Week 6 DNA Data Processing, Repository and Seq Alignment PDF
Document Details
Uploaded by PleasingSynergy3663
Universiti Malaysia Sabah
null
null
Tags
Summary
This document provides information on DNA data processing, repository, and sequence alignment, including data formats, processing steps, and tools like GenBank and BOLD. It covers different aspects of handling DNA sequences, from basic concepts to more advanced procedures, including phylogenetic analysis and barcoding.
Full Transcript
SB 33403 Molecular Biology in Conserva@on 1. Processing Output data from DNADNA sequences. sequencer...
SB 33403 Molecular Biology in Conserva@on 1. Processing Output data from DNADNA sequences. sequencer Lecturer: Dr. Si@ Fa@mah Md Isa (SFMI) Data format DNA Data Processing, Repository and Sequence Alignment Electronic gel file – primary data Output data from DNA sequencer v Processing DNA sequences Data format Electronic gel file – primary data Chromatogram – secondary data - DNA sequencer and so6ware - Output data–from Chromatogram DNA sequencer secondary data Nucleotide sequence Nucleotide – Tertiary sequence – Tertiarydata data *.abi file = Chromatogram + Nucleotide sequence *.abi file = Chromatogram + Nucleotide sequence 6 SB 33403 Molecular Biology in Conserva@on Lecturer: Dr. Si@ Fa@mah Md Isa (SFMI) DNA Data Processing, Repository and Sequence Alignment v Processing DNA sequences Data format cont. ocessing DNA sequences. - Nucleo@de sequence data mat de sequence data mat – FASTA format (*.fas) ormat and commonly used e BioEdit SB 33403 Molecular Biology in Conserva@on Lecturer: Dr. Si@ Fa@mah Md Isa (SFMI) DNA Data Processing, Repository and Sequence Alignment cont. 1.v Processing DNA Processing DNA sequences cont.sequences. Verify DNA sequence Verify DNA sequence 5’ 3’ ATGTGGTATGGTAGGAACAGG 3’ TACACCATACCATCCTTGTCC 5’ 10 SB 33403 Molecular Biology in Conserva@on Lecturer: Dr. Si@ Fa@mah Md Isa (SFMI) DNA Data Processing, Repository and Sequence Alignment cont. v Processing DNA sequences cont. Data depository Where to keep DNA sequences? ü Genbank (hPps://www.ncbi.nlm.nih.gov/) - Sequence database - Open access - > 170 million sequences (> 160 billion nucleo@de base) - Public repository for all the raw data (sequences) of scien@fic ar@cles ü Barcoding of Life Database (BOLD) - Sequence & specimen database - Open access / private project - Tool – database, simple phylogene@c analysis, barcoding analysis SB 33403 Molecular Biology in Conserva@on Lecturer: Dr. Si@ Fa@mah Md Isa (SFMI) DNA Data Processing, Repository and Sequence Alignment cont. 1.vProcessing DNAcont. Processing DNA sequences sequences.... Data depository cont. How to manage DNA sequences Data depository How to manage DNA sequences BOLD COI: PCR and Sequencing DNA Extraction 16S: PCR and Sequencing ITS-1: PCR and Sequencing 14 Information about the specimen SB 33403 Molecular Biology in Conserva@on Lecturer: Dr. Si@ Fa@mah Md Isa (SFMI) DNA Data Processing, Repository and Sequence Alignment cont. v Processing DNA sequences cont. Data depository cont. How to manage DNA sequences 1. Processing DNA sequences….. four components of BOLD Specimens information Laboratory analysis Barcode of Life Data Systems Data analysis 15 SB 33403 Molecular Biology in Conserva@on Lecturer: Dr. Si@ Fa@mah Md Isa (SFMI) DNA Data Processing, Repository and Sequence Alignment cont. v Basic of phylogene@c analysis Making inferences from morphological data Cladis@c method Phylogene@c Tree (s) (character matrix) Branch support Length Homology Character coding Consistency Outgroup index (CI), Reten@on index (RI) SB 33403 Molecular Biology in Conserva@on Lecturer: Dr. Si@ Fa@mah Md Isa (SFMI) DNA Data Processing, Repository and Sequence Alignment cont. v Basic of phylogene@c analysis Making inferences from DNA data Cladis@c method Phylogene@c Tree (s) (character matrix) Branch support Length Homology Character coding - A, T, G, C - Primer (gene) - Alignment Consistency Complex analysis index (CI), - Parsimony Outgroup Reten@on index - Likelihood (RI) - Bayesian SB 33403 Molecular Biology in Conserva@on Lecturer: Dr. Si@ Fa@mah Md Isa (SFMI) DNA Data Processing, Repository and Sequence Alignment cont. § How to align two DNA sequences? § How to align more than two DNA sequences? SB 33403 Molecular Biology in Conserva@on Lecturer: Dr. Si@ Fa@mah Md Isa (SFMI) 3. DNA data matrix. DNA Data Processing, Repository and Sequence Alignment cont. Taxa A - ATTCCGAAAAATATACTCAA Taxa B - ATTCCAAATATACTCAACAA Taxa C - ATTCCGAAAAATATACTAACAA Taxa D - ATTCCAAATATACTCAACAA Outgroup Taxa - CGG CCTTTCCAAGTAGGGGTTCA Taxa B Taxa D Taxa C Taxa A Outgroup taxa SB 33403 Molecular Biology in Conserva@on Lecturer: Dr. Si@ Fa@mah Md Isa (SFMI) DNA Data Processing, Repository and Sequence Alignment cont. 3. DNA data matrix… … … …... 44 SB 33403 Molecular Biology in Conserva@on Lecturer: Dr. Si@ Fa@mah Md Isa (SFMI) 3. DNA data DNA Data matrix… Processing, …Sequence Repository and … …...Alignment cont. Variable Conserved Region Region 47