Podcast
Questions and Answers
What is the primary function of the search feature in the KEGG Pathway database?
What is the primary function of the search feature in the KEGG Pathway database?
What is the main advantage of the KEGG Pathway database for researchers and students?
What is the main advantage of the KEGG Pathway database for researchers and students?
What is the source of the viral genomes in the ViruSITE database?
What is the source of the viral genomes in the ViruSITE database?
What sets the ViruSITE database apart from other databases?
What sets the ViruSITE database apart from other databases?
Signup and view all the answers
What is the purpose of the pathway maps in the KEGG Pathway database?
What is the purpose of the pathway maps in the KEGG Pathway database?
Signup and view all the answers
What is the primary focus of the ViruSITE database?
What is the primary focus of the ViruSITE database?
Signup and view all the answers
What is the primary goal of gene annotation?
What is the primary goal of gene annotation?
Signup and view all the answers
What is the benefit of gene annotation in terms of understanding biological processes and diseases?
What is the benefit of gene annotation in terms of understanding biological processes and diseases?
Signup and view all the answers
What step in the gene annotation process involves identifying open reading frames (ORFs) and predicting coding regions?
What step in the gene annotation process involves identifying open reading frames (ORFs) and predicting coding regions?
Signup and view all the answers
What is the purpose of accession numbers in gene annotation?
What is the purpose of accession numbers in gene annotation?
Signup and view all the answers
What is the final output of the gene annotation process?
What is the final output of the gene annotation process?
Signup and view all the answers
What is the purpose of the iterative improvement step in gene annotation?
What is the purpose of the iterative improvement step in gene annotation?
Signup and view all the answers
What is the primary purpose of using a primary database?
What is the primary purpose of using a primary database?
Signup and view all the answers
What type of database is GenBank?
What type of database is GenBank?
Signup and view all the answers
What is the main difference between a primary database and a secondary database?
What is the main difference between a primary database and a secondary database?
Signup and view all the answers
Which database is used to store 3D structures of proteins and nucleic acids?
Which database is used to store 3D structures of proteins and nucleic acids?
Signup and view all the answers
What is the origin of the EMBL database?
What is the origin of the EMBL database?
Signup and view all the answers
What is the original format of GenBank's data?
What is the original format of GenBank's data?
Signup and view all the answers
What is the purpose of rich annotations in a database?
What is the purpose of rich annotations in a database?
Signup and view all the answers
What is the purpose of BankIt, Sequin, and tbl2asn tools?
What is the purpose of BankIt, Sequin, and tbl2asn tools?
Signup and view all the answers
What type of data is stored in the CoreNucleotide database?
What type of data is stored in the CoreNucleotide database?
Signup and view all the answers
What is the main goal of the International Sequence Database Collaboration (INSDC)?
What is the main goal of the International Sequence Database Collaboration (INSDC)?
Signup and view all the answers
What is the characteristic of GenBank files?
What is the characteristic of GenBank files?
Signup and view all the answers
What is the current status of GenBank's database?
What is the current status of GenBank's database?
Signup and view all the answers
What is the main objective of pairwise sequence alignment?
What is the main objective of pairwise sequence alignment?
Signup and view all the answers
What type of sequence alignment is used to identify short conserved regions in protein or nucleotide sequences?
What type of sequence alignment is used to identify short conserved regions in protein or nucleotide sequences?
Signup and view all the answers
What is the difference between global and local alignment?
What is the difference between global and local alignment?
Signup and view all the answers
What is the purpose of a scoring system in pairwise sequence alignment?
What is the purpose of a scoring system in pairwise sequence alignment?
Signup and view all the answers
What type of sequence alignment involves aligning two sequences to identify the optimal pairing of the sequences?
What type of sequence alignment involves aligning two sequences to identify the optimal pairing of the sequences?
Signup and view all the answers
What is a characteristic of global alignment?
What is a characteristic of global alignment?
Signup and view all the answers
Study Notes
Primary Databases
- GenBank is a repository of known nucleotide sequences with a flat file structure, readable by humans and computers.
- GenBank files contain information such as accession numbers, gene names, phylogenetic classification, and references to published literature.
- GenBank has transitioned from a flat file format to a more complex structure using XML and ASN.1 formats for improved manageability and data exchange.
- GenBank is an open-access sequence database that coordinates with individual laboratories and other sequence databases like EMBL and DDBJ.
- GenBank is an annotated collection of all nucleotide sequences available to the public.
GenBank Structure
- The nucleotide database is divided into three databases at NCBI: CoreNucleotide database, Expressed Sequence Tag (EST), and Genome Survey Sequence (GSS).
- CoreNucleotide database has most of the nucleotide sequences used and encloses all nucleotide records not in EST and GSS databases.
Submission to GenBank
- Sequences can be submitted to GenBank using BankIt, Sequin, and tbl2asn tools.
KEGG Pathway Database
- The KEGG Pathway database offers features to explore biosynthesis pathways, including search function, pathway maps, and data links.
- Pathway maps provide detailed information on molecules, enzymes, and genes involved in each step.
- Data links access relevant databases for genes, proteins, and other molecules associated with the pathway.
ViruSite Database
- ViruSite is a comprehensive database designed for viral genomics, integrating information on viral genomes, genes, and proteins from various sources.
- ViruSite incorporates all genomes from viruses, viroids, and satellites deposited in the NCBI Reference Sequence Database (RefSeq).
- Data from numerous resources like NCBI RefSeq, UniProtKB, Gene Ontology (GO), ViralZone, and PubMed are computationally extracted and integrated under human supervision.
- ViruSite offers an intuitive and user-friendly interface for easy navigation and exploration.
Gene Annotation
- Gene annotation involves analyzing the sequence and predicting its potential functions, adding valuable information to databases.
- Benefits of gene annotation include identifying genes and their locations, predicting gene function, providing insights into gene regulation and expression, and understanding biological processes and diseases.
Sequence Analysis Pipeline
- The pipeline involves pre-processing, gene prediction, functional annotation, structural annotation, integration, and visualization, quality assessment, and database submission.
- Each step involves specific tools and techniques, including quality control, sequence assembly, identifying ORFs, predicting coding regions, protein homology, domain prediction, GO annotation, and pathway mapping.
Accession Numbers
- Accession numbers are unique identifiers that permanently identify sequences in the database.
Primary vs. Secondary Databases
- Primary databases store raw, uninterpreted experimental data, while secondary databases provide a more comprehensive view of biological information with interpretations, annotations, and functionalities for exploring relationships and functions.
- Examples of primary databases include GenBank, DDBJ, and EMBL, while secondary databases include KEGG Pathway Database and ViruSite.
Types of Sequence Alignment
- Pairwise sequence alignment involves aligning two sequences to identify the optimal pairing, based on a scoring system that assigns positive scores to matching characters and negative scores to mismatching characters or gaps.
- Multiple sequence alignment involves aligning three or more sequences to identify conserved regions and reconstruct evolutionary relationships.
- Methods of pairwise sequence alignment include dot-matrix method, dynamic programming, and word or k-tuple method.
- Methods of multiple sequence alignment include exhaustive algorithms and heuristic algorithms.
- Applications of sequence alignment include identifying functional regions, understanding evolutionary relationships, and predicting protein structure and function.
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.
Description
Explore the features of KEGG Pathway Database, a valuable resource for researchers and students, including search function, pathway maps, and data links.