KEGG Pathway Database

HalcyonWeasel avatar
HalcyonWeasel
·
·
Download

Start Quiz

Study Flashcards

30 Questions

What is the primary function of the search feature in the KEGG Pathway database?

To search for specific pathways by name or category

What is the main advantage of the KEGG Pathway database for researchers and students?

It offers insights into the intricate world of biosynthesis and cellular metabolism

What is the source of the viral genomes in the ViruSITE database?

NCBI Reference Sequence Database (RefSeq)

What sets the ViruSITE database apart from other databases?

It integrates data from various resources under human supervision

What is the purpose of the pathway maps in the KEGG Pathway database?

To view interactive pathway maps with detailed information on molecules, enzymes, and genes

What is the primary focus of the ViruSITE database?

Viral genomics and proteomics

What is the primary goal of gene annotation?

To analyze the sequence and predict its potential functions

What is the benefit of gene annotation in terms of understanding biological processes and diseases?

It enables researchers to understand how genes contribute to biological processes and diseases

What step in the gene annotation process involves identifying open reading frames (ORFs) and predicting coding regions?

Gene Prediction

What is the purpose of accession numbers in gene annotation?

To provide a unique identifier for sequences in the database

What is the final output of the gene annotation process?

Comprehensive gene annotations

What is the purpose of the iterative improvement step in gene annotation?

To update and refine annotations

What is the primary purpose of using a primary database?

To access raw, uninterpreted experimental data for further analysis or verification

What type of database is GenBank?

Primary database

What is the main difference between a primary database and a secondary database?

The level of interpretation and annotation of the data

Which database is used to store 3D structures of proteins and nucleic acids?

PDB

What is the origin of the EMBL database?

Europe

What is the original format of GenBank's data?

Flat file format

What is the purpose of rich annotations in a database?

To enable searching, browsing, and analysis

What is the purpose of BankIt, Sequin, and tbl2asn tools?

To submit sequences to GenBank

What type of data is stored in the CoreNucleotide database?

Most of the nucleotide sequences

What is the main goal of the International Sequence Database Collaboration (INSDC)?

To coordinate with individual laboratories and other sequence databases

What is the characteristic of GenBank files?

Readable by both humans and computers

What is the current status of GenBank's database?

Open access

What is the main objective of pairwise sequence alignment?

To obtain the highest possible score, indicating the degree of similarity between two sequences

What type of sequence alignment is used to identify short conserved regions in protein or nucleotide sequences?

Local alignment

What is the difference between global and local alignment?

Global alignment aligns the entire length of the sequences, while local alignment aligns only the regions with the highest density of matches

What is the purpose of a scoring system in pairwise sequence alignment?

To assign positive scores to matching characters and negative scores to mismatching characters or gaps

What type of sequence alignment involves aligning two sequences to identify the optimal pairing of the sequences?

Pairwise sequence alignment

What is a characteristic of global alignment?

It aligns the entire length of the sequences by maximizing overall similarity

Study Notes

Primary Databases

  • GenBank is a repository of known nucleotide sequences with a flat file structure, readable by humans and computers.
  • GenBank files contain information such as accession numbers, gene names, phylogenetic classification, and references to published literature.
  • GenBank has transitioned from a flat file format to a more complex structure using XML and ASN.1 formats for improved manageability and data exchange.
  • GenBank is an open-access sequence database that coordinates with individual laboratories and other sequence databases like EMBL and DDBJ.
  • GenBank is an annotated collection of all nucleotide sequences available to the public.

GenBank Structure

  • The nucleotide database is divided into three databases at NCBI: CoreNucleotide database, Expressed Sequence Tag (EST), and Genome Survey Sequence (GSS).
  • CoreNucleotide database has most of the nucleotide sequences used and encloses all nucleotide records not in EST and GSS databases.

Submission to GenBank

  • Sequences can be submitted to GenBank using BankIt, Sequin, and tbl2asn tools.

KEGG Pathway Database

  • The KEGG Pathway database offers features to explore biosynthesis pathways, including search function, pathway maps, and data links.
  • Pathway maps provide detailed information on molecules, enzymes, and genes involved in each step.
  • Data links access relevant databases for genes, proteins, and other molecules associated with the pathway.

ViruSite Database

  • ViruSite is a comprehensive database designed for viral genomics, integrating information on viral genomes, genes, and proteins from various sources.
  • ViruSite incorporates all genomes from viruses, viroids, and satellites deposited in the NCBI Reference Sequence Database (RefSeq).
  • Data from numerous resources like NCBI RefSeq, UniProtKB, Gene Ontology (GO), ViralZone, and PubMed are computationally extracted and integrated under human supervision.
  • ViruSite offers an intuitive and user-friendly interface for easy navigation and exploration.

Gene Annotation

  • Gene annotation involves analyzing the sequence and predicting its potential functions, adding valuable information to databases.
  • Benefits of gene annotation include identifying genes and their locations, predicting gene function, providing insights into gene regulation and expression, and understanding biological processes and diseases.

Sequence Analysis Pipeline

  • The pipeline involves pre-processing, gene prediction, functional annotation, structural annotation, integration, and visualization, quality assessment, and database submission.
  • Each step involves specific tools and techniques, including quality control, sequence assembly, identifying ORFs, predicting coding regions, protein homology, domain prediction, GO annotation, and pathway mapping.

Accession Numbers

  • Accession numbers are unique identifiers that permanently identify sequences in the database.

Primary vs. Secondary Databases

  • Primary databases store raw, uninterpreted experimental data, while secondary databases provide a more comprehensive view of biological information with interpretations, annotations, and functionalities for exploring relationships and functions.
  • Examples of primary databases include GenBank, DDBJ, and EMBL, while secondary databases include KEGG Pathway Database and ViruSite.

Types of Sequence Alignment

  • Pairwise sequence alignment involves aligning two sequences to identify the optimal pairing, based on a scoring system that assigns positive scores to matching characters and negative scores to mismatching characters or gaps.
  • Multiple sequence alignment involves aligning three or more sequences to identify conserved regions and reconstruct evolutionary relationships.
  • Methods of pairwise sequence alignment include dot-matrix method, dynamic programming, and word or k-tuple method.
  • Methods of multiple sequence alignment include exhaustive algorithms and heuristic algorithms.
  • Applications of sequence alignment include identifying functional regions, understanding evolutionary relationships, and predicting protein structure and function.

Explore the features of KEGG Pathway Database, a valuable resource for researchers and students, including search function, pathway maps, and data links.

Make Your Own Quizzes and Flashcards

Convert your notes into interactive study material.

Get started for free
Use Quizgecko on...
Browser
Browser