Bioinformatics and Genomic Databases Overview
23 Questions
0 Views

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

What is a characteristic of first generation sequencing compared to the other generations?

  • It uses ligation and synthesis methods.
  • It generates longer sequence reads.
  • It is more expensive and time-consuming. (correct)
  • It does not involve electrophoresis.

Which generation of sequencing can generate long reads of sequences at a time?

  • Fourth generation sequencing
  • First generation sequencing
  • Third generation sequencing (correct)
  • Second generation sequencing

What is one primary benefit of third generation sequencing compared to second generation sequencing?

  • It is less costly and less time-consuming. (correct)
  • It is dependent on electrophoresis.
  • It requires more extensive annotation.
  • It involves more manual labor.

Which of the following is NOT a category of biological databases?

<p>Data processing databases (C)</p> Signup and view all the answers

Which database is recognized for storing sequences of proteins and nucleic acids?

<p>NCBI databases (A)</p> Signup and view all the answers

What type of additional information is typically stored along with sequences in NCBI databases?

<p>Name of the species (C)</p> Signup and view all the answers

Which of the following statements about structural databases is true?

<p>They contain solved structures of transcripts and proteins. (D)</p> Signup and view all the answers

Which of these databases is NOT primarily a sequence database?

<p>Protein Structure Initiative (PSI) (C)</p> Signup and view all the answers

What is the primary purpose of the Ensembl database?

<p>To annotate high-quality draft genome assemblies (B)</p> Signup and view all the answers

Which database is specifically focused on single nucleotide polymorphisms?

<p>dbSNP (B)</p> Signup and view all the answers

Which database does NOT provide genome assemblies but rather annotations?

<p>Ensembl (A)</p> Signup and view all the answers

What type of information does the 1000 Genomes database primarily catalog?

<p>Human genetic variation (A)</p> Signup and view all the answers

Which database integrates information specifically about essential genes?

<p>DEG (C)</p> Signup and view all the answers

Who developed the GenPept database?

<p>National Center for Biotechnology Information (NCBI) (B)</p> Signup and view all the answers

Which of the following databases contains data purely related to mitochondrial genomes?

<p>MITOMAP (C)</p> Signup and view all the answers

Which option describes a database characterized primarily for allele frequencies?

<p>Allele Frequency Net Database (B)</p> Signup and view all the answers

What is the primary function of the EMBL Nucleotide Sequence Database?

<p>To acquire, store, and distribute DNA sequence data (D)</p> Signup and view all the answers

Which of the following is NOT a component of the RefSeq database?

<p>Genome assembly repositories (D)</p> Signup and view all the answers

What role does the DNA Data Bank of Japan (DDBJ) primarily fulfill?

<p>It collects DNA sequence data from diverse researchers worldwide. (B)</p> Signup and view all the answers

Which database is primarily used in genomic and proteomic research for annotated sequences?

<p>RefSeq (C)</p> Signup and view all the answers

Which bioinformatics tools are provided by the European Bioinformatics Institute (EBI)?

<p>Sequence homology searching and multiple sequence alignment tools (C)</p> Signup and view all the answers

Which of the following statements accurately describes GenBank?

<p>It is the most comprehensive and annotated collection of publicly available DNA sequences. (C)</p> Signup and view all the answers

Which organization is primarily responsible for the maintenance of the EMBL Database?

<p>European Bioinformatics Institute (EBI) (A)</p> Signup and view all the answers

Flashcards

Third-generation sequencing

Sequencing technique that's less expensive and faster than first-generation methods. It bypasses electrophoresis and uses ligation/synthesis.

Biological databases

Organized collections of DNA, RNA, and protein information, freely available for research.

Sequence databases

Categories of biological databases storing DNA, RNA, and protein sequences.

Structure databases

Biological databases that store the shapes of transcripts and proteins.

Signup and view all the flashcards

Functional databases

Biological databases that describe the functions of DNA, RNA, and proteins.

Signup and view all the flashcards

NCBI databases

One of the largest sequence databases, containing DNA, RNA, protein sequence information and annotations.

Signup and view all the flashcards

EMBL Database

European Molecular Biology Laboratory (EMBL) database, storing DNA, RNA and protein sequences.

Signup and view all the flashcards

DDBJ Database

DNA Data Bank of Japan, a major database storing DNA, RNA, and protein sequences.

Signup and view all the flashcards

Ensembl database function

Ensembl annotates publicly available vertebrate genome assemblies.

Signup and view all the flashcards

Ensembl's role

It provides annotations to existing genome assemblies.

Signup and view all the flashcards

GenBank and DDBJ

Databases where genome assemblies are deposited.

Signup and view all the flashcards

Protein sequence databases

Collections of protein sequences from various sources, annotated.

Signup and view all the flashcards

GenPept database

NCBI's protein sequence database.

Signup and view all the flashcards

dbSNP

Database of single nucleotide polymorphisms (variations in DNA).

Signup and view all the flashcards

RefSeq

NCBI's reference sequence database.

Signup and view all the flashcards

Allele Frequency Net Database (AFND)

Database for allele frequency information.

Signup and view all the flashcards

EMBL/DDBJ/GenBank

The most comprehensive, annotated collection of publicly available DNA sequences, combining DDBJ, EMBL, and GenBank.

Signup and view all the flashcards

EMBL Nucleotide Sequence Database

The primary nucleotide sequence resource maintained by the European Bioinformatics Institute.

Signup and view all the flashcards

DDBJ

The DNA Data Bank of Japan,collecting DNA sequence data from Japanese researchers and globally.

Signup and view all the flashcards

Reference Sequence

Annotated collection of publicly available nucleotide and protein sequences.

Signup and view all the flashcards

Ensembl

A genome browser for vertebrate genomes that assists in comparative genomic research.

Signup and view all the flashcards

Comparative genomics

Study of genomes from different species to understand evolutionary relationships.

Signup and view all the flashcards

Nucleotide Sequence Databases

Collections of DNA and RNA sequences, often publicly available, used for research and analysis.

Signup and view all the flashcards

Study Notes

Bioinformatics and Genomic Databases

  • Bioinformatics utilizes information technology to collect, store, retrieve, and analyze biological data like sequences and structures of proteins and nucleic acids.
  • Biological databases are organized into Sequence, Structure, and Functional categories.
  • The first database was created after insulin protein sequencing in 1956. Insulin contains 51 amino acid residues.

History of Biological Databases

  • 1965: Margaret Dayhoff created the Atlas of Protein Sequence and Structure.
  • 1980s: EMBL Data Library cataloged biological data.
  • 1982: GenBank was established.
  • 2002: The development of high-throughput sequencing systems sequenced the complete E. coli genome.
  • Present: Creation of directories for multi-omic data.

Types of Biological Databases

  • Bibliographic Databases: Contain research articles and papers from various journals, like PubMed.
  • Sequence Databases: Store protein and nucleotide sequences. Examples are GenBank, DDBJ, and PIR.
  • Structure Databases: Contain 3D structures of proteins and nucleic acids (PDB).
  • Taxonomic Databases: Provide information about Earth's species of animals, plants, and more. Example includes Catalogue of life
  • Metabolic Databases: Contain data on biological pathways (KEGG and MetaCyc).
  • Model Organism Databases: Contain extensive biological data on studied model organisms (Flybase, RGD).
  • Chemical Databases: Contain data on small organic molecules (PubChem).
  • Microarray Databases: Store gene expression data from microarray experiments. Example includes GEO
  • Enzyme Databases: Contain information on enzyme structure and function (BRENDA).
  • Disease Databases: Collect disease-related information. Example includes OMIM

Sequence Data Generation

  • Sequencing plays a crucial role in biological data analysis.
  • Researchers now use in silico analysis as a first-line method in biomedical research, replacing the more costly and time-consuming in vitro and in vivo methods.
  • Sanger sequencing was the first generation of sequencing and it uses dideoxy nucleotides to halt the chain extension.
  • Maxam-Gilbert sequencing is another first-generation method based on chemical degradation.
  • Second-generation sequencing (e.g., Illumina) generates millions of short reads in parallel, is less expensive, and less time-consuming. It uses synthesis methods instead of electrophoresis
  • Third-generation sequencing (e.g., SMRT, Nanopore) uses single-molecule sequencing and produces longer reads but may have lower accuracy.

Nucleotide Sequence Databases

  • EMBL/DDBJ/GenBank: A primary nucleotide sequence resource, crucial for storing human genome sequence data
  • EMBL-Bank: Maintained by the European Bioinformatics Institute (EBI).
  • DDBJ: DNA Data Bank of Japan.
  • RefSeq: A comprehensive and annotated collection of publicly available nucleotide and protein sequences.
    • Data is generated using various techniques depending on the sequence class and organism.
  • Databases use accession numbers to help in identifying and tracking particular sequences in the databases.

Protein Sequence Databases

  • UniProt: A comprehensive and freely accessible protein sequence and functional information database.
  • AlphaFold: A Google DeepMind AI that predicts protein 3D structures from the amino acid sequence.

Additional Information

  • NCBI databases are located at www.ncbi.nlm.nih.gov, EMBL database is located at https://www.ebi.ac.uk/, and DDBJ is located at https://www.ddbj.nig.ac.jp/.
  • Ensembl is a genome browser tool for vertebrate genomes, supporting research, sequence variation, and transcriptional regulation on various publicly available vertebrate genome assemblies.

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

Related Documents

Description

Explore the evolution of bioinformatics and the various types of biological databases in this quiz. From the inception of the first database to modern multi-omic data directories, understand the significance and organization of biological data. Test your knowledge about key milestones and types of databases in the bioinformatics field.

More Like This

Use Quizgecko on...
Browser
Browser