Primary Biological Databases Overview

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson
Download our mobile app to listen on the go
Get App

Questions and Answers

What is the primary purpose of primary biological databases?

  • To contain primary sequence information and annotations (correct)
  • To serve as literature databases for bioinformatics
  • To store raw experimental data without annotations
  • To provide secondary analysis of biological data

Which of the following is NOT a primary nucleotide sequence database?

  • DDBJ
  • EMBL
  • UniProt (correct)
  • GenBank Database

What type of information do secondary biological databases primarily summarize?

  • Raw nucleotide sequences
  • Experimental data without analysis
  • Primary sequence information
  • Results from analyses of primary databases (correct)

Where does the journal Nucleic Acids Research primarily feature biological databases?

<p>In an entire issue dedicated to databases each January (D)</p> Signup and view all the answers

Why is a structured filing system necessary for biological data?

<p>To ensure that data is accessible and used appropriately (C)</p> Signup and view all the answers

What is one of the roles of primary biological databases according to the content provided?

<p>To provide annotations for nucleotide sequences (C)</p> Signup and view all the answers

Which of the following is considered a primary protein sequence database?

<p>NCBI Protein Database (A)</p> Signup and view all the answers

What type of databases often summarize common features derived from primary protein sequence analysis?

<p>Secondary Biological Databases (A)</p> Signup and view all the answers

What is the main purpose of the NCBI Entrez system?

<p>To query all NCBI-associated databases (C)</p> Signup and view all the answers

Which logical operators can be used to combine search terms in Entrez?

<p>AND, OR, NOT (B)</p> Signup and view all the answers

What format should be used to restrict search terms to specific database fields in Entrez?

<p>search term [field-id] (B)</p> Signup and view all the answers

What does the ID 'SLEN' represent when conducting a search in GenBank?

<p>Sequence Length (D)</p> Signup and view all the answers

Which feature does the advanced search in Entrez offer?

<p>Automatic generation of search queries (C)</p> Signup and view all the answers

What is the European counterpart of GenBank called?

<p>European Nucleotide Archive (D)</p> Signup and view all the answers

What is the purpose of using field IDs in Entrez searches?

<p>To simplify and clarify data input (A)</p> Signup and view all the answers

When performing a search for a sequence from Saccharomyces cerevisiae with a specified length, which syntax is correct?

<p>(Saccharomyces cerevisiae) [ORGN] AND 3260:3270[SLEN] (A)</p> Signup and view all the answers

What are the three main components of UniProt?

<p>UniProtKB, UniRef, UniPArc (C)</p> Signup and view all the answers

Which realm contains automatically annotated sequences in UniProtKB?

<p>TrEMBL (D)</p> Signup and view all the answers

Why is SwissProt considered the gold standard of protein annotation?

<p>It has undergone manual curation by specialists. (B)</p> Signup and view all the answers

What does TrEMBL stand for?

<p>Translated EMBL (A)</p> Signup and view all the answers

How many entries does the SwissProt database contain approximately?

<p>564,000 (A)</p> Signup and view all the answers

Which statement accurately describes the difference in quality between TrEMBL and SwissProt annotations?

<p>TrEMBL annotations are of lower quality than SwissProt annotations. (A)</p> Signup and view all the answers

What is a notable feature about the identifiers used in both UniProtKB and EMBL databases?

<p>Most identifiers are identical. (A)</p> Signup and view all the answers

Which one of the following statements is true about the SwissProt database's historical significance?

<p>It is older than the UniProt database. (C)</p> Signup and view all the answers

What is UniRef primarily known for?

<p>Allowing for fast similarity searches. (B)</p> Signup and view all the answers

Which version of UniRef allows searching for sequences that are 100% identical?

<p>UniRef100 (A)</p> Signup and view all the answers

What is a characteristic of the NCBI protein database?

<p>It compiles entries from other protein sequence databases. (B)</p> Signup and view all the answers

How can users initiate an advanced search in UniProtKB?

<p>By selecting options from drop-down menus in a dedicated interface. (B)</p> Signup and view all the answers

Which of the following correctly represents the similarity criteria for UniRef90?

<p>Sequences that are ≥ 90% identical. (C)</p> Signup and view all the answers

Which of the following databases is NOT included in the NCBI protein database?

<p>UniProtKB (C)</p> Signup and view all the answers

What does the structure of the NCBI protein database correspond to?

<p>The format of GenBank. (A)</p> Signup and view all the answers

Which of the following statements about UniRef50 is true?

<p>It provides a less comprehensive search compared to UniRef100. (C)</p> Signup and view all the answers

What is one primary reason why relational database systems have not gained acceptance in biological databases?

<p>They have a complicated structure not suited for biological data. (C)</p> Signup and view all the answers

Which of the following is NOT a primary nucleotide sequence database?

<p>MySQL (B)</p> Signup and view all the answers

What is a significant advantage of using ASCII text files for biological data?

<p>They allow for easy data manipulation without expensive systems. (C)</p> Signup and view all the answers

What drawback is associated with flat file databases like ASCII text files?

<p>Searching within the data can be slow and labor-intensive. (C)</p> Signup and view all the answers

Which organization is responsible for maintaining the GenBank database?

<p>National Center for Biotechnology Information (A)</p> Signup and view all the answers

Which of the following improvements can be associated with indexing flat file databases?

<p>Accelerating keyword-based searches. (D)</p> Signup and view all the answers

What purpose do systems that index flat file databases serve?

<p>To assist with quicker access to keywords within the data. (B)</p> Signup and view all the answers

As of December 2016, how many sequence entries were contained in the GenBank database?

<p>Roughly 199 million. (D)</p> Signup and view all the answers

Flashcards are hidden until you start studying

Study Notes

Primary Biological Databases

  • Primary biological databases store raw data like nucleotide and protein sequences
  • Primary databases are essential for scientific research, providing the foundation for further analysis and discovery

Secondary Biological Databases

  • Summarize and organize information from primary databases
  • Provide insights into protein function, structure, and interactions

Nucleic Acids Research Journal

  • Publishes an annual database issue highlighting new and updated databases

Structured Filing Systems

  • Necessary for organizing, managing, and retrieving biological data due to the vast amount of information
  • Facilitate data searching, retrieval, and analysis

Roles of Primary Biological Databases

  • Store and distribute raw biological data, making it accessible to the scientific community
  • Enable researchers to conduct comparative analysis and identify patterns

Primary Protein Sequence Databases

  • UniProtKB is a comprehensive protein sequence database

Databases Summarizing Protein Features

  • Secondary protein databases, like Pfam, summarize protein families and domains based on primary protein sequence analysis

NCBI Entrez System

  • Comprehensive search engine that allows querying multiple databases within the National Center for Biotechnology Information (NCBI)

Searching Terms in Entrez

  • Entrez supports logical operators like "AND", "OR", and "NOT" to refine searches
  • Use brackets to group terms for complex searches

Restricting Search Terms in Entrez

  • Use "[FIELD]term" syntax to limit searches to specific database fields

ID 'SLEN' in GenBank

  • Represents the sequence length

Advanced Search in Entrez

  • Offers more sophisticated search options and filters

European Counterpart of GenBank

  • EMBL-Bank

Field IDs in Entrez Searches

  • Help specify the search criteria

Searching Sequences in Saccharomyces cerevisiae

  • The correct syntax is "Saccharomyces cerevisiae[ORGN] AND SLEN=1000"

UniProt Components

  • UniProtKB, UniRef, and UniParc

Automatically Annotated Sequences in UniProtKB

  • Found in the TrEMBL realm

SwissProt Considered the Gold Standard

  • Due to its high-quality manual annotations

TrEMBL - Stands for

  • Translated EMBL

Number of Entries in SwissProt

  • Approximately 560,000 (as of December 2016)

Annotation Quality Differentiation

  • SwissProt annotations are manually curated, ensuring higher quality compared to TrEMBL's automated annotations

Identifiers in UniProtKB and EMBL

  • Both databases utilize consistent identifiers

Historical Significance of SwissProt

  • Established as the first comprehensive protein sequence databse

UniRef Focus

  • Provides clustered sets of protein sequences, simplifying searches and reducing redundancy

UniRef Version for 100% Identical Sequences

  • UniRef100

NCBI Protein Database Characteristic

  • Includes protein sequences from various sources

Advanced Search in UniProtKB

  • Accessible through the "Search" page

Similarity Criteria for UniRef90

  • Sequences with at least 90% similarity

Excluded Database from NCBI Protein Database

  • PIR

NCBI Protein Database Structure

  • Structured based on the UniProtKB database

Fact About UniRef50

  • Clusters protein sequences based on 50% similarity

Relational Database Systems in Biological Databases

  • Haven't gained widespread acceptance due to the complexity and large scale of biological data

Non-Primary Nucleotide Sequence Database

  • PDB (Protein Data Bank)

ASCII Text Files for Biological Data Advantage

  • Simplicity and ease of use

Drawback of Flat File Databases

  • Retrieval and analysis can be slow and inefficient

Organization Maintaining GenBank

  • NCBI

Indexing Flat File Databases Improvement

  • Enables faster data retrieval and analysis

Purpose of Indexing Systems

  • To create a searchable index for fast data access

Number of Sequence Entries in GenBank

  • Over 200 million (as of December 2016)

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

Related Documents

FBDA - S3 Slideshow PDF

More Like This

Biological Databases Quiz
5 questions

Biological Databases Quiz

EvaluativeSerpentine avatar
EvaluativeSerpentine
Introduction to Biological Databases
38 questions
Introduction to Biological Databases
39 questions
Use Quizgecko on...
Browser
Browser