Gene Annotation: Identifying Protein-Coding Genes

PleasingChaparral avatar
PleasingChaparral
·
·
Download

Start Quiz

Study Flashcards

38 Questions

What is the process of identifying protein-coding genes within DNA sequences in a database called?

Gene annotation

What is the primary goal of the ENCODE project?

To identify functionally important elements in the human genome

What is used to compare the sequence of a protein to the products of known genes from other organisms?

Software

What is the term for the signs that indicate the presence of genes, such as translational start and stop signals, RNA splicing sites, and promoter sequences?

Patterns

What is the method used to show that the relevant RNA is actually expressed from the proposed gene?

RNA-seq

What is the field of study that focuses on the organization, regulation, and evolution of genomes?

Genomics

What is the term for the study of genes directly using available DNA sequences?

Genetic analysis

What is the ultimate goal of identifying protein-coding genes and understanding their functions?

To gain insights into questions about genome regulation, development, and evolution

What percentage of the genome is transcribed at some point in at least one cell type studied?

75%

What is the primary focus of the Roadmap Epigenomics Project?

Characterizing the epigenetic features of the genome

What is the term for the entire set of proteins expressed by a cell or group of cells?

Proteome

What is the main goal of systems biology?

To study the functional integration of genes and proteins

What is the main limitation of the ENCODE project?

It only analyzed cells in culture

What is the term for the approach to studying large sets of proteins and their properties?

Proteomics

What is the name of the project that used sophisticated techniques to disable pairs of genes one pair at a time?

Yeast research project

What enabled the development of systems biology?

Advances in bioinformatics

What is the primary reason for the increased size of genomes over evolutionary time?

To provide raw material for gene diversification

What is the main characteristic of multigene families?

They consist of two or more identical or very similar genes

What is the function of some Alu elements?

To regulate gene expression

What is the basis of change at the genomic level?

Mutation

What is the characteristic of α-globins and β-globins?

They are polypeptides of hemoglobin coded by genes on different human chromosomes

What is the primary function of transposable elements?

To move from one location to another in a genome

What is the characteristic of Alu elements?

They are transcribed into RNA molecules and help regulate gene expression

What is the result of the increase in genome size over evolutionary time?

An increase in raw material for gene diversification

What is the result of accidents in meiosis?

The formation of polyploidy

Why do humans have 23 pairs of chromosomes while chimpanzees have 24 pairs?

Because two ancestral chromosomes fused in the human line

What is indicated by large blocks of genes on human chromosome 16 being found on four mouse chromosomes?

That the genes in each block stayed together in both the human and mouse lineages

What is thought to have accelerated about 100 million years ago?

The rate of duplications and inversions

What is the result of chromosomal rearrangements?

The generation of new species

What is an example of a gene that evolved into a new function?

The gene that encodes α-lactalbumin

What is the function of α-lactalbumin in mammals?

It plays a role in milk production

What is the result of one copy of a duplicated gene undergoing alterations?

The evolution of a completely new function for the protein product

What is the result of errors in meiosis on one chromosome and deletion from the homologous chromosome?

Exon duplication on one chromosome and deletion from the homologous chromosome

What is the thought process behind the current version of the gene for tissue plasminogen activator (TPA)?

Several instances of exon shuffling and subsequent duplication

What is the consequence of inserting transposable elements within a protein-coding sequence?

Block protein production

What is the role of transposable elements in genome evolution?

Carrying a gene or groups of genes to a new position

What is the outcome of exon shuffling?

Mixing and matching of exons between two nonallelic genes

What is the general outcome of changes caused by transposable elements?

Usually detrimental to an organism

Study Notes

Identifying Protein-Coding Genes and Understanding Their Functions

  • Gene annotation is the process of identifying protein-coding genes within DNA sequences in a database.
  • Three lines of evidence are used to identify a gene:
    • Patterns in the DNA sequence indicating the presence of genes (e.g., translational start and stop signals, RNA splicing sites, promoter sequences).
    • Comparison of the sequence to known genes from other organisms.
    • RNA-seq or other methods to show that the relevant RNA is actually expressed from the proposed gene.

Understanding Genes and Gene Expression at the Systems Level

  • Genomics provides insights into genome organization, regulation of gene expression, embryonic development, and evolution.
  • The ENCODE (Encyclopedia of DNA Elements) project (2003-2012) aimed to identify functionally important elements in the human genome.
  • The project extensively characterized histone and DNA modifications, chromatin structure, and compared results from different projects.
  • About 75% of the genome is transcribed at some point in at least one cell type studied.
  • Biochemical functions have been assigned to DNA elements making up at least 80% of the genome.

Systems Biology

  • Proteomics is the study of large sets of proteins and their properties.
  • A proteome is the entire set of proteins expressed by a cell or group of cells.
  • Systems biology focuses on the functional integration of genes and proteins in biological systems.
  • The approach is possible due to advances in bioinformatics.
  • Multiple copies of transposable elements and related sequences are scattered throughout eukaryotic genomes.
  • In humans and other primates, a large portion of transposable element-related DNA consists of Alu elements.
  • Many Alu elements are transcribed into RNA molecules, which may help regulate gene expression.

Genes and Multigene Families

  • Many eukaryotic genes are present in one copy per haploid set of chromosomes.
  • The rest of the genes occur in multigene families, collections of two or more identical or very similar genes.
  • Some multigene families consist of identical DNA sequences, usually clustered tandemly, such as those that code for rRNA products.

Evolution of Genomes

  • The basis of change at the genomic level is mutation, which underlies much of genome evolution.
  • Duplication, rearrangement, and mutation of DNA contribute to genome evolution.
  • Accidents in meiosis can lead to polyploidy, which can result in genes with novel functions.
  • Alterations of chromosome structure, such as fusions and inversions, can also contribute to genome evolution.

Evolution of Genes with Novel Functions

  • One copy of a duplicated gene can undergo alterations that lead to a completely new function for the protein product.
  • For example, the lysozyme gene was duplicated and evolved into the gene that encodes α-lactalbumin in mammals.

Rearrangements of Parts of Genes

  • Errors in meiosis can result in exon duplication or deletion.
  • Exon shuffling can lead to mixing and matching of exons, either within a gene or between two nonallelic genes.

How Transposable Elements Contribute to Genome Evolution

  • Multiple copies of similar transposable elements facilitate recombination, or crossing over, between different chromosomes.
  • Insertion of transposable elements within a protein-coding sequence can block protein production.
  • Insertion of transposable elements within a regulatory sequence can increase or decrease protein production.
  • Transposable elements can also carry a gene or groups of genes to a new position or create new sites for alternative splicing in an RNA transcript.

Learn about gene annotation, the process of identifying protein-coding genes within DNA sequences, and the three lines of evidence used to identify a gene.

Make Your Own Quizzes and Flashcards

Convert your notes into interactive study material.

Get started for free

More Quizzes Like This

Control of Gene Expression
30 questions

Control of Gene Expression

EffectualJubilation avatar
EffectualJubilation
Gene Expression and Translation Quiz
5 questions
Use Quizgecko on...
Browser
Browser