BIO4BI3 Genome Annotation Lecture 7

Podcast

Play an AI-generated podcast conversation about this lesson

Download our mobile app to listen on the go

Get App

Questions and Answers

What is the primary purpose of scoring function in exon selection?

To maximize a scoring function dependent on individual exons (correct)
To select random segments of DNA
To minimize the number of introns
To assign equal weights to all exons

Methionine is coded by multiple triplets in DNA.

False (B)

Name the four basic signals involved in defining an exon.

Translational start site, 5’ donor splice site, 3’ donor splice site, Translational stop codon

An ___ is defined as an open reading frame (ORF) delimited by a 3’ acceptor site and a stop codon.

terminal exon Signup and view all the answers

Match the type of exon with its definition:

Initial exon = ORFs delimited by a start site and a 5’ donor site Internal exon = ORFs delimited by a 3’ acceptor site and a 5’ donor site Terminal exon = ORFs delimited by a 3’ acceptor site and stop codon Signup and view all the answers

Which of the following is a composition bias observed in organisms?

A preference for codons coding for a particular amino acid (C) Signup and view all the answers

The frequency of nucleotide pairs can help differentiate between integers and exons.

True (A) Signup and view all the answers

What upstream elements should be considered when scoring exons?

TATA box elements Signup and view all the answers

What is one significant advantage of using deep learning over traditional rule-based methods in gene prediction?

It automatically discovers relevant patterns (B) Signup and view all the answers

Deep learning models can accurately predict splice sites in eukaryotes.

True (A) Signup and view all the answers

Name one example of software used for alternative splicing detection.

SpliceAI Signup and view all the answers

Deep learning models require large, labeled datasets for effective _____.

training Signup and view all the answers

Which technique allows pre-trained models on one genome to be adapted for other species?

Transfer Learning (D) Signup and view all the answers

Match the following applications of deep learning with their corresponding focus:

DeepGene = Gene Prediction SpliceAI = Alternative Splicing Detection DeepEnhancer = Enhancer Identification DeepCAPE = Promoter Identification Signup and view all the answers

What process is used to automatically detect features during the training of a neural network?

Feature Extraction Signup and view all the answers

Computational resources are not a challenge for training deep learning models.

False (B) Signup and view all the answers

Which of the following is NOT a category in the Gene Ontology classification?

Cellular interaction (A) Signup and view all the answers

Gene Ontology was initiated in 1999 with a focus on plant species.

False (B) Signup and view all the answers

What is a slower but effective method for similarity-based gene prediction?

Exonerate (C) Signup and view all the answers

What does the 'Biological process' category in Gene Ontology describe?

The function of the gene from a cell’s point of view. Signup and view all the answers

Gene Ontology provides a common vocabulary to identify genes in __________.

pathways Signup and view all the answers

NGS allows for the capture of alternative splicing events in a single run.

True (A) Signup and view all the answers

What is one method used for identifying conserved regions across species?

Conservation of miRNA Signup and view all the answers

Match the following databases with their focus:

KEGG = Molecular interactions and reaction networks Reactome = Reactions and pathways in human biology BLAST2GO = Functional annotation of genes AMIGO = Gene ontology data access Signup and view all the answers

The human genome contains approximately ______ miRNA that control tens of thousands of genes.

1900 Signup and view all the answers

Which of the following is a reason to use Gene Ontology?

It identifies trends in differential gene expression. (C) Signup and view all the answers

A gene can exist in only one category within Gene Ontology.

False (B) Signup and view all the answers

Match the sequencing types to their advantages:

NGS = Captures the near transcriptional state of tissues Long-Read Sequencing = Improves gene model accuracy by capturing full-length transcripts Short-Read Sequencing = May miss complex regions and alternative splicing cDNA = Requires tremendous effort and can be expensive Signup and view all the answers

What main benefit does Long-Read Sequencing provide over Short-Read Sequencing?

Captures full-length transcripts (C) Signup and view all the answers

What is the main advantage of using pathway databases like KEGG?

They help to visualize how genes function in biological pathways. Signup and view all the answers

Many important chromosome regions are considered genic.

False (B) Signup and view all the answers

What is one challenge of Short-Read Sequencing?

Miss large, complex regions Signup and view all the answers

What is the primary use of Hidden Markov Models (HMM) in gene finding?

To predict base positions in exons, introns, or intergenic regions (A) Signup and view all the answers

The frequency of codons is consistently the same across all organisms.

False (B) Signup and view all the answers

What does the log odds ratio (LP(S)) represent in codon usage?

It compares the observed codon usage probability to the expected probability under a random model. Signup and view all the answers

The frequency of codon c in a non-coding sequence is represented as P0(C) = F0(C1)F0(C2)...F0(Cm). Assuming the random model, F0(C) equals _____ .

1/64 Signup and view all the answers

Match the following gene elements with their descriptions:

Promoter region = Region where transcription begins Exons = Coding sequences in a gene Introns = Non-coding sequences that are removed during RNA processing Stop codons = Signal to terminate protein synthesis Signup and view all the answers

Which of the following programs uses Hidden Markov Models for gene prediction?

GENESCAN (B) Signup and view all the answers

Codon TCT and ACG occur more frequently than expected in coding sequences.

False (B) Signup and view all the answers

What is a limitation of using BLAST programs for gene finding?

They do not define intron/exon boundaries well. Signup and view all the answers

What is genome annotation?

Identifying locations of genes and assigning functions (B) Signup and view all the answers

Functional annotation involves the identification of precise locations of genes and regulatory elements.

False (B) Signup and view all the answers

Name the two types of genome annotation.

Structural Annotation and Functional Annotation Signup and view all the answers

The system used for functional annotation that includes gene roles is called _____.

Gene Ontology Signup and view all the answers

Match the following classes of gene prediction with their examples:

Ab initio = Using algorithms to predict gene locations from sequences Homology-based = Finding genes based on similarities to known genes RNA-seq = Identifying genes by analyzing RNA transcripts Evidence-based = Utilizing experimental data to confirm gene predictions Signup and view all the answers

Which of the following statements is true about the significance of genome annotation?

It acts as a translator for genomic sequences. (C) Signup and view all the answers

MAKER is a tool used for structural annotation of genes.

True (A) Signup and view all the answers

What role does SNPEff play in genome annotation?

It is used for SNP functional impact assessment. Signup and view all the answers

Flashcards

Genome Annotation

The process of identifying genes and features in a genome and assigning functions to them.

Structural Annotation

Identifying the precise locations of genes, regulatory elements, and repetitive sequences.