Gene Ontology and Protein Functions Quiz

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

Which of the following best describes the term 'Molecular Function' in Gene Ontology?

  • The location where a process occurs
  • The specific biological role of a gene product (correct)
  • The biological goal or objective at a higher level function
  • The classification of proteins based on their structure

Cellular Component indicates the biological goal or objective of a gene product.

False (B)

What are the two most reliable sources for Gene Ontology annotations?

Papers and Experiments

In Gene Ontology, 'Orthologs' are homologues created by __________ and typically have the same function in different species.

<p>speciation</p> Signup and view all the answers

What is one of the main uses of Gene Ontology (GO)?

<p>To enhance predictors of protein function (A)</p> Signup and view all the answers

Match the following types of protein homologs with their descriptions:

<p>Orthologs = Homologs created by speciation with the same function in different species Paralogs = Homologs created by gene duplication events within the same species</p> Signup and view all the answers

What does the term 'HMMs' refer to in the context of protein function prediction?

<p>Hidden Markov Models</p> Signup and view all the answers

Human chymotrypsin and bovine chymotrypsin are examples of paralogs.

<p>False (B)</p> Signup and view all the answers

What is the identity percentage that PAM 250 was developed to model?

<p>20% (C)</p> Signup and view all the answers

BLOSUM62 is derived from aligned sequences of protein families called BLOCKS.

<p>True (A)</p> Signup and view all the answers

What does the letter 'o' represent in the gap penalty formula?

<p>gap opening constant</p> Signup and view all the answers

The PAM250 matrix quantifies the odds that one residue is mutated from another based on the __________ of amino acid pair exchanging.

<p>observed probability</p> Signup and view all the answers

Match the following terms with their definitions:

<p>PAM250 = Models sequences with 20% identity BLOSUM62 = Derived from clustered sequences with &gt;62% pairwise identity Gap Penalty = Penalty for introducing gaps in alignment Needleman-Wunsch Algorithm = Maximizes similarity score for global alignment</p> Signup and view all the answers

What is one of the main advantages of BLOSUM62?

<p>It is more accurate and sensitive. (C)</p> Signup and view all the answers

The Needleman-Wunsch algorithm is used for local alignment only.

<p>False (B)</p> Signup and view all the answers

What does the term 'indel' refer to in terms of sequence alignment?

<p>insertions or deletions</p> Signup and view all the answers

What is the primary goal of the Needleman-Wunsch algorithm?

<p>To find the maximum score alignment between two sequences (A)</p> Signup and view all the answers

The Smith-Waterman algorithm focuses on global alignments for sequences.

<p>False (B)</p> Signup and view all the answers

What matrix is used for assigning similarity values in the alignment process?

<p>BLOSUM62</p> Signup and view all the answers

The maximum match is the largest number of residues from one sequence that can be matched with another, allowing for all possible __________.

<p>indels</p> Signup and view all the answers

Match the alignment algorithms with their characteristics:

<p>Needleman-Wunsch = Global alignment of full sequences Smith-Waterman = Local alignment focusing on segments Dynamic Programming = Method to achieve optimal sequence alignment BLOSUM62 = Matrix used for scoring residue similarities</p> Signup and view all the answers

What is the main factor considered when determining pathways in the alignment matrix?

<p>The similarity values assigned to residues (A)</p> Signup and view all the answers

The maximum match will always be found in the outer row or column of the alignment matrix.

<p>True (A)</p> Signup and view all the answers

What is introduced when a gap is included in the alignment?

<p>Gap penalty</p> Signup and view all the answers

What is the order of protein structure hierarchy?

<p>Primary - Secondary - Tertiary - Quaternary (A)</p> Signup and view all the answers

Glycine is less flexible than Alanine.

<p>False (B)</p> Signup and view all the answers

What type of bond does Cystine form?

<p>Disulphide bridge</p> Signup and view all the answers

The equation for free energy of folding is ΔG = ΔH - TΔS, where ΔG represents the ____.

<p>Free energy of folding</p> Signup and view all the answers

Which amino acid is known for its rigidity due to its unique bonding?

<p>Proline (A)</p> Signup and view all the answers

Van der Waals interactions are energetically unfavorable for protein packing.

<p>False (B)</p> Signup and view all the answers

Name one type of non-bonded interaction in proteins.

<p>Ionic Bond, Van der Waals Interactions, or Hydrogen Bond</p> Signup and view all the answers

What defines paralogues?

<p>Homologues created by gene duplication within a species (C)</p> Signup and view all the answers

Match the following amino acids with their characteristics:

<p>Glycine = Most flexible amino acid Proline = Imparts rigidity Cystine = Forms disulphide bridges Alanine = Less flexible than Glycine</p> Signup and view all the answers

Paralogues can have entirely unrelated functions.

<p>False (B)</p> Signup and view all the answers

What percentage identity indicates a potential orthologue when searching proteins from another species?

<blockquote> <p>~50%</p> </blockquote> Signup and view all the answers

Specific domain libraries can be searched via __________ at EBI.

<p>InterPro</p> Signup and view all the answers

Which of the following is true about the function transfer between orthologues?

<p>It is safe to transfer function between orthologues. (A)</p> Signup and view all the answers

Match the tools with their functionalities:

<p>Prosite = Short patterns for enzyme active sites Profite = Extensive sequence profiles for function Pfam = HMMs of domains for functional classification InterPro = Comprehensive domain library search tool</p> Signup and view all the answers

Automated predictions based on homology should rely solely on local matches.

<p>False (B)</p> Signup and view all the answers

What is a potential danger of automated predictions based on homology?

<p>Inheriting function from a homologue with only one domain match</p> Signup and view all the answers

Which of the following is NOT one of the four most commonly annotated functions covered by flDPnn?

<p>Carbohydrate-binding (A)</p> Signup and view all the answers

DisProt is recognized as a secondary database for intrinsically disordered proteins.

<p>False (B)</p> Signup and view all the answers

What is the primary goal of Clinical Phase III in drug development?

<p>To definitively prove effectiveness and evaluate safety.</p> Signup and view all the answers

A small molecule identified through biological screening with a desired effect is called a __________.

<p>hit</p> Signup and view all the answers

Match the following clinical phases with their purpose:

<p>Phase I = Determine a safe dosage and assess side effects Phase II = Refine Phase I results and focus on initial effectiveness Phase III = Prove effectiveness and evaluate safety</p> Signup and view all the answers

What is the average participant range for Clinical Phase I?

<p>20-80 individuals (B)</p> Signup and view all the answers

AlphaFold2 pLDDT scores are used to predict disordered proteins.

<p>True (A)</p> Signup and view all the answers

What is a lead in drug discovery?

<p>A chemically optimized version of a hit.</p> Signup and view all the answers

Flashcards

Primary Protein Structure

The linear sequence of amino acids in a protein chain, determined by the genetic code.

Secondary Protein Structure

Local structures within a protein chain, formed by hydrogen bonds between backbone atoms. Alpha-helices and beta-sheets are common examples.

Tertiary Protein Structure

The overall three-dimensional structure of a protein, arising from interactions between amino acid side chains.

Quaternary Protein Structure

The structure formed when multiple protein chains (subunits) associate with each other.

Signup and view all the flashcards

Chirality

The ability of an atom to exist in two mirror-image forms, like left and right hands.

Signup and view all the flashcards

Free Energy of Folding (ΔG)

A measure of the energy change associated with folding a protein. Determines whether folding is favorable or not.

Signup and view all the flashcards

Hydrophobic Effect

The tendency of non-polar amino acid side chains to avoid contact with water, driving protein folding.

Signup and view all the flashcards

Van der Waals Interactions

Weak interactions between atoms that are close together, playing a crucial role in protein structure and stability.

Signup and view all the flashcards

Odds Score

A method to quantify the likelihood of one residue being mutated from another in aligned sequences, representing the odds of a residue resisting mutation.

Signup and view all the flashcards

PAM 250

A matrix that represents the chemical properties of amino acid substitutions, with favorable scores for hydrophobic residues that are often observed.

Signup and view all the flashcards

BLOSUM62

A matrix derived from aligned sequences of protein families called BLOCKS, considering conserved blocks and ignoring loops, to amplify the signal of evolution.

Signup and view all the flashcards

Gap Penalty

A penalty applied to gaps (insertions or deletions) in protein alignments, calculated with an opening penalty and an extension penalty.

Signup and view all the flashcards

Needleman-Wunsch Algorithm

A common algorithm for sequence alignment that aims to find the best global alignment (matching entire sequences) by maximizing a similarity score.

Signup and view all the flashcards

Protein Domain

Parts of a protein sequence that are evolutionarily distinct and often correspond to different homologous families, acting as the basic unit of evolution.

Signup and view all the flashcards

Local Alignment

A specific type of sequence alignment that focuses on finding the most similar regions within two sequences, unlike global alignment which aligns the entire sequences.

Signup and view all the flashcards

Global Alignment

A type of sequence alignment that aligns the entire sequences, finding the optimal match across the entire length of the sequences.

Signup and view all the flashcards

Maximum Match

The largest number of residues from one sequence that can be matched with another, allowing for insertions and deletions (indels).

Signup and view all the flashcards

Alignment Matrix

A 2-dimensional array where each cell represents a possible pairing of residues between two sequences.

Signup and view all the flashcards

Alignment Pathway

A path through the alignment matrix, representing a possible alignment of two sequences. It reflects the insertion/deletion (indel) events chosen.

Signup and view all the flashcards

Similarity Value

A numerical value assigned to each cell in the alignment matrix, representing the similarity or dissimilarity of the two residues. It determines how good a match is (high score) or bad (low score).

Signup and view all the flashcards

Smith-Waterman Algorithm

A dynamic programming method used to find the highest scoring local alignment between two sequences. It focuses on finding the best matching segments, regardless of overall length.

Signup and view all the flashcards

Database Searching

A technique used to compare a query sequence against a database of sequences to find similar sequences. It utilizes alignment algorithms to identify matches and infer homology, structure, and function.

Signup and view all the flashcards

Dynamic Programming

A method that guarantees finding the highest scoring alignment for two given sequences using specific alignment scoring schemes. It also allows gaps, making it optimal for aligning sequences.

Signup and view all the flashcards

Molecular Function

The biological function of a gene product, such as binding to a specific molecule or catalyzing a chemical reaction.

Signup and view all the flashcards

Biological Process

The overall biological process that a gene product contributes to, encompassing a series of events with a specific outcome.

Signup and view all the flashcards

Cellular Component

The specific location within a cell where a gene product is active, such as a particular organelle or cellular compartment.

Signup and view all the flashcards

Evidence Code

Indicates the level of confidence in a GO annotation, based on the type of evidence used.

Signup and view all the flashcards

Protein Homology Search

A way to predict protein function by comparing a query sequence to a database of known protein sequences, identifying similar proteins likely sharing similar function.

Signup and view all the flashcards

Orthologs

Homologous proteins that arose from speciation events, having the same function in different species.

Signup and view all the flashcards

Paralogs

Homologous proteins that arose from gene duplication events within a species, potentially gaining new or modified functions.

Signup and view all the flashcards

GO Group

A set of proteins or genes that share a common function or are involved in the same biological process, often identified based on their GO annotations.

Signup and view all the flashcards

Identity (ID) Percentage

The percentage of identical amino acids between two protein sequences.

Signup and view all the flashcards

Enzyme Commission (EC) Number

A numerical classification system for enzymes based on their catalytic reaction, using four numbers separated by periods.

Signup and view all the flashcards

Function Transfer

The process of determining the function of a protein based on its similarity to proteins with known functions.

Signup and view all the flashcards

Pfam

A database that provides information about protein families and their domains.

Signup and view all the flashcards

InterPro

A tool that identifies functional motifs and profiles within protein sequences.

Signup and view all the flashcards

Target

A protein or molecule whose activity is modified to achieve therapeutic effects.

Signup and view all the flashcards

Hit

A small molecule identified through biological or computational screening with the desired effect (typically IC₅₀ ≤ 1 µM).

Signup and view all the flashcards

Lead

A chemically optimized version of a hit, designed to enhance therapeutic efficacy and minimize adverse effects, including toxicity and poor absorption.

Signup and view all the flashcards

DisProt

The gold standard database for intrinsically disordered proteins and regions, providing valuable information about their functions.

Signup and view all the flashcards

AlphaFold2 pLDDT

The predicted Local Distance Difference Test (pLDDT) score measures the confidence of protein structure predictions, indicating the likelihood of a region being disordered.

Signup and view all the flashcards

Putative functions of IDRs

These are functions predicted for Intrinsically Disordered Regions (IDRs) based on their characteristics. They include protein-binding, DNA-binding, RNA-binding, and linkers.

Signup and view all the flashcards

Clinical Development Phases

Clinical Phase I, II, and III are the three key phases of drug development, involving testing on humans to assess safety and efficacy.

Signup and view all the flashcards

Regulatory Phase

This phase involves submitting data to regulatory bodies for approval of the drug's use and marketing.

Signup and view all the flashcards

Study Notes

Bioinformatics - Protein Structure

  • Protein Structure Hierarchy: Primary -> Secondary -> Tertiary -> Quaternary
  • Protein Backbone: The core structure of proteins, composed of repeating amino acid units.
  • Amino Acid Chirality: The Ca is chiral, following the CO-R-N clockwise order for L-form.
  • Amino Acid Residues: Different types of amino acids are listed with their abbreviations and side chain characteristics; Glycine is remarkably flexible compared to Alanine due to its lack of a side chain.
  • Proline: Has less backbone flexibility due to its covalent bond with amide nitrogen, giving rigidity.

Protein Primary Structure

  • Definition: The linear sequence of amino acid residues that comprise a protein.
  • Main-chain: The unchanging portion of the protein structure.
  • Side-chains: Variable amino acid side chains attached to the main chain.
  • Chirality: The amino acid residues are generally L-form.

Protein Secondary Structure

  • Alpha Helix: Most common arrangement in protein secondary structure; it has a right-handed helix conformation with interconnecting hydrogen bonds.
  • Beta Sheets: Formed from beta strands using hydrogen bonds; adjacent beta-strands can form antiparallel, parallel or mixed arrangements, with the antiparallel arrangement having the strongest inter-strand bonds.

Protein Tertiary Structure

  • 3D Structure: Three-dimensional arrangement of a single protein chain with hydrophobic portions in the core.
  • Stabilization: Stabilised by hydrophobic interactions, and electrostatic interactions
  • Resolution: Determined at near atomic resolution by X-ray crystallography, NMR and electron microscopy.

Protein Quaternary Structure

  • Multiple Chains: Generally symmetric arrangement of multiple protein chains,
  • Protein Categories: Three categories of proteins: Transmembrane, Globular, and Fibrous.

Thermodynamics of Protein Folding

  • ΔG=ΔH-TAS: The Gibbs free energy of folding is the difference of enthalpy ("ΔH") and temperature-dependent entropy ("TAS").
  • ΔΗ: Changes in heat (e.g. electrostatics and packing).
  • T: Temperature
  • ΔS: Entropy, measures randomness/disorder in the system.

Protein Domains

  • Structure, Unit, Evolution: Formed from functional parts known as domains, generally a distinct structural and evolutionary unit.
  • Classification: Domains are classified into different fold classes (ex. α/α, β/β , α/β) .

Database information

  • Protein Data Bank (PDB): A primary database comprising an extensive collection of protein structures.
  • Secondary Databases (SCOP, CATH, ECOD): Databases focused on protein family classifications.

Sequence Alignment

  • Pairwise Protein Sequence Alignment: Establishing similarity between two sequences using a scoring scheme.
  • Scoring Scheme: Assigns a score to identical amino acids (1) and different ones (0) with more advanced schemes including Point Accepted Mutation (PAM).
  • Alignment Algorithms: The Needleman-Wunsch and Smith-Waterman algorithms commonly used for sequence alignment.

Database Searching

  • BLAST: A widely used algorithm for finding similar sequences from databases.
  • PSIBLAST: Enhanced version of BLAST that uses multiple sequences for greater accuracy.
  • MMseqs2: A local search program 50x faster than Smith Waterman.

Protein Structure Prediction

  • Reasons: To predict structure from sequences understanding how environment dictates this and to guide rational drug design, mutagenesis studies and analysis
  • Accuracy: Quantified using RMSD (Root Mean Square Deviation); Useful for close structures. And TM (Template Modelling)
  • Template-based prediction: Utilizing similar known structures as templates for comparisons from database(s) to model a new structure, often highly accurate. More generalized models for predicting proteins.
  • Ab initio: predicting structure from scratch based on physico-chemical properties of protein.

Protein Docking

  • Need: To predict the structure of a complex starting from the unbound components.
  • Ab initio: This was the first approach; it involved lots of random pairings/complex evaluations to evaluate which ones are likely most correct to fit together
  • Computer programs: Used to perform docking calculations to identify and refine solutions
  • Template approach: This is using known protein-protein complex structures as templates.

Protein Function

  • Gene Ontology (GO): Controlled vocabulary to describe gene product functions.
  • Types: Molecular Function, Biological Process, Cellular Component
  • Approaches: Homology (searching) and structure-based predictions to infer function.
  • Domains/Motifs: Use known function of conserved domains/motifs to predict the function of a new sequence.

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

Related Documents

More Like This

Genomics: Gene Naming and Ontology
22 questions
11-Surgical Oncology
54 questions

11-Surgical Oncology

WorkableRetinalite4798 avatar
WorkableRetinalite4798
Bioinformatica e Allineamenti di Sequenze
41 questions
Use Quizgecko on...
Browser
Browser