Podcast
Questions and Answers
Which of the following best describes the term 'Molecular Function' in Gene Ontology?
Which of the following best describes the term 'Molecular Function' in Gene Ontology?
- The location where a process occurs
- The specific biological role of a gene product (correct)
- The biological goal or objective at a higher level function
- The classification of proteins based on their structure
Cellular Component indicates the biological goal or objective of a gene product.
Cellular Component indicates the biological goal or objective of a gene product.
False (B)
What are the two most reliable sources for Gene Ontology annotations?
What are the two most reliable sources for Gene Ontology annotations?
Papers and Experiments
In Gene Ontology, 'Orthologs' are homologues created by __________ and typically have the same function in different species.
In Gene Ontology, 'Orthologs' are homologues created by __________ and typically have the same function in different species.
What is one of the main uses of Gene Ontology (GO)?
What is one of the main uses of Gene Ontology (GO)?
Match the following types of protein homologs with their descriptions:
Match the following types of protein homologs with their descriptions:
What does the term 'HMMs' refer to in the context of protein function prediction?
What does the term 'HMMs' refer to in the context of protein function prediction?
Human chymotrypsin and bovine chymotrypsin are examples of paralogs.
Human chymotrypsin and bovine chymotrypsin are examples of paralogs.
What is the identity percentage that PAM 250 was developed to model?
What is the identity percentage that PAM 250 was developed to model?
BLOSUM62 is derived from aligned sequences of protein families called BLOCKS.
BLOSUM62 is derived from aligned sequences of protein families called BLOCKS.
What does the letter 'o' represent in the gap penalty formula?
What does the letter 'o' represent in the gap penalty formula?
The PAM250 matrix quantifies the odds that one residue is mutated from another based on the __________ of amino acid pair exchanging.
The PAM250 matrix quantifies the odds that one residue is mutated from another based on the __________ of amino acid pair exchanging.
Match the following terms with their definitions:
Match the following terms with their definitions:
What is one of the main advantages of BLOSUM62?
What is one of the main advantages of BLOSUM62?
The Needleman-Wunsch algorithm is used for local alignment only.
The Needleman-Wunsch algorithm is used for local alignment only.
What does the term 'indel' refer to in terms of sequence alignment?
What does the term 'indel' refer to in terms of sequence alignment?
What is the primary goal of the Needleman-Wunsch algorithm?
What is the primary goal of the Needleman-Wunsch algorithm?
The Smith-Waterman algorithm focuses on global alignments for sequences.
The Smith-Waterman algorithm focuses on global alignments for sequences.
What matrix is used for assigning similarity values in the alignment process?
What matrix is used for assigning similarity values in the alignment process?
The maximum match is the largest number of residues from one sequence that can be matched with another, allowing for all possible __________.
The maximum match is the largest number of residues from one sequence that can be matched with another, allowing for all possible __________.
Match the alignment algorithms with their characteristics:
Match the alignment algorithms with their characteristics:
What is the main factor considered when determining pathways in the alignment matrix?
What is the main factor considered when determining pathways in the alignment matrix?
The maximum match will always be found in the outer row or column of the alignment matrix.
The maximum match will always be found in the outer row or column of the alignment matrix.
What is introduced when a gap is included in the alignment?
What is introduced when a gap is included in the alignment?
What is the order of protein structure hierarchy?
What is the order of protein structure hierarchy?
Glycine is less flexible than Alanine.
Glycine is less flexible than Alanine.
What type of bond does Cystine form?
What type of bond does Cystine form?
The equation for free energy of folding is ΔG = ΔH - TΔS, where ΔG represents the ____.
The equation for free energy of folding is ΔG = ΔH - TΔS, where ΔG represents the ____.
Which amino acid is known for its rigidity due to its unique bonding?
Which amino acid is known for its rigidity due to its unique bonding?
Van der Waals interactions are energetically unfavorable for protein packing.
Van der Waals interactions are energetically unfavorable for protein packing.
Name one type of non-bonded interaction in proteins.
Name one type of non-bonded interaction in proteins.
What defines paralogues?
What defines paralogues?
Match the following amino acids with their characteristics:
Match the following amino acids with their characteristics:
Paralogues can have entirely unrelated functions.
Paralogues can have entirely unrelated functions.
What percentage identity indicates a potential orthologue when searching proteins from another species?
What percentage identity indicates a potential orthologue when searching proteins from another species?
Specific domain libraries can be searched via __________ at EBI.
Specific domain libraries can be searched via __________ at EBI.
Which of the following is true about the function transfer between orthologues?
Which of the following is true about the function transfer between orthologues?
Match the tools with their functionalities:
Match the tools with their functionalities:
Automated predictions based on homology should rely solely on local matches.
Automated predictions based on homology should rely solely on local matches.
What is a potential danger of automated predictions based on homology?
What is a potential danger of automated predictions based on homology?
Which of the following is NOT one of the four most commonly annotated functions covered by flDPnn?
Which of the following is NOT one of the four most commonly annotated functions covered by flDPnn?
DisProt is recognized as a secondary database for intrinsically disordered proteins.
DisProt is recognized as a secondary database for intrinsically disordered proteins.
What is the primary goal of Clinical Phase III in drug development?
What is the primary goal of Clinical Phase III in drug development?
A small molecule identified through biological screening with a desired effect is called a __________.
A small molecule identified through biological screening with a desired effect is called a __________.
Match the following clinical phases with their purpose:
Match the following clinical phases with their purpose:
What is the average participant range for Clinical Phase I?
What is the average participant range for Clinical Phase I?
AlphaFold2 pLDDT scores are used to predict disordered proteins.
AlphaFold2 pLDDT scores are used to predict disordered proteins.
What is a lead in drug discovery?
What is a lead in drug discovery?
Flashcards
Primary Protein Structure
Primary Protein Structure
The linear sequence of amino acids in a protein chain, determined by the genetic code.
Secondary Protein Structure
Secondary Protein Structure
Local structures within a protein chain, formed by hydrogen bonds between backbone atoms. Alpha-helices and beta-sheets are common examples.
Tertiary Protein Structure
Tertiary Protein Structure
The overall three-dimensional structure of a protein, arising from interactions between amino acid side chains.
Quaternary Protein Structure
Quaternary Protein Structure
Signup and view all the flashcards
Chirality
Chirality
Signup and view all the flashcards
Free Energy of Folding (ΔG)
Free Energy of Folding (ΔG)
Signup and view all the flashcards
Hydrophobic Effect
Hydrophobic Effect
Signup and view all the flashcards
Van der Waals Interactions
Van der Waals Interactions
Signup and view all the flashcards
Odds Score
Odds Score
Signup and view all the flashcards
PAM 250
PAM 250
Signup and view all the flashcards
BLOSUM62
BLOSUM62
Signup and view all the flashcards
Gap Penalty
Gap Penalty
Signup and view all the flashcards
Needleman-Wunsch Algorithm
Needleman-Wunsch Algorithm
Signup and view all the flashcards
Protein Domain
Protein Domain
Signup and view all the flashcards
Local Alignment
Local Alignment
Signup and view all the flashcards
Global Alignment
Global Alignment
Signup and view all the flashcards
Maximum Match
Maximum Match
Signup and view all the flashcards
Alignment Matrix
Alignment Matrix
Signup and view all the flashcards
Alignment Pathway
Alignment Pathway
Signup and view all the flashcards
Similarity Value
Similarity Value
Signup and view all the flashcards
Smith-Waterman Algorithm
Smith-Waterman Algorithm
Signup and view all the flashcards
Database Searching
Database Searching
Signup and view all the flashcards
Dynamic Programming
Dynamic Programming
Signup and view all the flashcards
Molecular Function
Molecular Function
Signup and view all the flashcards
Biological Process
Biological Process
Signup and view all the flashcards
Cellular Component
Cellular Component
Signup and view all the flashcards
Evidence Code
Evidence Code
Signup and view all the flashcards
Protein Homology Search
Protein Homology Search
Signup and view all the flashcards
Orthologs
Orthologs
Signup and view all the flashcards
Paralogs
Paralogs
Signup and view all the flashcards
GO Group
GO Group
Signup and view all the flashcards
Identity (ID) Percentage
Identity (ID) Percentage
Signup and view all the flashcards
Enzyme Commission (EC) Number
Enzyme Commission (EC) Number
Signup and view all the flashcards
Function Transfer
Function Transfer
Signup and view all the flashcards
Pfam
Pfam
Signup and view all the flashcards
InterPro
InterPro
Signup and view all the flashcards
Target
Target
Signup and view all the flashcards
Hit
Hit
Signup and view all the flashcards
Lead
Lead
Signup and view all the flashcards
DisProt
DisProt
Signup and view all the flashcards
AlphaFold2 pLDDT
AlphaFold2 pLDDT
Signup and view all the flashcards
Putative functions of IDRs
Putative functions of IDRs
Signup and view all the flashcards
Clinical Development Phases
Clinical Development Phases
Signup and view all the flashcards
Regulatory Phase
Regulatory Phase
Signup and view all the flashcards
Study Notes
Bioinformatics - Protein Structure
- Protein Structure Hierarchy: Primary -> Secondary -> Tertiary -> Quaternary
- Protein Backbone: The core structure of proteins, composed of repeating amino acid units.
- Amino Acid Chirality: The Ca is chiral, following the CO-R-N clockwise order for L-form.
- Amino Acid Residues: Different types of amino acids are listed with their abbreviations and side chain characteristics; Glycine is remarkably flexible compared to Alanine due to its lack of a side chain.
- Proline: Has less backbone flexibility due to its covalent bond with amide nitrogen, giving rigidity.
Protein Primary Structure
- Definition: The linear sequence of amino acid residues that comprise a protein.
- Main-chain: The unchanging portion of the protein structure.
- Side-chains: Variable amino acid side chains attached to the main chain.
- Chirality: The amino acid residues are generally L-form.
Protein Secondary Structure
- Alpha Helix: Most common arrangement in protein secondary structure; it has a right-handed helix conformation with interconnecting hydrogen bonds.
- Beta Sheets: Formed from beta strands using hydrogen bonds; adjacent beta-strands can form antiparallel, parallel or mixed arrangements, with the antiparallel arrangement having the strongest inter-strand bonds.
Protein Tertiary Structure
- 3D Structure: Three-dimensional arrangement of a single protein chain with hydrophobic portions in the core.
- Stabilization: Stabilised by hydrophobic interactions, and electrostatic interactions
- Resolution: Determined at near atomic resolution by X-ray crystallography, NMR and electron microscopy.
Protein Quaternary Structure
- Multiple Chains: Generally symmetric arrangement of multiple protein chains,
- Protein Categories: Three categories of proteins: Transmembrane, Globular, and Fibrous.
Thermodynamics of Protein Folding
- ΔG=ΔH-TAS: The Gibbs free energy of folding is the difference of enthalpy ("ΔH") and temperature-dependent entropy ("TAS").
- ΔΗ: Changes in heat (e.g. electrostatics and packing).
- T: Temperature
- ΔS: Entropy, measures randomness/disorder in the system.
Protein Domains
- Structure, Unit, Evolution: Formed from functional parts known as domains, generally a distinct structural and evolutionary unit.
- Classification: Domains are classified into different fold classes (ex. α/α, β/β , α/β) .
Database information
- Protein Data Bank (PDB): A primary database comprising an extensive collection of protein structures.
- Secondary Databases (SCOP, CATH, ECOD): Databases focused on protein family classifications.
Sequence Alignment
- Pairwise Protein Sequence Alignment: Establishing similarity between two sequences using a scoring scheme.
- Scoring Scheme: Assigns a score to identical amino acids (1) and different ones (0) with more advanced schemes including Point Accepted Mutation (PAM).
- Alignment Algorithms: The Needleman-Wunsch and Smith-Waterman algorithms commonly used for sequence alignment.
Database Searching
- BLAST: A widely used algorithm for finding similar sequences from databases.
- PSIBLAST: Enhanced version of BLAST that uses multiple sequences for greater accuracy.
- MMseqs2: A local search program 50x faster than Smith Waterman.
Protein Structure Prediction
- Reasons: To predict structure from sequences understanding how environment dictates this and to guide rational drug design, mutagenesis studies and analysis
- Accuracy: Quantified using RMSD (Root Mean Square Deviation); Useful for close structures. And TM (Template Modelling)
- Template-based prediction: Utilizing similar known structures as templates for comparisons from database(s) to model a new structure, often highly accurate. More generalized models for predicting proteins.
- Ab initio: predicting structure from scratch based on physico-chemical properties of protein.
Protein Docking
- Need: To predict the structure of a complex starting from the unbound components.
- Ab initio: This was the first approach; it involved lots of random pairings/complex evaluations to evaluate which ones are likely most correct to fit together
- Computer programs: Used to perform docking calculations to identify and refine solutions
- Template approach: This is using known protein-protein complex structures as templates.
Protein Function
- Gene Ontology (GO): Controlled vocabulary to describe gene product functions.
- Types: Molecular Function, Biological Process, Cellular Component
- Approaches: Homology (searching) and structure-based predictions to infer function.
- Domains/Motifs: Use known function of conserved domains/motifs to predict the function of a new sequence.
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.