Bioinformatics lecture 3+4 Bi4999en

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

Protein synthesis is a process that occurs in three steps: Transcription, Splicing, and Translation.

False (B)

UniProtKB is a central repository of protein sequences and is supported by a collaboration between EBI, Swiss Institute of Bioinformatics, and Protein Information.

True (A)

Post-translational modifications refer to the changes that occur to proteins after they have been synthesized, transforming them into mature proteins.

True (A)

Motifs or profiles databases contain exhaustive primary sequences of proteins with no abstractions or patterns.

False (B) Signup and view all the answers

Generalist databases, such as UniProtKB, only include sequences from very specifically defined sources.

False (B) Signup and view all the answers

There is only one type of protein sequence database available for researchers.

False (B) Signup and view all the answers

The quality level of annotation in databases like UniProtKB can vary between manual and automatic entry.

True (A) Signup and view all the answers

The primary sequence of proteins is unrelated to annotations and cross-references in databases.

False (B) Signup and view all the answers

PAM matrices represent evolutionary information based on distant protein relationships.

False (B) Signup and view all the answers

BLOSUM62 is derived from sequences clustered at 62% identity or greater.

True (A) Signup and view all the answers

Higher numbers in BLOSUM matrices indicate more evolutionary distance between sequences.

False (B) Signup and view all the answers

PAM250 corresponds to a residue identity of 45% between proteins.

True (A) Signup and view all the answers

BLOSUM1 corresponds to 1% identity and evaluates highly diverse protein alignments.

True (A) Signup and view all the answers

PAM1 corresponds to a residue identity of 99%.

True (A) Signup and view all the answers

The BLOSUM matrices are derived from individual sequences without any clustering.

False (B) Signup and view all the answers

PAM matrices are extrapolated from PAM1 to represent various evolutionary distances.

True (A) Signup and view all the answers

Introducing a gap in sequence alignment results in a negative score penalty.

True (A) Signup and view all the answers

The identity matrix for protein similarity uses a score of 1 for different amino acids.

False (B) Signup and view all the answers

Substitution models evaluate the likelihood of one specific amino acid replacing another during mutation.

True (A) Signup and view all the answers

1 PAM is defined as the time it takes for 1 out of 100 amino acids to mutate.

True (A) Signup and view all the answers

The Dayhoff Mutation Data Matrix is based on inferred evolutionary distances derived from genome sequencing.

False (B) Signup and view all the answers

The PAM matrix product allows for inference of homology in proteins beyond the twilight zone.

False (B) Signup and view all the answers

Gaps introduced in sequence alignments are beneficial as they eliminate the need for substitution models.

False (B) Signup and view all the answers

Scores in substitution models are based exclusively on the identity of the amino acids involved.

False (B) Signup and view all the answers

PDB format is advantageous because it is rarely supported by the majority of tools.

False (B) Signup and view all the answers

A significant disadvantage of the PDB format is the absolute limits on the size of certain items of data.

True (A) Signup and view all the answers

The mmCIF format was developed to simplify the handling of complicated structure data.

False (B) Signup and view all the answers

One disadvantage of the mmCIF format is that it is easily readable by humans and computers.

False (B) Signup and view all the answers

A notable feature of PDB format is its consistency across individual entries.

False (B) Signup and view all the answers

The mmCIF format is more suitable for accessing individual entries compared to the PDB format.

False (B) Signup and view all the answers

Hydrogen bonding and active sites are part of the data captured in the PDB format.

True (A) Signup and view all the answers

The maximum number of chains allowed in the PDB format is over 30.

False (B) Signup and view all the answers

R-factor should always be ≤ 0.4 for reliable models.

False (B) Signup and view all the answers

DRESS and RECOORD web servers provide improved versions of NMR models.

True (A) Signup and view all the answers

Local errors in a structure are indicated by residue B-factors < 50.

False (B) Signup and view all the answers

Predictions of atomic resolution in NMR structures can be made using the ResProx tool.

True (A) Signup and view all the answers

No guidelines exist for selecting NMR structures unlike X-ray structures.

True (A) Signup and view all the answers

Quality checks involve only comparisons against high-resolution structures of nucleic acids.

False (B) Signup and view all the answers

A structure showing a high number of outliers is likely to be problematic.

True (A) Signup and view all the answers

B-factor values are irrelevant for assessing the reliability of a structure.

False (B) Signup and view all the answers

The Ramachandran plot is used to check the stereochemical quality of protein structures by plotting the Ψ versus the Φ main chain torsion angles.

True (A) Signup and view all the answers

In a well-defined protein structure, residues are typically dispersed in the 'disallowed' regions of the Ramachandran plot.

False (B) Signup and view all the answers

Bad atom-atom contacts in protein structures are defined as two nonbonded atoms that have a center-to-center distance greater than the sum of their van der Waals radii.

False (B) Signup and view all the answers

Counts of unsatisfied hydrogen bond donors are a parameter evaluated in validating protein structures.

True (A) Signup and view all the answers

A real space R-factor is used to express how poorly each residue fits its electron density in a protein structure.

False (B) Signup and view all the answers

Knowledge-based potentials assess how 'happy' each residue is in its local environment according to predefined criteria.

True (A) Signup and view all the answers

The databases EDS and PDBREPORT provide pre-computed quality criteria for every structure in the Protein Data Bank (PDB).

True (A) Signup and view all the answers

Poorly defined protein structures generally show residues clustered tightly in the most favored regions of the Ramachandran plot.

False (B) Signup and view all the answers

Protein synthesis includes four steps: Transcription, Splicing, Translation, and Elimination.

False (B) Signup and view all the answers

Bioinformatics relies on databases that can provide sequences from any source, such as UniProtKB.

True (A) Signup and view all the answers

PAM and BLOSUM matrices are interchangeable for evaluating amino acid substitutions across all evolutionary distances.

False (B) Signup and view all the answers

The dynamic programming algorithm used for sequence alignments is optimized for both global and local alignments.

False (B) Signup and view all the answers

Post-translational modifications occur before protein synthesis is completed, altering proteins into their mature forms.

False (B) Signup and view all the answers

Transmembrane beta-strand barrels (TMB) typically contain 10 - 30 residues.

False (B) Signup and view all the answers

UniProtKB annotations may vary in quality depending on whether they are created manually or automatically.

True (A) Signup and view all the answers

Word-based methods for sequence alignments guarantee optimal alignments each time they are applied.

False (B) Signup and view all the answers

In the context of multiple sequence alignments, progressive methods begin by aligning the least similar sequences first.

False (B) Signup and view all the answers

Low-quality B-factor values indicate that residues are likely stable in their local environment.

True (A) Signup and view all the answers

The positive-inside rule indicates that positively charged residues are more prevalent in loop regions outside the membrane.

False (B) Signup and view all the answers

Diagonal transitions in the dynamic programming matrix represent gaps in the sequence alignment.

False (B) Signup and view all the answers

Motif databases derive information solely from full primary sequences without abstract representation.

False (B) Signup and view all the answers

PDB format is a flexible format that allows variable lengths for its entries.

False (B) Signup and view all the answers

The Ramachandran plot illustrates the steric arrangement of amino acid residues based on the angles of the main chain torsion.

True (A) Signup and view all the answers

Hydrophobicity analysis is particularly useful for predicting transmembrane beta-strand barrels.

False (B) Signup and view all the answers

The final alignment in dynamic programming corresponds to the path in the matrix that minimizes the score.

False (B) Signup and view all the answers

Methods for solubility and expressability prediction do not rely on machine learning techniques.

False (B) Signup and view all the answers

Back-tracing in sequence alignment starts from the top-left corner of the scoring matrix.

False (B) Signup and view all the answers

The mmCIF format is specifically designed to complicate the handling of structure data.

False (B) Signup and view all the answers

Gaps in sequence alignments are always beneficial as they improve alignment scores.

False (B) Signup and view all the answers

The substitution model scores are based solely on the identity of the corresponding amino acids.

False (B) Signup and view all the answers

The PDB format is the least supported format for 3D structure data representation.

False (B) Signup and view all the answers

ResProx tool is used to make predictions about atomic resolution in NMR structures.

True (A) Signup and view all the answers

Using an identity matrix, a score of 1 is assigned when two different amino acids are present.

False (B) Signup and view all the answers

The Dayhoff Mutation Data Matrix is based on a large sample of observed mutations for estimating evolutionary distances.

True (A) Signup and view all the answers

A gap in sequence alignment is treated as a positive score penalty to encourage shorter alignments.

False (B) Signup and view all the answers

Evolutionary distance in PAM is measured as the time for 1 out of 100 amino acids to remain unchanged.

False (B) Signup and view all the answers

The PAM250 matrix represents a scenario where the proteins considered have approximately 45% residue identity.

True (A) Signup and view all the answers

Substitution models assess the probability of observing mutations without considering evolutionary relations.

False (B) Signup and view all the answers

The introduction of more gaps in sequence alignment can enhance the accuracy of biologically meaningful alignments.

True (A) Signup and view all the answers

A Markov chain model is utilized to derive the PAM matrix product, which helps infer protein homology.

True (A) Signup and view all the answers

The maximum number of atom records in a PDB file is limited to 99,999.

True (A) Signup and view all the answers

The mmCIF format is rarely supported by visualization and computational tools.

True (A) Signup and view all the answers

PDB format is deemed suitable for computer extraction of information due to its consistency.

False (B) Signup and view all the answers

Each field of information in the mmCIF format is linked to other fields using a designated syntax.

True (A) Signup and view all the answers

PDB format allows for a maximum of 30 chains in a single file.

False (B) Signup and view all the answers

Inconsistencies within a single PDB entry include different residue numbering in the SEQRES and ATOM sections.

True (A) Signup and view all the answers

The advantages of the PDB format include being difficult to read and use.

False (B) Signup and view all the answers

The mmCIF format is suitable for accessing individual entries as it is easily readable.

False (B) Signup and view all the answers

In a Ramachandran plot, residues of a well-defined protein structure are typically dispersed in the 'disallowed' regions.

False (B) Signup and view all the answers

Bad atom-atom contacts are defined as two nonbonded atoms with a center-to-center distance less than the sum of their van der Waals radii.

True (A) Signup and view all the answers

Hydrogen bonding energies are not assessed during the validation of protein structures.

False (B) Signup and view all the answers

The Ramachandran plot is only useful in evaluating RNA structures, not protein structures.

False (B) Signup and view all the answers

A high number of unsatisfied hydrogen bond donors in a protein structure is a sign of good structural quality.

False (B) Signup and view all the answers

The real space R-factor is a metric that expresses how well each residue fits its electron density.

True (A) Signup and view all the answers

Knowledge-based potentials evaluate how 'unhappy' each residue is in its local environment, indicating a problematic overall structure.

True (A) Signup and view all the answers

All major databases provide pre-computed quality criteria for every structure in the Protein Data Bank (PDB).

False (B) Signup and view all the answers

Alternative splicing can result in multiple isoforms of proteins that share identical sequences.

False (B) Signup and view all the answers

The evolutionary information can enhance the accuracy of predictions related to protein properties.

True (A) Signup and view all the answers

The process of sequence alignment aims to assess the differences exclusively without considering evolutionary relationships.

False (B) Signup and view all the answers

Darwinian evolution posits that variations that enhance an individual's biological fitness will likely be inherited by future generations.

True (A) Signup and view all the answers

The assumption of large inter-individual differences is essential for Darwinian evolutionary theory.

False (B) Signup and view all the answers

Homology can be inferred solely from matching the primary sequences of proteins without any additional information.

False (B) Signup and view all the answers

Proteins can exhibit properties such as transmembrane regions solely based on their secondary structure.

False (B) Signup and view all the answers

Speciation is a direct result of the accumulation of inherited variations over time due to natural selection.

True (A) Signup and view all the answers

Function is solely dictated by sequence without regard for 3D structure.

False (B) Signup and view all the answers

Selective pressure operates primarily at the sequence level in proteins.

False (B) Signup and view all the answers

Homologous proteins arise from genes that evolved from a common ancestor.

True (A) Signup and view all the answers

Innovation in proteins occurs exclusively through large-scale genetic changes.

False (B) Signup and view all the answers

3D structures of proteins are unaffected by their amino acid sequences.

False (B) Signup and view all the answers

Adaptation in proteins leads to improved function in a given environment.

True (A) Signup and view all the answers

Mutations cannot be passed down to subsequent generations.

False (B) Signup and view all the answers

The sequence-structure-function paradigm emphasizes the relationship between these three aspects in proteins.

True (A) Signup and view all the answers

Protein synthesis involves processes including Transcription, Splicing, and Translation, followed by Post-translational modifications to form mature proteins.

True (A) Signup and view all the answers

UniProtKB is exclusively a specialist database that focuses solely on sequences from a limited biological pathway.

False (B) Signup and view all the answers

Motifs or profiles databases do not provide abstracted information from primary sequences of proteins.

False (B) Signup and view all the answers

Post-translational modifications occur prior to the synthesis of proteins and are essential for their final functional state.

False (B) Signup and view all the answers

The quality of annotations in databases like UniProtKB is only determined by automatic processes, with no human intervention.

False (B) Signup and view all the answers

BLOSUM matrices are used to evaluate evolutionary information based on proteins that share at least 62% identity.

True (A) Signup and view all the answers

Multiple databases such as WormBase exclusively provide exhaustive primary sequences without any additional annotations.

False (B) Signup and view all the answers

The mmCIF format is specifically designed to restrict access to individual entries, unlike PDB format.

False (B) Signup and view all the answers

A pairwise alignment technique is only associated with Global alignments.

False (B) Signup and view all the answers

Local alignments only consider similarity across the entire sequence of proteins.

False (B) Signup and view all the answers

Substitution scores in amino-acid alignments are fixed and do not vary.

False (B) Signup and view all the answers

Homologous proteins are those that share structural, functional, or sequence similarities regardless of their evolutionary background.

False (B) Signup and view all the answers

Iterative methods are the only techniques used for multiple sequence alignments.

False (B) Signup and view all the answers

Gaps in sequence alignments receive a positive score, encouraging their introduction.

False (B) Signup and view all the answers

The purpose of a substitution matrix in sequence alignment is to optimize the total alignment score by pairing amino acids.

True (A) Signup and view all the answers

The concept of homology in proteins is irrelevant to their structure and function.

False (B) Signup and view all the answers

The dynamic programming algorithm for sequence alignments allows back-tracing from the top-left corner of the matrix for global alignment.

False (B) Signup and view all the answers

Progressive methods for multiple sequence alignments first align the most divergent sequences before adding similar ones.

False (B) Signup and view all the answers

Word methods in sequence alignment guarantee an optimal alignment by matching short non-overlapping sequence stretches.

False (B) Signup and view all the answers

In local alignment, the Smith & Waterman algorithm allows back-tracing from any position in the alignment matrix.

True (A) Signup and view all the answers

Dynamic programming algorithms for sequence alignments are known for being computationally efficient at all times.

False (B) Signup and view all the answers

The relative positions of matching regions in word methods define an offset, which is the sum of corresponding coordinates.

False (B) Signup and view all the answers

Substitution models assess the likelihood of residue pairs being aligned based solely on their sequence identity.

False (B) Signup and view all the answers

Access to the PDB archive is available only through paid subscriptions.

False (B) Signup and view all the answers

Systematic errors in model structures contribute to the overall accuracy of the data.

False (B) Signup and view all the answers

Most structures in the PDB are of high quality, typically containing only systematic errors.

False (B) Signup and view all the answers

Completely wrong structures can be caused by misinterpretation of the electron density map.

True (A) Signup and view all the answers

All structures in the PDB are guaranteed to be correct and free from any type of error.

False (B) Signup and view all the answers

Sequence-based and text-based queries are available through the wwPDB sites.

True (A) Signup and view all the answers

Random errors are less common than systematic errors in structural models.

False (B) Signup and view all the answers

Quality checks on structures require critical assessment before being used for specific purposes.

True (A) Signup and view all the answers

Flashcards

Protein Synthesis Steps

Protein synthesis involves transcription (DNA to RNA), splicing (RNA to mRNA), translation (mRNA to protein), and post-translational modifications (protein to mature protein).

Protein Sequence Databases

Databases that store protein sequences, often with annotations and cross-references to other information. Types include generalist (like UniProtKB) and specialist databases (like WormBase) with different scopes.

UniProtKB

A central repository of protein sequences and functional information, known for detailed and quality annotations.

Protein Sequence Sources

Multiple databases hold protein sequences, categorized by scope (e.g., general or specific organism) and content (e.g., primary sequences and derived motifs).