Bioinformatics: Alignment & Similarity

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson
Download our mobile app to listen on the go
Get App

Questions and Answers

Why is sequence alignment a crucial step in bioinformatics analysis?

  • It helps in measuring the similarity between sequences.
  • It aids in observing regions of conservation and variability.
  • It assists in predicting the function of new sequences.
  • All of the above (correct)

Sequence alignment assumes that sequences do NOT share a common ancestor.

False (B)

What type of alignment is used to compare only two sequences?

  • Pairwise alignment (correct)
  • Local alignment
  • Global alignment
  • Multiple sequence alignment

What does a dotplot provide?

<p>overview of similarities between two sequences</p>
Signup and view all the answers

In a dotplot, rows represent residues from one sequence and columns represent residues from a different sequence.

<p>True (A)</p>
Signup and view all the answers

Why are gaps introduced during sequence alignment?

<p>To improve the match between the sequences (C)</p>
Signup and view all the answers

A negative score value called a ______ is used to penalize the creation of a gap in an alignment.

<p>gap penalty</p>
Signup and view all the answers

Which type of alignment compares sequences in their entire length and is useful for very similar sequences?

<p>Global alignment (B)</p>
Signup and view all the answers

What does a local alignment seek to find?

<p>regions of highest similarity</p>
Signup and view all the answers

Local alignments are useful for finding conserved regions especially in distantly related sequences.

<p>True (A)</p>
Signup and view all the answers

What type of alignment is suitable for comparing three or more sequences?

<p>Multiple sequence alignment (B)</p>
Signup and view all the answers

Match the symbols used in sequence alignment with their meanings:

<ul> <li>(asterisk) = Positions with a single, fully conserved residue : (colon) = Conservation between groups of strongly similar properties . (full stop) = Conservation between groups of similar properties</li> </ul>
Signup and view all the answers

In sequence alignment, what does an asterisk (*) typically indicate?

<p>A position with a single, fully conserved residue (B)</p>
Signup and view all the answers

A colon (:) in a sequence alignment indicates that there are no similarities in properties between the amino acids.

<p>False (B)</p>
Signup and view all the answers

What is the primary purpose of alignment scoring?

<p>To find the best alignment among many possible alignments (B)</p>
Signup and view all the answers

What is calculated by percentage sequence identity?

<p>extent to which two sequences are invariant</p>
Signup and view all the answers

Percentage sequence identity is a complex scoring method.

<p>False (B)</p>
Signup and view all the answers

What does a substitution matrix provide in protein sequence alignment?

<p>A numerical score for pairing amino acids (A)</p>
Signup and view all the answers

PAM and ______ are examples of protein sequence alignment algorithms that use complex scoring systems.

<p>BLOSUM</p>
Signup and view all the answers

In alignment scoring, what is one of the factors that a substitution matrix is based on?

<p>Biochemical similarity (C)</p>
Signup and view all the answers

What is the purpose of gap penalties in sequence alignment scoring?

<p>To avoid nonsense alignments (B)</p>
Signup and view all the answers

What does BLAST stand for?

<p>Basic Local Alignment Search Tool</p>
Signup and view all the answers

BLAST ignores local similarities between sequences.

<p>False (B)</p>
Signup and view all the answers

What is the primary function of BLAST?

<p>To identify unknown sequences (D)</p>
Signup and view all the answers

BLAST can identify evolutionarily ______ regions that are important for structure and/or function.

<p>conserved</p>
Signup and view all the answers

What does the E-value in BLAST results indicate?

<p>The chance that the hit is a random event (A)</p>
Signup and view all the answers

A lower E-value in BLAST results represents what?

<p>better match</p>
Signup and view all the answers

A high E-value in BLAST results indicates a high likelihood the match is significant and not due to random chance.

<p>False (B)</p>
Signup and view all the answers

Which of the following BLAST programs is used to search a protein database using an amino acid sequence?

<p>BLASTP (C)</p>
Signup and view all the answers

[Blank] is the most sensitive BLAST program that is useful for finding very distantly related proteins.

<p>PSI-BLAST</p>
Signup and view all the answers

What type of sequence alignment is the Needleman-Wunsch algorithm used for?

<p>Global alignment (A)</p>
Signup and view all the answers

The Smith-Waterman algorithm is commonly used for Global alignment.

<p>False (B)</p>
Signup and view all the answers

What is the use of TBLASTN?

<p>Finding homologous coding regions in un-annotated genes (B)</p>
Signup and view all the answers

Match the BLAST program to the corresponding description.

<p>BLASTP = Compares an amino acid query sequence against a protein sequence database BLASTN = Compares a nucleotide sequence to a nucleotide sequence database BLASTX = Searches a protein database using a translated nucleotide query TBLASTN = Searches a translated nucleotide database using a protein query</p>
Signup and view all the answers

Flashcards

Sequence Alignment

A process that finds the order of a set of sequences to maximize their similarity.

Sequence Similarity

Sequence analysis that seeks to quantify the degree to which two or more sequences are related or identical.

Why Align Sequences?

Aligning sequences to measure similarity, observe conservation, determine residue correspondence, and predict function.

Origin of Sequence Similarity

Sequence similarity suggests that they share a common ancestor, and are evolutionarily related.

Signup and view all the flashcards

Pairwise Alignment

Compares two sequences to identify regions of similarity and difference.

Signup and view all the flashcards

Dotplot

A matrix that visually represents the similarity between two sequences.

Signup and view all the flashcards

Global Alignment

A method that aligns sequences across their entire length, useful for similar sequences.

Signup and view all the flashcards

Local Alignment

A method that identifies regions of high similarity within sequences, useful for dissimilar sequences.

Signup and view all the flashcards

Multiple Sequence Alignment (MSA)

Compares three or more sequences to find conserved regions.

Signup and view all the flashcards

Asterisk (*) in Alignment

Indicates positions with a single, fully conserved residue within a sequence alignment.

Signup and view all the flashcards

Colon (:) in Alignment

Shows conservation between groups of strongly similar properties within a sequence alignment.

Signup and view all the flashcards

Full Stop (.) in Alignment

Represents conservation between groups of similar properties in a sequence alignment.

Signup and view all the flashcards

Alignment Scoring

An assessment of sequence relationship based on matches, mismatches, and gaps.

Signup and view all the flashcards

Percentage Sequence Identity

The proportion of identical positions between two aligned sequences.

Signup and view all the flashcards

Substitution Matrix

A system assigning numerical values to amino acid pairings, considering biochemical similarity and mutation data.

Signup and view all the flashcards

Gap Penalty

A penalty applied for introducing gaps, preventing nonsensical alignments.

Signup and view all the flashcards

BLAST

A search tool that indentifies regions of local similarity between sequences.

Signup and view all the flashcards

What is BLAST?

A search program that finds regions of local similarity between amino acid or nucleotide sequences.

Signup and view all the flashcards

BLAST Use Cases

BLAST is useful for identifying unknown sequences.

Signup and view all the flashcards

Conserved regions in BLAST

BLAST is used to locate evolutionarily conserved regions within sequences, which are important for structure/function

Signup and view all the flashcards

Study Notes

  • Bioinformatics II focuses on Alignment & Similarity.

Learning Objectives

  • Sequence analysis encompasses alignment and similarity assessments.
  • Alignment methods are employed to align sequences, and their effectiveness is evaluated through alignment scoring.
  • Homology searches, particularly using BLAST, are conducted to identify related sequences.

Why Align Sequences?

  • Sequence alignment enables the measurement of similarity between sequences.
  • It allows for observation of regions of conservation and variability.
  • It helps determine residue-residue correspondences and predict the function of new sequences, which is invaluable for genome annotation.

Sequence Alignment and Common Ancestry

  • Sequence similarity often suggests a common evolutionary origin.
  • Aligning sequences presumes a shared ancestor, indicating homology.
  • Protein sequences evolve, and alignments pinpoint homologous positions.

Pairwise Alignment

  • Pairwise alignment compares two sequences, usually visualized using a dotplot matrix.
  • Dotplots offer an overview of similarities between two sequences, with rows and columns corresponding to residues in each sequence.

Dotplots

  • Dotplots are simple matrix representing the similarity between sequences
  • Rows and columns represent a sequence each
  • In a dotplot, if two sequences are identical, a diagonal line appears.
  • Differences appear as interruptions or shifts in the diagonal.

Gaps in Alignments

  • Gaps are insertions or deletions introduced to improve alignment matching.
  • Gaps influence alignment scoring.

Global Alignments

  • Global alignments compare sequences in their entirety and are useful for short, similar sequences.
  • An example tool is NEEDLE, employing the Needleman-Wunsch algorithm.

Local Alignments

  • Local alignments identify regions of highest similarity and extend the alignment outward.
  • Conserved regions in distantly related sequences are found through this method usually.
  • The WATER tool uses the Smith-Waterman algorithm.

Multiple Sequence Alignment (MSA)

  • Multiple sequence alignment (MSA) compares three or more sequences.
  • CLUSTAL is an example of MSA algorithms.

Features of Sequence Alignment

  • Accession numbers provide an entry point into a database
  • Gaps represent insertions or deletions within a sequence alignment.
  • Consensus symbols, such as asterisks, colons, and full stops, denote levels of conservation

Amino Acid Properties

  • Asterisks indicate fully conserved residues, colons indicate conservation between strongly similar groups, and periods indicate conservation between weakly similar groups

Alignment Scoring

  • Alignment scoring is essential and influenced by:
  • Sequence composition
  • The aim is to score the similarities and differences between sequences
  • Gaps are often penalized, as this can affect the alignment score

Percentage Sequence Identity

  • Percentage sequence identity indicates variance.
  • Gaps (insertions/deletions) affect sequence identity.

Complex Scoring Systems

  • Protein sequence alignment algorithms use PAM or BLOSUM.
  • Numerical scores are derived from substitution matrices to specific amino acid pairings.
  • Known mutations impact the score derived from substitution matrices when scoring
  • Biochemical similarity and the probabilty of occurance are also factors
  • Score is based on what alignment is using for the process

Gaps and Penalties

  • Gaps must be accounted for, with "gap penalties" used to avoid unchecked gaps and nonsense alighments.
  • Gap creation is penalized with negative scores.

BLAST

  • Basic Local Alignment Search Tool locates regions of local similarity between nucleotide or amino acid sequences.
  • BLAST enables comparison of a query sequence against a database, identifying similar sequences by finding regions of similarity with local alignments.
  • You can identify evolutionarily conserved regions
  • You can find regions of common structure and function
  • BLAST can find evolutionarily conserved regions for structure and/or function.
  • These regions are often strictly required.

BLAST Results

  • It identifies unknown sequences and find homologous sequences.

Types of BLAST Searches

  • BLASTP: Amino acid sequence against protein sequences.
  • BLASTX: Translated nucleotide sequence against protein sequences.
  • TBLASTN: Amino acid sequence against translated nucleotide sequences.
  • TBLASTX: Translated nucleotide sequence against translated nucleotide sequences.
  • BLASTN: Nucleotide sequence against nucleotide sequences.
  • PSI-BLAST is the most sensitive program and useful for finding very distantly related proteins

Key Summary Points

  • Sequence similarity and alignment is central to bioinformatics analysis
  • Global alignments are best for short and similar sequences
  • Local alignments are used for finding regions of homology
  • A BLST tool assists in finding unknown sequences, or evolutionarily conserves regions.
  • Choosing a BLAST program depends on use case.

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

Related Documents

More Like This

case 11 blast and sequence alligning
17 questions
Sequence Alignment and BLAST
17 questions

Sequence Alignment and BLAST

SupportingAutoharp5841 avatar
SupportingAutoharp5841
Use Quizgecko on...
Browser
Browser