Podcast
Questions and Answers
Define homologs and provide a general explanation.
Define homologs and provide a general explanation.
Homologs are genes/proteins that have similar sequences and are derived from a common ancestral sequence.
What are orthologs and how are they derived?
What are orthologs and how are they derived?
Orthologs are homologs derived through speciation.
Explain paralogs and how they are derived.
Explain paralogs and how they are derived.
Paralogs are homologs derived through gene duplication.
What are analogs in the context of genes, and what distinguishes them from homologs?
What are analogs in the context of genes, and what distinguishes them from homologs?
Signup and view all the answers
What is the main purpose of a homology search?
What is the main purpose of a homology search?
Signup and view all the answers
BLAST is a tool that aligns complete sequences to find homologous regions.
BLAST is a tool that aligns complete sequences to find homologous regions.
Signup and view all the answers
Study Notes
Comparative Genomics I
Background
- Homologs: genes/proteins with similar sequences derived from a common ancestral sequence
- Orthologs: homologs derived through speciation (e.g., human alpha-globin and chimpanzee alpha-globin)
- Paralogs: homologs derived through gene duplication (e.g., human alpha-globin and human beta-globin)
- Analogs: genes with similar sequences due to convergent evolution, not common ancestry (rare at sequence level, but can be functional analogs)
Homology Search
- Matching a given sequence to known genes or proteins in a database (first step after genome sequencing and gene prediction)
- Comparing genomes to determine shared genes is common with many complete genomes sequenced
- Distinguishing orthologs and paralogs can be impossible for distantly related species
BLAST (Basic Local Alignment Search Tool)
- Most commonly-used homology search tool
- Finds subsequences with best possible alignment, not complete sequences
- Protein sequences:
- Identical: amino acids are the same
- Positive: amino acids have similar biochemical properties (size, charge)
- Matches scored by E-value, representing the number of expected matches at random
- Lower E-value indicates greater confidence in homology (e.g., E = 10^(-6) means 1 in a million chance of observing match at random)
BLAT and Other Alignment Tools
- BLAT (BLAST-like alignment tool): faster algorithm for quick genome searches
- Uses 11-mers (11 DNA bases) or 4-mers (4 amino acids) to find matches with:
- 95% or greater identity over 25 bases or more (DNA)
- 80% or greater identity over 20 amino acids or more (Proteins)
- Other alignment tools (BWA, bowtie, Stampy, NextGenMap) for quickly mapping short reads to a reference genome
Distinguishing Orthologs and Paralogs
- Molecular evolutionists/systematists compare orthologs, but distinguishing them from paralogs can be difficult
- Reciprocal best hits approach:
- Gene A from species 1 is used for BLAST search of species 2 genome
- Gene A' from species 2 is used for BLAST search of species 1 genome
- If best match is gene A, then these are reciprocal best hits and considered orthologs
- One-to-one orthologs: homologous genes occurring in a single copy in each genome
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.
Description
Learn about the definitions and concepts of homologs, orthologs, and paralogs in comparative genomics. Understand the differences between these terms and how they relate to gene duplication and speciation.