Methods in Analysis of Simple Sequencing Lecture 5 PDF
Document Details
Uploaded by EventfulQuantum
New Mansoura University
Dr. Rami Elshazli
Tags
Summary
This lecture covers methods in analysis of simple sequencing, focusing on sequence alignment, phylogenetic analysis, and identification of novel genes for drug development. It explains biological sequences, nucleotides, and amino acids.
Full Transcript
Bioinformatics BIO417 Lecture 5 Methods in analysis of simple sequencing Prepared by Dr. Rami Elshazli Associate Professor of Biochemistry and Molecular Genetics Analysis of Biological sequences A...
Bioinformatics BIO417 Lecture 5 Methods in analysis of simple sequencing Prepared by Dr. Rami Elshazli Associate Professor of Biochemistry and Molecular Genetics Analysis of Biological sequences A biological sequence is a single, continuous molecule of nucleic acid or protein. The nucleic acid sequence is composed of nucleotides. The nucleotides adenine, thymine, guanine, and cytosine act as building blocks of deoxyribonucleic acid (DNA). The nucleotides adenine, uracil, guanine, and cytosine act as building blocks for ribonucleic acid (RNA). The primary protein structure is composed of a linear chain of amino acid molecules. The methodologies implemented under sequence analysis include: Sequence alignment. Phylogenetic analysis. Identification of novel genes for the drug. Dr. Rami Elshazli Associate Professor of Biochemistry and Molecular Genetics Pairwise Sequence Alignment Orthologs are homologous genes found in different species that evolved from a common ancestral gene through Sequence alignment is an essential step in molecular speciation. phylogenetic for analysis of homologues, orthologues, and ☛ Orthologous genes generally retain the same function across paralogues genes as well as identification of mutations in species. various genetic disorders. ☛ Example: The gene responsible for hemoglobin in humans and the gene responsible for hemoglobin in mice are orthologs. Homologs are genes that are derived from a common ancestral ☛ They descended from a common ancestor, and both serve gene. similar functions in oxygen transport. ☛ homologues genes share evolutionary ancestry, regardless of how their functions have diverged. Paralogs are homologous genes that arise within the same ☛ Homology is a broad term that encompasses both orthologs species due to gene duplication. and paralogs. ☛ After the duplication event, these genes can evolve new functions or specialized roles. ☛ Example: In humans, the genes for hemoglobin and myoglobin are paralogs. ☛ They originated from a common ancestral gene via duplication, but over time, myoglobin evolved to specialize in oxygen storage in muscles, while hemoglobin remained Dr. Rami Elshazli specialized for oxygen transport in blood. Associate Professor of Biochemistry and Molecular Genetics Pairwise Sequence Alignment The sequence alignments may be global or local. The goal of pairwise alignment is to find the conserved region between two or more sequences. These conserved regions are supposed to be an important and functional region in the sequences. The human hemoglobin subunit alpha (HBA_HUMAN) was used as a query sequence, and other four subunit sequences of hemoglobin (HBB_HUMAN), (HBG2_HUMAN), (HBD_HUMAN), and (HBG1_HUMAN) were considered as a subject sequence for comparison. Dr. Rami Elshazli Associate Professor of Biochemistry and Molecular Genetics (A) Global Alignment Sequence Alignment: Arranging two sequences to identify regions of similarity that may indicate functional, structural, or evolutionary relationships. Global Alignment: Aligns the entire length of the sequences from start to end. Local Alignment: Finds the best matching region within parts of (B) Local Alignment the sequences. Local alignment is mainly used for those sequences which differ Global alignment occurred when related sequences of the in sequence length. same length are aligned together. This method finds local matches within the sequence stretch The alignment of the sequence is carried out from the start to instead of looking at the entire sequence. end of the sequence. Smith-Waterman algorithm, a dynamic programming algorithm The algorithm of aligning two protein sequences, published by was developed by Smith and Waterman in 1981. Needleman and Wunsch in 1970 was the first dynamic programming application for biological sequence analysis. Dr. Rami Elshazli Associate Professor of Biochemistry and Molecular Genetics (B) Local Alignment Local alignment uses scoring matrices which give the user a choice to choose the appropriate scoring system. There are many software which use Smith-Waterman algorithm to build alignment of sequences. The most popular used software is NCBI-BLAST (basic local alignment search tool). NCBI-BLAST BLAST has seven subprograms as listed below: BLASTn (aligns nucleotide query sequence with nucleotide BLAST (Basic Local Alignment Search Tool) is the most used tool database). for sequence alignment and similarity search. BLASTp (aligns protein sequence with protein database). BLAST tool is fast and can be used in analysis of more than BLASTx (aligns nucleotide sequence with protein database by 1000s of sequences and even for comparison of two genomes. comparing six-frame conceptual translation of nucleotide BLAST is freely available for everyone and downloadable. sequence). tBLASTx (aligns query nucleotide possible six-frame converted sequence with converted nucleotide six-frame sequences of the https://www.ncbi.nlm.nih.gov/BLAST/ database). tBLASTn (aligns protein query sequence with translated Dr. Rami Elshazli Associate Professor of Biochemistry and Molecular Genetics nucleotide database). Dr. Rami Elshazli Associate Professor of Biochemistry and Molecular Genetics BLAT FASTA BLAT is another algorithm which is used in pairwise sequence FASTA is a first sequence alignment program used for DNA and alignment. protein sequence alignment. It can be used to align both DNA and protein sequences and It uses FASTA sequence format as an input file which is now designed to work best of a sequence having more similarity. standard for every sequence alignment software; it is slow but accurate as compared to BLAST. http://genome.ucsc.edu/cgi-bin/hgBlat Dr. Rami Elshazli Associate Professor of Biochemistry and Molecular Genetics https://fasta.bioch.virginia.edu/ Dr. Rami Elshazli Associate Professor of Biochemistry and Molecular Genetics