Introduction to Computational Biology PDF

Bio16 Computational Biology Introduction to Computational Biology Joseph Martin Q. Paet, MSc. Instructor 1 | Research & Dev’t Mgt. Division Bicol University Computational Biology...

Bio16 Computational Biology Introduction to Computational Biology Joseph Martin Q. Paet, MSc. Instructor 1 | Research & Dev’t Mgt. Division Bicol University Computational Biology ▪ the development and application of data‐analytical and theoretical methods, mathematical modeling, and computational simulation techniques to the study of biological, behavioral, and social systems (Pevsner, 2015) ▪ is a broad term, which covers all efforts of scientific investigations on or related to biology that involves mathematics and computation (Jiang et. al., 2013) Toma, M., & Concu, R. (2021). Computational Biology: A New Frontier in Applied Biology. Biology, 10(5), 374. https://doi.org/10.3390/biology10050374 Pevsner, J. (2015). Bioinformatics and Functional Genomics (3rd ed.). John Wiley & Sons Inc. Jiang, R., Zhang, X., & Zhang, M. Q. (2013). Basics of Bioinformatics: Lecture Notes of the Graduate Summer School on Bioinformatics of China. Tsinghua University Press. https://doi.org/10.1007/978-3-642-38951-1 Bioinformatics ▪ an emerging discipline that draws upon the strengths of computer sciences, mathematics, and information technology to determine and analyze genetic information (Singh, 2015) ▪ research, development, or application of computational tools and approaches for expanding the use of biological, medical, behavioral, or health data, including those to acquire, store, organize, analyze, or visualize such data (Pevsner, 2015) Pevsner, J. (2015). Bioinformatics and Functional Genomics (3rd ed.). John Wiley & Sons Inc. Singh, G. B. (2015). Fundamentals of Bioinformatics and Computational Biology: Methods and Exercises in MATLAB. Springer International Publishing. https://doi.org/10.1007/978-3-319-11403-3 Bioinformatics Pevsner, J. (2015). Bioinformatics and Functional Genomics (3rd ed.). John Wiley & Sons Inc. It did not start with DNA Protein analysis was the starting point ▪ Edman sequencing = one-by-one cleavage of N- terminal amino acid residues with phenylisothiocyanate ▪ Insulin = first protein to be sequenced ▪ Hundreds of fragments needed to be assembled Dayhoff: the first bioinformatician ▪ pioneered the application of computational methods ▪ developed COMPROTEIN = ‘a complete computer program for the IBM 7090’ designed to determine protein primary structure ▪ Developed the 3-letter and 1-letter amino acid code ▪ Atlas of Protein Sequence and Structure = first-ever biological sequence database Margaret Dayhoff The mother and father of bioinformatics Gauthier, J., Vincent, A. T., Charette, S. J., & Derome, N. (2019). A brief history of bioinformatics. Briefings in Bioinformatics, 20(6), 1981–1996. https://doi.org/10.1093/bib/bby063 Sequence Similarities and Species Relatedness Emile Zuckerkandl Linus Pauling ▪ Protein sequences as carriers of information ▪ Coined the term ‘Paleogenetics’ ▪ hypothesized that orthologous proteins evolved through divergence from a common ancestor Sequence dissimilarity between orthologous hemoglobin subunit beta-1. Gauthier, J., Vincent, A. T., Charette, S. J., & Derome, N. (2019). A brief history of bioinformatics. Briefings in Bioinformatics, 20(6), 1981–1996. https://doi.org/10.1093/bib/bby063 Multiple sequence alignment (MSA) is a common tool in phylogenetic analysis, where the evolutionary tree of different organisms are identified and organized in a hierarchical structure in which closely related species are physically placed near each other Sequence Alignment Algorithms Saul B. Needleman Christian D. Wunsch Da-Fei Feng Russell F. Doolitle ▪ developed the first dynamic programming ▪ developed the first truly practical approach algorithm for pairwise protein sequence to MSA alignments ▪ Feng–Doolittle algorithm = became the basis ▪ Needleman–Wunsch algorithm = became the for the now popular CLUSTAL MSA software basis for the first published MSA algorithm ▪ Aligning longer sequences is impractical Gauthier, J., Vincent, A. T., Charette, S. J., & Derome, N. (2019). A brief history of bioinformatics. Briefings in Bioinformatics, 20(6), 1981–1996. https://doi.org/10.1093/bib/bby063 Deciphering the Genetic Code in DNA Cost-efficient reading of DNA ▪ Maxam–Gilbert sequencing = radioactive reagents ▪ Sanger Sequencing = ‘plus and minus’ DNA sequencing; first to rely on primed synthesis with DNA polymerase Extracting the information from DNA Sequences ▪ Comparisons, calculations, pattern matching ▪ More computer-assisted analysis ▪ Staden Package = first sequence analysis software to include additional characters ▪ search + contigs assembly + annotate and manipulate sequence files Watson and Crick and their 3D model of DNA Gauthier, J., Vincent, A. T., Charette, S. J., & Derome, N. (2019). A brief history of bioinformatics. Briefings in Bioinformatics, 20(6), 1981–1996. https://doi.org/10.1093/bib/bby063 Establishing Relationship between Sequences Maximum Parsimony vs Maximum Likelihood The least number of changes finding the evolutionary tree as the main mechanism that yields the highest driving evolutionary change probability of evolving Usually, for protein sequence- Usually, for nucleic acid based phylogeny sequence-based phylogeny Walter M. Fitch Joseph Felsenstein maximum parsimony is an optimality criterion under which the phylogenetic tree that minimizes the total number of character-state changes Parsimony is the biological principle that the simplest possible explanation for a phenomenon is the most likely to be true. Gauthier, J., Vincent, A. T., Charette, S. J., & Derome, N. (2019). A brief history of bioinformatics. Briefings in Bioinformatics, 20(6), 1981–1996. https://doi.org/10.1093/bib/bby063 Increase in Computer Processing Ability In vivo research done on a living organism Molecular methods to target and amplify specific in vitro research done in a laboratory dish or test tube genes ▪ Genes are unlike proteins and RNA ▪ Gene cloning by Jackson, Symons, and Berg = in vivo amplification of DNA ▪ Polymerase chain amplification by Kary Mullis = in vitro amplification of DNA Access to computers and specialized software ▪ Development of microcomputers Bioinformatics and the free software movement ▪ Richard Stallman = promoted the freedom to run, copy, distribute, study, change and improve the software DNA is the least abundant macromolecular cell Desktop computers and new programming component that can be sequenced languages Gauthier, J., Vincent, A. T., Charette, S. J., & Derome, N. (2019). A brief history of bioinformatics. Briefings in Bioinformatics, 20(6), 1981–1996. https://doi.org/10.1093/bib/bby063 Era of Sequencing Total DNA Dawn of the Genomics Era ▪ Haemophilus influenzae = first complete genome that was sequenced Human Genome Project = turning point of the genomic era ▪ Public = National Institute of Health Private = Celera Genomics Bioinformatics went online ▪ EMBL (UK), GenBank (US), DDBJ ▪ Simplified access to bioinformatics tools Structural Bioinformatics ▪ Myoglobin = first 3D structure experimentally determined ▪ Prediction of protein structure (problems in computational power) Gauthier, J., Vincent, A. T., Charette, S. J., & Derome, N. (2019). A brief history of bioinformatics. Briefings in Bioinformatics, 20(6), 1981–1996. https://doi.org/10.1093/bib/bby063 Massively Parallel Sequencing and Analysis Second-generation sequencing or NGS ▪ Started with the ‘454 pyrosequencing technology’ ▪ Availability of a multitude of platforms ‘Biological Big Data’ ▪ Effect of lowering the cost of massively parallel sequencing technologies ▪ New repositories being made for model organisms High-performance bioinformatics and collaborative computing ▪ government-sponsored organizations specialized in high performance computing have emerged Gauthier, J., Vincent, A. T., Charette, S. J., & Derome, N. (2019). A brief history of bioinformatics. Briefings in Bioinformatics, 20(6), 1981–1996. https://doi.org/10.1093/bib/bby063 Perspectives of Bioinformatics Cell Organism Eukaryotic vs Bacterial Gene Expression Bacterial and Eukaryotic mRNA Alberts, B., Johnson, A., Lewis, J., Raff, M., Roberts, K., & Walter, P. (2014). Molecular biology of the cell (7th ed.). New York, NY: Garland Science. Two Main Cultures in Bioinformatics do not require High‐throughput knowledge of approaches that is programming and more appropriate for are immediately analyzing large‐scale accessible datasets Pevsner, J. (2015). Bioinformatics and Functional Genomics (3rd ed.). John Wiley & Sons Inc. Reproducible Research in Bioinformatics (1) A workflow should be well documented. (2) Information should be well organized. (3) Data should be made available to others. (4) Metadata can be equally as crucial as data. (5) Databases that are used should be documented. (6) Software should be documented. Pevsner, J. (2015). Bioinformatics and Functional Genomics (3rd ed.). John Wiley & Sons Inc. Bio16 Computational Biology Introduction to Computational Biology Joseph Martin Q. Paet, MSc. Instructor 1 | Research & Dev’t Mgt. Division Bicol University

Introduction to Computational Biology PDF

Document Details

Tags

Related

Summary

Full Transcript