BIOL415 Genomics & Proteomics Lecture Slides Fall 2024-2025 PDF

Summary

These are lecture slides for a course on Genomics and Proteomics, likely for an undergraduate level biology program. The slides cover a range of topics including the overview of the course, landmarks in genetics, introduction to omics, types of genomics, proteomics and more.

Full Transcript

BIOL415 Genomics & Proteomics Fall 2024-2025 Molecular Biology and Genetics Program Department of Biological Sciences Fezel Nizam Overview of the Course 1 – Intro to Genomics 2 – Importance of Genomes and Genetic Variation 3 – Mapping, Sequencing, Annotatio...

BIOL415 Genomics & Proteomics Fall 2024-2025 Molecular Biology and Genetics Program Department of Biological Sciences Fezel Nizam Overview of the Course 1 – Intro to Genomics 2 – Importance of Genomes and Genetic Variation 3 – Mapping, Sequencing, Annotation, Databases 4 – Comparative Genomics Human Genome Other Genomes 5 – Evolution and Genomic Change Overview of the Course 6 – Proteomics 7 – Transcriptomics 8- Cancer Genomics 9– Contemporary topics in the field: -Epigenomics -Nutrigenomics -Pharmacogenomics etc. Lanadmarks in Genetics and Genomics http://www.nature.com/nature/journal/v422/n6934/pdf/timeline_01626.pdf LANADMARKS IN GENETICS AND GENOMICS https://doi.org/10.1161/CIR.0000000000000211 INTRODUCTON TO OMICS TECHNOLOGY What is ‘omics’? The term ‘‘omic’’ is derived from the Latin suffix ‘‘ome’’ meaning mass or many. Thus, OMICS involve a mass (large number) of measurements per endpoint. (Jackson et al., 2006) Integration of OMICS data Efficient integration of data from different OMICS can greatly facilitate the discovery of true causes and states of disease, mostly done by softwares (Andrew et al., 2006). In biological context , suffix –omics is used to refer to the study of large sets of biological molecules (Smith et al.,2005). The realization that DNA is not alone regulate complex biological processes (as a result of HGP, 2001), triggered the rapid development of several fields in molecular biology that together are described with the term OMICS. The OMICS field ranges from – Genomics (focused on the genome) – Proteomics (focused on large sets of proteins, the proteome) – Metabolomics (focused on large sets of small molecules, the metabolome). TYPES OF Genomics Static DNA content Dynamic output Transcrip t-omics Interact- omics Pharmac o- Genomic genomics s Epi- genomics Prote- Nurti- omics Is there more? genomics HW: Name and define 3 more omics sciences. Static DNA content Dynamic output Dynamic components of genome? 1-transposable elements 2-retrotransposons 3- transposons (Long and Short interspersed elements (LINEs +SINEs) Biological effects of transposable elements? sequence broadcasting, altering properties of genes, evolution, chromosomal rearrangements, epigenetic modification GENOMICS The field of genomics has been divided into 3 major categories. 1- Genotyping (focused on the genome sequence), The physiological function of genes and the elucidation of the role of specific genes in disease susceptibility (Syvanen, 2001) 2- Transcriptomics (focused on genomic expression) The abundance of specific mRNA transcripts in a biological sample is a reflection of the expression levels of the corresponding genes (Manning et al., 2007) 3- Epigenomics (focused on epigenetic regulation of genome expression) Study of epigenetic processes (expression activities not involving DNA) on a large (ultimately genome-wide) scale (Feinberg, 2007) PROTEOMICS Proteomics provides insights into the role of proteins in biological systems. The proteome consists of all proteins present in specific cell types or tissue and its highly variable over time, between cell types and will change in response to changes in its environment, a major challenge (Fliser et al., 2007). The overall function of cells can be described by the proteins (intra- and inter- cellular) and the abundance of these proteins (Sellers et al., 2003). Although all proteins are directly correlated to mRNA (transcriptome), post translational modifications (PTM) and environmental interactions impede to predict from gene expression analysis alone (Hanash et al., 2008). Tools for proteomics Mainly two different approaches that are based on detection by, mass spectrometry (MS) and protein microarrays using capturing agents such as antibodies. Major focuses, the identification of proteins and proteins interacting in protein-complexes Then the quantification of the protein abundance. The abundance of a specific protein is related to its role in cell function at the given time (Fliser et al., 2007). Introduction to Genomics Genome Sequencing Projects The static contents of the human genome, and its dynamic aspects, are similar in general features to what other genomes contain. As genome sequencing techniques become easier, the field is progressing in the direction -to determine more and more human genome sequences; disease anticipation and prevention Completed Genome Projects -many species have now had their genome sequenced, for at least one individual. Human genome project Begun formally in 1990 planned to last 15 years (1990-2005) 18 countries participate with significant contributions from USA, UK, Germany, France, Japan and China GOAL Identify all the 100,000 genes in human DNA Determine the sequences of the 3 billion chemical bases that make up the human DNA Store this information in databases Develop faster, more efficient sequencing technologies Develop tools for data analysis Address the ethical, legal and social issues that may arise from the project Recent progress Dec 1999 - Human Chromosome 22 Completed (First human chromosome to be sequenced) Mar 2000 - Drosophila Genome Completed Apr 2000 - Completion of Draft Sequence of human Chromosome 5, 16 and 19 May 2000 - Human Chromosome 21 completed June 2000 - Bill Clinton announced the completion of a “working draft” DNA sequence (90%) of the human genome By 2003 - Completion of the HGP 100,000 Genomes Project timeline https://www.genengnews.com/insights/th e-human-genome-project-in-2020- hindsight/ ? Benefits of hgp ? Alert patients that are at risk for certain diseases Reliably predict the course of disease Precisely diagnose disease and ensure the most effective treatment Developing new treatments at the molecular level faqS Whose genome is being sequenced in the HGP ? Blood (female) or sperm (male) samples from a large number of donors including J. Craig Venter, James D. Watson etc What genomes have been sequenced completely ? Several viruses and bacteria Yeast, roundworm and fruit fly First plant genome to be completed in 2000 The human genome A human genome contains apx 3.2 x 109 bps, distributed among 22 paired chromosomes, plus two X ch in females and X&Y ch in males The first human genome was determined in 2001 Since then, advances in technology have made genomic sequencing cheaper and faster. Chanllenges of understanding the genomic information, applying the data and analysis it. Phenotype = genotype + environment + life history + epigenetics International HapMap Project (http://hapmap.ncbi.nlm.nih.gov/) Variations in sequences in population distrubuted around the world, an atlas of SNPs The 1000-Genome Project (http://www.1000genomes.org/) Extension of HapMap Project towards complete genome data, sequencing of family groups, detailed sequecing of 1000 protein-coding regions in 1000 individuals Several companies nowadays offer personal genome sequencing What are the reasons for sequencing non- human genomes? Tree of Life The Tree of Life Image from: http://yifanhu.net/TOL/tol_9_19_2011.jpg Information from non- human genomes Evolutionary processes Conserved regions Particularly comparative analysis with mammalian genomes Functional analysis Genomes of pathogens Improvement of crops and animals Endangered species History of human species Help to understand the functions of the different regions of the human genome Information from human genome Clinical applications Genetic / genomic testing Genealogy Forensic DNA analysis Research – normal vs cancer cells Prevention, diagnosis, treatment =============== Public availability of sequence data ELSI e.g. privacy of individuals The evolutıon and development of databases High-throughput sequencing methods are genereting immense amounts of data How can this info be archived and presented in useful forms? : This is the responsibility of databases Combination of biological data with computer science and statistics Computer storage and softwares are essential for generating, collecting, archiving, curating, distributing, retrival and analysis of biological data Sources of biological data include several high- throughput streams, including; Systematic genome sequencing Protein expression patterns Metabolic pathways Protein interaction patterns and regulatory networks The scientific literature, including bibliographical databases – data mining ✓Needs of public availability – A databank without effective modes of Access is merelt a data graveyard ✓Depeloving softwares for information retrival and analysis ✓Establishment of specialized institutions to organize the databases Genome Browsers A spesific type of database aimed at presenting genomic sequences and related information is called a genome browser. Genome browsers are projects designed to organize and annotate genome information Present it via web pages together with links to related data such as evolutionary realtionships or correlation with disease Provide tools for searching and analysis To major genome browsers are; ✓Ensembl ✓Santa Cruz Genome Browser Comparative Genomics Differences in gene sequences, differences in the corresponding amino acid sequences, and the differences in three-dimentional structure relfect evolutionary divergence. In general, divergence at the molecular level parallels the divergence of the species according to classical taxonomic methods. Multiple sequence alignment The basic tool for investigating sequence divergence is the multiple sequence alignment. Conserved residues What are some of the ethical, legal, and social challenges presented by genetic information ? Who owns and controls genetic information? How reliable and useful is fetal genetic testing? Should testing be performed when no treatment is available ? Do people’s genes make them behave in a particular way ? Social inequalities ? Conventional Sequencing Sanger Sequencing The conventional DNA sequencing technique was developed by Frederick Sanger in mid 1970s. a reliable method for sequencing long DNA fragments. named the dideoxy method of Sanger -HW- https://www.ncbi.nlm.nih.gov/pubmed/26554401

Use Quizgecko on...
Browser
Browser