LEC 12 - Human Genomics PDF
Document Details
Uploaded by AffectionateCommonsense7053
UWI Cave Hill
Dr. A. T Alleyne
Tags
Summary
This document provides lecture notes on the human genome, bioinformatics, structure of the human genome and associated topics. It discusses various aspects, including structural components, comparison to other genomes, CpG islands, SNPs, haplotypes, repetitive sequences, and genome-wide association studies and their applications in clinical medicine.
Full Transcript
LEC 12: THE HUMAN GENOME BIOC 3265- Principles of Bioinformatics Dr. A. T Alleyne- UWI Cave Hill 1 LEARNING OUTCOMES At the end of this lecture you should be able to: 1.Describe the general structural components of the human genome 2.Compare the human genome with other sequenced genomes 3.Expl...
LEC 12: THE HUMAN GENOME BIOC 3265- Principles of Bioinformatics Dr. A. T Alleyne- UWI Cave Hill 1 LEARNING OUTCOMES At the end of this lecture you should be able to: 1.Describe the general structural components of the human genome 2.Compare the human genome with other sequenced genomes 3.Explain what is a CpG island, SNP, haplotype and repetitive sequence 4.Describe the methods and application of genome wide association studies (GWAS) and exome sequencing (WES). 5.Discuss the use of genomics research in clinical medicine 6.Provide examples of genomic applications in medicine 2 HUMAN GENOME BY NUMBERS Average gene Approx. 3 billion consists of 3000 base pairs bases coding genes 99.9% bases are ~20,000 ( the same in all 30,000 earlier people. reports) Alu repeat elements -11% of the genome. Contained in GC rich areas 3 gene-dense areas are predominantly composed of the DNA building blocks G and C. Gene-poor areas are rich in A and T and Protein-coding genes are associated with higher GC content. GENOME An individual genome has ~ 3-4 million SNPs. Fewer than 1% of SNPs alter protein sequence. ARRANGEMENT Stretches of up to 30,000 C and G bases repeating over and over often occur adjacent to gene-rich areas, forming CpG islands which are believed to help regulate gene activity. Chromosome 1 has the most genes (2968), and the Y chromosome has the fewest (231). 4 HUMAN GENE CHARACTERISTICS 5 CpG islands Unmethylated regions of the genome associated with the 5' ends of house- keeping and regulatory genes In cancer cells, many CpG islands become hypermethylated Short sequence ranges where the Obs/Exp value is greater than 0.6 and the GC content is greater than 50% Taken from http://www.nature.com/nrurol/journal/v10/n6/fig_tab/nrurol.2013.89_F1.html Cytosine is usually methylated. Methyl- CpG binding proteins recruit histone deacetylases 6 REPETITIVE SEQUENCES Non-coding repeated sequences make up at least 50% of the human genome; Gives insight on chromosome structure and dynamics, reshape the genome by rearranging it, creating entirely new genes, and modifying and reshuffling existing genes. The human genome has a much greater portion (50%) of repeat sequences than the worm (7%), and the fly (3%) 7 INTERSPERSED REPEATS 1. LINEs (21% of human genome) 2. SINEs (13%) 3. Long terminal repeat transposons (8%) 4. DNA transposons (3%) 8 MINI AND MICRO SATELLITES Simple sequence repeats (SSR) are repeats of k-mers. Microsatellites have k = 1 to 12 Minisatellites have k = ~12 to 500 bp. This Photo by Unknown Author is licensed under CC BY 9 SINGLE NUCLEOTIDE POLYMORPHISMS ̶ They occur once in every 300 nucleotides on average in human genome, approx. 10 million SNPs in the human genome ̶ SNPs occur within a gene or in a regulatory region near a gene, they may play a more direct role in disease by affecting the gene’s function ̶ Some diseases have been identified by SNP’s ̶ Factor V (Leiden)- blood clotting disease ̶ Hypertension- some rare cases This Photo by Unknown Author is licensed under CC BY-SA ̶ Cystic Fibrosis- ̶ Chron’s disease ̶ Parkinson’s ̶ http://learn.genetics.utah.edu/content/p http://www.ncbi.nlm.nih.gov/SNP/ recision/snips/ 10 HAPLOTYPES 11 GENOME WIDE ASSOCIATION STUDIES Genome-wide association studies ( GWAS) are used to identify genes involved in human disease. The genome is searched for SNPs that occur more frequently in people with a particular disease (cases) than in people without the disease (controls). SNPs act as biological markers, when SNPs occur within a single gene or in a regulatory region near a gene, or have a direct role in disease by affecting the gene’s function SNPs do not always affect the way a protein functions Linkage disequilibrium (LD) occurs when SNPs are tightly linked to each other and form blocks. 12 1. Functional variation. A SNP associated with a nonsynonymous substitution in a coding region will change the amino acid sequence of a protein. SNPS AS DISEASE 2. Regulatory variation. A SNP in a noncoding MARKERS region can influence gene expression. 3. Association. SNPs can be used in whole- genome association studies. SNP frequency is compared between affected and control populations. 13 THE HUMAN GENOME Useful in finding human SNP variation in association with specific disease traits https://www.ncbi.nlm.nih.gov/snp/rs4247357#seq_hash The current assembly is GRCh38. (GRC refers to Genome Reference Consortium at http://www.ncbi.nlm.nih.gov/projects/genome/assembly/grc/) 14 Single nucleotide polymorphisms (SNPs) and haplotypes 15 Whole exome sequencing (WES) has been useful for identifying variants that cause monogenic disorders. Mendelian diseases are typically caused primarily by mutations affecting the coding region of a gene. Focus is on a small subset of the genome (∼60 megabases), enriched for functionally relevant loci. 16 Whole genome sequencing (WGS) detects 3-4 million single nucleotide variants (SNVs) per individual, substantially more than in a SNP array Trio-based WES or WGS often used to study complex diseases Interpretation of variants relevant to the phenotype is challenging 17 https://www.who.int/healthinfo/global_burden_disease/projections2004/en/ 18 MEDICAL GENOMICS Molecular Medicine ̶ Improve diagnosis of disease ̶ Detect genetic predispositions to disease ̶ Create drugs based on molecular information ̶ Design “custom drugs” (pharmacogenomics) based on individual genetic profiles and/or Use gene therapy approaches 19 rare common common 20 Monogenic disorders 21 GENETIC VARIATION http://www.youtube.com/genometv#p/u/1/RK2qDAA _u20 Multi-omics in Health and Disease (Welcome and Workshop Rationale) - YouTube http://www.1000genomes.org/home HGDP | IGSR data collection (internationalgenome.org) This Photo by Unknown Author is licensed under CC BY-SA https://youtu.be/Xsyp0qqKzkY?si=dl9CltmQjU6e3_OV This Photo by Unknown Author is licensed under CC BY-SA 22 1000 GENOMES PROJECT ( 2008- 2015) 1. Sequenced each sample to 4x genome coverage to allow the detection of most variants with frequencies as low as 1%. 2. Data from 2,504 samples was combined to allow genotyping of each sample at all the possible variant sites discovered 3. Provides a publically available resource of almost all genetic variants, with at least a frequency of 1% in 2500 genomes of European, African and Asian ancestry 4. Sample data from several Barbados studies are included under Afro- Caribbean variants data in the project 5. Uses NextGen sequencing of whole genomes 23 THE INTERNATIONAL GENOME SAMPLE RESOURCE (IGSR) Its goals: 1. Ensure the future usability of the 1000 Genomes reference data 2. Incorporate published genomic data on the 1000 Genomes samples 3. Expand the data collection to include new populations 1000 Genomes | A Deep Catalog of Human Genetic Variation (internationalgenome.org) 24 ̶ GWAS are the preferred method for the identification of genetic determinants in complex multifactorial diseases such as asthma. SNP S AND A STHMA ̶ These studies identify polymorphisms with small effect sizes and localise susceptibility regions due to linkage disequilibrium (LD). ̶ The beta2-adrenergic receptor (ADRB2) gene polymorphism is a major determinant of bronchodilator response to albuterol. ̶ The variant associated with Arg16 of the ADRB2 protein results in changes an arginine in in vitro receptor function. http://www.snpedia.com/inde x.php/Asthma ̶ Now consistent clinical evidence shows that in vivo, patients with asthma harbouring the Arg-16 genotype may experience reduced lung function and an increased frequency of exacerbations when treated with regular short-acting β-agonists 25 CANCER GENOMICS Cancer is a somatic disease, arising from a cell clone having somatic mutations and leading to malignant transformation as these cells proliferate. The goal of cancer genomics is to; 1. Enhance understanding of the molecular mechanisms of cancer, 2. Improve the prevention, early detection, diagnosis, and treatment of cancer. http://ocg.cancer.gov/ 26 BREAST CANCER AND HERCEPTIN Herceptin an intravenous drug as part of a chemotherapy regime against breast cancer that has metastasized. It is only effective in women with a genetic defect which results in the overproduction of the HER2 receptor. Herceptin is directed against the receptor; so it only helps women who have an increased number of copies of the relevant gene. Herceptin should be used only after a genetic http://genomemag.com/how- test. personalized-medicine-is-changing- breast-cancer/#.We3KrFtSzIU 27 Herceptin only targets those http://i2.wp.com/sitn.hms.harvard.edu/wp- content/uploads/2015/11/nicholes_herceptin.png cancerous cells eliminating the use of excessive or chemotherapy 28 MICROBIAL GENOMICS AND MAN ─ Microbial genomes have been sequenced for research and medicine ─ Pathogenic microbes –( some examples) o Meningitis o Anthrax o Ulcers o Tuberculosis o Malaria This Photo by Unknown Author is licensed under CC BY o Cancer 29 DNA/RNA VACCINES Microbial surface proteins are used as antigen targets ─ Recombinant antigens are used to test efficacy of new vaccines ─ Used in the development of N. meningitidis vaccine ─ Meningitidis is responsible for menigococcal disease in 32% of the US population (e.g. meningitis and sepsis) ─ The genome of N. meningitidis was sequenced in 2000 ─ Covid-19 Pfizer vaccine This Photo by Unknown Author is licensed under CC BY-NC 30 31 CANCER GENOMIC PROJECTS ̶ Cancer Genome Characterization (CGCI) ̶ Provides genomic data for identification of abnormalities in tumor cells. e.g Burkitt’s lymphoma ̶ Cancer target delivery and development (CTD2) ̶ Focus is on caner therapeutics and bioinformatics approaches of sequence data ̶ Therapeutic ally Applicable research to Generate Effective Treatments (TARGET) ̶ Identification of therapeutic targets in childhood cancers. e.g ALL, AML etc. 32 Injection of genetically engineered DNA ( e.g of pathogens) into cells so that they produce an antigen thereby initiating an immunological response directly. https://www.ncbi.nlm.nih.gov/pmc/articles/PM C3202319/pdf/cir334.pdf https://upload.wikimedia.org/wikipedia/commons/thumb/d/ da/Making_of_a_DNA_vaccine.jpg/373px- Making_of_a_DNA_vaccine.jpg 33 GENETIC TESTS A medical test that identifies changes in chromosomes, genes, or proteins. Genetic testing may be used for medical management and for personal decision-making. Genetic test results apply to the patient and also to other family members. Most genetic disorders are rare, so testing is done by specialized laboratories. For genetic testing to yield meaningful results: ─ multiple test methodologies may be required ─ other family members may need to be tested ─ a genetics consultation may be appropriate 34 EXAMPLES: GENETIC TESTS Currently there are several genetic tests available, all of which are not FDA approved http://www.genome.gov/19516567 Gene Target Disease CFTR Cystic Fibrosis FMR-1 Fragile X syndrome Viral RNA HPV Bcr-abl, Leukemia's BRAC1 and 2 gene Breast Cancer K-Ras and p53 Colon cancer 16srRNA N. gonorrhea SARS-CoV-2 gene Covid 19 35 ─ helps people understand and adapt to the medical, psychological and familial implications of genetic contributions to disease. ─ It integrates: Interpretation of family and medical histories to assess the chance of disease GENETIC occurrence or recurrence. Education about inheritance, testing, COUNSELING management, prevention, resources and research. Counseling to promote informed choices and adaptation to the risk or condition. http://www.nsgc.org/ 36 Gene therapy is the process of correcting genetic defects There are several approaches: 1.A normal gene may be inserted into a nonspecific location within the genome to replace GENE THERAPY a non-functional gene. 2.An abnormal gene could be swapped for a normal gene through homologous recombination. 3.The abnormal gene could be repaired through selective reverse mutation, which returns the gene to its normal function. 4.The abnormal gene could be removed or replaced through gene editing 5.The regulation of a particular gene could be altered. 37 Sickle cell and Gene therapy treatment Understanding gene therapy approaches Understanding gene therapy approaches 38 DISEASE DIAGNOSIS ̶ Gene expression patterns for complex disease such as cancer, diabetes, and neurological diseases (Parkinson’s, Alzheimer's) have been studied. ̶ Disease diagnosis has improved by use of such experimental procedures e.g.; ̶ DNA microarray analysis ̶ Proteomics ̶ PCR ̶ SNP analysis 39 PHARMACOGENOMICS Relationship between drug response and the genome of an individual, group or population design of individual therapies Improves disease Side effects in Pharmacogenomics diagnosis drugs can be better studied Differences in drug reactions in populations https://www.genome.gov/27530645/faq-about-pharmacogenomics/ 40 Correct Dosage PHARMACOGENOMICS Targeting drug dose to the OBJECTIVES patient’s metabolism Efficacy Knowing which drug to use Safety Drug alternatives if drugs does not work, Prediction of adverse reactions, Prevention Early interventions if genetic tests are done Image taken form: https://vignette.wikia.nocookie.net/mmg-233-2014- genetics- genomics/images/7/78/The_role_of_pharmacogenomics_1%281%29.jpg /revision/latest?cb=20140828163706 41 ̶ CYP2C9 polymorphisms are associated with an WARFARIN increased risk of over-anticoagulation ̶ screening for CYP2C9 variants allows clinicians to develop dosing protocols and surveillance in patients receiving warfarin. ̶ Too high a dose carries a high-risk hemorrhage and too low a dose will be ineffective, ̶ Dose variability is related to variations in CYP2C9 and vitamin K epoxide reductase complex subunit 1 (VKORC1). ̶ The FDA in 2007 noted the relevance of genetic testing for VKORC1 and CYP2C9 variants in prescribing warfarin. Variants in both VKORC1, which recycles the oxidized form to the reduced hydroquinone form of vitamin K1, and CYP2C9, which predominantly catalyzes (S)-warfarin to the 7-hydroxylated form as the major metabolite, determine the requirement of warfarin dosing atrial fibrillation (AF) and venous thromboembolism (VTE). http://www.bloodjournal.org/content/118/11/2938?sso-checed=true 42 SUMMARY: GENOMICS AND MEDICINE Improvements in cancer Improved diagnosis of the molecular basis of complex diseases genomics High Throughput genotyping methods Applications Development of Discovery of new genes targets Pharmacogenomics e.g.. Gene of disease susceptibility therapy and DNA vaccines 43 REFERENCES Benfey, P. N and Protopapas, A. D. (2005). Essentials of Genomics. Pearson NHGRI education resources National Human Genome Research Institute Home | NHGRI New maps link thousands of genetic variants to disease genes | Broad Institute http://europepmc.org/articles/PMC5628188 44