The Human Genome PDF BGEN3022
Document Details
Uploaded by DecisiveMermaid
Rady Faculty of Health Sciences
2024
Paul Marcogliese
Tags
Summary
This document provides a lecture or presentation summary on the human genome. It explains the composition of the human genome, differences from simpler organisms, and the concept of genes. It covers topics, such as RNA types and gene families, and provides examples like Hox genes and Hyaluronidases.
Full Transcript
Rady Faculty of Health Sciences BGEN3022 The Human Genome September 12th 2024 Paul Marcogliese (Mar – ko – yay – zeh), PhD (he/him/his)...
Rady Faculty of Health Sciences BGEN3022 The Human Genome September 12th 2024 Paul Marcogliese (Mar – ko – yay – zeh), PhD (he/him/his) Contact: [email protected] www.marcoglieselab.com @PCMarcogliese Learning Objectives What is the composition of the human genome? How does the human genome differ from those of simpler organisms? What is a gene? Gene families: redundancies, specialization and pseudogenes. Types of RNAs (lncRNAs, XIST and microRNAs). READING: Chapters 2, 3 Thompson and Thompson, 8th Edition Chapter 9 Strachan and Read, 4th Edition What is a Genome? Complete set of DNA sequence of a cell, organism, or DNA virus (also includes RNA sequence in RNA virus). How many different genomes are in a typical human cell? 2 nuclear genome mitochondrial Genome A Red Blood Cell? make hemoglobin to Just there Mitochondrial Genome (mtDNA) ↳ comes from mom 16.6 kb long (very small), no histones, highly redundant (each mt has multiple copies of the mtDNA genome). Contains 37 genes in total. 2 rRNA, 22 tRNAs, 13 oxidative phosphorylation proteins. - Mutation rate 10x that of nuclear DNA blacksprotective histora Compact genome - no “extra” DNA. Mitochondrial disease can result from mutations in mtDNA or nuclear DNA! Strachan and Reid, Ch 9 Bigger not a lot - > of Junk DNA Strachan and Reid, Ch 9 Human (Nuclear) Genome Project Used old methods of cloning, mapping and Sanger sequencing. Published 2001 3-billion bp (3,000Mb) of DNA. Computer predictions of raw data to find open reading frames, splice sites etc. Predictions verified with cDNA sequencing. Total of ~25,000+ protein-encoding genes. Which has the bigger genome? Humans C. elegans (soil worm with 100,000 distinct human proteins had been observed on 2- Dimensional gels If one gene makes one protein, there should be >100,000 genes in the human genome. Genome Comparisons # of kb # of genes Human Species Mb Genes* Chrom. C. elegans Human 3,000 25,000 46 C. elegans 100 23,000 12 Arabidopsis Arabidopsis 140 27,000 10 Wheat 16,800 95,000 42 *Protein-encoding genes ↓ due toPolypliedy not 1-1 relationship Wheat Adapted from D. Merz Human Genome The human genome gets more informationcontent from its 25,000 genes. due to transcriptiontranslation Information content of the human genome is much greater due to transcriptional regulatory complexity (e.g. alternative splicing and start sites). Human proteome contains over 100,000 proteins (many more if you consider post-translational modifications). One gene makes more than one protein. Latorre & Silva, Mètode Science Studies Journal, 2014 Human Genome Complexity Multiple transcriptional start sites at a single gene. Alternative splicing is common. Complex transcriptional regulatory regions in non-coding sequences (upstream, downstream, intronic, distant). Many post-translational modifications (glycosylation etc) create even greater diversity in the proteome. within a gene there's another gene nested to it. Strachan and Reid, Ch 9 Take home message – strategies for organizing a genome: DNA DNA Proteins Proteins Take home message A single gene can encode multiple proteins, ie 25K One gene makes t human genes, >100k multiple human proteins. Image on the right is closer to reality! Adapted from D. Merz Human Genome Content 1% of DNA encodes proteins 4% DNA encodes “non- coding” RNAs (that act as RNAs, are not translated to proteins). 95% of DNA is transposable element repeats, heterochromatin or “other” sequences. · Strachan and Reid, Ch 9 Gene Families Gene families contain multiple copies of closely related genes that arose from a single ancestral gene. These are called paralogues. ↳ in from ancestoral genes have close I species multiple genes derived an , They are often found in clusters (i.e. right next to one another) on chromosomes. Gene families arise through duplication events. Gene Duplication Mechanisms 1. Tandem duplication 2. Chromosomal Translocations 3. DNA Polymerase slippage in replication 4. Transposable Elements 5. Whole Genome Duplication 1. Tandem Duplication (aka Unequal Recombination/Crossover Regions of repetitive DNA sequence ↑ repetitive can cause “confusion” during cross over here. recombination and result in unequal should it > - cross over crossing over. ↳ deletion Net gain and loss of DNA material Can involve small or large regions Can replicate genes into gene “clusters” at the same chromosomal location. Strachan and Reid, Ch 9 2. Translocations (aka Duplicative transposition by recombination. any part of the genome Exchange of DNA between different chromosomes. Can be reciprocal (an exchange) or uni-directional. Results in duplications in other chromosomal locations. near centromeres. 3. DNA Polymerase Slippage in a coding Wher replicating Causes small duplications, insertions, deletions. C A G One of the main causes of DNA variants/mutations. CAGC A a Not likely to duplicate an entire gene. To be clear: DNA slippage during replication causes a loop/bubble and it is endogenous DNA repair duplication within a gene mechanisms that end up treating the bubble as “real” and expanding it. Slippage happens with repetitive regions. 4. Transposable Elements (aka “Jumping genes”) Class 1 - Retrotransposons (copy and paste) Class 2 - DNA Transposons (cut and paste) moving transposable DNA & To be clear: https://www.youtube.com/watch?v=JEBuiImSY2s See the above video for some additional clarity Class 1. Retrotransposition Presence of Long Terminal Repeats (LTRs) Use an RNA intermediate Transposons encode reverse transcriptase enzyme (RNA-DNA transcription) like a retrovirus About 40% of the human genome. Transposed sequences typically not functional as they lack regulatory sequences (promoter etc.) No introns. Usually result in pseudogenes. Class 2: DNA Transposition happens anytime Cutrans “Cut and paste” Mobile DNA elements (2% of human genome) No RNA intermediate Encode transposase enzymes needed to excise and re-insert Can carry other sequences Recognized in DNA by inverted repeats that flank them. Can result in functional genes. 5. Whole Genome Duplication Polyploidy – extra copies of chromosomes. Mechanism - chromosomal non-disjunction Human genome - there were two rounds of Whole Genome Duplication occurred early in vertebrate development Teleost Fish - underwent an additional genome duplication. Strachan and Reid, Ch 13 Gene Duplication Effects on Genes – produce Copy Number Variations (CNVs) 1. Redundancy 2. Specialization (sequence or expression) – gene family - specific > expression 3. Degradation> - One of the genes will get degradation Examples: Hox clusters, Hyaluronidases Duplication can Cause Redundancy Two genes with the same function. Loss of either will have little or no effect. Loss of both may have a significant effect. Degradation (by mutation) of one copy can occur in the absence of selection. This can result in a pseudogene (identified by stop codons or frameshifts that disrupt the open reading frame). Duplication can Cause Specialization Mutations can also alter the activity and/or expression of a duplicated gene, giving it a novel or specialized function (and thus reducing redundancy). Can be alterations to coding sequence or to transcriptional regulatory sequences. The “new” gene now has a unique function that may be important. There will be selection against mutation that eliminate function. Example 1: Hox Gene Family Hox genes: Family of homeobox transcription factors that are differentially expressed along the rostrocaudal (head-to-tail) axis during development. Give identity to developing tissues in different places along this axis. https://www.khanacademy.org/science/biology/developmental-biology/signaling-and-transcription-factors -in-development/a/homeotic-genes modified from Hox genes of fruit fly, by PhiLiP, public domain Hox Clusters Invertebrates have 1 Hox cluster Humans have 4 Hox clusters Fish have 7 Hox clusters To be clear: https://www.youtube.com/watch?v=HwrXeQTCcXY See the above video for some additional clarity, but more than you need to know Hox Clusters Genome duplication (X2) to get 4 clusters Unequal recombination causes further duplications within clusters. Specialization and degradation within clusters to stabilize or eliminate family members. How do fish get 7 clusters? Hox gene redundancy Effects of Hox gene lof (loss-of-function) mutations: Lethal in Drosophila But in humans only subtle effects caused by any single mutation in a Hox gene. e.g. HOXD13 mutations cause shortened or fused fingers. Example 2: Hyaluronidases Hyals Enzymes that degrade hyaluronan (a glycosaminoglycan) Nematodes have only one hyaluronidase. Humans two clusters of three paralogues each, on different chromosomes. Slightly different amino acid sequences. Different enzymatic activities (pH). Different tissue expression patterns. How did this arise? un equal recombination. Example 2: Hyaluronidases Loss of a HYAL gene has subtle defects. Accumulation of the substrate (HA), lysosomal storage disorders (Mucopolysaccharidoses). But not lethal. Why? Paralogs can compensate for the loss To be clear Pseudogenes https://www.youtube.com/ Look like genes but do NOT produce functional proteins. watch?v=bLEw6O0a4zI See the above video for some additional clarity, but more than you need Usually degraded redundant genes or retrotransposed to know genes that have accumulated mutations in the absence of selection. Two types: there was a gene that happened ↳ Nonprocessed pseudogenes – vestigial genes inactivated by mutations in critical coding or regulatory sequences. non functional genes jumping genes making Processed pseudogenes are pseudogenes that have been formed by transposable elements Are they without function? Some can still be transcribed - & take home message Coding v non-coding Genome 25,000 protein-encoding genes comprise only 1% of the genome. But 57% of the genome is transcribed (RNA genes, transposons, pseudogenes) There is a cost in terms of energy used so there must be a function? RNA Types in the Human Genome ① O Strachan and Reid, Ch 9 RNA Types in the Human Genome Classical RNAs function in producing proteins: mRNA, rRNA, tRNA Non-coding RNAs do not produce proteins: long ncRNAs (e.g. XIST), microRNAs, snoRNAs, scaRNAs, snRNAs, piRNAs, exRNAs, siRNAs. lncRNA molecules can adopt complex secondary and tertiary folding structures, just like polypeptides. Strachan and Reid, Ch 9 XIST (X-inactive specific transcript)) ↳Barbodies lncRNA transcribed only from the inactive X chromosome. Acts with other ncRNAs and proteins from the X Inactivation Centre (XIC). Associates with the inactive X chromosome and is essential for its inactivation. Thompson and Thompson, 8th Edition MicroRNAs MicroRNAs (miRNAs) regulate the expression of certain target genes by binding to the mRNAs they produce (1776 genes). Important regulators of gene expression. 22nt long after processing Bind to complementary sequence in 3’UTRs of target mRNAs (UTR = untranslated region). MicroRNAs are negative regulators through translational repression or mRNA degradation Linked to a variety of human disease Garzon et al., Trends in Molecular Medicine, 2006 Bias in Human Genomics Studies Sirugo et al., Cell, 2019 Human Genome Diversity There is more genetic diversity within African populations than any other group. Schlebusch et al., Science, 2017 Summary of RNA classes [ - - Strachan and Reid, Ch 9 So then … What is a Gene? - Central dogma: DNA - RNA protein But view arose from phage and bacterial genomes. One gene – one protein, but, many (or most) transcriptional events do not lead to proteins. DNA · RNA ( protein) What is a Gene? Regulatory elements can lie within other genes. Can be distant. Or can be shared. Genes can lie within the introns of other genes. Genes can be co-transcribed (poly-cistronic). What are the physical boundaries of a gene? How do you say a DNA variant affects a specific gene? (Next class!) Summary of Objectives Content of the human genome (1% protein-encoding genes etc). Types of RNA-encoding genes (4% of genome). How gene families (eg Hox) arise via duplication events: Redundancies, specialization, pseudogenes/degradation. The challenge of defining what exactly is a gene. Questions to ponder What is the composition of the human genome? How does the human genome differ from those of simpler organisms? Why does the human genome have all this “junk” DNA? Why not just have 100,000 genes and get rid of the “junk”? If a stray gamma ray hits the DNA within a nucleus, what is it likely to hit? What is a gene? Resources we use in research all the time!