BGEN 3022 Lecture 3 Human Genome PDF

Document Details

DecisiveMermaid

Uploaded by DecisiveMermaid

Rady Faculty of Health Sciences, University of Manitoba

Paul Marcogliese

Tags

human genome molecular genetics biology

Summary

This document, part of BGEN 3022, provides details on the human genome. It covers topics including the composition of the human genome, how it differs from simpler organisms, what a gene is, and gene families. The lecture also describes types of RNA and the mitochondrial genome; as well as the human genomics project, including methods and statistics on human (nuclear) genome projects.

Full Transcript

Rady Faculty of Health Sciences Learning Objectives BGEN3022 What is the composition of the human genome?...

Rady Faculty of Health Sciences Learning Objectives BGEN3022 What is the composition of the human genome? How does the human genome differ from those of simpler organisms? The Human Genome What is a gene? September 12th 2024 Paul Marcogliese (Mar – ko – yay – zeh), PhD Gene families: redundancies, specialization and pseudogenes. (he/him/his) Contact: [email protected] Types of RNAs (lncRNAs, XIST and microRNAs). www.marcoglieselab.com @PCMarcogliese READING: Chapters 2, 3 Thompson and Thompson, 8th Edition Chapter 9 Strachan and Read, 4th Edition What is a Genome? Mitochondrial Genome (mtDNA) Complete set of DNA 16.6 kb long (very small), no histones, highly redundant (each mt has multiple sequence of a cell, organism, copies of the mtDNA genome). or DNA virus (also includes RNA sequence in RNA virus). Contains 37 genes in total. 2 rRNA, 22 tRNAs, 13 oxidative phosphorylation proteins. How many different genomes are in a typical human cell? Mutation rate 10x that of nuclear DNA Compact genome - no “extra” DNA. A Red Blood Cell? Mitochondrial disease can result from mutations in mtDNA or nuclear DNA! Strachan and Reid, Ch 9 Human (Nuclear) Genome Project Used old methods of cloning, mapping and Sanger sequencing. Published 2001 3-billion bp (3,000Mb) of DNA. Computer predictions of raw data to find open reading frames, splice sites etc. Predictions verified with cDNA sequencing. Total of ~25,000+ protein-encoding genes. Strachan and Reid, Ch 9 Which has the bigger genome? Large Animal = Large Genome? It was expected that humans would have many more genes in Humans their genome than smaller animals. C. elegans (soil worm with 100,000 distinct human proteins had been observed on 2- Dimensional gels Largest size in base pairs of DNA? If one gene makes one protein, there should be >100,000 Greatest number of predicted genes? genes in the human genome. Genome Comparisons Human Genome # of kb # of genes The human genome gets more information from its 25,000 genes. Human Species Mb Genes* Chrom. Information content of the human C. elegans genome is much greater due to Human 3,000 25,000 46 transcriptional regulatory complexity C. elegans 100 23,000 12 Arabidopsis (e.g. alternative splicing and start sites). Arabidopsis 140 27,000 10 Wheat 16,800 95,000 42 Human proteome contains over 100,000 *Protein-encoding genes proteins (many more if you consider Wheat Slide update!!! post-translational modifications). Note that C. elegans and Arabidopsis One gene makes more than one have more “compact” genomes to us. protein. The same is true for Drosophila, not much “junk DNA” aka intergenic space Adapted from D. Merz Latorre & Silva, Mètode Science Studies Journal, 2014 Human Genome Complexity Take home message – Multiple transcriptional start sites at a single gene. strategies for organizing a genome: Alternative splicing is common. DNA DNA Proteins Proteins Complex transcriptional regulatory regions in non-coding sequences (upstream, downstream, intronic, distant). Take home message A single gene can encode multiple proteins, ie 25K Many post-translational modifications (glycosylation etc) create even greater diversity in human genes, >100k the proteome. human proteins. Image on the right is closer to reality! Strachan and Reid, Ch 9 Adapted from D. Merz Human Genome Content Gene Families 1% of DNA encodes Gene families contain multiple copies of closely related proteins genes that arose from a single ancestral gene. These are 4% DNA encodes “non- called paralogues. coding” RNAs (that act as RNAs, are not translated to proteins). They are often found in clusters (i.e. right next to one another) on chromosomes. 95% of DNA is transposable element repeats, heterochromatin or “other” sequences. Gene families arise through duplication events. Strachan and Reid, Ch 9 Gene Duplication Mechanisms 1. Tandem Duplication (aka Unequal Recombination/Crossover 1. Tandem duplication Regions of repetitive DNA sequence can cause “confusion” during 2. Chromosomal Translocations recombination and result in unequal crossing over. 3. DNA Polymerase slippage in replication Net gain and loss of DNA material Can involve small or large regions 4. Transposable Elements Can replicate genes into gene “clusters” at the same chromosomal 5. Whole Genome Duplication location. Strachan and Reid, Ch 9 2. Translocations (aka Duplicative transposition by recombination. 3. DNA Polymerase Slippage Causes small duplications, Exchange of DNA between different chromosomes. insertions, deletions. One of the main causes of Can be reciprocal (an exchange) or uni-directional. DNA variants/mutations. Results in duplications in other chromosomal locations. Not likely to duplicate an entire gene. To be clear: DNA slippage during replication causes a loop/bubble and it is endogenous DNA repair mechanisms that end up treating the bubble as “real” and expanding it. Slippage happens with repetitive regions. 4. Transposable Elements Class 1. Retrotransposition (aka “Jumping genes”) Presence of Long Terminal Repeats (LTRs) Class 1 - Retrotransposons (copy and paste) Use an RNA intermediate Transposons encode reverse transcriptase enzyme (RNA-DNA transcription) like a Class 2 - DNA Transposons (cut and paste) retrovirus About 40% of the human genome. To be clear: Transposed sequences typically not functional as they lack regulatory sequences (promoter https://www.youtube.com/watch?v=JEBuiImSY2s etc.) See the above video for some additional clarity No introns. Usually result in pseudogenes. Class 2: DNA Transposition 5. Whole Genome Duplication Polyploidy – extra copies of chromosomes. “Cut and paste” Mobile DNA elements (2% of human genome) No RNA intermediate Mechanism - chromosomal non-disjunction Encode transposase enzymes needed to excise and re-insert Can carry other sequences Human genome - there were two rounds of Recognized in DNA by inverted repeats that flank them. Whole Genome Duplication occurred early in Can result in functional genes. vertebrate development Teleost Fish - underwent an additional genome duplication. Strachan and Reid, Ch 13 Gene Duplication Duplication can Cause Redundancy Effects on Genes – produce Copy Two genes with the same function. Number Variations (CNVs) Loss of either will have little or no effect. 1. Redundancy Loss of both may have a significant effect. 2. Specialization (sequence or expression) – gene family 3. Degradation Degradation (by mutation) of one copy can occur in the absence of selection. This can result in a pseudogene Examples: Hox clusters, (identified by stop codons or frameshifts that disrupt the Hyaluronidases open reading frame). Duplication can Cause Specialization Example 1: Hox Gene Family Mutations can also alter the activity and/or expression of a duplicated gene, Hox genes: giving it a novel or specialized function (and thus reducing redundancy). Family of homeobox transcription factors that are Can be alterations to coding sequence or to transcriptional regulatory differentially expressed along sequences. the rostrocaudal (head-to-tail) axis during development. The “new” gene now has a unique function that may be important. There will Give identity to developing tissues in different places along be selection against mutation that eliminate function. this axis. https://www.khanacademy.org/science/biology/developmental-biology/signaling-and-transcription-factors -in-development/a/homeotic-genes modified from Hox genes of fruit fly, by PhiLiP, public domain Hox Clusters Hox Clusters Genome duplication (X2) to get 4 Invertebrates have 1 Hox clusters cluster Unequal recombination causes further duplications within clusters. Humans have 4 Hox clusters Specialization and degradation within clusters to stabilize or eliminate family Fish have 7 Hox clusters members. How do fish get 7 clusters? To be clear: https://www.youtube.com/watch?v=HwrXeQTCcXY See the above video for some additional clarity, but more than you need to know Hox gene redundancy Example 2: Hyaluronidases Effects of Hox gene lof (loss-of-function) mutations: Enzymes that degrade hyaluronan (a glycosaminoglycan) Lethal in Drosophila Nematodes have only one hyaluronidase. But in humans only subtle effects caused by any single mutation in Humans two clusters of three paralogues each, on different chromosomes. a Hox gene. e.g. HOXD13 mutations cause shortened or fused fingers. Slightly different amino acid sequences. Different enzymatic activities (pH). Different tissue expression patterns. How did this arise? To be clear Example 2: Hyaluronidases Pseudogenes https://www.youtube.com/ Look like genes but do NOT produce functional proteins. watch?v=bLEw6O0a4zI See the above video for some additional clarity, Loss of a HYAL gene has subtle defects. Usually degraded redundant genes or retrotransposed but more than you need to know genes that have accumulated mutations in the absence of Accumulation of the substrate (HA), lysosomal storage disorders selection. (Mucopolysaccharidoses). Two types: But not lethal. Why? Nonprocessed pseudogenes – vestigial genes inactivated by mutations in critical coding or regulatory sequences. Processed pseudogenes are pseudogenes that have been formed by transposable elements Are they without function? Some can still be transcribed Coding v non-coding Genome RNA Types in the Human Genome 25,000 protein-encoding genes comprise only 1% of the genome. But 57% of the genome is transcribed (RNA genes, transposons, pseudogenes) There is a cost in terms of energy used so there must be a function? Strachan and Reid, Ch 9 RNA Types in the Human Genome XIST (X-inactive specific transcript)) Classical RNAs function in producing proteins: mRNA, rRNA, tRNA lncRNA transcribed only from the inactive X chromosome. Non-coding RNAs do not produce proteins: long ncRNAs (e.g. XIST), microRNAs, snoRNAs, scaRNAs, Acts with other ncRNAs and snRNAs, piRNAs, exRNAs, siRNAs. proteins from the X Inactivation Centre (XIC). lncRNA molecules can adopt complex secondary and Associates with the inactive X tertiary folding structures, just like polypeptides. chromosome and is essential for its inactivation. Strachan and Reid, Ch 9 Thompson and Thompson, 8th Edition MicroRNAs Bias in Human Genomics Studies MicroRNAs (miRNAs) regulate the expression of certain target genes by binding to the mRNAs they produce (1776 genes). Important regulators of gene expression. 22nt long after processing Bind to complementary sequence in 3’UTRs of target mRNAs (UTR = untranslated region). MicroRNAs are negative regulators through translational repression or mRNA degradation Linked to a variety of human disease Garzon et al., Trends in Molecular Medicine, 2006 Sirugo et al., Cell, 2019 Human Genome Diversity Summary of RNA classes There is more genetic diversity within African populations than any other group. Schlebusch et al., Science, 2017 Strachan and Reid, Ch 9 So then … What is a Gene? What is a Gene? Regulatory elements can lie within other genes. Can be distant. Or can be shared. Central dogma: DNA RNA  protein But view arose from phage and bacterial genomes. Genes can lie within the introns of other genes. One gene – one protein, but, many (or most) transcriptional events do not lead to Genes can be co-transcribed (poly-cistronic). proteins. DNA RNA (  protein) What are the physical boundaries of a gene? How do you say a DNA variant affects a specific gene? (Next class!) Summary of Objectives Questions to ponder Content of the human genome (1% protein-encoding genes etc). What is the composition of the human genome? Types of RNA-encoding genes (4% of genome). How does the human genome differ from those of simpler organisms? Why does the human genome have all this “junk” DNA? How gene families (eg Hox) arise via duplication events: Redundancies, specialization, pseudogenes/degradation. Why not just have 100,000 genes and get rid of the “junk”? The challenge of defining what exactly is a gene. If a stray gamma ray hits the DNA within a nucleus, what is it likely to hit? What is a gene? Resources we use in research all the time!

Use Quizgecko on...
Browser
Browser