Human Genome Project (IBSSD 1511/1520, 2024) PDF
Document Details
Uploaded by RelaxedWichita
Midwestern University
2024
IBSSD
Dr. Jonathan Lerner
Tags
Summary
This document is from IBSSD 1511/1520, 2024, and details a lecture about the organization of the human genome. It looks at DNA structure, how chromosomes contain genes, and the important role of intergenic regions, as well as important diseases.
Full Transcript
Organization of the Human Genome IBSSD 1511/1520, 2024 M3 Lecture 1 Dr. Jonathan Lerner Terminal Objective: Understand the general information content and organization of the human genome, including the structure of a typical gene, the natur...
Organization of the Human Genome IBSSD 1511/1520, 2024 M3 Lecture 1 Dr. Jonathan Lerner Terminal Objective: Understand the general information content and organization of the human genome, including the structure of a typical gene, the nature of regulatory sequences and epigenetic modification of gene expression. Introduction Part 1. From DNA to chromosomes Part 2. Genome Organization and Regulation Part 3. Epigenetics – regulation above the DNA sequence Genome= complete set of genetic material Human cells are DIPLOID (2n): 2 copies of each chromosome 22 autosomal chromosomes + sex-specific chromosomes XX or XY (+ mitochondrial genome) karyotype chromosome genes DNA Double Helix Intergenic regions genes Genome = INFORMATION Jurassic Park, 1993 Human Genome = INFORMATION to make a human + progeny Human Genome = INFORMATION What is the physical support of genomic information ? How is the genomic information organized in the cell ? How is the genomic information regulated ? Why is it important ? Variations/Alterations in genomic information - responsible for disease Congenital disease De novo mutations Genomic instability and cancer - predispose to disease Genome-Wide Association studies Personalized/Precision medicine Wang et al., Frontier in Immunology 2021 Regenerative Medicine Generate Cells, Tissues and Organs at will to heal and restore normal function neuron osteoblast enterocytes hepatocyte cardiomyocyte Part 1. From DNA to chromosomes DeoxyriboNucleic Acid: the support of genetic information DNA is a polymer of nucleotides A) The double helix Bases ATGC Deoxyribose- Phosphate James Watson & Francis Crick backbone Watson & Crick, Rosalind Franklin & Maurice Wilkins Nature, 1953. Property 1: The DNA double helix is stable 1) Hydrogen bonds between bases A-T (2) G-C(3) NH2 O P A N HN T P O NH2 O P G N HN C P O NH2 Property 1: The DNA double helix is stable 1) Hydrogen bonds between bases A-T (2) G-C(3) 2) Phosphodiester bonds between deoxyribose and phosphate O 5’ O P O CH2 BASE O O 5’ O P O CH2 BASE O O 3’ O 3’ HO O 5’ O O P O CH2 BASE 5’ O O P O CH2 BASE O O 3’ O 3’ Phosphodiester HO bond HO Property 1: The DNA double helix is stable 1) Hydrogen bonds between bases A-T (2) G-C(3) 2) Phosphodiester bonds between deoxyribose and phosphate 3) Hydrophobic interaction between adjacent bases HYDROPHOBIC NH2 CORE Cytosine N O O N O N HN Guanine N H 2N HYDROPHILIC Hydrophobic base BACKBONE stacking Side view Top view Property 2: The DNA double helix is directional HO 3’OH 3’ O O O 5’ H2C O P O BASE BASE 5’P O P O CH2 O O O 3’ O O O 5’ H2C O P O O P O CH2 BASE BASE O O O 3’ O O O H2C O P O O P O CH2 BASE BASE O O O 3’ 3’ 5’ O O O O P O CH2 O BASE BASE H2C O P O 5’P O O 3’ HO 3’OH Property 2: The DNA double helix is directional By convention, a DNA sequence is read from 5’ to 3’ Example: part of sequence of the gene for albumin 5’P CTAGCTTTTCTCTTCTGTCAACCCCACACGCCTTTGGCACAATGAAGTGGGTAACCTTTATTTCCCTTCTTTTTCTCTTTAGCTCGGCTTATTCCAGGGGTGTGTTTCGTCGAGATGCACGTAAGAAATCCATTTTTCTATTGTTCAACTTTTATTCTATTTTCCCAGTAAAATAAAGTTTTAGTAAACTCTGCATCTTTAAAGAATTATTTTGGCATTTATTTCTA AAATGGCATAGTATTTTGTATTTGTGAAGTCTTACAAGGTTATCTTATTAATAAAATTCAAACATCCTAGGTAAAAAAAAAAAAAGGTCAGAATTGTTTAGTGACTGTAATTTTCTTTTGCGCACTAAGGAAAGTGCAAAGTAACTTAGAGTGACTGAAACTTCACAGAATAGGGTTGAAGATTGAATTCATAACTATCCCAAAGACCTATCCATTGCACTATGCTT TATTTAAAAACCACAAAACCTGTGCTGTTGATCTCATAAATAGAACTTGTATTTATATTTATTTTCATTTTAGTCTGTCTTCTTGGTTGCTGTTGATAGACACTAAAAGAGTATTAGATATTATCTAAGTTTGAATATAAGGCTATAAATATTTAATAATTTTTAAAATAGTATTCTTGGTAATTGAATTATTCTTCTGTTTAAAGGCAGAAGAAATAATTGAACAT CATCCTGAGTTTTTCTGTAGGAATCAGAGCCCAATATTTTGAAACAAATGCATAATCTAAGTCAAATGGAAAGAAATATAAAAAGTAACATTATTACTTCTTGTTTTCTTCAGTATTTAACAATCCTTTTTTTTCTTCCCTTGCCCAGACAAGAGTGAGGTTGCTCATCGGTTTAAAGATTTGGGAGAAGAAAATTTCAAAGCCTTGTAAGTTAAAATATTGATGAA TCAAATTTAATGTTTCTAATAGTGTTGTTTATTATTCTAAAGTGCTTATATTTCCTTGTCATCAGGGTTCAGATTCTAAAACAGTGCTGCCTCGTAGAGTTTTCTGCGTTGAGGAAGATATTCTGTATCTGGGCTATCCAATAAGGTAGTCACTGGTCACATGGCTATTGAGTACTTCAAATATGACAAGTGCAACTGAGAAACAAAAACTTAAATTGTATTTAATT GTAGTTAATTTGAATGTATATAGTCACATGTGGCTAATGGCTACTGTATTGGACAGTACAGCTCTGGAACTTGCTTGGTGGAAAGGACTTTAATATAGGTTTCCTTTGGTGGCTTACCCACTAAATCTTCTTTACATAGCAAGCATTCCTGTGCTTAGTTGGGAATATTTAATTTTTTTTTTTTTTTAAGACAGGGTCTCGCTCTGTCGCCCAGGCTGGAGTGCAGT GGCGCAATCTCGGCTCACTGCAAACTCCGCCTCCCGGGTTCACGCCATTCTCCTGCCTCAGCCTCCCGAGTAGCTGGGACTACAGGCGCCCGCCATCACGCCCGGCTAATCTTTTGTATTTTTAGTAGAGATGGGGTTTCACCGTGTGCCAGGATGGTCTCAATCTCCTGACATCGTGATCTGCCCACCTCGGCCTCCCAAAGTGCTGGGATTACAGGAGTGAGCCA CCGCGCCCGGCCTATTTAAATGTTTTTTAATCTAGTAAAAAATGAGAAAATTGTTTTTTTAAAAGTCTACCTAATCCTACAGGCTAATTAAAGACGTGTGTGGGGATCAGGTGCGGTGGTTCACACCTGTAATCCCAGCACTTTGGAAGGCTGATGCAGGAGGATTGCTTGAGCCCAGGAGTTCAAGACCAGCCTGGGCAAGTCTCTTTAAAAAAAACAAAACAAAC AAACAAAAAAATTAGGCATGGTGGCACATGCCTGTAGTCCTAGCTACTTAGGAGGCTGACGTAGGAGGATCGTTTGGACCTGAGAGGTCAAGGCTACAGTGAGCCATGATTGTGCCACTGCACTCCAGCCTGGGTGACAGAGTGAGACTCTGTCTCAAAAAAGAAAAAGGAAATCTGTGGGGTTTGTTTTAGTTTTAAGTAATTCTAAGGACTTTAAAAATGCCTAG TCTTGACAATTAGATCTATTTGGCATACAATTTGCTTGCTTAATCTATGTGTGTGCATAGATCTACTGACACACGCATACATATAAACATTAGGGAACTACCATTCTCTTTGCGTAGGAAGCCACATATGCCTATCTAGGCCTCAGATCATACCTGATATGAATAGGCTTTCTGGATAATGGTGAAGAAGATGTATAAAAGATAGAACCTATACCCATACATGATTT GTTCTCTAGCGTAGCAACCTGTTACATATTAAAGTTTTATTATACTACATTTTTCTACATCCTTTGTTTCAGGGTGTTGATTGCCTTTGCTCAGTATCTTCAGCAGTGTCCATTTGAAGATCATGTAAAATTAGTGAATGAAGTAACTGAATTTGCAAAAACATGTGTTGCTGATGAGTCAGCTGAAAATTGTGACAAATCACTTGTAAGTACATTCTAATTGTGGA GATTCTTTCTTCTGTTTGAAGTAATCCCAAGCATTTCAAAGGAATTTTTTTTAAGTTTTCTCAATTATTATTAAGTGTCCTGATTTGTAAGAAACACTAAAAAGTTGCTCATAGACTGATAAGCCATTGTTTCTTTTGTGATAGAGATGCTTTAGCTATGTCCACAGTTTTAAAATCATTTCTTTATTGAGACCAAACACAACAGTCATGGTGTATTTAAATGGCAA TTTGTCATTTATAAACACCTCTTTTTAAAATTTGAGGTTTGGTTTCTTTTTGTAGAGGCTAATAGGGATATGATAGCATGTATTTATTTATTTATTTATCTTATTTTATTATAGTAAGAACCCTTAACATGAGATCTACCCTGTTATATTTTTAAGTGTACAATCCATTATTGTTAACTACGGGTACACTGTTGTATAGCTTACTCATCTTGCTGTATTAAAACTTT GTGCCCATTGATTAGTAACCCCTCGTTTCGTCCTCCCCCAGCCACTGGCAACCAGCATTATACTCTTTGATTCTATGAGTTTGACTACTTTAGCTACCTTATATAAGTGGTATTATGTACTGTTTATCTTTTTATGACTGACTTATTTCCCTTAGCATAGTGCATTCAAAGTCCAACCATGTTGTTGCCTATTGCAGAATTTCCTTCTTTTCAAGGCTGAATAATAT TCCAGTGCATGTGTGTACCACATTTTCTTTATCCATTAATTTGTTGATTGATAGACATTTAGGTTGGTTTTCTACATCTTGACTATCATGAATAGTGTTGCAATGAACACAGGAGAGCTACTATCTCTTAGAGATGATATCATGGTTTTTATCATCAGAAAACACCCACTGATTTCTATGCTAATTTTGTTACCTGGGTGGAATAATAGTACAGCTATATATTCCTC ATTTTAGATATCTTTGTATTTCTACATACAATAAAAAAGCAGAGTACTTAGTCATGTTGAAGAACTTTAAACTTTTAGTATTTCCAGATCAATCTTCAAAACAAGGACAGGTTTATCTTTCTCTCACCACTCAATCTATATATACCTCTTGTGGGCAAGGCCAGTTTTTATCACTGGAGCCTTTCCCCTTTTTATTATGTACCTCTCCCTCACAGCAGAGTCAGGAC TTTAACTTTACACAATACTATGGCTCTACATATGAAATCTTAAAAATACATAAAAATTAATAAATTCTGTCTAGAGTAGTATATTTTCCCTGGGGTTACAGTTACTTTCATAATAAAAATTAGAGATAAGGAAAGGACTCATTTATTGGAAAGTGATTTTAGGTAACATTTCTGGAAGAAAAATGTCTATATCTTAATAGTCACTTAATATATGATGGATTGTGTTA CTCCTCAGTTTTCAATGGCATATACTAAAACATGGCCCTCTAAAAAGGGGGCAAATGAAATGAGAAACTCTCTGAATGTTTTTCTCCCCTAGGTGAATTCACCTGCTGCTTAGAAGCTTATTTTCTCTTGATTTCTGTTATAATGATTGCTCTTACCCTTTAGTTTTAAGTTTCAAAATAGGAGTCATATAACTTTCCTTAAAGCTATTGACTGTCTTTTTGTCCTG TTTTATTCACCATGAGTTATAGTGTGACAGTTAATTCTTATGAAAATTATATAGAGATGGTTAAATCATCAGAAACTGTAAACCTCGATTGGGAGGGGAAGCGGATTTTTAAATGATTTCCTGACCAAGCTTAACCAGTATATTAAATCCTTTGTACTGTTCTTTGGCTATAAAGAAAAAAGGTACTGTCCAGCAACTGAAACCTGCTTTCTTCCATTTAGCATACC CTTTTTGGAGACAAATTATGCACAGTTGCAACTCTTCGTGAAACCTATGGTGAAATGGCTGACTGCTGTGCAAAACAAGAACCTGAGAGAAATGAATGCTTCTTGCAACACAAAGATGACAACCCAAACCTCCCCCGATTGGTGAGACCAGAGGTTGATGTGATGTGCACTGCTTTTCATGACAATGAAGAGACATTTTTGAAAAAGTAAGTAATCAGATGTTTATA GTTCAAAATTAAAAAGCATGGAGTAACTCCATAGGCCAACACTCTATAAAAATTACCATAACAAAAATATTTTCAACATTAAGACTTGGAAGTTTTGTTATGATGATTTTTTAAAGAAGTAGTATTTGATACCACAAAATTCTACACAGCAAAAAATATGATCAAAGATATTTTGAAGTTTATTGAAACAGGATACAATCTTTCTGAAAAATTTAAGATAGACAAAT TATTTAATGTATTACGAAGATATGTATATATGGTTGTTATAATTGATTTCGTTTTAGTCAGCAACATTATATTGCCAAAATTTAACCATTTATGCACACACACACACACACACACACACTTAACCCTTTTTTCCACATACTTAAAGAATGACAGAGACAAGACCATCATGTGCAAATTGAGCTTAATTGGTTAATTAGATATCTTTGGAATTTGGAGGTTCTGGGGA GAATGTCGATTACAATTATTTCTGTAATATTGTCTGCTATAGAAAAGTGACTGTTTTTCTTTTTCAAAATTTAGATACTTATATGAAATTGCCAGAAGACATCCTTACTTTTATGCCCCGGAACTCCTTTTCTTTGCTAAAAGGTATAAAGCTGCTTTTACAGAATGTTGCCAAGCTGCTGATAAAGCTGCCTGCCTGTTGCCAAAGGTATTATGCAAAAGAATAGA AAAAAAGAGTTCATTATCCAACCTGATTTTGTCCATTTTGTGGCTAGATTTAGGGAACCTGAGTGTCTGATACAAACTTTCCGACATGGTCAAAAAAGCCTTCCTTTTATCTGTCTTGAAAATCTTTCATCTTTGAAGGCCTACACTCTCGTTTCTTCTTTTAAGATTTGCCAATGATGATCTGTCAGAGGTAATCACTGTGCATGTGTTTAAAGATTTCACCACTT TTTATGGTGGTGATCACTATAGTGAAATACTGAAACTTGTTTGTCAAATTGCACAGCAAGGGGCCACAGTTCTTGTTTATCTTTTCATGATAATTTTTAGTAGGGAGGGAATTCAAAGTAGAGAATTTTACTGCATCTAGATGCCTGAGTTCATGCATTCATTCCATAAATATATATTATGGAATGCTTTATTTTCTTTTCTGAGGAGTTTACTGATGTTGGTGGAG GAGAGACTGAAATGAATTATACACAAAATTTAAAAATTAGCAAAATTGCAGCCCCTGGGATATTAGCGTACTCTTTCTCTGACTTTTCTCCCACTTTTAAGGCTCTTTTTCCTGGCAATGTTTCCAGTTGGTTTCTAACTACATAGGGAATTCCGCTGTGACCAGAATGATCGAATGATCTTTCCTTTTCTTAGAGAGCAAAATCATTATTCGCTAAAGGGAGTACT TGGGAATTTAGGCATAAATTATGCCTTCAAAATTTAATTTGGCACAGTCTCATCTGAGCTTATGGAGGGGTGTTTCATGTAGAATTTTTCTTCTAATTTTCATCAAATTATTCCTTTTTGTAGCTCGATGAACTTCGGGATGAAGGGAAGGCTTCGTCTGCCAAACAGAGACTCAAGTGTGCCAGTCTCCAAAAATTTGGAGAAAGAGCTTTCAAAGCATGGTAAAT ACTTTTAAACATAGTTGGCATCTTTATAACGATGTAAATGATAATGCTTCAGTGACAAATTGTACATTTTTATGTATTTTGCAAAGTGCTGTCAAATACATTTCTTTGGTTGTCTAACAGGTAGAACTCTAATAGAGGTAAAAATCAGAATATCAATGACAATTTGACATTATTTTTAATCTTTTCTTTTCTAAATAGTTGAATAATTTAGAGGACGCTGTCCTTTT TGTCCTAAAAAAAGGGACAGATATTTAAGTTCTATTTATTTATAAAATCTTGGACTCTTATTCTAATGGTTCATTATTTTTATAGAGCTGTAGGCATGGTTCTTTATTTAATTTTTTAAAGTTATTTTTAATTTTTGTGGATACAGAGTAGGTATACATATTTACGGGGTATATGAGATATTTTGATATAAGTATACAACATATATAATCCCTTTATTTAATTTTAT CTTCCCCCCAATGATCTAAAACTATTTGCTTGTCCTTTTATGTCTTATAGTTAAATTCAGTCACCAACTAAGTTGAAGTTACTTCTTATTTTTGCATAGCTCCAGCTCTGATCTTCATCTCATGTTTTTGCCTGAGCCTCTGTTTTCATATTACTTAGTTGGTTCTGGGAGCATACTTTAATAGCCGAGTCAAGAAAAATACTAGCTGCCCCGTCACCCACACTCCT CACCTGCTAGTCAACAGCAAATCAACACAACAGGAAATAAAATGAAAATAATAGACATTATGCATGCTCTCTAGAAACTGTCAATTGAACTGTATTTGCTCATCATTCCTACCATCTACACCACCAAAATCAACCAAATTTATGAAAAAAAACAGCCCCAACATAAAATTATACACAGATAAACAGGCTATGATTGGTTTTGGGAAAGAAGTCACCTTTACCTGATT TAGGCAACTGTGAAATGACTAGAGAATGAAGAAAATTAGACGTTTACATCTTGTCATAGAGTTTGAAGATAGTGCTGGATCTTTCTTTTTATAAGTAAGATCAATAAAAACTCCCTCATTCTGTAGAAGTTATGATTTCTTTTCTAAGAGACCTTTAGAAGTCAGAAAAAATGTGTTTCAATTGAGAAAAAAGATAACTGGAGTTTGTGTAGTACTTCCCAGATTAT AAAATGCTTTTGTATGTATTATCTAATTTAATCCTCAAAACTTCTTCAATTTAGCATGTTGTCATGACACTGCAGAGGCTGAAGCTCAGAGAGGCTGAGCCCTCTGCTAACAAGTCCTACTGCTAACAAGTGATAAAGCCAGAGCTGGAAGTCACATCTGGACTCCAAACCTGATGCTTCTCAGCCTGTTGCCCCTTTTAGAGTTCCTTTTTAATTTCTGCTTTTAT GACTTGCTAGATTTCTACCTACCACACACACTCTTAAATGGATAATTCTGCCCTAAGGATAAGTGATTACCATTTGGTTCAGAACTAGAACTAATGAATTTTAAAAATTATTTCTGTATGTCCATTTTGAATTTTCTTATGAGAAATAGTATTTGCCTAGTGTTTTCATATAAAATATCGCATGATAATACCATTTTGATTGGCGATTTTCTTTTTAGGGCAGTAGC TCGCCTGAGCCAGAGATTTCCCAAAGCTGAGTTTGCAGAAGTTTCCAAGTTAGTGACAGATCTTACCAAAGTCCACACGGAATGCTGCCATGGAGATCTGCTTGAATGTGCTGATGACAGGGTAAAGAGTCGTCGATATGCTTTTTGGTAGCTTGCATGCTCAAGTTGGTAGAATGGATGCGTTTGGTATCATTGGTGATAGCTGACAGTGGGTTGAGATTGTCTTC TGTGCTTTCGTCTGTCCTATCTTCAATCTTTCCCTGCCTATGGTGGTGGTACCTTTCTGTTTTTAACCTGGCTATAAATTACCAGATAAACCCATTCACTGATTTGTAACTCCTTTCAGTCATGCTCTAACTGTAAATGAAGGCTTAAACTGAAGTAGAACAGTTACAAGGTTTTACTTGGCAGAACATCTTGCAAGGTAGATGTCTAAGAAGATTTTTTTTTCTTT TTTTAAGACAGAGTTTCGCTCTTGTTTCCCAGGCTGGGGTGCAATGGTGTGATCTTGGCTCAGCGCAACCTCTGCCTCCTGGGTTCAAGTGATTCTCATGCCTCAGCCTCCCAAGTAGCTGGGATTACAGGCATGCGCCACCACACCTGGCTAATTTTGTATTTTTAGTAGAGGCGGGGTTTCACCATATTGTCCAGACTGGTCTCGAACTCCTGACCTCAGGTGAT CCACCCGCCTTGGCCTCCCAAAGTGCTGGGATTACAGGCATGAGCCACCTTGCCCAGCCTAAGAAGATTTTTTGAGGGAGGTAGGTGGACTTGGAGAAGGTCACTACTTGAAGAGATTTTTGGAAATGATGTATTTTTCTTCTCTATATTCCTTCCCTTAATTAACTCTGTTTGTTAGATGTGCAAATATTTGGAATGATATCTCTTTTCTCAAAACTTATAATATT TTCTTTCTCCCTTTCTTCAAGATTAAACTTATGGGCAAATACTAGAATCCTAATCTCTCATGGCACTTTCTGGAAAATTTAAGGCGGTTATTTTATATATGTAAGCAGGGCCTATGACTATGATCTTGACTCATTTTTCAAAAATCTTCTATATTTTATTTAGTTATTTGGTTTCAAAAGGCCTGCACTTAATTTTGGGGGATTATTTGGAAAAACAGCATTGAGTT TTAATGAAAAAAACTTAAATGCCCTAACAGTAGAAACATAAAATTAATAAATAACTGAGCTGAGCACCTGCTACTGATTAGTCTATTTTAATTAAGTGGGAATGTTTTTGTAGTCCTATCTACATCTCCAGGTTTAGGAGCAAACAGAGTATGTTCATAGAAGGAATATGTGTATGGTCTTAGAATACAATGAATATGTTCTGCCAACTTAATAAAGGTCTGAGGAG AAAGTGTAGCAATGTCAATTCGTGTTGAACAATTTCCACCAACTTACTTATAGGCGGACCTTGCCAAGTATATCTGTGAAAATCAAGATTCGATCTCCAGTAAACTGAAGGAATGCTGTGAAAAACCTCTGTTGGAAAAATCCCACTGCATTGCCGAAGTGGAAAATGATGAGATGCCTGCTGACTTGCCTTCATTAGCTGCTGATTTTGTTGAAAGTAAGGATGTT TGCAAAAACTATGCTGAGGCAAAGGATGTCTTCCTGGGCATGTAAGTAGATAAGAAATTATTCTTTTATAGCTTTGGCATGACCTCACAACTTAGGAGGATAGCCTAGGCTTTTCTGTGGAGTTGCTACAATTTCCCTGCTGCCCAGAATGTTTCTTCATCCTTCCCTTTCCCAGGCTTTAACAATTTTTGAAATAGTTAATTAGTTGAATACATTGTCATAAAATA ATACATGTTCATGGCAAAGCTCAACATTCCTTACTCCTTAGGGGTATTTCTGAAAATACGTCTAGAAACATTTTGTGTATATATAAATTATGTATACTTCAGTCATTCATTCCAAGTGTATTTCTTGAACATCTATAATATATGTGTGTGACTATGTATTGCCTGTCTATCTAACTAATCTAATCTAATCTAGTCTATCTATCTAATCTATGCAATGATAGCAAAGA 3’OH AGTATAAAAAGAAATATAGAGTCTGACACCAGGTGCTTTATATTTGGTGAAAAGACCAGAAGTTCAGTATAATGGCAATATGGTAGGCAACTCAATTACAAAATAAATGTTTACATATTGTCAGAAGTTGTGGTGATAAACTGCATTTTTGTTGTTGGATTATGATAATGCACTAAATAATATTTCCTAAAATTATGTACCCTACAAGATTTCACTCATACAGAGAA GAAAGAGAATATTTTAAGAACATATCTCTGCCCATCTATTTATCAGAATCCTTTTGAGATGTAGTTTAAATCAAACAAAATGTTAATAAAAATAACAAGTATCATTCATCAAAGACTTCATATGTGCCAAGCAGTGTGTGCTTTGTGTAGATTATGTCATATAGTTCTCATAATCCACCTTCCGAGACAGATACTATTTATTTTTTGAGACAGAGTTTTACTCTTGT Property 3:The DNA double helix is negatively charged Basic Amino Acids (Lysine, Arginine, Histidine) O 5’ O P O CH2 BASE O O 3’ O 5’ O P O CH2 BASE O O 3’ O O P O CH2 BASE O O 3’ O O P O B) Histones, Nucleosomes, Chromatin 22 autosomal chromosomes + sex-specific chromosomes XX or XY 1 150 million 50—250 base pair base pair million = = base pairs 340 pm 0.050 meters per = = chromosome 3.4 x 10-10 m 5 cm ~ 2 inches The total physical length of the diploid genome is ~6’5 ! The human genome is compacted Nucleus ~5-10 µm DNA staining Histones compact the genome 4 main types of Histones Histone H2A Histone H2B Basic amino acids Globular Histone H3 proteins N-terminal Histone H4 “tails” Histones and DNA assemble into NUCLEOSOMES Histone octamer (2x H2A, H2B, H3 & H4) 143 bp DNA (no specific sequence of nucleotides) Histone H1 (optional) H2A H4 H2B H3 H1 Nucleosomes assemble into CHROMATIN Nucleosomes are the basic unit of chromatin The minimal degree of chromatin compaction is the 11nm fiber or “beads on a string” In vitro assembly Image credit: Ada Olins and Donald Olins, University of Tennessee/Oak Ridge Graduate School of Biomedical Sciences Nucleosomes assemble into CHROMATIN Electron microcopy of chromatin in the nucleus Source: Ou et al., Science 2017 Nucleosomes assemble into CHROMATIN Various Degrees of Chromatin Compaction The 11nm fiber is folded to achieve diverse higher-order chromatin compaction states C) CHROMOSOMES are made of chromatin Variations in Chromatin compaction : Heterochromatin vs. euchromatin (see Part 3.) In function of the cell cycle (growth and division of cells) Mitotic chromosome Chromosomes in Interphase INTERPHASE Higher form of chromatin Lower compaction compaction (1/10000) Chromosomes in Interphase vs. Mitosis Chromosomes in Interphase Mitotic chromosome Chromosome Organization Origins of Telomeres Centromeres replication Chromosome ends Central region (TTAGGG)n 30-50k in the genome Used as handles Protection against Allow copy of the genome during mitosis Degradation & joining during REPLICATION Karyotypes used to detect large chromosomal anomalies (extra chromosome, deletions, recombination, …) progressively replaced by real-time quantitative PCR & whole-genome sequencing Part 2. Organization and Regulation of Genomic Information Sequences of Nucleotides (ATGC) contain INFORMATION CTAGCTTTTCTCTTCTGTCAACCCCACACGCCTTTGGCACAATGAAGTGGGTAACCTTTATTTCCCTTCTTTTTCTCTTTAGCTCGGCTTATTCCAGGGGT GTGTTTCGTCGAGATGCACGTAAGAAATCCATTTTTCTATTGTTCAACTTTTATTCTATTTTCCCAGTAAAATAAAGTTTTAGTAAACTCTGCATCTTTAA AGAATTATTTTGGCATTTATTTCTAAAATGGCATAGTATTTTGTATTTGTGAAGTCTTACAAGGTTATCTTATTAATAAAATTCAAACATCCTAGGTAAAA AAAAAAAAAGGTCAGAATTGTTTAGTGACTGTAATTTTCTTTTGCGCACTAAGGAAAGTGCAAAGTAACTTAGAGTGACTGAAACTTCACAGAATAGGGTT GAAGATTGAATTCATAACTATCCCAAAGACCTATCCATTGCACTATGCTTTATTTAAAAACCACAAAACCTGTGCTGTTGATCTCATAAATAGAACTTGTA TTTATATTTATTTTCATTTTAGTCTGTCTTCTTGGTTGCTGTTGATAGACACTAAAAGAGTATTAGATATTATCTAAGTTTGAATATAAGGCTATAAATAT TTAATAATTTTTAAAATAGTATTCTTGGTAATTGAATTATTCTTCTGTTTAAAGGCAGAAGAAATAATTGAACATCATCCTGAGTTTTTCTGTAGGAATCA GAGCCCAATATTTTGAAACAAATGCATAATCTAAGTCAAATGGAAAGAAATATAAAAAGTAACATTATTACTTCTTGTTTTCTTCAGTATTTAACAATCCT TTTTTTTCTTCCCTTGCCCAGACAAGAGTGAGGTTGCTCATCGGTTTAAAGATTTGGGAGAAGAAAATTTCAAAGCCTTGTAAGTTAAAATATTGATGAAT CAAATTTAATGTTTCTAATAGTGTTGTTTATTATTCTAAAGTGCTTATATTTCCTTGTCATCAGGGTTCAGATTCTAAAACAGTGCTGCCTCGTAGAGTTT TCTGCGTTGAGGAAGATATTCTGTATCTGGGCTATCCAATAAGGTAGTCACTGGTCACATGGCTATTGAGTACTTCAAATATGACAAGTGCAACTGAGAAA CAAAAACTTAAATTGTATTTAATTGTAGTTAATTTGAATGTATATAGTCACATGTGGCTAATGGCTACTGTATTGGACAGTACAGCTCTGGAACTTGCTTG GTGGAAAGGACTTTAATATAGGTTTCCTTTGGTGGCTTACCCACTAAATCTTCTTTACATAGCAAGCATTCCTGTGCTTAGTTGGGAATATTTAATTTTTT TTTTTTTTTAAGACAGGGTCTCGCTCTGTCGCCCAGGCTGGAGTGCAGTGGCGCAATCTCGGCTCACTGCAAACTCCGCCTCCCGGGTTCACGCCATTCTC CTGCCTCAGCCTCCCGAGTAGCTGGGACTACAGGCGCCCGCCATCACGCCCGGCTAATCTTTTGTATTTTTAGTAGAGATGGGGTTTCACCGTGTGCCAGG ATGGTCTCAATCTCCTGACATCGTGATCTGCCCACCTCGGCCTCCCAAAGTGCTGGGATTACAGGAGTGAGCCACCGCGCCCGGCCTATTTAAATGTTTTT TAATCTAGTAAAAAATGAGAAAATTGTTTTTTTAAAAGTCTACCTAATCCTACAGGCTAATTAAAGACGTGTGTGGGGATCAGGTGCGGTGGTTCACACCT Human Genome Project determining the nucleotide sequence of the full human genome “finished” in 2003 Publicly available - Genome Browser chromosomes 1-22, X&Y genes Chromosome 16 gene names Publicly available - Genome Browser Chromosome 16 GPT2 gene = Alanine Transaminase (ALT) DNA sequence Genes vs. Intergenic Regions Gene Intergenic regions Noncoding regions encode a functional product: Sequences Repetitive sequences regulating and Other Protein Gene activity transposable elements OR functional RNA Regulatory Structural & ~30,000 protein-coding INFORMATION INFORMATION Regulatory genes in humans ? INFORMATION INFORMATION= Function A) Genes & Regulatory regions Protein-coding genes Central dogma of Molecular Biology flow of information in biological systems m Francis Crick Structure of Protein-Coding Genes Exons encode for a protein sequence (genetic code) Introns separate exons + contain regulatory information UTRs are important for mRNA stability Alternate splicing of mRNA EXON 1 EXON 2 EXON 3 EXON 4 EXON 5 genome TRANSCRIPTION EXON 1 EXON 2 EXON 3 EXON 4 EXON 5 pre-mRNA SPLICING EX. 1 EX. 2 EX. 3 EX. 4 EX. 5 EX. 1 EX. 2 EX. 4 EX. 5 mRNA TRANSLATION Isoform 1 Isoform 2 polypeptide Genes coding for functional RNA Ribosomal RNA (rRNA) transfer RNA (tRNA) miRNA mRNA translation mRNA regulation …… Gene-Regulatory regions Information encoded: transcriptional activation of genes CIS-REGULATORY SEQUENCES: PROMOTER TRANS-REGULATORY SEQUENCES: ENHANCERS PROMOTER TRANSCRIPTION GENE ENHANCER Promoters Site of Assembly of the transcriptional machinery General Transcription Factors RNA pol2 PROMOTER GENE TRANSCRIPTION RNA pol2 PROMOTER GENE Enhancers Promoter = basal level of transcriptional activity Enhancers increase transcriptional levels – bound by TF: Transcription Factors PROMOTER RNA pol2 TRANSCRIPTION GENE MEDIATOR Enhancer-Promoter communication COMPLEX Transcription TF TF Factors ENHANCER Some Transcription Factors repress transcriptional activation B) Intergenic regions: repetitive DNA Sequences 50% of the human genome is comprised of repeats of DNA motifs Tandem Repeats - Satellites DNA motif e.g., (AT)n, (GC)n, (TTAGGG)n, ….. Alpha satellites: 171 bp motifs – cover at least several millions bp found at centromeres, crucial for proper mitotic segregation, … Mini satellites: 20-70 bp motifs – cover a few thousand bp found throughout the genome, regulation of gene expression, … Micro satellites: 2-6 bp motifs – cover a few hundred base pairs found at telomeres, chromosome protection, … Trinucleotide Repeat Disorders Huntington disease CAG Micro satellite in protein-coding sequence of Huntingtin – codes for Gln (Q) Huntingtin protein Huntingtin gene (CAG)n 3142 AA Muscular, cognitive, behavioral, psychological, … symptoms Death 15-20 years after onset Interspersed Repeats – LINEs & SINEs Long Interspersed Nuclear Elements (LINEs): 7000bp repeated 20-50,000 times Short Interspersed Nuclear Elements (SINEs): 90-500bp repeated 100,000 times REPRESSED BY HETEROCHROMATIN – MAINLY INACTIVE LINEs & SINEs are transposable elements transcription Noncoding RNA retro-transcriptase retro-transcription DNA transposase transposition new genomic insertion Evolution, genetic diversity, disease-causing mutations, genomic instability, cancer … Part 3. Epigenetics – information above the DNA sequence Each cell has the same genome neuron enterocytes osteoblast cardiomyocyte hepatocyte Each cell type expresses/activates a specific set of genes genes open Actin UBIQUITOUS open = ACTIVE cardiomyocyte closed = INACTIVE Muscle open Tropomyosine specific CPS I Liver closed (Urea Cycle) specific hepatocyte open Actin UBIQUITOUS CPS I Liver open (Urea Cycle) specific Muscle closed Tropomyosine specific Epigenetics = (heritable) changes in cell function that do not involve modifications of the DNA sequence Epigenetic processes Set of Set of open = ACTIVE closed = INACTIVE genes genes A) Heterochromatin versus Euchromatin Epigenetics impact the structure of chromatin ACCESSIBLE INACCESSIBLE Heterochromatin densely packed – inaccessible/repressed histone H1 repress transcription : - alternate lineage genes - repeated DNA sequences (e.g., transposons) repressed structural compaction heterochromatin - centromeres (proper cell division) states - telomeres (genome integrity) Euchromatin loosely packed - accessible transcriptionally active chromatin How is heterochromatin vs euchromatin built at different genomic locations ? B) Histone Post-Translational modification – Histone marks Amino-acids on histone tails can harbor chemical modifications: - Acetylation (1 -COCH3 ) Lysine (K) - Methylation (1-3 -CH3 ) - Phosphorylation - Ubiquitinylation - … Zhao et al., Systems Neurotherapeutics 2013 Acetylation and Methylation of Lysine o modify the electromagnetic interface of nucleosomes, leading to compaction/decompaction. o Recruit proteins that can open or compact chromatin. Histone name/Residue/modification H3K9me1: on Histone H3, Lysine number 9 is mono-methylated (1 -CH3) H4K16ac: on Histone H4, Lysine number 16 is acetylated (1 -COCH3) H3K27me3: on Histone H3, Lysine number 27 is tri-methylated (3 -CH3) Histone Acetylation Histone Acetyl Transferase (HAT) Lysine Lysine -COCH3 Histone DeACetylase (HDAC) chromatin chromatin compaction decondensation transcription Histone Methylation (1, 2 or 3 CH3) Histone Methyl Transferase -CH3 Lysine Lysine -CH3 -CH3 Histone Demethylase Effects of histone methylation/demethylation depends on the Lysine residue H3K9me3 H3K4me3 recruits proteins recruits proteins compacting chromatin opening chromatin HETEROCHROMATIN EUCHROMATIN Histone Marks many contiguous histones harbor similar histone marks Chromosome 6 30 Mb H3K9me3 heterochromatin 2 Mb 2 Mb C) DNA methylation mainly on Cytosine nucleotides mainly when Cytosine are followed by Guanine (CpG) CpG methylations ATTACGCGACGAACAGCGTTCT results in gene repression/chromatin compaction CpG islands methylation =“by default” 70-80% of CpGs are methylated CpG island are genomic regions rich in unmethylated CpGs ~20,000 in the genome CpG islands are typically upstream/inside of gene promoters Aberrant methylation of CpG island is involved in pathology and cancer DNA methylation & histone marks MeCP2 MeCP2 HDACs 5mC 5mC mSIN3A 5mC 5mC MeCP2 MeCP2 mSIN3A mSIN3A HDACs HDACs 5mC 5mC 5mC 5mC HDACs mSIN3A mSIN3A HDACs Rett Syndrome 95% of cases: de novo mutation of MeCP2 Almost exclusively in females (death before 2 y.o. for males, 1 copy of X) Impaired silencing of genes during neuronal development and maturation GENE OVEREXPRESSION -> debilitating neurological symptoms WT Rett Syndrome Fragile X Syndrome (Trinucleotide Repeat Disorders) Micro satellite in 5’ UTR sequence of FMR1 (on X chromosome) FMR1 gene (CGG)n 5 silencing of FMR1 Seizures, neurodevelopmental delays, behavioral and social symptoms Human genome - SUMMARY chromosome genes DNA Double Helix Intergenic regions genes Heterochromatin Euchromatin Take Home Genome: diploid vs. haploid DNA double helix: stable + directional (5’ to 3’) + negative charge, phosphodiester bonds/backbone, DNA bases Chromatin: Histones, nucleosomes, 11 nm fiber Chromosomes: mitotic chromosomes, telomeres, centromeres, origins of replication, karyotype Genes: diploid, genes, protein-coding genes, non-coding RNA, introns, exons, alternative splicing Regulatory DNA sequences: promoters, enhancers Repetitive DNA sequences: tandem (satellites), interspersed (LINEs and SINEs, transposable elements) Epigenetics: Heterochromatin, Euchromatin, Histone marks (methylation + acetylation), DNA methylation Diseases: Huntington disease, Rett Syndrome, Fragile X syndrome