Fundamental Topics in Biology 2X Lecture 3 Molecular Biology III (2024-2025) PDF
Document Details
University of Glasgow
2024
Joe Gray
Tags
Summary
This document is a lecture on molecular biology, covering the structure, function and features of the human genome. It also discusses various types of genetic variations and their prevalence in human populations.
Full Transcript
Fundamental Topics in Biology 2X: Lecture 3 Molecular Biology III: Genomes Prof Joe Gray 26th Sept 2024 No – this is not a picture of me Any ideas? FTiB news Moodle Forum ALL academic/scientific q...
Fundamental Topics in Biology 2X: Lecture 3 Molecular Biology III: Genomes Prof Joe Gray 26th Sept 2024 No – this is not a picture of me Any ideas? FTiB news Moodle Forum ALL academic/scientific questions/comments on the MOODLE forum. We will look. We will NOT answer e-mails on academic/admin Qs This set of 3 lectures links to the lab coming soon Take-Home Essay Instructions will appear on Moodle soon Aims and Objectives Following this lecture you should be able to: 1, Summarise the main features of the human genome and how it differs between people 2, Briefly explain the origin, nature, and scientific importance of homologs and orthologs 3. Outline the nature and use of SNPs 4. Define “evolution” in molecular terms and explain the assumptions behind the Hardy-Weinberg equilibrium (extra reading is assumed here) 5. Perform and interpret allele frequency calculations and Hardy- Weinberg calculations What’s in a genome sequence? What’s in a genome sequence? … At face value, NOT MUCH.. Unless you do a lot of analysis and experiment. Which bits are important (have a function/role??), What functions/roles? + THE “human” or any genome is actually the genome sequences of ONE example: the “reference genome”. Hence to be meaningful, need genome sequences of lots of people… possibly even you and me. + Most of us are diploid – we have two copies of every gene, not one… heterozygous or homozygous? dominant, recessive? All of it (HAPLOID) 1.5% EXONS Lots of space (e.g., 40,000 bp) GREEN BITS ARE EXONs Brightly-coloured bits below line are repeated elements: LINES (long interspersed nuclear elements) SINES (short interspersed nuclear elements) Transposons LTRs (Long terminal repeats) Protein-encoding genes Use Knowledge transfer Species share HOMOLOGOUS GENES (= have sequence “similarity”, i.e., high identity) ORTHOLOGS: by common descent: probably SAME FUNCTION/ROLE Hence study role in one species (ethically and costs allowed) And thus Infer role in other species (where ethics/costs/practicalities might be an issue) PARALOGS: by duplication: probably diverged function/role Orthologs DO the same jobs A human gene (CDC2) can take the place of a missing yeast gene (cdc2) After ~1 billion years of evolution (since last common ancestor) Look similar (encode a key cell cycle regulatory protein) Really DO DO the SAME job: are interchangeable … Nobel Prize 2001 Example of orthologs 47% identity in protein sequence (>>>5%) Between yeast & Human gene Shown in 1-letter amino acid code Read left-to-right, top-to-bottom HUMAN gene YEAST (Ss pombe) BAKERS YEAST (S Cerevisiae) Self esteem warning (part II) We are ALL related, inbred, mutating, mutant, (part) FUNGI Be (even more) proud Common genome variations between people Individual humans when compared pairwise typically show * 99.5% identity at genome sequence Differences in single base pairs are important.. But Not the whole story INDELS: insertions/deletions Surprisingly common: Can be very small: 1bp, 2bp….. Can affect the protein product of a gene (frameshift etc.) if they occur in the protein-coding region (Open Reading Frame – ORF) of a protein-coding gene Can affect function of a ncRNA Can affect regulatory regions (MOST) do not affect gene function… occur outside genes, in introns etc… VNTRs: Variable Numbers of Tandem Repeats: Micro- and Mini-Satellites Repetitive DNA Micro- ~1bp to 9bp repeating (“short tandem repeats” FORENSICS) Mini- ~10bp to 100bp repeating Relatively STABLE (generation to next) but highly variable in LENGTH (i.e., NUMBER of repeats) across alleles in the population Micro- and Mini-Satellites Q: have you seen an example of a trait CAUSED by a microsatellite??? CNVs: Copy number variants A VERY surprising amount of CNVs variation in NUMBER of a segment of a chromosome Typically 500 bp to 1 million bp segments Might loss of a chromosomal region be recessive lethal? Point Mutations/Variants Range from the PRIVATE (to YOU): the ~200 de novo point mutations FAIRLY COMMON (allele frequency of 1%) – “POLYMORPHISM”: Typically only 2 versions at a LOCUS (site in the genome) across the population(s): SNPs: Single Nucleotide Polymorphisms Rare SNPs: between 1% and 4.9% allele frequency Common SNPs: 5% allele frequency or more SNPs Can affect the protein product of a gene if they occur in the protein-coding region (Open Reading Frame – ORF) of a protein-coding gene: silent missense nonsense (MOST) do not cause a phenotype… occur outside genes, in introns etc… Surprisingly common: A common SNP occurs approximately EVERY 1,000 bps in Human genome i.e., we have 3,000,000 common SNP loci (sites in haploid genome) and we are diploid… hence have TWO alleles at each may be Same (Homozygous) or different (Heterozygous) Allele Frequency: How to calculate it….I Frogs – have TWO alleles of a gene/SNP: A and a 5 frogs have genotype AA 8 frogs have genotype Aa 3 frogs have genotype aa What is the allele frequency of allele A? Number of A alleles: Number of alleles in TOTAL (A +a): Allele frequency = (no. of A)/total = What is evolution? A molecular definition “evolution is… a change in allele frequency in a gene pool” T. Dobzhansky (1900-1975) Evolution: change in allele frequency with time (l-r) AA Aa aa But is this evolution? And will I live long enough to measure it? AA Aa aa If only there a nice lazy quick (clever) way of looking at ONE SINGLE timepoint and asking if the population is stable or evolving? AA Aa aa Hardy-Weinberg: just measure the population at ONE timepoint (= now) G. H. Hardy (1877 - 1947) Wilhelm Weinberg (1862 – 1937) Hardy-Weinberg equation For a given pair of allele frequencies, p and q: it predicts the GENOTYPE frequencies that will keep that allele frequency constant from one generation to the next. p = allele frequency of A q = allele frequency of a p2 + 2pq + q2 = 1 Where: p2 = Frequency of AA homozygotes IF NO evolution 2pq = Frequency of Aa heterozygotes IF NO evolution q2 = Frequency of aa homozygotes IF NO evolution NOTE: If allele frequency does not change with time: then the genotype Hardy-Weinberg equation – origin For a given pair of allele frequencies, p and q: it predicts the GENOTYPE frequencies of diploids (humans, animals, plants..) assuming alleles are stable, a large homogeneous population, random mating and all genotypes are equally ‘fit’ p = allele frequency of A q = allele frequency of a Bi-allelic: p+q =1 Organisms are diploid – and from random mating of (egg) meets (sperm) (p + q)(p + q) =1 p2 + pq + pq + q2 = 1 HOMOZYGOUS AA p 2 + HETEROZYGOUS Aa (or aA)2pq + q 2 = HOMOZYGOUS aa Hardy-Weinberg Equation/Equilibrium For a given 2-allele system (bi-allelic e.g., SNPs), the Hardy-Weinberg equation predicts the genotype frequencies in the population that keep that allele frequency constant from one generation to the next, i.e., NO EVOLUTION Put it another way: If no ongoing/recent evolution? i.e., if the observed allele frequency is stable (or ‘at equilibrium’), then the observed ratios of genotypes will be at values predicted by the H-W equation Hardy-Weinberg Equation/Equilibrium REMINDER As for allele frequency … ALWAYS work in decimals You can convert from % at start or convert to % at end as needed/requested/clarity Genotype Frequency using H-W Frogs – have TWO alleles of a gene/SNP: A and a Allele frequency (allele A) = p = 0.56 Allele frequency (allele a) = q = 0.44 Hardy-Weinberg Equilibrium predicts: Frequency of AA genotype frogs = p2 = (0.56)(0.56) = 0.314 Frequency of aa genotype frogs = q2 = (0.44)(0.44) = 0.194 Frequency of Aa genotype frogs = 2pq = (2)(0.56)(0.44) = 0.493 Check: Sums to 1.001 Is this population of frogs evolving? Hardy-Weinberg Equilibrium predicts: Frequency of AA genotype frogs = p2 = (0.56)(0.56) = 0.314 Frequency of aa genotype frogs = q2 = (0.44)(0.44) = 0.194 Frequency of Aa genotype frogs = 2pq = (2)(0.56)(0.44) = 0.493 Observed in field: 16 frogs were genotyped: 5 frogs have genotype AA genotype freq = 5/16 = 0.3 3 frogs have genotype aa genotype freq = 3/16 = 0.2 8 frogs have genotype Aa genotype freq = 8/16 = 0.5 YOU will do this kind of analysis in the upcoming lab … on humans (yourselves) Hardy-Weinberg example calculation 2 Can use the H-W equilibrium and PARTIAL information about population to: Estimate expected allele frequency: Assuming H-W equilibrium: if cystic fibrosis (a recessive disorder) occurs in 1 in 2,500 births: what is the allele frequency in the population? have a go Hardy-Weinberg example calculation 2a Can use the H-W equilibrium and PARTIAL information about population to: Estimate expected allele frequency: e.g., Assuming H-W equilibrium: if cystic fibrosis (a recessive disorder) occurs in 1 in 2,500 births: what is the allele frequency of the disease allele in the population? Hence q = 0.0004 = 0.02 This is equivalent to inferring that 2% of all the gametes (eggs and sperm) in the population (= allele frequency) carry a cystic fibrosis allele What is the predicted frequency of carriers in the population? Hardy-Weinberg example calculation 3 Can use the H-W equilibrium and PARTIAL information about population to: Estimate expected genotype frequency: e.g., Assuming H-W equilibrium: if 9% of a population show a recessive trait, then Predict frequency of HOMOZYGOUS WT have a go Hardy-Weinberg example calculation 4 Can use the H-W equilibrium and PARTIAL information about population to: Estimate expected genotype frequency: e.g., Assuming H-W equilibrium: if 9% of a population show a DOMINANT trait, then Predict frequencies of carriers (= heterozygotes) and homozygous WT Do in own time Most SNPs in humans are in Hardy-Weinberg equilibrium Hence, to a first approximation: most of our genome (US) obeys the assumptions of H-W and is NOT evolving. For most SNPs, the human population, obeys: random mating homogeneous population (i.e., not stratified) Population has not recently been small Mutation between alleles occurring at v low frequency NATURAL SELECTION is not occurring. Stability/lack of evolution here is not about US as PEOPLE/Groups, but rather about points across our genome So ‘most of our genome’ is NOT EVOLVING Most SNPs in humans are in Hardy-Weinberg equilibrium Hence, to a first approximation: most of our genome (US) obeys the assumptions of H-W and is NOT evolving. But some REGIONS of the genome are evolving.. Something interesting MUST be going on: Any one or more of…. NON-random mating Population is not homogeneous (i.e., is stratified, immigration…) Population is or recently was small (i.e, sampling becomes an issue) Mutation between those alleles occurring at high frequency (unlikely.. But seen an example) NATURAL SELECTION is occurring. NATURAL SELECTION is only one way to cause EVOLUTION (the above “exceptions to H-W’ being the others), but it is the only one with a “directionality” (see Khan lab you will be looking at a Taste gene: Tas2R38 Bitter taste perception 2 alleles caused by SNPs You will consider How to Determine genotype Test various hypotheses Determine if alleles are in Hardy-Weinberg equilibrium across your class if so….. What does that mean? if not…..Why might that be? What makes you, you (DNA sequence)? To a first approximation, you are the product of your unique combination of alleles: Type Point mutations/SNPs micro/mini-satellites Indels CNVs Location Number all interacting with each other Same gene/locus: Homozygous/heterozygous Same gene/locus: Dominance/recessive relationships Different genes/locus: Epistasis Environment/history/lifestyle Epigenetics Dynamic: Ongoing somatic mutation Puzzle We each carry 1-2 (maybe 3) recessive LETHAL mutations Q: Are DOMINANT LETHAL mutations EVER POSSIBLE in the population? or… have you seen an example of one? References (SUPERB) Hardy Weinberg, allele frequency etc… some excellent resources at the Khan Academy: e.g., following link (and the series it is part of) https://www.khanacademy.org/science/biology/her/heredity-and-gen etics/v/discussions-of-conditions-for-hardy-weinberg Mutations/Polymorphisms across genome Campbell & Reece, Biology. Various sections. Griffiths, Gelbart et al. Modern Genetic Analysis , Various sections.