Summary

This lecture provides an overview of Genome Wide Association Studies (GWAS), including learning outcomes, sharing of ancestral chromosome segments, technological advances, and analysis of GWAS data.

Full Transcript

GENE3340 Molecular Genetics II Genome Wide Association Studies (GWAS) Belinda Kaskow School of Biomedical Sciences, UWA Email: [email protected] Learning Outcomes Be able to describe two technological advances that have allowed GWAS to flourish...

GENE3340 Molecular Genetics II Genome Wide Association Studies (GWAS) Belinda Kaskow School of Biomedical Sciences, UWA Email: [email protected] Learning Outcomes Be able to describe two technological advances that have allowed GWAS to flourish Understand that GWAS designed to identify common variants Be able to describe the process of carrying out GWA scan Be able to describe what data from GWAS plot (broadly) means Be able to describe and calculate an Odds Ratio Understand the limitations of GWAS and the term missing heritability Be able to explain common disease-common variant and the common disease-rare variant hypotheses Sharing of ancestral chromosome segments AS = LD (reflects shared segments because of distant common ancestor The more distant a common ancestor, the smaller each shared segment, but number people sharing will be greater eg. Sharing of two segments on a single chromosome extends to all 8 individuals in generation IV Extent of sharing greatest in sibs, but decreases the further individuals separated from common ancestor Figure 8.12a Genetics and Genomics in Medicine. Strachan & Lucassen. 2nd Ed. Sharing of ancestral chromosome segments Sharing of ancestral chromosome segments Linkage disequilibrium around an ancestral mutation that confers disease susceptibility Red asterisk newly emergent mutation, has a minor allele 2 at very closely linked SNP, where allele 1 is major allele Passing through many generations, meitotic recombination will ensure most of original chromosome (yellow) replaced (gray). Descendants who inherited disease susceptibility variant have increased chance of having allele 2. Higher frequency of 2 than in control population Figure 8.12b Genetics and Genomics in Medicine. Strachan & Lucassen. 2nd Ed. Two technological advances that allowed GWAS to be achieved International HapMap project delivered hundreds of thousands & then millions of mapped SNP loci Extension of microarray technology allowed automated genotyping of huge numbers of SNPs across the genome. Now = Next Generation Sequencing (NGS). (Whole genome or whole exome = common and rare SNPs, CNVs, Indels etc). GWAS designed to identify common variants = assume common complex diseases caused by common variants (MAF>0.05) GWAS: 101 500-1000 Cases 500-1000 Controls Extract DNA – Genotype* Currently: ~$300 per individual (by standard methods) (n = 2000 x $300 = $600,000)*** Calculate which of 300-500,000 SNPs and/or haplotypes are more frequent in cases than controls Haplotype blocks Attempts to define ancestral chromosome segments = high-resolution haplotype structure. Suggests our DNA is composed of defined blocks of limited haplotype diversity Genotyping 8 SNP (5q31) loci reveals 84kb haplotype block. Just two haplotypes account for vast majority of chromosomes from European pop. * Remember: 8 x SNP = 28 = 256 haplotypes possible Haplotype blocks Adjacent haplotype blocks at 5q31 – blocks 1, 2, 3, 4 were genotyped at 5, 9, 11 SNP loci and had between two & four haplotypes with certain population frequencies. Dashed black lines = locations where >2% of all chromosome 5 are seen to switch between haplotypes Imputing SNPs Doing a GWAS GWAS Steps Using HapMap data (map LD), representative SNPs selected which differentiate (tag) the common haplotypes (A, B, C) at each locus. Locus 1 = tagged by 4 SNPs Locus 2 = tagged by 2 SNPs Tagged SNPs are genotyped in disease cases & controls using microarrays Allele frequencies for each SNP compared in two groups SNPs associated with disease (statistical threshold) are genotyped in 2nd independent cohort Which associations are robust Figure 8.13 Genetics and Genomics in Medicine. Strachan & Lucassen. 2nd Ed. Visualising genome-wide association data: Quantile-quantile (Q-Q-plots) - Two types of distribution of observed test statistics generated in GWAS - In case-control studies a chi-squared comparison of absolute genotype counts is calculated for each variant - Red = idealized test results - Blue = expected values under null hypothesis of no association Manhattan Plots C) Coronary artery disease - blue = new loci - red = previously discovered loci Threshold = 7.3 p = 5 x 10-8 p = 0.05 / 1 million tests = 5 x 10-8 Bonferroni Correction- adjusts the conventional p value by dividing by the number on independent tests. Figure 8.14 Genetics and Genomics in Medicine. Strachan & Lucassen. 2nd Ed. GWAS and Odds Ratio (OR) Odds ratio: An effect size estimate of a risk factor that quantifies the increased odds of having the disease per risk allele count in genome-wide association studies (GWAS) Each SNP is an independent test. Associations are tested by comparing the frequency of each allele in cases and controls Allele counting method Association of rs6983267 with colorectal cancer C allele T allele Cases a 875 c 675 C is the risk allele Controls b 1860 d 1940 a: Number of individuals with the allele and the trait. b: Number of individuals with the allele but without the trait. c: Number of individuals without the allele but with the trait. d: Number of individuals without the allele and without the trait. The odds ratio (OR) is calculated as: "⁄# ÷ %⁄& = (875/675) ÷ (1860/1940) = 1.35 GWAS and Odds Ratio (OR) OR > 1: The presence of the SNP is associated with higher odds of the trait or disease. (RISK ALLELE) OR < 1: The presence of the SNP is associated with lower odds of the trait or disease. (PROTECTIVE ALLELE) OR = 1: No association between the SNP and the trait or disease. GWAS: Multiple Sclerosis IMSGC, 2011, Ann Neurol Limitations of GWAS Despite initial hopes, common disease variants identified by GWAS have very weak effects. Exceptions, novel factors that strongly predispose – e.g. Age-related macular degeneration Even cumulative contributions of identified variants are small. Available GWAS data explain only small proportion of genetic variance of complex diseases = missing heritability. GWAS: Multiple Sclerosis Genome-wide associations in MS 32 MHC 1 X chromosome 200 outside of MHC all in immune pathways “we can now explain ~39% of the genetic predisposition to MS with the validated susceptibility alleles” IMSGC, 2019, Science Missing Heritability Figure 18.8 Human Molecular Genetics. Strachan & Read. 5th Ed. Common Disease – Common Variant Hypothesis Common Disease – common variant hypothesis Different combinations of variants at multiple loci aggregate in specific individuals to increase disease risk In other words: SNPs at relatively large frequency in the population (>1%), but with relatively low penetrance (probability that a carrier will express the disease) are the major contributors to genetic susceptibility to common disease Explains why steep falling away of disease risk in relatives of probands with a common disease Common variants are expected to be of ancient origin. They are merely susceptibility factors and so have typically weak deleterious effects (ie mild missense mutation or changes in gene expression) Rare variants are expected to be of comparatively recent origin Common Disease – Rare Variant Hypothesis Moderately rare variants may have moderate effects, very rare variants expected to have rather strong effects, highly penetrant. Would not appear on common haplotype blocks – ancient origins. Common Disease – rare variant hypothesis Many complex diseases have known mendelian subsets in which pathogenesis is due to rare mutations of extremely strong effect In other words: multiple rare DNA sequence variations (

Use Quizgecko on...
Browser
Browser