Investigating the Genome - Use of whole genome sequencing in clinical care PDF
Document Details
Uploaded by StableEpilogue
King's College London
2024
Tim Hubbard
Tags
Related
- Salmonella Genomics in Public Health and Food Safety PDF
- Methods in Studying Genetics PDF Lecture Notes
- Antimicrobial Resistance Mechanisms And Solutions PDF
- Genomic Sequencing Strategies and New Technologies PDF
- Principles of Clinical Cytogenetics and Genome Analysis PDF
- Teste 3 Introdução à Bioinformática - IB PDF
Summary
This presentation, "Investigating the Genome," by Tim Hubbard discusses the use of whole genome sequencing in clinical care. It covers learning objectives, benefits and limitations of whole genome sequencing, and criteria for identifying causal genetic variants.
Full Transcript
Investigating the Genome Use of whole genome sequencing in clinical care Tim Hubbard King’s College London, King’s Health Partners ELIXIR, HDRUK, NHS, Genomics England 4MBBS103 MBBS Stage 1 Genomics April 2024 ...
Investigating the Genome Use of whole genome sequencing in clinical care Tim Hubbard King’s College London, King’s Health Partners ELIXIR, HDRUK, NHS, Genomics England 4MBBS103 MBBS Stage 1 Genomics April 2024 Learning Objectives Benefits and limitations of whole genome sequencing compared to existing clinical genetic approaches Criteria used to identify candidate causal genetic variants Where use of WGS is likely to be clinically beneficial Objectives and structure of the 100,000 Genomes Project Examples of the types of diagnostic results that are generated and fed back to clinicians and patients Learning Objectives Benefits and limitations of whole genome sequencing compared to existing clinical genetic approaches Criteria used to identify candidate causal genetic variants Where use of WGS is likely to be clinically beneficial Objectives and structure of the 100,000 Genomes Project Examples of the types of diagnostic results that are generated and fed back to clinicians and patients The first human genome sequence Draft announced 26th June 2000 Finished in 2003 Cost $3.2 billion Whole Exome Genome Data Type Array CGH SNP arraya sequencing Targeted gene V R R R R Large-scale structural changes V V V V R Balanced translocations V V V R Distant Now is the moment to commit to WGS VR consanguinity V V R R Uniparental VR disomy V R R R R Novel/known coding variants Novel/known V V V V R non-coding variants Benefit from collecting whole genome A whole genome is complete An individual’s genome doesn’t change Potential to collect once, store, refer to again and again for clinical care – c.f. clinical images stored in PACS (picture archiving and communication system) Only need to analyse each time for a specfic question Next Generation Sequencing Cheap whole genome sequencing only possible due to “next generation” technology (NGS) NGS based on sequencing billions of random fragments in parallel Fragment length is 150 bases x 2 Each position in genome (3 billion letters) sequenced 30x times on average Two copies of genome in each cell: each sequenced 15x times on average = 300 million random DNA fragments per genome Distribution of sequence fragments GCGCGATTGACCAAGCCAAAATTCTAGAGGATTGATTA ACGGCATTGACCAAGCCAATATTCTAGTGGATAGATTA Distribution of sequence fragments GCGCGATTGACCAAGCCAAAATTCTAGAGGATTGATTA ACGGCATTGACCAAGCCAATATTCTAGTGGATAGATTA Distribution of sequence fragments GCGCGATTGACCAAGCCAAAATTCTAGAGGATTGATTA ACGGCATTGACCAAGCCAATATTCTAGTGGATAGATTA Reminder: Different types of variation Single Nucleotide Variation (SNV) Deletion – can be one base or many Insertion – can be one base or many – Special case: Tandem duplication Inversion Translocation Limitations of current technology Short reads of NGS make accurate characterisation of large variants hard – Most human genomes have been sequenced with NGS, so knowledge of ‘normal’ structural variants is more limited NGS accuracy lower than older sequencing technology – Variants detected by NGS historically verified independently using targeted ‘Sanger sequencing’ – Being phased out in most cases Long read technology (Oxford Nanopore, PacBio) – Currently more expensive, less accurate than NGS – Can be useful supplement to NGS Learning Objectives Benefits and limitations of WGS compared to existing clinical genetic approaches Criteria used to identify candidate causal genetic variants Where use of WGS is likely to be clinically beneficial Objectives and structure of the 100,000 Genomes Project Examples of the types of diagnostic results that are generated and fed back to clinicians and patients Whole Genome vs Whole Exome Whole Exome (protein coding exons) – 30-50 million bases – ~20,000 variants Whole Genome – 3 billion bases – 3 million variants Strategies to identify causal variants Filter variants that are frequently observed – Use gnomAD (Genome Aggregation Database), from >125,000 exomes and >15,000 whole genomes. – gnomAD superceeds ExAC (Exome Aggregation Consortium) database, which only contained variants within coding regions Look for variants identified as pathogenic Look for variants in genes linked to condition Look for variants that effect functional elements – Protein coding sequence – Splicing – Regulatory element Look for variants at genome positions that are normally conserved Strategies to identify causal variants Filter variants that are frequently observed – Use gnomAD (Genome Aggregation Database), from >125,000 exomes and >15,000 whole genomes. – gnomAD superceeds ExAC (Exome Aggregation Consortium) database, which only contained variants within coding regions Look for variants identified as pathogenic Look for variants in genes linked to condition Look for variants that effect functional elements – Protein coding sequence – Splicing – Regulatory element Look for variants at genome positions that are normally conserved MAF15,000 whole genomes. – gnomAD superceeds ExAC (Exome Aggregation Consortium) database, which only contained variants within coding regions Look for variants identified as pathogenic Look for variants in genes linked to condition Look for variants that effect functional elements – Protein coding sequence – Splicing – Regulatory element Look for variants that are normally conserved “Pathogenic variant” false positives Variants have been labelled as pathogenic from rare disease diagnostic sequencing and added to databases Frequency of variant occurrence has only recently been surveyed in normal population ExAC database aggregated protein coding regions from 60,000 individuals (http://exac.broadinstitute.org/) Analysis of ExAC suggests each person has an average of 54 mutations which would be labelled pathogenic, but 41 would be false positives (M. Lek et al. Nature 536, 285 - 291; 2016) Strategies to identify causal variants Filter variants that are frequently observed – Use gnomAD (Genome Aggregation Database), from >125,000 exomes and >15,000 whole genomes. – gnomAD superceeds ExAC (Exome Aggregation Consortium) database, which only contained variants within coding regions Look for variants identified as pathogenic Look for variants in genes linked to condition Look for variants that effect functional elements – Protein coding sequence – Splicing – Regulatory element Look for variants that are normally conserved Strategies to identify causal variants Filter variants that are frequently observed – Use gnomAD (Genome Aggregation Database), from >125,000 exomes and >15,000 whole genomes. – gnomAD superceeds ExAC (Exome Aggregation Consortium) database, which only contained variants within coding regions Look for variants identified as pathogenic Look for variants in genes linked to condition Look for variants that effect functional elements – Protein coding sequence – Splicing – Regulatory element Look for variants that are normally conserved Sources of information about variants Functional annotation of the reference genome Occurance between affected and unaffected individuals The annotated reference genome Annotation of SMURF2 gene, covering just 123,340 bases of genome http://www.ensembl.org/ Current human gene annotation GENCODE 39 statistics (December 2021): https://www.gencodegenes.org/human Strategies to identify causal variants Filter variants that are frequently observed – Use gnomAD (Genome Aggregation Database), from >125,000 exomes and >15,000 whole genomes. – gnomAD superceeds ExAC (Exome Aggregation Consortium) database, which only contained variants within coding regions Look for variants identified as pathogenic Look for variants in genes linked to condition Look for variants that effect functional elements – Protein coding sequence – Splicing – Regulatory element Look for variants that are normally conserved Conservation across species Varient Effect Predictor (VEP) Sources of information about variants Functional annotation of the reference genome Occurance between affected and unaffected individuals Clinical Diagnostics vs Research Rare disease discovery science – sequences of groups of affected individuals – Looking to identify genes sharing variants Rare disease diagnostics – sequences of affected individual + other family members (affected or unaffected) Learning Objectives Benefits and limitations of WGS compared to existing clinical genetic approaches Criteria used to identify candidate causal genetic variants Where use of WGS is likely to be clinically beneficial Objectives and structure of the 100,000 Genomes Project Examples of the types of diagnostic results that are generated and fed back to clinicians and patients Use of WGS in clinical care restricted due to limits in understanding Limit application to: – Monogenic diseases: just looking for one variant – Patients with a clear phenotype: can focus on genes known to be associated with condition – Patients who are ill Reporting currently mainly limited to: – Variants in protein coding sequences: easier to predict effect of mutation Use of WGS in clinical care restricted due to limits in understanding Cannot currently be usefully applied to: – Diagnosis of complex diseases: e.g. diabetes – Prediction of risk: e.g. patients worried about 23andme result – Patients with unexplained condition Recommendations for 100,000 genomes project 2013 - Professor Dame Sally Davies established a Strategic Priorities Working Group for the Project - chaired by Professor David Lomas (UCL) Recommended rare diseases, certain cancers, and infections Areas where they believe the introduction of genomic technology will have the greatest benefit for patient health 36 Learning Objectives Benefits and limitations of WGS compared to existing clinical genetic approaches Criteria used to identify candidate causal genetic variants Where use of WGS is likely to be clinically beneficial Objectives and structure of the 100,000 Genomes Project Examples of the types of diagnostic results that are generated and fed back to clinicians and patients Steps in UK towards Genomic Medicine 2009 House of Lords report on Genomic Medicine 2010 Creation of Human Genomic Strategy Group (HGSG) 2011: UK Life Sciences Strategy Office for Life Sciences Office for Life Sciences Strategy for UK Life Sciences No10: http://www.number10.gov.uk/news/uk-life-sciences-get-government-cash-boost/ BIS/DH: http://www.dh.gov.uk/health/2011/12/nhs-adopting-innovation/ 2012: Human Genome Strategy Group report UK Life Science Strategy Update; 100K Genomes Industrial Strategy: government and industry in partnership Strategy for UK Life Sciences One Year On DH: http://www.dh.gov.uk/health/2012/01/genomics/ BIS: http://www.gov.uk/office-for-life-sciences/ Genomics England http://www.genomicsengland.co.uk/ @genomicsengland 100,000 genomes project Primarily a treatment project NHS transformation project All clinical whole genome sequencing (>30x) Rare disease (proband/parent trios) Cancer (normal/tumour pairs) Timeline Announced December 2012 Genomics England setup 2013 Pilots 2014 Main Programme 2015-2017 http://www.genomicsengland.co.uk/ Genomics England – mission Sequence 100,000 whole genomes Improve Health of NHS patients Stimulate wealth generation Create legacy of infrastructure, human capacity and capability Enable large scale genomics research http://www.genomicsengland.co.uk/ Process Overview Sample Sequence Variants Candidate Clinical Clinical DNA (BAM) (VCF) Variants Interpretation Action Process Overview Sample Sequence Variants Procured DNA (BAM) (VCF) Sequence Variants Candidate Clinical Procured (VCF) Variants Interpretation Annotation Clinical Sequence Interpretation Validation GeL Clinical Database NHS Action Genomics England Sequencing Contract Sequencing and Annotation assessment Sequencing bake-off – Samples sent to participants; returned sequence assessed – Evaluation on quality and coverage – Informed sequencing contract Genomics England Sequencing (Illumina) Contract Consent NHS Genome Medicine Centres Contract NHS Genomic Medicine Centres 13 Genomic Medicine Centres covering England Joined by NHS in Scotland, Northern Ireland and Wales Responsibilities: identifying and recruiting participants clinical care following results 49 Genomics England NHS Firewall Phenotype Sequencing DNA (Illumina) BAM/VCF Contract Consent NHS Genome Medicine Contract Centres Data Centre Contract Genomics England NHS Firewall Phenotype Sequencing DNA (Illumina) BAM/VCF Contract Consent NHS Genome Medicine Contract Centres Data Centre Contract Contract Clinical Interpretation Services Genome Interpretation Service Companies Sequencing and Annotation assessment Sequencing bake-off – Samples sent to participants; returned sequence assessed – Evaluation on quality and coverage – Informed sequencing contract Annotation bake-off – Sequence sent to participants (BAM+VCF) Rare diseases: trio Cancer: germline + tumour – Harder than assessing sequencing – Gold standard less well defined – Lack of established data standards Genomics England NHS Firewall Phenotype Sequencing DNA (Illumina) BAM/VCF Contract Consent NHS Genome Medicine Contract Centres Data Centre Contract Contract Grants Clinical Interpretation Services Genome Interpretation Clinical Report Service Companies Research Protocol under new designation of “Bioresource” Single project-wide approval: no need for site specific approvals Independent review committee grants data access to bona-fide research uses Consent for return of additional findings (secondary: 17 genes; and carrier status: 8 genes) Participants can be re-contacted up to four times a year Samples for various –omics technologies collected Revision of diagnosis if underlying evidence changes (e.g. when new is gene discovered) http://www.genomicsengland.co.uk/library-and-resources/ Genomics England NHS Firewall Genomics Biobank sample England Registry Clinical (PHE) Interpretation HES Phenotype Partnership Sequencing (GeCIP) (HSCIC) DNA (Illumina) BAM/VCF Other NHS Contract Clinical Data GeCIP/GENE Embassies Consent NHS Genome Medicine Contract Centres Data Centre GENE Consortium Contract Contract Grants Clinical Interpretation Services Genome Interpretation Clinical Report Service Companies Genomics England NHS Firewall Genomics Biobank sample England Registry Clinical (PHE) Interpretation HES Phenotype Partnership Sequencing (GeCIP) (HSCIC) DNA (Illumina) BAM/VCF Other NHS Contract Clinical Data GeCIP/GENE Embassies Consent NHS Genome Medicine Contract Trusted Research Centres Data Centre Environment (TRE) GENE Consortium Contract Contract Grants Clinical Interpretation Services Genome Interpretation Clinical Report Service Companies https://www.hdruk.ac.uk/access-to-health-data/trusted-research-environments/ Genomics England Clinical Interpretation Partnership (GeCIP) A research consortium Partnership between over 2,500 researchers from academia and the NHS, trainees, plus international collaborators Designed to accelerate academic/industry partnership and development of diagnostics and therapies Over 35 topics (domains) of research and most domains cover a single disease or group of diseases and some are wider e.g. epigenomics, health economics and technology All data generated contributes to the Genomics England Dataset 4 April 2024 57 Commercial Research Working with industry GENE Consortium (pilot 2015-16) 12 companies = the Genomics Expert Network for Enterprises (GENE) Consortium to oversee a year-long Industry Trial AbbVie, Alexion, AstraZeneca, Berg, Biogen, Dimension Therapeutics, GSK, Helomics, Roche, Takeda, NGM Biopharmaceuticals, UCB Discovery Forum (2017-) Exploring the business value of genomic medicine data. Connecting industry stakeholders to the Genomics England community. Providing a gateway to our Research Environment and dataset. Leading to discovery and development of precision methods, diagnostics, and therapeutics Health Education England Genomics Education Programme 9 University providers of MSc in Genomic Medicine Aimed at NHS healthcare professionals working in England Full/part time study Fully funded places available through HEE Individual (CPPD) modules available for range of professional backgrounds and groups (e.g. medicine, nursing, healthcare scientists and technologists). Online training courses and resources The fundamentals of genomics Bioinformatics How to support patients through the consent process for the 100,000 Genomes Project 4 April 2024 59 Infections and Pathogens 3000 Multi-drug resistance TB strains NHS implementing TB sequencing for diagnosis Global registry of TB resistance A co-ordinated response across health and care Co- ordinating genomic knowledge to make the Using UK a world Sequencing genomic leader 100,000 knowledge genomes to for advance prevention genomic and health knowledge protection Ensuring Turning the NHS genomic Workforce is knowledge skilled and able into health to deliver for intervention patient benefit s 4 April 2024 61 Chief Medical Officer’s report https://www.gov.uk/government/publications/chief-medical- officer-annual-report-2016-generation-genome 19 August 2015 62 Building the future NHS genomic medicine service By the end of 2018 the NHS will have: A national Genomic Medicine Service providing consistent & equitable care for 55 million population Operating to common national standards, specifications & protocols Standardised genomic consent for NHS care and Research Delivering an approved national testing directory covering use of single gene to WGS Building a single UK Genomic Knowledgebase national NHS database with all tests that will enable care, effectiveness, and outcomes De-identified data for academic & industry research 4 April 2024 63 Learning Objectives Benefits and limitations of WGS compared to existing clinical genetic approaches Criteria used to identify candidate causal genetic variants Where use of WGS is likely to be clinically beneficial Objectives and structure of the 100,000 Genomes Project Examples of the types of diagnostic results that are generated and fed back to clinicians and patients What are we telling participants? Information about a patient’s main condition Information about additional ‘serious and actionable’ conditions (optional) Carrier status for non affected parents of children with rare disease (optional) Image courtesy of Health Education England 65 Patient involvement - the National Participant Panel Role of the Panel is to ensure the interests of participants are always at the centre of the 100,000 Genomes Project. They do this by: Making sure experiences of participants are at the heart of the project Responding to feedback. Overseeing who should have access to participant data Access GeCIP Board Ethics Review Advisory Committe Committee e Participants 4 April 2024 66 Genome interpretation providers Currently contracting a pilot phase for up to 8000 “reports” with three providers Genomics England will provide web-based tools to enable the collaboration between GEL, GECIPs and GMCs to analyse, assess, review and validate the clinical interpretation of whole genomes Standarisation through Tiering of reported variants Tier 1: In gene panel Clear LOF (truncating, splicing, etc) Known pathogenic variants Tier 2: In gene panel Missense and other VUS Tier 3: Not in panel Aims: Source expert knowledge to establish a final diagnostic grade gene panel (green list) for each disorder that will be used in the classification of genetic variants to aid clinical interpretation of rare disease genomes Engage Scientific Community, encourage open debate, and begin to establish consensus on gene panels for rare diseases. Standardisation of terms and collection of gene-disease related information, accumulation of reviews over time, and updated releases PanelApp https://panelapp.genomicsengland.co.uk/ Public access Register to be a View and download reviewer gene panels. View and download View Reviewers’ gene panels. comments. View Reviewers’ comments. + Evaluate genes and make comments. + Add genes to a gene panel. 4 April 2024 70 Searching Panels 4 April 2024 71 PanelApp Gene Panel View 4 April 2024 72 Jessica Epileptic encephalopathy type 9 (GLUT1) Difficult to treat seizures Developmental delay Standard tests found no cause Now 4 years old Mutation in GLUT1 found via 100KGP Mutation not present in either parent (‘de novo’) Likely benefits of diagnosis Ends 4 year diagnostic odyssey Provides possible tailored therapy (ketogenic diet) Informs parents on risk of recurrence in another child (very low) 4 April 2024 73 A 10 year-old girl with life threatening chicken pox Ten year old girl admitted to intensive care in Manchester because of life threatening chicken pox She had previously had other unusual infections. Detailed immune testing had not determined why. Mutations in CTSP1 gene found via 100KGP Likely benefits of diagnosis A (curative) bone marrow transplant is now planned for the girl Her siblings have been tested and shown not to be at risk of these infections The gene wasn’t recognised by immunologists as a cause of bad chicken pox. A change in practice is now planned to test many more children for changes in this gene to identify others with the condition 4 April 2024 74 A family with kidney problems 57-year-old man with kidney failure; he had other relatives who had had kidney failure too His genome was sequenced and the genetic cause of his kidney failure was identified His daughter already had signs of kidney failure, and she also shared the genetic variant His teenage granddaughter was having yearly checks on her kidneys as she had a 1 in 2 chance of also getting kidney failure Genetic tests showed she didn’t have the variant found in her mother and grandfather, so she doesn’t have to go for check-ups or worry about her kidneys any more 4 April 2024 75 Georgia KDM5B-related intellectual disability Developmental delay Multiple medical problems Sees >5 hospital specialist services Seen in two genetic centres No cause known despite extensive testing Now 4 years old Mutation in KDM5B found via 100KGP – newly recognised disease gene Mutation not present in either parent (‘de novo’) Likely benefits of diagnosis Ends 4 year diagnostic odyssey Informs parents on risk of recurrence in another child (very low) This is a newly recognised disease gene. It’s recognition will help diagnose other families A CRISPR-Cas9 mouse model of the mutation is planned as part of the collaboration between Genomics England and MRC Harwell to learn more about the condition 4 April 2024 76 Non-coding mutations as a cause of choroideremia A man with choroideremia of unknown cause under the case of Moorfield’s Eye Hospital A causative non-coding (promoter) mutation upstream of the X chromosome CHM gene was found via 100KGP A second family with the same mutation has now been found Likely benefits of diagnosis Identifies the cause as X-linked and allows cascade testing of at risk relatives No non-coding mutations had previously been found, nor CHM’s promoter recognised. Analysis of the promoter region will now become a standard part of diagnostics, allowing diagnosis in other families 4 April 2024 77 Cancer Common cancers included initially: Lung, Breast, Ovarian, Prostate, Colorectal Now included: Renal, sarcoma, childhood cancer, Adult Brain Tumours, Endometrial, Melanoma, Upper gastrointestinal (GI) tumours, Testicular, Head and Neck, Cancer of Unknown Primary, Haematological Malignancies Molecular pathology Complex NHS transformation underway Tumour samples are traditionally preserved in formalin then fixed in paraffin (FFPE) to preserve cellular architecture for diagnosis under the microscope DNA extracted from samples treated like this is damaged and broken Use part of the sample for FFPE and histology Freeze part of the sample for genetic tests Need to make sure the sample contains mainly tumour cells This new pathway requires very significant changes in sample handling, affecting surgeons, interventional radiologists, pathologists and oncologists Cancer whole genome analysis report Preliminary analysis report: Links to Clinical Trials Domain 1 variants - directly relevant to cancer treatment Remainder of results are mostly of Domain 2 variants – other cancer research interest for now, but in future related genes may assist: Drug development Supplementary analysis report Targeted treatment selection Domain 3 variants & other relevant Prediction of prognosis information Monitoring of disease progression Status of project Figures as at 01/06/2019 Samples Genomes Analysis and Results 122,650 Results for 86,944 Samples collected and received at the 112,198 genomes sent to NHS GMCs Genomes sequenced UK Biocentre Equivalent to 21,551 36,696 28,508 cancer genomes and 83,690 65,393 rare disease 85,954 20-25% actionable findings for Rare Disease ~ 50% cancer cases contain potential for a therapy or a trial in our report 4 April 2024 81 National Genomic Medicine Service Started October 2018 National Genomic Medicine Service National Test Directory National Laboratory Network Genomic Medicine Centres NHS 300,000 Tests reviewed Genomic Laboratory Hubs - 7 providing care Lead 25% upgraded to new hubs doing single gene, (continue till 2021) technologies panels, clinical exome UK Genomics Whole Genome Clinical 22 categories of rare Knowledgebase GEL Lead Sequencing Interpretation disease Informatics architecture Provider Pipeline & data store Cancer Industry/ academic/ international 4 cancers planned for Workforce development partnerships WGS upskilling of existing staff supporting ongoing research & development through clinical care Many more edge cases in cancer 500,000 Whole Genomes Sequenced from the NHS in the next 5 years Annual Test Directory Offered consent for research Review Longitudinal Life Course Pharmacogenetics from Recall for research October 2019 International researchers and industry Acknowledgements The patients and their families Genomics England Team NHSE- Sue Hill, Malcolm Grant, Bruce Keogh HEE- Sue Hill, Val Davison, Anneke Seller PHE- Derrick Crook All 13 Genomic Medicine Centres, UK CLL Consortium, CRUK, RCPath, NHSE, DoH, Biobank UK, Sanger, EBI, KCL, UCL and QMUL NIHR BioResource Rare Disease, DDD NIHR Translational Research Collaborative Learning Objectives Benefits and limitations of whole genome sequencing compared to existing clinical genetic approaches Criteria used to identify candidate causal genetic variants Where use of WGS is likely to be clinically beneficial Objectives and structure of the 100,000 Genomes Project Examples of the types of diagnostic results that are generated and fed back to clinicians and patients