L2 MIIM30011 2024 Medical Microbiology: Bacteriology PDF
Document Details
Uploaded by NobleTucson
University of Melbourne
2024
Mark Davies
Tags
Summary
This document provides an overview of Medical Microbiology, focusing on genomics, microbial evolution, and epidemiology. It outlines learning objectives, suggested readings, and a timeline of sequencing technologies. The University of Melbourne is mentioned as the school.
Full Transcript
MIIM30011 Medical Microbiology: Bacteriology L2: Genomics, microbial evolution and epidemiology Dr Mark Davies Department of Microbiology and Immunology [email protected] (with thanks to Tim Stinear) Suggested reading ¤ Bacterial Pathogenesis, a molecular approach...
MIIM30011 Medical Microbiology: Bacteriology L2: Genomics, microbial evolution and epidemiology Dr Mark Davies Department of Microbiology and Immunology [email protected] (with thanks to Tim Stinear) Suggested reading ¤ Bacterial Pathogenesis, a molecular approach ¤ 3rd Edition ¤ Prescott’s Microbiology ¤ 10th Edition, Chapter 18 + 37 ¤ Klemm and Dougan. 2016. Advances in Understanding Bacterial Pathogenesis Gained from Whole-Genome Sequencing and Phylogenetics. Cell Host Microbe 1;19(5):599-610. ¤ Gardy and Loman. 2018. Towards a genomics-informed, real-time, global pathogen surveillance system. Nature Reviews Genetics. Jan;19(1):9-20. ¤ Scientific literature as cited throughout L2: Learning objectives ¤ Define a microbial genome ¤ Describe the four-step process by which we sequence a genome ¤ Describe the key DNA sequencing technologies commonly used to sequence microbial genomes ¤ Describe some methods used in analysing a bacterial genome ¤ Consider how you can apply genome sequencing in bacteriology What is a genome? ¤ Total genetic content of an organism Salmonella Typhi genome ¤ In the context of bacteria: (strain CT18) ¤ chromosome(s) ¤ plasmid(s) Cryptic plasmid Resistance plasmid (100 genes) (200 genes) Chromosome 4,500 genes Parkhill et al. 2001 Nature 413:848-52 What is genomics? ¤ Genomics: the study of genomes ¤ Determine the genetic makeup of a genome ¤ Comparative genomics ¤ Population genomics ¤ Functional genomics ¤ Genomics has led to hundreds of other “omics” ¤ Some useful ones: transcriptomics, proteomics, metagenomics ¤ Some not-so-useful ones: foldomics, RNomics ¤ Some highly dubious ones: animalomics, complexomics, religiomics, legalomics, healthomics, drugomics Why do we sequence genomes? ¤ Understand bacterial evolution ¤ Design public health interventions ¤ Monitor how pathogen populations respond/change ¤ Identify virulence factors/mechanisms of pathogenesis ¤ Design new vaccines ¤ Develop new antibiotics/therapeutic targets ¤ Develop new molecular diagnostics Using genomics to assess microbial threats www.cdc.gov/drugresistance/biggest_threats.html There ~2,800,000 ‘AGCTs’ in a S. aureus genome DNA sequencing timeline 1950 1960 1970 1980 1990 2000 2010 I I I I I I I 1995 2004 2009 1977 First whole- First ‘next Third 1972 First ‘whole’ genome generation generation Recombinant genome sequence Sequencer ‘long-read’ DNA sequence of free living Roche 454 Sequencing 1953 technology phage φX174 organism pyro- Eg. Pacbio Watson and Crick ~5 kb H. influenzae sequencing Eg. Oxford Discover the structure 1.8 Mb Nanopore Of DNA 1977 Fred Sanger ‘sequencing 1984 by chain termination’ Epstein- 2006 Barr virus 2001 Illumina sequenced Draft whole- Genome 170 kb genome sequence Analyser Human becomes 3000 Mb, $3b (US) new standard Genome Sequencing ‘Explosion’ Speed of sequencing (bp/day) GenBank-WGS sequence data 1015 4 tera 1012 2 tera 109 106 0 1985 1995 2005 2015 Genome Research Limited. Bradley et al. 2019. Nat Biotech. https://www.yourgenome.org/stories/ 37(2):152-159 How do we sequence genomes? Four stages in sequencing a genome ‘Sanger’ sequencing ¤ Sequencing by synthesis (recall second year lectures) ¤ Based on DNA replication ¤ Synthesis primed by oligonucleotide and specificity driven by DNA polymerase ¤ Termination by incorporation of labeled dideoxy-NTP Available sequencing technologies ¤ Illumina (HiSeq, NextSeq, MiSeq) ¤ Applied Biosystems/Life Technologies (Ion Torrent) ¤ Pacific Biosciences (PacBio RS) ¤ Oxford nanopore Illumina: Sequencing by synthesis Output /run Read number Read length MiSeq 15 Gb 25 Million 2 x 300bp NextSeq500 120 Gb 400 Million 2 x 150bp HiSeq 2500 1000 Gb 4000 Million 2 x 125bp ¤ Simple sample preparation ¤ Inexpensive cost/per base ¤ Massively parallel sequencing Illumina: Sequencing by synthesis 1. Prior: Extract genomic DNA! 2. 1. dsDNA have adaptors ligated 2. ssDNA is attached to the surface of the glass flow cell (lawn of oligo’s complementary to adaptors on cell) 3. Attached fragments bridge to 3. complementary primers www.illumina.com Illumina: Sequencing by synthesis 4. 4. Fragments undergo 5. amplification (mini-PCR) resulting in a daughter strand 5. Repeated cycles gives ride to ‘clusters’ or ‘colonies’ 6. 6. Colonies consist of ~1000 identical molecules www.illumina.com Illumina: Sequencing by synthesis 7. Add labeled reversible terminator nucleotides 8. ‘colonies’ are imaged to identify incorporated base 9. Terminating fluorophore cleaved 10. Repeat cycles of sequencing reactions (7-9) 11. Generate sequencing reads https://youtu.be/fCd6B5HRaZ8 Oxford Nanopore: Single molecule long-reads Output /run Read number Read length MinION ~20 Gb 1-2 Million >100kb, avg=20kb ¤ Single molecule, real-time sequencing ¤ Not sequencing by synthesis ¤ No amplification, press go! Stages in sequencing a genome Finishing ¤ Most accurate method is to PCR sequence across gap regions or use a different platform (eg. Nanopore). Short reads Long reads ¤ Time consuming and expensive ¤ Need to consider what biological question you are trying to address Read-mapping | de novo assembly Easy - rapid Difficult – slower - unbiased Stages in sequencing a genome Genome Annotation ¤ process that locates position of genes in the genome ¤ CDS: protein-coding DNA sequence encoded for by a gene ¤ identifies each open reading frame (ORF) in genome ¤ ORF: a reading frame >100 codons that is not interrupted by a stop codon ¤ ORF might encode for a functional gene ¤ An ORF with a ribosomal binding site at the 5’ end and terminator sequence at the 3’ end might encode a CDS. Need additional transcription/protein (western blot/proteomics) data to confirm. Annotation: predicting ORFs in 6 reading frames Prescott’s Figure 18.9 Genome Annotation ¤ BLAST (basic local alignment search tool) computer program ¤ Base-by-base comparison of two or more gene sequences ¤ assign tentative function of gene or protein structure based on BLAST alignment www.ncbi.nlm.nih.gov Annotation: predicting CDS function Prescott’s Figure 18.10 Genome Annotation ¤ Similar gene sequence used to infer similar function ¤ Gene function not studied in every different bacterium ¤ Chain of annotation ¤ Function studied in bacterium A ¤ Annotated in B based on similarity to A ¤ Annotated in C based on similarity to B ¤ Annotated in D based on similarity to C ¤ Always try to use experimental evidence to support annotation (rarely happens). Annotation: predicting CDS function Salmonella Typhi genome ¤ Annotated genome (strain CT18) ¤ Features identified ¤ Location and predicted Chromosome CDS function 4,500 genes Bioinformatics ¤ analysis of genomic data using computers ¤ data on genome content, structure, and arrangement ¤ A very diverse discipline (software development, mathematical models, physics, applied informaticians) ¤ Addresses novel questions in microbiology ¤ Disease transmission / spread ¤ Evolution, including drug resistance ¤ Pathogenesis, virulence ¤ Microbial ecology Example bioinformatic pipeline Applications of genomics ¤ Foundation for other omic studies ¤ Transcriptome, RNAseq ¤ DNA binding proteins, ChIP-seq ¤ Proteome, ITRAQ ¤ Functional genomics, TraDIS ¤ Rational vaccine design ¤ Identify potential vaccine targets ¤ Novel drug targets ¤ Genes/pathways essential for bacterial growth ¤ DNA sequence-based diagnostics ¤ Understand variation in populations - Population genetics ¤ Phylogenetics ¤ Phylogeography Bacterial evolution – population genetics ACGT Ancestor generations A->G C->G G->C T->C Population GGGT GGGT GCGT GCGT ACCT ACCT ACGC ACGT Identifying mutations within a population Population GGGT GGGT GCGT GCGT ACCT ACCT ACGC ACGT ¤ Assess all of the genetic variation at a population level ¤ Determine alleles at single nucleotide loci ¤ single nucleotide polymorphisms (SNPs) Mapping sequence reads to a reference Chromosome ~4 Mb Reference AGATTCTTCGAGAGTTCTGAGATTAGGATATTTTATTATTTACTCTCTGGG................................................... AGATGC TCGAGA TTCTGAGA TCGGATATT TATTATTT CTCTCTG Reads GATGCTTCG AGTTTCTGAGAT GGATA TTATTA TTTCTCTCT AGATGCTT GAGA TTCTGAGATTCGG TATTTTATTA CTCTCTGGG AGATGCTTCG GAGTTCTGAGAT CGGATA TTTATTA TTTCTCTCTGGG ATGCTTCG GAG GAGAT CGGATA TTA TTTCTCTCTG GATGCTTC GTTCTGAGAT CGGATA TTTATTA TTTCT * * * SNPs Phylogenetics ¤ Process of inferring phylogeny from a set of genomic sequences ¤ Estimates the evolutionary relationships among a set of genome sequences Epidemiology: a revision ¤ The science that evaluates the occurrence, determinants, distribution and control of health and disease in a defined human population ¤ ‘Genomic epidemiology’ links genomics with epidemiology ¤ eg. Contact Tracing! Prescott’s 10th Edition Chapter 37 Epidemiology: a revision Prescott’s 10th Edition Chapter 37 Transmission infection possible sources Bore River x Swamp Infected neighbour Evolution of Population dynamics new traits Selective pressure drug resistance (e.g. vaccine, drug, immunity) or Phylogeography: Following disease transmission and spread ¤ Phylogeography: ¤ combines phylogenetics (relationships between strains), with geographical mapping (where strains were isolated) to infer how bacteria spread ¤ Local scales – within hospital, city, town ¤ Global scales – within country, internationally ¤ Genome sequencing provides highest possible resolution for this task. Population genomics - summary Klemm and Dougan. 2016. Cell Host Microbe. 19(5), 599–610 L2: Learning objectives ¤ Define a microbial genome ¤ Describe the four-step process by which we sequence a genome ¤ Describe the key DNA sequencing technologies commonly used to sequence microbial genomes ¤ Basic steps ¤ Key differences ¤ Describe some methods used in analysing a genome ¤ Consider how you can apply genome sequencing in bacteriology