Microbial Taxonomy and Diversity PDF
Document Details
Uploaded by rafawar1000
Florida Atlantic University
Tags
Summary
This chapter explores microbial taxonomy and evolution, describing how and why microorganisms are classified using an evolutionary framework. It discusses challenges and controversies in microbial taxonomy and phylogeny, highlighting the vast degree of biological diversity within microbes.
Full Transcript
19 Microbial Taxonomy and the Evolution of Diversity Source: NASA, ESA, R. O’Connell (University of Virginia), and the Hubble Heritage Team Scientists Query: “Is the Microbial Universe Expanding?” S hould the audacious question “Does life exist that cannot be classified as archaeal, bacterial, or eu...
19 Microbial Taxonomy and the Evolution of Diversity Source: NASA, ESA, R. O’Connell (University of Virginia), and the Hubble Heritage Team Scientists Query: “Is the Microbial Universe Expanding?” S hould the audacious question “Does life exist that cannot be classified as archaeal, bacterial, or eukaryotic?” be taken seriously? After all, the elegant work of Carl Woese, who first described archaea, has survived the test of time to emerge stronger and more widely accepted. Yet this is the question that some microbiologists and evolutionary biologists ask when they consider “microbial dark matter.” Just as astronomical dark matter represents the majority of mass in the observable universe, most of the biomass in microbial ecosystems (including the human body) is also difficult to define. One way to view all microbes is to sort them into one of three categories: explored, unexplored, and undiscovered. Explored microbes are those we can culture. Here we see an overrepresentation of pathogens and microbes with medical, industrial, or food applications. The unexplored are those we know exist based on gene sequences obtained by metagenomic studies, but have not yet been grown in the lab. These microbes contribute to nutrient cycling and host-microbe physiology in ways that are defined in aggregate but not at the level of a single microbial taxon (e.g., genus or species). The undiscovered, however, are most intriguing. These are the microbes yet to be found—the real microbial dark matter. Do they exist? If so, how much bigger is the microbial universe than previously thought? The only way to answer these questions is to explore. As discussed in chapter 18, metagenomic and single-cell genome sequencing harness the power of next-generation sequencing to probe genes and genomes that belong to uncultured microorganisms. The results have been startling: Hundreds of new archaeal and bacterial phyla have recently been proposed. Even more exciting is the fact that some newly discovered gene sequences don’t match any previously identified archaeal, bacterial, or eukaryotic nucleotide sequences. For example, very large and unusual viruses have been found to possess genes whose evolutionary history places them somewhere between archaeal and eukaryotic domains. If cellular c ounterparts to these viruses exist, would they represent a new domain? As we discuss in this chapter, phyla (s., phylum) are higher-level taxa to which a group of related organisms belong. The recent, sudden increase in the number of phyla has raised concern among taxonomists. While the criteria used to define a particular bacterial or archaeal species have long been a subject of discussion, many taxonomists feel the explosion in metagenomic and single-cell genomic data warrants a discussion regarding what truly constitutes higher-level taxa, including the recently minted term “superphylum,” as well as the traditionally held phylum, class, order, family, and perhaps even domain. It seems the expansion of the microbial universe is a messy business that may require not only new nomenclature, but a new way of thinking about life itself. In this chapter, we explore microbial taxonomy and evolution—how and why microorganisms are classified using an evolutionary framework. This was not always the case, and the move to an evolution-based system resulted in the reclassification of some genera. Indeed, the fields of microbial taxonomy and evolution continue to present challenges and controversy. But ultimately, microbial taxonomy and phylogeny open the door to a degree of biological diversity unmatched by multicellular organisms. Readiness Check: Based on what you have learned previously, you should be able to: ✓ Describe the structure of DNA (section 13.2) and summarize methods used in DNA sequencing and metagenomics (sections 18.1 to 18.3) ✓ Outline in general terms how small subunit rRNA has been used to define the three domains of life (section 1.2) ✓ Summarize the mechanisms of horizontal gene transfer (chapter 16) ✓ Describe the rationale for the RNA world hypothesis of the origin of life (section 1.2) ✓ Contrast and compare the structural differences between bacterial, archaeal, and eukaryotic microbes (chapters 3–5) ✓ Explain the principles of next-generation sequencing and metagenomics (sections 18.1–18.3) 19.1 Microbial Taxonomy Is Based on the Comparison of Multiple Traits After reading this section, you should be able to: a. Explain the utility of taxonomy and systematics b. Draw a concept map illustrating the differences between phenetic and genotypic classification Microbiologists are faced with the daunting task of understanding the diversity of life forms that cannot be seen with 447 wil11886_ch19_447-465.indd 447 23/10/18 10:25 am 448 CHAPTER 19 | Microbial Taxonomy and the Evolution of Diversity the naked eye but can live anywhere on Earth. Obviously a reliable classification system is paramount. The science of classifying living things is called taxonomy (Greek taxis, arrangement or order; nomos, law, or nemein, to distribute or govern). In a broader sense, taxonomy consists of three separate but interrelated parts: classification, nomenclature, and identification. A taxonomic scheme is used to arrange organisms into groups called taxa (s., taxon) based on mutual similarity. Nomenclature is the branch of taxonomy concerned with the assignment of names to taxonomic groups in agreement with published rules. Identification is the practical side of taxonomy—the process of determining if a particular isolate belongs to a recognized taxon and, if so, which one. The term systemat ics is often used for taxonomy, a lthough it sometimes infers a more general scientific study of organisms with the ultimate objective of arranging them in an orderly manner. Thus systematics encompasses morphology, ecology, epidemiology, biochemistry, genetics, molecular biology, and physiology. One of the oldest classification systems, called natural classification, arranges organisms into groups whose members share many characteristics. The Swedish botanist Carl von Linné, or Carolus Linnaeus as he often is called, developed the first natural classification in the eighteenth century. It was based largely on anatomical characteristics and was a great improvement over previously employed systems because natural classification provided information about many biological properties. For example, classification of humans as mammals denotes that they have hair, self-regulating body temperature, and milk-producing mammary glands in the female. When natural classification is applied to higher organisms, evolutionary relationships become apparent simply because the morphology of a given structure (e.g., wings) in a variety of organisms (ducks, songbirds, hawks) suggests how that structure might have been modified to adapt to specific environments or behaviors. However, the traditional taxonomic assignment of microbes was not rooted in evolutionary relatedness. For instance, bacterial pathogens and microbes of industrial importance were historically given names that described the diseases they cause or the processes they perform (e.g., Vibrio cholerae, Clostridium tetani, and Lactococcus lactis). Although these labels are of practical use, they do little to guide Table 19.1 the taxonomist concerned with the vast majority of microbes that are neither pathogenic nor of industrial consequence. Our present understanding of the evolutionary relationships among microbes now serves as the theoretical underpinning for taxonomic classification. In practice, determining the genus and species of a newly isolated microbe is based on polyphasic taxonomy. As the term “polyphasic” suggests, this encompasses many aspects that describe the microorganism. These include phenotypic and genotypic features. To understand how all of these data are incorporated into a coherent profile of taxonomic criteria, we must first consider the individual components. Phenetic Classification For a very long time, microbial taxonomists had to rely exclusively on a phenetic system, which classifies organisms according to their phenotypic similarity (Table 19.1). This system succeeded in bringing order to biological diversity and clarified the function of morphological structures. For example, because motility and flagella are always associated in particular microorganisms, it is reasonable to suppose that flagella are involved in at least some types of motility. Although phenetic studies can reveal possible evolutionary relationships, this is not always the case. For example, not all flagellated bacteria belong to the same phylum. This is why the best phenetic classification is one constructed by comparing as many attributes as possible. Genotypic Classification As the name suggests, genotypic classification seeks to compare the genetic similarity between organisms. Individual genes or whole genomes can be compared. Since the 1970s it has been widely accepted that bacteria and archaea whose genomes are at least 70% homologous belong to the same species. However, there is now a consensus building to replace this metric with a genomics-based assay that measures the average nucleotide identity between organisms. The means by which microbes are genotypically classified is discussed further in section 19.3. Components of Polyphasic Taxonomy wil11886_ch19_447-465.indd 448 23/10/18 10:25 am 19.2 Taxonomic Ranks Provide an Organizational Framework 449 nonoverlapping hierarchy so that each level includes not only the traits that define the rank above it With the publication in 1859 of Charles D arwin’s On but also a new set of more restrictive traits (fig the Origin of Species, biologists began developing ure 19.1). Thus within each d omain—Bacteria, phylogenetic or phyletic classification systems Archaea, or Eukarya—each organism is asthat sought to compare organisms on the basis of signed (in descending order) to a phylum, class, evolutionary relationships. The term phylogeny order, family, genus, and species epithet or (Greek phylon, tribe or race; genesis, generation name. Some microbes are also given a subor origin) refers to the evolutionary development species designation. Microbial groups at of a species. Scientists realized that when they obeach level have a specific suffix that indiserved differences and similarities between orcates rank or level. ganisms as a result of evolutionary The application of Linnaeus’s clasprocesses, they also gained insight into the sification system to bacteria began in history of life on Earth. However, for Pasteur and Koch’s time, about 1872. much of the twentieth century, microbiHowever, within 20 years microbioloologists could not effectively employ gists became dissatisfied, considered it phylogenetic classification, primarily haphazard, and called for more uniform because of the lack of a good fossil criteria in organizing phyla, classes, orrecord. When Carl Woese and George ders, families, genera, and species. Fox proposed using small subunit With the ongoing explosion in metagen(SSU) rRNA nucleotide sequences to as©Pixtal/age fotostock omic analysis, some would argue that hissess evolutionary relationships among microtory is repeating itself. Within the last few decades, organisms, the door opened to the resolution of long-standing thousands of 16S rRNA genes and protein-coding genes have inquiries regarding the origin and evolution of the majority of been sequenced that do not belong to any previously defined life forms on Earth—microbes. Whole genome sequencing is taxa. This data explosion has led to the recent development of the the most recent analytical tool to elucidate phylogenetic relataxonomic classification superphylum, below domain and above tionships among microbes. phylum (e.g., in figure 19.1, superphylum would be placed between Bacteria and Proteobacteria). Ideally a superphylum Comprehension Check includes organisms of several phyla that share a number of dis1. What is a natural classification? What microbial features might tinctive characteristics, such as unusual morphological or metahave been considered when devising a natural classification bolic features. However, some feel that the term is being loosely scheme? applied based on insufficient data—for instance, to SSU rRNA 2. What is polyphasic taxonomy, and what types of data does it sequences alone. consider? Should each type of data be of equal weight? Why or At the other end of the classification scheme is the most why not? fundamental definition of a bacterial or archaeal species. A 3. Consider the finding that bacteria capable of anoxygenic species is a collection of strains that share many stable propphotosynthesis belong to several different phyla. How do you think erties and differ significantly from other groups of strains. A these bacteria might have been originally classified, and what types of strain consists of the descendants of a single, pure microbial data do you think were key in making the most recent taxonomic culture. Strains within a species may be described in a number assignments? of different ways. Biovars are variant strains characterized by biochemical or physiological differences, morphovars differ morphologically, and serovars have distinctive antigenic (immunologically reactive) properties. Because changes in species assignment are not uncommon, one strain is designated as the type strain for each species. The type strain is the holder of the species name. This ensures permanence of After reading this section, you should be able to: names when nomenclature revisions occur because the type a. Outline the general scheme of taxonomic hierarchy species must remain within the original species. It is usually one of the first strains studied and often is more fully characb. Explain how the binomial system of Linnaeus is used in microbial taxonomy terized than others; however, it does not have to be the most representative member. Only those strains very similar to the The definition of a bacterial or archaeal species is widely detype strain or type species are included in a species. Each spebated, as discussed in section 19.5. Nonetheless, for practical cies is assigned to a genus, the next rank in the taxonomic reasons it is essential that the established rules of taxonomy are hierarchy. A genus is a well-defined group of one or more followed. Microbes are placed in taxonomic levels arranged in a species that is clearly separate from other genera. In practice, Phylogenetic Classification 19.2 Taxonomic Ranks Provide an Organizational Framework wil11886_ch19_447-465.indd 449 23/10/18 10:25 am 450 CHAPTER 19 | Microbial Taxonomy and the Evolution of Diversity Domain Bacteria Phylum Proteobacteria α-Proteobacteria Class Order Chromatiales β-Proteobacteria Thiotrichales Legionellales γ-Proteobacteria δ-Proteobacteria Pseudomonadales Vibrionales Family Genus ε-Proteobacteria Enterobacteriales Pasteurellales Enterobacteriaceae Enterobacter Species Escherichia Klebsiella Proteus S. boydii Salmonella Serratia S. dysenteriae Shigella S. flexneri Yersinia S. sonnei S. flexneri serotype 2a strain 2457T Figure 19.1 Hierarchical Arrangement in Taxonomy. In this example, members of the genus Shigella are placed within higher taxonomic ranks. Not all classification possibilities are given for each rank to simplify the diagram. Note that -ales denotes order and -ceae indicates family. considerable subjectivity occurs in assigning species to a genus, and taxonomists may disagree about the composition of genera. Microbiologists name microorganisms using the binomial system of Linnaeus. The Latinized, italicized name consists of two parts. The first part is the generic name (i.e., the genus), and the second is the species name (e.g., Yersinia pestis, the causative agent of plague). The species name is stable; the oldest epithet for a particular organism takes precedence and must be used. In contrast, a generic name can change if the organism is assigned to another genus. For example, some members of the genus Streptococcus were placed into two new genera, Enterococcus and Lactococcus, based on rRNA analysis and other characteristics. Thus Streptococcus faecalis is now Enterococcus faecalis. To be recognized as a new species, genomic, metabolic, morphological, reproductive, and ecological data must be accepted and published in the International Journal of Systematic and Evolutionary Microbiology; until that time, the new species name will appear in quotation marks. Microbes that have not been grown in pure culture but for which there is s ufficient genetic characterization may be given a provisional genus and species name preceded by the term Candidatus, meaning candidate. For instance, Candidatus Mycoplasma girerdii is believed to be associated with preterm birth. It was first detected by metagenomic sequencing in the microbial community in a preterm infant’s mouth. Although complete genomes are known for several related strains, none have been successfully cultured. Note that its provisional genus and species names are not italicized. Bergey’s Manual of Systematics of Bacteria and Archaea contains only recognized bacterial and archaeal species as discussed in section 19.6. wil11886_ch19_447-465.indd 450 Comprehension Check 1. What is the difference between a microbial species and a strain? 2. Why is it important to have a type strain for each species? 3. The genus Salmonella was once thought to contain five species. Most scientists now consider only two species valid: S. bongori and S. enterica. The latter contains six subspecies, and S. enterica subspecies enterica is further subdivided into eight serovars. Three of these, Enteritidis, Typhi, and Typhimurium, were once considered Salmonella species. (a) Inspect figure 19.1 and construct a similar lineage for Salmonella. (b) What criteria do you think were used to make these taxonomic changes? 19.3 Microbial Taxonomy and Phylogeny Are Largely Based on Molecular Characterization After reading this section, you should be able to: a. Construct a concept map describing the approaches commonly used to determine taxonomic classification b. Assess the impact molecular methods have had on the field of microbial taxonomy and phylogeny c. Compare and contrast nucleotide sequencing and nonsequencingbased molecular approaches used in microbial taxonomy and phylogeny d. Select an appropriate technique to identify a microbial genus, species, and strain e. Predict the basic biological as well as public health implications of microbial taxonomic identification 23/10/18 10:25 am 19.3 Microbial Taxonomy and Phylogeny Are Largely Based on Molecular Characterization 451 Many different approaches are used to classify and identify microorganisms that have been isolated and grown in pure culture. For clarity, we divide these approaches into two groups: classical and molecular. The most durable identifications are those that are based on a combination of approaches. Methods often employed in routine laboratory identification of pathogenic bacteria are covered in chapter 37, Clinical Microbiology and Immunology. Classical Characteristics Classical approaches to taxonomy make use of morphological, physiological, biochemical, and ecological characteristics. These characteristics have been employed in microbial taxonomy for many years and form the basis for phenetic (phenotypic) classification. When used in combination, they are quite useful in routine identification of well-characterized microbes. Morphological Characteristics Morphological features are important in microbial taxonomy for many reasons (table 19.2). Morphology is easy to study and analyze, particularly in eukaryotic microorganisms and more complex bacteria and archaea. In addition, morphological comparisons are valuable because structural features depend on the expression of many genes and are usually genetically stable. Thus morphological similarity often is a good indication of phylogenetic relatedness. Physiological and Metabolic Characteristics Physiological and metabolic characteristics are useful because they are directly related to the nature and activity of microbial proteins. For instance, the detection of specific end products of fermentation in a newly discovered microorganism reveals the Table 19.2 wil11886_ch19_447-465.indd 451 Some Morphological Features Used in Classification and Identification presence of specific catabolic enzymes and the genes that encode them. Therefore analysis of characteristics, such as energy metabolism and nutrient transport, provides an indirect comparison of microbial genomes. Table 19.3 lists some of the most important of these properties. Biochemical Characteristics Among the more useful biochemical characteristics used in microbial taxonomy are bacterial fatty acids, which can be analyzed using a technique called fatty acid methyl ester (FAME) analysis. A fatty acid profile reveals differences in chain length, degree of saturation, branched chains, and hydroxyl groups. Microbes of the same species will have identical fatty acid profiles, provided they are grown under the same conditions; this limits FAME analysis to only those microbes that can be grown in pure culture. Finally, because the identification of a species is done by comparing the results of the unknown microbe in question with the FAME profile of other, known microbes, identification is only possible if the species in question has been previously analyzed. Nonetheless, FAME analysis is particularly important in public health, food, and water microbiology. In these applications, microbiologists seek to identify specific pathogens. Microbiology of food (chapter 41); Purification and sanitary analysis ensures safe drinking water (section 43.1) Advances in mass spectrometry (MS) have resulted in the fast and accurate identification of bacteria based on the presence of specific, highly abundant proteins. The specific type of MS used is called matrix-assisted laser desorption/ionization-time of flight (MALDI-ToF). MALDI-ToF enables the analysis of complex biomolecules that could not previously be studied by MS. The material to be analyzed is dried on a sample holder (called a target) and Table 19.3 Some Physiological and Metabolic Characteristics Used in Classification and Identification 23/10/18 10:25 am 452 CHAPTER 19 | Microbial Taxonomy and the Evolution of Diversity then mixed with a molecular film called a matrix. When a UV laser beam strikes the sample target, the matrix helps stimulate release of the sample from the surface; this is matrix-assisted desorption. Once the matrix-sample is released (desorbed), the matrix transfers protons to the sample, which then becomes ionized; this is matrix-assisted ionization. Once ionized, biomolecules are “flown” across a space within the instrument; the time taken for a molecule to fly from one side to the other is used to determine the mass of each molecule; this is time of flight. In its simplest form, MALDI-ToF identification of bacteria involves the transfer of whole cells from a single colony grown under specific conditions to a sample target. Once dried, a matrix is deposited and the bacterial samples are analyzed. MALDI-ToF yields the masses of many highly abundant bacterial proteins. Like FAME analysis, each experimentally derived protein profile must be matched to a protein profile from a known bacterium. MALDI-ToF is becoming increasingly important in medical microbiology laboratories where the same strains of organisms are regularly encountered. Like FAME, microbes must be grown under very specific conditions and MALDI-ToF cannot be used to identify newly discovered m icrobes. Proteomics explores total cellular proteins (section 18.5) Ecological Characteristics The ability of a microorganism to colonize a specific environment is of taxonomic value. Some microbes may be very similar in many other respects but inhabit different ecological niches, suggesting they may not be as closely related as first suspected. Some examples of taxonomically important ecological properties are life cycle patterns; the nature of symbiotic relationships; the ability to cause disease in a particular host; and habitat preferences such as requirements for temperature, pH, oxygen, and osmotic concentration. Many growth requirements are considered physiological characteristics as well (table 19.3). Environmental factors affect microbial growth (section 7.5) Many types of microbial interactions exist (section 27.1) Molecular Characteristics It is hard to overestimate how the study of DNA, RNA, and proteins has advanced our understanding of microbial evolution and taxonomy. Evolutionary biologists studying plants and animals draw from a rich fossil record to assemble a history of morphological changes; in these cases, molecular approaches supplement such data. In contrast, microorganisms have almost no fossil record, so molecular analysis is the only feasible means of collecting a large and accurate data set that explores microbial evolution. When scientists are careful to make only valid comparisons, phylogenetic inferences based on molecular approaches provide the most robust analysis of microbial evolution. Microbial genomes can be directly compared and taxonomic similarity can be estimated in many ways. DNA sequence can be determined for one or a few genes, or for an entire genome, depending on the degree of identification required. The recent rapid drop in cost and time to generate a genome sequence has wil11886_ch19_447-465.indd 452 made this approach viable for many taxonomic applications. Molecular techniques developed in the late twentieth century are now in transition to techniques based on whole-genome sequencing (WGS). In many instances, the original nomenclature remains, although the technique is performed in silico rather than in vitro. As with phenetic approaches, comparison to standards and type strains forms the basis for identification and phylogenetic placement. Many classical properties used in strain identification are now inferred from genome sequences. Signature Sequences The rRNAs from small ribosomal subunits (16S from bacterial and archaeal cells and 18S from eukaryotes) have become the molecules of choice for inferring microbial phylogenies and making taxonomic assignments at the genus level. The small subunit rRNAs (SSU rRNAs) are widely applicable in studies of microbial evolution, relatedness, and genus identification for several important reasons (figure 19.2). First, they play the same role in all microorganisms. In addition, because ribosomes are absolutely necessary for survival and ribosomes cannot function without SSU rRNAs, the genes encoding these rRNAs cannot tolerate large mutations. Thus these genes change very slowly with time. Finally, rRNA genes are rarely subject to horizontal gene transfer, an important factor in comparing sequences for phylogenetic purposes. The utility of SSU rRNAs is extended by the presence of certain sequences within SSU rRNA genes that are variable among organisms as well as other regions that are quite similar. The variable regions enable comparison between closely related microbes, whereas the stable sequences allow the comparison of distantly related microorganisms. Comparative analysis of SSU rRNA sequences from thousands of organisms has demonstrated the presence of oligonucleotide signature sequences. These are short, conserved nucleotide sequences that are specific for phylogenetically defined groups of organisms. Thus the signature sequences found in bacterial rRNAs are rarely or never found in archaeal rRNAs and vice versa. Likewise, the 18S rRNA of eukaryotes bears signature sequences that are specific to the domain Eukarya. In addition, signature sequences for a lower taxon, for example, the genus Pseudomonas, may be found in the variable regions of a higher taxon like Bacteria (figure 19.2b). The ability to amplify regions of rRNA genes (rDNA) by the polymerase chain reaction (PCR) and sequence the DNA using next-generation techniques has greatly increased the efficiency by which SSU rRNA sequences can be obtained. PCR can amplify rDNA from the genomes of different organisms because conserved nucleotide sequences flank the nucleotides that can be used to reveal the microbe’s identity (figure 19.2b). In practice, this means that PCR primers can be generated to amplify rDNA from both cultured and uncultured microbes. Any uncultivated microorganism that is identified solely on its nucleic acid sequence (or other observable, quantifiable phenotype) is termed a phylotype. 23/10/18 10:25 am 19.3 Microbial Taxonomy and Phylogeny Are Largely Based on Molecular Characterization 453 Eukarya Archaea Bacteria ~ 230 bases (a) 27F Ps308 16S rRNA gene ~ 1540 bases (b) Ps1258 Amino acid number Lyme disease Borrelia Relapsing fever Borrelia 1492R 203 221 B. burgdorferi G G G I K G L K G V F L L T L L S G B. afzelii G G G I K G L K G V F L L T L L S G B. garinii G G G I K G L K G V F V L T L L S G B. hermsii G G G I K A L K G G M F I L T L L S G B. recurrentis G G G I K S L K G G M F I L T L L S G B. parkeri G G G I K G L K G G M F I L T L L S G A L G G I L T L L S G B. miyamotoi G G G I K Q I I (c) Figure 19.2 Signature rRNA and Amino Acid Sequences Help to Identify Microbes. (a) Representative examples of rRNA secondary structures from the three domains: Bacteria (Escherichia coli), Archaea (Methanococcus vannielii), and Eukarya (Saccharomyces cerevisiae). (b) Diagram of a bacterial 16S rRNA gene. Conserved regions (red) are present in all bacteria, and bacterial-specific PCR primers 27F and 1492R bind where indicated. These sites are highlighted in red in (a). Variable regions (gold) contain genus-specific sequence, and PCR primers Ps308 and Ps1258 indicate the positions of signature sequences specific to Pseudomonas. These sites are highlighted in gold in (a). (c) Alignment of amino acids 203-221 in L-lactate permease from Borrelia species. Shaded boxes are amino acids identical to the B. burgdorferi sequence. An amino acid indel (shaded blue) distinguishes species that cause Lyme disease from those that cause the condition known as relapsing fever. The use of SSU rRNA sequences as a taxonomic marker has been validated as a molecular measure of relatedness. Correlating SSU rRNA sequence data with traditional measures of assigning species, organisms of the same species show at least 98.65% sequence identity in their SSU rDNA sequences. This means that when comparing the SSU rRNA sequences of two organisms, differences at more than 20 positions of a ~1,540 wil11886_ch19_447-465.indd 453 base rRNA indicate that they probably derive from different species. Polymerase chain reaction amplifies targeted DNA (section 17.2) Signature sequences are present in genes other than those encoding rRNA. Many genes have nucleotide insertions or deletions of specific lengths and sequences at fixed positions. A particular nucleotide sequence that is inserted or deleted may be found 23/10/18 10:25 am 454 CHAPTER 19 | Microbial Taxonomy and the Evolution of Diversity exclusively among all members of a phylum. These taxon specific insertions and deletions are called conserved indels (for insertion/ deletion). Indels are particularly useful in phylogenetic studies when they are flanked by conserved regions. In such cases, changes in the signature sequence cannot be due to sequence misalignments. When an indel occurs in the coding region of a gene, it must be a multiple of three nucleotides to preserve the reading frame. Alignment of the inferred amino acid sequences reveals indels useful for phylogenetic analysis (figure 19.2c). Whole-Genome Comparison As the field of taxonomy shifts to whole-genome sequencing (WGS) for strain identification and taxonomic classification, new quantitative measures of relatedness are being developed and evaluated. As with any metric, standardization of methods and interpretations are required. These processes are ongoing. An initial step to identify a newly determined sequence is SSU rRNA comparison, which provides confident identification to the genus level. Identification to the species level requires detailed gene-bygene comparison of the new isolate and type strains of its closest relatives. Average nucleotide identity (ANI) is now widely considered to be the standard for species identification. This technique uses pairwise alignments between all sequences shared between two genomes and calculates the fraction of identical nucleotides. ANI values for two genomes of the same species should be at least 95 to 96%. ANI is poised to replace a biochemical technique called DNA-DNA hybridization (DDH). DDH is performed by mixing genomic DNA from two strains. The mixture is heated until denaturation occurs, and then slowly cooled to allow renaturation. Noncomplementary regions remain unpaired, and the degree of renaturation is calculated. This is a relatively complex assay and results depend on quality of the extracted DNA and other factors. By contrast, bioinformatics software quickly calculates digital DDH values using WGS data, although this technique is less commonly used than ANI. Another biochemical technique gone digital is determination of G + C content. This metric is a simple percentage of the bases in DNA that are G + C, and it is readily determined computationally from WGS data. Organisms range from around 30% to 80% G + C. Despite the wide range of variation, the G + C content of strains within a species is constant and varies little within a genus. Subspecies and Strain Identification For many applications, identification to levels below species is required. Typically genes that evolve more quickly than those that encode rRNA must be analyzed. The technique called multilocus sequence analysis (MLSA) compares the sequences of several conserved housekeeping genes (figure 19.3). At least 5 genes are examined to avoid misleading results that can arise through horizontal gene transfer. Because many wil11886_ch19_447-465.indd 454 SNPs MLSA 16S ribosomal RNA Figure 19.3 Genome Coverage of Genetic Taxonomic Approaches. A bacterial or archaeal genome contains at least one locus encoding 16S rRNA. MLSA samples multiple genes throughout the genome. SNP analysis compares more, but shorter, nucleotide sequences that span the entire genome. different versions, or alleles, of each gene can exist, the finding that two microbial isolates share the same alleles for multiple genes is strong evidence that the two strains are closely related, perhaps even the same strain. MLSA is now often performed from whole genome sequences, where it is termed wgMLSA. The availability of wgMLSA data enables extended gene-bygene comparison. To survey a large fraction of the genome, single nucleotide polymorphisms (SNPs, pronounced “snips”) are identified in specific genes, intergenic regions, or other noncoding regions (figure 19.3). Originally developed to analyze human DNA, SNP analysis targets specific regions because they are normally conserved, so single base pair differences reveal evolutionary change. SNP analysis shares features with analysis of restriction fragment length polymorphism (RFLP). This technique identifies differences in restriction endonuclease digestion patterns, which reflect individual base pair changes. When this analysis is applied to the gene encoding the SSU rRNA, it is termed ribotyping. With the advent of WGS, these techniques are increasingly performed in silico. Comprehension Check 1. What are the advantages of using each major group of characteristics (morphological, physiological and metabolic, biochemical, ecological, and molecular) in classification and identification? How is each group related to the nature and expression of the genome? Give examples of each type of characteristic. 2. Why is rRNA suitable for determining relatedness? 3. List the steps you would take to identify an organism just isolated in pure culture. What criteria would tell you if it is a new species? 23/10/18 10:25 am 19.4 Phylogenetic Trees Illustrate Evolutionary Relationships 455 sequences have been analyzed are identified at the tip of each branch. Each node (branchpoint) represents a divergence event, and the length of the branches correlates with the number of molecular changes that have taken place between the two nodes. Often sequences are obtained from well-classified microbes After reading this section, you should be able to: that have been grown in pure culture; however, this is not always the a. Paraphrase the rationale underpinning the construction of case. SSU rRNA sequences have become particularly important phylogenetic trees in both identifying microbes and constructing phylogenetic trees to b. Compare and contrast rooted and unrooted trees describe their evolutionary relationships. An inclusive term for the c. Outline the general considerations used in building a phylogenetic tree organism at each branch tip, regardless of whether or not it has been d. Characterize the challenges horizontal gene transfer introduces cultured, is the operational taxonomic unit (OTU). The term in the study of microbial evolution clade refers to a group of organisms with a common ancestor. We now briefly describe how phylogenetic trees can be built with the Microbial taxa within Bacteria and Archaea form discrete, intention of increasing an understanding of what they represent. genealogically clustered groups that can be illustrated in phyloThere are five steps in building a phylogenetic tree. First, the genetic trees. Phylogenetic trees show inferred evolutionary nucleotide or amino acid sequences must be aligned. Alignments relationships in the form of multiple branching lineages conare performed with software, although manual inspection of the nected by nodes (figure 19.4). Organisms whose nucleotide alignment is also important (figure 19.4a). In addition to SSU rRNA gene sequences, protein-coding genes may be analyzed, in which case the alignment of amino acids is preferred. This is because the genetic code is degenerNucleotide position 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 ate, so even if a nucleotide sequence is Microbe 1 A C U G A C U C A U A G A U C not conserved, the amino acid seMicrobe 2 A G U G A G U C A G A C A U C Microbe 3 U C U G G G U C A G A C A U C quence may be. For instance, if a Microbe 4 U G U G G U C C A U A C A U C protein-coding gene were mutated so that a codon changed from AGA to (a) Sequence alignment and analysis AGG, arginine would still be added to the growing polypeptide during transMicrobe 1 2 3 4 Microbe 1 2 3 4 lation. Next the alignment must be ex1 1.0 0.33 0.40 0.62 1 1.0 0.27 0.33 0.40 amined for a phylogenetic signal; this 2 1.0 0.23 0.40 2 1.0 0.20 0.33 will determine if it is appropriate to 3 1.0 0.35 3 1.0 0.27 continue with tree building (figure 4 1.0 4 1.0 19.4b). There are two extremes in this regard: At one end of the spectrum, (b) Calculated evolutionary distance (c) Corrected evolutionary distance the sequences align perfectly. The other extreme is 25% identity, or what Figure 19.4 Constructing a Phylogenetic would be expected if two random Tree Using a Distance Method. (a) Nucleotide Microbe 1 DNA sequences were aligned. (When sequences are aligned, pairwise comparison is there are four possible nucleotides, made, and the number of nonidentical nucleotides each position has a 1 in 4 chance of is scored. For example, when sequences from 0.25 Microbe 2 matching.) Phylogenetic analysis can microbes 1 and 2 are compared, there are 4 0.08 only be performed on those sequences mismatches out of 15 total nucleotides, yielding a 0.06 that fall in the middle, with a mixture calculated evolutionary distance (ED) of 0.27. of random and matched positions. The (b) The calculated ED values are corrected to 0.09 third step is the hardest: choosing account for back mutation to the original genotype Microbe 3 which tree-building method to use. We or other forward mutations that could have 0.26 briefly review some of the more popuoccurred at the same site before generating the lar methods. The last two steps involve observed genotype. (c) A tree-building method is Microbe 4 the application of the selected method, then selected (in this case, a distance method is performed by a computer, followed by used), and computer analysis of the values (d) Phylogenetic tree manual examination of the resulting generates a phylogenetic tree as shown in (d). ED tree to confirm that it is logical. For values are indicated for each branch. Numbers are example, a computer-generated tree not exact due to rounding for a short sequence. 19.4 Phylogenetic Trees Illustrate 4/15 = 0.27 5/15 = 0.33 6/15 = 0.40 3/15 = 0.20 5/15 = 0.33 4/15 = 0.27 Evolutionary Relationships wil11886_ch19_447-465.indd 455 23/10/18 10:25 am 456 CHAPTER 19 | Microbial Taxonomy and the Evolution of Diversity A that places a mammal and an archaeon on the same branch would certainly require correction. Approaches to building a phylogenetic tree can be divided into two broad categories: a distance-based (phenetic) approach and a character-based (cladistic) approach. Distance-based approaches are the most intuitive. Here the differences b etween the aligned sequences are counted for C each pair and summarized in a single statistic, which is roughly the percent difference between the two sequences (a) (figure 19.4b,c). A tree is generated by serially linking pairs that are ever more distantly related (i.e., start with those with the least number of sequence differences and move to those with the most). This is called cluster analysis and should be carefully applied as it has the unattractive capability of generating trees even in the absence of evolutionary relationships. Neighbor joining is another d istance-based method that uses a different matrix that attempts to avoid this problem by modifying the distance between each pair of nodes based on the average divergence from all other nodes. Character-based methods for phylogenetic tree building are more complicated but generate more robust trees. These methods start with assumptions about the pathway of evolution, infer the ancestor at each node, and choose the best tree according to a specific model of evolutionary change. These methods include maximum parsimony, (b) which assumes that the fewest number of changes occurred between ancestor and extant (living) organisms. Another approach is called maximum likelihood. This requires a large data set because for each possible tree that can be built, its probability (i.e., the likelihood) based on certain evolutionary and molecular information is determined so that the tree with the greatest probability based on these criteria is selected. All tree building methods have their advantages and disadvantages, so it is advisable to use several methods to analyze the same data set. S imilar trees generated by different approaches is the desired outcome. Importantly, a tree may be unrooted or rooted. An unrooted tree (figure 19.5a) represents phylogenetic relationships but does not indicate which organisms are more primitive relative to the others. Figure 19.5a shows that A is more closely related to C than it is to either B or D, but it does not indicate which of the four species might be the oldest. In contrast, the rooted tree (figure 19.5b) includes a node that serves as the common ancestor and shows the development of the four species from this root. It is much more difficult to develop a rooted tree. For example, there are 15 possible rooted trees that connect four species but only three possible unrooted trees. An unrooted tree can be rooted by adding data from an outgroup—a species known to be very distantly related to all the species in the tree (figure 19.5c). The root is determined by the point of the tree where the outgroup joins. This provides a point of reference to identify the oldest node on the tree, which is the node closest to the outgroup. So, for example, in figure 19.5c, organism Z is the outgroup and the oldest node on the tree is marked with an arrow. Once a tree is constructed, it is important to get a sense of whether the placement of its branches and nodes is legitimate. wil11886_ch19_447-465.indd 456 B Branch Nodes D Figure 19.5 Phylogenetic Tree Topologies. (a) Unrooted tree joining four taxonomic units. (b) Rooted tree. (c) The tree shown in (a) can be rooted by adding an outgroup, represented by Z. B A C D C B D A Z (c) There are a variety of methods to assess the “strength” of a tree, but the most common is bootstrapping. Bootstrapping involves phylogenetic analysis of a randomly selected subset of the data presented on the tree. A bootstrap value is the percent of analyses in which that particular branch was found. Typically bootstrap values of 70% or greater are thought to support a tree. Another approach, called Bayesian inference, may also be used. Rather than looking at a single tree, Bayesian inference analyzes multiple potential trees and calculates the probability that each branch would appear based on this comparison. Although these values are also reported as percentages, they are not directly comparable to bootstrap values. Only values greater than 95% are acceptable when Bayesian inference is used. An important feature in phylogenetic trees is the scale. Just as a scale bar on a road map indicates distance, the scale bar on a phylogenetic tree illustrates the evolutionary distance. This is usually measured in number of mutations per 100 nucleotides or amino acid substitutions per 100 amino acid residues. This may be expressed as a number without units, for example, 0.02 (2 per 100). To continue the map analogy, just as a map does not reveal how long it takes to get from one point to another (due to traffic and weather), the branches on a phylogenetic tree do not indicate the length of time it took for an ancestral microbe to give rise to an extant form. As discussed in section 19.5, the theory of punctuated equilibria is one important reason evolutionary distance, as measured by the similarity of genes or proteins in living organisms, provides little or no information regarding how long ago evolutionary divergence occurred. One of the biggest challenges in constructing a satisfactory tree is widespread, frequent horizontal gene transfer (HGT). Although microbiologists are careful to exclude from their analysis genes and 23/10/18 10:25 am 19.5 Evolutionary Processes and the Concept of a Microbial Species Inspire Debate 457 Figure 19.6. Core and Pan-genomes. The core and pan-genomes in cells comprising strains within a species. The core genome is common to all members (blue), while the pan-genome includes all sequences present in any member of the set. proteins known to have been subject to HGT, the influence of HGT on phylogeny and evolution cannot be ignored. Indeed, there has been frequent gene transfer among all three domains. Clearly, the pattern of microbial evolution is not truly linear and treelike. Assessing the impact of HGT on microbial evolution has been guided by genome sequence analysis. Genome comparisons of strains within a species and among species within a phylum have revealed that microbial genomes consist of an older core genome and a more recently acquired pan-genome (figure 19.6). The core genome is the set of genes found in all members of a species (or other monophyletic group). Thus it is thought to represent the minimal number of genes needed for the group of microbes to survive. In general, these genes encode informational proteins involved in DNA replication, transcription, and translation (e.g., rRNA genes). These genes are thought to have been present in the group’s common ancestor. By contrast, the pan-genome consists of every gene in all strains of a species (or other taxonomic unit), so it includes the core genome plus every additional gene found in at least one strain. Genes outside the core genome are more recently acquired genes that enable microbial colonization of new niches. In general, genes unique to the pan-genome are considered to have been acquired by HGT. A comparison of the core genome size with the actual genome size of a particular strain thus indicates the evolution of new traits. For instance, current values of the core genome and pan-genome of Bacillus anthracis differ by only roughly 200 genes (about 3,600 versus 3,800, respectively), reflecting the limited genetic diversity within this species. By contrast, the E. coli core genome consists of about 2,800 genes, whereas some estimate that the pan-genome consists of roughly 37,000 genes. The broad genetic diversity among E. coli strains is illustrated by comparing the genomes of the nonpathogenic E. coli strain K12 and the pathogenic strain O157:H7. Their last common ancestor is estimated to have lived about 4.5 million years ago. During this time E. coli has mutated and exchanged genes, allowing the development of many strains. These have radiated to numerous habitats to which strains continue to adapt. Obviously, for any given species the estimated size of the core and pan-genome depends on the number of strains with sequenced genomes. Indeed, as more strains of each species are sequenced, the core genome tends to get smaller, as strains are discovered that lack genes once thought to be common to all genomes. It follows that as core genomes shrink, pan-genomes expand. Mechanisms of genetic variation (chapter 16) wil11886_ch19_447-465.indd 457 Comprehension Check 1. Could a phylotype be considered an OTU? What about a species? 2. List the differences between distance-based and character-based methods for constructing a phylogenetic tree. Which type is maximum parsimony? Explain your answer. 3. What is the difference between a rooted and unrooted tree? Which provides more information? 4. You are building a tree based on 16S rRNA sequence alignments of a group of spirochetes. Suggest a possible outgroup so that you can build a rooted tree. Refer to chapter 21 for more information about these bacteria. 5. Is HGT involved in movement of genes in the core or pan-genome? Explain. 19.5 Evolutionary Processes and the Concept of a Microbial Species Inspire Debate After reading this section, you should be able to: a. Diagram the endosymbiotic theory of the origin of mitochondria and chloroplasts b. Compare and contrast the two theories that address the origin of the nucleus c. Explain why the concept of a microbial species is difficult to define d. List the “gold standard” taxonomic methods currently applied to species designation e. Explain the importance of adaptive mutations in giving rise to new ecotypes It is our goal here to describe current models that seek to explain the evolution of new microbial species. Before we begin this discussion, however, it is helpful to consider the origin of all microbes—bacterial, archaeal, and eukaryotic. We then review the controversy that surrounds the word “species” as applied to bacteria and archaea. Only then will it be possible to understand and appreciate the evolutionary mechanisms that drive the development of new species of microorganisms. Evolution of the Three Domains of Life As we present in chapter 1, many scientists think that the first selfreplicating entity was RNA. This is because RNA has the capacity 23/10/18 10:25 am 458 CHAPTER 19 | Microbial Taxonomy and the Evolution of Diversity to reproduce as well as to catalyze chemical reactions. It is thought that when RNA became enclosed in lipid spheres, the first primitive cell-like forms were generated (see figures 1.5 and 1.6). Considerable evidence indicates that by at least 3.5 billion years ago, such proto-cells (Greek protos, first) had evolved to form the ancestors of our extant microbes. Moreover, by 2.5 billion years ago, bacteria and archaea not only abounded, but each had evolved distinct taxonomic lineages. For instance, the Gram-positive bacterial phylum Firmicutes and Gram-negative phyla Proteobacteria and Cyanobacteria had developed. Indeed, the ancestors of modern cyanobacteria performed the oxygenic photosynthesis responsible for converting our anoxic planet to an oxygenated one. A reexamination of the tree of life based on SSU rRNA (see figure 1.2) shows that the root of the tree is on the earliest region of the bacterial branch. As we discuss in chapter 1, the root is considered the last universal common ancestor, or LUCA. The placement of LUCA indicates that although bacteria and archaea share similar cellular construction, they are not phylogenetically linked. Because LUCA maps to the bacterial branch of the tree, it is thought that Archaea and Eukarya evolved independently of Bacteria. Recall that this was first suggested by Carl Woese and George Fox in the 1970s. Since that time, the distinction they inferred between bacterial and archaeal lineages has been confirmed by biochemical differences including membrane lipids, cell wall structure, and enzymes involved in gene transcription (table 19.4). Although Archaea and Eukarya share a recent common ancestor, eukaryotes possess both archaeal and bacterial traits. There are several hypotheses that account for genes of both archaeal and bacterial ancestry on the eukaryotic nuclear genome. One hypothesis asserts that the first eukaryotic cell arose upon the fusion of an archaeon and a bacterium that lived in close association. Over time, archaeal genes involved in metabolism were lost while bacterial genes involved in information processing were also degraded. Contrary to this “single-step” hypothesis, others suggest a more multistep scenario involving a series of endosymbioses. Here an archaeal cell is thought to have engulfed a bacterium that donated genes that would eventually become the nuclear genome of an ancestral eukaryote. Most recently, evidence supporting the hypothesis that the first eukaryotic cell arose from within the archaeal lineage was reported. In 2017, a rchaeal DNA sequences recovered from marine sediments worldwide were found to encode a number of eukaryotic-like proteins, such as membrane proteins involved in phagocytosis and cytoskeletal components. These sequences were proposed to constitute a new archaeal superphylum, Asgard, that is more closely related to eukaryotes than to any other archaeal or bacterial lineage. The Asgard microbes have not been cultivated in the laboratory and are known only through DNA sequence fragments. Unlike the uncertainty surrounding the origin of the first eukaryotic proto-cell, there is general agreement that mitochondria and chloroplasts arose by the incorporation of endosymbiotic bacteria. Like bacteria, most mitochondria and chloroplasts have a single, circular chromosome and undergo binary fission. In addition, mitochondria and chloroplasts have 70S, not 80S, ribosomes. These observations led to the development of the endosymbiotic wil11886_ch19_447-465.indd 458 hypothesis by Lynn Margulis (figure 19.7). This posits the following series of events. An ancestral eukaryotic cell lost its rigid cell wall but had evolved actin (or its precursor) that enabled amoeboid motility. This nucleated cell was thus able to develop endocytosis. These mobile proto-eukaryotes became predators of other cells, including bacteria. Predation imposed selection for cellular enlargement and increased motility. Engulfment without digestion of bacterial prey evolved because a smaller bacterial cell provided energy for the larger host cell, while the host protected and supplied nutrients to the bacterial cell. Because endocytosis and oxidative phosphorylation are not compatible on the same membrane, the endosymbiont was retained as a separate subcellular entity. The energy supplied by the endosymbiont conferred a growth advantage to the proto-eukaryote, enabling its dominance over and eventually eliminating other cells that lacked both cell walls and endosymbionts. As the endosymbiont became more dependent on its host for nutrients and protection, there was little selective pressure for the retention of genes involved in these processes. Conversely, there was strong selective pressure to retain the genes involved in energy conservation. Thus genes whose products were redundant to the host were eventually lost. In fact, such genome reduction is the rule, rather than the exception, among obligate intracellular microbes (see figure 18.18). Ultimately, the endosymbiont evolved into an energy- providing organelle like mitochondria, or other mitochondrialike organelles (e.g., hydrogenosomes and mitosomes). Hydrogenosomes are found in some protists where, like mitochondria, they take up pyruvate that results from glycolysis within the host cytoplasm. Unlike mitochondria, however, pyruvate in the hydrogenosome is reduced to acetate, H2, and CO2 with ATP generated (see figure 5.15). The similarity between certain key genes (and thus their protein products) supports the notion that hydrogenosomes and mitochondria evolved from a single common ancestor, most likely a proteobacterium. Mitosomes are found in some protists. Of these three organelles, mitochondria appear to be most highly derived (i.e., continued to evolve), since these organelles are the site of oxidative phosphorylation. Mitochondria, related organelles, and chloroplasts are involved in energy conservation (section 5.6) The proteobacterial origin of mitochondria (section 22.1); Microsporidia are intracellular parasites (section 25.7) Chloroplasts arose when these new aerobic eukaryotes engulfed a cyanobacterium—probably an ancestor of Prochlorococcus. Again, this led to the development of a mutualistic relationship that evolved into our extant green plants and algae— organisms that possess both mitochondria and chloroplasts. Such endosymbioses exist today in certain protists that retain living cyanobacteria or the functional chloroplasts of their algal prey. Here it is thought a eukaryotic cell with mitochondrionlike organelles engulfed a photosynthetic cell that possessed both mitochondria and chloroplasts. Recent genomic data support this hypothesis, that is to say that plants arose from an ancestral protist that engulfed a cyanobacterium. The situation is more complicated for red algae, which are derived from a protist that engulfed an ancient photosynthetic eukaryote. 23/10/18 10:25 am 19.5 Evolutionary Processes and the Concept of a Microbial Species Inspire Debate 459 Table 19.4 wil11886_ch19_447-465.indd 459 Comparison of Bacteria, Archaea, and Eukarya 23/10/18 10:25 am 460 CHAPTER 19 | Microbial Taxonomy and the Evolution of Diversity Plants and algae (contain mitochondria and chloroplasts) Animals, fungi, and protists (contain mitochondria) Billions of years ago (bya) 0 Evolution Primordial eukaryotic cells Evolution 1 Cyanobacterium α-proteobacterium 2 (a) Mitochondria originated from endosymbiotic proteobacteria. (b) Chloroplasts originated from endosymbiotic cyanobacteria. Figure 19.7 The Endosymbiotic Theory. (a) According to this hypothesis, mitochondria derived from a proteobacterium. (b) A similar phenomenon occurred for chloroplasts, which derived from cyanobacteria. MICRO INQUIRY On what evidence is this hypothesis based? Phylum Cyanobacteria: oxygenic photosynthetic bacteria (section 21.4); Protists (chapter 24) What Is a Microbial Species? The term “species concept” describes a theoretical framework used to understand how and why certain organisms can be sorted into discrete taxonomic groups. We discussed “species definition” in sections 19.2 and 19.3, when we reviewed the criteria used to identify a microbial genus, species, or strain. From this, we can see that species definition is the application of the species concept. Both species concept and definition have changed over time and continue to be difficult for microbiologists to agree upon. Because bacteria and archaea lack sexual reproduction, extensive morphological features, and a fossil record, microbiologists are at a distinct disadvantage when defining species, as compared to biologists studying other forms of life. Historically, the application of different criteria in making species assignments has led to taxonomic confusion. In some wil11886_ch19_447-465.indd 460 cases, a single microbial species is so metabolically and genetically diverse that it seems probable that the group represents multiple species. On the other hand, some species are very narrowly defined, such that two species differ very little. For instance, Bacillus anthracis strains are so similar to B. cereus, many believe that all B. anthracis strains are really members of the B. cereus species. It is argued that only because B. anthracis causes anthrax does it have its own species designation. In an effort to clarify and standardize microbial taxonomy, the ICSP recommends four criteria to meet a “gold standard” for species assignment: The microbe must be phenotypically similar to others in the group, whole genome similarity as determined by DNA-DNA hybridization must be at least 70%, the melting temperature of the DNA (a reflection of the G + C content) within 5°C, and less than 3% divergence in rRNA gene sequence. As noted previously, these techniques may soon be replaced with a genomic metric, such as average nucleotide identity. Even with these proposed updates, some remain uncomfortable with these criteria. They point out that two microorganisms with, for instance, only 75% similarity in DNA and 98% rRNA gene sequence identity can be c onsidered the same species, but if these criteria were applied to eukaryotes, all primates (monkey, apes, you) would be lumped together as a single species! Indeed, it remains unresolved whether the species concept can be applied to microbes. Microbial Evolutionary Processes While the debate regarding the operational definition of a microbial species continues, microbiologists agree that the microbial species concept is grounded in natural selection and evolution. As the most ancient life forms on Earth, bacteria and archaea have evolved and adapted to virtually every habitat. While their diverse metabolic strategies and ability to tolerate extreme conditions explain why microbes display such enormous diversity, natural selection explains how this diversity came to be. Recall that genetic diversity in members of Archaea and Bacteria must occur asexually. Thus heritable genetic changes in these organisms are introduced principally by two mechanisms: mutation and HGT, both of which are subject to natural selection. Generally speaking, it is thought that mutation drives initial speciation events, and HGT permits more rapid radiation thereafter. In other words, a new species must arise from its single, ancestral species. This makes sense if one assumes that the ancestral population of microbes was genetically homogeneous. Genetic variation can only arise within a population possessing identical (or nearly identical) genomes by mutation, gene loss and gain, and intragenomic recombination (figure 19.8a). By definition, there is not enough genetic diversity to drive speciation among such a population by HGT. Anagenesis, also known as genetic drift, refers to small, random genetic changes that occur over generations. It might 23/10/18 10:25 am 19.5 Evolutionary Processes and the Concept of a Microbial Species Inspire Debate 461 ldredge and Steven Jay Gould coined E the term punctuated equilibria to describe this phenomenon. Certainly, the 3.5-billion-year history of microRecombination bial life on Earth affords the accumuMutation lation of many, many mutations; that in turn has resulted in vast speciation. Mechanisms of genetic variation Gene loss (chapter 16); Comparative genomics (section 18.7) Unlike variation introduced in the ecotype model, HGT-driven genetic variation requires genetically diverse groups of microbes. This is because (a) Mechanisms of genetic variation within a homogeneous (b) Stable ecotype model population HGT does not rely on replication but rather on the exchange of genetic maFigure 19.8 Evolution of Microbial Diversity. (a) Mechanisms by which a single, genetically terial between microbes. The rate of homogeneous population of microbes can develop genetic variation include mutation, gene loss and HGT is extremely variable. Some miduplication, and recombination. (b) Genetic changes within a population can lead to the development of crobes have very reduced genomes new ecotypes. Each ecotype is subject to periodic selection events (indicated by stars) that enable cells with no evidence of HGT. These miwith adaptive mutations to outcompete other lineages, which are eventually driven to extinction (dotted crobes are generally highly adapted to lines). The solid lines represent successful populations or lineages; those at the top are extant. a specific, stable ecological niche. MICRO INQUIRY Construct a scenario in which each of the following factors leads to the establishment The most extreme examples are obliof two ecotypes from a single common ancestor, as shown in (b): the availability of carbon and nitrogen gate intracellular symbionts that can sources; terminal electron acceptor; and mean local temperature. only grow within their host cells, where no other microbes exist with which to exchange genes. By contrast, some microbes appear seem that very small genetic differences within a microbial to have high rates of HGT with more than half of their gepopulation would be of little evolutionary significance. Hownome acquired from other organisms, as is the case for ever, model studies designed to assess competition between the members of the ancient phylum Thermotogae. In such mimicrobial populations has led to some surprising observations. crobes, genes acquired by HGT frequently expand metabolic When selection is applied, very small genetic differences can capabilities, thereby enabling rapid adaptation to new enviresult in one population overtaking another. How does this ronmental challenges. Aquaficae and Thermotogae are happen when individuals within a population have similar muancient bacterial lineages (section 21.1) tation rates and most of these mutations are neutral and have Finally, we pose the question: How many bacterial and arno phenotypic effect? Only those rare mutations that confer a chaeal species are there? There are two major obstacles to forgrowth advantage, called adaptive mutations, are retained mulating an answer. First, most microbial species resist growth and passed from one generation to the next, in which case we in the laboratory, so they can only be detected by metagenomic say the mutation is fixed. The descendants of that individual or other culture-independent approaches. Second, as we have continue to evolve through mutation and other intraspecific seen, microbiologists cannot agree on a biological species conmechanisms. cept. So we must resort to the operational definition of a species Adaptive mutation is key to the ecotype model of microin terms of nucleic acid homologies, as well as similarities in bial evolution. An ecotype is a population of microbes that is physiology, morphology, and ecology. Based on these criteria, genetically very similar but ecologically distinct. Ecotypes estimates range from 100,000 to 1,000,000 species in nature, arise when members of a microbial population living in a spewith about 1030 individual cells. These estimates reveal that cific ecosystem undergo a genetic event (or series of events) there are probably about a billion more microbes on Earth than that enables them to outcompete the remainder of the populastars in the universe. tion. According to the ecotype model, the acquisition of adaptive mutations ultimately drives the remaining members of the Comprehension Check population into extinction and reduces the amount of genetic diversity within the surviving population (figure 19.8b). The 1. Define ecotype. Do you think it is necessary to obtain microbes fossil record shows that evolution does not always occur at a in pure culture before assigning different ecotypes? Explain. constant rate but is periodically interrupted by rapid bursts of 2. What is the difference between the core genome and panspeciation driven by abrupt environmental changes. Niles genome? What might you infer if you compare two genera, one Gene duplication wil11886_ch19_447-465.indd 461 Ecotype 1 Ecotype 2 23/10/18 10:25 am 462 CHAPTER 19 | Microbial Taxonomy and the Evolution of Diversity Euryarchaeota Aquificae Bacilli Actinobacteria Spirochaetes 1 μm 1.4 μm Archaea differ greatly from bacteria. Archaeal cell walls lack peptidoglycan; plasma membranes are made of different kinds of lipids than bacterial plasma membranes; RNA and ribosomal proteins are more like eukaryotes than bacteria. Examples include Methanococcus, Thermoproteus, Halobacterium. The phyla Aquificae and Thermotogae are the two deepest or oldest branches of bacteria. Both are Gram-negative thermophiles. Thermotoga is named for its loose-fitting sheath, or “toga.” 24 μm Gram-positive bacteria. Largely solitary; many form endospores. Responsible for many significant human diseases, including anthrax (Bacillus anthracis); botulism (Clostridium botulinum); other common diseases (Staphylococcus, Streptococcus). Also include bacteria used in dairy foods (Lactococcus lactis). 22 μm Some Gram-positive bacteria form branching filaments; some produce spores. Produce many commonly used antibiotics, including streptomycin and tetracycline. One of the most common types of soil bacteria; also common in dental plaque. Streptomyces, Actinomyces. Thermophiles Crenarchaeota Euryarchaeota Aquificae Thermotogae Long, coil-shaped cells that stain Gramnegative. Common in aquatic environments. Rotation of internal flagella produces a corkscrew movement. Some spirochetes such as Treponema pallidum (syphilis) and Borrelia burgdorferi (Lyme disease) are significant human pathogens. Gram-positive bacteria Chloroflexi DeinococcusThermus Low G + C (Firmicutes) Bacilli Archaea 26 μm Clostridium High G + C Actinobacteria Bacteria Figure 19.9 Some Major Clades of Bacteria and Archaea. This classification scheme is based on Bergey’s Manual of Systematics of Archaea and Bacteria. (a) ©SPL/Science Source; (b) ©Karl O. Stetter; (c) ©Andre Syred/SPL/Science Source; (d) ©Microfield Scientific Ltd/Getty Images; (e) ©Alfred Paseika/SPL/Science Source; (f) ©McGraw-Hill Education/Don Rubbelke photographer; (g) ©Dr. Kari Lounatmaa/Science Source; (h) Source: CDC/Janice Haney Carr; (i) ©Derek Lovley/Science Source in which the size of the core genome and pan-genome are very similar, and one in which the core genome is much smaller than the pan-genome? 3. Of the following genes, which do you think are part of the pan-genome and which are part of the core genome: the genes for lactose catabolism in E. coli; the genes for heat-stable DNA polymerase in Thermus aquaticus; the genes for proteorhodopsin in marine bacteria; the genes for toxin production in Vibrio cholerae? 4. Would a protein encoded on the core genome or one encoded only on the pan-genome be best to use in constructing a phylogenetic tree? Explain your answer. wil11886_ch19_447-465.indd 462 19.6 Bergey’s Manual of Systematics of Archaea and Bacteria After reading this section, you should be able to: a. Employ Bergey’s Manual to investigate the defining taxonomic elements used for a bacterium or archaeon that is unfamiliar to you In 1923 David Bergey, professor of bacteriology at the University of Pennsylvania, and four colleagues published Bergey’s Manual of Determinative Bacteriology, a classification of 23/10/18 10:25 am 19.6 Bergey’s Manual of Systematics of Archaea and Bacteria 463 Cyanobacteria Beta 10 μm Cyanobacteria are photosynthetic bacteria common in both marine and freshwater environments. Deeply pigmented; often responsible for “blooms” in polluted waters. Both colonial and solitary forms are common. Some filamentous forms have cells specialized for nitrogen fixation. A nutritionally diverse group that includes soil bacteria such as the lithotroph Nitrosomonas that recycle nitrogen within ecosystems by oxidizing ammonia. Other members are heterotrophs and photoheterotrophs. Gamma A diverse group including photosynthetic sulfur bacteria, pathogens such as Legionella, and the enteric bacteria that inhabit animal intestines. Enterics include E. coli, Salmonella (food poisoning), and Vibrio cholerae (cholera). Pseudomonas (shown here) are a common genus of soil bacteria, responsible for many plant diseases, and are important opportunistic pathogens. Photosynthetic Spirochaetes Cyanobacteria Chlorobi These proteobacteria include bacteria used in bioremediation such as this Geobacter species. In addition, this group includes predatory bacteria such as Bdellovibrio and the myxobacteria. The latter glide in multicellular groups and form upright structures called fruiting bodies. Proteobacteria Beta bacteria that could be used for the identification of many bacterial species. Nine editions of this manual were published in the twentieth century. Despite its age, this text continues to serve as a relatively brief reference guide in the identification of bacteria based on physiological and morphological traits. A related publication, Bergey’s Manual of Systematic Bacteriology included the archaea for the first time in 1984. The final edition of this reference comprised five volumes, and included morphology, physiology, growth conditions, ecology, and other information. Classification was organized based on phylogeny, and clinically important bacteria were not discussed separately, but integrated into the phylogenetic scheme. Both references are referred to simply as Bergey’s Manual and in the wil11886_ch19_447-465.indd 463 Delta Gamma Alpha Epsilon Delta twenty-first century, the current manifestation is an online reference entitled Bergey’s Manual of Systematics of Archaea and Bacteria. This electronic format allows for the resource to be frequently updated as new taxa are described and validated. Figure 19.9 illustrates most of the groups covered in Bergey’s Manual and in chapters 20–23. Comprehension Check 1. Bergey’s Manual is no longer based on phenetic classification. Why is this the case? 2. Describe two different situations in which it would be essential to identify the genus and species of a bacterium or archaeon. 23/10/18 10:25 am 464 CHAPTER 19 | Microbial Taxonomy and the Evolution of Diversity Key Concepts 19.1 M icrobial Taxonomy Is Based on the Comparison of Multiple Traits Taxonomy, the science of biological classification, is composed of three parts: classification, nomenclature, and identification. A polyphasic approach is used to classify microbes. This incorporates information gleaned from genetic and phenotypic analysis (table 19.1). 19.2 Taxonomic Ranks Provide an Organizational Framework Taxonomic ranks are arranged in a nonoverlapping hierarchy (figure 19.1). A bacterial or archaeal species is a collection of strains that have many stable properties in common and differ significantly from other groups of strains. Microorganisms are named according to the binomial system. 19.3 Microbial Taxonomy and Phylogeny Are Largely Based on Molecular Characterization Historically, microbial taxonomic and phylogenic analysis used morphological, physiological, and ecological characteristics. These remain important in building a complete picture that also includes molecular information (tables 19.2 and 19.3). Nucleic acid sequencing is the most powerful and direct method for comparing genomes. Whole-genome sequencing is rapidly replacing other methods for genome comparison. The sequences of SSU rRNA are used in phylogenetic studies of microbes and can identify an organism to the genus level (figure 19.2). Additional techniques must be applied to identify a microbe at the species or strain level. They include multilocus sequence analysis (MLSA) and single nucleotide polymorphism analysis (figure 19.3). Signature sequences and indels are conserved in various taxa and provide important information on phylogenetics. 19.4 Phylogenetic Trees Illustrate Evolutionary Relationships Phylogenetic relationships often are shown as branched diagrams called phylogenetic trees. Trees are based on pairwise comparison of amino acid or nucleotide sequences, followed by computer analysis (figure 19.4). wil11886_ch19_447-465.indd 464 Trees may be either rooted or unrooted and are created in several different ways. Unrooted trees can be rooted by including an outgroup when the tree is constructed (figure 19.5). Microbes have a long history of horizontal gene transfer, which confuses taxonomic analysis. Complete genome analysis has revealed a set of core genes found in all members of a given taxon, and a pan-genome, the sum of all genes outside the core genome of that taxon. The pan-genome arises from horizontal gene transfer. 19.5 Evolutionary Processes and the Concept of a Microbial Species Inspire Debate There are several hypotheses regarding the origin of eukaryotic cells. Most biologists agree that endosymbioses of a bacterium and a cyanobacterium gave rise to mitochondria and chloroplasts, respectively (figure 19.7). The operational definition of a microbial species is based on criteria approved by the International Committee on the Systematics of Prokaryotes. This includes at least 70% whole genome similarity as determined by DNA-DNA hybridization, at least 97% 16S rRNA homology, no more than a 5°C difference in % G + C, and physiological, morphological, and ecological similarity. The concept of a microbial species is based on evolution. The ecotype model describes the outcome of a periodic natural selection on a genetically homogeneous microbial population. Individuals that acquire adaptive mutations are the source of microbial diversity (figure 19.8). Horizontal gene transfer is also important in microbial evolution. However, speciation is thought to be the outcome of mutation, while rapid adaptation to new niches is mediated by horizontal gene transfer. 19.6 Bergey’s Manual of Systematics of Archaea and Bacteria Bergey’s Manual of Systematics of Archaea and Bacteria is based on the accepted system of prokaryotic taxonomy. Comparisons of nucleic acid sequences, particularly 16S rRNA, are the foundation of this classification. 23/10/18 10:25 am Active Learning 465 Active Learning 1. Consider the fact that the use of 16S rRNA sequencing as a taxonomic and phylogenetic tool has resulted in tripling the number of bacterial phyla. Why has the advent of this genetic technique expanded the currently accepted number of microbial phyla? 2. You have recently established a pure culture of a new archaeon from soil. Describe the approaches you would use to identify your new microbe to the species level. 3. Discuss the problems in developing an accurate phylogenetic tree. Do you think it is possible to create a completely accurate universal phylogenetic tree? Explain your answer. 4. Why is the current classification system for Bacteria and Archaea likely to change considerably? How would one select the best features to use in the identification of unknown microbes and determination of relatedness? 5. How would you interpret a whole-genome sequence that is overall 45% G + C but shows a 20,000 bp region that is 55% G + C? 6. Horses were introduced to Iceland over 1,000 years ago and have remained an isolated population, not exposed to contagious diseases. A recent respiratory epidemic caused symptoms in almost all horses on the island. Viral infection could not be demonstrated, but Streptococcus equi subsp. zooepidemicus was recovered from nasal swabs of all infected animals. Although this bacterium was believed to be a commensal, whole-genome sequencing was performed on over 300 isolates. The population comprised four clades defined by multilocus sequencing analysis. The data table lists the distribution of these clades. wil11886_ch19_447-465.indd 465 Working on the hypothesis that a pathogenic strain of S. zooepidemicus was recently introduced to Iceland, which of the four clades is most likely to represent that strain? Why? Researchers compared the CRISPR spacers in different isolates to examine their relatedness. Why are CRISPR spacers a good measure of strain relatedness? (Review section 14.6 on CRISPR.) Read the original paper: Björnsdóttir, S., et al. 2017. Genomic dissection of an Icelandic epidemic of respiratory disease in horses and associated zoonotic cases. mBio. 8:e00826-17. 7. Achromatium oxaliferum is an aquatic sulfur-oxidizing microbe with extremely large polyploid cells. Microscopy following DNA staining reveals that DNA is distributed around the cell, but not uniformly. This organism is not cultured in the lab, rather it is isolated directly from the environment; and because of its cell size, it can be manipulated to isolate single cells. Genomic sequencing of single cells revealed multiple SSU rRNA genes per cell with 93 to 95% similarity, less similarity than should exist within a single genome. Fluorescence in situ hybridization with probes specific for one SSU rRNA demonstrated that a single cell contains a mixture of SSU rDNA sequences. Comparable results were found for other genes, and the authors conclude that Achromatium doesn’t have multiple copies of a single genome, but instead it appears to have a diverse population of genomes within a single cell. How would you define a species in this genus? How would you identify the core genome? Insertion sequences and transposases are abundant in this organism. How might these affect the observed genetic diversity in a cell? Read the original paper: Ionescu, D., et al. 2017. Community-like genome in single cells of the sulfur bacterium Achromatium oxaliferum. Nature Communications 8:455. 23/10/18 10:25 am