Lecture 3 Genomes and Genetics PDF
Document Details
Tags
Summary
This document discusses viral genomes, including their structures, complexities, and replication strategies. It explains the Baltimore system for classifying viruses and highlights the diversity of viral genomes. The document also touches on genetic analysis of viruses and engineering mutations.
Full Transcript
3 Genomes and Genetics Introduction The “Big and Small” of Viral Genomes: Does Size Matter? Genome Principles and the Baltimore System The Origin of Viral Genomes Structure and Complexity...
3 Genomes and Genetics Introduction The “Big and Small” of Viral Genomes: Does Size Matter? Genome Principles and the Baltimore System The Origin of Viral Genomes Structure and Complexity of Viral Genetic Analysis of Viruses Genomes Classical Genetic Methods DNA Genomes Engineering Mutations into Viral RNA Genomes Genomes What Do Viral Genomes Look Engineering Viral Genomes: Viral Vectors Like? Perspectives Coding Strategies References What Can Viral Sequences Tell Us? Study Questions LINKS FOR CHAPTER 3 Virocentricity with Eugene Koonin CRISPR-Cas immune systems http:// bit.ly/Virology_Twiv275 http://microbe.t v/t wim/t wim184 pragmatically, the system simplifies comprehension of the ex traordinary reproduction cycles of viruses. The Baltimore system omits the second universal function of viral genomes, to serve as a template for synthesis of prog eny genomes. Nevertheless, there is also a finite number of nucleic acidcopying strategies, each with unique primer, Introduction template, and termination requirements. We shall combine Earth abounds with uncountable numbers of viruses of great this principle with that embodied in the Baltimore system to diversity. However, because taxonomists have devised meth define seven strategies based on mRNA synthesis and ge ods of classify ing viruses, the number of identifiable groups nome replication. The Baltimore system has stood the test of is manageable (Chapter 1). One of the contributions of molec time: despite the discovery of multitudes of viral genome se ular biology has been a detailed analysis of the genetic mate quences, they all fall into one of the seven classes. rial of representatives of major virus fami lies. From these Replication and mRNA synthesis present no obvious chal studies emerged the principle that the viral genome is the nu lenges for most viruses with DNA genomes, as all cells use cleic acidbased repository of the information needed to DNAbased mechanisms. In contrast, animal cells possess no build, reproduce, and transmit a virus (Box 3.1). These ana ly known systems to copy viral RNA templates and to produce ses also revealed that the thousands of distinct viruses defined mRNA from them. For RNA viruses to propagate, their RNA by classical taxonomic methods can be organized into seven genomes must, by definition, encode a nucleic acid polymerase. groups, based on the structures of their genomes. Structure and Complexity Genome Principles and the of Viral Genomes Baltimore System Despite the simplicity of expression strategies, the composition A universal function of viral genomes is to specify proteins. and structures of viral genomes are far more varied than those However, none of these genomes encode the complete ma seen in the entire archaeal, bacterial, or eukaryotic domains. chinery needed to carry out protein synthesis. Consequently, Nearly every possible method for encoding information in nu one important principle is that all viral genomes must be cop cleic acid can be found in viruses. Viral genomes can be ied to produce messenger RNAs (mRNAs) that can be read by host ribosomes. Literally, all viruses are parasites of their host DNA or RNA cells’ translation system. DNA with short segments of RNA A second principle is that there is unity in diversity: evolu DNA or RNA with cova lently attached protein tion has led to the formation of only seven major types of vi singlestranded (+) strand, (−) strand, or ambisense ral genome. The Baltimore classification system integrates (Box 3.2) these two principles to construct an elegant molecular algo double stranded rithm for virologists (Fig. 3.1). When the bewildering array of linear viruses is classified by this system, we find seven pathways to mRNA. The value of the Baltimore system is that by know ing circular only the nature of the viral genome, one can deduce the basic segmented steps that must take place to produce mRNA. Perhaps more gapped P R I N C I P L E S Genomes and Genetics The genomes of viruses range from the extraordinarily small Although the details of replication difer, all viruses with (2,500 kbp); the diver RNA genomes must encode either an RNAdependent sity in size likely provides advantages in the niches in which RNA polymerase to synthesize RNA from an RNA template particular viruses exist. or a reverse transcriptase to convert viral RNA to DNA. Viral genomes specify some, but never all, of the proteins The information encoded in viral genomes is optimized by a needed to complete the viral reproductive cycle. variety of mechanisms; the smaller the genome, the greater the compression of genetic information. That only seven viral genome replication strategies exist for all known viruses implies unity in viral diversity. The genome sequence of a virus is at best a biological “parts list” and tells us little about how the virus interacts with its host. Some genomes can enter the reproduction cycle upon en try into a target cell, whereas others require prior repair Technical advances allowing the introduction of mutations or synthesis of viral gene products before replication can into any viral gene or genome sequence are responsible for proceed. much of what we know about viruses. 63 64 Chapter 3 B OX 3.1 B OX 3.2 B A C K G R O U N D T E R M I N O L O G Y What information is encoded in a viral Important conventions: plus (+) and minus (−) genome? strands Gene products and regulatory signals required for mRNA is defined as the posit ive (+) strand, because it can be trans lated. A strand of DNA of the equiva lent polarity is also designated replication of the genome as a (+) strand; i.e., if it were mRNA, it would be translated into efficient expression of the genome protein. assembly and packaging of the genome The RNA or DNA complement of the (+) strand is called the (−) regu lation and timing of the reproduction cycle strand. The (−) strand cannot be translated; it must first be copied modu lation of host defenses to make the (+) strand. Ambisense RNA contains both (+) and (−) spread to other cells and hosts sequences. Information not contained in viral genomes: A color key for nucleic acids, proteins, membranes, cells, and more is located in the front of this book. genes encoding a complete protein synthesis machine (e.g., no ri bosomal RNA and no ribosomal or translation proteins) genes encoding proteins of membrane biosynt hesis telomeres (to maintain genomes) or centromeres (to ensure segregation of genomes) this list becomes shorter with each new edition of this text book! DNA Genomes The strategy of having DNA as a viral genome appears at first glance to be the ultimate in genetic ef ficiency: the host genetic system is based on DNA, so viral genome replication and ex + DNA II pression could simply emulate the host system. While the rep lication of viral and cellular DNA genomes is fundamentally VI similar, the mechanistic details are varied because viral ge ± DNA nomes are structura lly diverse. VII + RNA – DNA ± DNA I Double-Stranded DNA (dsDNA) (Fig. 3.2) There are 38 families of viruses with dsDNA genomes. Those that include vertebrate vir uses are the Adenoviridae, Alloherpesviridae, Asfarviridae, Herpesviridae, Papillomaviri- + RNA – RNA + mRNA ± RNA dae, Polyomaviridae, Iridoviridae, and Poxviridae. These ge IV III nomes may be linear or circular. Genome replication and mRNA synthesis are accomplished by host or viral DNA- dependent DNA and RNA polymerases. – RNA V Gapped DNA (Fig. 3.3) Figure 3.1 The Baltimore classification. All viruses must produce Members of two virus families, Caulimoviridae and He- mRNA that can be translated by cellular ribosomes. This classification padnaviridae, have a gapped DNA genome. The Hepadnaviri- system traces the pathways from viral genomes to mRNA for the seven classes of viral genomes. dae include vir uses that infect vertebrates. As the gapped DNA genome is partially double stranded, the gaps must be filled to produce perfect duplexes. This repair process must precede mRNA synthesis because the host RNA polymerase The seven strateg ies for expression and repl ic at ion of vi can transcribe only fully dsDNA. The unusual gapped DNA ral genomes are illust rated in Fig. 3.2 to 3.8. In some cases, genome is produced from an RNA template by a virus-encoded genomes can enter the repl ic at ion cycle directly, but in oth enzyme, reverse transcriptase. ers, genomes must first be repaired, and viral gene prod ucts that participate in the replic ation cycle must first be Single-Stranded DNA (ssDNA) (Fig. 3.4) synt hesized. Examples of specific vir uses in each class are Thirteen families of viruses containing ssDNA genomes prov ided. have been recognized; the families Anelloviridae, Circoviri- Genomes and Genetics 65 A dsDNA genome ± DNA ± DNA B Polyomaviridae (5 kbp) Ori C Adenoviridae (30–50 kbp) Ori Ori ITR 3' 5' 5' 3' TP ITR D Herpesviridae (120–240 kbp) E Poxviridae (130–375 kbp) L S ITR ITR TRL UL IRL IRS US TRS Terminal loop OriL OriS OriS Figure 3.2 Structure and expression of viral double-stranded DNA genomes. (A) Synthesis of genomes, mRNA (shown as green line in yellow box), and protein (shown as brown line). The icon represents a polyomavirus particle. (B to E) Genome configurations. Ori, origin of replication; ITR, inverted terminal repeat; TP, terminal protein; L, long region; S, short region; UL and US, long and short unique regions; IRL, internal repeat sequence, long region; IRS, internal repeat sequence, short region; TRL, terminal repeat sequence, long region; TRS, terminal repeat sequence, short region; OriL, origin of replication of the long region; OriS, origin of replication of the short region. A Gapped, circular, dsDNA genome B Hepadnaviridae (3–3.3 kbp) (+) 3' (–) ± DNA ± DNA 5' 5' 3' + RNA – DNA ± DNA Figure 3.3 Structure and expression of viral gapped, circular, double-stranded DNA genomes. (A) Synthesis of genome, mRNA, and protein. (B) Configuration of the hepadnavirus genome. dae, Genomoviridae, and Parvoviridae include viruses that templates (Box 3.3). One solution to this problem is that RNA infect vertebrates. ssDNA must be copied into mRNA before virus genomes encode RNA-dependent RNA polymerases that proteins can be produced. However, RNA can be made only produce RNA from RNA templates. The other solution, exem from a dsDNA template, whatever the sense of the ssDNA. plified by retrovirus genomes, is reverse transcription of the Consequently, DNA synt hesis must precede mRNA produc genome to dsDNA, which can be transcribed by host RNA tion in the replication cycles of these viruses. All synt hesis of polymerase. viral DNA is cata lyzed by cellular DNA polymerases. dsRNA (Fig. 3.5) RNA Genomes There are twelve families of vir uses with linear dsRNA Cells have no RNA-dependent RNA polymerases that can rep genomes. The number of dsRNA segments in the virus par licate the genomes of RNA viruses or make mRNA from RNA ticle may be 1 (Totiviridae, Hypoviridae, and Endornaviridae, 66 Chapter 3 A ssDNA genome or + DNA – DNA + DNA ± DNA – DNA B Circoviridae (1.8–2.3 kb) C Parvoviridae (4–6 kb) B B' A A' A' D D' A C C' Figure 3.4 Structure and expression of viral single-stranded DNA genomes. (A) Synthesis of genomes, mRNA, and protein. (B and C) Genome configurations. A dsRNA genome B Reoviridae (18.2–30.5 kbp in 1–12 dsRNA segments) L1 L2 L3 3' 5' 3' 5' 3' 5' 5' c 3' 5' c 3' 5' c 3' RNA M1 M2 M3 3' 5' 3' 5' 3' 5' 5' c 3' 5' c 3' 5' c 3' RNA S1 S2 S3 S4 3' 5' 3' 5' 3' 5' 3' 5' 5' c 3' 5' c 3' 5' c 3' 5' c 3' Figure 3.5 Structure and expression of viral double-stranded RNA genomes. (A) Synthesis of genomes, mRNA, and protein. (B) Genome configur ation. B OX 3.3 B A C K G R O U N D RNA synthesis in cells There are no known host cell enz ymes that can copy the genomes of RNA vir uses. How ever, at least one enz yme, RNA polymerase II, can copy an RNA template. The 1.7-kb cir cular, ssRNA genome of hepatitis delta satel (–) strand lite vir us is copied by RNA polymerase II to genome RNA form multimeric RNAs (see the figu re). How RNA po ly merase II, an en z yme that pro duces pre-mRNAs from DNA templates, is reprogrammed to copy a circular RNA tem plate is not known. Hepatitis delta satellite (−) strand genome RNA is copied by RNA polymerase II at the indicated position. The polymera se passes the poly(A) signal (purple box) and the self-cleavage domain (red circle). For more informat ion, see Fig. 6.25. Redrawn from Tay lor JM. 1999. Curr Top Microbiol Immunol 239:107–122, with permission. Genomes and Genetics 67 A ss (+) RNA brates, plants, and vertebrates). While dsRNA contains a (+) strand, it cannot be translated to synt hesize viral proteins as part of a duplex. The (−) strand of the genomic dsRNA is first copied into mRNAs by a viral RNA-dependent RNA poly Genome merase. Newly synthesized mRNAs are encapsidated and then copied to produce dsRNAs. – RNA (+) Strand RNA (Fig. 3.6) There are more different types of (+) strand RNA viruses B Coronaviridae (27.6–41.1 kb) than any other, and 38 families have been recognized [not counting (+) strand RNA viruses with DNA intermediates]. 5’ c AnAOH3’ These genomes are linear and may be single molecules (non UTR UTR segmented) or segmented, depending on the family. The fam ilies Arteriviridae, Astroviridae, Caliciviridae, Coronaviridae, Flaviviridae (9.6–12.3 kb) Flaviviridae, Hepeviridae, Nodaviridae, Picornaviridae, and 5’ c 3’ Togaviridae include viruses that infect vertebrates. (+) strand UTR UTR RNA genomes usua lly can be translated directly into protein Picornaviridae (6.7–9.07 kb) by host ribosomes. The genome is replicated in two steps. The (+) strand genome is first copied into a full-length (−) strand, 5’ cVPg AnAOH3’ and the (−) strand is then copied into full-length (+) strand UTR UTR genomes. In some cases, a subgenomic mRNA is produced. Togaviridae (9.7–11.8 kb) (+) Strand RNA with a DNA Intermediate (Fig. 3.7) 5’ c AnAOH3’ Members of four virus families are (+) strand RNA viruses UTR UTR with a DNA intermediate; those viruses within Retroviridae Figure 3.6 Structure and expression of viral single-stranded infect vertebrates. In contrast to other (+) strand RNA viruses, (+) RNA genomes. (A) Synthesis of genomes, mRNA, and protein. (B) the (+) strand RNA genome of retroviruses is converted to a Genome configurations. UTR, untranslated region; VPg, virion protein, genome linked. dsDNA intermediate by viral RNA-dependent DNA polymer ase (reverse transcriptase). Following integration into host DNA, the viral DNA then serves as the template for viral viruses of fungi, protozoa, and plants); 2 (Partitiviridae, mRNA and genome RNA synthesis by cellular enzymes. Birnaviridae, and Megabirnaviridae, vi r uses of fungi, plants, insects, fish, and chickens); 3 (Cystoviridae, viruses of (−) Strand RNA (Fig. 3.8) Pseudomonas bacteria); 4 (Chrysoviridae, viruses of fungi); Viruses with (−) strand RNA genomes are found in 19 or 10 to 12 (Reoviridae, vir uses of protozoa, fungi, inverte families. These genomes are linear and may be single molecules A ss (+) RNA with DNA intermediate + RNA – DNA DNA + RNA B Retroviridae (7–11 kb) U5 U3 5’ c AnAOH3’ Figure 3.7 Structure and expression of viral single-stranded (+) RNA genomes with a DNA intermediate. (A) Synthesis of genomes, mRNA, and protein. (B) Genome configuration. 68 Chapter 3 A ss (–) RNA – RNA + RNA – RNA B Segmented genomes: Orthomyxoviridae C Ambisense (–) strand RNA (10–15 kb in 6–8 RNAs) Arenaviridae (11 kb in 2 RNAs) (–) strand RNA segments Peribunyaviridae (12.4–16.6 kb in 3 RNAs) 1 2 3 4 3’ 5’ 3’ 5’ 3’ 5’ 3’ 5’ L RNA 5’ c 3’ 5 6 7 8 3’ 5’ 3’ 5’ 3’ 5’ 3’ 5’ Nonsegmented genomes: Paramyxoviridae (15.1–18.2 kb) M RNA 5’ c 3’ 3’ 5’ Rhabdoviridae (11–15 kb) S RNA 5’ c 3’ 3’ 5’ Figure 3.8 Structure and expression of viral single-stranded (−) RNA genomes. (A) Synthesis of genomes, mRNA, and pro tein. The icon represents an ort homyxov irus particle. (B and C) Genome configurations. (nonsegmented; some viruses with this configuration have proteins or enzymes. A fundamental difference between the been classified in the order Mononegavirales) or segmented. genomes of viruses and those of their hosts is that alt hough Viruses of this type that can infect vertebrates include mem viral genomes are often covered with proteins, they are usu bers of the Arenaviridae, Bornaviridae, Filoviridae, Hanta- ally not bound by histones in the virus particle (polyomaviral viridae, Orthomyxoviridae, Paramyxoviridae, Pneumoviridae, and papillomaviral genomes are exceptions). However, it is and Rhabdoviridae fam ilies. Unlike (+) strand RNA, (−) likely that allviral DNAs become coated with histones shortly strand RNA genomes cannot be translated directly into pro after they enter the nucleus. tein but must be first copied to make (+) strand mRNA. While viral genomes are allnucleic acids, they should not There are no enzymes in the cell that can make mRNAs from be thought of as one-dimensional structures. Virology text the RNA genomes of (−) strand RNA viruses. These virus par books (this one included) often draw genomes as straight, ticles therefore contain virus-encoded RNA-dependent RNA one-dimensional lines, but this notation is for illustrative pur polymerases. The genome is also the template for the synt he poses only; physical rea lity is certain to be dramatically dif sis of full-length (+) strands, which, in turn, are copied to pro ferent. Genomes have the po tential to adopt amaz ing duce (−) strand genomes. secondary and ter tiary structures in which nucleotides may The genomes of certain (−) strand RNA viruses (e.g., mem engage in long-distance interactions (Fig. 3.9). bers of the Arenaviridae and Bunyaviridae) are ambisense: The sequences and structures near the ends of viral ge they contain both (+) and (−) strand information on a single nomes are often indispensable for replication. For example, the strand of RNA (Fig. 3.8C). The (+) sense information in the DNA sequences at the ends of parvovirus genomes form T- genome is translated upon entry of the viral RNA into cells. shaped structures that are required for priming during DNA Replication of the RNA genome yields additional (+) sense se synthesis. Proteins covalently attached to 5′ ends, inverted and quences, which are then translated. tandem repeats, and bound tRNAs may also participate in the rep li ca tion of RNA and DNA ge nomes. Secondary RNA What Do Viral Genomes Look Like? structures may facilitate translation (the internal ribosome Some small RNA and DNA genomes enter cells from vir us entry site [IRES] of picornav irus genomes) and genome pack particles as naked molecules of nucleic acid, whereas others aging (the structured packaging signal of retrov iral genomes, are always associated with specialized nucleic acid-binding [Fig. 3.9]). Genomes and Genetics 69 A Linear (+) strand RNA genome of a picornavirus 5’ VPg AnAOH3’ UTR UTR B 5’ 3’ 4252 C D SL1 SL4 SL2 SL3 TAR pA U5 PBS DIS SD Ψ AUG Figure 3.9 Genome structures in cartoons and in real life. (A) Linear representat ion of a picornav ir us RNA genome. UTR, un translated region. (B) Long-distance RNA-RNA interactions in a tombusvirus RNA genome. The 4,252-nucleotide viral genome is shown with secondary RNA structures at the 5′ and 3′ ends. Sequences that base-pair are shown in blue (required for RNA frameshifting) and red (required to bring ribosomes from the 3′ end to the 5′ end). Courtesy of Anne Simon, University of Maryland. (C) Schematic represen tation of RNA seconda ry-structure elements in the human immunodeficiency vir us type 1 5′ leader, includi ng the core packagi ng signal. (D) NMR struct ure of the RNA shown in C, withoutelements colored black. Courtesy of Paul Bieniasz, Rockefeller University. Coding Strategies synt hesis, leaky scanning, suppression of termination, and The compact genome of most viruses renders the “one gene, ribosomal frameshifting. In general, the smaller the genome, one mRNA” dogma inaccurate. Extraordinary tactics for in the greater the compression of genetic information. formation retrieval, such as the production of multiple subge nomic mRNAs, alternative mRNA splicing, RNA editing, and What Can Viral Sequences Tell Us? nested transcription units (Fig. 3.10), allow the production of Knowledge about the physical nature of genomes and coding multiple proteins from a sing le viral genome. Further ex strategies was first obtained by the study of the nucleic acids pansion of the coding capacity of the viral genome is achieved of viruses. Indeed, DNA sequencing technology was per by postt ranscript ional mechan isms, such as polyprotein fected on viral genomes. The first genome of any kind to be Figures in Mechanism Diagram Virus Chapter(s) appendix Multiple 3' 5' Genome Adenoviridae 7, 8 1, 2 subgenomic Hepadnaviridae 7, 10 11, 12 mRNAs Herpesviridae 7 5' c 5' c 5' c 5' c 5' c mRNAs Paramyxoviridae 6 17, 18 Poxviridae 7 25, 26 Proteins Rhabdoviridae 6 31, 32 Alternative Adenoviridae 7, 8 1, 2 5' c mRNA splicing Orthomyxoviridae 8 15, 16 5' c Papillomaviridae 7, 8 5' c Polyomaviridae 7, 8 23, 24 Retroviridae 8, 10 29, 30 RNA editing Editing site Paramyxoviridae 6, 8 Viral genome Filoviridae 8 5' c 3' mRNA 1 Hepatitis delta 8 Protein 1 virus 5' c 3' mRNA 2 (+1 G) Protein 2 Information on CBF USF +1 3' Adenoviridae 7–9 1, 2 both strands Polyomaviridae 7–9 23, 24 Retroviridae 10 29, 30 Double-stranded DNA Proteins Polyprotein Viral gene Alphaviruses 6, 11 33, 34 synthesis mRNA Flaviviridae 6, 11 9, 10 Picornaviridae 6, 11 21, 22 Polyprotein Retroviridae 6, 11 29, 30 Processing Leaky scanning Viral gene Orthomyxoviridae 11 15, 16 AUG AUG Paramyxoviridae 11 mRNA Polyomaviridae 11 Retroviridae 11 29, 30 Proteins Reinitiation Viral gene Orthomyxoviridae 11 15, 16 Herpesviridae 11 mRNA Proteins Suppression of Viral gene Alphaviruses 11 33, 34 termination Stops Retroviridae 11 29, 30 mRNA Proteins Ribosomal Viral gene Astroviridae 11 frameshifting Frameshift site Coronaviridae 11 5, 6 Retroviridae 11 29, 30 mRNA Upstream of frameshift site Downstream of frameshift site Proteins IRES Viral gene Flaviviridae 11 Picornaviridae 11 21, 22 mRNA Proteins Nested mRNAs 2a S Sa M Coronaviridae 6 5, 6 5' 3' Viral gene Arteriviridae 6 5, 6 HE 4 E N 2a S Sa M 5' c AnAOH3' Protein HE 4 E N S Sa M AnAOH3' Protein 5' c HE 4 E N S Sa M 5' c AnAOH3' Protein 4 E N Figure 3.10 Information retrieval from viral genomes. Different strategies for decoding the information in viral genomes are depicted. CBF, CCAAT-binding factor; USF, upstream stimu latory factor; IRES, internal ribosome entry site. Genomes and Genetics 71 se quenced was that of the Escherichia coli bacteriophage around us (especially in the sea) is astronomical. Most are un MS2, a linear ssRNA of 3,569 nucleotides. dsDNA genomes characterized and, because their hosts are also unk nown, of larger viruses, such as herpesv iruses and poxv iruses (vac cannot be investigated. A reductionist study of individual cinia virus), were sequenced completely by the 1990s. Since components in isolation provides few answers. Although the then, high-throughput sequencing has revolutionized the bi reductionist approach is often the simplest experimentally, it ological sciences, allowing rapid determination of genome is also important to understand how the genome behaves sequences from clinical and env ironmental samples. Organ- among others (population biology) and how the genome and tissue-specific viromes of many organisms have been changes with time (evolution). Nevertheless, reductionism has determined. In one study, over 186 host species representing prov ided much-needed det ailed informat ion for tract able the phylogenetic diversity of vertebrates, including lancelets virus-host systems. These systems allow genetic and biochem (chordates, but considered invertebrates), jawless fish, carti ical analyses and provide models of infection in vivo and in laginous fish, ray-finned fish, amphibians, and reptiles, all cells in culture. Unfortunately, viruses and hosts that are dif fi ancestral to birds and mammals, were sampled. RNA was ex cult or impossible to manipulate in the laboratory remain un tracted from multiple organs and subjected to high-through derstudied or ignored. put sequencing. Among 806 billion bases that were read, 214 new viral genomes were identified. The results show that in The “Big and Small” of Viral vertebrates other than birds and mammals, RNA viruses are Genomes: Does Size Matter? more numerous and diverse than suspected. Every viral fam The question “does genome size matter” is dif ficult to answer ily or genus of bird and mammal viruses is also represented considering the three orders of magnitude in genome length in viruses of amphibians, reptiles, or fish. Arenaviruses, filo that separate the largest and the smallest viral genomes. The viruses, and hantav iruses were found for the first time in two largest viral genomes known are those of Pandoravirus aquatic vertebrates. The genomes of some fish viruses have now salinus (2.4 million bases of dsDNA) and Pandoravirus dulcis expanded so that their phylogenetic diversity is larger than in (1.9 million bases of dsDNA), encoding 2,541 and 1,487 open mammalian viruses. New relatives of influenza viruses were reading frames, respectively. The largest RNA virus genomes found in hagfish, amphibians, and ray-finned fish. As of this are far behind (Box 3.4). At the other end are anelloviruses, writing, the complete sequences of >8,000 different viral ge with a 1,759-base ssDNA genome encoding two proteins (Fig. nomes have been determ ined. Published viral genome se 3.3B), and viroids, circular, single-stranded RNA molecules of quences can be found at http://w ww.ncbi.nlm.nih.gov/genome 246 to 401 nucleotides that encode no protein (Volume II, /v iruses/. Chapter 13). Anelloviruses include agricultura lly important The utility of viral genome sequences extends well beyond pathogens of chickens and pigs and torque teno (TT) virus, building a cata log of viruses. These sequences are the pri which infects >90% of humans with no known consequence. mary basis for classification and also prov ide information on Viroids cause economically important diseases of crop plants. the origin and evolution of viruses. In outbreaks or epidem All viruses with genome sizes spanning the range from the ics of viral disease, even partial genome sequences can pro biggest to the smallest are successful as they continue to repro vide information about the identity of the infecting virus and duce and spread within their hosts. Despite detailed analyses, its spread in different popu lations. New viral nucleic acid se there is no evidence that one size is more advantageous than quences can be associated with disease and characterized another. All viral genomes have evolved under relentless selec even in the absence of standard virological techniques (Vol tion, so extremes of size must provide particular advantages. ume II, Chapter 10). For example, human herpesv irus 8 was One feature distinguishing large genomes from smaller ones is identified by comparing sequences present in diseased and the presence of many genes that encode proteins for viral ge nondiseased tissues, and a novel member of the parvov irus nome replication, nucleic acid metabolism, and countering family was identified as the cause of unexpected deaths of host defense systems. When mimiviruses were first discov laboratory mice in Australia and the United States. ered, the surprise was that their genomes encoded components Despite their utility, genome sequences cannot provide a of the protein synthesis system, such as tRNAs and aminoacyl- complete understanding of how viruses reproduce. The ge tRNA synthetases. Tupanviruses, isolated from soda lakes in nome sequence of a virus is at best a biological “parts list”: it Brazil and deep ocean sediments, encode all 20 aminoacyl- provides some information about the intrinsic properties of a tRNA synthetases, 70 tRNAs, multiple translation proteins, virus (for example, predicted sequences of viral proteins and and more. Only the ribosome is lacking. Why would large vi particle composition), but says little or nothing about how the ral genomes carry these genes when they are available in their virus interacts with cells, hosts, and populations. This limita cellular hosts? Perhaps by producing a large part of the trans tion is best illustrated by the results of environmental metage tional ma la chin ery, vi ral mRNAs can be more ef fi ciently nomic analyses, which reveal that the number of viruses translated. This explanation is consistent with the finding that 72 Chapter 3 B OX 3.4 E X P E R I M E N T S Planaria and mollusks yield the biggest RNA genomes In the past 20 years the development of high- find larger virus RNAs, suggesting that we an RNA genome of 35,906 nucleotides with throughput nucleic acid sequencing methods have not yet reached the size limit of RNA ge ORFs that encode two polyproteins. has rapidly increased the pace of virus discov nomes. From the perspective of genome size, the ery. Yet in that time, while the largest DNA A close study of the transcriptome of a discovery of these nidovirus genomes suggests genomes have increased nearly ten times, the planarian revealed a new nidovirus, planar that viruses with even larger RNAs remain to largest known RNA viral genome has only in ian secretory cell nidovirus, with an RNA ge be discovered. In both cases the viruses were creased in size by ten percent. This situation nome of 41,103 nucleot ides. This viral genome identified from sequences that had been de has now changed with the discovery of new is unusual because it encodes a single, long posited in public databases, although in both RNA viruses of planaria ns and mollusks. open reading frame of 13,556 amino acids— cases, in fec tious vi ruses were not re ported. Until very recently, the biggest RNA virus the longest viral open reading frame (ORF) Nevertheless, many or ganisms have not yet genome known was 33.5 kb (ball python ni discovered so far. All the other known nido had their genomes sequenced and it is likely dovirus), which is much larger than the aver viruses encode mult iple open reading frames. that many RNA viruses remain to be discov age sized RNA virus genome of 10 kb. The Phylogenetic analysis of known nidoviruses ered. Declaring an upper limit on RNA ge reason for the difference is that RNA poly suggests that the planarian virus arose from nome size does not seem reasonable if we have merases make errors, and most do not have viruses with mult iple ORFs, after which their not sampled every species. proofreading capabilities. Nidovirus genomes single ORF expanded in size. Saberi A, Gulyaeva AA, Brubacher JL, Newmark PA, encode a proofreading exoribonuclease which The other nidovirus with a large RNA ge Gorbalenya AE. 2018. A pla nar ian nidovirus ex improves replication fidelity and presumably nome was dis cov ered by search ing allthe pands the limits of RNA genome size. PLoS Pathog allows for larger genomes. Even with a proof available RNA sequences of the mollusk Aply- 14:e1007314. reading enz yme, the biggest RNA virus ge sia californica. With a simple nervous system Debat HJ. 2018. Expanding the size limit of RNA vi nome is much smaller than the min mal i of 20,000 neu rons, this mol lusk has been ruses: evidence of a novel divergent nidovirus in Cali fornia sea hare, with a ∼39.5 kb virus genome. bioRxiv cellular DNA genome, which is 200 kb. The studied as a model system in many laborato 307678. results of two new studies show that we can ries. Aplysia californica nido-like virus has the codon and amino acid usage of tupanvirus is different translation machinery, as well as host cell systems to make from that of the amoeba that it infects. membranes and generate energ y. Another intriguing set of genes belongs to tetraselmis The parameters that limit the size of viral genomes are v irus 1, which infects green algae. These hosts, found in nutri largely unknown. There are cellular DNA and RNA molecules ent-rich marine and fresh waters, are photosynthetic. The that are much longer than those found in virus particles. Con viral genome encodes pyruvate formate-lyase and pyruvate sequently, the rate of nucleic acid synthesis is not likely to be formate-lyase-activating enzyme, which are key members of limiting. Nor does the capsid volume appear to limit genome size: cellular anaerobic respiration pathways and allow energy pro the icosahedral shell of Mimivirus, which houses a 1.2 million- duction when no oxygen is available. Green algae may use this base-pair DNA genome, is constructed mainly of a single major system in waters depleted of oxygen by exuberant algal capsid protein. For larger genomes, the solution is helical sym growth. If this process occurs in cells, why does the viral ge metry, which can in principle accommodate very large ge nome carry some of the genes involved? The answer is not nomes. The Pandoraviruses, with the largest known DNA viral known, but it is possible that the extra metabolic demands genomes (2,500 kbp), are housed in decidedly nonisometric placed on cells during virus replication—especially at night— ovoid particles 1 μm in length and 0.5 μm in diameter. require additional fermentation enzymes for energy produc There is no reason to believe that the upper limit in viral tion. The presence of these genes suggests that tetraselmis particle and genome size has been discovered. The core com vir us 1 can change host metabol ism, perhaps facilitati ng its partment of a mimivirus particle is larger than needed to ac reproduction. commodate the 1,200-kbp DNA genome. A particle of this size These large viruses therefore have suf ficient coding capac could, in principle, house a genome of 6 million bp if the DNA ity to escape some restrictions imposed by host cell biochem were packed at the same density as in polyomaviruses. Indeed, istry. The smallest genome of a free-living cell is predicted to if the genome were packed into the particle at the density comprise 12 million bp, the Remarkably, this number is smaller than the genetic content size of that of the smallest free-living unicellular eukaryote. of large viral DNA genomes. Nevertheless, the big viruses are In cells, DNAs are much longer than RNA molecules. RNA not cells: their reproduction absolutely requires the cellular is less stable than DNA, but in the cell, much of the RNA is used Genomes and Genetics 73 for the synthesis of proteins and therefore need not exceed the isms might have had RNA genomes. Viruses with RNA ge size needed to specify the largest polypeptide. However, this nomes might have evolved dur ing this time. Later, DNA constraint does not apply to viral genomes. Yet the largest viral replaced RNA as cellular genomes, perhaps through the action single-molecule RNA genomes, the 41-kb (+) strand RNAs of of reverse transcriptases. With the emergence of DNA ge the nidoviruses (Box 3.4), are dwarfed by the largest (2,500- nomes probably came the evolution of DNA viruses. However, kbp) DNA virus genomes. Susceptibility of RNA to chemical those with RNA genomes were and remain evolutionarily and nuclease attack might limit the size of viral RNA genomes. competitive, and hence they continue to survive to this day. However, the most likely explanation is that there are few Analysis of sequences of more than 4,000 RNA-dependent known enzymes that can correct errors introduced during RNA polymerases is consistent with the hypothesis that the RNA synthesis. An exonuclease encoded in the coronavirus ge first RNA viruses to emerge after the evolution of translation nome is one exception: its presence could explain the large size were those with (+) strand RNA genomes. The last common of these RNAs. DNA polymerases can eliminate errors during ancestor of these viruses encoded only an RNA-dependent polymerization, a process known as proofreading, and remain RNA polymerase and a single capsid protein. Double-stranded ing errors can also be corrected after DNA synthesis is com RNA viruses evolved from (+) strand RNA viruses on at least plete. The average error frequencies for RNA genomes are about two different occasions, and (−) strand RNA viruses evolved 1 misincorporation in 104 or 105 nucleotides polymerized. In an from dsRNA viruses. The emergence of viruses with the latter RNA viral genome of 10 kb, a mutation frequency of 1 in 104 genome types was likely facilitated by the capture of genes such would produce about 1 mutation in every replicated genome. as those encoding RNA helicases, to allow for the production Hence, very long viral RNA genomes, perhaps longer than 40 of larger genomes. kb, would sustain too many mutations that would be lethal. Single-stranded DNA viruses of eukaryotes appear to have Even the 7.5-kb genome of poliovirus exists at the edge of infec evolved from genes contributed from both bacterial plasmids tivity: treatment of the virus with the RNA mutagen ribavirin and (+) strand RNA viruses. Different dsDNA viruses origi causes a >99% loss in a single round of replication. nated from bacteriophages at least twice. The larger eukaryotic When new viral genomes are discovered, often many of the DNA viruses form a monophyletic group based on analysis of putative genes are previously unk nown. For example, >93% of 40 genes that derive from a last common ancestor. These vi the >2,500 genes of Pandoravirus salinus resemble nothing ruses appear to have emerged from smaller DNA viruses by known, and 453 of the 663 predicted open reading frames of the capture of multiple eukaryotic and bacterial genes, such as tetraselmis virus 1 show no sequence similarity to known those encoding translation system components. proteins. The implication of these findings is clear: our explo There is no evidence that viruses are monophyletic, i.e., de ration of global genome sequences is far from complete, and scended from a common ancestor: there is no single gene viruses with larger genomes might yet be discovered. shared by allviruses. Nevertheless, viruses with different ge nomes and replication strategies do share a small set of viral The Origin of Viral Genomes hallmark genes that encode icosahedral capsid proteins, nu The absence of bona fide viral fossils, i.e., ancient material cleic acid polymerases, helicases, integrases, and other en from which viral nucleic acids can be recovered, might ap zymes. For example, as discussed above, the RNA-dependent pear to make the origin of viral genomes an impenetrable RNA polymerase is the only viral hallmark protein conserved mystery. The oldest viruses recovered from env ironmental in RNA viruses. Examination of the sequences of viral capsid samples, the 30,000-year-old Pithovirus sibericum and Mol proteins reveals at least 20 distinct varieties that were derived livirus sibericum, isolated from Late Pleistocene Siberian from unrelated genes in ancestral cells on multiple occasions. permafrost, are simply too rare and too young to prov ide The emerging evidence therefore suggests that viral replica much information on viral evolution. However, the discovery tion enzymes arose from precellular self-replicating genetic of fragments of viral nucleic acids integrated into host ge elements, while capsid protein genes were captured from un nomes, coupled with the advances in determining genome related genes in cellular hosts. sequences of viruses and their hosts, has prov ided an im The compositions of the eu karyotic and bacterial viromes proved understanding of the evolutionary history of viruses, differ substant ially (Chapter 1, Fig. 1.13). In bacteria, most a topic discussed in depth in Volume II, Chapter 10. known vir uses possess dsDNA genomes; fewer vir uses have How viruses with DNA or RNA genomes arose is a compel ssDNA genomes, and there is a very limited number of vi ling question. A predominant hypothesis is that RNA viruses ruses with RNA genomes. In euk aryotes, most of the vi are relics of the “RNA world,” a period populated only by RNA rome diversity is accounted for by RNA vir uses, but ssDNA molecules that catalyzed their own replication in the absence and dsDNA vir uses are common (Chapter 1, Fig. 1.13). The of proteins. During this time, billions of years ago, cellular life reasons for this difference are unclear, but one possibility could have evolved from RNA, and the earliest cellular organ is that the formation of the euk aryotic nucleus erected a 74 Chapter 3 barrier for DNA vir us reproduct ion. On the other hand, the strand RNA genomes (see Chapter 6). There is some evidence eu k aryotic cy toplasm with its extensive membranous sys that segmented RNA ge nomes might have arisen from tem might have been a hospitable location for RNA vir us monopartite genomes, perhaps to allow regulation of the pro replicat ion. duction of individual proteins (Box 3.5). Segmentation proba Viral genomes display a greater diversity of genome com bly did not emerge to increase genome size, as the largest RNA position, structure, and reproduction than any organism. Un genomes are monopartite. derstanding the function of such diversity is an intriguing goal. As viral genomes are survivors of constant selective pressure, all configurations must provide benefits. One possi Genetic Analysis of Viruses bility is that different genome configurations allow unique The application of genetic methods to study the structure and mechanisms for control of gene expression. These mecha function of animal viral genes and proteins began with devel nisms include synthesis of a polyprotein from (+) strand RNA opment of the plaque assay by Renato Dulbecco in 1952. This ge nomes or pro duction of subgenomic mRNAs from (−) assay permitted the preparation of clonal stocks of virus, the B OX 3.5 E X P E R I M E N T S Origin of segmented RNA virus genomes Segmented genomes are plentiful in the RNA point mutat ions that gave the RNAs a fitness 1 5’ AnAOH3’ virus world. They are found in virus particles advantage over the standard RNA arose be NSP1 from differ ent fam i lies and can be dou ble fore fragmentation occurred, implying that (flavivirus NS5-like) stranded (Reoviridae) or single stranded, with the changes needed to occur in a specific se (+) (Closteroviridae) or (−) (Orthomyxoviridae) quence. The authors of the study conclude: polarity. Some experimental findings suggest “Thus, explorat ion of sequence space by a vi 2 5’ AnAOH3’ that monopartite viral genomes emerged first ral ge nome (in this case an un segmented VP1 and then later fragmented to form segmented RNA) can reach a point of the space in which genomes. a totally different genome structure (in this Insight into how such segmented genomes case, a segmented RNA) is favored over the 3 5’ AnAOH3’ may have been formed comes from studies form that performed the explorat ion.” While NSP2 with the picornav irus foot-and-mouth dis the fragmentat ion of the foot-and-mouth dis (flavivirus NS2b-NS3-like) ease vir us. The genome of this virus is a single ease virus genome may represent a step on the molecule of (+) strand RNA. Serial passage of path to segmentation, its relevance to what the vir us in baby hamster kidney cells led to occurs in nature is unclear, because the re 4 5’ AnAOH3’ the emergence of genomes with two different sults were obtained in cells in culture. VP2, VP3 large deletions (417 and 999 nucleotides) in A compelling picture of the genesis of a RNA genome of JMTV virus. The viral genome the coding region. Neither mutant genome is segmented RNA genome comes from the dis comprises four segments of single-stranded, (+) infectious, but when they are introduced to covery of a new tick-borne virus in China, sense RNA. Proteins encoded by each RNA are in gether into cells, an infectious virus popu la Jingmen tick virus. The genome of this virus dicated. RNA segments 1 and 3 encode flavivirus- tion is produced. This popu lation comprises a comprises four segments of (+) strand RNA. like proteins. mixture of each of the two mutant genomes Two of the RNA segments have no known se packaged sepa rately into virus particles. In quence homologs, while the other two are re another. Next, coinfection of this segmented fection is successf ul because of complementa lated to sequences of flav iv iruses. The RNA flavivirus with another unidentified virus could tion: when a host cell is infected with both genome of flav iv iruses is not segmented: it is a have produced the precursor of Jingmen tick particles, each genome prov ides the proteins single strand of (+) sense RNA. The proteins virus. missing in the other. encoded by RNA segments 1 and 3 are non The results prov ide new clues about the or Further study of the deleted viral genomes structural proteins that are clearly related to igins of segmented RNA viruses. revealed the presence of point mutations in the flav iv irus NS5 and NS3 proteins. Moreno E, Ojosnegros S, García-Arriaza J, Escarmís other regions of the genome. These mutations The genome structure of this vir us sug C, Domingo E, Perales C. 2014. Exploration of se had accumulated before the deletions ap gests that at some point in the past a flav iv i quence space as the basis of viral RNA genome seg peared and increased the fitness of the deleted rus genome fragmented to produce the RNA mentat ion. Proc Natl Acad Sci U S A 111:6678–6683. genome com pared with the wild-type ge segments encoding the NS3- and NS5-like Qin XC, Shi M, Tian JH, Lin XD, Gao DY, He JR, nome. proteins. This fragment at ion might have ini Wang JB, Li CX, Kang YJ, Yu B, Zhou DJ, Xu J, Ply- usnin A, Holmes EC, Zhang YZ. 2014. A tick-borne These results show how monopartite viral tially taken place as shown for foot-and-mouth segmented RNA vir us contains genome segments de RNAs may be div ided, possibly a pathway to a disease vir us in cells in cult ure, by fixing of rived from unsegmented viral ancestors. Proc Natl segmented genome. It is interesting that the deletion mutations that complemented one Acad Sci U S A 111:6744–6749. Genomes and Genetics 75 measurement of virus titers, and a convenient system for pract ice, cells are coi nfected with two mutants, and the fre studying viruses with conditional lethal mutations. Although quency of recombinat ion is calcu lated by div idi ng the titer a limited repertoire of classical genetic methods was available, of phenot ypically wild-type vir us (Box 3.7) obtained under the mutants that were isolated (Box 3.6) were invaluable in restrict ive cond it ions (e.g., high temperat ure) by the titer elucidating many aspects of infectious cycles and cell trans measured under permissive cond it ions (e.g., low tempera formation. Contemporary methods of genetic analysis based ture). The recombination frequency between pairs of mu on recombinant DNA technology confer an essentially un tants is determined, allowi ng the mutat ions to be placed on limited scope for genetic manipulation; in principle, any viral a cont iguous map. Although a locat ion can be assigned for gene of interest can be mutated, and the precise nature of the each mutat ion relat ive to others, this approach does not re mutation can be predetermined by the investigator. Much of sult in a physical map of the actual location of the base change the large body of information about viruses and their repro in the genome. duction that we now possess can be attributed to the power of In the case of RNA viruses with segmented genomes, the these methods. technique of reassortment allows the assignment of muta tions to specific genome segments. When cells are coinfected with both mutant and wild-type viruses, the progeny includes Classical Genetic Methods reassortants that inherit RNA segments from either parent. Mapping Mutations The origins of the RNA segments can be deduced from their Before the advent of recombinant DNA technology, it was migration patterns during gel electrophoresis (Fig. 3.11) or extremely dif ficult for investigators to determine the loca by nucleic acid hybridization. By ana lyzing a panel of such tions of mutations in viral genomes. The marker rescue tech reassortants, the segment responsible for the phenot ype can nique (described in “Introducing Mutations into the Viral be identified. Genome” below) was a solut ion to this problem, but before it was developed, other, less satisfactory approaches were Functional Analysis exploited. Complementation describes the ability of gene prod Recombination mapping can be applied to both DNA and ucts from two different mutant viruses to interact func RNA viruses. Recombination results in genetic exchange tiona lly in the same cell, perm itt ing viral reproduct ion. It bet ween genomes within the infected cell. The frequency of can be dist ing uished from recombinat ion or reassortment recombinat ion bet ween two mutat ions in a linear genome by examining the progeny produced by coinfected cells. True increases with the physical distance separati ng them. In complementat ion yields only the two parental mutants, B OX 3.6 M E T H O D S Spontaneous and induced mutations In the early days of experimental virology, mu The low spontaneous mutation rate of DNA tant viruses could be isolated only by screening viruses necessitated random mutagenesis by stocks for interesting phenotypes, for none of exposure to a chemical mutagen. Mutagens the tools that we now take for granted, such as such as nitrous acid, hydroxylamine, and alkyl restriction endonucleases, efficient DNA se ating agents chemically modify the nucleic acid quencing methods, and molecular cloning in preparations of virus particles, resulting in procedures, were developed until the mid to changes in base-pairing during subsequent ge late 1970s. RNA virus stocks usually contain a nome replication. Base analogs, intercalating viability of the virus. Virus mutants with this high proportion of mutants, and it is only a agents, or UV light are applied to the infected phenotype reproduce well at low temperatures, matter of devising the appropriate selection cell to cause changes in the viral genome dur but poorly or not at allat high temperatures. conditions (e.g., high or low temperature or ing replication. Such agents introduce muta The permissive and nonpermissive tempera exposure to drugs that inhibit viral reproduc tions more or less at random. Some mutations tures are typically 33 and 39°C, respectively, tion) to select mutants with the desired pheno are lethal under all conditions, while others for viruses that replicate in mammalian cells. type from the total population. For example, have no effect and are said to be silent. Other com monly sought phe notypes are the live attenuated poliovirus vaccine strains To facilitate identification of mutants, the changes in plaque size or morphology, drug re developed by Albert Sabin are mutants that population must be screened for a phenotype sistance, antibody resistance, and host range were selected from a virulent virus stock (Vol that can be identified easily in a plaque assay. (that is, loss of the ability to reproduce in cer ume II, Fig. 7.11). One such phenotype is temperature-sensitive tain hosts or host cells). 76 Chapter 3 B OX 3.7 T E R M I N O L O G Y What is wild type? Terminology can be confusing. Virologists of poliov irus obtained in 1909 undoubtedly is names of species (which are constructs that as often use terms such as “strains,” “varia nts,” very different from that of the virus we call sist in the cataloging of viruses). A species and “mut ants” to designate a vir us that dif wild type today. We dist ing uish caref ully be name is written in italics with the first word fers in some herit able way from a parent al or tween laboratory wild types and new virus beginning with a capital letter (other words wild-type vir us. In conventional usage, the isolates from the natural host. The latter are should be capitalized if they are proper nouns). wild type is defi ned as the original (often called field isolates or clinical isolates. For example, the causative agents of poliomy laborator y-adapted) vir us from which mu The field of viral taxonomy has its own elitis, poliovirus types 1, 2, and 3, are members tants are selected and which is used as the naming conventions which can cause some of the species Enterovirus C. A virus name basis for comparison. A wild-type vir us may conf usion. Viruses are classified into orders, should never be italicized, even when it in not be ident ic al to a vir us isolated from na fami l ies, subfami l ies, genera, and species. cludes the name of a host species or genus, and ture. In fact, the genome of a wild-type vir us These names are always italicized and start should be written in lowercase: for example, may include numerous mut at ions acc umu with a capital letter (e.g., Picornaviridae). To Sida ciliaris golden mosaic virus. A good exer lated duri ng propagat ion in the laboratory. ensure clarity, the names of viruses (like polio cise would be to see how often we have acci For exa mple, the genome of the first isolate virus) should be written differently from the dentally violated these rules in this textbook. while wild-type genomes result from recombinat ion or re will occur. In this way, the members of collect ions of mu assortment. If the mutations being tested are in sepa rate tants obtained by chemic al mutagenesis were init ially or genes, each vir us is able to supply a funct ional gene prod gan ized into complementat ion groups defi ni ng separate uct, allowi ng both vir uses to be reproduced. If the two vi viral funct ions. In principle, there can be as many comple ruses carry mutations in the same gene, no reproduction ment at ion groups as genes. A B L M L R3 M 1 1 2 2 3 3 4 4 5 5 6 6 L M R3 7 7 8 8 Figure 3.11 Reassortment of influenza virus RNA segments. (A) Progeny viruses of cells that are coinfected with two influenza vir us strains, L and M, include both parents and viruses that derive RNA segments from them. Recombinant R3 has inherited segment 2 from the L strain and the remaining seven segments from the M strain. (B) 32P-labeled influenza virus RNAs were fract ionated in a poly acryla mide gel and detected by autoradiography. Migration differences of parental viral RNAs (M and L) permitted identification of the origin of RNA segments in the progeny virus R3. Panel B reprinted from Racaniello VR, Palese P. 1979. J Virol 29:361–373. Genomes and Genetics 77 Engineering Mutations into Viral Genomes The complete genomes of polyomav ir uses, papi llomav i Infectious DNA Clones ruses, and adenoviruses can be cloned in plasmid vectors, and Recombinant DNA techniques have made it possible to in such DNA is infectious under appropriate conditions. The DNA troduce any kind of mutation anywhere in the genome of genomes of herpesviruses and poxviruses are too large to insert most animal viruses, whether that genome comprises DNA into conventional bacterial plasmid vectors, but they can be or RNA. The quintessential tool in virology today is the in cloned into vectors that accept larger insertions (e.g., cosmids fectious DNA clone, a dsDNA copy of the viral genome that and bacterial artificial chromosomes). The plasmids containing is carried on a bacterial vector such as a plasmid. Infectious such cloned herpesv irus genomes are infectious. In contrast, DNA clones, or in vitro transcripts derived from them, can be pox v irus DNA is not infectious, because the viral promoters introduced into cultured cells by transfection (Box 3.8) to re cannot be recognized by cellular DNA-dependent RNA poly cover infectious virus. This approach is a modern validation merase. Poxvirus DNA is infectious when early functions (viral of the Hershey-Chase experiment described in Chapter 1. DNA-dependent RNA polymerase and transcription proteins) The availability of site-specific bacterial restriction endonu are provided by complementation with a helper virus. cleases, DNA ligases, and an array of methods for mutagene RNA viruses. (i) (+) strand RNA viruses. The genomic sis has made it possible to manipulate these infectious clones RNA of retroviruses is copied into dsDNA by reverse transcrip at will. Infectious DNA clones also prov ide a stable reposi tase early during infection, a process described in Chapter 10. tory of the viral genome, a particu larly important advantage Such DNA is infectious when introduced into cells, as are for vaccine strains. As oligonucleotide synt hesis has become molecularly cloned forms inserted into bacterial plasmids. more ef ficient and less costly, the assembly of viral DNA ge Infectious DNA clones have been constructed for many nomes up to 212 kbp has become possible (Box 3.9). (+) strand RNA viruses. An example is the introduction of a plasmid containing cloned poliov irus DNA into cultured DNA viruses. Current genetic methods for the study of mammalian cells, which leads to the production of progeny most vir uses with DNA genomes are based on the infectivity virus (Fig. 3.12A). The mechanism by which cloned poliov i of viral DNA. When deproteinized viral DNA molecules are rus DNA initiates infection is not known, but it has been sug introduced into permissive cells by transfection, they gener gested that the DNA enters the nucleus, where it is transcribed ally initiate a complete infectious cycle, alt hough the infec by cellular DNA-dependent RNA polymerase from cryptic, tivity (number of plaques per microgram of DNA) may be pr