Harper's Biochemistry Chapter 34 - Nucleic Acid Structure & Function PDF
Document Details

Uploaded by PrizeMeerkat
P. Anthony Weil
Tags
Summary
This document is from Harper's Biochemistry and comprises Chapter 34 which discusses the structure and function of nucleic acids. It delves into the chemical structures of DNA and RNA, explains how genetic information is encoded and replicated, and details the processes of transcription and translation involved in protein synthesis. The chapter covers key concepts in molecular biology and genetics.
Full Transcript
C H A P T E R Nucleic Acid Structure & Function P. Anthony Weil, PhD 34 OBJ EC T IVES Understand the chemical monomeric and polymeric structure of the geneti...
C H A P T E R Nucleic Acid Structure & Function P. Anthony Weil, PhD 34 OBJ EC T IVES Understand the chemical monomeric and polymeric structure of the genetic material, deoxyribonucleic acid, or DNA, which is found primarily within the After studying this chapter, nucleus of eukaryotic cells, but also in organelles. you should be able to: Explain why genomic DNA is double stranded and highly negatively charged. Understand the outline of how the genetic information of DNA can be faithfully duplicated via the process of DNA replication. Describe how the genetic information of DNA is transcribed, or copied, into myriad, distinct forms of ribonucleic acid (RNA). Appreciate that one form of information-rich RNA, the so-called messenger RNA (mRNA), is processed posttranscriptionally, shuttled to the cytoplasm, and then translated into proteins, the molecules that form the structures, shapes, and ultimately functions of individual cells, tissues, and organs. BIOMEDICAL IMPORTANCE genetic determination of the character (type) of the capsule of a specific pneumococcus bacterium could be transmitted to The discovery that genetic information is coded along the another of a different capsular type by introducing purified length of a polymeric molecule composed of only four types DNA from the former pneumococcus into the latter. These of monomeric units was one of the major scientific achieve- authors referred to the agent (later shown to be DNA) accom- ments of the 20th century. This polymeric molecule, deoxy- plishing the change as “transforming factor.” Subsequently, ribonucleic acid (DNA), is the chemical basis of heredity this type of genetic manipulation has become commonplace. and is organized into genes, the fundamental units of genetic Conceptually similar experiments are now regularly performed information. The basic information pathway—that is, DNA, utilizing a variety of cells, including human cells and mamma- which directs the synthesis of RNA, which in turn both directs lian embryos as recipients and molecularly cloned DNA as the and regulates protein synthesis—has been elucidated. Genes donor of genetic information. do not function autonomously; rather their replication and function are controlled by various gene products, primarily proteins often in collaboration with components of various DNA Contains Four Distinct signal transduction pathways. Knowledge of the structure and Deoxynucleotides function of nucleic acids is essential in understanding genetics The chemical nature of the monomeric deoxynucleotide and many aspects of pathophysiology as well as the genetic units of DNA—deoxyadenylate, deoxyguanylate, deoxycyti- basis of disease. dylate, and thymidylate—is described in Chapter 32. These monomeric units of DNA are held in polymeric form by 3′,5′- phosphodiester bonds constituting a single strand, as depicted in DNA CONTAINS THE GENETIC Figure 34–1. The informational content of DNA (the genetic code) resides in the sequence in which these monomers— INFORMATION purine and pyrimidine deoxyribonucleotides—are ordered. The demonstration that DNA contained the genetic infor- The polymer as depicted possesses a polarity; one end has mation was first made in 1944 in a series of classic experi- a 5′-hydroxyl or phosphate terminus while the other has a ments by Avery, MacLeod, and McCarty. They showed that the 3′-phosphate or hydroxyl terminus. The importance of this 348 CHAPTER 34 Nucleic Acid Structure & Function 349 FIGURE 34–1 A segment of one strand of a DNA molecule in which the purine and pyrimidine bases guanine (G), cytosine (C), thymine (T), and adenine (A) are held together by a phosphodiester backbone between 2′-deoxyribosyl moieties attached to the nucleobases by an N-glycosidic bond. Note that the phosphodiester backbone is negatively charged and has a polarity (ie, a direction). Convention dictates that a single-stranded DNA sequence is written in the 5′ to 3′ direction (ie, pGpCpTpAp, where G, C, T, and A represent the four bases and P represents the interconnecting phosphates). polarity will become evident. Since the genetic information This common form of DNA is said to be right handed resides in the order of the monomeric units within the poly- because as one looks down the double helix, the base residues mers, there must exist a mechanism of reproducing or repli- form a spiral in a clockwise direction. In the double-stranded cating this specific information with a high degree of fidelity. molecule, restrictions imposed by the rotation about the phos- That requirement, together with x-ray diffraction data from phodiester bond, the favored anti-configuration of the glyco- the DNA molecule generated by Franklin, and the observa- sidic bond (see Figure 32–5), and the predominant tautomers tion of Chargaff that in DNA molecules the concentration of (see Figure 32–2) of the four bases (A, G, T, and C) allow A to deoxyadenosine (A) nucleotides equals that of thymidine (T) pair only with T, and G only with C, as depicted in Figure 34–3. nucleotides (A = T), while the concentration of deoxyguanosine These base-pairing restrictions explain the earlier observa- (G) nucleotides equals that of deoxycytidine (C) nucleotides tion that in a double-stranded DNA molecule the content of (G = C), led Watson, Crick, and Wilkins to propose in the early A equals that of T and the content of G equals that of C. The 1950s a model of a double-stranded (ds) DNA molecule. The two strands of the double-helical molecule, each of which pos- model they proposed is depicted in Figure 34–2. The two sesses a polarity, are antiparallel; that is, one strand runs in strands of this double-stranded helix are held in register by the 5′ to 3′ direction and the other in the 3′ to 5′ direction. both hydrogen bonds (H-bonds; see Figure 2–2) between Within a particular gene in the double-stranded DNA mole- the purine and pyrimidine bases of the respective linear mol- cules, the genetic information resides in the sequence of nucle- ecules and by van der Waals and hydrophobic interactions otides on one strand, the template strand. This is the strand (see Chapter 2 and Figure 2–4) between the stacked adjacent of DNA that is copied, or transcribed, during ribonucleic acid base pairs. The pairings between the purine and pyrimidine (RNA) synthesis. It is sometimes referred to as the noncoding nucleotides on the opposite strands are very specific, and are strand. The opposite strand is considered the coding strand dependent on hydrogen bonding of A with T and G with C because it matches the sequence of the RNA transcript (but (see Figure 34–2). A–T and G–C base pairs are often referred containing uracil in place of thymine; see Figure 34–8) that to as Watson-Crick base pairs. encodes the protein. 350 SECTION VII Structure, Function, & Replication of Informational Macromolecules The two strands, in which opposing bases are held together by interstrand hydrogen bonds, wind around a central axis in the form of a double helix. In the test tube, double-stranded DNA can exist in at least six forms (A-DNA, through E-DNA and Z-DNA). These different forms of DNA differ with regard to intra- and interstrand interactions and involve structural rearrangements within the monomeric units of DNA. The B Minor groove form is usually found under physiologic conditions. A sin- P S A T S P o 34 A gle turn of B form DNA about the long axis of the molecule S P T A S P contains 10 bp. The distance spanned by one turn of B-DNA S C G S is 3.4 nm (34 Å). The width (helical diameter) of the double P P S G C S helix in B-DNA is 2 nm (20 Å). Major groove There Are Grooves in the DNA Molecule Examination of the model depicted in Figure 34–2 reveals a major groove and a minor groove winding along the molecule parallel to the phosphodiester backbones. In these grooves, proteins often interact specifically with exposed atoms of the nucleotides (via 20 A o specific hydrophobic and ionic interactions), thereby recogniz- ing and binding to specific nucleotide sequences as well as the FIGURE 34–2 A diagrammatic representation of the Watson unique shapes formed therefrom. Binding usually occurs without and Crick model of the double-helical structure of the B form of disrupting the base pairing of the double-helical DNA molecule. DNA. The horizontal arrow indicates the width of the double helix As discussed in Chapters 35, 36, and 38, regulatory proteins that (20 Å), and the vertical arrow indicates the distance spanned by one complete turn of the double helix (34 Å). One turn of B-DNA includes control DNA replication, repair, and recombination as well as the 10 base pairs (bp), so the rise is 3.4 Å per bp. The central axis of the transcription of specific genes occur through such protein-DNA double helix is indicated by the vertical rod. The short arrows des- interactions, and thus contribute critically to cellular function. ignate the polarity of the antiparallel strands. The major and minor grooves are depicted. (A, adenine; C, cytosine; G, guanine; P, phosphate; The Denaturation of DNA Is Used to S, sugar [deoxyribose]; T, thymine.) Hydrogen bonds between A/T and G/C bases indicated by short, red, horizontal lines. Analyze Its Structure As depicted in Figure 34–3, three H-bonds, formed by hydro- CH3 gen atoms bonded to electronegative N or O atoms, hold the deoxyguanosine nucleotide to the deoxycytidine nucleotide. O By contrast, the other canonical base pair, the A–T pair, is held H H together by two H-bonds. Note that the four DNA nucleotide N N N H bases ([dG, dA] purines and [dT, dC] pyrimidines) are flat, O N N planar molecules (see Figure 32–1 and Table 32–1). These two Thymine key properties of the nucleotide bases allow them to closely N stack within duplex DNA (see Figure 34–2). Moreover, the N Adenine atoms within the aromatic, heterocyclic bases are highly polariz- able and, coupled with the fact that many of the atoms within the bases contain partial charges, allows for the stacked bases to form H van der Waals and electrostatic interactions. Collectively these N forces are referred to as base-stacking forces or base-stacking H interactions. The base-stacking interactions between adjacent N N O G–C (or C–G) base pairs are stronger than A–T (or T–A) base H N pairs. Thus, overall, G–C-rich DNA sequences are more resistant N O to denaturation, or strand separation, termed “melting,” than are Cytosine H A–T-rich regions of DNA. N N N Guanine H DNA Can be Reversibly Denatured & FIGURE 34–3 Classic Watson-Crick DNA base pairing Specifically Renatured, Both in the between complementary deoxynucleotides involves the forma- tion of hydrogen bonds. Two such H-bonds form between adenine Test Tube & in Living Cells and thymine, and three H-bonds form between cytidine and guanine. In the laboratory the double-stranded structure of DNA can The broken lines represent H-bonds. be separated, or denatured into its two component strands CHAPTER 34 Nucleic Acid Structure & Function 351 in solution by: increasing temperature, decreasing solution salt hybridization. The rate of strand reassociation depends on the concentrations, adding chaotropic agents, which can form concentration of the complementary strands. At a given tem- competing H-bonds with the individual deoxynucleotide bases, perature and salt concentration, a particular nucleic acid strand or often experimentally, by a combination of all three treatments. will associate tightly only with its complementary strand. Thus, Under such conditions not only do the two stacks of bases pull renaturation is highly specific. Indeed, researchers have shown apart, but the bases themselves unstack while remaining con- that renatured DNA hybrid molecules with but one base pair nected within the now, two single-stranded polymers, that are mismatch can readily be detected and quantified. Importantly, connected by their phosphodiester backbones. Concomitant DNA-DNA, DNA-RNA, and RNA-RNA hybrid molecules will with the denaturation of the DNA molecule into two single also form under appropriate conditions. For example, DNA strands is an increase in the optical absorbance in the ultravio- will form a perfect double-stranded hybrid molecule with a let light spectrum (260 nm; see Chapter 32) of the purine and complementary DNA (cDNA) sequence, or with a cognate, pyrimidine bases of each strand; this phenomenon is referred complementary RNA. When hybridization is combined with to as hyperchromicity of denaturation. Because of the com- various sophisticated analytical techniques scientists can spe- bined strength of base stacking and the H-bonding between the cifically detect, identify, and even determine the sequence of complementary bases in each strand, the double-stranded DNA vanishingly small amounts of nucleic acids (both DNA and molecule exhibits properties of a rigid rod. Thus, native double- RNA). An overview of much of the enabling technology of stranded DNA in solution is an extremely viscous material. such nucleic acid analyses is described in Chapter 39. However, on denaturation, DNA solutions lose their viscosity. The strands of a given molecule of double-stranded DNA DNA Exists in Relaxed & separate over a temperature range. The midpoint of the mea- sured DNA denaturation is called the melting temperature, Supercoiled Forms or Tm. The Tm is influenced by the base composition of the In some organisms such as bacteria, bacteriophages, many DNA and by the salt concentration or other components of DNA-containing animal viruses, as well as organelles such as the solution (see following discussion). DNA rich in G–C pairs mitochondria (see Figure 35–8), the ends of the DNA molecules melts at a higher temperature than DNA rich in A–T pairs, are joined to create a closed circle with no covalently free ends. due to differences in hydrogen bond content and base stack- This of course does not destroy the polarity of the molecules, but ing, as discussed earlier. A 10-fold increase of monovalent cat- it eliminates all free 3′ and 5′ hydroxyl and phosphoryl groups. ion concentration significantly increases the Tm by neutralizing Closed circles exist in relaxed or supercoiled forms. Supercoils the intrinsic interchain repulsion between the highly negatively are introduced when a closed circle is twisted around its own axis charged phosphates of the phosphodiester backbone of each or when a linear piece of duplex DNA, whose ends are fixed, is DNA strand. For example, an increase of NaCl concentra- twisted. This energy-requiring process puts the molecule under tion from 0.01 to 0.1 M increases Tm by 16.6°C. By contrast, torsional stress, and the greater the number of supercoils, the chaotropes such as urea (NH2CONH2; see Figure 28–16) and greater the stress or torsion (test this by twisting a rubber band). formamide (CH3NO) can efficiently form H-bonds with the Negative supercoils are formed when the molecule is twisted nucleotide bases, which destabilizes H-bonding between bases. in the direction opposite from the clockwise turns of the right- Such solution conditions will lower the Tm. Chaotrope addition handed double helix found in B-DNA. Such DNA is said to be allows the strands of DNA or complementary DNA–RNA, and underwound. The energy required to achieve this state is, in a intramolecular RNA-RNA hybrids (see following discussion) to sense, stored in the supercoils. The transition to another form that be separated at much lower temperatures. Lower temperatures requires energy is thereby facilitated by the underwinding (see minimize phosphodiester bond breakage and chemical damage Figure 35–19). One such transition is strand separation, which is to nucleotides that can occur on extended incubation in solu- a prerequisite for DNA replication and transcription. Supercoiled tion. In living cells, both DNA denaturation and renaturation DNA is therefore a preferred form in biologic systems. Enzymes (see following discussion) occurs naturally during the processes that catalyze topologic changes of DNA are called topoisomer- of DNA replication, DNA recombination, DNA repair (see ases. Topoisomerases can relax or insert supercoils, using ATP as Chapter 35), and DNA gene transcription (see Chapter 36). In an energy source. Homologs of this enzyme exist in all organisms all of these instances DNA strand separation and renaturation and are important targets for cancer chemotherapy. Supercoils is mediated through the action of specific nucleic acid binding can also form within linear DNAs if particular segments of DNA proteins and various enzymes, in combination with thermal are constrained by interacting tightly with nuclear proteins that and/or chemical energy supplied via ATP hydrolysis. establish two boundary sites defining a topologic domain. Renaturation of DNA Requires Precise DNA PROVIDES A TEMPLATE FOR Base Pair Matching REPLICATION & TRANSCRIPTION Importantly, separated strands of DNA will renature, or reas- The genetic information stored in the nucleotide sequence of sociate when appropriate temperature and salt conditions are DNA serves two purposes. It is the source of information for achieved. Reannealing is often referred to as renaturation or the synthesis of all protein molecules of the cell and organism, 352 SECTION VII Structure, Function, & Replication of Informational Macromolecules and it provides the information inherited by daughter cells or offspring. Both of these functions require that the DNA mol- ecule serve as a template—in the first case for the transcription of the information into RNA and in the second case for the Original parent molecule replication of the information into daughter DNA molecules. When each strand of the double-stranded parental DNA molecule separates from its complement during replication, each independently serves as a template on which a new complementary strand is synthesized (Figure 34–4). The two newly formed double-stranded daughter DNA molecules, each containing one strand (but complementary rather than identical) from the parent double-stranded DNA molecule, are then sorted between the two daughter cells during mitosis First-generation (Figure 34–5). Each daughter cell contains DNA molecules daughter molecules with information identical to that which the parent possessed; yet, in each daughter cell, the DNA molecule of the parent cell has been only semiconserved. Second-generation daughter molecules FIGURE 34–5 DNA replication is semiconservative. During a round of replication, each of the two strands of DNA is used as a tem- plate for synthesis of a new, complementary strand. The semiconser- vative nature of DNA replication has implications for the biochemical (see Figure 35–16), cytogenetic (see Figure 35–12), and epigenetic control of gene expression (see Figures 38–8 and 38–9). THE CHEMICAL NATURE OF RNA DIFFERS FROM THAT OF DNA RNA is a polymer of purine and pyrimidine ribonucleotides linked together by 3′,5′-phosphodiester bonds analogous to those in DNA (Figure 34–6). Although sharing many features with DNA, RNA possesses several specific differences: 1. In RNA, the sugar moiety to which the phosphates and purine and pyrimidine bases are attached is ribose rather than the 2′-deoxyribose of DNA (see Figures 19–2 and 32–3). 2. The pyrimidine components of RNA can differ from those of DNA. Although RNA contains the ribonucleotides of adenine, guanine, and cytosine, it does not possess thymine except in the rare case mentioned in the following discus- sion. Instead of thymine, RNA contains the ribonucleotide of uracil. 3. RNA can exist as a single strand, whereas DNA exists as FIGURE 34–4 DNA synthesis maintains the sequence and a double-stranded helical molecule. However, given the structure of the original template DNA. The double-stranded struc- ture of DNA and the template function of each old parental strand proper complementary base sequence with opposite polarity, (orange) on which a new complementary daughter strand (blue) is the single strand of RNA—as demonstrated in Figures 34–7 synthesized. and 34–11—is capable of folding back on itself like a hairpin CHAPTER 34 Nucleic Acid Structure & Function 353 FIGURE 34–6 A segment of a ribonucleic acid (RNA) molecule in which the purine and pyrimidine bases—guanine (G), cytosine (C), uracil (U), and adenine (A)—are held together by phosphodiester bonds between ribosyl moieties attached to the nucleobases by N-glycosidic bonds. Note that negative charge(s) on the phosphodiester backbone are not illustrated (ie, Figure 34–1) and that the polymer has a polarity as indicated by the labeled 3′- and 5′-attached phosphates. Loop C G C G G C H A U N N H O A U A U N A N H N U U G N N U G Stem C C O G C H U A N N H O CH3 U A U C N A N H N T U A N N U G O C G G C 5 3 FIGURE 34–7 RNA Secondary Structure. (Left) Diagrammatic representation of the secondary structure of a hypothetical single- stranded RNA molecule in which a stem loop, or “hairpin”, has been formed. Formation of this structure is dependent on the indicated intramolecular base pairing (colored horizontal lines between complementary bases). Note that in RNA G pairs with C as in DNA, but that A pairs with U. (Right) Schematic of A-U (top) compared to A-T base pairs (bottom). 354 SECTION VII Structure, Function, & Replication of Informational Macromolecules FIGURE 34–8 The relationship between the sequences of an RNA transcript and its gene, in which the coding and template strands are shown with their polarities. The RNA transcript with a 5′ to 3′ polarity is complementary to the template strand with its 3′ to 5′ polarity. Note that the sequence in the RNA transcript and its polarity is the same as that in the coding strand, except that the U of the transcript replaces the T of the gene; the initiating nucleotide of RNAs contain a terminal 5-triphosphate (ie, pppA-above). and thus acquiring double-stranded characteristics: G pair- peptidyl transferase that catalyzes peptide bond formation on ing with C, and A pairing with U. G-C base pairs form three the ribosome. H-bonds and A-U base form only two. In all eukaryotic cells, there are small nuclear RNA (snRNA) species that are not directly involved in protein syn- 4. Since the RNA molecule is a single strand complementary thesis but play pivotal roles in RNA processing, particularly to only one of the two strands of a gene, its guanine content mRNA processing. These relatively small molecules vary in does not necessarily equal its cytosine content, nor does its size from 90 to about 300 nucleotides (Table 34–1). The prop- adenine content necessarily equal its uracil content. erties of the several classes of cellular RNAs are detailed in the 5. RNA can be hydrolyzed by alkali to 2′, 3′ cyclic diesters of following discussion. the mononucleotides, compounds that cannot be formed The genetic material for some animal and plant viruses from alkali-treated DNA because of the absence of a is RNA rather than DNA. Although some RNA viruses never 2′-hydroxyl group. The alkali lability of RNA is useful both have their information transcribed into a DNA molecule diagnostically and analytically. (eg, Influenza and Coronaviruses like COVID-19), certain ani- mal RNA viruses—specifically, the retroviruses (eg, the human Information within the single strand of RNA is contained immunodeficiency, or HIV virus)—are transcribed by viral in its sequence (“primary structure”) of purine and pyrimidine RNA–dependent DNA polymerase, the so-called reverse nucleotides within the polymer. The sequence is complemen- transcriptase, to produce a double-stranded DNA copy of their tary to the template strand of the gene from which it was tran- RNA genome. In many cases, the resulting double-stranded scribed. Because of this complementarity, an RNA molecule can DNA transcript is integrated into the host genome and sub- bind specifically via the base-pairing rules to its template DNA sequently serves as a template for gene expression and from strand (A–T, G–C, C–G, U–A; RNA base bolded); it will not which new viral RNA genomes and viral mRNAs can be tran- bind (“hybridize”) with the other (coding) strand of its gene. scribed and subsequently translated by the host cell machin- The sequence of the RNA molecule (except for U replacing T) is ery into viral proteins. Genomic insertion of such integrating the same as that of the coding strand of the gene (Figure 34–8). “proviral” DNA molecules can, depending on the site involved, be mutagenic, inactivating a gene or disregulating its expres- sion (see Figure 35–11). NEARLY ALL THE SEVERAL SPECIES OF STABLE, ABUNDANT RNAs ARE INVOLVED IN SOME ASPECT OF PROTEIN SYNTHESIS TABLE 34–1 Some of the Species of Small-Stable RNAs Found in Mammalian Cells Those cytoplasmic RNA molecules that serve as templates for protein synthesis (ie, that transfer genetic information from Length Molecules DNA to the protein-synthesizing machinery) are designated Name (Nucleotides) per Cell Localization messenger RNAs (mRNAs). Many other very abundant cyto- U1 165 1 × 106 Nucleoplasm plasmic RNA molecules (ribosomal RNAs [rRNAs]) have U2 188 5 × 105 Nucleoplasm structural roles wherein they contribute to the formation and function of ribosomes (the organellar machinery for pro- U3 216 3 × 105 Nucleolus tein synthesis) or serve as adapter molecules (transfer RNAs U4 139 1 × 105 Nucleoplasm [tRNAs]) for the translation of RNA information into specific U5 118 2 × 105 Nucleoplasm sequences of polymerized amino acids. Interestingly, some RNA molecules have intrinsic catalytic U6 106 3 × 105 Perichromatin granules activity. The activity of these RNA enzymes, or ribozymes, 4.5S 95 3 × 105 Nucleus and cytoplasm often involves the cleavage of a nucleic acid. Two ribo- 7SK 280 5 × 105 Nucleus and cytoplasm zymes are the ribozymes involved in the RNA splicing, and CHAPTER 34 Nucleic Acid Structure & Function 355 THERE EXIST SEVERAL DISTINCT in length. The poly(A) “tail” at the 3′-end of mRNAs maintains the intracellular stability of the specific mRNA by preventing CLASSES OF RNA the attack of 3′-exoribonucleases and also facilitates translation As noted earlier, in all prokaryotic and eukaryotic organisms, (see Figure 37–7). Both the mRNA “cap” and “poly(A) tail” are four main classes of RNA molecules exist: mRNA, tRNA, added posttranscriptionally by nontemplate-directed enzymes rRNA, and small RNAs. Each differs from the others by abun- to mRNA precursor molecules (pre-mRNA). mRNA represents dance, size, function, and general stability. 2 to 5% of total eukaryotic cellular RNA. In mammalian cells, including cells of humans, the mRNA Messenger RNA molecules present in the cytoplasm are not the RNA products immediately synthesized from the DNA template but must This class is the most heterogeneous in abundance, size, and be formed by processing from the precursor, or pre-mRNA stability; for example, in brewer’s yeast, specific mRNAs are before entering the cytoplasm. Thus, in mammalian cell nuclei, present in 100s/cell to, on average, ≤0.1/mRNA/cell in a genet- the immediate products of gene transcription (primary tran- ically homogeneous population. As detailed in Chapters 36 scripts) are very heterogeneous and can be 10- to 50-fold longer and 38, both specific transcriptional and posttranscriptional than mature mRNA molecules. As discussed in Chapter 36, mechanisms contribute to this large dynamic range in mRNA pre-mRNA molecules are processed to generate mRNA mol- content. In mammalian cells, specific mRNA abundance likely ecules, which then enter the cytoplasm to serve as templates varies over a 104-fold range. All members of this RNA class for protein synthesis. function as messengers conveying the information in a gene to the protein-synthesizing machinery, where each mRNA serves as a template on which a specific sequence of amino acids is Transfer RNA polymerized to form a specific protein molecule, in this case tRNA molecules vary in length from 74 to 95 nucleotides and, the ultimate gene product (Figure 34–9). like many other RNAs, are also generated by nuclear processing Eukaryotic mRNAs have unique chemical characteristics. of a precursor molecule (see Chapter 36). The tRNA molecules The 5′ terminus of mRNA is “capped” by a 7-methylguanosine serve as adapters for the translation of the information in the triphosphate that is linked to an adjacent 2′-O-methyl ribo- sequence of nucleotides of the mRNA into specific amino acids. nucleoside at its 5′-hydroxyl through the three phosphates There are at least 20 species of tRNA molecules in every cell, (Figure 34–10). mRNA molecules frequently contain internal at least one (and often several) corresponding to each of the N6-methyladenine and other 2′-O-ribose-methylated nucleo- 20 amino acids required for protein synthesis. Although each spe- tides. The cap is involved in the recognition of mRNA by the cific tRNA differs from the others in its sequence of nucleotides, translation machinery, and also helps stabilize the mRNA by the tRNA molecules as a class have many features in common. preventing the nucleolytic attack by 5′-exoribonucleases. The The primary structure—that is, the nucleotide sequence—of protein-synthesizing machinery begins translating the mRNA all tRNA molecules allows extensive folding and intrastrand into proteins beginning downstream of the 5′ or capped ter- complementarity to generate a secondary structure that appears minus. At the other end of almost all eukaryotic mRNA mole- in two dimensions like a cloverleaf (Figure 34–11). cules, the 3′-hydroxyl terminus has an attached, nongenetically All tRNA molecules contain four main double-stranded encoded polymer of adenylate residues 20 to 250 nucleotides arms or stems, connected by single-stranded loops named for their respective nucleotide composition or function. The amino acid acceptor arm terminates in the nucleotides CpCpAOH. As with the mRNA 5′-Cap and 3′ Poly A tail, these three nucleotides (CCA) are added posttranscriptionally, in this case by a specific nucleotidyl transferase enzyme. The tRNA-appropriate amino acid is attached, or “charged,” onto the posttranscriptionally added 3′-OH group of the A moiety of the acceptor arm through the action of specific aminoacyl tRNA synthetases (see Figure 37–1). The D, TψC, and extra arms help define a specific tRNA. tRNAs compose roughly 20% of total cellular RNA. Recent work has shown that many tRNAs are specifically cleaved by certain ribonucleases to gen- erate unique subfragments termed tRNA-derived small RNA (tsRNAs). These relatively stable tsRNAs are thought to regu- FIGURE 34–9 The expression of genetic information within late both transcription and translation (see Chapters 36–38). DNA into the form of an mRNA transcript with 5′ to 3′ polarity and then into protein with N- to C-polarity is shown. DNA is tran- Ribosomal RNA scribed into mRNA that is subsequently translated by ribosomes into a specific protein molecule that exhibits polarity, N-terminus (N) to A ribosome is a cytoplasmic nucleoprotein structure that acts C-terminus (C). as the machinery for the synthesis of proteins from the mRNA 356 SECTION VII Structure, Function, & Replication of Informational Macromolecules FIGURE 34–10 The cap structure attached to the 5′ terminal of most eukaryotic messenger RNA molecules. A 7-methylguanosine triphosphate (black) is attached at the 5′ end of the mRNA (red), which usually also contains a 2′-O-methylpurine nucleotide. These modifications (the cap and methyl group) are added after the mRNA is transcribed from DNA. Note that the γ- and β-phosphates of the GTP added to form the cap (black in figure) are lost on cap addition while the γ-phosphate of the initiating nucleotide (here an A-residue; red in figure) is lost during cap addition. templates. On the ribosomes, the mRNA and tRNA molecules The 60S subunit contains a 5S rRNA, a 5.8S rRNA, and a 28S interact to translate the information transcribed from the gene rRNA; there are also more than 50 specific polypeptides. The during mRNA synthesis into a specific protein. During periods 40S subunit is smaller and contains a single 18S rRNA and of active protein synthesis, many ribosomes can be associated 30 distinct polypeptide chains. All of the rRNA molecules with any single mRNA molecule to form an assembly called except the 5S rRNA, which is independently transcribed, are the polysome (see Figure 37–7). processed from a single 45S precursor RNA molecule in the The components of the mammalian ribosome, which has a nucleolus (see Chapter 36). rRNAs are highly methylated post- molecular weight of about 4.2 × 106 and a sedimentation velocity transcriptionally, and are packaged in the nucleolus with the coefficient of 80S (S = Svedberg units, a parameter sensitive to specific ribosomal proteins. In the cytoplasm, the ribosomes molecular size and shape) are shown in Table 34–2. The mam- remain quite stable and capable of many cycles of transla- malian ribosome contains two major nucleoprotein subunits— tion. The exact functions of the rRNA molecules in the ribo- a larger one with a molecular weight of 2.8 × 106 (60S) and a some, over and above their scaffold functions are not yet fully smaller subunit with a molecular weight of 1.4 × 106 (40S). understood. It is clear though that rRNAs are necessary for FIGURE 34–11 Structure of a mature, functional tRNA, yeast phenylalanyl-tRNA (tRNAPhe). Shown are primary (1o), secondary (2o), and tertiary (3o) structures (top, lower left, and lower right, respectively) of tRNAPhe. Numerals below the 76 nucleotide-long tRNAPhe primary structure indicate nucleotide numbering from the 5′ (+1) to the 3′ end (+76) of the molecule. Note that the +1 nucleotide contains a 5′ phosphate moiety (P), while the 3′ nucleotide has a free 3′ hydroxyl group (OH). Bases underlined in bold type within the sequence of tRNAPhe are heavily modified to the nucleotides shown in the 2o structural representation of tRNAPhe. This structure is often referred to as a “cloverleaf.” Some of these nucleotides have noncanonical ribo- nucleotide names, as represented in the 2o structural model. Within tRNAPhe nucleotides U16 and U17 are modified to D16, D17; G37 to Y37; U39 and U55 to Ψ; and U54 to T54 (see following discussion for details). Straight lines between bases within the tRNA secondary structure represent hydrogen bonds formed between bases (A–U; G–C). Note that these regions of secondary structure form with the same strand polarity (ie, 5′ to 3′ and 3′ to 5′) as base-paired regions of DNA. The three bases of the anticodon loop are shown in red. In the case of amino acid–charged tRNAs, an aminoacyl moiety is esterified to the 3′-CCAOH terminus (brown; in this case the amino acid would be phenylalanine; not shown). Blue type highlights nontraditional nucleotides introduced by posttranslational modification, abbreviated as follows: m2G = 2-methylguanosine; D = 5,6-dihydrouridine; m22G = N2-dimethylguanosine; Cm = O2′-methylcytidine; Gm = O2′-methylguanosine; T = 5-methyluridine; Y = wybutosine; Ψ = pseudouridine; m5C = 5-methylcytidine; m7G = 7-methylguanosine; m1A = 1-methyladenosine. Essentially all tRNAs fold into similar, characteristic, tertiary structures (3o) as shown, lower right. The distinct portions of the molecule in 2o (insert) and 3o configurations are color-coded in this image for clarity. tRNAPhe was the first nucleic acid whose structure was determined by x-ray crystallography. Such distinct three-dimensional tRNA structures bind specifically to important functional sites on both aminoacyl tRNA synthetases and the ribosomes during protein synthesis (see Chapter 37). (Reproduced with permission from Transfer RNA/Wikipedia Commons https://en.wikipedia.org/wiki/Transfer_RNA.) 357 358 SECTION VII Structure, Function, & Replication of Informational Macromolecules TABLE 34–2 Components of Mammalian Ribosomes Protein RNA Component Mass (MW) Number Mass Size Mass Bases 40S subunit 1.4 × 106 33 7 × 105 18S 7 × 105 1900 60S subunit 2.8 × 106 50 1 × 106 5S 3.5 × 104 120 5.8S 4.5 × 104 160 28S 1.6 × 106 4700 Note: The ribosomal subunits are defined according to their sedimentation velocity in Svedberg (S) units (40S or 60S). The number of unique proteins and their total mass (MW) and the RNA components of each subunit in size (Svedberg units), mass, and number of bases are listed. ribosome assembly, and also play key roles in the binding of mechanisms. miRNAs are generated by specific nucleolytic mRNA to ribosomes and mRNA translation. Recent studies indi- processing of the products of distinct genes/transcription units cate that the large rRNA component performs the peptidyl trans- (see Figure 36–17). miRNA precursors, which are 5′-capped ferase activity and thus is a ribozyme. The rRNAs (28S + 18S) and 3′-polyadenylated, usually range in size from about 500 to represent roughly 70% of total cellular RNA. 1000 nucleotides. By contrast, siRNAs are generated by the specific nucleo- lytic processing of large dsRNAs that are either produced from Small RNA other endogenous RNAs, or dsRNAs introduced into the cell A large number of discrete, highly conserved small RNA spe- by, for example, RNA viruses. Both siRNAs and miRNAs cies are found in eukaryotic cells; some are quite stable. Most hybridize via the formation of RNA–RNA hybridization of these molecules are complexed with proteins to form ribo- to their targeted mRNAs (see Figure 38–19). To date, hun- nucleoproteins and are distributed in the nucleus, the cyto- dreds of distinct miRNAs and siRNAs have been described plasm, or both. They range in size from 20 to 1000 nucleotides in humans; estimates suggest that there are likely 1000s of and are present in 100,000 to 1,000,000 copies per cell, collec- human miRNA-encoding genes. Given their exquisite genetic tively representing less than or equal to 5% of cellular RNA. specificity, both miRNAs and siRNAs represent exciting new potential agents for therapeutic drug development. In the Small Nuclear RNAs laboratory siRNAs are frequently used to decrease or “knock- snRNAs, a subset of the small nuclear RNAs (see Table 34–1), down” specific protein levels (via siRNA homology–directed are significantly involved in rRNA and mRNA processing and mRNA degradation), and thus serve as an extremely useful gene regulation. Of the several snRNAs, U1, U2, U4, U5, and and powerful alternative to gene-knockout technology (see U6 are involved in mRNA splicing, a nuclear process whereby Chapter 39). Indeed, several siRNA-based therapeutic clinical introns are removed from mRNA precursor molecules to gener- trials are in progress to test the efficacy of these novel mol- ate functional, translatable cytoplasmic mRNAs (see Chapter 36). ecules as drugs for treating human disease. The U7 snRNA is involved in production of the correct 3′ ends Other exciting recent observations in the RNA realm are of histone mRNA—which lacks a poly(A) tail. 7SK RNA associ- the identification and characterization of two classes of larger ates with several proteins to form a ribonucleoprotein complex, noncoding RNAs, the circular RNAs (circRNAs) and the long termed P-TEFb, that modulates mRNA gene transcription elon- noncoding RNAs, or lncRNAs. Many circRNAs have recently gation by RNA polymerase II (see Chapter 36). been discovered and characterized. circRNAs appear to be pro- duced by RNA splicing-type reactions from a wide range of pre- Large & Small Noncoding Regulatory RNAs: cursor RNAs, both mRNA precursors and nonprotein coding lncRNA precursors (see following discussion for more infor- Micro-RNAs (miRNAs), Silencing RNAs (siRNAs), mation on lncRNAs). Though not an abundant class of RNA Long Noncoding RNAs (lncRNAs), and Circular molecules in most cells, circRNAs have been detected in all RNAs (circRNAs) eukaryotes tested, and seem particularly abundant in metazo- One of the most exciting and unanticipated discoveries in the ans. While the functions of circRNAs are still being elucidated last decade of eukaryotic regulatory biology has been the iden- they seem to be particularly abundant in cells of the nervous sys- tification and characterization of regulatory nonprotein cod- tem. Similar to lncRNAs, these molecules likely play important ing RNAs (ncRNAs). NcRNAs exist in two general size classes, roles in cellular biology by regulating gene expression at mul- large (50–1000nt) and small (20–22nt). Regulatory ncRNAs tiple levels. LncRNAs, which as their name implies, do not code have been described in most eukaryotes (see Chapter 38). for protein, and range in size from ~300 to 1000s of nucleotides The small ncRNAs termed miRNAs and siRNAs typi- in length. These RNAs are typically transcribed from the large cally inhibit gene expression at the level of specific protein regions of eukaryotic genomes that do not encode for protein production by targeting mRNAs through one of several distinct (ie, the mRNA encoding genes). In fact, transcriptome analyses CHAPTER 34 Nucleic Acid Structure & Function 359 indicate that more than 90% of all eukaryotic genomic DNA In addition to their roles in nucleic acid metabolism in liv- may be transcribed at some level. ncRNAs make up a signifi- ing cells, the nucleases described here, in concert with a pano- cant portion of this transcription. ncRNAs play many roles ply of other nucleic acid synthesizing and modifying enzymes, ranging from contributing to structural aspects of chromatin coupled with nucleic acid cloning and sequencing techniques to regulation of mRNA gene transcription by RNA polymerase II. represent the essential tools of modern molecular genetics and Future work will further characterize this important, newly molecular medicine (see Chapter 39). discovered class of RNA molecules. Interestingly, bacteria also contain small, heterogeneous regulatory RNAs termed sRNAs. Bacterial sRNAs range in SUMMARY size from 50 to 500 nucleotides, and like eukaryotic mi/si/ DNA consists of four bases—A, G, C, and T—that are held in lncRNAs, control the expression/activity of a large array of linear array by phosphodiester bonds through the 3′ and 5′ distinct genes. sRNAs often repress, but sometimes activate positions of adjacent deoxyribose moieties. protein synthesis by binding to specific mRNA. DNA is organized into two strands by the pairing of bases A to T and G to C on complementary strands. These strands form a double helix around a central axis. SPECIFIC NUCLEASES DIGEST The ~3 × 109 bp of DNA in humans are organized into the NUCLEIC ACIDS haploid complement of 23 chromosomes. The exact sequence of these 3 billion nucleotides defines the uniqueness of each Enzymes capable of degrading nucleic acids have been recog- individual. nized for many years. These nucleases can be classified in sev- DNA provides a template, both for its own replication and thus eral ways. Those that exhibit specificity for DNA are referred maintenance of the genotype, and for the transcription of the to as deoxyribonucleases. Those nucleases that specifically roughly 25,000 protein coding human genes as well as a large hydrolyze RNA are ribonucleases. Some nucleases degrade array of nonprotein coding regulatory ncRNAs. DNA and RNA. Within both of these classes are enzymes RNA exists in several different cellular nucleoprotein capable of cleaving internal phosphodiester bonds to produce structures, most of which are directly or indirectly involved either 3′-hydroxyl and 5′-phosphoryl terminals or, 5′-hydroxyl in protein synthesis or its regulation. The linear array of and 3′-phosphoryl terminals. These are referred to as endo- nucleotides in RNA consists of A, G, C, and U, and the sugar nucleases. Some are capable of hydrolyzing both strands of moiety is ribose. a double-stranded molecule, whereas others can only cleave The major forms of RNA include mRNA, rRNA, tRNA, single strands of nucleic acids. Some nucleases can hydro- snRNAs and regulatory ncRNAs. Certain RNA molecules act as lyze only unpaired single strands, while others are capable of catalysts (ribozymes). hydrolyzing single strands participating in the formation of a double-stranded molecule. There exist classes of endonucle- ases that recognize specific sequences in DNA. One class of REFERENCES these DNA cleaving enzymes, the restriction endonucleases, Ali T, Grote P: Beyond the RNA-dependent function of LncRNA also termed restriction enzymes, do so directly by binding genes. eLife 2020; 9:e60583. doi.org/10.7554/eLife.60583. specific (usually) contiguous DNA base pairs (typically 4, 5, Berget SM, Moore C, Sharp PA: Spliced segments at the 5′ 6, or 8 bp), and cleaving both strands of DNA, usually DNA terminus of adenovirus 2 late mRNA. Proc Natl Acad Sci. USA within the binding/recognition sequence element. The second 1977;74:3171. class of enzymes, which are ribonucleoprotein complexes, uti- Doudna JA: The promise and challenge of therapeutic genome lizes a “guide RNA” of specific nucleotide sequence that targets editing. Nature 2020;578:229. a nuclease to cleave distinct DNA or RNA sequences. These Goodall GJ, VO Wickramasinghe: RNA in Cancer. Nat Rev Cancer are the CRISPR-Cas family of enzymes. Both classes of DNA- 2020; doi: 10.1038/s41568-020-00306-0. Herbert A: A Genetic Instruction Code Based on DNA Conformation. cleaving enzyme are described in greater detail in Chapter 39. Trends Genet 2019; 35:887. Some nucleases are capable of hydrolyzing a nucleotide Noller HF: The parable of the caveman and the Ferrari: protein only when it is present at a terminal of a molecule; these are synthesis and the RNA world. Philos Trans R Soc Lond B Biol Sci referred to as exonucleases. Exonucleases act in one direction 2017;372(1716):20160187. (3′ → 5′ or 5′ → 3′) only. In bacteria, a 3′ → 5′ exonuclease is Rich A, Zhang S: Timeline: Z-DNA: the long road to biological an integral part of the DNA replication machinery and there function. Nat Rev Genet 2003;4:566. serves to edit—or proofread—the most recently added deoxy- Watson JD, Crick FH: Molecular structure of nucleic acids: a nucleotide for base-pairing errors. structure for deoxyribose nucleic acid. Nature 1953;171:737.