Chapter 15.1: Many Genes Encode Proteins (PDF)
Document Details
Uploaded by FerventAgate4185
Utah Valley University
Tags
Summary
This document describes the one gene-one enzyme hypothesis. It details how George Beadle and Edward Tatum used Neurospora to study the effects of mutations on biochemical pathways. The text explains how the study established a relationship between genes and proteins, providing vital information for understanding genetics and biochemistry.
Full Transcript
## 15.1 Many Genes Encode Proteins The first person to suggest the existence of a relation between genotype and proteins was English physician Archibald Garrod. In 1908, Garrod correctly proposed that genes encode enzymes, but unfortunately, his theory made little impression on his contemporaries....
## 15.1 Many Genes Encode Proteins The first person to suggest the existence of a relation between genotype and proteins was English physician Archibald Garrod. In 1908, Garrod correctly proposed that genes encode enzymes, but unfortunately, his theory made little impression on his contemporaries. Not until the 1940s, when George Beadle and Edward Tatum examined the genetic basis of biochemical pathways in the bread mold Neurospora, did the relation between genes and proteins become widely accepted. Beadle and Tatum's work helped define the relation between genotype and phenotype by leading to the one gene, one enzyme hypothesis: the idea that each gene encodes a separate enzyme. ### The One Gene, One Enzyme Hypothesis Beadle and Tatum used Neurospora to study the biochemical results of mutations. Neurospora is easy to cultivate in the laboratory, and the main vegetative part of the fungus is haploid, which allows the effects of otherwise recessive mutations to be easily observed. * Wild-type Neurospora grows on a minimal medium, which contains only inorganic salts, nitrogen, a carbon source such as sucrose, and the vitamin biotin. The fungus can synthesize all the biological molecules that it needs from these basic compounds. * However, mutations may arise that disrupt fungal growth by destroying the fungus's ability to synthesize one or more essential biological molecules. These nutritionally deficient mutants, termed auxotrophs, cannot grow on a minimal medium, but they can grow on a medium that contains the substance that they are no longer able to synthesize. Beadle and Tatum first irradiated spores of Neurospora to induce mutations. Then they placed the spores in different culture tubes with a complete medium (a medium containing all the biological substances needed for growth). These spores grew into fungi and produced spores by mitosis. Next, they transferred spores from each culture to tubes containing a minimal medium. Fungi with auxotrophic mutations did not grow on the minimal medium, which allowed Beadle and Tatum to identify cultures that possessed mutations. ### Experiment What do the effects of genetic mutation on a biochemical pathway tell us about the gene-protein relation? | Step | | | | |---|---|---|---| | 1 | precursor | ornithine | citrulline | | 2 | | | arginine | | 3 | | | | Note: A plus sign (+) indicates growth; a minus sign (-) indicates no growth. Based on these results, Srb and Horowitz proposed that the biochemical pathway leading to the amino acid arginine has at least three steps: They concluded that the mutations in group I affect step 1 of this pathway, mutations in group II affect step 2, and mutations in group III affect step 3. But how did they know that the order of the compounds in the biochemical pathway was correct? Notice that if step 1 is blocked by a mutation, then the addition of either ornithine or citrulline allows growth because these compounds can still be converted into arginine. Similarly, if step 2 is blocked, the addition of citrulline allows growth, but the addition of ornithine has no effect. If step 3 is blocked, the spores will grow only if arginine is added to the medium. The underlying principle is that an auxotrophic mutant cannot grow on a compound that comes before a step that is blocked by a mutation in the pathway. The mutant can still synthesize compounds that come after the block, provided the right precursor is supplied. Using this reasoning with the information in the table, we can see that the addition of arginine to the medium allows all three groups of mutants to grow. Therefore, biochemical steps affected by all the mutants precede the step that results in arginine. The addition of citrulline allows group I and group II mutants to grow, but not group III mutants; therefore, group III mutations must affect a biochemical step that takes place after the production of citrulline but before the production of arginine. The addition of ornithine allows the growth of group I mutants, but not group II or group III mutants; thus, mutations in groups II and III affect steps that come after the production of ornithine. We've already established that group II mutations affect a step before the production of citrulline; so group II mutations must block the conversion of ornithine into citrulline. Because group I mutations affect some step before the production of ornithine, we can conclude that they must affect the conversion of some precursor into ornithine. We can now outline the biochemical pathway yielding ornithine, citrulline, and arginine: Importantly, this procedure does not necessarily detect all steps in a pathway; rather, it detects only the steps that produce the compounds tested. Using mutations and this type of reasoning, Beadle, Tatum, and others were able to identify the genes that control several biosynthetic pathways in Neurospora. They established that each step in a pathway is controlled by a different enzyme, as shown in Figure 15.3 for the arginine pathway. In addition, by conducting genetic crosses and mapping experiments, they were able to demonstrate that mutations affecting any one step in a pathway always occurred at the same chromosomal location. Beadle and Tatum reasoned that mutations affecting a particular biochemical step occurred at a single locus that encoded a particular enzyme. This idea became known as the one gene, one enzyme hypothesis: genes function by encoding enzymes, and each gene encodes a separate enzyme. Although the genes Beadle and Tatum examined encoded enzymes, many genes encode proteins that are not enzymes, so more generally, their idea was that each gene encodes a protein. When research findings showed that some proteins are composed of more than one polypeptide chain and that different polypeptide chains are encoded by separate genes, this model was modified to become the one gene, one polypeptide hypothesis. ### Concept Check 1 Auxotrophic mutation 103 grows on a minimal medium supplemented with A, B, or C; mutation 106 grows on a medium supplemented with A or C, but not B; and mutation 102 grows only on a medium supplemented with C. What is the order of A, B, and C in the biochemical pathway? *B→A→C* ## The Structure and Function of Proteins Proteins are central to all living processes. Many proteins are enzymes, the biological catalysts that drive the chemical reactions of the cell; others are structural components, providing scaffolding and support for membranes, filaments, bone, and hair. Some proteins help transport substances; others have a regulatory, communication, or defense function. ### Amino Acids All proteins are polymers composed of amino acids, linked end to end. Twenty common amino acids are found in proteins; these amino acids are shown in Figure 15.5 with both their three-letter and one-letter abbreviations (other amino acids sometimes found in proteins are modified forms of these common amino acids). All of the common amino acids are similar in structure: each consists of a central carbon atom bonded to an amino group, a hydrogen atom, a carboxyl group, and an R (radical) group that differs for each amino acid. The R groups help determine the chemical properties of the amino acids. ### Protein Structure Like that of nucleic acids, the molecular structure of proteins has several levels of organization. * **Primary structure** of a protein is its sequence of amino acids. * **Secondary structure** a protein is formed through interactions between neighboring amino acids, a polypeptide chain folds and twists. Two common secondary structures found in proteins are the beta (B) pleated sheet and the alpha (a) helix. * **Tertiary structure** of a protein is formed by further folding of the secondary structures. * **Quaternary structure** of a protein exists for some proteins that consist of two or more polypeptide chains that associate to produce a quaternary structure. * **Domains** are a group of amino acids that forms a discrete functional unit within the protein. ### Concept Check 2 What primarily determines the secondary and tertiary structures of a protein? *The amino acid sequence (primary structure) of the protein* ## 15.2 The Genetic Code Determines How the Nucleotide Sequence Specifies the Amino Acid Sequence of a Protein In 1953, James Watson, Francis Crick, Rosalind Franklin, and Maurice Wilkins solved the structure of DNA and identified its base sequence as the carrier of genetic information. However, the way in which the base sequence of DNA specifies the amino acid sequences of proteins (the genetic code) remained elusive for another 10 years. One of the first questions about the genetic code to be addressed was how many nucleotides are necessary to specify a single amino acid. The set of nucleotides that encode a single amino acid-the basic unit of the genetic code-is called a codon. Many early investigators recognized that codons must contain a minimum of three nucleotides each, in mRNA can be occupied by one of four bases: A, G, C, or U. - If a codon consisted of a single nucleotide, only four different codons (A, G, C, and U) would be possible, which is not enough to encode the 20 different amino acids commonly found in proteins. - If codons were made up of two nucleotides each (GU, AC, etc.), there would be 4 × 4 = 16 possible codons-still not enough to encode all 20 amino acids - With three nucleotides per codon, there are 4 × 4 × 4 = 64 possible codons-more than enough to specify 20 different amino acids. Therefore, a triplet code requiring three nucleotides per codon would be the most efficient way to encode all 20 amino acids. Using mutations in bacteriophages, Francis Crick and his colleagues confirmed in 1961 that the genetic code is indeed a triplet code. ### Concept Check 3 A codon is a. one of three nucleotides that encode an amino acid. b. three nucleotides that encode an amino acid. c. three amino acids that encode a nucleotide. d. one of four bases in DNA. *three nucleotides that encode an amino acid.* ## Breaking the Genetic Code Once it had been firmly established that the genetic code consists of codons that are three nucleotides in length, the next step was to determine which groups of three nucleotides specify which amino acids. Logically, the easiest way to break the code would have been to determine the base sequence of a piece of RNA, add it to a test tube containing all the components necessary for translation, and allow it to direct the synthesis of a protein. The amino acid sequence of the newly synthesized protein could then be determined, and its sequence could be compared with that of the RNA. Unfortunately, there was no way at that time to determine the nucleotide sequence of a piece of RNA, so indirect methods were necessary to break the code. ### The Use of Homopolymers The first clues to the genetic code came in 1961, from the work of Marshall Nirenberg and Johann Heinrich Matthaei. These investigators created synthetic RNAS by using an enzyme called polynucleotide phosphorylase. Unlike RNA polymerase, polynucleotide phosphorylase does not require a template; it randomly links together any RNA nucleotides that happen to be available. The first synthetic mRNAs used by Nirenberg and Matthaei were homopolymers, RNA molecules consisting of a single type of nucleotide. For example, by adding polynucleotide phosphorylase to a solution of uracil nucleotides, they generated RNA molecules that consisted entirely of uracil nucleotides and thus contained only UUU codons. These poly(U) RNAs were then added to 20 test tubes, each containing the components necessary for translation and all 20 amino acids. A different amino acid was radioactively labeled in each of the 20 tubes. Radioactive protein appeared in only one of the tubes-the one containing labeled phenylalanine. This result showed that the codon UUU specifies the amino acid phenylalanine. The results of similar experiments using poly(C) and poly(A) RNA demonstrated that CCC encodes proline and AAA encodes lysine; for technical reasons, the results from poly(G) were uninterpretable. ### The Use of Random Copolymers To gain information about additional codons, Nirenberg and his colleagues created synthetic RNAs containing two or three different bases. Because polynucleotide phosphorylase incorporates nucleotides randomly, these RNAs contain random mixtures of the bases and are thus called random copolymers. For example, when adenine and cytosine nucleotides were mixed with polynucleotide phosphorylase, the RNA molecules produced had eight different codons: AAA, AAC, ACC, ACA, CAA, CCA, CAC, and CCC. These poly(AC) RNAs produced proteins containing six different amino acids: asparagine, glutamine, histidine, lysine, proline, and threonine. The proportions of the different amino acids in the proteins produced depended on the ratio of the two nucleotides used in creating the random copolymers, and the theoretical probability of finding a particular codon could be calculated from the ratios of the bases. If a 4:1 ratio of C to A were used in making the RNA, then the probability of C being in any given position in a codon would be 4/5 and the probability of A being in it would be 1/5. With random incorporation of bases, the probability of any one of the codons with two Cs and one A (CCA, CAC, or ACC) would be 4/5 x 4/5 x 1/5 = 16/125 = 0.13, or 13%, and the probability of any codon with two As and one C (AAC, ACA, or CAA) would be 1/5 x 1/5 x 4/5 = 4/125 = 0.032, or about 3%. Therefore, an amino acid encoded by two Cs and one A should be more common than an amino acid encoded by two As and one C. By comparing the percentages of amino acids in proteins produced by random copolymers with the theoretical frequencies expected for the codons, Nirenberg and his colleagues could derive information about the base composition of the codons. These experiments revealed nothing, however, about the codon base sequence; histidine was clearly encoded by a codon with two Cs and one A, but whether that codon was ACC, CAC, or CCA was unknown. There were other problems with this method: the theoretical calculations depended on the random incorporation of bases, which did not always occur; furthermore, because the genetic code is redundant, sometimes several different codons specify the same amino acid. ### The Use of Ribosome-Bound tRNAs To overcome the limitations of random copolymers, Nirenberg and Philip Leder developed another technique in 1964 that used ribosome-bound tRNAs. They found that a very short sequence of mRNA-even one consisting of a single codon-would bind to a ribosome. The codon on the short mRNA would then base pair with the matching anticodon on a tRNA that carried the amino acid specified by the codon. When short mRNAs that were bound to ribosomes were mixed with tRNAs and amino acids, and this mixture was passed through a nitrocellulose filter, the tRNAs that were paired with the ribosome-bound mRNA stuck to the filter, whereas unbound tRNAs passed through it. The advantage of this system was that it could be used with very short synthetic mRNA molecules that could be synthesized with a known sequence. Nirenberg and Leder synthesized more than 50 short mRNAs with known codons and added them individually to a mixture of ribosomes and tRNAs with amino acids. They then isolated the tRNAs that were bound to the mRNAs and ribosomes and determined which amino acids were present on the bound tRNAs. For example, synthetic mRNA with the codon GUU retained a tRNA to which valine was attached, whereas mRNAs with the codons UGU and UUG did not. Using this method, Nirenberg and his colleagues were able to determine the amino acids encoded by more than 50 codons. Other experiments provided additional information about the genetic code, and it was fully deciphered by 1968. Let's examine some of the features of the code, which is so important to modern biology that Francis Crick compared its place to that of the periodic table of the elements in chemistry. ### The Degeneracy of the Code One amino acid is encoded by three consecutive nucleotides in mRNA, and each nucleotide can have one of four possible bases (A, G, C, or U), so there are 43 = 64 possible codons. Three of these codons are stop codons, which specify the end of translation, as we'll see shortly. Thus, 61 codons, called sense codons, encode amino acids. Because there are 61 sense codons and only 20 different amino acids commonly found in proteins, the code contains more information than is needed to specify the amino acids and thus is said to be degenerate. This expression does not mean that the genetic code is depraved; degenerate is a term that Francis Crick borrowed from quantum physics, where it describes multiple physical states that have equivalent meaning. The degeneracy of the genetic code means that the code is redundant: amino acids may be specified by more than one codon. Only tryptophan and methionine are encoded by a single codon. Other amino acids are specified by two or more codons, and some, such as leucine, are specified by six different codons. Codons that specify the same amino acid are said to be synonymous codons, just as synonymous words are different words that have the same meaning. As we learned, tRNAs serve as adapter molecules that bind particular amino acids and deliver them to a ribosome, where the amino acids are then assembled into polypeptide chains. Each type of tRNA attaches to a single type of amino acid. The cells of most organisms possess about 30 to 50 different tRNAs, yet there are only 20 different amino acids commonly found in proteins. Thus, some amino acids are carried by more than one tRNA. Different tRNAs that accept the same amino acid but have different anticodons are called isoaccepting tRNAs. Even though some amino acids can pair with multiple (isoaccepting) tRNAs, there are still more codons than anticodons. One anticodon can pair with different codons through flexibility in base pairing at the third position of the codon. Examination of Figure 15.10 reveals that many synonymous codons differ only in the third position. For example, serine is encoded by the codons ÚCU, UCC, UCA, and UCG, all of which begin with ÚC. When the codon of the mRNA and the anticodon of the tRNA join, the first (5') base of the codon forms hydrogen bonds with the third (3') base of the anticodon, strictly according to the Watson-and-Crick base-pairing rules: A with U; C with G. Next, the middle bases of codon and anticodon pair, also strictly following the Watson-and-Crick rules. After these pairs have bonded, the third bases pair weakly, and there may be flexibility, or wobble, in their pairing. In 1966, Francis Crick developed the wobble hypothesis, which proposed that some nonstandard pairings of bases could take place at the third position of a codon. For example, a G in the anticodon may pair with either a C or a U in the third position of a codon. The important thing to remember about wobble is that it allows some tRNAs to pair with more than one mRNA codon; thus, from 30 to 50 tRNAs can pair with 61 sense codons. Some codons are synonymous through wobble. ### Concept Check 4 Through wobble, a single___ can pair with more than one___. *anticodon, codon* ## The Reading Frame and Initiation Codons Findings from early studies of the genetic code indicated that the code is generally nonoverlapping. An overlapping code would be one in which a single nucleotide might be included in more than one codon, as follows: Usually, however, each nucleotide is part of a single codon. A few overlapping genes are found in viruses, but codons within the same gene do not overlap, and the genetic code is generally considered to be nonoverlapping. For any sequence of nucleotides, there are three potential sets of codons-three ways in which the sequence can be read in groups of three. Each different way of reading the sequence is called a reading frame, and any sequence of nucleotides has three potential reading frames. The three reading frames have completely different sets of codons and therefore specify proteins with entirely different amino acid sequences. Thus, it is essential for the translation machinery to use the correct reading frame. How is the correct reading frame established? The reading frame is set by the initiation codon (or start codon), which is the first codon of the mRNA to specify an amino acid. After the initiation codon, the other codons are read as successive groups of three nucleotides. No bases are skipped between the codons, so there are no punctuation marks to separate the codons. The initiation codon is most often AUG, although GUG, UUG, and other codons are also sometimes used. The initiation codon is not just a sequence that marks the beginning of translation; it also specifies an amino acid. In bacterial cells, the first AUG encodes a modified type of methionine, N-formylmethionine; thus, all proteins in bacteria initially begin with this amino acid, but its formyl group (or, in some cases, the entire amino acid) may be removed after the protein has been synthesized. When the codon AUG is at an internal position in a gene, it encodes unformylated methionine. In archaeal and eukaryotic cells, AUG specifies unformylated methionine, both at the initiation position and at internal positions. In both bacteria and eukaryotes, there are different tRNAs for the initiator methionine (designated tRNA, Met in bacteria and tRNA, Met in eukaryotes) and internal methionine (designated tRNA Met). ## Termination Codons Three codons - UAA, UAG, and UGA - do not encode amino acids. These codons, which signal the end of translation in both bacterial and eukaryotic cells, are called stop codons, termination codons, or nonsense codons. No tRNAs have anticodons that pair with termination codons. ### The Universality of the Code For many years, the genetic code was assumed to be universal, meaning that each codon specifies the same amino acid in all organisms. We now know that the genetic code is mostly, but not completely, universal; some exceptions have been found. Most of these exceptions are termination codons, but there are a few cases in which one sense codon substitutes for another. Many of these exceptions are found in mitochondrial genes; some nonuniversal codons have also been detected in the nuclear genes of protozoans and in bacterial DNA. One study of bacteria and bacteriophages isolated from 1776 environmental samples found nonuniversal codons in a substantial fraction, suggesting that nonuniversal codons may be more common than previously thought. ### Concept Check 5 Do the initiation and termination codons specify amino acids? If so, which ones? *The initiation codon in bacteria encodes N-formylmethionine; in eukaryotes, it encodes methionine. Termination codons do not specify amino acids.* ## 15.3 Amino Acids Are Assembled Into a Protein Through Translation Now that we are familiar with the genetic code, we can begin to study how amino acids are assembled into proteins. Because more is known about translation in bacteria than in eukaryotes, we will focus primarily on bacterial translation. In most respects, eukaryotic translation is similar, although some significant differences will be noted. Remember that only mRNAs are translated into proteins. Translation takes place on ribosomes; indeed, ribosomes can be thought of as moving protein-synthesizing machines. Through a variety of techniques, a detailed view of the structure of the ribosome has been produced in recent years, which has greatly improved our understanding of translation. A ribosome attaches near the 5 end of an mRNA strand and moves toward the 3 end, translating the codons as it goes. Synthesis begins at the amino end of the protein, and the protein is elongated by the addition of new amino acids to the carboxyl end. Protein synthesis includes a series of RNA-RNA interactions: interactions between the mRNA and the rRNA that holds the mRNA in the ribosome, between the codon on the mRNA and the anticodon on the tRNA, and between the tRNA and the rRNAs of the ribosome. Protein synthesis can be conveniently divided into four stages: 1. tRNA charging, in which tRNAs bind to amino acids. 2. Initiation, in which the components necessary for translation are assembled at the ribosome. 3. Elongation, in which amino acids are joined, one at a time, to the growing polypeptide chain. 4. Termination, in which protein synthesis halts at the termination codon and the translation components are released from the ribosome. ### The Binding of Amino Acids to Transfer RNAs The first stage of translation is the binding of tRNA molecules to their appropriate amino acids. As we have seen, each tRNA is specific for a particular amino acid. All tRNAs have the sequence CCA at the 3 end, and the carboxyl group (COO) of the amino acid is attached to the adenine nucleotide at the 3 end of the tRNA. If each tRNA is specific for a particular amino acid but all amino acids are attached to the same nucleotide (A) at the 3 end of a tRNA, how does a tRNA link up with its appropriate amino acid? The key to specificity between an amino acid and its tRNA is a set of enzymes called aminoacyl-tRNA synthetases. A cell has 20 different aminoacyl-tRNA synthetases, one for each of the 20 amino acids. Each synthetase recognizes a particular amino acid as well as all the tRNAs that accept that amino acid. Its recognition of the appropriate amino acid is based on the different sizes, charges, and R groups of the amino acids. Its recognition of the appropriate tRNAs depends on the nucleotide sequences of the tRNAs. Researchers have identified which nucleotides are important in recognition by synthetases by altering different nucleotides in a particular tRNA and determining whether the altered tRNA is still recognized by its synthetase. The attachment of a tRNA to its appropriate amino acid, termed tRNA charging, requires energy, which is supplied by adenosine triphosphate (ATP): *amino acid + tRNA + ATP → aminoacyl-tRNA + AMP + PPi* This reaction takes place in two steps. To identify the resulting aminoacylated (charged) tRNA, we write the three-letter abbreviation for the amino acid in front of the tRNA; for example, the amino acid alanine (Ala) attaches to its tRNA (tRNA-Ala), giving rise to its aminoacyl-tRNA (Ala-tRNA-Ala). ### Concept Check 6 Amino acids bind to which part of the tRNA? *3' end* ## The Initiation of Translation The second stage in the process of protein synthesis is initiation. At this stage, all the components necessary for protein synthesis assemble: 1. mRNA 2. The small and large subunits of the ribosome. 3. A set of three proteins called initiation factors. 4. Initiator tRNA with N-formylmethionine attached. 5. Guanosine triphosphate (GTP). Initiation comprises three major steps. 1. mRNA binds to the small subunit of the ribosome. 2. Initiator tRNA binds to the mRNA through base pairing between the initiation codon and the anticodon. 3. The large ribosomal subunit joins the initiation complex. ### Initiation in Bacteria The functional ribosome of bacteria exists as two subunits, the small 30S subunit and the large 50S subunit. An mRNA molecule can bind to the small ribosomal subunit only when the subunits are separate. Initiation factor 3 (IF-3) binds to the small ribosomal subunit and prevents the large subunit from binding during initiation. Another factor, initiation factor 1 (IF-1), enhances the disassociation of the large and small ribosomal subunits. Where on the mRNA does the ribosome bind during initiation of translation? Key sequences on the mRNA required for ribosome binding have been identified by techniques designed to allow a ribosome to bind to mRNA, but not to proceed with protein synthesis; the ribosome is thereby stalled at the initiation site where binding occurs. A ribonuclease is added, which degrades all the mRNA except the region covered by the ribosome. The intact mRNA can then be separated from the ribosome and studied. The sequence covered by the ribosome during initiation is 30 to 40 nucleotides long and includes the AUG initiation codon. Within the ribosome-binding site is the Shine-Dalgarno sequence, a consensus sequence that is complementary to a sequence of nucleotides at the 3 end of 16S rRNA (part of the small ribosomal subunit). During initiation, the nucleotides in the Shine-Dalgarno sequence pair with their complementary nucleotides in the 16S rRNA, allowing the small ribosomal subunit to attach to the mRNA and positioning the ribosome directly over the initiation codon. These ribosome-binding sequences are within the 5 untranslated region of the mRNA. The initiator tRNA, fMet-tRNA, Met, attaches to the initiation codon. This attachment requires initiation factor 2 (IF-2), which forms a complex with GTP. At this point, the initiation complex consists of (1) the small ribosomal subunit, (2) the mRNA, (3) the initiator tRNA with its amino acid (fMet-tRNA Met), (4) one molecule of GTP, and (5) several initiation factors. These components are collectively known as the 30S initiation complex. In the final step of initiation, IF-3 dissociates from the small subunit, allowing the large ribosomal subunit to join the initiation complex. The molecule of GTP (provided by IF-2) is hydrolyzed to guanosine diphosphate (GDP), and the initiation factors dissociate from the complex. When the large subunit has joined the initiation complex, the complex is called the 70s initiation complex. ### Initiation in Eukaryotes Similar events take place in the initiation of translation in eukaryotic cells, but there are some important differences. In bacterial cells, sequences in 16S rRNA of the small ribosomal subunit bind to the Shine-Dalgarno sequence in mRNA. No analogous consensus sequence exists in eukaryotic mRNA. Instead, the cap at the 5' end of eukaryotic mRNA plays a critical role in the initiation of translation. In a series of steps, the small subunit of the eukaryotic ribosome, initiation factors, and the initiator tRNA with its amino acid (Met-tRNA, Met) form an initiation complex that recognizes the cap and binds there. The initiation complex then moves along (scans) the mRNA until it locates the first AUG codon. The identification of the start codon is facilitated by the presence of a consensus sequence (called the Kozak sequence) that surrounds the start codon: Once the initiation codon is reached, scanning ceases and the anticodon on initiator tRNA pairs with the start codon on the mRNA. Initiation factors are then released, and the large subunit of the ribosome joins the small subunit to create the functional ribosome. Another important difference is that eukaryotic initiation requires at least 12 initiation factors. Some of these factors keep the ribosomal subunits separated, just as IF-3 does in bacterial cells. Others recognize the 5 cap on the mRNA and allow the small ribosomal subunit to bind there. Still others possess RNA helicase activity, which is used to unwind secondary structures that may exist in the 5 untranslated region of the mRNA, allowing the small subunit to move down the mRNA until the initiation codon is reached. Other initiation factors help bring Met-tRNA, Met to the initiation complex. In eukaryotes, the 5 cap is initially bound by several proteins, one of which is the cap-binding complex (CBC). The CBC aids in exporting the mRNA from the nucleus and then promotes the "pioneer," or initial, round of translation in the cytoplasm. This first round of translation plays an important role in checking for errors in the mRNA (see Messenger RNA Surveillance in Section 15.4). After the pioneer round of translation, the CBC is replaced by eukaryotic initiation factor 4E (eIF-4E), which promotes continued translation of the mRNA. The poly(A) tail at the 3' end of eukaryotic mRNA also plays a role in the initiation of translation. During initiation, proteins that attach to the poly(A) tail interact with proteins that bind to the 5 cap, enhancing the binding of the small ribosomal subunit to the 5 end of the mRNA. This interaction indicates that the 3 end of mRNA bends over and associates with the 5' cap during the initiation of translation, forming a circular structure known as a closed loop. A few eukaryotic mRNAs contain internal ribosome entry sites, where ribosomes can bind directly without first attaching to the 5 cap. Furthermore, some uncapped mRNAs are translated through the binding of initiation factors and ribosomes to modified adenine nucleotides (N-methyladenosine) in the mRNA. Recent research that maps where ribosomes initiate translation on the mRNA has revealed surprising variation in translation initiation sites. Ribosomes frequently initiate translation at CUG and other non-AUG codons. In addition, a number of eukaryotic genes have short reading frames in the 5 UTR of the mRNA upstream of the standard AUG start site (called upstream open reading frames, or uORFs) that may be translated into proteins, which then affects translation of downstream genes. ### Concept Check 7 During the initiation of translation in bacteria, the small ribosomal subunit binds to which consensus sequence? *The Shine-Dalgarno sequence* ## Elongation The next stage in protein synthesis is elongation, in which amino acids are joined to create a polypeptide chain. Elongation requires (1) the 70S initiation complex just described, (2) tRNAs charged with their amino acids, (3) several elongation factors, and (4) GTP. A ribosome has three sites that can be occupied by tRNAs: the aminoacyl (A) site, the peptidyl (P) site, and the exit (E), site. The initiator tRNA immediately occupies the P site (the only site to which the fMet-tRNA, Met is able to bind), but all other tRNAs first enter the A site. At the end of initiation, the ribosome is attached to the mRNA, and fMet-tRNA, Met is positioned over the AUG start codon in the P site; the adjacent A site is unoccupied. Elongation takes place in three steps. 1. A charged tRNA binds to the A site. This binding takes place when elongation factor Tu (EF-Tu) joins with GTP and then with a charged tRNA to form a three-part complex. This complex enters the A site of the ribosome, where the anticodon on the tRNA pairs with the codon on the mRNA. Once the charged tRNA is in the A site, GTP is cleaved to form GDP, and the EF-Tu-GDP complex is released. Elongation factor Ts (EF-Ts) regenerates EF-Tu-GDP to EF-Tu-GTP. In eukaryotic cells, a similar set of reactions delivers a charged tRNA to the A site. 2. The second step of elongation is the formation of a peptide bond between the amino acids that are attached to tRNAs in the P and A sites. The formation of this peptide bond releases the amino acid in the P site from its tRNA. Peptide-bond formation occurs within the large ribosomal subunit. The catalytic activity that creates the peptide bond is a property of one of the rRNA components of the large subunit (the 23S rRNA in bacteria, the 28S RNA in eukaryotes); this rRNA acts as a ribozyme. 3. The third step in elongation is translocation, the movement of the ribosome down the mRNA in the 5→3 direction. This step, which positions the ribosome over the next codon, requires elongation factor G (EF-G) and the hydrolysis of GTP to GDP. Because the tRNAs in the P and A sites are still attached to the mRNA by codon-anticodon pairing, they do not move with the ribosome as it translocates. Consequently, the ribosome shifts so that the tRNA that previously occupied the P site now occupies the E site, from which it then moves into the cytoplasm, where it can be recharged with another amino acid. Translocation also causes the tRNA that occupied the A site (which is attached to the growing polypeptide chain) to occupy the P site, leaving the A site open. Thus, the progress of each tRNA through the ribosome in the course of elongation can be summarized as follows: cytoplasm → A site → P site → E site → cytoplasm. As stated earlier, the initiator tRNA is an exception: it attaches directly to the P site and never occupies the A site. After translocation, the A site of the ribosome is empty and ready to receive the tRNA specified by the next codon. The elongation cycle repeats itself: a charged tRNA and its amino acid occupy the A site, a peptide bond is formed between the amino acids in the A and P sites, and the ribosome translocates to the next codon. Throughout the cycle, the polypeptide chain remains attached to the tRNA in the P site. Another protein, called elongation factor P (EF-P), enhances the translation of proteins that contain consecutive copies of the amino acid proline. If EF-P is absent, ribosomes often stall during the translation of such polyproline-containing proteins. Researchers have developed methods for following a single ribosome as it translates individual codons of an mRNA molecule. These studies have revealed that translation does not take place in a smooth, continuous fashion. Each translocation of the ribosome typically requires less than a tenth of a second, but sometimes there are distinct pauses, often lasting a few seconds, between translocations. Thus, translation takes place in a series of quick translocations interrupted by brief pauses. In addition to the short pauses between translocation events, translation may be interrupted by longer pauses-lasting from 1 to 2 minutes-which may play a role in regulating the process of translation. Elongation in eukaryotic cells takes place in a manner similar to that in bacteria. Eukaryotes possess at least three elongation factors, one of which also acts in initiation and termination. Another of these elongation factors, called eukaryotic elongation factor 2 (eEF-2), is the target of a toxin produced by the bacterium that causes diphtheria, a disease that until recently was a leading killer of children. The diphtheria toxin inhibits eEF-2, preventing the translocation of the ribosome along the mRNA, and protein synthesis ceases. ### Concept Check 8 In elongation, the creation of peptide bonds between amino acids is catalyzed by ___. *rRNA* ## Termination Protein synthesis ends when the ribos