Chapter on Proteomes BCH311 PDF
Document Details
Uploaded by InfluentialForsythia
Tags
Summary
This chapter on proteomes explores the methodology used for proteomics research (protein profiling and expression proteomics) and separation techniques like PAGE and column chromatography. It describes how proteomes are studied, focusing on the synthesis, degradation, and processing of proteins in a cell.
Full Transcript
CHAPTER PROTEOMES 13 e proteome is the collection of protein molecules present in a cell. e prot- 13.1 STUDYING THE COMPOSITION eome is therefore the nal link between the genome and the biochemical...
CHAPTER PROTEOMES 13 e proteome is the collection of protein molecules present in a cell. e prot- 13.1 STUDYING THE COMPOSITION eome is therefore the nal link between the genome and the biochemical capa- OF A PROTEOME bility of the cell, and characterization of the proteomes of di erent cells is one of the keys to understanding how the genome operates and how dysfunctional 13.2 IDENTIFYING PROTEINS THAT genome activity can lead to diseases. Transcriptome studies can address these INTERACT WITH ONE ANOTHER issues only in part. Examination of the transcriptome gives an accurate indication of which genes are active in a particular cell but gives a less accurate indication of 13.3 SYNTHESIS AND the proteins that are present. is is because the factors that in uence protein DEGRADATION OF THE content include not only the amount of mRNA that is available but also the rate COMPONENTS OF THE PROTEOME at which the mRNAs are translated into protein and the rate at which the proteins are degraded. Additionally, the protein that is the initial product of translation 13.4 INFLUENCE OF PROTEIN may not be active, as some proteins must undergo physical and/or chemical PROCESSING ON THE modi cation before becoming functional. Determining the amount of the active COMPOSITION OF THE PROTEOME form of a protein is therefore critical to understanding the biochemistry of a cell or tissue. 13.5 BEYOND THE PROTEOME e issues that we must examine regarding the proteome are very similar to the issues that interested us in Chapter 12 when we studied transcriptomes. First, we will explore the various methods that are used to catalog the compo- nents of a proteome and to understand how a proteome functions within a cell. en we must study the events involved in synthesis, degradation, and processing of the components of a proteome. Finally, we will examine more closely the link between the proteome and the biochemistry of the cell. 13.1 STUDYING THE COMPOSITION OF A PROTEOME e methodology used to study proteomes is called proteomics. Strictly speak- ing, proteomics is a collection of diverse techniques that are related only in their ability to provide information on a proteome. at information encompasses not only the identities of the constituent proteins that are present but also factors such as the functions of individual proteins and their localization within the cell. e particular technique that is used to study the composition of a proteome is called protein pro ling or expression proteomics. Protein pro ling is usually carried out in two stages: one another. TOPDOWN Mass spectrometry mass spectrometry. is basic format encompasses two di erent approaches called top-down Protein and bottom-up proteomics (Figure 13.1). e di erence is that in top-down Trypsin Peptides Figure 13.1 Top-down and bottom-up proteomics. In top-down proteomics, the intact protein is examined by mass spectrometry. In bottom-up proteomics, peptides derived from the protein are examined. In this example, peptides are obtained by treating the protein with trypsin, which cuts immediately after BOTTOMUP Mass spectrometry arginine or lysine amino acids. 294 Chapter 13: Proteomes proteomics, individual proteins are directly examined by mass spectrometry, whereas in bottom-up proteomics, the proteins are broken into peptides by treatment with a sequence-speci c protease, such as trypsin, prior to mass spectrometry. The separation stage of a protein profiling project of its constituent proteins. How di cult this is depends on the complexity of the proteome. Separation of the 10,000–20,000 proteins in some mammalian pro- teomes requires more sophisticated methods than are needed for the less com- plex proteomes of bacteria or mammalian cell fractions (for example, mitochon- dria), which might contain fewer than 1000 proteins. e choice of separation technique is therefore dictated in part by the complexity of the proteome that is being studied. Polyacrylamide gel electrophoresis (PAGE) is the standard method for sepa- rating the proteins in a complex mixture. Depending on the composition of the gel and the conditions under which the electrophoresis is carried out, di erent chemical and physical properties of proteins can be used as the basis for their separation. One technique makes use of the detergent called sodium dodecyl sul- fate (SDS), which denatures proteins and confers a negative charge that is roughly equivalent to the length of the unfolded polypeptide. Under these conditions, the proteins separate according to their molecular masses, the smallest proteins migrating more quickly toward the positive electrode. Alternatively, proteins can be separated by isoelectric focusing in a gel containing chemicals that establish migrates to its isoelectric point, the position in the gradient where its net charge is zero. When a complex proteome is being studied, the two versions of PAGE are often combined in two-dimensional gel electrophoresis - sion, the proteins are separated by isoelectric focusing. e gel is then soaked in sodium dodecyl sulfate and rotated by 90°, and a second electrophoresis, separat- ing the proteins according to their sizes, is carried out at a right angle to the rst (Figure 13.2). is approach can separate several thousand proteins in a single gel, the proteins revealed as a complex pattern of spots when the gel is stained (Figure 13.3 they contain can be puri ed. Cutting out 20,000 spots would clearly be a laborious process, and in practice two-dimensional PAGE is not used if the aim is to catalog such as those that have di erent abundances in two or more related proteomes: the healthy and diseased versions of a tissue, for example. An alternative approach to protein separation by PAGE is provided by column chromatography. is method involves passing the protein mixture through a column packed with a solid matrix. e proteins in the mixture move through the matrix at di erent rates and so become separated into bands. e solution emerging from the column can then be collected as a series of fractions, with each individual protein present in a di erent fraction (Figure 13.4). e identity of the solid phase (the matrix or resin) and the composition of the mobile phase (the Figure 13.2 Two-dimensional gel liquid used to move the proteins through the column) specify which of the variable electrophoresis. Load the protein sample First Rotate Second electrophoresis electrophoresis 13.1 STUDYING THE COMPOSITION OF A PROTEOME 295 + IPG 5 – 6 – Figure 13.3 Result of two-dimensional kDa gel electrophoresis. Mouse liver proteins have been separated by isoelectric focusing – 94 in the pH 5–6 range in the first dimension and according to molecular mass in the – 67 second dimension. The protein spots have been visualized by staining with silver solution. (From Görg A, Obermaier C, – 43 Boguth G et al. Electrophoresis 21:1037–1053. With permission from John Wiley & Sons, Inc.) – 30 – 20 – 14 24 cm Protein mixture pro ling, the two most commonly used types of column chromatography are as Add Add mobile phase follows: sample to elute proteins reverse-phase liquid chromatography (RPLC), the solid phase is a matrix of silica particles whose surfaces are covered with nonpolar chemi- cal groups such as hydrocarbons. e mobile phase is a mixture of water and an organic solvent such as methanol or acetonitrile. Most proteins have hydrophobic areas on their surfaces, which bind to the nonpolar matrix, but the stability of this attachment decreases as the organic content of the liquid phase increases. Gradually changing the ratio of the aqueous and organic components of the mobile phase therefore results in the elution of proteins according to their degree of surface hydrophobicity. Ion-exchange chromatography separates proteins according to their net electric charges. e matrix consists of polystyrene beads that carry proteins with a net negative charge will bind to them, and vice versa. e proteins can be eluted with a salt gradient, set up by gradually increasing the salt concentration of the bu er being passed through the column. e charged salt ions compete with the proteins for the binding sites on the resin, so proteins with low charges are eluted at low salt concentration, and those with higher charges are eluted at higher salt concentrations. e salt gradient therefore separates proteins according to their net charges. Alternatively, a pH gradient can be used. e net charge of a protein depends on the pH and, as described above, is zero at the pH correspond- ing to that protein's isoelectric point. Gradually changing the pH of the mobile phase will result in the elution of proteins with di erent isoelectric points, again achieving their separation. Compared with two-dimensional PAGE, column chromatography is less labo- Figure 13.4 Column chromatography. rious to carry out and has the advantage that individual proteins can be collected The proteins separate into bands as they as they elute from the column, avoiding the postseparation puri cation step move through the column. In practice, tens needed to obtain a protein from its spot on a polyacrylamide gel. Column chro- or hundreds of proteins can be separated matography is usually carried out in a capillary tube with an internal diameter of in this way. 296 Chapter 13: Proteomes Figure 13.5 High-performance liquid HPLC column chromatography (HPLC). The diagram shows a typical HPLC apparatus. The protein mixture is injected and pumped through the column along with the Data mobile-phase solution. Proteins are Sample Detector analysis Pump injection detected as they elute from the column, usually by measuring UV absorbance at 210–220 nm. The fractions that are collected can be of equal volume, or the data from the detector can be used to control the fractionation so that each protein peak is collected as a single sample of minimum volume. Mobile phase Fraction collector less than 1 mm, with the liquid phase being pumped at high pressure. is pro- cedure, called high-performance liquid chromatography (HPLC), has a high resolving power and enables proteins with very similar chromatographic proper- ties to be separated (Figure 13.5). To increase the resolving power, di erent types of chromatography column can be linked together, with each consecutive fraction from one column being fed into a second column, in which a further round of separation by a di erent procedure is carried out (Figure 13.6A proteins can be fully separated. Alternatively, proteins can initially be separated by one-dimensional PAGE, using either the SDS version or isoelectric focusing, and the resulting gel cut into segments (Figure 13.6B). e set of proteins present in each segment is then entered sequentially into the column chromatography protein fractions as they emerge from the column—the o ine mode—the column can be directly attached to the mass spectrometer. Each protein is therefore analyzed by the spectrometer as it elutes from the column (Figure 13.6C). is online mode cannot be used with the standard version of bottom-up proteomics, because each protein must be treated with a protease to cut it into fragments prior to injection into the mass spectrometer. e online mode is, however, possible with the modi cation of the bottom-up approach called shotgun proteomics. With this method, the proteins are treated with the protease before the column (A) Chromatography in sequential columns Proteome Collect fractions (B) PAGE followed by column chromatography Cut gel into segments Figure 13.6 Three configurations Collect fractions for the separation phase of protein profiling. (A) Two chromatography columns are linked in series. (B) Fractions from polyacrylamide gel electrophoresis (PAGE) are entered into the chromatography column. (C) Online (C) Online column chromatography system mode with the chromatography column directly linked to the mass spectrometer. Mass Proteome spectrometer Combinations of these three formats are also possible. 13.1 STUDYING THE COMPOSITION OF A PROTEOME 297 chromatography step. ese proteins could be the entire proteome, if it is not overly complex, or the mixture obtained from a segment of a one-dimensional proteins, and the online mode is used to inject the eluted molecules directly into the mass spectrometer. The identification stage of a protein profiling project Separation of the components of a proteome is followed by identi cation of the individual proteins, either directly, in top-down proteomics, or from the pep- tides resulting from proteolytic cleavage, if a bottom-up method is being used. advances in mass spectrometry, driven in part by the requirements of proteomics, have largely solved this problem. Mass spectrometry is a means of identifying a compound from the mass-to- charge ratio (designated m/z) of the ions that are produced when molecules of the compound are exposed to a high-energy eld. e rst type of mass spectrometry to be used widely in protein pro ling was matrix-assisted laser desorption ionization time-of- ight (MALDI-TOF). is technique forms the basis for peptide mass ngerprinting, the bottom-up procedure that identi es individual peptides in the mixture obtained by protease cleavage of a protein puri ed by is ionization of the peptides. is is achieved by absorbing the mixture into an organic crystalline matrix, usually made of a phenylpropanoid compound called Figure 13.7 Use of MALDI-TOF in sinapinic acid, which is excited with a UV laser. e excitation initially ionizes protein profiling. (A). In the matrix- the matrix, with protons then donated to or removed from the peptide molecules assisted laser desorption ionization time- to give the molecular ions [M + H]+ and [M − H]−, respectively, where M is the of-flight (MALDI-TOF) mass spectrometer, the peptides are ionized by a pulse of accelerated along the tube of the mass spectrometer by an electric eld. e energy from a laser and then accelerated ight path can be direct from the ionization source to a detector, but often the down the column to the reflectron and onto the detector. The time of flight of ions are initially directed at a re ectron, which re ects the ion beam toward the each peptide depends on its mass-to- detector (Figure 13.7). As well as enabling a longer ight path to be built into charge ratio. (B) The data are visualized a machine of a de ned size, the re ectron acts as a focusing device, ensuring as a spectrum indicating the m/z values that all ions with the same m/z value travel through the mass spectrometer at the of the peptides. The computer converts same speed. is is critical because a time-of- ight spectrometer uses the time the m/z values into molecular masses and that an ion takes to reach the detector in order to calculate the mass-to-charge compares these masses with the predicted ratio for that ion. As the charge is always +1 or −1, the time of ight can easily be masses of all the peptides that would be obtained by protease treatment of all the converted to a molecular mass, which in turn allows the amino acid composition proteins encoded by the genome of the organism under study. The protein that spot in a two-dimensional gel are analyzed, then the resulting compositional gave rise to the detected peptides can information can be related to the genome sequence in order to identify the gene therefore be identified. (A) MALDI-TOF mass spectrometry (B) MALDI-TOF spectrum 100 Sample Paths of ionized peptides Reflectron matrix Relative intensity (%) Peptide peaks Laser 0 1000 1500 2000 Mass-to-charge ratio Detector High Low mass-to-charge mass-to-charge peptide peptide 298 Chapter 13: Proteomes that speci es that protein. e amino acid compositions of the peptides derived from a single protein can also be used to check that the gene sequence is correct and, in particular, to ensure that exon–intron boundaries have been correctly located. is not only helps to ensure that the genome annotation is accurate but also allows alternative splicing pathways to be identi ed in cases where two or more proteins are derived from the same gene. because the larger number of peptides that are produced when a mixture of proteins is treated with a protease increases the possibility that two peptides will have similar m/z values and hence be indistinguishable when examined by the resolution of peptide mass spectrometry and hence provide support for the shotgun methods. e rst of these is the use of electrospray ionization, which can be performed online between HPLC and mass spectrometry. A high voltage is applied to the solution emerging from the HPLC, generating an aerosol of charged droplets that evaporate, transferring their charges to the peptides dissolved within them. e advantage of this ionization method is that, as well as the [M + H]+ and [M − H]− molecular ions, each with a single ionized group, multiple ionized mol- ecules such as [M + nH]n+ are also obtained. Generation of multiple ions with dif- ferent m/z values from individual peptides increases the amount of information that can be used to infer the composition of that peptide. e second innovation is to break peptides down into smaller fragments within the mass spectrometer. Fragmentation can be induced during the ioniza- tion step by use of a hard ionization method, one that injects greater quantities of energy into the molecules being ionized, causing bonds within those molecules to break. However, in peptide mass spectrometry, fragmentation is usually delayed until a later stage, by inducing collisions between the peptide molecular ions and inert atoms such as helium. ese collisions cause peptide bonds to break, result- ing in a variety of fragment ions whose m/z values reveal the composition of the to use the data to work out the sequence of the peptide. Knowing the sequence of the peptide enables a much more precise identi cation of the parent protein than is possible simply from compositional information. Collision-induced fragmenta- tion is also utilized in top-down proteomics, because analysis of molecular ions derived from intact proteins is usually insu cient to distinguish all of the di erent proteins in a proteome. Fragment ions must therefore be obtained before a pro- tein can be identi ed unambiguously. (A) Quadrupole Detector Ion Magnets have been accompanied by a diversi cation in the types of mass spectrometer source employed in proteomics research. As well as time-of- ight con gurations, other types of mass analyzer used with peptides and proteins include the following (Figure 13.8): Ion trajectories wiggle as quadrupole mass analyzer has four magnetic rods placed parallel to they pass between the magnets one another, surrounding a central channel through which the ions must pass. Oscillating electrical elds are applied to the rods, de ecting the ions in a complex way so that their trajectories wiggle as they pass through the (B) Fourier transform ion cyclotron resonance quadrupole. Gradually changing the eld strengths enables ions with dif- ferent m/z values to pass through the quadrupole without colliding with the rods. Fourier transform ion cyclotron resonance (FT-ICR) mass analyzer Ions follow a spiral trajectory within the includes an ion trap that captures individual ions and further excites them cyclotron within a cyclotron, so they accelerate along an outward spiral; the vector of this spiral revealing the m/z ratio. Figure 13.8 Two types of mass analyzer. (A) Quadrupole mass analyzer. (B) Fourier Mass analyzers can also be linked in series, further increasing the amount of transform ion cyclotron resonance mass information that can be obtained regarding a single peptide or protein. is is analyzer. called tandem mass spectrometry. A typical con guration involves analysis of 13.1 STUDYING THE COMPOSITION OF A PROTEOME 299 the molecular ions in the rst mass analyzer, followed by fragmentation and anal- ysis of the fragment ions in the second mass analyzer. Comparing the compositions of two proteomes Often the aim of a protein pro ling project is not to identify every protein in a single proteome but to understand the di erences between the protein com- they will be apparent simply by looking at the stained gels after two-dimensional electrophoresis. However, important changes in the biochemical properties of a proteome can result from relatively minor changes in the amounts of individual proteins, and methods for detecting small-scale changes are therefore essential. One possibility is to label the constituents of two proteomes with di erent uorescent markers, and then run them together in a single two-dimensional gel. is is the same strategy as is used for comparing pairs of transcriptomes (see Figure 12.5). Visualization of the two-dimensional gel at di erent wavelengths enables the intensities of equivalent spots to be judged more accurately than is possible when two separate gels are obtained. A more accurate alternative in bottom-up proteomics is to label peptides with an isotope-coded a nity tag (ICAT). ese are chemical groups that can be contain either the common 12C isotope of carbon or the less common 13C isotope (Figure 13.9). e proteins in the proteomes are separated in the normal manner, and equivalent proteins from each proteome are recovered and treated with pro- tease. One set of peptides is then labeled with 12C tags and the other with 13C tags. Because the 12C and 13C tags have di erent masses, the m/z ratio of the molecular ion obtained from a peptide labeled with a 12C tag will be di erent from that of an identical peptide labeled with a 13C tag. e peptides from the two proteomes are therefore run through the mass spectrometer together. A pair of identical pep- tides (one from each proteome) will occupy slightly di erent positions on the resulting mass spectrum, because of their distinctive m/z ratios (Figure 13.10). Comparison of the peak heights allows the relative abundance of each peptide to be estimated. 12C and 13C tags can have slightly di erent chromatographic properties, especially during RPLC, so they might emerge from the column at slightly di erent times. e di er- ent masses are also a hindrance in tandem mass spectrometry, as the 12C- and 13C-labeled peptides pass through the rst mass analyzer at di erent rates, so their fragment ions will be collected by the second mass analyzer at di erent times. ese problems are avoided by isobaric labeling consists of three parts: a reactive region that forms the attachment with the pep- tide; a balance region, which is labeled with 12C and 13C; and a reporter region, which is also labeled with 12C and 13C (Figure 13.11). e labeling is designed in such a way that each tag has the same mass: in other words they are isobaric. Pairs of tagged peptides from two proteomes therefore give molecular ions that have the same m/z values and so behave in the same way in the rst mass analyzer. However, the labels are distributed di erently within the balance and reporter regions, so cleavage of a tag in the second mass analyzer releases a fragment ion Figure 13.9 Typical isotope-coded affinity tag for proteome studies. O The iodoacetyl group reacts with cysteine and hence forms an attachment to the peptide. The linker region contains either NH 12C or 13C atoms and so provides the HN isotope coding function. The terminal biotin group enables tagged peptides H H N O O N to be separated from untagged ones by I O S affinity chromatography on a column O O matrix carrying avidin groups. Untagged peptides (ones that lack a cysteine group) Iodoacetyl Linker region (contains 12C or 13C atoms) Biotin can therefore be discarded prior to mass group spectrometry. 300 Chapter 13: Proteomes 100 Cleavage site O Relative intensity (%) 12 C-labeled peptide O O 13 C-labeled peptide N N N O H O Reporter Balance Reactive 0 region region region 1000 1500 2000 Mass-to-charge (m/z) ratio Figure 13.11 Typical isobaric tag. The reporter and balance regions are labeled with Figure 13.10 Analyzing two proteomes by use of 12C or 13C in such a way that all tags have the isotope-coded affinity tags. In the mass spectrum, peaks same molecular mass. When the peptide is resulting from peptides containing 12C atoms are shown fragmented, the tag is cleaved at the position in red, and those from peptides containing 13C are shown shown, releasing the reporter group, which in blue. The protein under study is approximately 1.5-fold has a different mass for each tag. The reactive more abundant in the proteome that has been labeled with group on this tag enables amino acid R groups 12C isotope-coded affinity tags (ICATs). containing an amine to be labeled. whose mass is characteristic of that tag. e relative amounts of the reporter frag- ment ions, as detected in the second mass analyzer, gives the relative amounts of the peptides in the two proteomes. A nal strategy that can sometimes be used with microorganisms and eukary- otic cell lines is metabolic labeling nutrients that contain 13C rather than 12C atoms, then all the proteins in the pro- is being studied is obtained in this way, then there is no need to add tags to indi- vidual peptides, as all the proteins will be prelabeled. is approach therefore enables a rapid, high-throughput means of comparing the relative amounts of all the proteins in a proteome, albeit with the possible drawbacks described above regarding di erential mobility during column chromatography and the rst stage of tandem mass spectrometry. Analytical protein arrays offer an alternative approach to protein profiling Gel and/or column separation followed by mass spectrometry o ers an e ective but laborious and expensive means of pro ling the contents of a proteome. ese approaches are necessary during the initial characterization of a proteome, but in many research projects the objective is not to catalog the entire content of a proteome but to understand changes that occur within a proteome, for example, in response to extracellular stimuli and during the transition from a healthy tissue to a diseased one. For these applications, a more rapid method of assessing the relative abundance of di erent proteins is desirable. Protein arrays provide the main alternative to the top-down and bottom-up approaches to protein pro ling. A protein array is similar to a DNA array (Section 12.1), the di erence being that the immobilized molecules are proteins rather than oligonucleotides. ere are several types of protein array, including a version used to detect protein–protein interactions, which we will study in Section 13.2. e particular type of protein array used in protein pro ling is called an analytical protein array or an antibody array, the second name indicating that the array carries a series of antibodies, each one speci c for a di erent protein in the proteome for which the microarray is designed. When a sample of the proteome is applied to the array, individual proteins bind to their antibodies and become captured on the array. e amount of binding at each position is dependent on the abundance of that protein in the proteome. e captured proteins are usually detected with a second, polyclonal antibody that binds to all the proteins in the 13.2 IDENTIFYING PROTEINS THAT INTERACT WITH ONE ANOTHER 301 uorescent label Figure 13.12 Protein detection on an detection antibody antibody array. Captured proteins are detected with a polyclonal antibody that is captured protein fluorescently labeled. protein-speci c antibody proteome. is antibody is uorescently labeled and so gives signals for those positions on the array where a protein has been captured (Figure 13.12). As with a DNA microarray, the intensities of the resulting uorescent signals can be used to assay the amount of each protein in the proteome. e main di culty in designing an analytical protein array is ensuring that each antibody is speci c for its target protein and does not cross-react with any other proteins. Cross-reaction will occur if the epitope recognized by an antibody is a common surface feature shared by two or more distinct proteins. However, once a non-cross-reacting array has been designed, then multiple copies can be fabricated and its actual usage is relatively straightforward. Although hun- dreds of thousands of antibodies can be accommodated on a single chip, most antibody arrays are designed for the assay of particular components of a pro- teome and hence carry fewer than 1000 antibodies. Typical applications would be screening for the presence or absence and relative abundance of cytokines in di erent human tissues, for which commercial arrays targeting 640 proteins are available. 13.2 IDENTIFYING PROTEINS THAT INTERACT WITH ONE ANOTHER - tifying pairs and groups of proteins that interact with one another. At a detailed level, this information is often valuable when attempts are made to assign a func- tion to a newly discovered gene or protein (Chapter 6) because an interaction with a second, well-characterized protein can often indicate the role of an unknown protein. For example, an interaction with a protein that is located on the cell sur- face might indicate that the unknown protein is involved in cell–cell signaling (Section 14.1). Identifying pairs of interacting proteins ere are several methods for studying protein–protein interactions, the two most useful being phage display and the yeast two-hybrid system special type of cloning vector is used, based on λ bacteriophage or one of the la- mentous bacteriophages such as M13. e vector is designed so that when a new gene is inserted, it is expressed in such a way that its protein product becomes fused with one of the phage coat proteins (Figure 13.13A). e phage protein therefore carries the foreign protein into the phage coat, where it is displayed 302 Chapter 13: Proteomes Figure 13.13 Phage display. (A) The (A) Production of a display phage cloning vector used for phage display is a bacteriophage genome with a unique Restriction site Phage coat protein gene restriction site located within a gene for a R coat protein. The technique was originally Vector DNA carried out with the gene III coat protein of the filamentous phage called f1, but it Insert DNA for has now been extended to other phages protein to be displayed including λ. To create a display phage, the R R DNA sequence coding for the test protein is ligated into the restriction site so that a Expression fused reading frame is produced: this is one in which the series of codons continues unbroken from the test gene into the Display phage coat protein gene. After transformation -TTA ATC GGA GCC - of E. coli, this recombinant molecule directs synthesis of a hybrid protein made Fused reading frame Fusion protein Protein displayed in phage coat up of the test protein fused to the coat protein. Phage particles produced by these transformed bacteria therefore display the test protein in their coats. (B) Using a (B) Using a phage display library phage display library. The test protein is immobilized within a well of a microtiter tray, and the phage display library is added. After one or more washes, the phages Phage display that are retained in the well are those library displaying a protein that interacts with the test protein. Retained Test protein phage Washes in a form that enables it to interact with other proteins that the phage encoun- ters. ere are several ways in which phage display can be used to study protein sought with a series of puri ed proteins or protein fragments of known function. is approach is limited because it takes time to carry out each test, so it is fea- sible only if some prior information has been obtained about likely interactions. A more powerful strategy is to prepare a phage display library, a collection of clones displaying a range of proteins, and identify which members of the library interact with the test protein (Figure 13.13B). e yeast two-hybrid system detects protein interactions in a more complex - ling the expression of genes in eukaryotes. To carry out this function, a transcrip- tion factor must bind to a DNA sequence upstream of a gene and stimulate the mediator protein that regulates the initiation of transcription (see Figure 12.21). ese two abilities, DNA binding and mediator activation, are speci ed by dif- ferent parts of the transcription factor. Some transcription factors can be cleaved into two segments, where one segment contains the DNA-binding domain and form the functional transcription factor. e two-hybrid system makes use of a Saccharomyces cerevisiae strain that lacks a transcription factor for a reporter gene. is gene is therefore switched o. An arti cial gene that codes for the DNA-binding domain of the transcription factor is ligated to the gene for the protein whose interactions we wish to study. is protein can come from any organism, not just yeast: in the example shown in Figure 13.14A, it is a human protein. After introduction into yeast, this construct speci es synthesis of a fusion protein made up of the DNA-binding domain of the transcription factor attached to the human protein. e recombinant yeast strain is still unable to express the reporter gene because the modi ed transcription fac- tor only binds to DNA; it cannot in uence the mediator protein. Activation only occurs after the yeast strain has been co-transformed with a second construct, 13.2 IDENTIFYING PROTEINS THAT INTERACT WITH ONE ANOTHER 303 (A) The two-hybrid system (B) Screening for protein interactions using the two-hybrid system HYBRID 1 HYBRID 2 DNA-binding Activation domain domain Interaction between No interaction the human between the proteins human proteins Gene No gene Mediator expression expression is activated Mediator Figure 13.14 The yeast two-hybrid system. (A) On the left, a gene for a KEY human protein has been ligated to the gene for the DNA-binding domain of a yeast transcription factor. After transformation of yeast, this construct Yeast gene Yeast specifies a fusion protein, part human protein and part yeast transcription Human Human gene domains domains factor. On the right, various human DNA fragments have been ligated to the gene for the activation domain of the transcription factor: these constructs specify a variety of fusion proteins. (B) The two sets of constructs are mixed and co-transformed into yeast. A colony in which the reporter gene is expressed contains fusion proteins whose human segments interact, thereby bringing the DNA-binding and activation domains into proximity and stimulating the mediator protein. one comprising the coding sequence for the activation domain fused to a DNA fragment that speci es a protein able to interact with the human protein that is being tested (Figure 13.14B). As with phage display, if there is some prior knowl- edge about possible interactions, then individual DNA fragments can be tested one by one in the two-hybrid system. Usually, however, the gene for the activation domain is ligated with a mixture of DNA fragments so that many di erent con- structs are made. After transformation, cells are plated out and those that express uorescent label the reporter gene are identi ed. ese are cells that have taken up a copy of the test protein gene for the activation domain fused to a DNA fragment that encodes a protein array protein able to interact with the test protein. Unlike the arrays used in protein pro ling, the immobilized proteins are not anti- bodies but instead are the actual proteins whose possible interactions we wish to assay. A uorescently labeled version of the test protein is applied to the array, the positions of the resulting signals indicating proteins with which the test protein interacts (Figure 13.15). Although this approach enables interactions between the test protein and a broad range of other proteins to be checked in a single experiment, it is not usually the rst choice for this type of work, and phage dis- play and the two-hybrid system remain the standard methods for studying pro- tein–protein interactions. Protein arrays are more popular for testing interactions with DNA fragments—for example, to identify proteins that bind to a particular DNA sequence—and with small molecules such as some drugs. Figure 13.15 Using a protein array to test protein–protein interactions. The array carries a series of different proteins. Detection of the fluorescent signal indicates which proteins bind to the test protein. 304 Chapter 13: Proteomes Identifying the components of multiprotein complexes Phage display and the yeast two-hybrid system are e ective methods for identi- fying pairs of proteins that interact with one another, but identifying such links reveals only the basic level of protein–protein interactions. Many cellular activi- ties are carried out by multiprotein complexes, such as the mediator protein (Section 12.2) or the spliceosome that is responsible for the removal of introns from pre-mRNA (Section 12.4). Complexes such as these typically comprise a set of core proteins, which are present at all times, along with a variety of ancil- lary proteins that associate with the complex under particular circumstances. how these complexes carry out their functions. ese proteins might be identi ed Multiprotein pair-by-pair by a long series of phage display or two-hybrid experiments, but a complex more direct route to determining the composition of multiprotein complexes is clearly needed. of a multiprotein complex, as in this procedure all proteins that interact with the test protein are identi ed in a single experiment (see Figure 13.13B). e problem is that large proteins are displayed ine ciently because they disrupt the phage replication cycle. To circumvent this problem, it is generally necessary to display a short peptide, representing part of a cellular protein, rather than the Proteins Displayed peptide does not entire protein. e displayed peptide may therefore be unable to interact with all not interact with all members members of the complex within which the intact protein is located, because the detected of the complex peptide lacks some of the protein–protein attachment sites present in the intact Figure 13.16 Phage display may fail to form (Figure 13.16). A method that avoids this problem, because it works with detect all members of a multiprotein intact proteins, is a nity chromatography complex. The complex consists of a central protein is attached to a chromatographic matrix and placed in a column. e cell protein that interacts with five smaller extract is passed through the column in a low-salt bu er, which allows formation proteins. In the lower drawing, a peptide of the hydrogen bonds that hold proteins together in a complex (Figure 13.17A). from the central protein is used in a phage e proteins that interact with the bound test protein are therefore retained in the display experiment. This peptide detects two of the interacting proteins, but the column, while all the others are washed away. e interacting proteins are then other three proteins are missed because eluted with a high-salt bu er. A disadvantage of this procedure is the need to purify their binding sites lie on a different part of the test protein, which is time-consuming and hence di cult to use as the basis the central protein. tandem (A) Standard affinity chromatography (B) Tandem affinity purification Cell extract Cell extract Low High 2 mM No salt salt CaCl2 CaCl2 Figure 13.17 Affinity chromatography methods for the purification of multiprotein complexes. (A) In standard affinity chromatography, the test protein Resin with Resin with is attached to the resin. The cell extract is attached attached applied in a low-salt buffer so that other test protein calmodulin members of the multiprotein complex bind molecules to the test protein. The proteins are then eluted with a high-salt buffer. (B) In tandem affinity purification (TAP), the cell extract is applied in a buffer containing 2 mM CaCl2, which promotes attachment of the modified test protein, plus the proteins it interacts with, to the calmodulin molecules attached to the chromatography resin. The Discard Interacting Discard Test plus proteins are then eluted with a buffer that proteins interacting contains no CaCl2. proteins Genomes | chapter 13 | figure 17 13.2 IDENTIFYING PROTEINS THAT INTERACT WITH ONE ANOTHER 305 a nity puri cation (TAP), which was developed as a means of studying protein Multiprotein complexes in S. cerevisiae, the gene for the test protein is modi ed so that the test complex protein, when synthesized, has a C-terminal extension that can bind to a second protein called calmodulin. e cell extract is prepared under gentle conditions so that multiprotein complexes do not break down, and then passed through an a nity chromatography column packed with a resin containing attached calmodulin molecules. is results in immobilization of both the test protein and others with which it is associated (Figure 13.17B identities of the puri ed proteins are determined by mass spectrometry. When B used in a large-scale screen of 1739 yeast genes, TAP identi ed 232 multiprotein complexes, providing new insights into the functions of 344 genes, many of which Bait plus Proteins had not previously been characterized by experimental means. attached not proteins obtained A second disadvantage of a nity chromatography methods is that a single member of a multiprotein complex is used as the bait for isolation of other Figure 13.18 A disadvantage of affinity chromatography. If the bait protein directly with the bait, then it may not be isolated (Figure 13.18). ese methods (labeled with a B) does not interact directly therefore identify groups of proteins that are present in a complex but do not with one or more proteins in the complex, necessarily provide the total protein complement of the complex. Developing then those proteins might not be isolated. ways of purifying intact complexes is therefore a major goal of current research. co-immunoprecipitation, a cell extract is prepared under gentle conditions so that complexes remain intact. An antibody speci c for the test protein is then added, which results in precipitation of this protein and all other members of the complex within which it is present. Treatment of the collection of proteins with a protease, followed by bottom-up proteomics, then enables the members of the complex to be identi ed. is version of shotgun proteomics is called the multidimensional protein identi cation technique (MudPIT). e method was rst used to study the large subunit of the yeast ribosome and resulted in identi cation of 11 proteins that had not previously been known to be associated with this complex. Identifying proteins with functional interactions Proteins do not need to form physical associations with one another in order to have a functional interaction. For example, in bacteria such as Escherichia coli, the enzymes lactose permease and β-galactosidase have a functional interaction in that they are both involved in utilization of lactose as a carbon source. But there is no physical interaction between these two proteins: the permease is located in the cell membrane and transports lactose into the cell, while β-galactosidase, which splits lactose into glucose and galactose, is present in the cell cytoplasm (see Figure 8.9A). Many enzymes that work together in the same biochemical pathway never form physical interactions with one another, and if studies were to be based solely on detection of physical associations between proteins, then many functional interactions would be overlooked. Several methods can be used to identify proteins that have functional interac- tions. Most of these do not involve direct study of the proteins themselves and hence, strictly speaking, do not come under the general heading of proteomics. Nonetheless, it is convenient to consider them here because the information they yield is often considered along with the results of proteomics studies. ese meth- ods include the following: Yeast HIS2 of proteins that have functional relationships. One approach is based on E. coli his2 the observation that pairs of proteins that are separate molecules in some E. coli his10 organisms are fused into a single polypeptide in others. An example is pro- vided by the yeast gene HIS2, which codes for an enzyme involved in his- Figure 13.19 Using homology analysis E. coli, two genes are homologous to HIS2. One of to deduce protein–protein interactions. these, itself called his2, has sequence similarity with the 5′-region of the The 5ʹ-region of the yeast HIS2 gene yeast gene, and the second, his10, is similar to the 3′-region (Figure 13.19). is homologous to E. coli his2, and the e implication is that the proteins coded by his2 and his10 interact within 3ʹ-region is homologous to E. coli his10. 306 Chapter 13: Proteomes the E. coli proteome to provide part of the histidine biosynthesis activity. Analysis of the sequence databases reveals many examples of this type, where two proteins in one organism have become fused into a single pro- tein in another organism. A similar approach is based on examination of bacterial operons. An operon consists of two or more genes that are trans- cribed together and usually have a functional relationship (Section 8.2). For example, the genes for lactose permease and β-galactosidase of E. coli are present in the same operon, along with the gene for a third protein involved in lactose utilization (see Figure 8.9A). e identities of genes in bacterial operons can therefore be used to infer functional interactions between the proteins coded by homologous genes in a eukaryotic genome. - teins, as the mRNAs for functionally related proteins often display similar expression pro les under di erent conditions. is observed only when two or more genes are inactivated together, then it can be inferred that those genes function together in generation of the phenotype. Protein interaction maps display the interactions within a proteome e information from phage display, two-hybrid analyses, and other methods for identifying pairs and groups of proteins that associate with one another in a cell enables a protein interaction map protein is depicted by a dot, or node, with pairs of interacting proteins linked by lines, or edges. e resulting network displays all the interactions that occur between the components of a proteome. e rst of these maps were constructed in 2001 for relatively simple proteomes, almost entirely from two-hybrid experi- ments. ese included maps for the bacterium Helicobacter pylori, comprising over 1200 interactions involving almost half the proteins in the proteome, and for 2240 interactions between 1870 proteins from the S. cerevisiae proteome (Figure 13.20A). More recently, the application of additional techniques has led to more detailed versions of the S. cerevisiae map, as well as maps for humans and other eukaryotes (Figure 13.20B). Each protein interaction map forms a part of the broader interactome for the species. e interactome comprises all of the molecular interactions that occur, including those involving small molecules that regulate protein activity and between DNA-binding proteins and the genes whose expression they control. What interesting features have emerged from these protein interaction maps? e most intriguing discovery is that each network is built up around a small Figure 13.20 Protein interaction maps. (A) Initial version of the S. cerevisiae map, number of proteins that have many interactions and form hubs in the network, published in 2001. Each dot represents a along with a much larger number of proteins with few individual connections protein, with connecting lines indicating interactions between pairs of proteins. Red dots are essential proteins: an inactivating (A) (B) mutation in the gene for one of these proteins is lethal. Mutations in the genes for proteins indicated by green dots are nonlethal, and mutations in genes for proteins shown in orange lead to slow growth. The effects of mutations in genes for proteins shown as yellow dots were not known when the map was constructed. (A, from Jeong H, Mason SP, Barabási AL & Oltvai ZN Nature 411:41–42. With permission from Macmillan Publishers Limited. B, from Stelzl U, Worm U, Lalowski M et al. 122:957–968. With permission from Elsevier.) 13.2 IDENTIFYING PROTEINS THAT INTERACT WITH ONE ANOTHER 307 (A) The complete network (B) Removal of party hubs (C) Removal of date hubs (Figure 13.21A). is architecture is thought to minimize the impact on the Figure 13.21 Hubs in the S. cerevisiae proteome of the disruptive e ects of mutations that might inactivate individual protein interaction map. This map proteins. Only if a mutation a ects one of the proteins at a highly interconnected was published in 2004. (A) The hubs are clearly visible in the complete map. node will the network as a whole be damaged. is hypothesis is consistent with (B) After removal of the party hubs, the the discovery, from gene inactivation studies (Section 6.2), that a substantial network remains almost intact. (C) After number of yeast proteins are apparently redundant, meaning that if the protein the date hubs are removed, the network activity is destroyed, the proteome as a whole continues to function normally, splits into detached subnetworks. (From with no discernible impact on the phenotype of the cell. Examination of the Han J-DJ, Bertin N, Hao T et al. expression pro les of the hub proteins and their direct partners enables these Nature 430:88–93. With permission from hubs to be divided into two groups. e rst group includes those hub proteins Macmillan Publishers Limited.) that interact with all their partners simultaneously. ese have been called party hubs, and their removal has little e ect on the overall structure of the network (Figure 13.21B interact with di erent partners at di erent times, breaks the network into a series of small subnetworks (Figure 13.21C). e implication is that the party hubs work within individual biological processes and do not contribute greatly to the overall organization of the proteome. e date hubs, on the other hand, are the key players that provide organization to the proteome by linking biological processes to one another. Most of the protein interaction maps that have been constructed so far are incomplete, simply because not all of the interactions occurring in the proteome possible to construct a fully comprehensive interaction map for any proteome, bearing in mind the limitations in scope and sensitivity of the methods used to study protein–protein interactions. e accuracy of these methods also needs to be considered to ensure that the resulting networks do not contain spurious links. Both problems are illustrated by the current status of the human protein inter- action map. When all reported interactions are taken into account, this network comprises almost 30,000 proteins with 350,000 interactions, but the numbers drop to 16,000 proteins and 116,000 interactions if only those interactions that have been checked by two di erent methods are included. Clearly these networks are far from complete: they account for only a fraction of 70,000 proteins thought to be present in the human proteome, and many of the interactions that have been identi ed are uncon rmed. Despite these limitations, protein interaction maps are proving valuable as a means of probing the link between the proteome and give rise to the same disease often specify proteins that occupy a distinct disease module within a network (Figure 13.22). e implication is that loss or perturba- tion of the biochemical function performed by those interlinked components of the proteome gives rise to the symptoms of the disease. Even if the map is incom- plete, identi cation of proteins within a module that were not previously associ- ated with the disease enables the biochemical basis of the defect to be understood in greater detail. e discovery that there are sometimes overlaps between the modules for distinct diseases, such as multiple sclerosis and rheumatoid arthritis 308 Chapter 13: Proteomes Figure 13.22 Disease modules in the human protein interaction map. The modules for multiple sclerosis, peroxisomal disorders, and rheumatoid arthritis are shown. Disease pairs with overlapping modules (for example, multiple sclerosis and rheumatoid arthritis) have some symptoms in common and display high comorbidity. Nonoverlapping diseases, such as multiple sclerosis and peroxisomal disorders, lack detectable clinical relationships. (From Menche J, Sharma A, Kitsak M et al. Science 347:841. With permission from American Association for the Advancement of Science.) Multiple sclerosis (MS) Peroxisomal disorders (PD) Rheumatoid arthritis (RA) Genomes | chapter 13 | gure 22 (see Figure 13.22), also provides important insights into comorbidity, which is the tendency for patients su ering from one disease to display symptoms associ- ated with other diseases. 13.3 SYNTHESIS AND DEGRADATION OF THE COMPONENTS OF THE PROTEOME e composition of a proteome is determined by the balance between the syn- thesis and degradation of the individual proteins that it contains. Except for the words proteome and proteins, this is the same sentence that we used to describe the composition of the transcriptome (Section 12.2). e principles are exactly the same, and the dynamics of synthesis and degradation illustrated in Figure 12.7 for RNA could equally well refer to proteins. To understand how a proteome is main- tained and how a proteome changes in response to external stimuli and during di erentiation, development, and disease, we must therefore study the same top- ics as in Chapter 12—synthesis, degradation, and processing—but this time with reference to protein rather than RNA. Ribosomes are molecular machines for making proteins Proteins are synthesized by the large RNA–protein complexes called ribosomes. An E. coli cell contains approximately 20,000 ribosomes, distributed throughout its cytoplasm. e average human cell contains more than a million ribosomes, some free in the cytoplasm and some attached to the outer surface of the endo- plasmic reticulum, the membranous network of tubes and vesicles that perme- ates the cell. Originally, ribosomes were looked on as passive partners in protein synthesis, merely the structures on which mRNAs were translated into polypeptides. is view has changed over the years, and ribosomes are now considered to play two active roles: coordinate protein synthesis by placing the mRNA, tRNAs, and associated protein factors in their correct positions relative to one another. catalyze at least some of the chemical reactions that occur during protein synthesis, including the central reaction that results in synthesis of the peptide bond that links two amino acids together. 13.3 SYNTHESIS AND DEGRADATION OF THE COMPONENTS OF THE PROTEOME 309 Figure 13.23 Composition of eukaryotic and bacterial ribosomes. The details given are for human ribosomes and those in E. coli. There are some variations in the number of ribosomal proteins in different species. EUKARYOTES BACTERIA 3 rRNAs 2 rRNAs 28S (4718 nucleotides) 23S (2904 nucleotides) 5.8S (160 nucleotides) 5S (120 nucleotides) large 5S (120 nucleotides) subunit 50 proteins 34 proteins 1 rRNA 1 rRNA 18S (1874 nucleotides) 16S (1541 nucleotides) 33 proteins small 21 proteins subunit When the involvement of ribosomes in protein synthesis became clear in the 1950s, biologists quickly realized that a detailed knowledge of ribosome structure would be necessary in order to understand how mRNAs are translated into poly- peptides. Originally called microsomes, ribosomes were rst observed in the early decades of the twentieth century as tiny particles almost beyond the resolving showed that bacterial ribosomes are oval-shaped, with dimensions of 29 × 21 nm, rather smaller than eukaryotic ribosomes, which vary a little in size depending on species but average about 32 × 22 nm. Compositional studies then revealed that a ribosome comprises two subunits, referred to as large and small, each subunit made up of one or more rRNAs and a collection of ribosomal proteins (Figure 13.23). We now know that ribosomes dissociate into their subunits when they are not actively participating in protein synthesis, and the subunits remain in Central the cytoplasm until they are used for a new round of translation. domain 3‘ major Once the basic composition of eukaryotic and bacterial ribosomes had been domain worked out, attention became focused on the way in which the various rRNAs sequences. Comparisons between these sequences identi ed conserved regions that can base-pair to form complex two-dimensional structures (Figure 13.24). is suggested that the rRNAs provide a sca olding within the ribosome, to which 5‘ the proteins are attached, an interpretation that underemphasizes the active role that rRNAs play in protein synthesis but which nonetheless was a useful founda- tion on which to base subsequent research. 3‘ Much of that subsequent research has concentrated on the bacterial ribosome, 3‘ minor which is smaller than the eukaryotic version and available in large amounts from domain extracts of cells grown to high density in liquid cultures. A number of technical approaches have been used to study the bacterial ribosome: Nuclease protection experiments (Section 7.1) enabled contacts between 5‘ domain rRNAs and proteins to be identi ed. Figure 13.24 Base-paired structure of Protein–protein cross-linking identi ed pairs or groups of proteins that E. coli 16S rRNA. The 16S rRNA is the single rRNA present in the small subunit of the are located close to one another in the ribosome. bacterial ribosome. In this representation, standard base pairs (G-C and A-U) are Electron microscopy gradually became more sophisticated, enabling shown as bars while nonstandard base the overall structure of the ribosome to be resolved in greater detail. For pairs (such as G-U) are shown as dots. 310 Chapter 13: Proteomes example, innovations such as immunoelectron microscopy, in which ribosomes are labeled with antibodies speci c for individual ribosomal proteins before examination, have been used to locate the positions of these proteins on the surface of the ribosome. Site-directed hydroxyl radical probing, which makes use of the ability of bonds located within 1 nm of the site of radical production, has been used to determine the exact positioning of ribosomal proteins in the E. coli ribo- some. For example, to determine the position of S5, di erent amino acids 5‘ induced in reconstituted ribosomes. e positions at which the 16S rRNA 3‘ was cleaved were then used to infer the topology of the rRNA in the vicinity of S5 protein (Figure 13.25). X-ray crystallography (Section 11.1), which has been responsible for the most exciting insights into ribosome structure. Analyzing the massive amount of X-ray di raction data produced by crystals of an object as large as a ribosome is a huge Figure 13.25 Positions within E. coli 16S task, particularly at the level needed to obtain a structure that is detailed enough rRNA that form contacts with ribosomal to be informative about the way in which the ribosome works. is challenge protein S5. The distribution of the contact has been met, and detailed structures are now known for entire ribosomes, positions (shown in pink) for this single including ones attached to mRNA and tRNAs (Figure 13.26A). ese studies have ribosomal protein emphasizes the extent to which the base-paired secondary shown that, in bacteria, attachment of the two ribosome subunits to one another structure of the rRNA is further folded results in the formation of two sites at which an aminoacyl-tRNA (a tRNA with within the three-dimensional structure of an attached amino acid) can bind. ese are called the P or peptidyl site and the ribosome. the A or aminoacyl site. e P site is occupied by the aminoacyl-tRNA whose amino acid has just been attached to the end of the growing polypeptide, and the A site is entered by the aminoacyl-tRNA carrying the next amino acid that will be used. ere is also a third site, the E or exit site, through which the tRNA departs after its amino acid has been attached to the polypeptide (Figure 13.26B). e structures revealed by X-ray di raction analysis show that these sites are located in a cavity between the large and small subunits of the ribosome, the mRNA threading through a channel formed mainly by the small subunit. After each amino acid addition, the ribosome adopts a less compact structure, with the two subunits rotating slightly in opposite directions. is opens up the space between the subunits and enables the ribosome to slide along the mRNA in order to read the next codon in the open reading frame. (A) Structure of the bacterial ribosome (B) Positions of the A, P, and E sites and the mRNA channel P site LARGE DNA SUBUNIT LARGE SUBUNIT E P A E site A site SMALL mRNA SUBUNIT channel 5' SMALL SUBUNIT mRNA 3' Figure 13.26 Detailed structure of a bacterial ribosome. (A) Structure of a ribosome in the process of translating an mRNA. The tRNAs positioned in the A, P, and E sites are indicated in pink, green, and yellow, respectively. (B) Diagram showing relative positions of the A, P, and E sites and the channel through which the mRNA is translocated. (A, From Schmeing TM & Ramakrishnan V Nature 461:1234–1242. With permission from Macmillan Publishers Limited) 13.3 SYNTHESIS AND DEGRADATION OF THE COMPONENTS OF THE PROTEOME 311 During stress, bacteria inactivate their ribosomes in order to downsize the proteome As well as revealing the structures of active ribosomes, X-ray crystallography has also helped to elucidate the events that enable a bacterium to bring about a general reduction in the size of its proteome during periods of stress, such as nutrient limi- tation. e latter is signaled by the presence in the A sites of ribosomes of tRNAs that do not have attached amino acids. ese tRNAs lack amino acids because the amino acid pool in the cytoplasm has become depleted due to the starvation E. coli, the presence of deacylated tRNA in the A site is detected by the L11 ribosomal protein, initiating the stringent response. L11 activates a ribo- some-associated protein called RelA, which converts guanosine 5′-triphosphate (GTP) to guanosine 5′-triphosphate 3′-diphosphate (pppGpp) by transferring a diphosphate from adenosine 5′-triphosphate (ATP) to a GTP molecule. Guanine pentaphosphatase then converts pppGpp to ppGpp (Figure 13.27), which is an alarmone, a stress response molecule that modi es a broad range of cellular activ- responses is a general decrease in transcription but an increase in transcription of genes involved in amino acid biosynthesis. ese changes are brought about by ppGpp binding to the β and β′ subunits of the bacterial RNA polymerase and alter- ing the a nity of the polymerase for di erent types of promoters. By switching on amino acid biosynthesis, the bacterium is able to carry out essential maintenance of its proteome while its rides out the stress conditions and waits for the external nutrient supply to increase. Because its overall metabolic activity has declined, the bacterium also decreases the size of its proteome by globally reducing the rate of protein synthesis. is is achieved, at least in part, by ppGpp binding to the translation initiation factor - ing the rst s