Protein Folding & Molecular Chaperones PDF

Cells, Tissues and Control Systems Unit Moncef LADJIMI [email protected] Office: C-169 As faculty of Weill Cornell Medical College in Qatar we are committed to providing transparency for any and all external relationships prior to giving an academic presentation. I, Moncef LADJIMI DO NOT have a financial interest in commercial products or services. Protein Folding and Molecular Chaperones Additional material for this lecture may be found in: § Lehninger’s Biochemistry (8th Ed), chapter 4: p 128-133 §Alberts et al., Molecular Biology of the Cell (6th ed), chapter 6: 353-357 PROTEIN FOLDING AND MOLECULAR CHAPERONES Learning objectives: Describe the two formulations of the protein folding problem Describe principles of protein structure leading to folding Describe the pathways and thermodynamics of protein folding Describe the features of two-state vs multi-states folding proteins Describe chaperone assisted folding and the major classes of cytosolic molecular chaperones. PROTEIN FOLDING: THE SECOND PART OF THE GENETIC CODE First part of the genetic code (from RNA to a protein sequence) Unfolded Second part of the genetic code (Folding according to Anfinsen principle, molecular Chaperones) Folded Protein THE TWO FORMULATIONS OF THE PROTEIN FOLDING PROBLEM THE PROTEIN FOLDING PROBLEM THE PROBLEM: Let’s take a protein of 100 AA that is going to fold in a defined 3D structure: If each AA residue could take up 10 different conformations on average (rotations around Ca), this will give 10100 polypeptide conformations. – If the protein folds spontaneously by a random process in which it tries all possible conformations, and if each conformation is sampled in the shortest time possible of 10-13 sec (the time required for a single vibration), it would take about 10100x10-13 sec, that is ~1077 years, to sample all possible conformations (for comparison, the age of the universe is estimated to be 13.7 109 years). – Thus, it would take more than the age of the universe for a protein to fold, yet it does so in a few seconds (we know proteins fold to the lowest-energy fold in the microsecond to second time scales, for example, E. coli can make a complete, biologically active, protein of 100 AA residues in about 5 sec). This is called the Levinthal’s paradox: It is mathematically impossible for protein to fold by randomly trying every conformation until the lowest-energy one is found. THE FIRST FORMULATION OF THE PROTEIN FOLDING PROBLEM: How does a protein arrive at its native, active, conformation so fast ? THE SECOND FORMULATION OF THE PROTEIN FOLDING PROBLEM: Is it possible to predict the folded structure of a protein from its AA sequence? 1/ APPROACHING THE FIRST FORMULATION OF THE PROTEIN FOLDING PROBLEM How proteins fold so fast while It is mathematically impossible for proteins to fold by randomly trying every conformation until the lowest-energy one is found? IN REALITY: § Protein folding is biased by free energy and does not proceed through a random search. § Proteins do not search every possible conformation or state but move towards a lower free energy. § Thus, most conformations are never visited by any particular molecule. THE THERMODYNAMICS OF PROTEIN FOLDING DEPICTED AS A FREE-ENERGY FUNNEL Folding is initiated by a spontaneous collapse into a compact state, mediated by hydrophobic interactions (the hydrophobic effect). The collapsed state, called « molten globule » has high secondary structure content but amino acids are not yet in place. Re-arrangements to the amino acids positions occur. Thus, the folding pathway is: 1/ Random coil 2/ Molten globule formation 3/ Final re-arrangements At the top of the funnel, the number of conformations, and hence the conformational entropy, is large. Only a small fraction of the intramolecular interactions that will exist in the native conformation are present. As folding progresses, the thermodynamic path down the funnel reduces the number of states present (decreases entropy), increases the amount of protein in the native conformation, and decreases the free energy. Depressions on the sides of the funnel represent semi-stable folding intermediates, which in some cases may slow the folding process. COMPETITION BETWEEN FOLDING AND AGGREGATION Unfolded Intermediate Aggregate Folded § Spontaneous folding is generally efficient for small, single-domain proteins that bury exposed hydrophobic amino acid residues rapidly (within milliseconds) upon initiation of folding (two- state folding proteins). § In contrast, large proteins composed of multiple domains often refold inefficiently, owing to the formation of partially folded intermediates (multi-states folding proteins), including misfolded states, that tend to aggregate. § Misfolding originates from interactions between regions of the folding polypeptide chain that are separate in the native protein and that may be stable enough to prevent folding from proceeding at a biologically relevant time scale. § These nonnative states, often expose hydrophobic amino acid residues and segments of unstructured polypeptide backbone to the solvent. They readily self-associate, aggregate into disordered complexes, driven by hydrophobic forces and interchain hydrogen bonding. “ENERGY LANDSCAPES” OF FOLDING AND AGGREGATION TWO-STATES VS MULTI-STATES FOLDING PROTEINS Smooth landscape Rugged landscape “Two-state” folding pathway: Multiple intermediates folding pathway: Either folded or unfolded Several intermediates (fast and cooperative) (slow, with kinetic traps and potential aggregation) 2/ APPROACHING THE SECOND FORMULATION OF THE PROTEIN FOLDING PROBLEM Knowing that protein folding is determined solely by the protein AA sequence (Anfinsen’s central dogma), but not knowing the “code” that “transforms” a protein sequence into a protein 3-D structure leads to: How then to determine (predict) the folded structure of a protein from its amino-acid sequence? - A POSSIBILTY is to use computer calculations and molecular dynamics simulations, guided by physics and chemistry laws (potential energy and force fields), to obtain models of folded structures (simulations of protein folding in silico; see next slides). - SIMULATIONS AND COMPARISON OF THE COMPUTER MODELS to corresponding “real” folded structures is a way for understanding the detailed pathways and intermediates formed, going from a nascent polypeptide chain to the final folded, native, conformation. SUCCESSFUL PREDICTION OF THE FOLDED STRUCTURE BASED ON AMINO ACID SEQUENCE IN SILICO (COMPARISON WITH A REAL FOLDED STRUCTURE) Sophisticated algorithms have been developed to predict the native structure of proteins based solely on their amino acid sequence, without considering how the folding process actually occurs in nature. Superposition of the model “computer structure” (blue), obtained by blind structure prediction, with the “real” crystal structure (brown). Core side chains for the protein are shown. Simulations were initiated from completely extended structures. These predictions are essential in efforts to predict the structure of the great number of protein sequences for which there is no corresponding 3D structures Philip Bradley et al. Science 2005;309:1868-1871 SUCCESSFUL PREDICTION OF THE FOLDED STRUCTURE BASED ON AMINO ACID SEQUENCE IN SILICO (VIDEO) PROTEIN STRUCTURE PREDICTIONS TO ATOMIC ACCURACY WITH ALPHAFOLD (AI) AlphaFold is a neural-new work-based approach to predicting protein structures with high accuracy. It is a paradigm shift in structural biology. The release of protein structure predictions from AlphaFold will increase the number of protein structural models by almost three orders of magnitude. Structural biology and bioinformatics will never be the same, and the need for incisive experimental approaches will be greater than ever. Combining these advances in structure prediction with recent advances in cryo-electron microscopy suggests a new paradigm for structural biology. DeepMind's protein-folding AI cracks biology's biggest problem Artificial intelligence firm DeepMind ( Alphafold) has transformed biology by predicting the structure of nearly all proteins known to science in just 18 months, a breakthrough that will speed drug development and revolutionise basic science Nature Methods, 2021 New Scientist, 28 July 2022 Sept 21, 2023 ALPHAFOLD WEBSITE AlphaFold A practical guide https://www.ebi.ac.uk/training/online/courses/alphafold/an-introductory-guide-to-its-strengths-and-limitations/what-is-the- protein-folding-problem/ What is the protein folding problem? It is theoretically possible to predict the 3D structure of a protein just from its amino acid sequence. However, this is extremely challenging because of the sheer number of possible conformations. Artificial intelligence is ideally suited to this problem. Protein folding problem: predicting protein structure from sequence: The protein folding problem encompasses two interrelated challenges: understanding the process of protein chain folding and accurately predicting a protein’s final folded structure In 1972 Christian Anfinsen shared the Nobel Prize in Chemistry for proposing that, in its standard physiological environment, a protein’s structure is determined by the sequence of amino acids that make it up. This came to be known as Anfinsen’s dogma. This hypothesis was important, because it suggested we should be able to reliably predict a protein’s structure from its sequence. Decades of research into structural biology have since shown that Anfinsen was largely correct. The computational challenge: However, it turns out that predicting protein structure from sequence is not so simple. This is because of a second concept called Levinthal’s paradox. In the 1960s, Cyrus Levinthal showed that there is a very large number of possible conformations a protein chain could theoretically adopt. If a protein was to explore them all, it would take an incomprehensible amount of time, comparable with the lifetime of the Universe. Nevertheless, Anfinsen’s findings inspired a search for an efficient system that could reliably identify the most likely native structure of a protein, based solely on its amino acid sequence. While challenging, this was at least theoretically possible. The role of artificial intelligence: This is where artificial intelligence comes in. Modern machine learning methods can help identify complex relationships in large datasets, enabling prediction of protein structures. Crucially, Anfinsen’s dogma implies that predicting the folded state of a protein does not necessarily require an understanding of the folding process. That is, it should be possible to predict the final 3D shape of a protein without predicting the sequence of movements that leads to this shape – sidestepping Levinthal’s paradox. GENERAL PRINCIPLES GOVERNING PROTEIN FOLDING The primary structure dictates the 3D structure (Anfinsen principle). Most proteins fold spontaneously, other will need assistance by molecular chaperones. Folding is driven by the hydrophobic effect. For soluble proteins, the core of a protein is tightly packed and is hydrophobic. The surface of a protein is polar and interacts with water. Secondary structure elements are necessary to satisfy the hydrogen bonding capabilities of the main chain atoms in the interior of the protein The native, folded state, is the lowest free-energy state. The stability of the folded structure is determined by non-covalent interactions (and for some proteins disulfide bonds) Interactions between charged residues (salt bridges) Hydrogen bonds van der Waals interactions (tight packing) Hydrophobic interactions PROTEIN FOLDING IN THE CYTOSOL (CHAPERONE ASSISTED FOLDING) CROWDING OF THE CYTOSOL 0.1 µm 3D-reconstitution of the cell cytoplasm by cryo-electron microscopy (volume of about 800x800x70 nm, W. Baumeister, Germany). Actin filaments(orange-red), ribosomes and other macromolecular assemblies (green) and membranes (blue). The macromolecules occupy a significant fraction (typically 20–30%) of the total volume. This fraction (excluded volume) is thus physically unavailable to other molecules. The excluded volume effects resulting from the highly crowded nature of the cytosol (about 300 to 400 mg/ml of proteins and other macromolecule in Escherichia coli) are predicted to enhance the aggregation of non native protein chains substantially by increasing their effective concentrations PROTEIN FOLDING WITH CHAPERONES Unfolded Intermediate Aggregate Chaperones Folded The aggregation process irreversibly removes proteins from their productive folding pathways, and must be prevented in vivo by molecular chaperones. A certain level of protein aggregation does occur in cells despite the presence of an exclusive chaperone machinery and, in special cases, can lead to the formation of structured, fibrillar aggregates, known as amyloid, that are associated with diseases such as Alzheimer’s or Parkinson’s disease (see Protein Misfolding lecture). Compared to refolding in dilute solution, the tendency of nonnative states to aggregate in the cell is expected to be sharply increased as a result of the high local concentration of nascent chains in polyribosomes and the added effect of macromolecular crowding. ROLE OF MOLECULAR CHAPERONES Biosynthesis : High concentration of nascent “unfolded” polypeptides Transport : Traffic of unfolded proteins within the cytosol/organelles Assembly : High concentration of unfolded monomers Stress : Unfolding of proteins in extreme conditions Premature intra- or inter-molecular interactions Misfolding and Aggregation Molecular Chaperones Bind to and stabilize otherwise unstable or aggregation-prone intermediates Ensure the correct fate of proteins in-vivo Folding, Assembly or Transport through membranes HOW CHAPERONES PREVENT MISFOLDING AND AGGREGATION? § The cellular chaperone machinery counteracts misfolding and aggregation of non-native proteins, both during de novo folding and transport, and under conditions of stress, such as high temperatures, when some native proteins unfold. § Many chaperones, though constitutively expressed, are synthesized at greatly increased levels under stress conditions and are classified as stress proteins or heat- shock proteins (Hsps). § In general, all these chaperones recognize hydrophobic residues and/or unstructured backbone regions in their substrates, i.e., structural features typically exposed by nonnative proteins but normally buried upon completion of folding (native proteins). THE MAJOR CLASSES OF CYTOSOLIC MOLECULAR CHAPERONES THE MOLECULAR CHAPERONE PANOPLY Unfolded Intermediate Aggregate HSP25-30 HSP60 HSP70 HSP90 HSP100 HSP70 HSP10 J, E HSP70 70,J,E ATP ATP ATP ATP ATP Folded HSP70 Protein Family CONSERVATION OF THE HSP70 SEQUENCE AND STRUCTURE FROM BACTERIA TO MAN q Bacteria DnaK Cytosol q Yeast Ssa Cytosol Ssb Cytosol Ssc Mitochondria Kar2 ER qMammals HSC70 Cyto/nucleus HSP70 Cyto/nucleus HSP70 Mitochondria BiP ER 3D STRUCTURE OF THE HSP70s N-terminal domain C-terminal domain 607 PBD (Peptide Binding Unfolded Domain) Peptide protein ATP bound Lid 1 384 Linker 390 HSP70s DYNAMICS Unfolded protein N-terminal Linker domain C-terminal domain ADP State PBD Movement (closure) (High affinity) of the lid relative to ADP is bound to the the PBD locks the N-terminal domain. Lid unfolded protein in C-terminal domain the binding site). in CLOSED form ATP State (Low affinity) Movement (opening) of the lid ATP is bound to the relative to the PBD allows the N-terminal domain. protein to be folded to be C-terminal domain released from the binding site in OPEN form (in a folded or unfolded state). HSP70s FUNCTIONAL CYCLE (ANALOGOUS TO THAT OF G PROTEINS) ATP-state: Open Form, Low affinity for unfolded proteins Unfolded protein (Nucleotide (ATPase Activating Exchange Factors: Factors: DnaJ, Hsp40..) GrpE, Bag…) ADP-state: Closed form, High affinity for unfolded proteins HSP70 CHAPERONES IN PROTEIN FOLDING Pathway by which chaperones of the eukaryotic Hsp70 class (with Hsp40 and NEFs) bind and release polypeptides. - The chaperones do not actively promote the folding of the substrate protein, but instead prevent aggregation of unfolded peptides. - The unfolded or partly folded proteins bind first to the open, ATP-bound form of Hsp70. Hsp40 then interacts with this complex and triggers ATP hydrolysis that produces the closed form of the complex, in which the domains colored orange and yellow come together like the two parts of a jaw, trapping parts of the unfolded protein inside. Dissociation of ADP and recycling of the ADP Hsp70 requires interaction with another type of protein called a nucleotide-exchange factor (NEF). - For a population of polypeptide molecules, some fraction of the molecules released after the transient binding of partially folded proteins by Hsp70 will take up the native conformation. The remainder are quickly rebound by Hsp70 or diverted to the chaperone HSP70S HIGHLIGHTS Highly conserved protein family Two-domain proteins that couple ATP hydrolysis and ADP to ATP exchange to unfolded protein binding and release Bind to hydrophobic stretches of 5-7 residues in extended (b-strand like) conformation within unfolded proteins Cooperate with JDPs and NEFs for coupling multiple cycles of ATP hydrolysis and exchange to multiple cycles of unfolded proteins binding and release until correct folding is complete HSP60 (GroEL) Protein Family CONSERVATION OF HSP60-HSP10 FAMILY q Bacteria GroEL-GroES Cytosol q Yeast TCP1 Cytosol HSP60-10 Mitochondria qMammals Tric-GimC Cytosol HSP60-10 Mitochondria 3D-STRUCTURE OF HSP60 (GroEL)-HSP10 (GroES) GroES (Hsp10) GroEL (Hsp60)-GroES (Hsp10) 3D-ORGANIZATION OF HSP60 (GroEL)-HSP10 (GroES) GroES (1ring of 7x10 kD) GroEL 2 rings of cis (7x60 kD) trans HSP60 (GroEL)-HSP10 (GroES) CENTRAL CAVITY Left: The equatorial (pink), intermediate (yellow), and apical (darkblue) domains, of one subunit each in the cis and trans ring of GroEL, respectively, and one subunit of GroES (red). Right: The accessible surface of the central cavity of the GroEL-GroES complex. Polar and charged side-chain atoms (blue); hydrophobic side-chain atoms (yellow); backbone atoms, white; and solvent-excluded surfaces at subunit interfaces (gray). Note the hydrophobic surface (yellow) in the open region (trans) of GroEL chaperone that will be binding the hydrophobic regions of an unfolded protein substrate HSP60 (GroEL)-HSP10 (GroES) IN PROTEIN FOLDING Unfolded protein folding discharge Protein folding binding triggered triggered or rebinding GroEL is a two-chambers system that works in alternation both separately, when one chamber is occupied by the unfolded protein substrate, the other one is free, but also in concert with regard to ATP binding and release, when one is in the ATP form (or free), the other one is in the ADP form (there is negative cooperativity in ATP binding and release between the two rings). HSP60 (GroEL)-HSP10 (GroES) IN PROTEIN FOLDING Unfolded protein folding discharge Protein folding binding triggered triggered or rebinding Simplified reaction of protein folding in the GroEL-GroES cage: I, folding intermediate bound by the apical domains of Hsp60 (GroEL). N, native protein folded inside the cage. For a typical GroEL substrate, multiple rounds of GroEL-GroES action are required for folding. Both I and N exit the cage upon GroES (hsp10) dissociation, and accumulate after a single reaction cycle. I is then rapidly re-bound by GroEL. HSP60 (GroEL)-HSP10 (GroES) IN PROTEIN FOLDING 1. The hydrophobic aminoacids in the open chamber of GroEL (the chamber that does not have GroES bound to it, or trans chamber) capture an unfolded substrate protein. 2. ATP binds to the loaded chamber (trans chamber) and induces a conformational change leading to the binding of GroES. The unfolded protein substrate is now encapsulated in the central cavity. 3. The chamber to which GroES and unfolded protein substrate are bound is now in its "cis" conformation. After ATP hydrolysis, folding occurs in this cis chamber (which is now in the ADP state). 4. ATP hydrolysis to ADP in the cis chamber leads to binding of ATP to the other (unoccupied) trans chamber. This second ATP binding event to the trans chamber releases GroES and the folded substrate from the complex 5. The chaperone two-chambers system is now ready for a new cycle HSP60-HSP10 HIGHLIGHTS Highly conserved protein family Double-ring complexes, enclosing a central cavity that couple ATP hydrolysis and ADP to ATP exchange to unfolded protein binding and release Bind to and encapsulates (in a protected environment) non-native, hydrophobic regions of unfolded polypeptides up to 60 kD Cooperate with HSP10 for coupling multiple cycles of ATP hydrolysis and exchange to multiple cycles of unfolded proteins binding and release until correct folding is complete DE-NOVO PROTEIN FOLDING IN THE E. COLI CYTOSOL (similar protein folding chaperone systems are fond in mammals) mRNA Hsp70 JDP NEF Chaperone-independent Hsp70-dependent folding: folding: small & fast aggregation-prone proteins folding proteins GroEL- GroES F GroEL-GroES-dependent folding: complex proteins with discontinuous surfaces CHAPERONE ASSISTED PROTEIN FOLDING (VIDEO) (For GroEL-GroES mechanism, watch between 17 min and 27 min)

Protein Folding & Molecular Chaperones PDF

Document Details

Tags

Related

Summary

Full Transcript