Protein Dynamics, Folding, and Disorder PDF

![](media/image2.png)Topic 05: Protein dynamics, folding and disorder ===================================================================== ![](media/image4.jpeg) - While proteins structures are commonly depicted as being fixed and static, in reality proteins are in ceaseless motion - ![](media/image6.png)Essentially all atoms undergo fast motions that are local, involving only relatively small motions. - But proteins also capable of rarer and slower large-scale motions - Protein dynamics is a key aspect of protein function ![](media/image8.jpeg)Protein dynamics: local ============================================= - Individual chemical bonds have harmonic vibration modes (rotation, translational) - These motions occur on \~10-13 s time scales - Individual groups such as side chains may convert from one stable conformation another by rotation around C-C bonds - ![](media/image12.png)This can occur on a microsecond time scales if residues are fully exposed, but will be much slower (millisecond) if the group is buried and hindered by its neighbours ![](media/image14.jpeg)timescale - ![](media/image17.jpeg)movement of individual do ns - folding and more ocal disorder/order transitions - local secondary structure reorganization - binding Protein structural states ========================= - While proteins are dynamic, they tend to spend most of their time close to a small number of specific conformations - ![](media/image19.png) - - These distinct states are often preferentially stabilized by binding specific small molecules (e.g. enzyme substrate) or by specific covalent modification (e.g. phosphorylation stabilizes the protein complex) - ![](media/image21.png)A full understanding of the structure-function relationships of a protein requires understanding the different structural states it can populate ### ![](media/image23.png)![](media/image25.png)e.g. Trypsin dynamics explores six microstates - In molecular dynamics, simulations trypsin alternates between six distinct states - These differ in loop conformation, substrate affinity - States interconvert on a 1 - 100 - ![](media/image27.png)The red and green states are the most populated low and high affinity states - ![](media/image29.png)The slow interconversion of these two states dominates the activation/inactivation dynamics of trypsin Protein folding =============== - ![](media/image33.png)A protein is considered "folded" when it is in a compact, stable, functional conformation; this is the conformation generally described by structural studies - Once folded, the protein spends most of its time in one of a relatively small set of closely related conformations - ![](media/image36.png)Unfolded proteins have few of the interactions seen in the folded structure, and are significantly less stable than the folded structure - ![](media/image38.png)When unfolded, the protein is not constrained by native interactions and can take on a huge number of conformations, most of which are nothing like the native structure Studying folding - biochemically ================================ - Chemical denaturants can also be used, which destabilize the structure by competing for favourable interactions - E.g. Urea and guanidine (at molar concentrations) compete - ![](media/image41.png)Strong acids and bases destabilize by causing electrostatic repulsion - By rapidly removing the denaturant (e.g. by dilution) you can induce refolding and monitor it by various biophysical means ![](media/image43.png) #### Interactions occur in both the unfolded and folded states - Intuitively, you might think that making a new H-bond between two protein atoms during folding would have a net stabilizing effect - However, these atoms would have been making H-bonds with water in solution - you need to break these solvent H-bonds - But then you also gain the energy from new water-water H-bonds after they are released - The net result might only be slightly stabilizing - However, burying these protein groups without making H-bonds will definitely be destabilizing Proteins are only marginally stable =================================== - You can formulate the energetics of folding in terms of its enthalpic and entropic contributions - - H~fold~ is net favourable, and on the order \~1000s of kJ/mol (scales with protein size) - T S~fold~ term is also on the order of \~1000s of kJ/mol, but is net destabilizing (also scales with protein size) - This is mostly due to loss of conformational entropy - Overall G~fold~ is \~-20 to -60 kJ/mol -- favourable to folding, but only slightly so - The overall stabilizing energy of a protein is therefore small compared to the energy released by the interactions formed ![](media/image47.png)e.g. Ribonuclease folding energetics ---------------------------------------------------------- - The entropy of folding is strongly unfavourable - This largely reflects the cost of forcing the protein to adopt a single compact conformation instead of exploring a huge range of extended conformations (loss of conformational entropy) - The enthalpy of folding is strongly favourable - The strongly favourable enthalpic contributions slightly outweigh (\~ 2 H-bonds worth of energy) the strongly unfavourable entropic contributions Marginal stability is functional ================================ - If proteins were too stable, they would be only be able to adopt their lowest energy state (too much energy would be required to do anything else!) - Proteins need a certain amount of flexibility to function - They adapt themselves to ligands, move between steps in a catalytic cycle, etc. - Proteins that need to be modified by enzymes (e.g. in protein maturation or signaling) do so with the target residues unstructured - If proteins are too stable, they lack the dynamics necessary to function - Overly stable proteins are also challenging to unfold prior to degradation when they are no longer needed - ![](media/image51.png)Consider a small, 100 residue protein - Assume that the backbone each residue can take only 3 conformations (α, or L) (a vast oversimplification) - Then there are 3100 \~ 1047 possible conformations - If it takes 10-13 s to convert from one structure to another (optimistic), an exhaustive search would take \~ 1027 years! - Proteins fold much faster than this, so they necessarily cannot be checking every possible configuration - So proteins must fold in a more "directed" fashion where almost all theoretical possibilities are systematically avoided ### Anfinsen's Experiment (1973) ![](media/image53.jpeg) - ![](media/image56.png)Treating purified ribonuclease with 8 M urea unfolds the protein making it no longer catalytically active - ![](media/image58.png)Reducing conditions then break the four disulfide bonds that lock the protein into its native conformation ### Anfinsen's Experiment ![](media/image55.jpeg) - Moving the protein to oxidizing conditions allows disulfide bonds to reform - ![](media/image64.png)If this is done under denaturing conditions (8 M urea), the protein remains largely inactive even if the urea is later removed - You end up trapping a collection of 105 possible disulfide connections, only one of which is native (and active) ### ![](media/image59.jpeg)Anfinsen's Experiment - Adding a trace of reducing agent to trapped ribonuclease allows disulfides to break and reform - In the absence of urea, other residues can interact, pushing the overall structure towards the native state - The result is that the protein rearranges until it gets locked (by the right disulfide bonds) into its native structure - The protein becomes fully active ### ![](media/image68.png)Conclusions from Anfinsen's experiment - Treating purified Ribonuclease with urea, then introducing oxidizing conditions allows one to trap disulfide bond patterns different from the native protein - ![](media/image71.png)The diversity of disulfide bonds trapped implies that there is no single preferred structure under these conditions - This corresponds to the [unfolded state] - When the urea is removed, the protein has the ability to refold in the absence of any other cellular components - This demonstrates that the [protein sequence alone] contains the information that encodes the final structure of the protein - Proteins defy Levinthal and act as little automata that spontaneously fold themselves! Protein sequence directs structure ================================== - ![](media/image73.png)Proteins made up of 20 amino acids, each with unique physical properties: charge, size, pattern of H-bond donors and acceptors, hydrophobicity, etc. - Given correct conditions, any protein is able to find its correct conformation without any additional input - In essence, the protein's biophysical properties (encoded within its primary structure) [compel it] to adopt the appropriate conformation - With the right amino acid sequence, physics folds the protein for you - In principle, one should be able to predict a protein's structure from its aa sequence - This is the "inverse folding problem" - This turns out to be a very difficult problem (but now largely solved!) ![](media/image75.jpeg)Melting temperatures =========================================== - If you monitor structure (e.g. by circular dichroism) as a function of increasing denaturant or temperature, unfolding is seen to occur in a nonlinear fashion - Small increases barely affect the fraction of folded protein - At some critical point, almost all of the protein will unfold over a narrow range - Different proteins have different melting temperatures (or denaturant concentrations) - T~m~ is the temperature at which half the protein is folded - Tm will increase if a protein is bound to a ligand (which is an easy experiment to find novel ligands) ### Protein folding is highly co-operative - The sigmoidal shape of curves of percent folding vs. \[denaturant\] or T reflects a highly co-operative folding process - The structure becomes stable only when all residues are in place and locked into the correct configuration - Molecules are **highly likely** to shift from being unfolded to folded **once conditions favour folding over unfolding** - ![](media/image79.png)One consequence is that, in general, isolated fragments of proteins (less than a domain) are not stable and will not fold - they need tertiary interactions to stabilize the native conformation Two state model of protein folding ---------------------------------- - ![](media/image81.png)The two-state folding model posits that a protein is either completely folded, or completely unfolded - Under denaturing conditions, the protein is essentially all unfolded (in an enormous variety of possible structures) - When conditions become right, the protein folds to its native structure - By definition, there is no intermediate structure between these two states which the protein adopts to any significant degree - For many small proteins, folding can be described by this model B&T Fig 6.1 ### Folding funnels model folding energetics - The "folding funnel" is an abstract representation of the energy of a protein as a function of its conformation - ![](media/image86.png)For two-state proteins, this landscape is a simple funnel with the native state as the lowest energy state - Unfolded proteins are at the top of the funnel, exploring many conformations all at high energy - The process of folding is one of reducing this - The protein folds by a rapid series of changes, each of which makes it more native-like - The protein is effectively pushed towards the native state by gains in favourable, native-like interactions The drivers of protein folding ============================== - ![](media/image88.png)*Hydrophobic collapse* is a key driving force for protein folding - The energetic cost of not forming hydrogen-bonds drives the backbone into helix/sheet conformation wherever buried - ![](media/image90.png)The choice of secondary structure favoured depends on the side chains in the region (E.g. MALEK residues favour helices) - The structure remains fluid and mobile until the side chains are able to lock into the correct packing arrangement - ![](media/image92.png)Disulfide bonds or metal ion binding (where they occur) help lock in the correct conformation once folded Protein folding time scale -------------------------- - Hydrophobic collapse and formation of local secondary structure happens very fast (\~10 ns) - Very small proteins can fold in \~1 μs - Slow proteins can take 100s milliseconds to fold Secondary structure forms quickly ================================= - ![](media/image94.jpeg)The detail is from a simulation of a very fast folding protein, but experiments corroborate - Collapse of a hydrophobic core and formation of some secondary structure elements happens very quickly - within nanoseconds - Secondary structure element formation is driven by local interactions within that sequence - These secondary structure elements resemble, but are not identical to native secondary structure, and are dynamic (extending, shrinking at each end, reorganizing) - The 3D organization of these elements (topology) is largely incorrect (since they start at random) - The amino acids in contact between ss elements are all different than the native structure Finding the correct topology is slow ==================================== - Over time, ss elements rearrange and explore different topologies - Very few of the early a.a. contacts resemble the native structure - Eventually ss elements happen upon a near-correct arrangement ( ) - At this point further improvements become co-operative, and residues rapidly lock together into their correct packing arrangements - This drives the stabilization of native topology and ss elements Three state protein folding --------------------------- ![](media/image99.jpeg) - Some proteins have one or more conformations that they typically adopt for significant time before adopting the native structure - A partially folded molten globule state is found on the folding pathway of many proteins - Other proteins form relatively long lived (millisecond), structurally well-defined intermediates as they fold B&T Fig. 6.2 Folding funnels with intermediates ---------------------------------- - Folding intermediates can occur where the protein forms a structure similar to the native structure, but is slightly less stable - ![](media/image109.png)Moving towards the native-like state requires undergoing a transition that increases the energy - ![](media/image111.png)This energetic barrier temporarily traps the protein in these states - The protein will therefore spend significant time in a structural state different from the native state - ![](media/image113.png)Proteins that fold with intermediates tend to fold more slowly ![](media/image115.png)Molten globules ====================================== - Some proteins are experimentally observable to go through a "molten globule state" intermediate step when folding - Molten globules are short-lived structural states that have the hydrophobic residues largely buried, and much of the native secondary structure formed - ![](media/image117.png)However, not all secondary structural elements are properly placed relative to one another, and the side chains are not properly packed and locked into place - The protein core therefore remains dynamic - Molten globules can be understood as the protein persisting in the state before the final, co-operative folding steps occur long enough to be observed experimentally ### ![](media/image119.jpeg)Folding intermediates can have a well- defined structure - ![](media/image122.png)This protein folds through a discrete intermediate - This folding intermediate is close enough in energy to the native structure that at any given time \~2 % of the protein is in this state - Sophisticated NMR experiments were used to determine the structure of this intermediate, and show that it differs from the native structure #### Knotting requires non-native folding intermediates - ![](media/image124.png)Some proteins have a knotted topology - Forming a knot requires that part of the chain is threaded through a loop in another part of the chain - During this process, any interactions the portion threading makes are necessarily different than the final, native structure - The longer the sequence that needs to thread through, the longer this takes - ![](media/image126.jpeg)In consequence, only a small minority of proteins form knots (\ - In cells, proteins fold as they are synthesized by the ribosome - In a multi-domain protein, the N-terminal domains can fold before the C-terminal domains are synthesized - *in vitro*, proteins fold in dilute solution where they very rarely collide - However, in the cell, the presence of exposed hydrophobic patches of other proteins (folded and unfolded) at high concentrations makes *in vivo* folding potentially hazardous - So, while the drivers for folding are intrinsic to the protein, specialized proteins are needed in the cell to accelerate folding and to help other proteins escape from mis-folding traps Protein disulfide isomerases ============================ - ![](media/image129.png)Proteins folding in oxidizing environments can form non-native disulfide bonds - These tend to trap the protein and prevent it from properly folding (as in the Anfinsen experiment) - Disulfide isomerases are proteins with reactive disulfide groups - They can reversibly oxidize internal disulfides so as to reduce disulfides in other proteins - With the incorrect disulfide broken, the protein is unlocked from the non-native conformation - It then has a chance to fold into the native state - ![](media/image131.png)This is then locked in by disulfide isomerase by forming the correct disulfide bond - Disulfide isomerases are generally associated ### *cis-trans* isomerization of proline can be rate limiting for folding - While *cis* and *trans* configurations of Proline have similar [net stability], there is a large energetic barrier that must be overcome to interconvert the two conformations - Proteins with one *cis*-peptide bond fold to a native-like state with *trans* Pro in milliseconds, but it can take minutes to reach the fully native, active conformation - With two *cis*-prolines, it can take hours or days - *In vitro* these proteins fold very slowly - *In vivo*, cis-trans isomerizations are catalyzed by enzymes, speeding folding Prolyl peptide isomerases ------------------------- - ![](media/image138.jpeg)Prolyl peptide isomerases catalyse *trans-cis* isomerization - These enzymes accelerate the isomerization by a factor of \~ 106 - ![](media/image140.png)They therefore allow proteins with one or more *cis* peptides to fold *in vivo* almost as fast as all *trans* proteins Molecular Chaperones ==================== - ![](media/image142.png)Molecular chaperones are proteins that in some way help promote the transition from non-native to native structure - ![](media/image144.png)Chaperones come in a wide variety of folds but generally function by either - ![](media/image146.png)Passively binding exposed hydrophobic surfaces of unfolded proteins to prevent aggregation - Actively unfolding proteins in an ATP dependent manner and then allowing them to refold - There are multiple protein families that play these roles, with different strategies for targeting, stabilization and substrate release Protein misfolding - amyloids ============================= - While the native protein structure is selected for by evolution, there can coincidentally be very different conformations of the same sequence that can stably pack - ![](media/image148.png)Packing multiple copies of a protein region on itself in an extended -sandwich like conformation can be very stable if side chains interact favourably - Such sequences can spontaneously refold to form extended fibrils - These fibrils will recruit further copies - Fibrils will template more fibrils, resulting in a pathological phenotype - 50 different human diseases, included Alzheimer\'s and ![](media/image150.jpeg) - ![](media/image153.png)The membrane adjacent region, cleaved by secretase, is prone to self-associating - It can form a variety of fibril structures based on self-associating - ![](media/image155.png)These are pathological, causing Alzheimer's disease - Certain mutations can promote forming these fibrils Gremer et al Science 2017 Intrinsically unstructured proteins =================================== - ![](media/image157.jpeg)Folding proteins generally depends on hydrophobic collapse - But what about proteins (or protein regions) with too few hydrophobic residues to drive collapse? - Such proteins can stay in an unfolded state indefinitely - These proteins are known as intrinsically disordered (or unstructured) proteins - Despite lacking structure, these proteins play essential biological roles Intrinsically unstructured proteins =================================== - ![](media/image159.png)Intrinsically unstructured proteins are proteins whose function [depends] on the fact they can not fold spontaneously into a well-organized, globular structure - They generally do not have enough non-polar residues to form a stable hydrophobic core - Whole proteins can be unstructured - More commonly, significant regions (30 aa+) can be disordered between more structured domains - **About 30 % of eukaryotic proteins are either wholly disordered or have significant disordered domains** - ![](media/image161.png)Disordered proteins are much rarer in prokaryotes (3 %) Functional roles of IUPs ======================== - ![](media/image163.png)In many proteins IUP domains act as linkers between domains, allowing them to rearrange flexibly while keeping them tethered - ![](media/image165.png)The length of the linker can control the affinity and dynamics of interactions - IUPs can act as spacers; the entropic cost of restricting their conformation space allows them to push back like weak springs - IUPs also have important roles in chaperones and stress response proteins - The most critical roles are in protein interactions and membraneless organelles - ![](media/image173.png)In Fatty Acyl Synthase, the acyl carrier protein (ACP) is a domain - The FA is attached to this ACP via a phosphopantheinate group - ![](media/image175.png)ACP carries the substrate to each catalytic site in succession ![](media/image177.png)Ion channel inactivation =============================================== - Ion channels (e.g. K+ channel) are quickly inactivated after firing - This ensures that they produce a short burst of signal - The signal is terminated by an inbuilt inactivation domain - This domain recognizes and specifically blocks the open channel Ball and chain model of ion channel inactivation ------------------------------------------------ - The inactivation domain is formed by the N-terminus of the molecule - Inactivation only requires a short (\~20 a.a.) peptide that binds within the ion channel - This peptide will work at higher concentrations as a free peptide - Biologically this peptide is at the end of a longer (\~50 a.a.) linker sequence that connects it to the transmembrane domains - The linker is experimentally demonstrated to be fully disordered - ![](media/image181.png)The linker tethers the inactivation peptide near its target - The linker length affects the affinity (since the linker length determines the local terminator concentration) - The time needed for the inactivator to find its target is also - This is because the length of the linker dictates the volume which the peptide can diffuse in, and therefore the time it takes to find the binding site - This system allows gate closure timing (a critical parameter for ion channels) to be tuned simply by altering linker chain length ![](media/image183.png)IUPs can drive phase separation ====================================================== - A special sub-group of disordered proteins can form multiple favourable, but weak, self-associations - Such proteins will, under the right conditions, form separated droplets within the solution - This process is termed phase separation - ![](media/image185.png)The resulting **condensate** will tend to recruit other molecules (e.g. proteins, nucleic acids) that themselves interact favourably with the condensate Interactions driving liquid condensates --------------------------------------- - A variety of different interactions have been found to drive phase separation - These can include hydrophobic interactions, cation-pi interactions, or favourable electrostatics - Short interaction driving sequences are often interspersed with disordered regions - Sequence matters in these proteins -- scrambled versions do not condense - Condensates can also involve specific interactions between protein domains, or between protein and RNA Protein condensates =================== - ![](media/image188.png)Covalent modification of these proteins can regulate (dis)assembly of the condensate - Other proteins/nucleic acids with suitable traits also selectively partition into these localized regions and become concentrated there - ![](media/image190.png)This can promote e.g. transcription, signaling events, or concentrate factors for molecular processing - ![](media/image192.png)Examples of condensates include nucleoli, stress granules, and viral replication condensates (including COVID 19) - Condensates are an extremely important principle for organizing cells, and directly impact many biological processes, especially signaling ### ![](media/image194.jpeg)e.g. 1: Ddx4 forms phase separated bodies - Ddx4 is an IDP - In vitro and in cells, it forms phase separated droplets - ![](media/image197.jpeg)Ddx4 interacts with copies of itself via distinct regions of negatively and positively charged residues - Ddx4 droplets preferentially sequesters ssDNA while excluding dsDNA - Ddx4 self-association can be blocked by methylating the Arg residues (which interferes with cation-pi interactions) Nott et al. Mol. Cell 2015 ![](media/image199.jpeg) #### e.g. 2: FG repeats form a gel plug in the nuclear pore complex - The NPC is a very large, multiprotein complex (\~600 Å inner diameter) that provides the only access to the nucleus - A subset of NPC proteins have extended domains containing a repeated sequence with a Phe-Gly motif every \~15 a.a. - The FG motifs self interact, and these repeat proteins form an agarose-like gel when expressed recombinantly - These proteins form a self-interacting plug that prevents nuclear-excluded proteins from passing through the pore ### IUPs that undergo disorder -\ order transitions to bind - IUPs regions are very common in signaling proteins - ![](media/image203.png)An already-folded partner provides a stable platform to bind - The IUP can wrap around its cognate protein target - This allows them to interact with essentially every amino acid, and achieve nanomolar affinity with only 20 or so amino acids - Binding is often dependent on a specific modification (e.g. phosphorylation), allowing transmission of a signal - E.g. the TAZ-1 protein (pink) becomes structured upon binding #### Binding an IUP to a pre-ordered domain allows the proteins to utilize much more of the available binding surface - ![](media/image205.png)IUPs can [wrap around] their interacting partner, utilizing much more of the - This implies that the interacting partner protein need not be as large as if it were binding another globular protein - The IUP itself need only be a couple of tens of amino acids; it does not form a hydrophobic core, and therefore most of the amino acids are available to interact - The interacting proteins are smaller allowing the overall genome to be considerably more compact, with less crowding in the cell ### HIF-1a (IUP) binding to TAZ-1 (preordered) - ![](media/image208.png) HIF-1α wraps around TAZ-1 domain of CBP - HIF-1α forms 3 α-helices that are not present in the uncomplexed protein - The complex buries a large amount of surface and gives very tight (K~D~ 7 nm) and specific interaction Using IUPs allows highly specific but transient interactions ============================================================ - The tighter the interaction between proteins, the more specific it generally is - ![](media/image210.png)However, very tight interactions result in very slow dissociation of the interacting partners - How then to get a specific but transient interaction? - You do so by diverting energy gained from the interaction to folding one of the partners ### The energetics of HIF-1α TAZ-1 binding ![](media/image215.jpeg)A single scaffold protein can recognize multiple dissimilar IUP targets ----------------------------------------------------------------------------------------------- - The CITED2 binding site on TAZ1 overlaps with the HIF-1α binding site - Therefore, CITED2 competes with HIF- 1α, and antagonizes its action - ![](media/image217.png)Note that while CITED2 and HIF-1α bind the same site, they use different structural motifs to do so - ![](media/image219.png)Also, CITED2 and HIF-1a bind with their termini in different locations - ![](media/image221.png)CITED2 binds TAZ1 \~33x more tightly than HIF-1α, and can therefore compete off HIF-1α to shut down the hypoxic response

Protein Dynamics, Folding, and Disorder PDF

Document Details

Tags

Related

Summary

Full Transcript