Chapter 3: The Atomic Structure of Proteins PDF

CHAPTER 3 THE ATOMIC STRUCTURE OF PROTEINS Proteins are the most structurally complex and sophisticated molecules. Amino acids are linked together with a covalent peptide bond. The repeating sequence of atoms along the core of the polypeptide chain in the polypeptide backbone. Amino acid Side chains: attached to the polypeptide backbone and are not involved in the peptide bond. There are many limitations to the different bond angles between proteins such as: no two atoms can overlap, atoms are hard spheres, and steric restrictions. Noncovalent bonds determine the folding of the protein chain. The 3-D structure is determined by the amino acid sequence. The final folded shape of a protein is the one that minimizes its free energy. The conformation experiences constant fluctuations by thermal energy. It can also change when a protein interacts with another molecule. Molecular chaperones: assist proteins to fold in their correct shapes. α helix and β sheet result from hydrogen-bonding between the N-H and the C=O without involving the side chains. The cores of many proteins contain beta sheets. Alpha helixes are abundant in cell membranes, so they have their nonpolar regions outside and their polar inside (the polypeptide backbone that is hydrophilic is hydrogen bonded to itself) due to them being transmembrane proteins. Beta sheets are parallel or antiparallel chains that are held by hydrogen bonds. Alpha helixes form when a single polypeptide chain twists to form a cylinder with H-bond between every 4th peptide bond. Alpha helixes wrap around each other to for coiled coil. The nonpolar parts of 2 helixes are on the inside of the coil. Examples: α-keratin in skin and myosin in muscles. Primary structure: amino acid sequence. Secondary structure: alpha helixes and beta sheets. Tertiary structure: 3-D shape. Quaternary structure: more than one polypeptide. Larger proteins are made of smaller protein domains. SH2 domain responds to cell signals to cause selected protein molecules to bind to each other. It is made up of 2 alpha helices and 3 antiparallel beta sheets. Src protein kinase is made up of SH2 and SH3 domain which turn the kinase on and off and a C-terminal domain responsible for kinase catalytic activity. Domains are connected with intrinsically disordered regions (IDRs) that are lengths of polypeptide chains that act as flexible hinges between domains. Proteins are in constant motion and proteins exploit this as when a loop on the surface of a protein flips out to expose a binding site for a second molecule. There are thousands of possible combinations of protein polypeptides only a small fraction is stable, and these are found in cells because of natural selection. Gene duplication allows genes copies to evolve independently to perform new functions. These are grouped into protein families and are paralogs. Orthologs are proteins that have the same function but in different organisms. Serine proteases are a family of enzymes (protein cleaving enzymes) include chymotrypsin, trypsin, and elastase and proteases in blood clotting. The amino acid sequence of their protease portions are identical. They have similar 3-D shapes. However, each have a different function. Structure of protein families are more conserved than amino acid sequences because mutation changes the sequence. There is a limited number of ways protein domain can fold up. Domain Shuffling: accidental joining of DNA sequences leads to multidomain proteins. New binding surfaces are created where 2 domains meet and many of the functional sites where proteins binds are also there. Protein modules are mobile domains that can move from one portion to the next leading to different functions. Examples: SH2 and SH3 other with stable core structure of beta sheets, from which less ordered loops of polypeptide chain protrudes. The loops contain binding sites. These are convenient because they allow for new binding sites. Another feature is that they can be integrated easily into other proteins. The domains that have their N and C terminals on opposite poles undergo series duplication resulting in the duplicated domains in an in-line arrangement to form extended structures that are common in extracellular matrix molecules and extracellular portions of the cell-surface receptors. Other domains with N and C terminals close to each other are plug-in type and they insert into a loop region of a second protein. Common domains are found in many eukaryotes like protein kinases. MHC antigen-recognitions domain is found in humans only. Vertebrates have adapted their protein domains from invertebrates. Proteins of vertebrates are more complex because of domain shuffling. Binding site: A region on the proteins surface that can bind to another molecule through noncovalent bonds. Each polypeptide is called a protein subunit. 2 subunits are dimers. Haemoglobin has 2 alpha globin subunits and 2 beta globin subunits. Globular proteins: polypeptide chains that fold up into a compact shape like a ball. Examples: enzymes and actin. Actin is a globular protein that from a long helical structure known as actin filament. It forms the major filament systems of the cytoskeleton. Helical structures are very common because 2 identical subunits can only fit in one way that minimizes the free energy. Fibrous proteins span large distances. α-keratin is a dimer molecule of 2 long alpha helices in a coiled coil that have globular domains at each end containing binding sites. Fibrous proteins are abundant outside the cell and help cells to bind and form tissues. Ex: Collagen contains 3 polypeptide chains each with a nonpolar amino acid glycine at every third position. The chains form a triple helix and many molecules form overlapping arrays. The fibrous proteins that are intracellular are alpha keratin that is made of two alpha helices in a coiled coil. Proteins that are exposed to the outside of the cells are stabilized by covalent cross-linkages that tie 2 amino acids in a chain or many polypeptide chains together. Example: covalent disulfide bond. Lysozymes has these linkages. The linkages are made in the endoplasmic reticulum by an enzyme that links the -SH groups of two cysteine side chains. These bonds are not found inside the cell because the s-s (disulfide) bonds are reduced back to s-h bonds. Benefits of using smaller subunit to make larger proteins: 1- Small amount of genetic information is required. 2- Assembly and disassembly can be readily controlled because subunit have relatively low energy. 3- Errors in the synthesis can be avoided. Closed structures such as rings, spheres have additional stability because they increase the number of noncovalent bonds. Purified subunit can spontaneously assemble into final structure which means that information for forming many of the complex assemblies of macromolecules are in subunits. Example: TMV, bacterial ribosome. Long core protein provides a scaffold that determines the extent of final assembly for proteins that self assemble. Assembly factors: template for proteins that cannot self-assemble. They do not take part in the final assembled structure. Collagen and insulin require proteolytic cleavage so they cannot self-assemble. Amyloid fibrils: self-propagating Beta sheet aggregates. Decline in age results in accumulation of amyloid fibrils that kill cells. Alzheimer’s and Parkinson’s disease. Prion disease: spread from one organism when another organism eats a tissue with the protein aggregate. Prp is normally on the outer surface of the plasma membrane. It turns into infections Prp*. Protein-only inheritance is observed in yeasts. Protein molecules can form several different types of amyloid fibrils from the same polypeptide chain. Several different strains of infectious particles can arise from the same polypeptide chain. Secretory vesicles: eukaryotic cells store many different peptide and protein hormones which package a high concentration of their cargo in dense cores of amyloid fibrils. These dissolve to release the cargo. Bacteria use amyloid fibrils that project from the cell exterior to help bind to neighbouring bacteria in biofilms. PROTEIN FUCNTION A proteins interaction with other molecules determines its biological properties. Each protein molecule can bind to just one or a few molecules. The ability of a protein to bind to a ligand depends on weak noncovalent bonds. The amino acids of the binding site can belong to different portions of the polypeptide chain that are brought together when the protein folds. The atoms in the interior of the protein form the framework that gives the surface contours and its properties. Small changes to the amino acids in the interior of a protein can change its 3-D shape. The surface conformation of a protein determines its chemistry: the interaction of neighbouring parts of the polypeptide chain may restrict water molecules resulting in the ligand forming tighter h-bonds with the proteins in the absence of water. This happens because water molecules form a large hydrogen bonded network and it in energetically unfavourable for individual water molecules to break away. The clustering of polar amino acid side chains can alter reactivity. For instance, if negatively charges side chains are together, affinity for positive ion is increased. Furthermore, when amino acid side chains interact with each other unreactive groups become reactive. 2 slightly different conformations of the same protein may differ greatly in their chemistry because it depends on which amino acids and their exact orientation. Evolutionary tracing: identify sites in protein domain that crucial to domain’s functions that are binding sites. SH2 domain functions to link proteins together. Binds to phosphorylated tyrosine side chain. The amino acids located at the binding sites have been slowest to change. Surface string binding: surface of one protein contacts an extended loop of polypeptide chain on a second protein. Ex: SH2 domain and phosphorylated polypeptide loop. Protein kinase. Helix-helix: two alpha helixes to form a coiled coil. Ex: transcription regulatory proteins. Surface – surface: Precise matching of 2 surfaces. When antibody binds to antigen, it inactivates it or marks it. Antibodies are y-shaped molecules with 2 identical binding sites complementary to antigen. Binding sites are formed from several loops of polypeptide chain that protrude from the ends of the adjacent domains. Genes generate diversity for changing only the length and amino acid sequence of these loops. Half of the binding sites will be occupied by ligand when ligand concentration is1/k. The larger K, the greater the binding. Binding is energetically favourable. Increasing substrate concentration increases the rate at which product is formed. Eventually enzyme becomes saturated and the rate of reaction is Vmax which depends on how rapidly the enzyme processes the substrate. Vmax/enzyme concentration is the turnover rate, number of substrate molecules processed per second per enzyme molecule. Km is the substrate concentration at 1/2Vmax. Low Km means strong bonding. Enzymes are efficient because: 1- Enzyme increases local concentrations of substrates and holds them in the correct orientations. 2- Enzymes have higher affinity for transition state that are the most unstable state. This binding lowers energy of transition state (activation energy). Enzymes contain precisely positioned atoms that alter electron distributions in the atoms that participate in the, making and breaking of covalent bonds. Enzymes use acid and base catalysis simultaneously. Lysozyme catalyses the cutting of polysaccharide chains in the cell walls of bacteria by hydrolysis. This reaction is energetically favourable but there is an energy barrier. A colliding water molecule can break a bond only if they polysaccharide is in the transition state. This doesn’t happen in normal conditions without enzymes because the activation energy to reach the transition state is too high. Lysozymes distort one of the sugars in the bond that is to be broken, the bond is held close to 2 amino acids with acidic chained that transiently form a covalent bond. Some enzymes might act as templates to bring the substrates in a proper orientation, the enzymes might create bonds that change the electrons in the substrate, or it might change the shape of the substrate leading it to the transition state. All the above factors are ways enzymes increase the rate of a reaction. Small non protein molecules like ions or organic molecules (coenzymes) used to perform functions that amino acids can’t. Carboxypeptidase cuts polypeptide by a zinc ion that forms a transient bond with one of the substrate atoms. Biotin (coenzyme) forms bond with -COO- group that is going to be transferred by the enzyme with the biotin. Rhodopsin has retinal in its protein that changes its chape when it absorbs light and send signals to brain. Haemoglobin has heme which have an iron atom in the center and enables haemoglobin to pick and release oxygen. Coenzymes cannot be synthesized, need to be digested as vitamins. Regulation of enzymes: 1- Regulate gene expression of enzyme which controls the number of enzymes produced. 2- Confining enzymes to subcellular compartments. 3- Enzymes are covalently modified. 4- Protein destruction by proteolysis. 5- Feedback inhibition (negative regulation): product produced by an enzyme inhibits an enzyme in the earlier pathway by binding to the regulatory site resulting in a conformational change. 6- Protein phosphorylation Most proteins are allosteric (they can adopt many different conformations). EX: enzymes, receptors, structural proteins, and motor proteins. Each ligand stabilizes the conformation that it binds to most strongly. If the binding sites on an allosteric protein are coupled, positive regulation occurs when the binding of one molecule leads to increased affinity of the other (binding) and negative regulation when the opposite occurs. If molecules prefer the same conformation: positive regulation. If molecules prefer different conformations: negative regulation. Symmetrical assembly of identical subunits allow for a sharp change in reactivity. The binding of a ligand causes an allosteric change in the entire assembly allowing neighbouring subunits to bind to the same ligand. (Cooperative allosteric transition: a small change in the concentration of ligand switches the whole assembly on or off). Ex: O2 pick up by haemoglobin. Binding of inhibitory ligand is difficult as it disrupts the energetically favourable interaction of the 2 monomers, a second molecule however can bind easily as it will restore the energetically favourable monomer-monomer contacts. These symmetrical enzymes can only 2 conformations, an on and off one. Proteins are regulated by the covalent addition of smaller molecule to one or more of its amino acid side chains, usually a phosphate group is added. Phosphorylation (by kinase): 1- Causes a conformational change affecting the activity of the protein as the phosphate group causes the addition of a large negative charge that attracts all the positively charged amino acid side chains. 2- Attached phosphate can form a structure that is recognized by binding sites of other domains. (protein synthesis) 3- The addition of a phosphate group can block a binding site between two proteins (protein disassembly). Reversible protein phosphorylation controls the activity, structure, and cellular localization of enzymes and proteins. Ex: cell division and signal amplification. Protein phosphorylation is the enzyme-catalysed (protein kinase) transfer of the terminal phosphate group of an ATP to a hydroxyl group on a serine, threonine, or tyrosine side chain of the protein. Protein phosphatase removes phosphate. Protein kinase family members share the same amino acid sequence for the catalytic part (kinase). They contain different sequences on the other end of the kinase. They have short amino acids inserted into loops that allow the kinase to recognize the protein it has to phosphorylate or bind to structures that localize it in the specific regions of the cell. The other part of the protein regulates the activity of the kinase. Kinases need an input to act as signalling processing protein, and this input comes from phosphates added and removed from them by other kinase and phosphates. Src family of protein kinases is tyrosine kinase. The short N terminal is covalently bonded to hydrophobic fatty acid to hold it in the cell membrane, next to the N terminal on the other side is an SH3 domain, then an SH2 domain, followed by the kinase catalytic domain. In its inactive form, the SH3 domain is bound to an internal peptide and the SH2 domain is bound to a phosphorylated tyrosine near the C-terminus. Active form: removal of the phosphate at the C-terminal and binding of SH3 domain to an activating protein. And a kinase phosphorylates the tyrosine to self-activate. Another way to control protein activity using phosphates is to use the phosphate in the form of GTP. GTP-binding proteins (GTPases) are on when GTP is bound, hydrolysis of GTP turns it off. Monomeric GTPase called Ras plays an important role in cell signalling. Small proteins are covalently bonded to some proteins and by doing so these small proteins determine the activity/fate of the protein they are attached to. The carboxyl end of Ubiquitin (small protein) binds to the amino end of a lysine side chain of the target protein. Many ubiquitin molecules are then bonded to each other creating a chain of Lys-48-linked ubiquitin. This chain then directs the target protein to a proteasome where it is digested. SUMO is a different member of the ubiquitin family. The carboxyl end of the ubiquitin is activated by E1, by using ATP to bind the ubiquitin to itself. The activated ubiquitin is passed to E2 which works with E3 (protein ligases) that select the target protein. Ubiquitin ligase binds to degrons (signals) in the target protein and helps E2 form a polyubiquitin chain linked to a lysin of the target protein. C-shaped SCF ubiquitin ligase (5 protein subunit) has an F-box protein and an E2 at the other end. F-box binds to a phosphorylate target protein at a specific site and position the protein in the gap between the F-box and the E2 (in the C) so that some of its lysine side chains are in contact with E2. This allows E2 to produce polyubiquitin chain. The F-box subunit and the cullin (C part) are interchangeable leading to the recognition of a wide variety d target proteins. This function allows for rapid evolution as new functions can evolve for the entire complex simply by producing an alternative version of one of its subunits. EF-Tu is a GTP-binding protein member, it acts as an elongation factor in protein synthesis attaching tRNA to ribosome. When GTP is bound to EF-Tu, tRNA binds strongly to EF-Tu and when a tRNA meets its proper mRNA, tRNA triggers the hydrolysis of GTP in EF-Tu and detaches itself. When GTP becomes GDP, there is a small change in the active site of EF-Tu, this change causes a change in an alpha helix known as switch helix which keeps the EF-Tu closed. The change in the switch helix causes te switch helix to detach and “open” the EF-Tu leading to the release of tRNA. This is an example of a small change generating large movements. Motor proteins are used for muscle contraction, crawling of cells, movement of chromosomes in mitosis, movement of organelles, and movement of enzymes across DNA in synthesis. If no free energy is provided, a cell will move back and forth instead of moving forward in one direction. To move in one direction (to translate) one of the changes in shape of the protein should be irreversible. This is done by the hydrolysis of ATP. Myosin motor protein that walks on actin filaments. Kinesin walks on microtubules. Protein machines: large assemblies formed from many protein molecules linked by noncovalent bonds. It can move a result of hydrolysis of bound ATP or GTP that results in a conformation change in one or more of the individual subunits leading to a change in the entire assembly. Some IDRs are known as low complexity domains because they are made from only 20 amino acids. Unstructured regions (IDRs) have been well conserved over evolution. IDRs from specific binding sites for specific proteins. Phosphorylation sites are on IDRs. Ex: RNA polymerase has an IDR tail at the C-terminal. This tail gets covalently modified by phosphate kinase and phosphatases and this leads to the tail attracting proteins that might help or stop the transcription of mRNA by the RNA polymerase. Elastin is a highly disordered polypeptide whose loos unstructured polypeptide chains form covalent cross linkages to form elastic meshwork. IDRs are used as tethers (ropes) to concentrate substrates and thereby increase rate of reaction by increasing the chances of collision. IDRs allow large scaffold (framework/base) with multiple binding sites for RNA and proteins to group them at a particular site. Scaffolds bring interacting macromolecules together and concentrate then in selected region of a cell and being assembled only when and where they are needed. Rigid scaffolds: cullin in SCF ubiquitin ligase. Large, flexible scaffolds found under plasma membrane. EX: Discs-large protein (Dlg) that is under the plasma membrane of epithelial cells and at synapses. This protein is very ancient and was first discovered in Drosophila where its mutations led to uncontrollable proliferation of imaginal disc cells leading to tumours. Macromolecules self-assemble to from biomolecular condensates. Each copy of molecular machine is made of the same parts and ahs the same structure. Copies of flexible scaffolds are also very similar. Biomolecular condensates are different and are built from proteins held by noncovalent bonds that are very weak. Each condensate is made of at least 1 scaffold. Multivalent (scaffolds make many connections) macromolecules: bonds in them break very easily, made of at least one scaffold. In some cases, isolation macromolecules in condensates acts as temporary storage and blocks activity of macromolecule. Ex: stress granules. Disorder, low-complexity domains of proteins (IDRs) mediate the weak fluctuating bonds in condensates. The fluctuating bonds cause the condensate to act as liquid. Liquid-liquid phase separation. Many structures of biomolecular condensates are readily assembly and disassembled. Ex: nucleolus. Demixing (phase separation/liquid-liquid separation) is unfavourable (positive delta G, negative delta S) as it requires a large input of energy to turn the disordered state (both liquids are mixed) to an ordered state (demixed/separated). However, in the case of biomolecular condensate (proteins and nucleic acids), weak attractions between the polymers can provide a large FAVORABLE energy change to drive phase separation and overcomes unfavourable energy change of demixing. Phase diagrams: describe what happens when polymers phase separate. One phase is dilute and other is concentrated. Addition of more polymer results in the concentration of each phase to stay the same. Volume of concentrated phase increases, dilute decreases. P53 (protein that controls cell’s response to adversity) has numerous combinatorial regulatory codes as each modification of an amino acid by addition of small molecules alters its behaviour. These modifications create a site that binds to a scaffold protein in a specific region.

Chapter 3: The Atomic Structure of Proteins PDF

Document Details

Tags

Related

Summary

Full Transcript

Upgrade to continue