Nucleic Acid Structure PDF
Document Details
Uploaded by YouthfulGothicArt
Tags
Summary
This document discusses the structure of nucleic acids, specifically focusing on nucleotides, purines, pyrimidines, and the different types of DNA structures (A-DNA and B-DNA). It explains the interactions between these molecules and their important roles in biology.
Full Transcript
Topic 10: Nucleic acid structure deoxyAdenosine monophosphate - a typical nucleotide phosphate Base (nucleoside) Nucleotides are the sugar - fundamental building deoxyribose...
Topic 10: Nucleic acid structure deoxyAdenosine monophosphate - a typical nucleotide phosphate Base (nucleoside) Nucleotides are the sugar - fundamental building deoxyribose blocks of nucleic acids Organization of nucleotide Nucleotide has a sugar - (2’ deoxy)ribose in DNA This is phosphorylated on the 5’OH A base is attached at the 1’ position Purine vs. pyrimidine Purine and pyrimidine are heterocyclic aromatic compounds, with two nitrogen atoms per ring Biological bases are derivatives of these structures with additional keto, amine and methyl groups Lehninger fig 8.1 The standard bases Note bases can be methylated on heteroatoms In DNA this is typically regulatory In RNA, modifications including methylation of bases (e.g. in ribosomes) plays a structural role Deoxynucleotides Note nomenclature Note nucleoside refers to the base only Lehninger fig 8.4 Nucleic acid structural hierarchy nucleotides Read se quenc e 5’ t o 3’ Primary structure Secondary structure Tertiary structure (double stranded helices) The DNA backbone and its potential interactions 5’ end PO4 generally has a single negative charge, with several H-bond acceptors, but no donor Deoxy ribose has one H-bond acceptor (O1) plus non-polar interactions Ribose (RNA) has an additional OH that can make additional H-bond acceptors / +1 donor interactions 3’ end The DNA backbone and its potential interactions 5’ end The DNA backbone is comprised of the 5’ OH of one ribose group being linked to the 3’ OH of the next through a phosphate group The backbone of DNA has no H-bond donating groups, and no positively charged groups The phosphate group interacts well with water Deoxyribose is fairly non-polar Unlike peptides, the backbone cannot make strong interactions with itself 3’ end The ribose-phosphate backbone has multiple rotatable bonds There are six single bonds between the O3 5’ end of one ribose and the next These bonds are named a, b, g, d, e and z Each of these bonds allows free rotation of the groups above and below it e angles are very limited due to the O3 constraints of the ribose ring a Therefore, the backbone of nucleic acids b have much more flexibility per residue than g the equivalent peptide d c If you were going to make a Ramachandran e plot for DNA, you would need to draw it in z six dimensions O3 3’ end Sugar pucker - endo vs. exo conformers (deoxy)ribose has limited flexibility Generally, 4 atoms in the ring are planar, while the other is either endo (up towards C5’) or exo (down away from C5’) Potential interactions of a base H-bond donor H-bond donor H-bond acceptor H-bond Top view acceptor of adenine to ribose H-bond acceptor Potential interactions of the base Side view of adenine big flat hydrophobic surface Note that there are no free electron pairs or available protons to directed out of plane ribose of the base big flat hydrophobic surface The nucleotide bases are much less varied than amino acid sidechains None of the bases carry a net charge All bases non-polar along their flat faces All bases can make H-bonds through their edges Bases come in only two sizes (purine/pyrimidine) The main difference differentiating bases is the pattern of potential H-bond acceptors and donors along the edge of the base Hydrogen bonding in nucleic acids The bases have a lot of hydrogen bonding potential This can be satisfied by binding water - you don’t necessarily need to H-bond with another base However, once you form one hydrogen bond and have paid the entropic cost of freezing out the conformational freedom of the base, the next hydrogen bond contributes just as much enthalpically at a lesser entropic cost The bases have flat, hydrophobic surfaces The flat upper and lower surfaces of the bases have no H-bonding potential Consequently, they exclude water and are hydrophobic - much like the rings of the aromatic amino acids However, these surfaces are not simply non-polar - there are strong dipole moments that arise because different atoms around the ring withdraw electrons to different degrees The hetero-cyclic bases develop complex electrostatic fields The distribution of electronegative atoms and electron delocalization A develops complex T electrostatic fields within the base pairs A & T have individual dipole moments, but the base pair overall does not GC has a strong dipole moment with the + charge G C on the C These fields help dictate how bases prefer to stack on one another Travers A.A., 2004 Base stacking Bases from adjacent nucleotides can maximize van der Waals interactions by stacking their flat faces This interaction also buries these large hydrophobic surfaces area away from water The exact details as to how different pairs of bases stack depends in part upon the way their individual electrostatic fields interact Base stacking is a very important driving force in nucleic acid structure, and contributes roughly as much energy to stabilizing the DNA double helix as hydrogen bonding between bases does Some base pairs stack better than others A T First nucleotide is in the 5’ position, paired base is implied (so T A = TA stacked on AT) In general T or A pack relatively weakly G / C pack more strongly These differences result in the stiffness of DNA being sequence dependent GC electrostatic complementarity results in the strongest stacking Canonical Watson and Crick base pairing There are many ways to hydrogen bond pairs of nucleotides Watson and Crick base pairing is the only pairing used in regular dsDNA G base pairs with C making 3 hydrogen bonds A base pairs with T, making 2 hydrogen bonds Stryer5 Fig. 5.12 So what’s so special WC about base pairs? The C1-C1 distances for Watson Crick GC and AT base pairs are T A essentially identical at 10.9 Å The C1-N1 bonds are all at a ~ 53˚ similar angle to the C1’->C1’ line ~ 53˚ This geometric similarity means ~ 10.9 Å that AT/TA/GC/CG base pairs are interchangeable without perturbing backbone geometry C G Non-WC base pairings have different geometry and are not interchangeable ~ 53˚ ~ 53˚ ~ 10.9 Å DNA is organized into a double helix The strands are arranged in an anti-parallel fashion Bases go to the inside and interact through hydrogen bonding Phosphate groups on the outside of the helix to minimize electrostatic repulsion and maximize interactions with the solvent The helix is right-handed Lehninger Fig. 8-13 Major and minor groove major groove minor groove B&T Fig. 7-4 The major groove is the longer perimeter as one goes from N1 to N1 (the atom that links the base to ribose) The minor groove is the shorter perimeter B-DNA geometry Right-handed helix Base pairs almost perpendicular to the helix axis ~ 10 bases 3.4 Å per per turn base pair = 34 Å o ve g ro.0 Å o r ~6 ~23 Å diameter n mi rrow No hole down the center na DNA in the cell is almost all in the ve form of B-DNA o g ro.6 Å This is therefore the most j o r 11 biologically important DNA structure m a de ~ wi A-DNA geometry Right-handed helix Hole down Base pairs strongly tilted ~ 20 ˚ the axis 2.3 Å rise per base pair ~ 10.7 o ve w r gro hallo bases per i no r y s turn m ve i d e, = 24.6 Å w ~27 Å diameter o ve ep g r d de o A-DNA is an alternate structure for a jo r a n DNA m rrow y na It was identified because DNA fibres ver diffract X-rays differently when they are hydrated (B-DNA) versus being dried out (A-DNA) A-DNA vs B-DNA Z-DNA No Hole down the Base pairs slightly tilted axis 3.6 Å rise per ~ 12 bases base pair per turn = 43 Å ma ~18 Å v.d.W diameter w i d jo r g ea r nd oove Z-DNA is a left-handed helix sha l l ow Overall shape is long and narrow Zig-zagging backbone gives m in it its name n a r o r gr row o ov and e dee p Z-DNA Z-DNA most commonly forms from repeated (GC)n under high salt/alcohol conditions Conformation of purine backbones differs from the pyrimidine backbone The C bases are bent back over the sugar in the unusual syn conformation - the ribose ring stacks on the next base G C AT pairs are destabilizing but accommodated in small numbers C G Z-DNA has been proposed to help C relieve super-coiling strain on DNA but G its biological relevance is unproven DNA hydration DNA forms hydrogen bonds with the surrounding water Some specific positions in the structure bind a water molecule in a standard way These include water molecules binding in the major groove, the minor groove, and water binding to the DNA backbone Water in the major groove of B-DNA The major groove of B-DNA is wide enough to accommodate a few water molecules side by side Water molecules form extensive networks linking the bases to the phosphate backbone Water spine in the minor groove of B-DNA The minor groove is only wide enough (6 Å) to accommodate water molecules in single file A given water molecule position typically permits hydrogen bonds to polar groups of successive bases on opposite strands, ribose O1 atoms, and other minor groove water molecules This water “spine” stabilizes B-DNA at high water concentrations, and is the critical factor making B-DNA more stable than A-DNA Metal ions in nucleic acid structure Mg2+ bound in 266D The large net negative charge carried by the phosphate backbone requires counter ions to help stabilize it Monovalent cations (K+ and Na+) and divalent cations (Mg2+) are most common in the cell and therefore biologically relevant These counter-ions are critical for DNA stability – charge repulsion will precent DNA from forming a duplex in pure water Many of the structural positions that water molecules favour can also be bound by metal ions By making use of bridging metal ions, phosphate groups may approach one another closely without destabilizing the structure Factors destabilizing nucleic acid structures Compensating for the conformational entropy inherent in the large amount of flexibility inherent in nucleic acid chains is, as with proteins, the single largest destabilizing factor Electrostatic repulsion of the negatively charged phosphate groups is highly destabilizing unless compensated for by cations Enthalpic stabilization of exposed nucleotide edges by bound waters (it costs energy to strip these waters off as you form structure) Forces stabilizing nucleic acid structures Base stacking interactions (hydrophobic, van der Waals and electrostatic) Hydrogen bonds between bases Entropic gain from releasing waters associated with single stranded nucleotides into bulk solution Binding of metal ions (electrostatic) Energetics of DNA duplex formation DNA duplex formation is favoured under physiological conditions Experiments with a 13mer oligo showed that stabilization is on the order of 20 kcal/mol (roughly 1.5 kcal/mol/base pair) However, this is mostly from a highly stabilizing enthalpic contribution of ~ 120 kcal/mol (mostly from H-bond formation and the vdW component of base stacking) Entropy of duplex formation (TDS) was unfavourable to the tune of 100 kcal/mol; this mostly reflects the cost of locking the conformation of a highly flexible molecule One consequence of this is that DNA is readily melted by increasing the temperature DNA tetraplex Lehninger Fig. 8-20 Occurs only in very guanine rich sequences Stable over a wide range of conditions Can be parallel or antiparallel Guanine tetrads can influence transcription, replication and recombination DNA Holliday A Holliday junction is formed when complementary DNA strands are junction exchanged between adjacent helices Stacking unperturbed by crossover This occurs, for example in DNA repair and in meiosis The individual DNA duplexes are barely distorted from standard B- DNA helical conformation, and all bases are properly stacked This is possible due to the conformational flexibility of the DNA backbone - you do not see analogous behaviour in a-helices in proteins In essence you have a single secondary structural element (the double helix) which continues despite changes in the contributing 1NQS strand DNA structure Most DNA in the cell adopts the B-DNA conformation In DNA double helices, conformations with complementary Watson and Crick base pairing are the most stable This means that DNA has an inbuilt ability to recognize DNA with a complementary sequence Ribonucleotides form RNA CH3 Ribo Nucleic Acid differs from DNA in two details: The 2’carbon of the ribose sugar carries a hydroxyl group Thymine is replaced by its analog Uracil, similar in all respects except that it lacks the methyl group Most RNA, biologically, is not found as “matched” duplexes Most biologically relevant RNA is made by transcribing one strand of the DNA double helix only This includes mRNA, tRNA and rRNA dsRNA virus genomes are the one exception to this rule Consequently, unlike DNA, RNA does not come with an inbuilt complementary strand to form a double helix with RNA tries to find stabilizing interactions wherever it can, resulting in more irregular and complex structures than the regular double helix of DNA RNA polymerase RNA is never truly disordered RNA is often depicted as an extended molecule, with no particular structure In proteins, disorder is possible because there are many amino acids myoglobin mRNA – artistic depiction that are not hydrophobic However, all bases are hydrophobic, and all can base pair In nucleic acids, most bases stack on their neighbours Bases also pair where they can RNA is therefore always structured myoglobin mRNA – AF3 model RNA forms only one helix type, A-RNA Base pairs strongly Hole down tilted ~ 16 ˚ the axis 2.8 Å rise per base pair ~ 10.7 o ve w r gro hallo bases per i no r y s turn m ve i d e, = 24.6 Å w ~27 Å diameter o ve ep g r d de o a jo r a n m rrow A-RNA closely resembles A-DNA y na ver Why does RNA not have a B form? The 2’OH of a B-RNA double helix would A clash with the C2’ phosphate oxygen and base of the residue 3’ 2’OH T to it would go here Therefore, the B-form helix is not accessible to RNA B-DNA close-up Base interactions in RNA Watson Crick base pairing is also possible in RNA, and can lead to a regular double helix However, perfectly complementary sequences are generally unavailable Having bases solvent exposed is unfavourable – bases are strongly driven to form stacking interactions Stacked bases prefer to hydrogen bond; even if this results in irregular geometry, the resulting structure is often still stable Base interactions Nucleotide bases have a strong tendency to stack on both each other and on exposed ribose rings They also can hydrogen bond with one another, ribose, phosphate etc. While the canonical Watson and Crick base pairs are optimal, any pair of nucleotides can interact with one another favourably, in many different geometries The interactions strongly drive RNA (and DNA) to form structure Nucleic acids are essentially always structured locally though at least base stacking, even where that structure serves no particular purpose The G.U wobble base pair G.U wobble base pairs are very commonly T found in RNA helices A A structural water molecule bridges the 2’OH and the amine group of the guanosine ~ 53˚ ~ 53˚ This stabilizing interaction explains why G.T ~ 10.9 Å is not seen in DNA offset! The G.U base pair is thermodynamically roughly as stable as A.U U G The C1’->C1’ distance of G.U is very similar to G.C or A.U base pairs ~ 65˚ ~ 40˚ However, the angle the bases make to the C1’->C1’ differs from WC base pairs The helical backbone distorts to ~ 10.3 Å compensate for the angle difference while Two H-bonds between bases maintaining optimal base stacking plus one water mediated Hoogsteen base pairing e face WC fac WC Hoogsteen pairs are an alternative pairing of A-T (A-U) and G-C The pyrimidine is interacting with a different surface (normally facing the major groove), so the Watson & Crick face remains free In each case, two hydrogen bonds form Note that the C in G.C+ is protonated Base pairing stabilizes this tautomer, so pKa rises from 4.2 to 7.5 WC and Hoogsteen base pairing can occur simultaneously – this forms a base triplet Non-WC base interaction is in red Third base makes Hoogsteen interaction Note this places the extra nucleotide in the major groove Lehninger Fig. 8-20a Extended triple-stranded RNA using base triplets This structure is part of the human telomere Note orange and green sections form extended runs of base triplets Note that the strands forming the WC base pairs swap out (blue/green and orange/red) Devi et al, 2015 Other non-canonical base pairs There are many ways two bases can interact forming 1 or 2 hydrogen bonds Examples of different A-A base pairs are shown here Note these pairings result in different spacings and orientations of the ribose These non-canonical interactions necessarily distort the backbone RNA structure often has many such unusual base pairings where the sequence does not allow standard ones RNA topology diagrams RNA topology diagrams depicts the location of secondary structure elements A bulge occurs is where a single nucleotide is unpaired Hairpins are tight turns where the RNA doubles back to form the second strand of dsRNA in 4-5 nt Internal loops have multiple incompatible bases not paired Lehninger Fig. 8-23 The function of many RNA species depend on their ability to fold into complex 3D structures tRNAs Ribosome (a ribozyme) Spliceosome Self splicing RNAs RNAse P (essential for tRNA maturation) 5S RNA (involved in the membrane insertion of proteins) Riboswitches The RNA composed RNA templated RNA synthase of the original RNA world (maybe)… Riboswitches - an example of complex RNA structure Riboswitches occur in the 5’untranslated regions of mRNAs in gram positive bacteria Riboswitches post-transcriptionally control protein levels by prematurely terminating transcription or preventing translation of an RNA when a particular metabolite is present The SAM riboswitch structure S-adenosyl methionine Note that this is a highly complicated three- dimensional structure Some regions resemble standard A-RNA Other regions have an almost random appearance Almost all bases are stacked, but lots of odd pairings 2GIS Non-canonical interactions in the SAM riboswitch Mg2+ coordination – GA base pair allows close approach of backbones G.C-A base triplet Bulged out base - note stacking of i with i+3 Stacking without base pairing base – 2’OH ribose H-bond Ligand recognition by RNA 11 nt from 5 strands make up the S-Adenosyl Methionine binding pocket Interactions include base stacking, van der Waal interactions and multiple hydrogen bonds using both base edges, ribose atoms and tightly bound water molecules Because SAM occupies a cleft between structural elements, its binding stabilizes this structure Hammerhead ribozyme The enzyme RNA strand (yellow) binds the substrate strand (blue) and cleaves it at the cyan position Multiple Na+ ions help stabilize the structure The ribozyme recognizes its substrate via base pairing, but different regions interact successively Note that the structure is quite irregular, with only short stretches of A-RNA helix Hammerhead ribozyme - 3zp8 Protein-Nucleic acid interactions DNA-protein interactions DNA is the main depository of information in the cell However, maintaining, replicating or transcribing this information needs to be done in a tightly controlled manner where specific actions are performed at specific sites on the DNA under specific circumstances Finding, and interacting with specific sites on DNA entails in some fashion accessing the sequence information DNA encodes Generally, this is done by proteins; therefore protein- DNA interaction is a critical process in the way the cell’s information content is accessed and used Non-specific DNA binding - DNAse I DNAse I is a (relatively) non-specific double stranded nuclease from bovine pancreas DNAse I - DNA interactions DNAse I binds in the minor groove of DNA, opening it up slightly DNAse I interacts predominantly with the phosphate groups of the backbone These allow hydrogen bonds (through H-bond donors) and electrostatic interactions Arginine residues, which make bidentate interactions with phosphate are especially favoured Some of the interactions are through the water molecules that bind specific sites in the DNA There is only one H-bond directly to a base - so interaction depends minimally on the sequence Protein targeting of specific DNA sequences There are 4 bases in DNA (GATC) A specified set of nucleotides n long (not necessarily contiguous) has 4n possible sequences For 16 nt, there are 416 ~4x109 possible combinations Therefore, in the human genome (~3 x 109 nt), one would expect a random 16 nt sequence to appear, on average, approximately once by chance But how do proteins recognize their target DNA sequence? Possible strategies might include unzipping DNA and “reading” the base pairing edges of one strand Reading the exposed edges of each base pair in the double helix Exploiting the sequence specific physical properties of DNA The “unwind and read” strategy Conceptually, the most obvious way to recognize a sequence would be to unzip DNA and just “read” the base pairing edges of one strand, much as the other strand does However, the DNA double helix is very stable, so unwinding it like this would cost huge amounts of energy (especially as you might have to open up a substantial fraction of the genome to find the right binding site!) However, some proteins use a modified version of this strategy to investigate individual bases for mismatches or chemical modifications; if you can flip a base out of the DNA easily, the DNA must be damaged Exposed edges of the base pairs Recognizing the base edges This strategy is plausible because each base pair has a unique pattern of potential interactions in the major groove The minor groove is much more ambiguous; essentially, you can tell GC/CG from AT/TA by looking for a hydrogen bond donor in the middle of the base pair Helix-turn-helix transcription factors Cro recognizes a 17 residue pseudo-palindromic sequence Cro recognizes a 17 bp sequence with a pseudo 2-fold organization Some positions in this sequence are more strongly required than others Cro protein forms a dimer Cro is a small protein (61 residues) that is found to form a dimer both in solution and bound to DNA The core of the structure is 3 a-helices, with a 3- strand b-sheet The b-sheet portion is responsible for dimerization Overall, the Cro dimer is organized so that helix 3 for both protomers sticks out on the same surface Cro binding to DNA DNA binding helix The most important interaction between Cro and DNA occurs with helix 3, often termed the DNA binding helix Helix 3 inserts into the major groove of DNA making extensive interactions Note that each Cro protomer interacts with one DNA half- site almost exclusively Half site 1 Half site 2 Cro helix inserts into the major groove of DNA Note that the helix fits snuggly in the major groove, making contacts over ~ 180 degrees of its radius Specific hydrogen bonds are mediated with individual base pairs This “helix enveloped by the major groove” is a common theme in protein- DNA interactions Cro - base interactions Each interaction is between the Asn31 insertion helix and the major groove Each residue that contacts a base makes 2 H-bonds These three Lys32 residues are the Ser28 only ones each half site makes with the bases Cro - DNA backbone interactions His35 Lys56 Asn31 Ser60 Several specific hydrogen bonds to PO4 backbone Tyr26 Interactions frequently involve Arg38 basic residues These interactions promote binding but are non-specific Cro binding surface electrostatics The surface of Cro that contacts DNA is rich in Arg and Lys, poor in Asp and Glu and residues and is therefore markedly electropositive This pattern holds for all DNA binding proteins Electrostatic attraction between the electropositive protein and electronegative Net neutral DNA helps stabilize the Net negative complex positive charge charge Cro distorts DNA upon binding The DNA in the Cro-DNA complex is distorted from conventional B-DNA form, being both bent and over-wound Since it takes energy to distort DNA in this manner, Cro can only bind tightly if the DNA it is trying to bind is susceptible to such distortions Since AT tracts are easier to distort than GC regions, Cro prefers AT/TA base pairs in the middle of its binding site, even though it never directly contacts these bases Cro is reading out the mechanical properties of the DNA as much as the hydrogen bond pattern The “Helix turn Helix” motif is common in DNA binding proteins The recognition helix inserts into the major groove and interacts ix tion hel specifically with the bases Recogni The helix immediately N-terminal to it helps position and orient the recognition helix B&T Fig 8.8 Other non-HTH proteins also use a- helices to recognize DNA Coiled-coil dimerizes protein & orients binding helix P53 tumor repressor GCN4 H. sapiens Zif 268 H. sapiens (basic leucine zipper) (3 zinc fingers) B&T 9-20 B&T 10-21 H. Sapiens B&T 10-3 TATA binding protein (TBP) TATA binding protein is a ubiquitous eukaryotic transcription factor It directly binds DNA and helps recruit the pre- initiation complex that recruits RNA polymerase It specifically binds to the sequence “TATA” located ~ 35 nt upstream of the transcription initiation site TBP is a monomeric protein with two structurally similar halves N-terminus in blue C-terminus in green Each half is 88 aa The overall shape resembles a saddle, with two “stirrups” Note that this structure is derived from the DNA complex - the stirrups are unlikely to adopt this conformation in solution But how does this structure access DNA? B&T Fig 9-4 a,b TBP binds in the minor groove of DNA The DNA is heavily distorted into an A-DNA conformation This exposes the edges of the bases in a broad, shallow minor groove TBP can then use the broad, flat central b-sheet on the underside of the saddle to bind to this broad, shallow minor groove The “stirrups” also contact key residues TBP contacts 8 consecutive base pairs in this fashion Specific protein-base hydrogen bonds in TBP Only the central AT are directly read out with one H-bond each B&T 9-7 Note that this interaction is found in each TBP pseudo- repeat Protein-DNA backbone interactions in TBP Extensive interactions occur with the backbone TBP shows the typical basic DNA binding surface Note how TBP wraps around the DNA The surface contacting DNA bases in TBP is comprised primarily of aliphatic and aromatic residues basic residue acidic residue aromatic residue hydrophobic residue hydrophilic residue main chain atoms The DNA-protein interface in TBP is predominantly hydrophobic Small hydrophobic residues (Val, Ile, Leu) mediate most of the contacts with the bases in the minor groove Note the two Phe residues that drive a wedge between bases, kinking the DNA (arrows) Interactions with the flat side of the ribose groups are favourable This includes contacts with the base’s hydrogen bonding groups Binding is driven entropically by release of water from the TBP DNA binding surface to bulk solvent Modular DNA recognition by TAL effector proteins TAL proteins are built from sequential 34 (+/-1) a.a. repeats Each repeat is comprised of 2 a-helices, which are structurally and sequence-wise almost identical Helix “a” is short, helix “b” is longer and kinked (note the proline!) Residues 12-17 connecting the helices contact and bind DNA Repeats stack to form a TAL protein repeats form a hollow helical hollow superhelix arrangement 11 a/b repeats/360 turn 5 to > 30 repeats per TAL protein The space within is perfectly shaped to bind B-DNA without distortion TAL contacts only one strand of the double helix Each loop joining the a/b helices of each repeat will contact a single consecutive base, reading it out Deng D, et al. Science. 2012 K16 and Q17 of each repeat contact the PO4 group Positions K16 and Q17 (b helix) interact with the backbone PO4 (and water) group of one strand This provides binding energy but no sequence specificity Positions 12 and 13 together specify the base Position 12 is always Asn or His; these make distinct hydrogen bonds that position residue 13 Residue 13 contacts the base, and specifies sequence Gly13 specifies dT by making space for the methyl group Asp13 specifies dC, as it can make a hydrogen bond with the amine Asn13 specifies dG or dA Ser13 is non-specific and can accommodate all bases Deng D, et al. Science. 2012 Here 11.5 TAL repeats specify 12 consecutive nucleotides on one strand However, the modular nature of TAL means that arbitrarily long or short sequences can be recognized Since only residues 12 and 13 specify the base, this makes TAL proteins trivial to engineer TAL proteins of arbitrary specificity can be designed and made Disorder-order transition in NA binding Nucleic acid binding proteins often undergo order-disorder transitions as they bind The partner NA provides a stable platform the IUP can order itself on RNA Note the large number of basic a.a. and non-polar base stacking P22 peptide P22 N-peptide binding boxB RNA (1A4T) General conclusions All DNA binding proteins derive most of their binding energy by interacting with the phosphate backbone Almost all sequence specific DNA binding proteins distort DNA away from its canonical B-form A large part of the specificity of a DNA binding protein for DNA is encoded in how easily the DNA can be twisted into the desired conformation Hydrogen bonds to the exposed edges of individual base are used to help specify which bases are present Conclusions cont. Most DNA recognition occurs through binding in the major groove, where the pattern of hydrogen bonding groups is more informative a-helices are very commonly inserted into the major groove of DNA as a means of contacting several successive bases Two-fold symmetry (or pseudo-symmetry) is often used as a way of efficiently expanding the binding footprint of proteins on DNA; this is why many binding sites are palindromic or nearly so. TBP manages to form an extensive binding surface with the minor groove of DNA using a b-sheet by distorting DNA into an A-DNA like conformation