11_BL_20231030_WhisperAI_2.docx
Document Details
Uploaded by CourageousStrength
ETH Zürich
Full Transcript
The last topic that I talked about was the semi-conservative DNA replication. And some of you already must be thinking, and I even heard some questions in the previous years, how can you separate a very long double helical molecule? You can compare it a little bit to the situation if you would try t...
The last topic that I talked about was the semi-conservative DNA replication. And some of you already must be thinking, and I even heard some questions in the previous years, how can you separate a very long double helical molecule? You can compare it a little bit to the situation if you would try to unwind a rope that has two strands, and these two strands of the rope are coiled one around the other. And if you would just take one side of the rope, both strands, and you would try to pull it apart, you would never be able to open up and unwind the rope. The reason being is that you would start generating torsion on the rest of the string that would lead to formation of knots. And one of the most famous examples of such tragic problem stems from one of the first attempts to climb Eiger North Face by a German alpinist. They basically got stranded, and they had to extend the ropes to rappel down the face. And the ropes were not long enough, and they tried to take them apart. And with frozen hands, it took them hours to open up this coiled rope, exactly for the same problem that DNA copying would also obviously face. Ultimately, this was tragic, and due to this long time that they took, the team ended up freezing to death. With respect to DNA, what kind of mechanisms ensure that this copying can take place in spite of this topological problem? Obviously, if you open up the DNA that is necessary to separate the strands, you unwind it, which leads to the further downstream regions of double helical coil to be overwound. In addition to this problem number one, we also have a very stable DNA duplex that has a lot of base pairs between bases, there is hydrogen bonds between bases, there is stacking, and there is a lot of energy needed to melt this DNA. Typically, if you would try to melt DNA, you would have to raise the temperature very, very high. But in the cell, this obviously happens at normal physiological temperatures. It has to be on top of that very fast, 4.6 million base pairs are copied in 20 minutes in bacterial, during bacterial replication. It has to be extremely accurate, you will see the accuracy in the end is that there is one error made in 10 billion bases that are copied. Ultimately, if there is damage to the DNA, it has to be repaired, and of course you cannot just copy part of this DNA, you have to synchronize that the copying always happens on the entire genome. So let's think of a problem number one. So this coil structure of DNA gives it unique properties. Basically the strands, because of the double helical nature of DNA, will coil around each other every full turn that the strands make, and that is about every ten and a half base pairs, because for every base pair there is a 36 degree turn. So if you would end up melting this DNA, you would literally remove this twist of one strand around the other, and because of that, if you would open up this DNA and unwind it and reconnect it, the DNA would not be able to stay in this plane. Let me explain what is meant by this. So if you have a molecule that resists changing the twist, which is the case for any physical object that has some asymmetry along the long axis of the molecule, this is due to chemical features and physical forces that are a consequence of these chemical features. But if you take a molecule that would resist twisting, like this simple rubber band, doesn't resist twisting very strongly, but it does. So if you start twisting it, you make more and more turns, you make more and more turns, and suddenly you will feel the resistance. And ultimately what happens is that the molecule starts twisting outside of the plane. So basically you end up with the molecule that is not in the plane, it goes outside of the plane. And this is exactly what is meant by this. So such a contortion is called super coil. So there is a coil and a super coil. So how does this look like? So let me introduce some basic mathematical properties of coiling and supercoiling. And the science that deals with topology in math is the kind of change of an object that does not involve cutting or gluing, but it involves distortions. So this is the, this is in math referred to as super coil. It involves topological transformations. And there is a few numerical parameters that have to be satisfied when the change of topology takes place. So one number here is called so-called linking number. And in this case of a circular DNA molecule, the linking number is 25. That is actually the number of full Watson-Crick twists of one strand around the other. So that's literally the number of full twists that happen every 34 angstroms in a double helical DNA. That's the linking number. Twisting number. Twisting number refers to the same number of twists, but only when the molecule is in one plane. So when the molecule is in plane, then the linking number and the twisting number are the same, okay? However, there is yet another number here. And that refers, that is called right, and it refers to the number of super helical twists. So this is what I was referring to before. So these are the twists that make the molecule go out of plane. And how does that look? So basically if you would take this circular DNA molecule, you would cut it, and then you would unwind it, let's say by two right-hand turns. What happens next? Then you connect it in this unwound form, and this molecule will obviously be in a strained conformation because it would have unpaired region. In this case, in this planar form, this DNA molecule has a linking number of 23, and the number of twists are also 23 because it's in one plane. And this rying number is zero. But this molecule is not stable. It's at a higher energetic state, and it would like to ultimately reform these base pairs. But the only way these base pairs can form is if there is a compensation for this untwisting that has happened, unwinding that has happened here. And that can be satisfied if the molecule gets out of plane. So basically you end up with these super coiled elements. The whole ring is wound around itself and crosses itself twice, and that allows all the base pairs to be satisfied. In this case, this linking number is reduced, it's 23, but the twisting number is 25. That's exactly what leads to a larger number of twists when you get out of plane, and the rying number, this is this number of super helical twists, is minus two. And so based on how many winding ones you get out of plane, the number of super helical winding steps you take, that's exactly reflected in the number of super coil crossovers that can be observed. So this type of geometric transformation allows two things to happen. First of all, if you're unwinding the DNA, and if you cut the DNA strand, and there are enzymes that do that, and these enzymes are actually called topoisomerases. So they're basically changing the topology of the molecule by introducing a cut. So they're switching the molecule from one topological state to another. So basically these enzymes can cut a strand, and just think of a mechanism how this works. So basically you have DNA polymerases that have to copy the DNA, and the DNA is being unwound. And as a result of unwinding, you're generating this twist downstream of the unwinding location. And the only way you can relax this twist is by allowing the molecule to separate from and turn and then reconnect. So basically there are molecules that will break one strand, and as a natural relaxation process, once one strand is broken, the other strand will not resist twisting because it has a single covalent bond that keeps the strand together. So for example, if you think of phosphodiester phosphate sugar backbone of the DNA, they're often some bonds that have single bond characteristics that can freely torque around the bond. And in that case, one bond breaks, and then the molecule just gets relaxed, and the enzyme then has to very accurately reconnect this break, and then DNA copying and separation can continue. There are some enzymes that will not only relax DNA upon copying, but they can also using ATP generate a super coil. So there are some enzymes that will introduce a cut, and then using energy twist the DNA more. And this twisting can be then used, for example, to package long DNA genomes into very small cells. And this type of packaging has multiple roles. First of all, it organizes the genome, but it also has an effect on the stability of the double helical DNA. And some organisms have to survive at very high temperatures. And in that case, this super coiling will prevent the DNA from being melted at high temperature. So these are some of the examples of very highly packaged and super coiled DNA genomes that are further stabilized by proteins. So this is often the case in some thermophilic bacteria or in thermophilic archaea that have to survive sometimes at temperatures that exceed 100 degrees Celsius. So this answers question number one, and that is how do you achieve this strand separation and relax the twisting that happens in front of the enzymes that separate the strands? Well there's another fundamental question that stems from the chemical directionality of the two DNA strands. One of them is obviously organized such that it stretches from 5' to 3' direction, and the complementary strand has opposite chemical directionality, and it runs from 3' to 5'. So this is how it works. So if you would open up this blue DNA and you would start copying it, then obviously chemically this copying mechanism involves certain nucleotides with certain chemical features, and it is natural to assume that the chemistry of DNA copying will always be the same, that you will not have a special enzyme for copying one DNA and special chemistry, special nucleotides for copying one strand of DNA, and different for the other strand. So in that case, if you know you have to kind of copy the DNA, you kind of have to pry it open and copy it, that would imply that in one case you would be synthesizing the molecule from 5' to 3', which would be complementary to the 3' to 5' strand, but in the other case you're opening the DNA such that it's being opened from 5' to 3', and in that case seemingly the synthesis would have to go in the opposite direction, you would have to synthesize from 3' to 5'. And it is known based on electron microscopy that the copying of DNA happens such that the DNA is peeled open, literally, in these little bubbles. So these are multiple so-called replication bubbles where obviously the DNA is being copied. So one of these replication forks runs in one direction and the other one in the other direction. So the principle of this replication fork is very clear, but it is difficult to understand how the chemistry works. Again, just because in one case you would have to go in one direction and in the other case you would have to go in the other direction. So this copying of DNA, and that's another little bit of a puzzle, is very, very accurate. And it needs enzymes, and I would like to now introduce these enzymes that are involved in DNA copying, because that will help us understand how the chemistry of this copying works and explain this problem number two. So the enzymes involved responsible for copying DNA are called DNA polymerases. And there is typically more than one DNA polymerase in cells, and I'll explain to you the differences between different types of DNA polymerases. So what DNA polymerase has to achieve is it has to take template DNA and then copy DNA and synthesize and copy it such that the complementary strand is synthesized. Now the problem with copying this information is that in a way the chemistry of the copying can be much more accurate if you have as much constraints during the process as possible. So pretty much to start really copying DNA, you cannot start from some arbitrary place and introduce first nucleotide, because this nucleotide would kind of float in space and you have a complementary series of nucleotides of the DNA that has to be copied. And even like the angle of this nucleotide would not be so strict. The nucleotide could form different types of hydrogen bonds. So incorporation of a first copied nucleotide is much less accurate than extending a partially synthesized copy of DNA. And so because of this, the process of DNA copying starts with a different type of molecule being synthesized first, chemically a different type of molecule, RNA molecule is synthesized first and serves as a primer for continued copying of the DNA, for continued, for starting the copying process of the DNA, because this cannot be considered as a copy of DNA because it's an RNA molecule. So basically what happens is that first there is an enzyme that is not DNA polymerase, that is called primase, that sets the stage for copying of DNA information. This primase will synthesize a short complementary RNA molecule to one very small segment of the entire DNA that has to be copied. And then DNA polymerase and you will see the one that is the fastest of DNA polymerases and the main working horse of DNA replication in bacterial cells is called DNA polymerase III. Then this DNA polymerase III will use this primer to continue the synthesis and this continued synthesis will be much, much more accurate. So using a different type of molecule to initiate replication of DNA, the accuracy of copying is much, much higher. So as a consequence to increase this accuracy, DNA replication is primed by a short stretch of RNA that is synthesized by an enzyme called primase and then the RNA primer is removed at a later stage of replication. So this is a very important point to just be aware, even though this primase is an RNA polymerase, it's not the RNA polymerase that is used and I'll explain to you this later for transcription. So how is this primer extended? So by now you have slowly learned some of the basic terminology connected to DNA replication. So what DNA polymerase needs is a template that's a sequence of DNA that has to be copied. For DNA replication it's of course DNA, but this term template is used also for copying of RNA. So you need a template and you also need a primer and primer is an initial segment of nucleic acid that will be extended by DNA polymerase. So in molecular terms this is how it would look like. You would have a template and you would have a primer and then you would incorporate the enzyme that is called DNA polymerase will incorporate the next complementary nucleotide, complementary meaning that DNA polymerase can form Watson Creek base pairs with the next nucleotide in the template and in this case this is a T. So adenosine triphosphate will be used here, so deoxy ATP will be used and as a result in direction 5' to 3' as a result of this reaction the next nucleotide of DNA will be incorporated. Since the polymerization is happening from 5' to 3' the next nucleotide will then again have free hydroxy group on the 3' end that will be used for reaction of with the next deoxynucleotide in this case it will be a G because it has to be complementary to C. In the reaction you may have already noticed here two phosphates, pyrophosphate leaves the reaction and this is what provides the chemical energy for the reaction. So this three phosphates leaves the reaction and this is what provides the chemical energy for the reaction to take place. So basically ATP is a higher energy nucleotide than AMP or ADP and as a result of this reaction one of the three phosphates stays with and it's contributed by the incoming nucleotide, the next nucleotide that is incorporated and pyrophosphate leaves the reaction. The next step will add another nucleotide and so forth. So you will gradually extend and copy the sequence of DNA such that complementary nucleotides are incorporated. Remarkably this is happening at very very fast rates. Just imagine millions of nucleotides copied in E. coli genome in a matter of minutes. So what about these enzymes that are capable of such accurate and fast reactions that lead to DNA copying? These are so called DNA polymerases and DNA polymerases share a common architecture and you will have easy time remembering this architecture if you just compare the shape a little bit to the right hand. Beware it's a right hand, not a left hand. These enzymes have some shape and handedness to them. So they are not molecules that have opposite possibility of having opposite chirality or anything like this. So basically this molecule has certain shape, shape of a right hand where fingers build one half of the cleft that binds the DNA substrate and thumb builds the second half of this cleft. And at the base of the cleft where the palm would be is where the active site here is. Additionally to this active site of polymerases there is also domains that might have some other functionalities that I'll talk about later, so called exonuclease activities that play an important role in collecting DNA and important role in controlling correcting of errors and also removing for example this RNAs that are left behind after replication fork is opened and primase synthesizes short primers. So how about this reaction mechanism, the chemistry of the reaction? So the chemistry of the reaction involves hydrolysis of deoxynucleotide triphosphate substrates. This is what I told you before but in detail this is how it looks. So these are these triphosphates that are bound in the active site of the enzyme such that the enzyme controls that the base pair with the next nucleotide in the DNA template strand is complementary and then the chemistry of the reaction can take place. And the chemistry of the reaction involves a nucleophilic attack of the three prime hydroxy group oxygen that belongs to this three prime hydroxy group of the last nucleotide in the primer strand or in the extending DNA strand, not the template one but the one that is being synthesized. When you have this nucleophilic attack the oxygen attacks the phosphorous and then this bond is broken, the pyrophosphate leaves the reaction and the phosphate sugar backbone is formed and you end up with this ester bonds between the phosphate and the ribose sugar and then you have the sugar phosphate backbone and you end up with a one nucleotide extended DNA strand that is being copied. Now exactly because of this mechanism you end up only being able to copy DNA in one direction because this is how nucleotides look like. You don't have nucleotides that will have triphosphates on the three prime end, they all have triphosphates on the five prime carbon attached and this is the reason why this extension only works from five prime to three prime and so this DNA template is being read from three prime to five prime but the new strand is being synthesized from five prime to three prime. So this is the chemistry behind five prime to three prime directionality. So that means that this Okazaki fragments, that means that when you open a replication fork you cannot synthesize complementary strands in two different directions. Rather you have to do it for copying of one of these two strands in the opposite direction and this was experimentally demonstrated when people analyzed the rapidly replicating DNA systems that if they would stop the reaction and analyze the DNA structure of the two the composition of DNA that is being synthesized they would end up with very uniform or relatively uniform size fragments. So Okazaki was the scientist who has shown that Okazaki, that these fragments of DNA occur and that explained that this so called lagging strand when the replication fork is being opened so you have this leading strand that can be copied directly so the DNA polymerase can just keep going unwinding the DNA that is being relaxed by topoisomerases the enzymes that take care of this winding of the structure in front and this polymerase can just keep going but there has to be another polymerase that actually goes in the opposite direction and of course this one has to start somewhere and when more DNA becomes available it has to restart and this is the reason why you end up with this Okazaki fragments and in the replication fork the strand that can be copied directly without restarting the replication is called the leading strand and the other strand is called the lagging strand but this some of you may already realize leads to another problem. Here we in fact because we are restarting DNA synthesis we will end up with each one of these beginnings with a little RNA primer that had to be synthesized and that has to be first of all taken care of because copied DNA ultimately is pure DNA doesn't have any RNA nucleotides in it and on top of that once you do this discontinuous lagging strand synthesis you will end up with fragments. You have to connect them somehow. So how do you close these gaps? So now this was the next big question in understanding the entire machinery that is involved in copying of DNA and you can see now how far we have come from the basic proposal that DNA has to be copied and how it might work chemically that the genetic information is preserved to the point of actually understanding the entire set of enzymes that is involved. So this is striking illustration of the number of different enzymes that are participating in the process. So during synthesis of these discontinuous fragments of DNA for the lagging strand synthesis you will have RNA primer where the synthesis started here with primis and continued with DNA polymerase 3. Then when this strand of DNA is further opened then another primer will initiate synthesis of the next segment of DNA and then again RNA polymerase 3 will continue synthesizing it but it will stop when it reaches the already synthesized segment. Even though this is RNA DNA polymerase will not peel it off it will just stop because DNA polymerase is used to copying single strand of DNA. So other enzymes are responsible for opening up the replication fork. You will hear about that later but DNA polymerase once it encounters a double helical region it will stop. At that point this DNA polymerase 3 I mentioned to you this is the main workhorse that is the fast DNA polymerase will dissociate and then a different DNA polymerase called DNA polymerase 1 will bind. When DNA polymerase 1 binds it has not only ability to copy DNA provided that it has a template and a primer and in this case primer is the previously synthesized segment of DNA by DNA polymerase 3. So once DNA polymerase 1 binds it will recognize this primer but interestingly this DNA polymerase will not only have this polymerization active site it will also have an exonuclease active site that will hydrolyze the nucleotides here in the 5' to 3' direction. So this excised RNA primer is actually not excised as one piece. You will see later it will be excised one nucleotide after the other and then when the DNA polymerase reaches the DNA it will dissociate and then yet another enzyme has to come in step in and carry out the reaction of joining these strands and this enzyme will be excised as one piece and the enzyme is called DNA ligase. So what DNA ligase will do is it will take this gap that is only at the level of backbone so what is important is that the polymerase closes the gap to the last nucleotide. There should not be a single nucleotide missing so the only gap that exists is at the level of backbone. Here we do not have a covalent bond between two adjacent nucleotides nevertheless one of them has a phosphate as it is the case for the 5' end of the DNA where the nucleotides of the RNA were excised and the other strand will have a 3' OH group but if you remember the chemistry of nucleotide addition for this reaction to take place you need high energy nucleotides ATP's so if you would have an AMP that has just one phosphate you would not be able to generate the bond and extend the DNA strand by one nucleotide. However the chemistry of nucleotide addition is not DNA ligase obviously accomplishes this and for this DNA ligase use extra energy and in the higher level classes you will see how that works. So just to put the entire replication fork and the entire set of enzymes and machinery associated proteins in context so basically what has to happen is that the DNA before it can be copied it has to open up and this is happening at particular locations their so called origins of replication so DNA is opened up and then their enzymes called helicases that will literally use energy to unwind this DNA so that's the force that's a little bit like what I was mentioning saying how to unwind a long coiled rope you have to pull on it pull on the two sides. So these are DNA helicases in front of this over winding of DNA you have topoisomer as a summer further down the line but in the replication fork itself you have this helicase and the helicase you have topoisomer as a summer further down the line but in the replication fork itself you have this helicase and then in addition to this enzyme that opens up the DNA that uses a lot of ATP you generate locally single stranded DNA and that's a big problem single stranded DNA are extremely sensitive any sort of cut of single stranded DNA means that these two DNA pieces will separate and it is very difficult to bring them back together so these are some of the most deleterious some of the most damaging type of breaks or damages to the DNA that can happen if you change a single nucleotide of the DNA it's a problem but it's not nearly as big problem as if you cut the DNA and then it ends up being separated these type of anomalies cause major genetic problems if for example the daughter bacterial cell would end up with a partial DNA or fragments DNA that would be typically a huge problem that would be a lethal type of damage. So to prevent the single stranded DNA that is very sensitive from being cut by let's say some enzymes or simply being cut by some chemicals or ionizing radiation if the bacteria is growing somewhere where there is sunlight and UV radiation there are small proteins called single stranded binding proteins that form tetramers and they immediately coat the single stranded DNA. Okay so this is some concept what's happening around the replication fork a leading strand is immediately being copied by DNA polymerase three five prime three prime direction as the replication fork is being unwound this polymerase is synthesizing complementary DNA strand. On the other hand on the lagging strand you have first primase generates small segments of DNA these little primers so they are synthesized first and only once you have these primers then you can continue synthesis using DNA polymerase three holoenzyme the primer here and here and here are not shown. So basically you have DNA polymerase holoenzyme that is copying this but as soon as the DNA polymerase holoenzyme reaches the next primer here and the next okazaki fragment it will fall off and then DNA polymerase one comes in, digest, closes the gap, digest the primer and then finally DNA ligase will connect okazaki fragments into one continuous linear piece of DNA. So this is what we are going to do. We are going to take the DNA polymerase and we are going to take the DNA polymerase and we are going to take the DNA binary DNA polyphenol DNA but we are going to do that by making it basically Any replication fork or replication bubble that is formed will go in opposite directions and will ultimately, these two replication forks will run into each other when the circle is closed. So for bacterial genomes, replication bubble opens at the site called origin of replication. And the two replication forks will go in opposite directions and as a result you will copy both strands of DNA. So basically the green part here and the light yellow part here will be complementary strands to the original two green and light strands. So you will literally have this circle peel from itself as it is shown here. So the two replication forks will run in opposite directions and when they collide, that's the critical moment, that's the sensor for replication forks disassembly. That's a signal for the forks, for the replisomes, so this is this assembly of all the machinery, the DNA polymerase, the helicase and so on. So this is the signal for the replication machinery to disassemble when the two replication forks collide. And this is usually happening at so-called region that is referred to as the terminus of replication. And when the two replication forks disassemble, you end up with two fully synthesized identical circular DNA genomes. And here is just a little bit of a magnification of this origin of replication, the site of initial melting of the DNA. So how about this replication assembly, which is called the replisome? How does it look like? Well, because the copying of the leading and lagging strand have to be coordinated, DNA polymerases are not separated enzymes that in one case copy the leading and in one case lagging strand. And I think this will be more clear if you think of the following situation. So if you have this opening DNA replication fork, this polymerase ought to be much, much faster and keep going without much trouble. This polymerase has to be restarted. So the worry if you don't synchronize the activities of this copying and this copying is that you will have one side of the DNA copied much faster and then you will end up with very long stretches of single strand of DNA exposed. And this is not what should happen. So DNA, two copies of DNA polymerase actually have to carry out their joints together in a replisome. And they simultaneously copy the leading strand and the lagging strand. And because of this, two copies of polymerase are joined with additional proteins, with helicase, with primase, and with additional elements that increase processivity that you will hear about in higher classes. So with this I would like to end and show you a little bit of an animation that is absolutely striking. This is absolutely stunning. It's the entire replisome working towards copying DNA at the site of replication fork. And in higher up classes that will follow, you will perfectly well understand the mechanism of each one of these enzymes. The DNA polymerase, the enzymes that are involved in ensuring the processivity of the process, the primase and what it does, the components that actually load the entire complex on the DNA. And what you notice here is that this lagging strand, the polymerase is held here, but the lagging strand is extending a little bit like a trombone. And this is what allows the hollow enzyme that has two copies of polymerase to stay together while the lagging strand is being copied. So in a way, the process has to solve a very complex geometric problem because you have to open the replication fork. And at the same time, you have to be copying in opposite directions. And this is achieved such that one strand is kept in place and the other one is allowed to actually balloon out. And this is what these Okazaki fragments are. This is an animation produced by Howard Hughes, and I will show you some of the links for different animations that you can look up separately because probably this one you cannot see at high enough resolution in this type of recording.