11_BL_20231030_WhisperAI_3.docx
Document Details
Uploaded by CourageousStrength
ETH Zürich
Full Transcript
Welcome to the third lecture on the topic of DNA replication, repair and recombination. Just a brief reminder to what was discussed last week. I introduced to you the chemical and geometric features of DNA molecule. First of all, you should be aware that DNA molecule is a biological polymer made out...
Welcome to the third lecture on the topic of DNA replication, repair and recombination. Just a brief reminder to what was discussed last week. I introduced to you the chemical and geometric features of DNA molecule. First of all, you should be aware that DNA molecule is a biological polymer made out of two anti-parallel strands where the backbone of this double helical molecule consists of sugars and phosphates and that in between the backbone strands there are bases interacting with each other according to Watson-Crick rules with G interacting with C and with A interacting with T. The first proposed structure of DNA molecule, so-called B-form DNA, explain the mechanism how DNA can be copied. However, the architectural features of DNA that has between each base pair 3.4 angstroms distance for the B-form DNA and 36 degree rotation causes this double helix to, for the two strands to actually wrap around each other. So the double helix is in fact coiled from two anti-parallel strands. As a result, in order for this DNA to be copied, the two strands have to separate which generates massive winding of the DNA in front of the region that is opened up, that is separated. And because of this, we have to involve a number of enzymes that deal with these topological problems faced in the process of copying the information in two DNA strands. The separation of strands is necessary because experimental data showed us that the copying of DNA is semi-conservative. That means that from the two parent strands of DNA, we generate two DNAs, each one of them consisting of one old strand and one new strand, newly synthesized strand. There are other enzymes that are of course necessary to copy this information in DNA. First problem, this overwinding is dealt with with enzyme called topoisomerase. It will cut one strand of the DNA and allow it to relax and then reconnect it. The actual copying mechanism again has to occur in a certain chemical direction. So I have shown you that chemically the extension of the newly synthesized chain proceeds in a five prime to three prime direction. That means that on the three prime end of the primer, the strand of DNA that has to be extended, a new nucleotide is incorporated such that nucleotide triphosphates are used as an energy source and as a substrate for the next addition of a nucleotide that is exactly complementary to the nucleotide on the template strand of DNA. As a result of this reaction, pyrophosphate leaves the reaction and the next nucleotide is connected with phosphodiester bond. So you end up with this increasing length of primer DNA strand that ultimately becomes the copied DNA strand. The enzyme responsible for this is DNA polymerase III. However, when you open up two strands of DNA and because they have anti-parallel chemistry, one strand has a five prime chemical end. That means that the carbon on the five prime position in the ribose is on the end of one of these two strands and on the other strand there is an OH group connected to the three prime carbon in the ribose ring. And so if you're always copying one strand such that you're synthesizing from five prime to three prime, that means that the other strand when you're opening the DNA has to be synthesized in the opposite direction of copying. This is another problem that I showed you is resolved if you actually synthesize the so-called lagging strand between the two strands of DNA. One is the leading strand, one is the lagging strand and this lagging strand is synthesized in the opposite direction. That means that it has to constantly restart. The synthesis has to restart and these regions of DNA that are being synthesized are called Okazaki fragments. But then the question arose, well you cannot synthesize a copy of a DNA strand in fragments. You kind of have to connect them and this connection happens with DNA polymerase I that closes the gaps fully, removes the original primer which you will remember consists of an RNA molecule rather than DNA. That's how DNA copying starts with an RNA beginning because it's less accurate than the rest of the synthesis. So this DNA polymerase I will through exonuclease activity remove the primer and then the ligase will connect these Okazaki fragments on the lagging strand of DNA. So this kind of summarizes everything that I have told you. You have a primus synthesizing primers, this is the replication fork, DNA polymerase III is responsible for fast synthesis, DNA polymerase I for closing the gaps and removing the primer. And of course we also have a molecule that kind of opens this DNA, helps it unwind. This molecule is called helicase. It uses 80 PS power to help the copying machinery open up the DNA and separate it. So what's happening next? Well once the DNA is copied, and this is done very accurately by these polymerases, they will make only one error in million newly incorporated nucleotides. There are other mechanisms that correct possible errors that were introduced by the polymerase to reduce the error rate to one in a million. But even then, once the DNA is fully copied, it'll experience occasional damage to environmental effects and even those errors have to be corrected and these are so-called correction of spontaneous mutations. So let me first tell you how polymerases have these additional mechanisms to correct errors. Okay? And sometimes it'll simply happen that polymerase incorporates a wrong nucleotide. There can be different reasons for this. For example, the nucleotides can exist in different tautomeric states which change the possible hydrogen bonding pattern and due to this, polymerase will rarely but occasionally incorporate the wrong nucleotide. What will happen then? Well, the wrong nucleotide will not base pair according to Watson Creek rules with a complementary nucleotide on the template strand and so this end of newly synthesized DNA will tend to fray apart. It'll kind of separate a little bit and a remarkable feature of DNA polymerases is that in addition to the active site that I described to you at the base of this right-hand shaped molecule, DNA polymerase. So in addition to the active site that is here, these polymerases, for example, DNA polymerase 3, will have an extra so-called exonuclease active site and this exonuclease active site will degrade, it'll hydrolyze nucleotides in the opposite direction of synthesis. So in the direction 3 prime to 5 prime. So what happens when a wrong nucleotide is incorporated, the active site of the polymerase cannot keep going and the DNA will tend to kind of separate. Once it separates, it'll enter this additional active site on DNA polymerase, the exonuclease active site and this exonuclease active site will remove this nucleotide, the DNA will rebind and polymerization can continue. And this is a very elegant way how wrongly incorporated nucleotides are on the fly corrected by DNA polymerase. So what happens about what kind of mutations do we usually see? Just in terms of terminology, I would like to introduce the kind of mutations that can happen. First there can be so-called point mutations and this is exactly the case that would happen if DNA polymerase would incorporate the wrong nucleotide and would somehow fail to correct it. So in less than one in a million cases this happens. If you have a purine to purine transition, that's called a transition. If you have a purine to pyrimidine when you're changing the type of the nucleotide, this is called transversion. So this is less likely type of change because it's more difficult to explain a completely different size of nucleotide being incorporated than a wrong type of nucleotide but of the same size. So that's that would be the case when you have purine to purine. So once you have a wrong nucleotide incorporated, in the next round of copying the daughter DNA will have the wrong sequence. So here you'll have a mismatch but in the next round of copying you will have actually a wrong sequence that is fully preserved on both DNAs. So let me tell you a little bit about the mechanisms that are used to correct these mutations. There are mechanisms that involve specifically excising one base. Single base can be corrected and you will see some examples of how that is done. Other mechanisms will involve excision of a segment of DNA. This is called nucleotide excision repair mechanism. So let's see a few examples and I will start with the example that connects to the discussion that I just introduced and that is the situation that happens if DNA polymerase makes a mistake. So for example if you have a parent's strand of DNA and a daughter's strand where accidentally polymerase ended up incorporating a T opposite to G ordinarily T should only be incorporated opposite to an A. Then you have a set of proteins and enzymes among them that correct this mistake. But once you have, just think about it for a minute, once you have a mismatch between two bases in the DNA, double helical DNA, how does a repair machinery know which one of these nucleotides is correct and which one is wrong? Because in a way they're both wrong, they are not complementary. So in a way the repair machinery has to know which of these of the molecule is the parent molecule and which one is the newly synthesized one because the newly synthesized one is the one that is likely to have an error. And the way this is a very elegant mechanism and researchers who discovered this repair system were, these discoveries were made decades ago but a few years ago they received a Nobel prize for discovering these repair mechanisms that have very important implications for our understanding of possible sources of cancer and other diseases. So the mechanism how these repair molecules can recognize which strand should be corrected is through a teamwork with so-called methylases, enzymes that modify DNA. So this methylases will periodically modify certain nucleotides in DNA and this is an example from E. coli. But when the DNA is in the cell for a while it'll get methylated. A newly synthesized strand of DNA will not immediately be methylated, it'll take a little while. So if it takes some minutes for the newly synthesized strand to be methylated that's the window of opportunity for these correction enzymes to recognize the mismatch and correct it. And because one strand is not methylated the repair system will know this is the strand that has the wrong nucleotide and it has to be corrected. So here is the system of enzymes. One of them, mutS, will recognize the mismatch. MutL will connect to the protein that recognizes the mismatch, an enzyme that will cut the DNA that has a mistake, the blue one, the newly synthesized one. It's an exonuclease. Once this is cut then another exonuclease will chew away hundreds of nucleotides of DNA that are wrong, including the wrong nucleotide. And the single strand of DNA has to be protected. So you'll remember there's the single strand of binding protein that'll bind here and then the major workhorse of DNA replication polymerase 3 will close the gap and then ligase will connect the two strands and you have a repaired DNA molecule and then this one will be methylated. So this is an E. coli but in fact mutations in homologous enzymes, components of this repair machinery, can lead to increased chances of cancer in humans. So one particular type of cancer, colon cancer, so-called hereditary non-polypose colorectal cancer that affects a significant number of people is actually happening due to a mutation in our DNA repair system that is homologous to the system found in bacteria. Okay, how about other examples? In addition to these types of errors that polymerase can make, there are spontaneous mutations that can happen. Acids, chemicals, cigarette smoke, pollution, UV radiation, all of these sources can cause a different type of damage to the DNA. And just look at this situation here. If you have an adenine exposed to an acidic environment, you can end up with, so basically you have deamination of this exocyclic amine and because of that you end up with an oxygen in its place with a keto group and then in the end this typically non-existing base can base pair differently than the original A. And so you have an unusual base pairing due to chemical modification of the naturally occurring bases. Another example and the system that is used to correct it, so for example if you have a cysteine base and through acidic deamination you end up changing this base, you will not generate a new type of base but actually a base that exists in RNA molecules, uracil. And what is interesting, uracil will pair very effectively with just the way a thymine pairs. So instead of pairing with G the way cytosine would do, uracil will pair with an A like thymine does. However, uracil never exists in DNA, thymine does, and because of this there are enzymes that will simply screen the DNA, find the place where there is uracil, because the only way this base could be there is if it is a result of deamination of cytosine. And then there are enzymes that will cut the base, you will be cut out because it shouldn't be in DNA, and then endonuclease will cut the DNA strand that doesn't have a base, and then DNA polymerase I will connect, there is an exonuclease that will excise this nucleotide, a basic nucleotide, and polymerase will then put the right complementary nucleotide here and the gap will be connected with a ligase. Now you may have wondered, why does RNA molecule consist of a different set of nucleotides than DNA? Why is in RNA uracil and why is it replaced with a thymine in DNA? Well again this is a mechanism how you can better correct errors. So RNA molecules are typically short-lived, they are not sources of genetic information except in some viruses, and because of that for long-term storage thymine is a more stable and easier to error-correct type of nucleotide and base than cytosine is, and this is the reason why DNA contains thymine. So I mentioned to you that dependent on external factors, physically damaging radiation or reactive chemicals, you can end up damaging DNA considerably. But in spite of this, our organism through the use of these enzymes that can repair the DNA can handle considerable DNA damage, and we in fact do. There are thousands and thousands of nucleotides that are being damaged in our DNA in every cell every day, and enzymes are keeping track of it and correcting the errors. However, we would never be able to handle let's say lethal doses of radiation, but some organisms can deal with such extreme radiation doses. So just look at this, a 10-gray dose is lethal for us, however you have some bacteria that can take 6,000 gray and be fine. Somehow their repair machinery can deal with this massive DNA damage. Some multicellular organisms like these that can move around, so these are almost cute microorganisms, they can barely be seen under the microscope, they're called water bears. These water bears can withstand 4,000 gray, and they're such remarkable animals to study DNA repair and other aspects of unusual biology that can be applicable to understanding biology in general. These little organisms are even used for experiments, for example on space shuttle. So the astronauts take them, they put them out of the space shuttle, literally. They let them freeze at almost absolute zero temperatures. They expose them to direct sun without any protective layer of ozone or our atmosphere. Then they bring them back and these guys, they happily, you know, once the solution thaws, start walking around and mating and, you know, just living happily ever after. So what kind of systems these organisms have, what kind of machinery to avoid dying from the kind of DNA damage that literally shatters their DNA into pieces. So these are not, we are not talking about, let's say, depurinations or a single cut of DNA or maybe changing the chemical characteristics through some sort of reactive chemical radicals caused by radiation so that you change the identity of a particular nucleotide or a base. No, what we're talking about here is the intensity of radiation that literally generates double stranded breaks in DNA and things fly apart. They get shattered and these organisms can deal with it. So how is that dealt with? Well, the system that actually can deal with this type of catastrophic DNA damage is called recombination. So it's a special case of DNA damage repair and in addition to this, it's also a source of genetic diversity which I'm sure you have already heard about. So recombination can occur when you have DNA molecules that are similar in sequence and this is the reason why many organisms that can withstand such high radiation damage will have a larger amount of these recombination enzymes and also multiple copies of their DNA. So that's the secret. Recombination is also important, for example, for generation of antibodies. This is happening throughout our life and this is happening through recombination. Viruses, for example, use a slightly different type of recombination and I'll tell you about it to introduce their DNA into our genome. So how does this recombination work? There are two types of recombination and I'll first tell you about site-specific recombination that is used by viruses and then I will tell you about this homologous recombination that is used, for example, for repairing damage in the DNA. So the site-specific recombination follows the following mechanism. These site-specific recombinases, such as this Cree recombinase, in a way recognize a particular sequence in the two DNA molecules that have to recombine and I'll show you a little bit how this recombination works. Now in terms of terminology, there is a very important intermediate of the recombination reaction that happens for both homologous recombination and for the site-specific recombination and this intermediate of the reaction is called a Holliday junction. So this is this Holliday junction that happens during recombination. So this recombinase enzyme has four domains and they change conformation to cut and recombine this DNA molecule. So DNA molecules, there are two of them and you'll see how that works. So we start with two DNA molecules with specific sequences where the recombination takes place. So the recombination kind of is a word that explains it. So you basically take one DNA molecule, you cut it and you connect it differently to a different DNA molecule and the other one also has to be separated. So in a way the process is a little bit related to what topoisomerases do. They have to cut and reconnect the DNA, but here it's a much more complex process. It involves a few steps. First the DNA molecule on both DNAs has to be cut and this is this cutting cleavage reaction. So one strand, the blue strand is cut here and the purple strand is cut on the opposite molecule and then just the way topoisomerases cut and reconnect, this recombinase reconnects, but reconnects the DNA such that the blue one is now connected to the purple one and the purple one is connected to the blue one. So basically you end up with a cross connection of the DNA where the blue strand belongs to both molecules and the purple one now belongs to both molecules. And after this the molecule, this tetramer, has to change configuration. So the difference between this and this, you will notice that this angle here, it's kind of, it has a wider angle on top and this one now has a narrower angle on top. So basically you have this type of motion between the two DNAs. So it's like this and like this. So it's kind of moving like that. And because of that. And because of this, then this leads to another cleavage reaction just the way it happened here. But this time instead of blue and purple strands you cut the red and the orange. And then after that you reconnect them and as a result you have orange and purple DNA connected in line with blue and red. And here you have the blue and red again connected to orange and purple. So in a way, this way, you can introduce, for example, viral RNA piece into the genome. And suddenly, you know, you have this viral RNA, let's say if it's circular or something like this, it can just connect here or some other types of recombinations can take place. So how about this other type of recombination, the one that repairs damage? So let me show you what's what is happening here. So in case of homologous recombination, basically this will happen is if you have two similar molecules. And it's not happening at a particular sequence because two molecules are in general very similar. They don't have to be identical. That's the interesting thing. And this is what happens basically when you have basically production of cells that are used for reproduction. When you have production of those, you take different parts of DNA from your parents and you kind of put them in different combinations in the reproductive cells. And this is the reason why every reproductive cell has a different combination of elements that we inherited from our parents. And the way the molecules are recombined, so that basically as a result of this recombination, you have a mosaic of, let's say, black and red regions of DNA. The mechanism is very similar to what you have seen before. So basically you cross over the DNA, you have to introduce a cut on one single strand of both of the DNAs. Then they're reconnected, but they're not connected back the way they were. They're connected across. And this is exactly the same thing as this holiday junction that I already introduced to you. And then the only difference to site-specific recombination, where you have immediate reconnection of the second strand, here this crossing over can migrate. So it can move around and then the second strand is cut because there is no this tetramer that keeps things in place. So because of this you can end up with very different scenarios for how the two molecules are exchanged. And there are enzymes responsible for this type of homologous recombination. They're also responsible for repair. And the key molecule in bacteria, and we have related molecules in our organism, it's called RecA protein. So it uses ATP to kind of recognize and bind double-stranded DNAs and facilitate the exchange such that one of the two strands of double-stranded DNA is moved out of the way and a new DNA strand is merged. So this is using energy to facilitate this strand crossover and recombination. So just to introduce a very short kind of break. In thinking about recombination, what is important is that the place where the molecule is cut and the place where this holiday junction will migrate and reconnect is not predictable. This will happen through homologous recombination in all sorts of different ways. So if anybody would tell you, okay, I really want to have this kind of offspring. And, you know, you choose parents or the parents choose each other that have certain characteristics. And, you know, this is father, this is mother, beautiful. And they expect a beautiful little baby, but the recombination cannot be predicted. So this might be the product of recombination between these parents. It's never the less cute, but very different from the expectation. Okay, so for the rest of the class today, I would like to tell you a little bit more about the terminology on the systems that I have introduced to you. And that includes the higher order organizations of DNA molecule. So DNA usually builds genetic information that is referred to in most organisms as chromosomes. So these are large DNA molecules that have many elements, particular sequence features in the DNA molecule. And chromosome is as a single circular DNA, the predominant genetic element in prokaryotes. So most bacteria and archaea have single circular chromosome, double stranded DNA carrying most genes. On the other hand, eukaryotes have two or more linear chromosomes. So we have a number of linear chromosomes that are not circular, they are not connected. And that has important implications for understanding copying. For example, you should be aware that the E. coli chromosome is 4.6 million base pairs. And interestingly, in terms of this terminology about what genes are, E. coli, for example, has 4,300 genes in this 4.5 million base pair chromosome. That means that, however, this also means that not all of this DNA is in genes. There is a lot of DNA in between that has other functions. These genes often tend to cluster in groups based on the function of the product of those genes. When they are expressed, they make proteins, but you will hear more about it later. So viruses can have either RNA or DNA as genomes. They can be circular or they can be single or double stranded. So they can be linear and circular. All combinations are possible. Viruses are much smaller. So this demand on the preservation of genetic information is not as high because even if polymerase makes one in a million times an error and if the whole genome is less than a million base pairs, viruses will mostly be fine. And sometimes it's actually good for them to change because that way they can avoid the defense mechanisms of their hosts. There is another interesting nucleic acid element that is not the genome. These are separate elements of DNA that copy on their own, but they are not part of the chromosome, and they are called plasmids. They are double stranded DNA and they are usually circular. These plasmids are circular and you'll see a little bit. They can be quite useful because these plasmids are relatively easy to transfer to different cells. For example, if bacteria has a antibiotic resistance on the plasmid, this will be beneficial for the whole bacterial population because even if this bacteria dies, this plasmid will be very stable and it might be picked up by neighboring bacteria and it'll become resistant then and so on, and then this resistance will spread. Plasmids can vary in copy number and size. So sometimes there is a single genome, but a particular plasmid with the same sequence can be present in 100 copies in the cell, in the prokaryotic cell. These plasmids are inside bacterial cells and of course, in addition to some antibiotic resistance genes, they can also produce some product that can help bacteria defend itself from different types of bacteria around. This is how these plasmids look like in relation to bacterial genomic DNA. This is approximately the size of the bacterium and this is the size of the genomic DNA and you can see already last time I introduced to you this question of how much longer genomic DNA is than the cell itself. You can see here how enormously large it is once the DNA is kind of splayed apart.