2024 Chapter 2 Structure and Function of Nucleic Acids PDF
Document Details
Uploaded by Deleted User
Wuhan University Taikang Medical School
Ruilin Zhang
Tags
Summary
This document presents the structure and function of nucleic acids, including nucleotides and various aspects of DNA and RNA, suitable for an undergraduate-level biochemistry or biology course. It covers elements, molecular components, structural units, and functions.
Full Transcript
Chapter 2 Structure and Function of Nucleic Acids Ruilin Zhang Department of Biochemistry and Molecular Biology Wuhan University TaiKang Medical School The Chemical Component of Nucleic Acid Nucleic acids are polymeric molecules composed of nucleoti...
Chapter 2 Structure and Function of Nucleic Acids Ruilin Zhang Department of Biochemistry and Molecular Biology Wuhan University TaiKang Medical School The Chemical Component of Nucleic Acid Nucleic acids are polymeric molecules composed of nucleotides, including deoxyribonucleic acid (DNA) and ribonucleic acid (RNA). The function is carrying and transmitting genetic information. 1. Element component C、H、O、N、P(9~10%) 2. Molecular component —— base:purine ,pyrimidine —— ribose:ribose,deoxyribose —— phosphate 2.1 Structure Units of Nucleic Acids--Nucleotides Nucleotides are building blocks of nucleic acids. Nucleotides are composed of a base, a pentose and one, two, or three phosphate groups. base NH2 N N purine NH N N 7 5 6 1N adenine, A 8 2 9 4 O 3 NH N N NH NH N NH2 guanine, G base O pyrimidine 4 N NH 5 3 6 1 2 NH O NH uracil, U NH2 O H3 C N NH NH O NH O cytosine, C thymine, T ribose HO CH2 OH HO CH2 OH 5´ O 5´ O 4´ 1´ 4´ 1´ 3´ 2´ 3´ 2´ OH OH OH H OH ribose deoxyribose (RNA) (DNA) The structure of nucleotide ribose + base=nucleoside β-N-glycosidic bond Nucleosides are more water soluble than free bases O NH N NH O 2 N HN 2 N HN RNA N N NH2 N N N O N O N HOCH2 O HOCH2 O HOCH2 O HOCH2 O OH OH OH OH OH OH OH OH Adenosine Guanosine Cytidine Uridine NH2 O NH2 O CH3 DNA N HN N N N HN N N NH2 N N N N O O HOCH2 O HOCH2 O HOCH2 O HOCH2 O OH H OH H OH H OH H Deoxyadenosine Deoxyguanosine Deoxycytidine Deoxythymidine Nucleotide (phosphated nucleosides ) Phosphate ester P Phosphoryl group Functions of Nucleotides For synthesis of nucleic acids As energy currency: ATP, GTP Serve as second messenger in signal transduction pathways: cAMP,cGMP NH2 N N O CH2 N N O cAMP O P O OH OH Functions of Nucleotides Structural components of several essential coenzymes: NAD,FAD, HSCoA Serve as carriers of activated intermediates in the synthesis of some carbohydrates, lipids, and proteins: UDP-G,CDP- DAG, SAM,ATP 2.2 Structure and Function of DNA 1 The secondary structure of DNA 2 Superhelical structure of DNA 3 Function of DNA 2.2.1 Primary structure of DNA 5´end C Nucleic acids are polynucleotides: linear polymers of nucleotides linked by 3’,5’-phosphodiester A bridge G 3´end NH2 OH 5′end 2.2.1 Primary structure of DNA HO P O C N O 5` H2C N O O Single-stranded DNA sequence 3` is written in the 5’ to 3’ O NH2 direction. HO P O N N O 5` CH2 O N N A Phosphodiester bond 3` O O G HO P O N NH O 5` H2C N N NH2 O 3′end 3` OH Representation of the primary structure of DNA A C T G C T Vertical line: pentoses 5′ P P P P P P OH 3′ Phosphodiester bond 5′ pApCpTpGpCpT-OH 3′ 5′ A C T G C T 3′ 2.2.2 The secondary structure of DNA DNA Contains Four Deoxynucleotides Deoxyadenylate Deoxyguanylate Deoxycytidylate Thymidylate Chargaff’s rules: in late 1940s, Amounts of the four bases: A = T ; G = C Purine (A+G) = pyrimidine (T+C) Erwin Chargaff (1905-1995) Compositions of DNA from different species Source of purine/ G+C DNA A(%) G(%) C(%) T(%) A/T G/C pyrimidine (%) E.Coli 26.0 24.9 25.2 23.9 1.09 0.99 50.1 1.04 Tubercle bacillus 15.1 34.9 35.4 14.6 1.03 0.99 70.3 1.00 Yeast 31.7 18.3 17.4 32.6 0.97 1.05 35.7 1.00 Bull 29.0 21.2 21.2 28.7 1.01 1.00 42.4 1.01 Pig 29.8 20.7 20.7 29.1 1.02 1.00 41.4 1.01 Human 30.4 19.9 19.9 30.1 1.01 1.00 39.8 1.01 Rosalind Franklin DNA X-ray diffraction data (R Franklin and M Wilkins) James Watson and Francis Hydrogen bonds Crick at Cambridge University in 1953: DNA is a complementary double helix; two strands of Sugar-phosphate DNA are held together by backbone the bonding interactions between unique base pairs. Size of DNA: ~bp; ~kb Watson and Crick model of double-helical structure of DNA Two antiparallel polynucleotide strands. One strand runs in the 5′to 3′direction and the other in the 3′to 5′direction. A deoxyribose-phosphate backbone minor Opposing bases are held groove together by hydrogen bonds. (base pairing A=T; G≡C) Wind around a central axis in major groove the form of a right-handed double helix. There are major grooves and minor grooves in the DNA molecule. Base pairings Dependent upon the hydrogen bonding of A with T and C with G A T G C Hydrogen bonds Hydrogen bonds Characteristics of the double helical structure of DNA Right hand double helix Sugar phosphate backbone Proteins can interact specifically with exposed atoms of the nucleotides (via specific hydrophobic and ionic interactions) in major and minor grooves and thereby recognize and bind to specific nucleotide sequences without disrupting the base pairing of the double-helical DNA molecule. Regulatory proteins control the expression of specific genes via such interactions. Double-stranded DNA exists in at least six forms (A–E and Z). The B form is usually found under physiologic conditions (low salt, high degree of hydration). A-DNA B-DNA Z-DNA Comparison of the structural properties of A-, B-, and Z-DNA A-DNA B-DNA Z-DNA Helix rotation Right-handed Right-handed Left-handed Bp numbers ~11bp/turn ~10bp/turn 12bp/turn Major groove very narrow, deep Wide, middle deep Flattened Minor groove Broad but shallow narrow, middle deep very narrow, deep Helix diameter 2.55 nm 2.37 nm 1.84 nm The double helix is a very dynamic structure The long-range structure of B-DNA in solution is not a rigid, linear rod. Instead, DNA behaves as a dynamic, flexible molecule. The localized thermal fluctuations: distort temporarily Elastic motions: Base and backbone atoms (10-9 sec). Intercalating agents distort the double helix: Aromatic macrocycles Sequence-dependent variations: proteins bind to specific DNA sequences, helix bend gently Weak Forces Stabilize the Double Helix 2.2.3 Superhelical structure of DNA Supercoils — one kind of DNA tertiary structure Supercoils are introduced when a closed circle is twisted around its own axis or when a linear piece of duplex DNA, whose ends are fixed, is twisted. Negative supercoils: are formed when the molecule is twisted in the direction opposite from the clockwise turns of the right-handed double helix found in B-DNA. Positive supercoils: overwind DNA double helix Negatively supercoiled DNA can arrange into toroidal state, form more stable structure by wrapping around proteins (spool-like). The supercoiled DNA is more compact and sediments faster on ultracentrifugation or migrates more rapidly in electrophoresis. Prokaryotic DNA supercoils In some organisms such as bacteria, bacteriophages, and many DNA-containing animal viruses, as well as organelles such as mitochondria, the ends of the DNA molecules are joined to create a closed circle with no covalently free ends. Eukaryotic chromosome Chromatin is the chromosomal material extracted from nuclei of cells of eukaryotic organisms. Chromatin consists of very long double-stranded DNA molecules and a nearly equal mass of rather small basic proteins termed histones as well as a smaller amount of nonhistone proteins and a small quantity of RNA. The double-stranded DNA helix in each chromosome has a length that is thousands of times the diameter of the cell nucleus. One purpose of the molecules that comprise chromatin, particularly the histones, is to condense the DNA. chromosome chromatin fiber nucleosome histone DNA Eukaryotic chromosome Basic unit: nucleosome Nucleosomes are composed of DNA wound around a collection of histone molecules. DNA:~200bp Main composition H1 of nucleosome Histone H2A,H2B H3 H4 Histones, a class of Arg- and Lys-rich basic protein, interact with the anionic phosphate groups of DNA backbone. The core histones are subject to at least six types of covalent modification: acetylation, methylation, phosphorylation, ADP-ribosylation, monoubiquitylation and sumoylation (small ubiquitin-related modifier). Model for the structure of the nucleosome The 146 base pairs of DNA is wrapped around the surface of the histone octamer consisting of two each of histones H2A, H2B, H3, and H4. This protects the DNA from digestion by a nuclease. Histone octamer Higher-Order Structures Provide for the Compaction of Chromatin 10-nm fibril 1.75 superhelical turns of DNA are wrapped around the surface of the histone octamer, forming the nucleosome core particle. The core particles are separated by an about 30- bp linker region of DNA. Most of the DNA is in a repeating series of these structures, giving the socalled “beads- on-a-string” appearance when examined by electron microscopy. 30-nm chromatin fiber The 10-nm fibril is probably further supercoiled with six or seven nucleosomes per turn to form the 30- nm chromatin fiber. Base pairs per turn 10 80 1200 In interphase chromosomes, chromatin fibers appear to be organized into 30,000– 100,000 bp loops or domains anchored in a scaffolding (or supporting matrix) within the nucleus. At metaphase, mammalian chromosomes possess a twofold symmetry, with the identical duplicated sister chromatids connected at a centromere, the relative position of which is characteristic for a given chromosome. 23 pair human chromosomes 3 billion DNA subunits (the bases: A, T, C, G) Approximately 23,000 genes code for proteins that perform most life functions Function of DNA 1. DNA is the chemical basis of heredity and is organized into genes, the fundamental units of genetic information. 2. DNA provides a template for gene replication and transcription. Genes refers to the specific fragments of DNA. The genetic information stores in the nucleotide sequence of the gene. Quiz While studying the structure of a small gene that was recently sequenced during the Human Genome Project, an investigator notices that one strand of the DNA molecule contains 20 A’s, 25 G’s, 30 C’s, and 22 T’s. How many of each base is found in the complete double-stranded molecule? A. A = 40, G = 50, C = 60, T = 44 B. A = 44, G = 60, C = 50, T = 40 C. A = 45, G = 45, C = 52, T = 52 D. A = 50, G = 47, C = 50, T = 47 E. A = 42, G = 55, C = 55, T = 42 The chemical bond linking nucleotides in the DNA molecule is ( ). A. Glycosidic bond B. Hydrogen bond C. Phosphate bond D. Phosphodiester bond E. Ionic bond The main point of DNA double helix structure is ( ). A. Ribose and phosphate are stacked inside the double helix. B. The bases are on the outer side of the double helix. C. The stability of DNA double helix does not require hydrophobic base stacking. D. Two DNA strands are parallel. E. The bases in two strands are specifically paired. 2.3 Structure and Function of RNA 1 The chemical nature of RNA 2 The secondary structure of RNA 3 The species and Function of RNA The chemical nature of RNA In RNA, the sugar moiety is ribose rather than the 2’- deoxyribose of DNA. RNA is susceptible to hydrolysis by base, but DNA is susceptible to hydrolysis by acid. The alkali lability of RNA is useful both diagnostically and analytically. RNA is a relatively labile molecule, undergoes easy and spontaneous degradation. The pyrimidine components of RNA differ from those of DNA. Instead of thymine, RNA contains the ribonucleotide of uracil. O O H3 C NH NH NH O NH O RNA: uracil ( U) DNA: thymine ( T) RNA exists as a single strand. Size of RNA: S (sedimentation coefficient) The RNA transcript is complementary to the template strand of the gene from which it was transcribed. The sequence of the RNA molecule (except for U replacing T) is the same as that of the coding strand of the gene. The guanine content of RNA does not necessarily equal its cytosine content, nor does its adenine content necessarily equal its uracil content. Many copies of RNA are present per cell. The secondary structure of RNA Limited by intramolecular interactions and other stabilizing influences. RNA molecules often fold into stem-loop structure (hairpin) by intrastrand base pairing between partial complementary sequences. Stems of RNA, paired regions forming a double helix conformation like A-form of DNA, are the most prominent secondary structural elements. Stems may have bulges (or internal loops) where one or two base cannot have any complementary base for pairing. Stem Loop The species and function of RNA 1 2 3 Various forms of RNA (messenger Nearly all of the They may RNA, several species also act like transfer RNA, of RNA are enzymes , ribosomal RNA, and involved in ribozymes, small RNA) some aspect cleaving serve different roles in cells. of protein nucleic acids. synthesis. 2.3.1 Structure and function of messenger RNA In mammalian cells, including cells of humans, the mRNA molecules present in the cytoplasm are not the RNA products immediately synthesized from the DNA template but must be formed by processing from a precursor molecule before entering the cytoplasm. * Eukaryotic mRNA processing intron exon hnRNA Heterogeneous nuclear RNA mRNA Heterogeneous nuclear RNA (hnRNA) In mammalian nuclei, hnRNA is the immediate product of gene transcription. The nuclear product is heterogeneous in size (variable) and is very large. Molecular weight may be more than 107, while the molecular weight of mRNA is less than 2x106. 75% of hnRNA is degraded in the nucleus, only 25% is processed to mature mRNA. * Eukaryotic mRNA structural characteristics 1. The 5’ terminal of mRNA is “capped” by a 7-methylguanosine triphosphate that is linked to an adjacent 2’-O-methyl ribonucleoside at its 5’-hydroxyl through the three phosphates: m7GpppNm-. 2. The other end of most mRNA molecules, the 3’-hydroxyl terminal, has an attached polymer of adenylate residues 20– 250 nucleotides in length:PolyA tail 5’ Cap 3’ PolyA tail 5’ Non-coding area 3’ Non-coding area Coding area The cap structure Purine OCH3 m7GpppNm- 7 methylguanosine triphosphate that is linked to an adjacent 2’-O- methyl ribonucleoside at its 5’-hydroxyl through the three phosphates Function of the cap structure and polyA tail The cap is involved in the recognition of mRNA by the translating machinery, and it probably helps stabilize the mRNA by preventing the attack of 5’- exonucleases. The poly(A) “tail” maintains the intracellular stability of the specific mRNA by preventing the attack of 3’exonucleases. *Function of mRNA They carry the nucleotide sequence as information from DNA and act as a template for protein synthesis in ribosomes. Eukaryotic cells Prokaryotic 细胞质 cells Nucleus Exon Intron DNA DNA Transcription Transcription mRNA hnRNA Post-transcriptional Translation processing Protein mRNA Translation Protein 2.3.2 Structure and function of transfer RNA Transfer RNA adopts higher-order structure through intrastrand base pairing. All tRNA molecules allows extensive folding and intrastrand complementarity to generate a secondary structure that appears like a cloverleaf. Rare bases in tRNA All tRNA contain 5 main arms or loops which are as follows Acceptor arm Anticodon arm DHU arm TΨC arm Extra (Variable) arm The acceptor arm is at 3’ end. The end sequence is unpaired Cytosine, Cytosine-Adenine at the 3’ end. These three nucleotides are added post transcriptionally. The 3’-OH group terminal of Adenine binds with carboxyl group Accept of tRNA-appropriate amino acids. a.a. The anticodon arm lies at the opposite end of acceptor arm. Recognizes the triplet codon present in the mRNA. Base sequence of anticodon arm is complementary to the base sequence of mRNA codon. Due to complimentarity it can bind specifically with mRNA by hydrogen bonds. DHU arm Serves as the recognition site for the enzyme (amino acyl tRNA synthetase) that adds the amino acid to the acceptor arm. TΨC arm This arm is opposite to DHU arm. Since it contains pseudo uridine that is why it is so named. It is involved in the binding of tRNA to the ribosomes. Extra arm or Variable arm About 75 % of tRNA molecules possess a short extra arm. *Tertiary structure of tRNA The L shaped tertiary structure is formed by further folding of the clover leaf due to hydrogen bonds between T and D arms. The base paired double helical stems get arranged into two double helical columns, continuous and perpendicular to one another. Function of tRNA They serve as adaptor molecules and carry their specific amino acid to the site of protein synthesis. They do so as they recognize the sequence of nucleotides. 2.3.3 Structure and function of ribosomal RNA Ribosomal RNA forms a structural part of a ribosome. This is the site of protein synthesis, the factory. It makes about 80% of the RNA in the cell. Recent studies suggested that an rRNA component performs the catalytic activity of cleaving nucleic acids (ribozyme). Ribosomal RNA: also adopts higher-order structure through intrastrand base pairing Secondary structure of 16S rRNA of E. coli Ribosome: Protein + rRNA Prokaryotic ribosomes Eukaryotic ribosomes Nearly all of the several species of RNA are involved in some aspect of protein synthesis. Ribosome 50S tRNA mRNA 5' AUG 3' 2.3.4 Structure and function of non-coding RNA Small Nuclear RNAs (snRNAs) Involved in mRNA processing and gene regulation. Micro RNAs (miRNAs) Small interfering RNAs (siRNAs) Can cause inhibition of gene expressions by degradation of specific mRNA. Long non-coding RNAs (lncRNAs) Chromatin remodeling and transcription control. Circular RNAs (circRNAs) Bind microRNA and remove its inhibition for target genes. miRNAs and SiRNAs miRNA (microRNA) siRNA (small interfering RNA) 21-25 nt 21-25 nt, double strand Encoded by endogenous genes Mostly exogenous origin Hairpin precursors dsRNA precursors Recognize multiple targets, form May be target specific, form the imperfect RNA-RNA perfect RNA-RNA hybrids with duplexes within the 3’- their distinct target mRNAs untranslated regions of specific where the complementary target mRNAs sequence exists Translation repression siRNA-mRNA complexes (unknown mechanisms) degradation miRNA siRNA Paired bases dsRNA RDE-1 Pre-miRNA DICER RISC Dicer contains DICER 21-25 bp RISC two RNase III domains siRNA 21~25bp RNA-inducing miRNA silencing complex RISC RISC Target mRNA 5’ Target Target mRNA is 5’ cleaved by mRNA endonuclease Non-pairing area Target mRNA is Translation degraded by repression exonuclease RNA-inducing silencing complex, RISC Exonucleolytic nuclease siRNA Helicase Hemology- searching Endonucleolytic nuclease activity Long non-coding RNA, lncRNA Long non-coding RNA is a kind of functional RNA molecule whose transcript length exceeds 200 nucleotide unit. It can regulate gene expression by chromatin remodeling, transcriptional regulation and post-transcriptional processing. Function of lncRNA Circular RNA Circular RNA (circRNA) is a special kind of non- coding RNA molecule. Different from traditional linear RNA (containing 5'and 3' ends), circRNA molecule has a closed ring structure and is not affected by RNA exonuclease. It is more stable in expression and difficult to degrade. Function of circRNA 1. As a sponge of microRNAs, circRNAs inhibit the binding of microRNAs to target genes. ciRS-7 has more than 60 binding sites of microRNA-7 (miR-7), thus inhibiting the binding of miR-7 to target gene CDR1. 2. The biosynthesis of circRNA affects gene splicing Quiz Which of the following bases only exists in mRNA, not in DNA ( )? A. G B. A C. C D. T E. U Which of the following statements about RNA is wrong? ( ) A. There are mRNA, tRNA, rRNA and other types. B. There is only mRNA in the cytoplasm. C. tRNA is not the smallest RNA molecule. D. rRNA is the main component of the ribosome. E. There is no hnRNA and snRNA in prokaryotes There is ( ) in the 3 'end of the majority of eukaryotic mRNA. A. Cap structure B. stop codon C. poly A D. TATA box E. start codon Which of the following statements about tRNA is wrong? ( ) A. There is a cap structure in 5 'end. B. The 3' end is CCA-OH. C. Contains several rare bases. D. Second structure is clover-shaped. E. There is anticodon stem. 2.4 Properties of nucleic acids 2.4.1 Ultraviolet Absorption DNA, RNA, oligonucleotides and even mononucleotides absorb UV light very efficiently, especially at 260 nm in solutions. adenine Extinction cytosine Coefficient guanine uracil thymine Wavelength of light 2.4.2 Denaturation and renaturation of DNA Denaturation of DNA (Melting) individual random coil Increase temperature Decrease Salt concentration Disrupt H-bonds: High temperature pH extremes Low ionic strength Small organic compounds (formamide, urea) Denaturation of DNA can be observed by changes in UV absorbance. Hyperchromic shift: Absorbance of the DNA solution at 260 nm increases after denaturation. Absorbance Denatured DNA Natural DNA Wavelength (nm) Melting curve and melting temperature Melting temperature (Tm) is the temperature at which half the base pairs are denatured, or the midpoint of the melting curve. 0.70 光密度(260nm) 0.65 50% 0.60 Absorbance 0.55 50% 0.50 76 80 84 88 92 96 100 C Tm Tm 融解温度 The Tm is influenced by the base composition of the DNA and by the salt concentration of the solution. 0.2 M Na+: Tm = 69.3 + 0.41(% G + C) Renaturation of DNA Denatured DNA will renature to re-form the duplex structure if the denaturing conditions are removed— reannealing. Renaturation is dependent on both DNA concentration and time, and many of the realignments are imperfect. 2.4.3 Nucleic Acid Hybridization Different DNA strands of similar sequence can form hybrid duplexes. DNA from two different species mixed, denatured, and allowed to cool slowly so that reannealing can occur, artificial hybrid duplexes may form, provided the DNA from one species is similar in nucleotide sequence to the DNA of the other. Similar DNA sequence of Insulin gene in different species Rat I Rat II Human Chicken Nucleic acid hybridization is very human mouse useful in molecular biology. Evolutionary relationships Identify specific gene (by probe) Quantitative investigation of gene expression 25% hybrids Hybridization: DNA:DNA, RNA:RNA, DNA:RNA DNA-RNA hybrid duplexes Quiz Which of the following statements about DNA denaturation is false? ( ) A. Acid, alkali or heating can denature DNA. B. Hydrogen bonds between bases are disrupted. C. Denatured DNA reduced absorbance at 260nm. D. Tm value is directly proportional to the GC content of DNA. E. Denaturation does not involve the damage of glycosidic bonds. Tm value is ( ). A. The temperature at which DNA is converted from B type to A type. B. The denaturation temperature C. The temperature at which 50% of DNA double strands is opened. D. The renaturation temperature E. The temperature of replication Which of the following DNA molecules has the highest Tm value? ( ) A. A+T content is 30% B. G+C content is 30% C. A+T content is 60% D. G+C content is 60% E. G+C content is 20% Which of the following statements about nucleic acid hybridization is not correct? A. The single stranded DNA from different samples can be annealed to form a partial double helix in the presence of complementary base sequence. B. DNA can also be annealed to form a double helix with RNA C. RNA can also be combined with the encoded polypeptide chain to form a hybrid molecule. D. Nucleic acid hybridization can be used to study the structure and function of nucleic acids. E. Nucleic acid hybridization can identify the sequence similarity between the two nucleic acid molecules. 2.4.4 Catalytic properties of nucleic acids Catalytic RNA (ribozyme) sequence specific endonucleases, degrade RNA Catalytic DNA (DNAzyme) synthesized oligonucleotides, sequence specific degradation of mRNA copyright © Nobel Media AB 2016 There are five classes of ribozymes. Three classes of ribozymes carry out self-processing reactions. Ribonuclease P (RNase P) and rRNA act on separate substrates. Hammerhead ribozymes are small self-cleaving RNAs. It can be truncated to a minimal, catalytically active motif consisting of three base-pairing stems (marked in colors) flanking a central core of 15 mostly invariant nucleotides (marked in frame). And the conserved central bases are essential for the hammerhead ribozyme’s catalytic activity. self-cleavage site Identify specific nucleotide sequences 2.5 Nuclease Nucleases differ in their specificity for different forms of nucleic acid: deoxyribonuclease (DNase):only act on DNA Ribonuclease (RNase):only act on RNA Endonucleases are nucleases that cleave internal phosphodiester bonds of nucleic acids. Exonucleases are nucleases that hydrolyze a nucleotide only when it is present at a terminal. They act in one direction. Exonuclease 5’ 3’ Endonuclease Endonuclease Exonuclease 3’ 5’ 5´→3´ exonuclease 3´→5´ exonuclease Endonuclease 5´ AG C T T C A G G A T A G C T G 3´ | | | | | | | | | | | | | | | | 3´ T C G AA G T C C T A T C G A C 5´ Restriction Endonuclease Restriction endonucleases are nucleases that cleave double- stranded DNA molecules. They can be used to map the structure of a DNA fragment. There are types I, II and III restriction endonucleases. Type I and III require ATP to hydrolyze DNA, Type I cleave DNA randomly, Type III cut DNA in specific sequence. Type II restriction enzymes: no need ATP, specific recognition sequences are typically 4 or 6 nucleotides in length and have a twofold axis of symmetry. EcoRI 5’—N–N–N–N–N–N–G A–A–T–T–C–N–N–N–N–N–N—3’ : : : : : : : : : : : : : : : : : : 3’—N–N–N–N–N–N–C–T–T–A–A G–N–N–N–N–N–N—5’ “sticky” ends—cohesive ends 2011级博士研究生 Function of Nucleases Participate in the synthesis and repair of DNA, and the splicing of post-transcribed RNA. Remove nucleic acids with structural and functional abnormalities, redundant nucleic acids, and exogenous nucleic acids. Degrade nucleic acids from food. Is an important tool enzyme in recombinant DNA technique. 2.6 Genome and Genomics What is a gene? 1865 Heredity transmitted in units (Gregor Mendel); 1869 DNA isolated (Frederick Miescher); 1909 The word gene is coined (Wilhelm Johannsen); 1911 Chromosomes carry genes (Thomas Hunt Morgan); 1944 DNA (not proteins) transforms cells (Oswald Avery, Colin MacLeod, and Maclyn McCarty). Gene refers to the specific fragment of DNA (with the exception of some RNA viruses). The genetic information stores in the nucleotide sequence of the gene which carries the instructions for the synthesis of a functional protein or RNA. How do genes work? First, they are replicated faithfully; second, they direct the production of RNAs and proteins; third, they accumulate mutations and so allow evolution. 2.6.1 Genome Genome is the complete set of DNA in a single cell, including gene-coding sequences and non-coding sequences. Different regions of the genome have different functions. - Some regions encode for proteins—structural gene. - Some regions regulate replication and transcription. - The function of some regions is not clear. The genomes of different organisms vary in size and complexity. Viruses < bacteria < eukaryotes The human genome contains a complete copy of the approximately 3 billion DNA base pairs. A small white flower (Paris japonica) from Japan has the world's longest genome. It has 149 billion base pairs. A larger genome is often a burden because replicating the genome takes more time. 2.6.1.1 Characteristics of prokaryotic genomes Prokaryotic genome is usually composed of circular double-stranded DNA, which is relatively small. The bacterial genome includes bacterial chromosomal DNA (existing in the nucleoid region) and plasmid DNA. 2.6.1.1 Characteristics of prokaryotic genomes The GC content of different prokaryotic genomes varies greatly (25% - 75%) and can be used to identify bacterial species. There are few repetitive sequences in the genome, most of which are single copies (The genes encoding rRNA are multiple copies). Generally, the coding sequence does not overlap. Structural genes are continuous, without introns, and do not need splicing after transcription. 2.6.1.1 Characteristics of prokaryotic genomes There is operon structure: functionally related structural genes are often linked in series, together with common upstream regulatory regions and downstream transcriptional termination signals to form gene expression units. Structural genes repressor promoter operator 1 2 3 Upstream regulatory regions mRNA Protein 1 Protein 2 Protein 3 2.6.1.2 Characteristics of eukaryotic genomes Eukaryotic genome includes chromosomal DNA and mitochondrial DNA. Human mitochondria contain two to ten copies of a small circular double- stranded DNA molecule that makes up approximately 1% of total cellular DNA. 2.6.1.2 Characteristics of eukaryotic genomes The compaction of chromosomal DNA chromosome chromatin fiber nucleosome histone DNA 2.6.1.2 Characteristics of eukaryotic genomes There is telomere structure at the end of chromosome. 2.6.1.2 Characteristics of eukaryotic genomes Eukaryotic genome is large, and non- coding sequence is much more than coding sequence. There are a large number of repetitive sequences. 2.6.1.2 Characteristics of eukaryotic genomes In the human genome, neutral mutations lead to differences in nucleotide sequences among individuals, known as genetic polymorphism. Restricted Fragment Length Polymorphism (RFLP) Simple Sequence Length Polymorphism (SSLP) Single Nucleotide Polymorphism (SNP) Restricted Fragment Length Polymorphism (RFLP) The hydrolysis of DNA by restriction endonuclease produces fragments of different lengths. Hind III sites Simple Sequence Length Polymorphism (SSLP) Arrays of repeat sequences that display length variations. - Minisatellites, or variable number of tandem repeats (VNTRs) - Microsatellites, or simple sequence repeats (SSRs) Single Nucleotide Polymorphism (SNP) 2.6.1.2 Characteristics of eukaryotic genomes Genes are split genes (including exons and introns, etc). There are many cis-acting elements in the genome, including promoters, enhancers, silencers and so on. cis-acting Regulatory Structural genes sequences elements promoter Poly(A) CAAT box tailing signal TATA box Enhancer exon exon exon intron intron 3′ 5′ response TGA ATG element +1 Stop 2.6.2 Genomics Genomics is a discipline in genetics that applies recombinant DNA, DNA sequencing methods, and bioinformatics to sequence, assemble, and analyze the function and structure of genomes. Research of single genes does not fall into the definition of genomics. Structural genomics aims to determine the structure, composition, and gene location of genome. Genetic Map The relative distance of the genetic markers to each other in terms of recombination frequency—genetic distance. The genetic markers can be detectable phenotypes (e.g. eye color), non-coding DNA sequences generating restriction fragment length polymorphism (RFLP) or single nucleotide polymorphism (SNP). Physical Map The actual distance (base pairs) between different genes or DNA markers in a genome. The chromosomes are cut into fragments, and the fragments containing STS (sequence tagged site) sequences are overlapped and connected to define their location by fragment cloning. Transcription Map Based on the position and distance of the transcription sequence, it is a map of all transcribed sequences within a region of chromosome DNA. It is constructed by using expressed sequence tag (EST) as DNA markers. Usually, the sequences with a length of about 300-500 bp on the 5’ or 3’ untranslated region are used as EST. Sequence Map The complete nucleotide sequences of a genome DNA sequence analysis is a multistage process including the preparation of DNA fragmentation, base analysis, and the translation of DNA information. Choose candidate Look for Chromosomal Look for DNA Family gene mutations locus fragments studies Functional genomics explains the function of DNA sequences at the genome level and systematically studies the regulation of gene expression. Comparative genomics compares the genomic features among different species or organisms to investigate the genetic similarities and differences as well as evolutionary relationships among species. Genomics Structural Functional Comparative genomics genomics genomics Genetic map Gene identification Physical map Gene function Transcription map Gene expression pattern Sequence map Transcriptomics Human Genome Project Proteomics The Human Genome Project The Human Genome Project is a large multicentric, international collaborative venture aiming to determine the nucleotide sequence of the human genome. The sequence is not that of one person, but is a composite derived from several individuals. Therefore, it is a "representative" or generic sequence. The Human Genome Project Working draft of human HGP officially First Human Gene genome sequence completed launched map published and published Human genetic Sequencing of human Finished version of mapping goal chromosome 22 human genome achieved completed sequence completed The Human Genome Project Genomes of fruit fly D.melanogaster and First bacterial E.coli genome thale cress A.thaliana genome Yeast genome Genome of the Mouse genome roundworm C. elegans The Human Genome Project produced a very high-quality version of the human genome sequence that is freely available in public databases, a resource that could be used for a broad range of biomedical studies. One such use is to look for the genetic variations or the type of genetic mutations that increase risk of specific diseases, such as cancer. Precision Medicine Precision medicine is an emerging approach for disease treatment and prevention that takes into account individual variability in genes, environment, and lifestyle for each person. Precision medicine is about matching the right drugs or treatments to the right people, based on a genetic or molecular understanding of their disease. Precision Medicine Precision Oncology Old theme: One drug fits all Effect + No effect Adverse effect Precision Oncology 1. Identification of genetic alterations that drive carcinogenesis 2. Treatment with drugs that can effectively inhibit the function of the genetic alterations New theme: Personalized diagnosis and treatment + Effect Profiling + Effect + Effect Key points Secondary structure of DNA Superhelical structure of DNA The chemical nature of RNA The species and function of RNA Concept: Denaturation and renaturation of DNA Nucleic acid hybridization Nuclease and ribozyme Genome and Genomics