Summary

This document is a guidebook for cell biology, covering DNA structure, replication, and repair. It also includes information about chromosomes, DNA synthesis, and regulation of gene expression. It's a useful resource for undergraduate-level biology students.

Full Transcript

2. DNA STRUCTURE, REPLICATION AND REPAIR Additional reading on DNA damage and repair: Molecular Biology of the Cell. (4th ed.). Alberts B, Johnson A, Lewis J, et al. New York: Garland Science; 2002 (online); DNA repair in cancer therapy [electronic resource] : molecular targets and clinical applicat...

2. DNA STRUCTURE, REPLICATION AND REPAIR Additional reading on DNA damage and repair: Molecular Biology of the Cell. (4th ed.). Alberts B, Johnson A, Lewis J, et al. New York: Garland Science; 2002 (online); DNA repair in cancer therapy [electronic resource] : molecular targets and clinical applications. Elsevier, 2012; Ed. Mark R. Kelley – available online through the library. On genomic DNA, additional reading for those interested can be found in Harrison’s Principles of Internal Medicine, Chapter 61 – available online through the library. 2.1 DNA consists of two chains of deoxynucleotides connected by phosphodiester bonds, running anti-parallel to each other in a right handed helix. In a coding region, the 5’  3’ strand is the coding strand, from which the protein sequence is read, while the 3’  5-strand is the template strand, upon which a new daughter strand is assembled. Sequences are written 5’  3’ unless otherwise indicated. DNA is a negatively charged molecule due to the exposed phosphate groups, while the hydrophobic bases are packed in stacked arrangement in the inside of the double helix, forming Watson-Crick hydrogen bonded base pairs G–C and A–T. G–C and A–T have 3 and 2 hydrogen bonds respectively, making G-C base pairs more stable. Physically and chemically, DNA is very stable. It will denature into single strands at a melting / annealing temperature TM dependent on the length of the DNA and the number of G-C bases, and will re-anneal upon cooling. A hyperchromic shift (increase in OD260) occurs with denaturation. CC (1) DNA hybridization is the basis for the “genes on a chip” technology, which is used to examine mutations in genomic DNA or changes in expression patterns (cDNA). The chips are designed to study up- or down-regulation of known genes in cancers or to identify infectious microorganisms. (2) DNA with mismatched base pairs has a lower annealing temperature, which can be used in assays designed to recognize specific base mutations. 2.2 A Chromosome is a single DNA molecule. It is compacted 10,000 times into chromatin. Heterochromatin is highly condensed and euchromatin is loosely condensed and transcriptionally active. Chromatin consists of nucleosomes (histones plus DNA) wrapped into solenoid fibers and loops in a hierarchical assembly. Histones are positively charged molecules. They disassociate from DNA for replication and transcription, aided by acetylation, which reduces the positive charge. During replication, chromosomes are duplicated, forming a dyad of two chromatids held together by a centromere. Telomeres at the ends of chromosomes are special structures to distinguish them from DNA breaks. Genomic DNA contains several types of sequences: Unique sequences o genes for many proteins, enzymes (single copy) o genes for histones, immunoglobulins, rRNA (mid-range copy number) Moderately repetitive DNA o regulatory regions for transcription factor binding o LINES, SINES and DNA only transposons. LINES (L1) code for reverse transcriptase which enables LINES and SINES (Alu) to be copied and pasted into new regions of the genome, a process called retrotransposition. Transposons move by a cut and paste mechanism called transposition (in yeast, bacteria). Many retrotransposons are relics of ancient retroviruses. Highly repetitive DNA 6 o tandem repeat sequences. CC (1) Histone acetyltransferases (HAT’s) and histone deacetylases (HDAC’s) are very important for regulating accessibility of DNA. HDAC inhibitors are an important class of anticancer agents in development. (2) Only immortalized cells (germline cells, cancer cells) have the enzyme that is required to sustain telomeric length, making telomeric DNA or This chart is provided to show the organization of the genome. Do telomerase a target for NOT memorize the numbers in this chart. It is to give you an idea anticancer therapy. (3) In of the relative amounts of different types of sequences. genomic DNA, short tandem repeat sequences are very useful in forensics because the number of repeats at several chromosomal loci varies between individuals. Repeat sequences can make the DNA more prone to replication errors by mismatch, slippage, unequal crossover or transposition. (4) Alu repeats (jumping genes) occur frequently in primate DNA and have been observed in de novo cases of hemophilia and hypercholesterolemia. Insertion within exons or unequal crossover of genetic material between nearby Alu’s during homologous recombination are possible molecular causes. (5) HIV integrase target gene-rich regions of chromosomes to insert proviral DNA, thus ensuring a high level of transcription. 2.3 DNA synthesis involves recognition of the origin of replication (ORI) (multiple origins in the eukaryotic chromosome), formation of the replication bubble, replication forks, and synthesis of leading and lagging strands. Important proteins and enzymes involved include ORI binding proteins, ssDNA binding proteins, helicase (unwinding of DNA), various DNA polymerases, topoisomerase (relieving supercoil strain). Replication is bidirectional spanning out in both directions from the replication bubble(s), and synthesis always occurs from the 5’  3’ direction, relying on the free 3’-OH of a primer. The DNA template strand is copied from the 3’  5’ direction. Therefore one strand is synthesized continuously (leading strand) and the other in Okazaki fragments in the opposite direction from the growing replication fork. Additional proteins (ligases, RNA primase) are required to complete synthesis of the lagging strand. (Details in the reading and lecture) Epigenetic processing, including methylation, is important for regulating expression of DNA. The replicated chromosome contains single stranded overhangs at the 3’-end of each strand, since there is no available primer for DNA pol. Single stranded DNA is digested, shortening the length of the telomeres in somatic cells, and ensuring a limited number of cell divisions before cell death (the Hayflick limit). Germline cells contain telomerase, a reverse transcriptase. Telomeric RNA is the template for repairing and extended telomeres, resulting in an immortal cell line. 7 CC (1) The ciprofloxacin antibiotics target bacterial topoisomerase (gyrase) and there are anticancer agents that target mammalian toposimerase I. and II (2) Many tumor cells have telomerase expression activated, contributing to the ability of these cells to survive despite multiple chromosomal abnormalities. (3) Slippage of the template strand during lagging strand synthesis is believed to be a mechanism for the observed expansion of the number of triplet repeats in triplet repeat disorders. (4) Many antiviral drugs are nucleoside analogs missing the free 3’-OH required by DNA polymerase, stopping viral DNA synthesis when incorporated in the growing strand. A new nucleoside reverse transcriptase inhibitor of HIV-RT, EFdA-TP, is an dATP analog with a 3’OH that works by a different mechanism, interfering with translocation of RT along the primer. Importantly it overcomes resistance by HIV-RT and has an excellent safety and toxicity profile. Acyclovir is an acyclic nucleoside analog which interferes with Herpes virus thymidine kinase. 2.4 DNA methylation Subsequent to synthesis DNA is methylated, typically as 5- methylcytosine, by DNA methyltransferases, which leads to formation of condensed chromatin in eukaryotes. Nucleosomes form rapidly behind the advancing replication fork. Prokaryotes recognize methylated DNA as “self”, not susceptible to restriction enzymes. 2.5 Spontaneous mutations in DNA include depurination, deamination of cytosine and oxidation of guanine, at the rate of thousands of bases per day per cell. Oxidative metabolism results in free radicals which catalyze these events. Spontaneous DNA decay, intracellular metabolites and replication errors lead to single stranded breaks (SSBs). Environmental mutagens include Common reactions on cytosine in DNA causing epigenetic non-ionizing UV light changes (methylation) or DNA damage (deamination) (pyrimidine dimer formation), high energy ionizing radiation (single and double stranded breaks (SSB’s and DSBs)), chemical mutagens that are cross- linking agents, base analogs or intercalating agents. DNA is most vulnerable to mutagenesis during active replication (S phase) or transcription. Deamination of cytosine yields the base uracil, which will base pair with adenine. Thus uncorrected cytosine deamination causes a GC to AT base pair mutation. Eukaryotic cells contain a family of proteins called APOBEC3s, cytosine deaminases that target viral DNA. A very large number of G  A mutations scramble the viral sequence. 2.6 Specific DNA repair processes exist to repair each type of DNA damage (See Table below). Base excision repair (BER) removes abnormal bases. Specific glycosylases (e.g. uracil glycosylase) removes the offending base. Another BER associated enzyme is poly ADP ribose polymerase (PARP1) which is recruited in the repair of single stranded breaks. Mismatch repair (MMR) corrects non-Watson-Crick base pairs. DNA damaged by UV light is repaired with the nucleotide excision repair (NER) system and associated transcription coupled NER, since most DNA damage occurs when DNA is actively being replicated or transcribed. Both homologous repair (HR) and non-homologous end joining (NHEJ) are 8 mechanisms for repairing strand breaks. Deficiencies in DNA repair enzymes are associated with many forms of cancer. Many deficiencies are a result of epigenetic changes rather than mutations within the DNA repair genes. Epigenetic changes may have already occurred in precancerous lesions. It is the combination of multiple genetic mutations, which accumulate with deficiencies in DNA repair genes, that eventually leads to cancer. CC (1) Failure of BER to correct point mutations results in increased sensitivity to mutagens. Failure of XP enzymes of the NER system leads to Xeroderma pigmensosum, a condition in which any type of sun exposure is potentially lethal; an associated disorder of transcription coupled NER NHEJ error causes a protooncogene c-myc to be caused Cockayne’s syndrome, and translocated to an actively transcribed region of the involves enzymes of NER plus CS genome enzymes. (2) MMR enzymes, specifically MSH2 and MLH1 are linked to hereditary non-polyposis colon cancer (HNPCC), which accounts for ~10% of colon cancer cases. Deficiencies in MSH enzymes are found in a variety of sporadic cancers. (3) BRCA1 and BRCA2 are enzymes of the HR system. It is interesting that while these genes code for enzymes, they exhibit phenotypic dominance (inherited cancers). The process of HR can result in loss of heterozygosity, including loss of the remaining good allele and progression to cancer earlier than in the general population. (4) Defects in NHEJ cause loss of genetic material and chromosomal translocations (chronic myelogenous leukemia and B-cell lymphoma). (5) Cells deficient in BRCA (cancer cells) can be killed by PARP1 inhibitors since accumulation of SSB’s results in DSB’s with no functional enzyme to repair them. (6) APOBEC3G is a cellular enzyme that deaminates cytosine on viral nucleic acids and imparts innate antiviral activity. If APOBEC3G is incorporated in budding HIV virions, it causes hypermutation of viral G  A. HIV virus fights back by coding for a viral infectivity factor that mediates ubiquitination of APOBEC3G. Some key enzymes involved in DNA repair processes (General used in all processes)* Repair Key enzymes Result of enzyme action Diseases linked to Process enzyme deficiency General Endonucleases Cut within a DNA strand various Enzyme Exonucleases Cut at end of DNA strand Functions DNA polymerases Synthesize missing segment Ligase Ligate phosphodiester backbone BER DNA glycosylase AP site created by removal of abnormal base Single stranded break recognition PARP1 and recruitment of enzymes for repair MMR MSH1-6, PMS1-2 Recognition of mismatched bases HNPCC (MSH1), stomach, esophageal cancer, head and neck squamous cell 9 cancer, small cell lung cancer NER (repair of XPA Recognize the distorted DNA Xeroderma Pigmentosum UV damaged (mainly caused by defects TFIIH components Helicases: unwind DNA at DNA) in XPA) XPB, XPD damaged site XPF, XPG Endonucleases at 5’, 3’ sites Transcription Enzymes of XP coupled NER Colocalize with RNA pol II, CSA, CSB Cockayne syndrome function unclear HR BRCA1, BRCA2 Interact with RAD51, a Breast, Ovarian cancer recombinase that permits invasion of the homologous sequence on the sister chromatid. NHEJ DNA-PK, Ku endonuclease Chromosomal translocations You are expected to know the key enzymes and the disease correlations listed in the table for these repair system deficiencies. 3. RNA STRUCTURE, SYNTHESIS, PROCESSING AND TRANSCRIPTION 3.1 RNA is a polymer of ribonucleotides connected by phosphate groups through the 3’ – and 5’ hydroxyl groups, in the same linkage as for DNA. While RNA can form double helical structures and DNA:RNA hybrid duplexes, it generally adopts far more complex three dimensional structures than DNA, and typically exists as a single strand. RNA has ribose sugars (2’-OH) compared to DNA’s deoxyribose (2’-H). RNA contains uracil in place of thymine, and various unusual base modifications, which confer specific structure or function, e.g. in tRNA and rRNA. RNA is not nearly as stable as DNA and is prone to degradation by RNAses. Ribozymes are RNA’s with catalytic function. 3.2 rRNA, tRNA and mRNA are involved in protein production. rRNA comprises 80% of the RNA in the cell, tRNA and mRNA make up almost all the rest. Besides coding and transcriptional RNA’s, there are small 22nt RNA’s which modulate mRNA levels and remodel chromatin (miRNA) or which help to degrade viral RNA (siRNA). There are also small nuclear RNA’s (snRNA) and rRNA genes are on the p arms of small nucleolar RNA’s (snoRNA) which are involved in acrocentric chromosomes post-transcriptional processing of mRNA and rRNA, respectively. These small RNA’s direct RNP complexes to their target with high specificity dictated by Watson-Crick base pairing. CC Non-coding RNA’s are very important in development. An example is the X-linked FMR1 gene, which contains a CGG trinucleotide repeat sequence in the 5’-UTR. Expansion to > 200 repeats results in Fragile X syndrome (FXS), which occurs in both males and females, although often with more serious clinical symptoms of mental retardation in males. It is thought that epigenetic modification by miRNA’s in the repeat region leads to hypermethylation and gene silencing. The FMRP protein is expressed at its highest levels 10 early during fetal development, and is associated with miRNA modulation of mRNA is the brain. 3.3 Ribosomal RNA’s are assembled from 4 subunits. Three are them (28S, 18S, 5.8S) are made by RNA polymerase I in the nucleolus, and are obtained from post-transcriptional processing of 45S rRNA by snoRNA’s. The fourth subunit (5S) is made in the nucleus by RNA polymerase III. The subunits are assembled with proteins into 40S and 60S ribonucleoprotein particles. The 28S subunit has catalytic activity (peptide bond formation). tRNA is made by RNA pol III and is extensively modified after transcription, including base modifications, removal of internal bases by endonucleases and both 5’ and 3’ terminal modifications. RNAse P is a ribozyme that cleaves the 5’ end, Mature tRNA structure and RNAse D adds CCA-3’-OH to the 3’ end where tRNA will be primed with an amino acid. mRNA is synthesized by RNA polymerase II as heterogeneous nuclear RNA (hnRNA) or pre-mRNA. Basal (generic) transcription factors associate with the promoter region of the gene to be transcribed, to form a pre-initiation complex with RNA pol II and the DNA template strand. The promoter is ~30 bp upstream of the first transcribed base on the coding strand of the DNA. Additional gene-specific enhancers or suppressors may bind to sites 1000’s of base pairs away from the promoter region (see lecture of regulation of gene expression). Specific termination sequences on the template strand signal the end of mRNA synthesis. Pre-mRNA is extensively processed into mRNA, including 5’-capping and 3’-polyadenylation and splicing out of introns, leading to eventual transport of the mRNA to the cytoplasm. The spliceosome complex is a group of snRNP’s with high specificity dictated by snRNA. They are U1, U2, U4, U5, U6. U1 recognizes the 5’-splice site, containing a specific dinucleotide GU, and U2 recognizes the 3’-splice site, containing dinucleotide AG. Alternative splicing, in which exons may be skipped or different polyadenylation sites recognized, is important in adding to the variety of protein products of an individual gene. Alternative splicing is the mechanism by which antibody maturation occurs in B cells. mRNA can also be edited by deamination, changing one or more bases (C  U; A  I). Cytosine deaminase and adenosine deaminase acting on RNA are the enzymes involved in these processes. Apolipoprotein B mRNA editing complex (APOBEC1) acts on mRNA from the apolipoprotein B gene, generating a truncated form in intestinal cells. The APOBEC family of RNA and ssDNA acting enzymes include APOBEC3G, encountered earlier as an innate antiviral response targeting viral ssDNA. Tissue specific guide RNA is used for mRNA splicing and editing, resulting in different proteins translated in different tissues. All of the RNA polymerases synthesize RNA in the 5’  3’ direction. Note that unlike DNA polymerases, RNA polymerases do not require a primer. Prior to RNA processing, the RNA sequence is the same as the sequence of the coding (+) strand in duplex DNA, and is the reverse complement of the template (-) strand. Many segments are spliced out or edited during RNA processing. CC (1) Accuracy in splicing is critical because of the defined boundaries between exons and introns. Disrupted splicing is estimated to occur in 15% of disease-causing point mutations. (2) RNA pol II is the target of α-amanitin, a poison from the death cap mushroom that inactivates protein synthesis and causes complete liver failure within 48 hours of ingestion. 11 Prokaryotic RNA polymerase is sensitive to rifampicin, an antibiotic that is used against M. tuberculosis. (3) Transcription factors are proto-oncogenes, gain-of-function genes that activate cell-cycle and growth. If over-expressed or mutated, they may promote uncontrolled cell growth. 3.4 The genetic code consists of 64 three-base codons, 61 of which are translated into 20 amino acids. The remaining three codons UAA, UGA and UAG code for a STOP signal. A continuous stretch of codons between a start (AUG, methionine) and stop signal is an open reading frame (ORF). ORF’s are read on the coding strand of DNA. There are 48 tRNA anticodons, implying degeneracy in the system, where more than one codon can pair with a tRNA anticodon, and more than one tRNA anticodon codes for the same amino acid. Wobble base pairing accommodates the codon-anticodon degeneracy. The WC and wobble base pairing in the first two base pairs of the codon – anticodon are third position of the codon always Watson-Crick base pairs, while the third can be a wobble base pair (see accompanying figure for all base pair configurations at the third position). CC SNP’s in coding regions result in missense mutations (change in coded amino acid), silent mutations (no change in coded amino acid), nonsense mutations (chain termination) or frameshift mutations (loss or insertion of 1 or 2 bases). Examples are the missense mutation in codon 6 of the -globin chain (Glu  Val) in sickle cell anemia, deletion of three bases in the 11F508 mutation of CFTR in cystic fibrosis (note this is an amino acid deletion rather than a frameshift mutation), nonsense mutations of -globin chains in some cases of - thallasemia (a highly heterogeneous genetic disease). Mutations in non-coding regions of a gene can lead to changes in gene expression or splicing errors. 4 PROTEIN SYNTHESIS AND POST-TRANSLATIONAL MODIFICATIONS 4.1 tRNA is the messenger which translates the genetic code into amino acids. There are two mechanisms by which accuracy of translation is ensured by tRNA structure: 1) specificity of t-RNA loading by aminoacyl tRNA synthetase (AARS) and 2) codon-anticodon complementarity. There are specific AARS’s for each tRNA molecule, which recognize the unique shape of the tRNA. AARS activates the amino acid and catalyzes aminoacylation of tRNA at the 2’- OH of the 3’-adenosine in the CCA-3’ stem. Energy from ATP  AMP (pyrophosphate released; two high energy phosphate bonds) is required for this process. The anticodon in tRNA ensures that the correct amino acid is Intricate interaction at delivered to the ribozome according to the sequence aminoacyl stem and anticodon dictated by the mRNA codons. loop of tRNA (green) with its 4.1.1 Components of eukaryotic protein synthesis cognate AARS (blue). include the small (40S) and large (60S) ribonucleoprotein 12 subunits of the ribosome, eukaryotic initiation factors (eIF) which assemble the complete ribosome with mRNA and initiator Met-tRNAiMet, eukaryotic elongation factors (eEF) which deliver aminoacylated tRNA’s as the protein is being synthesized, and are involved in ribosome translocation, guanine exchange factors (GEF) which recycle GTP required for the energetics of protein synthesis. Two GTP’s are required for each peptide bond (plus the energy from the aminoacylated tRNA). There are release factors (eRF) involved in termination of protein synthesis. In prokaryotes, the initiator tRNA carries N-formylmethionine, the small and large subunits are 30S and 50S, respectively, and there are prokaryotic versions of initiation (EF- Tu) and elongation (EF-G) factors. (LIR Fig 31.15). Ribosome recycling factor (RRF) mediates separation of the subunits. Mitochondria also use fMet-tRNAiMet. The N-terminal residue of every synthesized protein is Met. Post-translational cleavage of the N-terminus (e.g. in secreted proteins) may occur. 4.2 Ribosomes (80S in eukaryotes, 70S in prokaryotes) are assembled and operate on the rough endoplasmic reticulum (for extracellular or membrane destinations) or in the cytoplasm. a. Preinitiation complex formation CAP-binding protein (eIF-4) binds to the mRNA cap, eIF-2 binds to GTP and Met-tRNAiMet, and these complexes bind to the 40S subunit, which is bound to eIF-3. b. Ribosome assembly The 60S subunit is bound, accompanied by hydrolysis of eIF-2- GTP to eIF-2-GDP. Other IF’s are released. GEF will recycle eIF-2-GDP to eIF-2-GTP. There are three tRNA binding sites, the A (aminoacyl), P (peptidyl) and E (empty) sites. Met-tRNAiMet is bound in the P site. c. aa-tRNA binding eEF-1α-GTP delivers aminoacylated tRNA to the A-site, accompanied by hydrolysis of GTP. In prokaryotes, this is EF-Tu. GEF recycles the GDP back to GTP. d. Peptide bond formation Peptidyl transferase activity resides in the 28S rRNA component of the large subunit (23S in prokaryotes). This subunit acts as a ribozyme. e. Ribosome translocation eEF-2-GTP (EF-G in prokaryotes) moves messenger RNA and the new peptidyl-tRNA in register into the P-site, moving deacylated tRNA into the E site and leaving a vacant A site. GTP is hydrolysed in this process, and will be regenerated by GEF. Once a new aa-tRNA is loaded into the A-site, the tRNA in the E- site will be released. f. Termination STOP codons are recognized by eRF-GTP which binds in the A site and causes peptidyl transferase to transfer a water molecule to the peptide chain, thus terminating protein synthesis and releasing the polypeptide. g. Dissociation eRF dissociates, accompanied by hydrolysis of GTP, and the ribosomal subunits and mRNA dissociate. This is aided by RRF’s in prokaryotes. The energetics of protein synthesis is very demanding. ATP  AMP provides the high energy bond of aa~tRNA, and both initiation and elongation factors hydrolyze GTP. Therefore four high energy bonds are required for each amino acid added to a growing peptide chain. Additional ATP and GTP molecules are needed for initiation and termination. CC Prokaryotic protein synthesis is a target of inhibition by many antibiotics. Ricin and diptheria toxin act on mammalian protein synthesis. Modulation of guanine exchange factors is also used to regulate translation. 13 Antibiotic Site of action Process affected Spectinomycin 30S subunit, prokaryotes Initiation Streptomycin 30S subunit distortion, preventing Initiation (aminoglycoside) assembly of 70S ribosome, prokaryotes Tetracycline Aminoacyl-tRNA binding, 30S subunit, Elongation prokaryotes* Puromycin Peptidyl transferase, 70S ribosomes* Elongation Chloramphenicol Peptidyl transferase, 70S ribosomes* Elongation Erythromycin 50S subunit, prokaryotes Translocation Fusidic acid EF-G (prevents turnover) Translocation *some effect on eukaryotic ribosome Toxin Site of action Process affected Diphtheria toxin eEF-2 (prevents recycling), Translocation eukaryotes Ricin Depurination of 28S rRNA Ribosome deactivation Note: You will be expected to understand the effect of inhibiting translation at the different steps of the protein synthesis pathway. Drug names will not be tested in the Cell module; but will be tested later in pharmacology. Protein synthesis targets of toxins (ricin, diptheria toxin) may be tested. 4.3 Post-translational modification of proteins is common and affects their distribution and function. N-terminal signal peptides on “pre”-proteins direct proteins destined for extracellular localization via the signal recognition particle into the ER lumen. The signal sequence is cleaved off by signal peptidases in the ER. Similarly, proteins destined for mitochondria, the nucleus or peroxisomes have specific peptide sequences that assist with localization. In the ER and Golgi apparatus, hydroxylation, glycosylation and other covalent modifications occur. Lysosomal enzymes are tagged with mannose-6-phosphate, which directs them to a specific receptor on the surface of lysosomes. Chaperones aid in protein folding by isolating polypeptides and providing disulfide isomerase and prolyl isomerase functionality. Misfolded proteins are tagged with ubiquitin and destined for the proteosomes. Examples of common post-translational modifications of proteins Modification Notes Acetylation At N-terminus (common), at Lys in histone modification; reversible Carboxylation Of glutamate, forming ψ-carboxyglutamate in proteins involved in blood coagulation; vitamin K dependent (ER) 14 Fatty acylation Anchors proteins in membranes – e.g. myristoylation, palmitoylation (Golgi) Glycosylation Localization signal for secreted and lysosomal or membrane proteins; functional role in extracellular matrix proteins; occurs at Asn, Ser, Thr, HyL. N-glycosylation in ER; O-glycosylation in Golgi Hydroxylation Of Pro and Lys in collagen; in ER; Vitamin-C dependent Methylation At lysine (alters charge on protein) – with SAM; chromatin remodeling Phosphorylation Of Ser, Thr or Tyr. Addition and removal regulates activity, e.g. of enzymes of glycogen degradation and regulation of gene expression. Prenylation At Cys near C-terminus; Ras, proteins involved in leukocyte activation Proteolytic Of zymogens, insulin – proteins are expressed as preproproteins Cleavage (Golgi or trans-Golgi or in secretory vesicles) CC (1) Vitamin-K dependent carboxylation of glutamate to form ψ-carboxyglutamyl residues on many zymogens plays a role in blood coagulation by making the proenzymes sensitive to Ca2+ ions. (2) Proinsulin is post-transcriptionally proteolyzed into insulin and C-peptide, the latter an important measure of the level of endogenously produced insulin. (3) Deletion of Phe508 in the chloride channel cystic fibrosis transmembrane receptor CFTR results in it being targeted by ubiquitination to proteasomes as a misfolded protein, despite the fact that it would have been fully functional if it reached its target membranes. (4) Defects in the enzyme which transfers mannose-6-phosphate to enzymes intended for lysosomes results in clogging of lysosomes due to material that cannot be digested; this is known as I-cell disease. (5) The anti-inflammatory activity of statins is in part due to inhibition of prenylation; prenylation reagents are derived from the cholesterol biosynthesis pathway. 5 REGULATION OF GENE EXPRESSION 5.1 Eukaryotic Gene Regulation 5.1.1 AT THE LEVEL OF DNA DNA methyl transferase alters chromatin structure by methylating DNA at the 5- position of cytosine bases (using SAM as a cofactor). Methylation at CpG islands in the 5’ regulatory region of many eukaryotic genes is associated with gene silencing. Methylated DNA binding proteins block transcription. They can also recruit chromatin remodeling enzymes such as histone deactylases (HDACs), trans-acting molecules which modify histones, forming compact heterochromatin. Methylation is an epigenetic effect. Methylation patterns are heritable and genes retain memory of their parental origin, a phenomenon called genomic imprinting. Imprinting results in loss-of-function of the associated genes. It is evident that methylation patterns in many intergenic regions change rapidly during development and after birth. Epigenetic events such as histone modifications, DNA 15 methylation and microRNA expression continue lifelong, and are affected both through germline transmission and by the environment. External effectors of epigenetic regulation include cigarette smoke, radiation, hormone disrupters, aromatic hydrocarbons, pathogens, heavy metals, particulate matter and other pollutants. Conversely, histone acetyltransferases (HAT’s) add the acetyl groups to histone lysine residues, and contribute to the formation of transcriptionally active euchromatin. Acetylating lysine residues in histones reduces their positive charge and affinity for DNA. HAT’s are recruited to the transcription complex (see below). Gene amplification is a form of regulation in which multiple copies of a gene have been intercolated into chromosomal DNA or appear in the form of extra-chromosomal double minutes. This results in gain-of-function for the product of those genes. CC (1) MeCP2 is a methylated CpG binding protein that is aberrant in Rett syndrome (developmental regression in girls at ~ 2 years of age) (2) Hyper- and hypomethylation are evident in tumor cells; the methylome and transcriptome can now be studied for molecular calcitriol fingerprinting of cancer. (3) Tumor cells resistant to methotrexate have been observed due to amplification of the Nuclear hormones (often steroids) that bind to nuclear hormone gene for DHFR, and receptors, triggering transcription of specific genes amplification of myc or MDM2 has been observed as a mechanism for deregulation of cancer cell growth. (4) One of the earliest examples of genomic imprinting is the case of Prader-Willi and Angelman syndromes, which both involve a gene deletion in the same locus on chromosome 15, but have very different phenotypes. This region of the genome is differentially expressed based on parental origin and tissue type, with the maternal copy expressed only in the brain. Inheritance of a deletion from the mother leads to developmental deficiencies, sleep disorders, happy affect (Angelman syndrome) associated with absence of function in the brain, while inheritance of a deletion from the father leads to hyperphagia, obesity, developmental delay and sexual development deficiencies (Prader-WIlli syndrome). 5.1.2 AT THE LEVEL OF TRANSCRIPTION Regulatory regions of a gene occur at the 5’- end (relative to the coding strand) and are composed of a TATA box promoter (-40 – -200bp upstream) and additional regulatory elements (up to 1000’s bp upstream or downstream). These are cis-acting regulatory sequences. Trans-acting regulatory regions are diffusible molecules that bind to the regulatory sequences, including basal TF’s that bind to the promoter, and gene-specific activators / repressors that bind to the upstream regulatory elements. 16 Example of the Glucocorticoid receptor (LIR 32.10) Important gene specific regulators are the hormone response elements (HRE) and the cAMP receptor element (CRE). HRE’s Signal transduction by bind to hormone receptors (HR) triggered by nuclear hormone extracellular (peptide) binding, inducing gene expression. CRE binds to CRE binding hormones (LIR 32.11) protein (CREB) that has been phosphorylated by cAMP- dependent protein kinase A. This is the mechanism of signal transduction from peptide or amino acid derived hormones which bind to cell-surface hormone receptors. Nuclear hormone receptors have several interacting domains: a ligand binding domain for receptor activation, a DNA binding domain with characteristic structure and an activator binding domain. The activator mediates association with RNA pol II and the preinitiation complex and with chromatin modifying proteins such as HAT. HRE’s are palindromic sequences, and hormone receptors form homo- or heterodimers to increase affinity and selectivity for the HRE. DNA binding domain motifs of transcription factors include the helix-turn-helix structure of the homeodomain proteins, Zn fingers of steroid hormone receptors, leucine zipper of CREB. CC (1) Over-expressed estrogen receptor (ER) in many breast cancers makes them susceptible to treatment with an ER antagonist, such as tamoxifen. ER overexpression has also been implicated in other cancer types, such as colon and prostate. (2) Genes in the homeobox family code for transcription factors and are strictly regulated during embryonic development, directing the formation of limbs and organs and the differentiation of cells. (3) The cAMP/CRE/CREB system is important in glucose homeostasis by activating expression of genes involved in gluconeogenesis (PEP carboxykinase, glucose-6-phosphatase). 5.1.3 POST-TRANSCRIPTIONAL REGULATION miRNA and siRNA are small ~21-22nt single stranded RNA’s that are produced from endogenous or exogenous dsRNA. They are formed from larger precursors by the action of the enzyme DICER and incorporated as the guide RNA in RNA-induced silencing complex (RISC). miRNA is produced from non-coding regions of the genome, and functions in chromatin remodeling and modulation of mRNA levels. siRNA is produced as a host response to a virus, guiding RISC to form a complex with the foreign RNA, which is degraded by helicase and endonuclease enzymatic activity. 17 CC RNA interference (RNAi) is a promising strategy for specific targeting of upregulated genes, and could be useful in reducing the expression of oncogene products or in treating gain-of-function autosomal dominant disorders. Antisense therapy using DNA, RNA or modified oligomers is a form of gene therapy, and suffers from similar limitations of drug delivery, since DNA and RNA do not cross the cell membrane. 5.1.4 AT THE LEVEL OF TRANSLATION Phosphorylation of eIF-2 down-regulates protein synthesis, by drastically reducing the turnover time of GTP hydrolysis as well as regeneration by GEF which binds to eIF-2. This renders eIF-2 incapable of being used to form a new ternary initiation complex and translational initiation is reduced. CC In erythrocytes, heme controlled inhibitor (HCI) is a kinase which phosphorylates eIF-2. HCI is inactivated by heme, leading to increased synthesis of globin chains. In leukocytes, interferons are induced by the detection of viral RNA. The interferons induce an RNA-dependent protein kinase that phosphorylates eIF-2. Protein production slows in the virally infected cell. 5.2 Prokaryotic Gene Regulation Prokaryotic genes are clustered into operons, which are polycistronic and consist of a sequence of related genes, governed by an operator. The operator is turned on by an inducer, which binds to a repressor protein expressed by the regulatory gene, preventing repressor from binding to the operator. Examples of inducers are catabolites such as lactose in the lac repressor or amino acids such as Trp in the trp repressor. Dual control occurs in the lac operon via a cAMP-dependent activator protein binding to the CAP site. Under conditions of low glucose, the CAP protein is bound, permitting expression, so that lactose is metabolized only in the absence of glucose. 6 BIOTECHNOLOGY 6.1 Introduction Recombinant DNA technology was originally developed as a research tool but it now being used to identify defective genes associated with disease and to potentially correct genetic defects. Completion of the human genome project led to Genome Wide Associate Studies (GWAS) in an attempt to pinpoint genetic variations (SNPs / haplotypes) as causative for, or correlated with, disease phenotype. For most chronic diseases, however, GWAS did not yield correlations despite clear indications of familial inheritance. This phenomenon is called missing heritability. Diseases in this category include Diabetes, Parkinsons, Alzheimiers, Autism, Crohn’s disease and many others. It is possible that rare mutations exist within families and that these could be found by whole genome sequencing, a once prohibitively costly enterprise that is now much more affordable. It is also likely that genes that contribute to common biochemical pathways should be assessed together, since the phenotypic outcome is a breakdown in the biochemistry. This is systems biology, the study The Sanger method of DNA sequencing. of networks of proteins. Not only gene sequence Note that the sequence read from the gel is but also epigenetic regulation, which controls the reverse complement of the template protein expression levels, must be considered. (From Marks Med. Biochemistry) 6.2 Strategies for Analyzing Sequences of DNA 18 6.2.1 DNA sequencing is commonly performed by the Sanger method. Dideoxynucleotides (ddNTPs) lacking a 3’-hydroxyl group are included in small amount along with larger quantities of dNTPs. When incorporated, they terminate strand synthesis. With a suitable ratio of ddNTPs to dNTPs, strands of every possible length will be generated. They are separated on a DNA gel and the sequence read out from the gel. Originally 4 reactions, each containing a different ddNTP, would be carried out and run on 4 lanes of a gel (see accompanying figure), but this is now accomplished in a single reaction with each ddNTP labeled with a different fluorophore. The DNA is separated by capillary electrophoresis and the bases are identified by wavelength of emission of each band. 6.2.2 Restriction Enzymes are endonucleases derived from bacteria which recognize 4-6 base pair palindromic sequences in unmethylated DNA. They arose as a mechanism for bacteria to recognize foreign (viral) DNA, since bacterial DNA is methylated. Hundreds of restriction enzymes with different specificities have been isolated. Most leave “sticky ends (see figure) but a few leave blunt ends with no overhang. Digestion of DNA with a restriction enzyme yields many small fragments. The restriction digest Cleavage often leaves sticky ends can be used to identify variations in base sequence in a gene or in selected regions of a genome (a DNA “fingerprint”). 6.2.3 Probes are single stranded oligonucleotides with complementary sequence to a known gene or RNA. They are radiolabeled (32P) or fluorescently labeled. They hybridize to complementary DNA that has been converted to single stranded form by heat or alkali treatment, or to complementary RNA. Hybridized DNA is detected by autoradiography or fluorescence microscopy. Examples of this include o Genomic DNA from a restriction digest that is separated on a nitrocellulose blot of an electrophoretic gel (Southern Blot) o RNA separated on a nitrocellulose blot of an electrophoretic gel (Northern Blot) o A DNA microarray that has been selected to represent (a part of) a genome of an organism o A cDNA microarray derived by reverse transcription of mRNA, chosen to represent a particular cell type, organism or cancer type. Preparation of cDNA containing the green fluorescent dye Cy3. A corresponding red fluorescent dye Cy5 could be used. DNA containing both dyes will fluoresce yellow in a cDNA microarray. 19 Probes can be allele-specific oligonucleotides (ASO’s) that can distinguish between RFLP fragment sequences or sizes on a gel or a mini- or microsatellite repeat marker to detect short tandem repeat polymorphisms (STRs) (Section 2.6.3). The stringency of hybridization can be controlled with physical conditions such as temperature or salt concentration. At high temperature and low salt, stringency will be high, requiring an exact match between probe and template DNA. At low temperature and high salt, mismatches between probe and template can be tolerated, leading to a low stringency condition. 6.3 Strategies for Production of Recombinant DNA 6.3.1 Cloning The sticky ends generated by restriction endonucleases can be used to anneal two DNA fragments that have been cleaved with the same restriction enzyme, thus forming recombinant DNA in a test-tube. This is used to insert genes within plasmids for growth in bacterial, insect or mammalian cells. Host cells containing recombinant DNA are called transformed cells if they are bacteria or transfected cells if they are eukaryotes. Plasmids have an origin of replication and can replicate in host cells. They typically have an antibiotic resistance gene for selection of transformed or transfected cells. 6.3.2 PCR is an in vitro temperature cycling method using primers known to be complementary to the ends of the DNA segment that is to be replicated. Thermal cycling of a mixture of primers, DNA substrate, dNTP’s and DNA polymerase is used to produce billions of copies of the DNA fragment. Heating separates the double stranded DNA Partial cooling allows primers to anneal to the DNA at each end of the segment of interest. The primers have a free 3’-OH. DNA pol synthesizes DNA. Heating separates the newly formed double stranded DNA (now 2) Detection of cystic fibrosis from PCR products The cycle is repeated, producing 4, 8, using mutant and normal ASO’s 16, … 2n copies of DNA RT-PCR is a method for obtaining amplified cDNA from mRNA. In real-time quantitative RT- PCR, the amount of cDNA produced is directly proportional to the original amount of mRNA present in the sample, an important test of mRNA expression. 6.4 Strategies for Diagnosis of Disease 6.4.1 Restriction Fragment Length Polymorphisms (RFLP) Occasionally, disease specific mutations occur at a restriction site and therefore eliminate cleavage at that site. Mutations can also create new restriction sites. The change in size of the fragmented DNA reveals the mutation. Mutations not within a restriction site are detected using normal and mutant ASO’s for hybridization to the DNA. This technique is useful when the disease specific mutations are known. 20 There are many single gene diseases, eg. cystic fibrosis, for which targeted mutation panels have been developed, containing all of the mutations known to cause disease (see www.genetests.org). A positive test in mutation panel analysis is a confirming diagnosis of the presence of a disease gene; however a negative result does not mean that the patient does not carry a genetic mutation for the disease. For most diseases, not all disease-causing mutations are known. In the case of familial clustering of disease, such individuals can be more accurately diagnosed by DNA sequencing of extended family members, a method that will also reveal new disease-causing mutations. 6.4.2 PCR using primers containing a known disease-causing mutation will amplify the DNA only if the mutation is present. PCR can be used to identify deletions or insertions or variable numbers of tandem repeats (VNTRs) using primers that bracket the region of interest. In cases Genetic markers in disease prediction of very large numbers of triplet repeat such as Huntingtons or Fragile X, restriction digest and use of an ASO with the triplet repeat sequence is preferable because of the limited size of PCR products. 6.4.3 Gene expression levels are determined using quantitative RT-PCR and either a Northern Blot or cDNA microarray. In microarrays, patient cDNA labeled with Cy5 (red) is co-hybridized with control cDNA labeled with Cy3 (green). A yellow color implies that gene expression in the control and patient cells is unchanged, since equal amounts of cDNA are present. Up-regulation of gene expression is indicated by a red spot on the microarray, and down- regulation of gene expression is indicated by a green spot. Microarrays containing whole genomes of organisms are available, as are oligonucleotide arrays that select for specific groups of genes. Uses include o Cancer - Characterize transformation related DNA microarray after genes, tumor markers, vaccine candidates hybridization with control and o Pathogens - Identification, response to sample DNA antibiotics, gene expression in immune response o Morphogenesis - Genes activated during epithelial-mesenchymal interactions - recombinant morphogens and gene therapy Although conceptually simple, statistical analysis of the output of DNA microarrays is still being developed. Cluster analysis of microarrays could identify unknown genes. 21 6.4.4 Techniques used to monitor proteins A Western blot is used to detect proteins that have been separated by gel electophoresis and transferred to a filter. An antibody is used to bind to the relevant protein(s). To detect those antibodies that have bound, a reporter group such as an enzyme conjugated to anti-immunoglobulin antibodies is added (called 2nd antibody or conjugate). After the excess 2nd antibody is washed free a substrate is added that will precipitate upon reaction with the conjugate resulting HIV test using HIV antigens in the in a visible band. wells and adding patient serum An ELISA is similar but no gel electrophoresis is followed by wash and the 2nd antibody involved. It is used to detect antigen – antibody conjugate. reactivity. In an indirect ELISA, patient serum is added to antigens adsorbed in the wells of a plate, followed by washing and addition of enzyme conjugated 2nd antibody. After washing out excess 2nd antibody, substrate is added and color measured. This was the first test for HIV exposure. A direct ELISA can be used to detect specific antigens in patient’s serum by adding enzyme-conjugated antibody directly. 6.5 Recombinant DNA in Prevention and Treatment of Disease 6.5.1 Vaccines Antigenic proteins are made using recombinant DNA technology and used in a vaccine. The method avoids use of an attenuated infectious agent. Although for the most part safe, occasionally the attenuated organism has been known to cause disease. The hepatitis B vaccine is a recombinant protein vaccine. DNA vaccines are being investigated in which the DNA that produces the antigenic protein is supplied to cells, which then make the antigen. This method has not reached the clinic yet. mRNA vaccines against the SARS-CoV-2 virus utilize a similar strategy, delivering mRNA of the spike protein to cells, which then make the spike antigen. The short lifespan of mRNA means that antigen production is temporal, but it has been astoundingly successful in preventing SARS-CoV-2 infection. Vaccine development included stabilization of the mRNA, e.g. by pseudouridine substitutions (JACS 2021 Outlook on pseudouridine), and the development of lipid nanoparticles for mRNA delivery (C&EN 2021 nanoparticle delivery). mRNA vaccines against specific cancers have been decades in development (Nature review 2019 cancer vaccines). Antibodies can be provided directly to aid in the immune response. Of recent interest are nanobodies, small stable single VHH domain antibodies that are being used to target von Willebrand factor (which causes blood clotting) and SARS-CoV-2 among many other examples. 6.5.2 Therapeutic Proteins Human insulin and human growth hormone are examples of recombinant proteins used to treat diabetes or deficiency in growth hormone, respectively. Previously these proteins were extracted from tissues. Creutzfeldt–Jakob's disease was a dreaded complication of GH therapy obtained from human pituitary glands. Replacement enzyme therapy is another example of therapeutic use of recombinant proteins. An example is Cerezyme® (glucocerebrosidase) for the treatment of Gaucher disease. 6.5.3 Gene Therapy is a method for introducing a corrected gene into cells. The main vehicle for transmission is a retroviral or adenoviral vector. Retroviral vectors have the advantage of integrating their DNA into the host genome, resulting in a permanent copy of the gene, but they can carry a limited size of gene. Adenoviruses can carry larger genes, but 22 expression is transient, and patients may have an immunological reaction or experience toxicity related to the pathogenic adenovirus. In principle, gene therapy can be used to introduce dsRNA’s that induce siRNA and alter the expression level of genes (Section 2.5.1.3). Gene therapy is also being studied to introduce susceptibility of cancer cells to chemotherapy. For example, introduction of HSV thymidine kinase into cancer cells will make them susceptible to ganciclovir. 6.5.4 CRISPR/Cas9 gene editing is a method for changing an organism’s DNA by editing, removing or adding a DNA sequence at a specific location in the genome. CRISPR stands for clustered regularly interspaced short palindromic repeats and Cas9 for CRISPR-associated protein 9. This gene editing method was adapted from an antiviral mechanism used by bacteria. Bacteria store snippets of DNA of invading viruses and store them as CRISPR arrays. In a subsequent infection by that or a similar virus, the bacteria produce RNA from the CRISPR arrays. The RNA selectively binds to the viral DNA sequence and recruits the endonuclease Cas9 to cleave the DNA. CRISPR/Cas9 technology in the lab involves the use of a guide RNA to target a specific DNA sequence in the genome for editing. Cas9 causes a double stranded break of the genomic DNA at the targeted location that is subsequently mended (perhaps not perfectly) by DNA repair processes. CRISPR/Cas9 holds promise for the treatment of single gene disorders, such as cystic fibrosis, sickle cell disease, triplet repeat disorders and hemophilia. An overlapping DNA segment with the corrected sequence can be incorporated into the genomic DNA during repair of the double stranded break. CRISPR/Cas9 is also being explored for treating cancer and other more complex diseases. Ethical considerations are a concern, since gene editing of germ-line or embryonic cells could be passed to future generations. Regulation of this technology is necessary in order to avoid the specter of “designer” babies rearing its ugly head. 6.5.5 Cancer Immunotherapy (additional topic of interest in molecular based therapy but not tested in Cell Module) is an intriguing method of reactivating the immune system to attack cancer cells. Tumor cells co-opt certain immune checkpoint pathways to evade the immune system. Immune checkpoints are inhibitory pathways crucial for self-tolerance and for modulation of immune responses. Scientists have developed antibodies that can reactivate T-cells by targeting the immune checkpoints. T-cells embedded in solid tumors are reactivated and attack the cancer cells. So far the technique has shown extraordinary promise against some of the hardest to treat cancers including pancreatic cancer, glioblastoma and lung cancer. Adverse effects including autoimmune response have not so far been a problem, although it is believed necessary to modulate dosage to avoid this complication. Combination therapy is also required to avoid resistance development. Example review: Nature Review May 2020 NY Times series Aug, 2016: Immunotherapy for cancer 23

Use Quizgecko on...
Browser
Browser