Macromolecules of Life Lectures - PDF
Document Details
Uploaded by RightfulDrama
Tags
Summary
These lecture notes cover the structure and function of nucleic acids, including DNA and RNA. Topics discussed include base pairing, DNA replication, and in vitro DNA replication (PCR). The lectures are aimed at a university-level audience.
Full Transcript
CH4305/14 Macromolecules of Life Lecture 1 - Building Blocks of Nucleic Acids What we will learn Breakdown of the chemical structure of nucleic acids Difference between nucleobase, nucleoside and nucleotide The structure and chemical properties of nucleobases The structure...
CH4305/14 Macromolecules of Life Lecture 1 - Building Blocks of Nucleic Acids What we will learn Breakdown of the chemical structure of nucleic acids Difference between nucleobase, nucleoside and nucleotide The structure and chemical properties of nucleobases The structure and connectivity of ribofuranose The structure and connectivity of phosphate group The structure and name of nucleotides The difference between the various nucleotides 2 Difference between nucleobase, nucleoside and nucleotide Polymer of nucleotides – oligonucleotides, nucleic acids, RNA & DNA. 3 Nucleobase structure and properties Derivatives of pyrimidine or purine Nitrogen-containing heteroaromatic molecules Planar or almost planar structures Absorb UV light around 250–270 nm 4 UV Absorption of Nucleobases Absorption of UV light at 250–270 nm is due to * electronic transitions Excited states of common nucleobases decay rapidly via radiationless transitions – Effective photoprotection of genetic 5 Reminder: Hydrogen bond H-bond donor: hydrogen attached to a highly electronegative atom H-bond acceptor: strongly electronegative atom O, N, F 6 Acid dissociation constant (Ka) HA + H2O H3O+ +A acid - conjugat e 𝐾 𝑎=¿ ¿ base 𝑝𝐾 𝑎= − log 𝐾 𝑎 7 Acid dissociation constant (Ka) example strong acid HA + H2O H3O+ +A acid - conjugat −e [𝐴 ] 𝑝𝐻 =𝑝 𝐾 𝑎 + log base [ 𝐻𝐴 ] − [ 𝐴 ] 7 =3 + log [ 𝐻𝐴] − [ 𝐴 ] 4 = log [ 𝐻𝐴] − [𝐴 ] 4 10,000 ¿ 10 = [ 𝐻𝐴 ] 1 8 Nucleobase – purine bases Good H-bond donors (d) and acceptors (a) Neutral molecules at pH 7 9 Approximate pKas taken from Bordwell pKa table Nucleobase – pyrimidine bases Good H-bond donors (d) and acceptors (a) Neutral molecules at pH 7 10 Nucleobase – tautomerism Prototropic tautomers are structural isomers that differ in the location of protons Keto-enol tautomerism is common in ketones Lactam-lactim tautomerism occurs in some heterocycles Both tautomers exist in solution but the lactam forms11 are Ribofuranose Fischer Haworth Projection Projection RNA -D-ribofuranose DNA -2’-deoxy-D- ribofuranose 12 Ribofuranose - b-N-Glycosidic Bond In nucleotides the pentose ring is attached to the nucleobase via N-glycosidic bond. The bond is formed to the anomeric carbon of the sugar in β configuration. The bond is formed: – to position N1 in pyrimidines. – to position N9 in purines. This bond is quite stable toward hydrolysis, especially in pyrimidines. 13 Ribofuranose – Confirmation around b-N-Glycosidic Bond Relatively free rotation can occur around the N-glycosidic bond in free nucleotides. Angle near 0 corresponds to syn conformation Angle near 180 corresponds to anti conformation (in B-DNA) 14 Phosphate groups – attached to 5’ Negatively charged at neutral pH Typically attached to 5’ position Nucleic acids are built using 5’- triphosphates ATP, GTP, TTP, CTP 15 Phosphate groups - oligonucleotide Negatively charged at neutral pH Typically attached to 5’ position Nucleic acids contain one phosphate moiety per nucleotide 16 Phosphate groups – alternative positions Phosphate can also be found in nature attached at other furanose ring positions 17 Structure and names of deoxyribonucleotides Learn the structures, names, and symbols (one-letter (A) and four-letter (dAMP) codes) 18 Structure and names of ribonucleotides Learn the structures, names, and symbols (one-letter (A) and three- letter (AMP) codes) 19 Differences between DNA and RNA Difference in monosacchari de Difference in nucleobase 20 What we have learned Breakdown of the chemical structure of nucleic acids Difference between nucleobase, nucleoside and nucleotide The structure and chemical properties of nucleobases The structure and connectivity of ribofuranose The structure and connectivity of phosphate group The structure and name of nucleotides The difference between the various nucleotides 21 CH4305/14 Macromolecules of Life Lecture 2 – Structure of Nucleic Acids What we will learn How nucleic acid strands interact DNA/RNA strand connectivity and stability DNA dimer – WCF base pairing and helix structures Hoogsteen base pair - DNA trimer and G- quadruplex Other interactions – Hairpins and cruciform RNA structures DNA denaturation 23 A recap from Lecture 1 nucleobase = nitrogeneous base nucleoside = nitrogeneous base + pentose nucleotide = nitrogeneous base + pentose + phosphate nucleic acid/DNA/RNA/oligonucleotide/polynucleotid e = nucleotide polymer 24 Nucleic acid structure and stability Covalent bonds formed via phosphodiester linkages – negatively charged backbone Linear polymers – No branching or cross-links Directionality – 5’ end is different from 3’ end – We read the sequence from 5’ to 3’ DNA backbone is fairly stable – DNA from mammoths? – Hydrolysis accelerated by enzymes (DNAse) RNA backbone is chemically unstable – In water, RNA lasts for a few years – In cells, mRNA is degraded in few hours 25 RNA instability – base-catalysed hydrolysis Hydrolysis is also catalysed by enzymes (RNase) RNase are abundant throughout nature 26 Hydrogen-bond interactions Watson-Crick-Franklin base pairs 3’ 5’ Two bases can hydrogen bond to form a base pair For monomers, large number of base pairs is possible In polynucleotide, only few possibilities exist Watson-Crick-Franklin (WCF) base pairs predominate in double-stranded DNA 5’ 3’ Purine pairs with Pyrimidine 27 Hydrogen-bond interactions Watson-Crick-Franklin base pairs Watson-Crick- Franklin Base pairing C≡G A=T The two strands are complimentary 1 Å = 0.1 nm = 28 The double helix Rise per base pair Rise per turn 29 B-DNA and Other Forms of DNA 30 B-DNA and Other Forms of DNA 31 Hoogsteen (non-WCF) base pairs Triple helix T=A∙ C≡G∙ T C+ Purine nucleobase H-bonds to two 32 Hoogsteen (non-WCF) base pairs G quadruplex Four guanine nucleobase H-bond 33 Other interactions – hairpins and cruciforms 5’ 3’ Both strand segments have 3’ 5’ the same sequence when written 5’-3’ Both strand segments have different sequence when written 5’-3’ 34 Palindromic sequences can form hairpins Unpaired loop/Loop = non-H-bonding nucleobases 35 Complementary strand hairpins can form cruciforms 36 RNA molecules can form complex intrastrand structures M1 RNA component of the enzyme RNase P tRNA (PDB ID 1TRA) Hammerhead ribozyme (PDB ID 1NME) Intron (PDB ID 1GRZ) 37 RNA molecules can form complex interstrand structures A single strand will fold to an A-form right- handed helix. Complementary and nearly complementary strands also fold to A- form right-handed double helix. But less complementary strands can form more complex structures. 38 DNA/RNA denaturation Covalent bonds remain intact – Genetic code remains intact Hydrogen bonds are broken – Two strands separate Base stacking is lost. – UV absorbance increases. Denaturation can be induced by high temperature or change in pH. Denaturation may be reversible: annealing. 39 DNA/RNA denaturation - Melt transition Several factors affect the midpoint of transition (Tm), i.e. melt temperature Tm increases with: higher GC content higher concentration longer oligomer length pH values nearer to neutral higher salt concentration RNA duplex is 20 ℃ more stable than DNA! 40 What we have learned How nucleic acid strands interact DNA/RNA strand connectivity and stability DNA dimer – WCF base pairing and helix structures Hoogsteen base pair - DNA trimer and G- quadruplex Other interactions – Hairpins and cruciform RNA structures DNA denaturation 41 CH4305/14 Macromolecules of Life Lecture 3 – DNA replication What we will learn How DNA information is preserved The process of DNA replication in nature: initiation, elongation & termination A method of replicating DNA in vitro (PCR) DNA modification by biochemical processes (epigenetics) DNA modification by exogenous damage (mutations) 43 Recap from Lecture 2 – Strand complementarity, denaturation and annealing 3’ 5’ Strand complementarity in DNA: GC and TA base pairs DNA denaturation: process of breaking H-bonds between strands and p stacking within strands forming single-strand DNA 5’ (ssDNA) 44 Replication of Genetic Code Template required to replicate DNA Each new duplex comprises a template strand and a daughter strand Three steps to biosynthesis of biopolymers 1. Initiation 45 Initiation - DNA strand separation Double helix opened with the aid of initiator proteins and unwound with helicases 46 Khan Academy Elongation – DNA replication Primer Short oligonucleotide sequence created by a primase enzyme DNA polymerase Enzyme that can build a complementary DNA Khan Academy strand starting from an existing primer 47 Elongation – DNA replication in more detail 48 Elongation - Challenge of 3’ daughter strand 49 Elongation – Okazaki fragments Leading strand Lagging strand (template strand is being opened (template strand is being opened 3’ to 5’) 5’ to 3’) DNA polymerase can follow the DNA polymerase can not follow direction of unwound DNA the direction of unwound DNA. Short Okazaki fragments are 50 Elongation – proofreading During DNA synthesis errors can occur from addition of incorrect nucleotide 1/100,000 typical error rate = 120,000 mistakes every time a cell divides! Polymerase has proofreading capability Molecular biologists can alter 51 the error rate in the lab Termination – End of DNA replication Ter 20 bp sequence that trap the replication fork Ter traps one direction on the strand but bound proteins will collide with opposite direction replication fork Linear DNA/chromosomes Terminated when they meet nearby replication forks For eukaryotic (cells containing a nucleus) chromosomes, the process is still52 In vitro DNA replication – Polymerase chain reaction One pot reaction – add polymerase, DNA template, primers & dNTPs Amplification cycle – 3 steps at different temperatures After 25 cycles could give 34 million copies! 53 PCR amplification - doubles per cycle 54 PCR – introduce site-directed mutations Buy synthesised primers that have non-complementary parts to the template Can be used to introduce point mutations, insertions or deletions Primer binds despite not being a perfect match for template The rest of the chain is elongated 55 DNA modification – Epigenetics Modifications are made after DNA synthesis 5-Methylcytosine is common in eukaryotes, also found in bacteria N6-Methyladenosine is common in bacteria, not found in eukaryotes 56 DNA modification – Epigenetics Epigenetics – Inheritable traits beyond genetics 57 Mike Jones, DNA modification – Degradation by deamination Deamination Very slow reactions Large number of residues The net effect is significant: 100 C U events /day in a mammalian cell 58 DNA modification – Degradation by depurination Abasic/apurininc site N-glycosidic bond is hydrolysed. Significant for purines: 10,000 purines lost/day in a mammalian cell 59 DNA modification – Degradation by exogenous alkylating agents Alkylating agents such as electrophilic methylating agents can cause damage to DNA 60 DNA modification – Radiation damage Cyclobutane uracil dimer 61 DNA modification – Radiation damage 6,4-dithymine photoproduct 62 What we have learned How DNA information is preserved The process of DNA replication in nature: initiation, elongation & termination A method of replicating DNA in vitro (PCR) DNA modification by biochemical processes (epigenetics) DNA modification by exogenous damage (mutations) 63 CH4305/14 Macromolecules of Life Lecture 4 – DNA (chemical) synthesis and sequencing What we will learn How to chemically synthesise and sequence DNA Protected nucleotides Oligosynthesis cycle and coupling efficiency Synthesising larger pieces & libraries Sanger sequencing Illumina sequencing Single-strand/nanopore sequencing 65 Recap from Lecture 3 – How DNA is synthesised in nature Primer Short oligonucleotide (oligos) sequence created by a primase enzyme. DNA polymerase Enzyme that can build a Khan Academy complementary DNA strand starting from 66 Protected phosphoramidites Different protecting groups can be selectively removed with different reagents. Benzoyl and 2-cyanoethane protect against chain branching. Dimethoxytrityl protects against multiple additions Diisopropylamino upon 67 protonation forms an activated The cycle for oligonucleotide synthesis Solid phase synthesis Makes purification after each step easier Chemical synthesis is opposite to nature 3’ to 5’ 4 steps per cycle 1. Deblocking/ deprotection 2. Coupling 68 High yield/coupling efficiency required for biopolymer synthesis 3 coupling steps at 0.995 efficiency 0.995*0.995*0.995 = 98% 30 coupling steps at 0.995 efficiency 0.99530 = 86% 30 coupling steps at 0.980 efficiency 0.98030 = 55% Joining oligonucleotides together using Assembly PCR 70 By Dhorspool (talk) - self-made, CC BY-SA 3.0, DNA sequencing through synthesis - automated Sanger sequencing Sanger Sequencing illustratio n video Mixture similar to PCR: 1. Template DNA 2. DNA polymerase 3. dNTPs (deoxynuleosidetriphosphate) 4. Very low amounts of dideoxynucleotide mimics (right) Chain Terminating PCR Dideoxynucleotides cannot 71 Sanger technique - Chain terminating PCR Small % of chain termination happens when each complementary nucleotide is added. Each terminated chain will have a specific fluorescence maximum wavelength depending on the final dideoxynucleotide mimetic added. 72 Sanger technique - capillary electrophoresis and fluorescence detection 73 Massive parallel sequencing – Illumina sequencing Confusingly known as - Next generation sequencing (NGS)/Second generation Illumina Sequencing 74 video Illumina sequencing – Fragmentation and addition of adapters 1. Whole genomic data fragmented 2. Adapters - Small oligoneucleotides ‘oligos’ added to ends of fragments Multiplexing Unique adapters can be added to each fragmented genome, then all combined for sequencing 75 Illumina sequencing – Hybridisation to flow cell and fragment application 1. Hybridisation (allow DNA to dimerise) of both adapters to flow cell ‘oligos’ forms a bridge 2. DNA polymerase makes complementary strand starting from oligos covalently bound to cell 76 Illumina sequencing – Linearisation of strand clusters Linearisation The bridges are broken to form linear strands that are now covalently attached to the flow cell Clusters Because the process was repeated several times, all nearby oligos have been extended with the same sequence (clonal amplification). This leads to clusters full of the same sequence 77 Illumina sequencing – Sequencing by synthesis Like Sanger sequencing 1. Strands are sequenced by synthesising new strands 2. The four nucleotides are labelled with different fluorescent dyes Unlike Sanger sequencing 3. No natural dNTPs are added. 4. All steps cause chain termination, but can be 78 Long reads sequencing – Oxford Nanopore Up to 4 Mb reads ( 4 million base sequence) Can read epigenetic changes (methylation) Extremely portable machine Amongst other things could revolutionise personalised Oxford nanoporemedicine sequencing 79 What we have learned How to chemically synthesise and sequence DNA Protected nucleotides Oligosynthesis cycle and coupling efficiency Synthesising larger pieces & libraries Sanger sequencing Illumina sequencing Single-strand/nanopore sequencing 80 CH4305/14 Macromolecules of Life Lecture 5 – DNA Transcription What we will learn How DNA is converted into RNA The central dogma of molecular biology Transcription on a sequence level Overview of transcription: initiation, elongation & termination Posttranscriptional processing: sequence capping & splicing Reverse transcription 82 Recap from Lecture 1 - Differences between DNA and RNA Difference in monosacchari de Difference in nucleobase 83 Introduction - difference between prokaryote and eukaryotic cells A prokaryotic A eukaryotic cell cell 84 The central dogma of molecular biology Dogma Prescribed doctrine proclaimed as unquestionably true by a particular group Original central dogma of molecular biology Francis Crick “This states that once "information" has passed into protein it cannot get out again.” 85 Gene expression and regulation Regulation Relative rates of transcription will affect total amount of a gene’s corresponding RNA in a cell The abundance of messenger RNA (mRNA) affects the abundance of protein Gene translation Not all transcribed RNA are86 Transcription Like DNA replication, transcription is also based on synthesis of complementary strands (5’)CGCTATAGCGTTT(3’) DNA non-template (coding) strand (3’)GCGATATCGCAAA(5’) DNA template strand (5’)CGCUAUAGCGUUU(3’) RNA transcript RNA transcript is the same as DNA coding strand with T➜U and is synthesised using the template strand RNA synthesised 5’ – 3’ 87 RNA polymerase Large enzymes made out of proteins Can unwind and rewind DNA without assistance from helicase NTP chan nel Template strand Inserts from 3’ direction New RNA strand Synthesised from 5’ to 3’ 88 RNA polymerases recognise specific genetic elements Promoter Sequence of DNA that is bound(5’) (3’) to by RNA polymerase (RNAP) upstream Non- template downstream Terminator (coding) strand Sequence of DNA that stops further RNA elongation, and ultimately causes RNAP to dissociate Upstream Genetic information that is closer to the 5’ of the non-template (coding) strand Downstream Genetic information that is closer to the 3’ of the non-template (coding) strand 89 Bacterial transcription – RNA initiation Sigma factor – protein subunit of the RNAP that is essential 90 RNA initiation – Promoter sequence (s subunit binding sites) Efficiency of binding and initiation is determined by: 1. The identity of the two sequences 2. The spacing between them 3. Their distance from the transcription 91 start site RNA initiation – Promoter sequence determines transcription direction Both strands can encode for genes The template strand can be different for each gene Promoter sequence states direction of transcription, i.e. which strand is template 92 Bacterial transcription – RNA elongation and termination 93 Bacterial transcription termination – Terminator sequence The terminator sequence is transcribed (unlike promoter sequence) Transcribed RNA forms hairpin that disrupts the newly formed mRNA-DNA interaction at TTTT 94 Eukaryotes are more complicated In Eukaryotes RNA is modified by: 5’ Capping Splicing 3’ Polyadenylation (poly 95 Eukaryotic mRNA - 5’ Capping and 3’ polyadenylation Increase stability against degradation by ribonucleases Facilitate transport from nucleus to cytosol Mark as mRNA 96 Eukaryotic mRNA - Introns and splicing Prokaryotic: mostly linear coding sequence (very rare examples of introns) Eukaryotic: coding sequences (exons - expressed) interrupted 97 with Eukaryotic mRNA – Alternative splicing The excision of introns can lead to various combinations of 98 Eukaryotic mRNA – Splicing mechanism Intron (PDB ID 1GRZ) Group I & II introns: self-splicing introns Spliceosomal introns & tRNA introns: require proteins to 99 catalyse excision A messenger RNA (mRNA) strand can include multiple genes Messenger RNA (mRNA) encode for proteins A single transcribed mRNA strand can include several genes that will produces several different proteins More common in prokaryotes than eukaryotes 100 Some genetic information can flow in the opposite direction Evidence of Reverse transcription Retroviruses, e.g. Hep-B and HIV Retrotransposon in DNA of eukaryotic cells, but unlike retroviruses can’t infect other cells Telomere – Telomerase enzyme reverse transcribes RNA into DNA101 In vitro method to generate Complementary DNA Reverse transcriptase enzymes can be used in vitro to construct mRNA-cDNA double helix This allows the synthesis of more chemically stable DNA from RNA strands Can be used for sequencing 102 What we have learned How DNA is converted into RNA The central dogma of molecular biology Transcription on a sequence level Overview of transcription: initiation, elongation & termination Posttranscriptional processing: sequence capping & splicing Reverse transcription 103 CH4305/14 Macromolecules of Life Lecture 6 – Bioinformatics What we will learn What is bioinformatics and how can we use it Types of bioinformatic data Common tasks in bioinformatics Online databases: Genbank & UniProt Have a go! 105 The study of biological data “Omics aims at the collective characterization and quantification of pools of biological molecules” https://en.wikipedia.org/wiki/Omics 106 What we can study with biological data 107 There are open-access databases online National Center for Biotechnology Information DNA Data Bank of Protein Information Resource (NCBI) Japan https:// http://www.ncbi.nlm.nih.gov/ https:// proteininformationresource.org www.ddbj.nig.ac.jp European Bioinformatics Institute Swiss Institute of (EBI) Bioinformatics 108 An extraordinarily rare microorganism has been observed! Bournemouth press rel ease Legendrea 109 GenBank: Search https://www.ncbi.nlm.nih.gov/genbank/ Search all databases for Legendrea Loyezae Labels provide information about nucleotide sequence Data can also be viewed in FASTA & graphics tab 110 GenBank: Compare How different is the rRNA from Legendrea Loyezae to others in nature BLAST - Basic local alignment search tool Sequence homology – A gene/sequence can either be homologous or not, used to describe an equivalent gene/sequence found in a different organism Sequence identity – Describes how many nucleotides are the same when comparing two or more sequences/genes. High sequence identity = very similar sequences. 111 Choose animal and find their genome Click on the names in presentation mode/pdf or go to https://www.ncbi.nlm.nih.gov/datasets/ and search by name Oryctolagus cunicul Manis pentadac us tyla Castor canade Trichechus man nsis atus Ornithorhynchus ana Dromiciops gliroi tinus des Halichoerus gry Choloepus didact pus ylus Erinaceus europ Homo sapi 112 Genome overview Browse Taxonomy Genome can be classified in various ways: function, location in genome etc. Within genome select a protein-coding gene Zoom in to look at annotation Run a BLAST search 113 Other useful websites/databases https://www.uniprot. https://www.expasy.org/ org/ A suite of tools for A database of various bioinformatic protein sequences applications and function 114 information What we will learn What is bioinformatics and how can we use it? Types of bioinformatic data Common tasks in bioinformatics Online databases: Genbank & UniProt Have a go! 115 CH4305/14 Macromolecules of Life Lecture 7 – RNA Translation What we will learn How RNA is converted into proteins Codons, codon tables and codon usage Reading frames tRNA construction and wobble hypothesis Ribosome, translation and polysomes 117 Recap from Lecture 4 – Translation is the final step in genetic information flow Translation is the conversion of RNA information into protein Currently no known examples of reverse translation Not all DNA is converted to RNA Not all RNA is converted to protein 118 The twenty canonical amino acids 119 Codons are translated into amino acids Each codon comprises 3 nucleotides 4 nucleobases (AGUC) over 3 positions 43 = 64 combinations Only This code is 20 amino used acids therefore throughout nature there is redundancy However, some minor differences exist between some organisms, and in organisms’ mitochondria 120 The codon lookup table Second T Cnucleotide A G TTT Phe TCT Ser TAT Tyr TGT Cys TTC Phe TCC Ser TAC Tyr TGC Cys T TTA Leu TCA Ser TAA Stop TGA Stop Provides a quick method TTG Leu TCG Ser TAG Stop TGG Trp to look up codons CTT Leu CCT Pro CAT His CGT Arg CTC Leu CCC Pro CAC His CGC Arg C CTA Leu CCA Pro CAA Gln CGA Arg 1st nucleotide select from the CTG Leu CCG Pro CAG Gln CGG Arg rows First ATT Ile ACT Thr AAT Asn AGT Ser nucleoti ATC Ile ACC Thr AAC Asn AGC Ser A 2nd nucleotide select from de ATA Ile ACA Thr AAA Lys AGA Arg ATG Met ACG Thr AAG Lys AGG Arg columns GTT Val GCT Ala GAT Asp GGT Gly GTC Val GCC Ala GAC Asp GGC Gly G GTA Val GCA Ala GAA Glu GGA Gly 3rd nucleotide select from sub GTG Val GCG Ala GAG Glu GGG Gly rows 121 Codon usage varies between organisms E. Coli Human Tobacco Usag Usag Usag e e e ACT Thr 19% 24% 39% ACC Thr 40% 36% 19% ACA Thr 17% 28% 33% ACG Thr 25% 12% 9% 122 Codon usage varies between organisms E. Coli Human Tobacco Usag Usag Usag Frequency () out of 1000 e e e total 12. 20. 8 8 ACT Thr 19% 10.3 24% 19. 39% 10. 64 codons ACC Thr 40% 22.0 36% 19% 2 0 ACA Thr 17% 9.3 28% 33% 14. 17. ACG Thr 25% 13.7 12% 9% 8 4 6.2 4.6 15.6 per 1,000 codons Therefore frequency values < 15.6 are less frequent than 123 Codon usage varies between organisms E. Coli Human Tobacco Usag Usag Usag Frequency () out of 1000 e e e total 12. 20. 8 8 ACT Thr 19% 10.3 24% 19. 39% 10. 64 codons ACC Thr 40% 22.0 36% 19% 2 0 ACA Thr 17% 9.3 28% 33% 14. 17. ACG Thr 25% 13.7 12% 9% 8 4 6.2 4.6 15.6 per 1,000 codons UAG Stop 9% 0.3 20% 0.5 19% 0.5 Therefore frequency values < 15.6 are less frequent than 124 Open reading frame (ORF) Search genome for regions with 50 or more consecutive codons without stop codon 125 Reading frames 126 Mutations can affect the reading frame Substitutions change a nucleic acid residue but no change to reading frame Insertions and deletions 1. complete codons added/removed – addition or deletion of amino acids but same reading frame 2. incomplete codon added/removed – change127 to Decoding machinery canonical amino acid maximum 64 unique protein endogenous tRNA ~20 endogenous synthetase ribosome mRNA 128 The structure of tRNA Anticodon – the three nucleotides that bind to the mRNA strand Amino acid arm – the 3’ end of each unique tRNA that is covalently bound to its corresponding amino acid 129 tRNA synthetase – The enzyme that covalently attaches amino acids to tRNA tRNA synthetase 130 Charging/Loading amino acid onto tRNA – Step 1 formation of aminoacyl-AMP 131 Charging/Loading amino acid onto tRNA – Step 2 transfer of aminoacyl-AMP to tRNA 132 Charging/Loading amino acid onto tRNA – Step 2 transfer of aminoacyl-AMP to tRNA Class I – aminoacyl-tRNA Class II – aminoacyl-tRNA synthetase synthetase 133 Wobble Hypothesis ACT Thr Not all organisms have 64 different 5’ ACC Thr tRNA ACA Thr 3’ The same tRNA can bind codons with ACG Thr different 3rd nucleotide from the mRNA strand Reduces tRNA degeneracy Faster dissociation from mRNA 5’ 3’ Minimises damage from mis-reading 134 Wobble Hypothesis – low specificity of first anticodon nucleotide 135 Ribosome – nature’s protein synthesiser 136 Ribosome - The three steps of translation 137 Ribosome - Initiator tRNA Start codon is AUG Codes methionine Initiator tRNA has modified Met in bacteria/mitochondria/chloroplast All newly synthesised proteins will have Met at N terminus 138 Ribosome – polypetide elongation 3 sites for tRNA ribosome Aminoacyl binding site Peptidyl site Exit site tRNA move along A→P→E 139 Polysome – Multiple ribosomes bound to a single mRNA strand Multiple ribosomes can bind to the same mRNA each translating the same gene into protein 140 Ribosome - termination tRNA of final amino acid at C terminus Release factor UAG stop 141 mRNA What we have learned How RNA is converted into proteins Codons, codon tables and codon usage Reading frames tRNA construction and wobble hypothesis Ribosome, translation and polysomes 142 CH4305/14 Macromolecules of Life Lecture 8 – Recombinant Technology What we will learn How we can make proteins in the lab Define recombinant technology Host cells Steps involved in recombinant protein expression Induction using IPTG Challenges faced when using recombinant technology 144 Recombinant technology Artificially created DNA that combines sequences that do not occur together in nature Heterologous Gene Expression creates Transgenic organisms Basis of much of the modern molecular biology Molecular cloning of genes Transgenic food, animals … Over-production of proteins 145 Recombinant technology - Definition Heterologous Expression means producing RNA/protein from a cloned coding sequence (often cDNA – no introns) in a host cell which is unlike the cells in which the protein is naturally expressed e.g. expressing human insulin in E. coli cells. Today, most cloned genes can be expressed in bacterial cells – at least to a level which is just detectable. However, achieving high levels of expression AND eventually producing biologically active protein cannot be guaranteed. 146 Host cell Escherichia coli (E. coli) most commonly used bacteria ≥ target protein is 10% (w/w) of total cellular protein ≈ 15 mg recombinant protein (un-purified) per Litre of bacterial culture (where the E. coli cultures contain 107 – 109 cells per ml) However sometimes prokaryotic cells are not suitable for producing a protein Yeast cell (Saccharomyces cerevisiae) Insect cell (S2, SF9, SF21 & high five) Chinese hamster ovary cell (CHO) 147 Steps of recombinant protein expression 1. DNA isolation/synthesis and amplification 2. Cloning into vector 3. Transformation of cells 4. Selection of transformants 5. Culture cells, induce, lyse and purify 148 Step 1 - DNA isolation and amplification Isolate a specific gene from Synthetic gene constructed the source organism and from oligonucleotide synthesis amplify it by PCR Luc Viatour 149 Step 2 – Plasmids as a vector Plasmids have: Circular DNA Origin of replication: can replicate autonomously in bacteria Selectable marker: carry antibiotic resistance genes or chromophore expressing genes The ability to clone DNA up to 15,000 bp Expression vectors must have: Promoter & terminator sequences Operator sequences Code for ribosome binding site (RBS) 150 Step 2 – Linearising plasmids using nucleases Staggered cuts give rise to sticky ends Straight cuts give rise to blunt ends Nucleases – enzymes that can hydrolyse nucleic acid backbone Endonucleases – cut double strands in the middle of the sequence (rather than at termini of linear DNA) 151 Step 2 – Inserting target gene into plasmid Restriction sites – DNA sequence recognised by restriction endonuclease Same restriction sites included on plasmid and gene = complementary sticky ends formed between gene and plasmid Ligase – enzyme that can form a phosphodiester between two 152 Step 2 – Inserting target gene into plasmid If the same restriction site used for both ends of gene then it can insert backwards too 153 Step 3 – Transformation of cells Transformation – introducing exogenous genetic material into cells 154 Step 4 – Selection of transformants Antibiotics kill bacteria Plasmids can carry genes that give host bacterium a resistance against antibiotics Allows growth (selection) of bacteria that have taken up the plasmid kanamycin 155 Step 5 – Culture cells, induce, lyse and purify 156 Stress induction on host cell Unnatural situation which places stresses on the cell’s ability to grow and survive The more toxic a recombinant protein is to the cell, the greater the advantage to any cell which reduces its level of expression e.g. by mutations in components of the expression system. Promoters must be STRONG and INDUCIBLE 157 pET Expression System – IPTG induction Repressor – A protein that binds to DNA/RNA and blocks transcription/translation Lac repressor – protein that binds to specific DNA sequence next to Lac promoter blocking transcription T7 RNA polymerase – separate to E Coli. polymerase 158 pET Expression System – IPTG induction Repressor – A protein that binds to DNA/RNA and blocks transcription/translation Lac repressor – protein that binds to specific operator DNA sequence next to Lac promoter blocking transcription T7 RNA polymerase – separate to E Coli. polymerase 159 pET vector also contains Challenges faced when using recombinant technology We are expressing a protein in a new environment which can cause problems 1. Protein might be highly toxic to new cell 2. Protein might be insoluble leading to inclusion bodies Expressing a eukaryotic protein in a prokaryotic cell (E. Coli) can also cause unique problems 3. Post-translation modification not included, e.g. glycosylation and cleavage of terminal amino 160 Removing introns from gene using complementary DNA (cDNA) As described in Lecture 5 on DNA transcription we can use reverse transcription to convert RNA to DNA 1. Purify mature mRNA from eukaryotic cells that has already been spliced 2. Generate double-stranded complementary DNA from mature mRNA 161 What we have learned How we can make proteins in the lab Define recombinant technology Host cells Steps involved in recombinant protein expression Induction using IPTG Challenges faced when using recombinant technology 162 CH4305/14 Macromolecules of Life Lecture 9 – Newest research What we will learn Some of the most recent research using macromolecules of life Nucleic-acid based vaccines De novo protein design Research from the Rhys Lab (Ben Orton) 164 Nucleic-acid based vaccines mRNA-based vaccines mRNA vaccine description Viral-vector based mRNA-based vaccines vaccines Example Oxford–AstraZeneca Pfizer–BioNTech COVID-19 vaccine COVID-19 vaccine Nucleic DNA mRNA acid 165 Enters Transfection Phagocytosis Steps to engineer a nucleic acid vaccine 1. Identify a target antigen and epitope (part of antigen that the immune system will recognise and attack) 2. Design DNA/mRNA sequence that encodes for the epitope 3. Assemble the DNA/mRNA into a delivery vector 166 Nucleic acid vaccine – Identify antigen Spike protein – essential protein for SARS-CoV-2 to bind to human cells Very unlikely for virus to evolve a different way to transfect cells Optimise protein sequence - Two mutations to proline Cryo-EM structure of the 2019-nCoV spike in the amino acids to ’lock’ protein 167 prefusion conformation, Volume: 367, Issue: 6483, Nucleic acid vaccine – Optimise sequence UTR – “Untranslated” region, often upstream of start codon, improves translation Jackson, N.A.C., Kester, K.E., Casimiro, D. et al. The promise of mRNA vaccines: a biotech and industrial perspective. npj 168 Nucleic acid - mRNA vaccine production Khan Academy 1. Molecular cloning of DNA or 2. In vitro transcription (IVT) of mRNA amplification region of interest with PCR, (remember - no primers required for and purification of DNA transcription) 169 Vaccine. 2021 39 2190–2200. doi: Nucleic acid vaccine – Optimisation of delivery vector Viral-vector mRNA vaccines vaccines mRNA is Evolved to chemically efficiently get into unstable and cells but antigen highly negatively DNA also needs to charged, need to be transported to get it into cells Moderna lipid composition nucleus and cytoplasm. 1. 1,2-distearoyl-sn-glycero-3- 1. Remove some genes so it can’t translated. phosphocholine (DSPC) replicate 2. cholesterol 2. Alter DNA to improve deliver of 3. PEG2000-DMG (polyethylene DNA into nucleus glycol (PEG) 2000-dimyristoyl glycerol (DMG)) 170 Advantages and disadvantages of nucleic acid vaccines Advantages Disadvantages Very fast to design (Moderna Currently require storing at took 2 days!) cold temperatures Very fast to produce Can’t be used for non-protein (Pfizer–BioNTech vaccine now based antigens such as takes 22 days for production) bacterial polysaccharides Simple production process PTMs happen in the body Non infectious Unlikely to be integrated into genome* *mRNA based vaccines 171 The future of nucleic acid vaccines Increased thermostability e.g. moderna mRNA-1283 Next generation (2–5 ℃) Vaccines against “latent” viruses e.g. CMV, EBV, HSV & HIV Cancer vaccines e.g. Biontech BNT111 - Advanced Melanoma Personalised vaccines e.g. individualized mRNA cancer vaccines 172 What we will learn Some of the most recent research using macromolecules of life Nucleic-acid based vaccines De novo protein design Research from the Rhys Lab (Ben Orton) 173 Freedom to create any DNA we want! Oligonucleotide Assembly synthesis PCR 174 What is protein design and why does it fascinate me? Selecting an amino acid sequence for a desired structure/function Protein design has many parallels to synthetic chemistry Learn how to build Improve our understanding of the world around us 175 Huang P.-S. et al. The coming of age of de novo protein design. Nature 537, It is now possible to predict protein structure accurately on the computer Scientist like me are trying to do the reverse! Nature 596, 583–589 (2021). https://doi.org/10.1038/s41586-021- 176 The inverse protein folding problem AEGAILWRTVEVENMQWIRRTIPLALARARE ALKCVVVETPLLALEKLKAEGAILWRTVEVEN MQWIRRTIPLALARAREALKCVVVETPLLALE KLK AlphaFol d2 Protein design Select amino acid sequences based on a desired structure or function Protein engineering Modify an existing protein function 177 Scientists can computationally design totally new proteins to bind to proteins in our body 178 Wang et al. Science 2022 Vol 377 387-394 Mizoroki-Heck cross-coupling Efficient and versatile C-C bond formation Traditional Mizoroki- CrossArMs objectives Heck cross-coupling Palladium Pd-free homogeneous 80-150 °C catalysis Toxic ligands and solvents 3d transition metals Lack of stereoselectivity Under ambient temperatures Side reactivity Robust and efficient enzyme High substrate promiscuity Establishing the chemistry Pd-free Heck reaction conditions screen Metal, Ligand Aq./Org., Base 90°C, 16h 3- methylindole (Skatole) 1 2 3 4 Intramolecular cross-coupling 5 6 7 8 Bidentate N-/O- donor ligands 26 27 28 29 Fe Co Ni Cu 55.845 58.933 58.693 63.546 3d transition metals Green solvents Source: OpenTron Technologies OpenTron OT-2 automation What we’ve learned Some of the most recent research using macromolecules of life Nucleic-acid based vaccines De novo protein design Research from the Rhys Lab (Ben Orton) 181