UM1010 Genes & Transcription PDF

UM1010: Integrated Science and Clinical Medicine 1 Semester 1 Genes and Transcription With credit to Dr Michael Porter and Dr Elaine Browne for a few slide contents. Dr Emyr Bakker SCHOOL OF MEDICINE [email protected] LEARNING OUTCOMES Identify the structure and define the function and replication of DNA Demonstrate an understanding of the basic principles of genetic inheritance Demonstrate an understanding of the organisation of genes and gene transcription Identify the process of gene translation and protein post- translational modification SCHOOL OF MEDICINE THE CENTRAL DOGMA OF MOLECULAR BIOLOGY—THE KEY OF THIS LECTURE The flow of genetic information is from DNA to RNA to proteins can be represented by a scheme called the central dogma of molecular biology. We will discuss more next year why this classical model is limited, but for now we’re focusing on fundamentals. SCHOOL OF MEDICINE PART 1: THE BASICS OF DNA STRUCTURE SCHOOL OF MEDICINE STARTER—HOW DO YOU STORE COMPUTER INFORMATION? Pause the video for a moment and take a second to think—how many different ways do you store information? I can’t imagine any of you use floppy disks or DVDs these days, but how about USBs? External hard drive? Cloud storage? SCHOOL OF MEDICINE THE SURPRISING UNIFORMITY OF BIOLOGICAL DATA STORAGE Despite the many ways you can store computer information, for living beings on Earth storing their genetic information all happens in one way— through DNA There are an estimated 10-100 million different species on Earth and cells have arguably been evolving and diversifying for over 3.5 billion years It’s therefore unlikely that all living things would all use DNA… yet here we are Cells are the building blocks of life, but it’s our genetics that underpin our SCHOOL OF MEDICINE cells INTRODUCTION TO DNA DNA stands for deoxyribonucleic acid DNA is made from simple subunits called nucleotides. Each nucleotide has: A sugar molecule (deoxyribose) A phosphate molecule A nitrogen-containing side-group, or base Figure 1-2A, Molecular Biology of The Cell Sixth A nucleoside is the nucleotide without the Edition phosphate There are four bases: Adenine (A) Hi! Guanine (G) Cytosine (C) Thymine (T) Thymine is replaced by Uracil (U) in RNA Thymine has higher resistance to mutation, so is used in DNA. RNA is short-lived, so it matters less there. (not this kind of introduction) SCHOOL OF MEDICINE OUR NUCLEOTIDES CAN BE DIVIDED INTO TWO MAJOR CLASSES Adenine and Guanine are derivatives of purine, whilst Cytosine and Uracil/Thymine are derivatives of pyrimidine Do not learn the molecular structures—you will not be examined on these. SCHOOL OF MEDICINE DNA STRUCTURE A single strand of DNA (a DNA “chain”) consists of nucleotides joined together by sugar- phosphate linkages Whilst these diagrams make it Figure 1-2B, Molecular look linear, individual sugar- Biology of The Cell Sixth Edition phosphate units are asymmetric, which gives the backbone of the strand polarity (directionality). This directionality guides how DNA is interpreted and copied – just how written text is read in a consistent order SCHOOL OF MEDICINE DNA ASYMMETRY Nucleic acid chains are represented by base abbreviations e.g. AGCT. Nucleic acid chains have directionality in that the two ends are different. One end has a phosphoryl group attached to the 5carbon atom of the sugar and one end has a free hydroxyl attached to the 3 carbon of the sugar. Nucleic acid sequences are written in the 5to 3direction. SCHOOL OF MEDICINE NUCLEOTIDES PAIR WITH EACH OTHER Guanine pairs with cytosine Adenine pairs with thymine Remember that although this is displayed as linear, there is polarity in the strands Adapted from Figure 1-2C, Molecular Biology of The Cell Sixth Edition SCHOOL OF MEDICINE BASE PAIRING MEANS OUR DNA IS DOUBLE-STRANDED Nucleotides WITHIN an individual strand are bonded by strong covalent bonds (a phosphodiester bond, to be exact) Nucleotides BETWEEN individual strands are held together more weakly by hydrogen bonds SCHOOL OF MEDICINE THE TWO STRANDS TOGETHER PRODUCE THE FAMOUS ‘DOUBLE HELIX’ DNA is called anti-parallel/antiparallel because the strands run in opposite directions (see image on the right) SCHOOL OF MEDICINE DNA IS STORED IN CHROMOSOMES Human DNA exists in a highly condensed form in chromosomes Chromosomes contain long lists of genes The scale of DNA condensation is incredible—there are about 48 million nucleotide pairs of DNA in chromosome 22. If this was laid out in one long perfect double helix, it would stretch out to about 1.5 cm in length. And yet, the nucleus, which holds all chromosomes, is only about 6 μm in diameter And there are 10000 μm in 1 cm! Specialised proteins bind to and fold the DNA, creating several levels ofOF MEDICINE SCHOOL organisation such as coils and loops THE PROTEINS BEHIND CHROMOSOME STRUCTURE DNA-binding proteins that form chromosomes are traditionally divided into two classes: Histone proteins These are responsible for the nucleosome, the basic level of chromosome packaging which has a “bead-like” structure you can see on the figure below Non-histone chromosomal proteins Chromatin refers to a mixture of DNA and proteins that form the chromosomes Each contribute about the same as DNA to the mass of the chromosome, so a chromosome is 1/3 DNA, 2/3 protein SCHOOL OF MEDICINE NUCLEOSOME STRUCTURE Each nucleosome (each “bead on the DNA string”) is an OCTamer (oct=eight) consisting of two of each of the below: Histone H2A Histone H2B Histone H3 Histone H4 The double-stranded DNA then winds around the octamer, as you can see on the right. Histones can be modified through post-translational mechanisms (e.g. acetylation, phosphorylation) to alter the accessibility of DNA – you can read more here ( https://www.ncbi.nlm.nih.gov/p Figure 1, Chen et al ( mc/articles/PMC4566235/ https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4454312/) ) if you are curious, but the main thing to remember is that histones can influence how accessible DNA is – which influences gene expression andSCHOOL OF MEDICINE protein production PART 2: DNA REPLICATION AND GENETIC INHERITANCE SCHOOL OF MEDICINE ALL CELLS REPLICATE THEIR DNA THROUGH TEMPLATED POLYMERISATION Because DNA is so crucial to life as it is the information store for hereditary purposes, we need robust mechanisms to ensure it can be appropriately copied and replicated We mentioned earlier that the bonds between bases WITHIN a strand are strong, but the bonds between bases BETWEEN strands are weaker This allows the DNA to be ‘pulled apart’ without damaging the integrity of individual strands New strands are synthesised against a template (old strand) and you can see from the figures that it does so in a semi-conservative manner SCHOOL OF MEDICINE KEY UNDERPINNINGS OF DNA REPLICATION DNA replication relies on base complementarity (e.g. G-C and A-T) The separation of the double-stranded DNA allows each individual nucleotide on each individual strand to be recognised by a free (unpolymerised) nucleotide (e.g. if the template strand has guanine, a free cytosine could bind to the synthesised strand) SCHOOL OF MEDICINE THE CHEMISTRY OF DNA REPLICATION Primer strand refers to the ‘new’ DNA strand Synthesis happens in a 5’ to 3’ direction, facilitated by DNA polymerase (which, really, can be one of several enzymes)—it promotes the formation of a phosphodiester bond between 3’OH of the primer strand and the innermost phosphorus of the incoming dNTP The synthesis is driven by a large, favourable free- energy change caused by the release of pyrophosphate and its subsequent hydrolysis to two molecules of inorganic phosphate Remember: nucleoside - base & sugar Nucleotide – base, sugar, phosphate SCHOOL OF MEDICINE THE ANTIPARALLEL PROBLEM & THE DNA REPLICATION FORK Early analyses (in the 1960s) on whole replicating chromosomes revealed a y-shaped structure which was a localised region of replication – this become known as the “replication fork” Humans, as eukaryotic organisms, have multiple replication origin sites on each chromosome and therefore multiple replication forks per chromosome At replication forks, a multienzyme complex (including DNA polymerase) synthesises the DNA of both new daughter strands However, we established earlier that DNA is antiparallel and therefore if both strands were to be replicated continuously, we’d need separate polymerase machinery as the DNA polymerase we have only works 5’ to 3’. So how can DNA grow in the 3’ to 5’ direction? SCHOOL OF MEDICINE The answer is Okazaki fragments OKAZAKI FRAGMENTS These are transient pieces of DNA that are 100-200 nucleotides long in eukaryotes They are polymerised only in the 5’-to-3’ direction and are joined together after their synthesis to create long DNA chains The antiparallel nature of DNA and the existence of Okazaki fragments leads to an asymmetric structure of the DNA replication fork—the leading strand is one that is continuously synthesised (as it runs 5’ to 3’) whilst the other strand is synthesised discontinuously and is known as the lagging strand This ‘back-stitching’ approach means that our polymerase only needs to synthesise 5’ to 3’ SCHOOL OF MEDICINE OKAZAKI FRAGMENTS VISUALISED Left: a replication fork showing newly-synthesised DNA in red and arrows showing the 5’-to-3’ direction of DNA synthesis Both daughter strands are polymerised in the 5’ to 3’ direction, the DNA synthesised on the lagging strand must be made initially as a series of short DNA molecules (Okazaki fragments) Right: the same replication fork a short time later, showing sequential synthesis of Okazaki fragments SCHOOL OF MEDICINE TYPES OF DNA POLYMERASE & FUNCTIONS There are five main types of humans DNA polymerase to remember: DNA Polymerase α (alpha), δ (delta) and ε (epsilon) – for nuclear DNA DNA Polymerase γ (gamma) – for mitochondrial DNA DNA Polymerase β (beta) – for DNA repair & gap-filling We have more—detailed here ( https://febs.onlinelibrary.wiley.com/doi/full/10.1111/febs.158 52 ) if you wish to read more All synthesise in the 5’ to 3’ direction As you see above, DNA polymerase can play a role in DNA repair – but DNA repair is beyond the scope of this lecture SCHOOL OF MEDICINE BACK TO BASICS—FUNDAMENTAL GENETIC NOMENCLATURE Phenotype: the physical appearance of a trait Gene: The unit of inheritance behind a trait Allele: The different forms of a gene Genotype: The genetic make-up of an individual. We each have two alleles per gene – one from each parent. Homozygous: Same two alleles for a gene Heterozygous: Different alleles for the same gene Dominant allele: expression manifests whether there’s one or two copies of it. Expressed as a capital letter (e.g. T) Recessive allele: expression manifests only if there’s two copies of it. Expressed as a lower-case letter (e.g. t) SCHOOL OF MEDICINE CLASSICAL GENETIC INHERITANCE—MENDELIAN GENETICS & PUNNETT SQUARES In reality, genetics and inheritance are much more complicated than what is shown here—you will learn more about this in Year 2. For now, focus on the basics. Punnett Square of the T t Tt and Tt plants T TT Tt t Tt tt SCHOOL OF MEDICINE BASIC GENETIC PEDIGREE VISUALISATION SCHOOL OF MEDICINE PART 3: TRANSCRIPTI ON SCHOOL OF MEDICINE REMEMBER THE CENTRAL DOGMA SCHOOL OF MEDICINE INTRODUCTION TO TRANSCRIPTION In order for DNA to perform its information-carrying function, it can’t just copy itself; it needs to EXPRESS that information It does so through the central dogma (DNA -> RNA -> Protein) Transcription is the templated replication of DNA, but instead of producing deoxyribonucleic acid (DNA) we produce ribonucleic acid (RNA). It is the first step of gene expression and takes place in the nucleus Deoxyribose vs ribose is shown below: This is why it’s called deoxyribose SCHOOL OF MEDICINE MANY RNA TRANSCRIPTS ARE PRODUCED FROM A SMALLER AMOUNT OF DNA Cells (barring mutation events) contain a fixed set of DNA – its genetic archive DNA for a single gene can guide the replication of many RNA transcripts Additionally, different RNA transcripts can be made by transcribing different parts of the DNA – meaning different cells can use the same information store in different ways SCHOOL OF MEDICINE RNA IS A FLEXIBLE STRUCTURE Whilst DNA is most commonly double-stranded, RNA is (typically) single-stranded. Like DNA, RNA has nucleotide subunits linked together by phosphodiester bonds. Being single-stranded gives them a flexible backbone, allowing the polymer chain to bend back on itself and self-bond This flexibility and self-bonding can lead to multiple distinct shapes of RNA, which are guided by the RNA’s sequence These shapes can impact function—and thus RNA folding is an extensive field of study RNA folding is crucial to health, but RNA misfolding can lead to disease – you can read here ( https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4508199/) if you want to learn more. SCHOOL OF MEDICINE SEVERAL KINDS OF RNA PLAY A ROLE IN GENE EXPRESSION – AND GENE EXPRESSION IS NOT UNIFORM There are many species of RNA that perform a variety of functions. However, the three most abundant classes of RNA are ribosomal RNA (rRNA), messenger RNA (mRNA) and transfer RNA (tRNA). It is also important to be aware that different genes can be transcribed and translated with different efficiencies, allowing the cell to make vast amounts of some proteins and tiny amounts of others (illustrated below) SCHOOL OF MEDICINE TRANSCRIPTION PRODUCES ssRNA THAT IS COMPLEMENTARY TO A DNA STRAND The sequence of bases in the RNA molecule produced is the same as the sequence of bases in the non-template DNA strand (substituting T for U) Whilst DNA replication needed DNA polymerase, transcription requires RNA polymerase Whilst DNA replication needed dNTPs, transcription requires the four ribonucleoside triphosphates (i.e. NTPs [ATP, GTP, etc…]) and it is the hydrolysis of these SCHOOL OF MEDICINE high-energy bonds that KEY FACTS ABOUT TRANSCRIPTION Transcription begins with the opening and unwinding of a small portion of the DNA double helix; one strand then acts as a template Like DNA replication, the synthesis of the RNA strand is reliant on base complementarity Unlike DNA replication, the RNA strand does not remain hydrogen bonded to the template DNA strand; instead, the RNA chain is displaced and the DNA helix re-forms RNA transcripts are copied from only a limited region of the DNA strand, and so are much smaller (e.g. a DNA molecule in a chromosome could be up to 250 million nucleotide-pairs long, whilst most RNAs are no more than a few thousand nucleotides [and many are shorter]) SCHOOL OF MEDICINE HUMANS, AS EUKARYOTES, HAVE SEVERAL RNA POLYMERASES Type of RNA Example Transcripts Polymerase Ribosomal RNAs (transcribed in the RNA Polymerase I nucleolus of the nucleus) [5.8S, 18S, 28S] RNA Polymerase II All protein-coding genes RNA Polymerase RNA polymerase II needs generalIII tRNA transcription genes, factors 5Salso and can rRNA use specific transcription factors Transcription factors can bind to promoter or enhancer elements in DNA (regulatory elements that modulate gene expression). Promoters are proximal, whilst enhancers are distal. Eukaryotic transcription initiation must take place on DNA that is packed into SCHOOL OF MEDICINE nucleosomes PROMOTER STRUCTURE Promoters are specific DNA sequences that direct RNA polymerase to the proper initiation site. There are often variations in the sequence of a promoter for different genes. The average of such variation is called the consensus sequence. SCHOOL OF MEDICINE TRANSCRIPTION INITIATION, ELONGATION, AND TERMINATION As mentioned, RNA polymerase II requires a set of general transcription factors e.g. TFIID, etc; TFII stands for “transcription factor for RNA polymerase II”, so you can see the naming is somewhat arbitrary Eukaryotic transcription initiation is shown on the right. In essence: The promoter contains a DNA sequence known as a TATA box, 25 nucleotides upstream of (before) transcription initiation site TFIID recognises and binds the TATA box This permits adjacent binding of TFIIB The other general transcription factors and RNA polymerase II itself can then assemble at the promoter TFIIH uses energy from ATP hydrolysis to pry the DNA strand apart at the transcription start site TFIIH also phosphorylates RNA polymerase II, so it can synthesise the RNA strand (elongation) Termination occurs when the polymerase encounters a SCHOOL termination signal, commonly AAUAAAOF MEDICINE THE GENERAL TRANSCRIPTION FACTORS AND THEIR ROLE IN TRANSCRIPTION INITIATION Tip to Remember Name Example Function Function TFIID detects the TFIID Recognises TATA box TATA box TFIIB brings RNA TFIIB Helps position RNA polymerase II polymerase II to the right place TFIIF facilitates Stabilises RNA polymerase II RNA polymerase II TFIIF interactions with other TFIIs and interaction & finds attracts TFIIE and TFIIH other TFs TFIIE Attracts and regulates TFIIH TFIIE entices TFIIH TFIIH uses ATP Unwinds DNA through hydrolysis hydrolysis and TFIIH and phosphorylates RNA helps RNA polymerase II polymerase II elongate SCHOOL OF MEDICINE RNA PROCESSING—MOVING FROM ‘THE PRIMARY RNA TRANSCRIPT’ TO mRNA Eukaryotic genes are discontinuous, with coding regions called exons, interrupted by noncoding regions called introns. Pre- messenger RNA contains exons and introns. It is first modified by the addition of a 5cap and a poly A tail at the 3end. The introns are spliced out to generate mature mRNA by large complexes called spliceosomes. Transcription and processing of the β-globin gene. The gene is Introns almost always transcribed to yield the primary transcript, which is modified by cap begin with a GU and end and poly(A) addition. The introns in the with an AG. primary RNA transcript are removed to form the mRNA. Adapted from Figure 6-20, Molecular Biology of the Cell 6e SCHOOL OF MEDICINE PART 4: TRANSLATION AND POST- TRANSLATIONAL MODIFICATIONS SCHOOL OF MEDICINE BASICS OF TRANSLATION Takes place outside of the nucleus—mRNA is transported out through the nuclear pore complex. Free ribosomes are the site of mRNA translation whilst rough endoplasmic reticulum is the site of protein production, folding, quality control and dispatch Translation is the second step of gene expression where mRNA (nucleic acid sequence information) is translated into proteins (amino acid sequence information) by ribosomes The genetic code links these two types of information: Three nucleotides, called a codon, encode an amino acid The code is nonoverlapping The code has no punctuation The code has directionality. It is read from the 5’ end of the mRNA to the 3’ end The code is degenerate in that some amino acids are encoded by more than one codon RNA is central to translation; mRNA, of course, encodes the genetic information, whilst rRNA catalyses the formation of peptide bonds and tRNA translates nucleic acids into ‘protein language’ SCHOOL OF MEDICINE THE GENETIC CODE—CODONS TO AMINO ACID Amino acids all have an amino group (-NH2) on one end and an acid group (COOH) on the other. The R group varies. You may hear the terms N- terminus (beginning of polypeptide) and C-terminus (end of polypeptide) when discussing proteins; this is referring to positions in the amino acid sequence. Amino acids are joined sequentially by peptide bonds between the acid group of one to the amino group of another Do NOT memorise! For reference only. SCHOOL OF MEDICINE COLINEARITY OF DNA, RNA, AND AMINO ACID SEQUENCES Coding DNA strand 5’ 3’ Coding strand equivalent mRNA 5’ 3’ Protein NH2 COOH SCHOOL OF MEDICINE MESSENGER RNA CONTAINS START AND STOP SIGNALS FOR PROTEIN PRODUCTION Messenger RNA is translated on ribosomes. The first codon is almost always AUG, which codes for methionine. In eukaryotes, the AUG nearest the 5end is the initiator codon. SCHOOL OF MEDICINE TRANSFER RNAs ARE THE BRIDGE Transfer RNA (tRNA) molecules function as adaptor molecules between a codon and an amino acid. A portion of the tRNA called the anticodon base pairs with the codon on the mRNA There is at least one tRNA molecule for each amino acid. N.B. I may bond to U, C, or A. I is inosine, an unusual base found in tRNA. The recognition of the third base in the codon by the anticodon is sometimes less discriminating, a phenomenon called wobble—but look at the amino acid table, and you’ll see SCHOOL OF MEDICINE why it’s less of an issue for the AND NOW TO RIBOSOMES AND PROTEIN SYNTHESIS Ribosomes are protein-RNA complexes (1/3 and 2/3 respectively). There are three tRNA binding sites on the ribosome: 1.The A (aminoacyl) site binds the incoming tRNA. 2.The P (peptidyl) site binds the tRNA with the growing peptide chain. 3.The E (exit) site binds the uncharged tRNA before it leaves the ribosome. A charged tRNA has an N.B. 50S and 30S are prokaryotic; eukaryotic are 60S and 40S amino acid; an uncharged one does not SCHOOL OF MEDICINE MECHANISM OF PROTEIN SYNTHESIS The cycle begins with peptidyl- tRNA in the P site. (1) An aminoacyl-tRNA binds in the A site. (2) With both sites occupied, a new peptide bond is formed. (3) The tRNAs and the mRNA are translocated through the action of elongation factor G, which moves the deacylated tRNA to the E site. (4) Once there, the tRNA is free to dissociate to complete the cycle. The process continues until a stop codon is reached. SCHOOL OF MEDICINE PROTEINS AS MACROMOLECULES Proteins are composed of one or more polypeptides and additional small molecules (cofactors) Proteins have four levels of structure: Primary – amino acid sequence Secondary – caused by repeated coils or folds in the polypeptide backbone due to e.g. hydrogen bonding. Examples include alpha helices and beta-sheets. Tertiary – three-dimensional atom arrangement in a single polypeptide chain Quaternary – the arrangement of separate polypeptide chains (subunits) into a protein You have more on proteins with Dr Vogt! SCHOOL OF MEDICINE POST-TRANSLATIONAL MODIFICATIONS Eukaryotic proteins are also subject to a variety of post-translational modification (PTM), such as: Phosphorylation Methylation Acetylation Ubiquitination Hydroxylation Glycosylation PTMs can have significant impact on function e.g. the glucocorticoid receptor (GR, NR3C1) can be phosphorylated at S211 and S226, which has opposing effects on its activity levels SCHOOL OF MEDICINE SUPPLEMENTAL READING Many figures in this lecture were taken from Molecular Biology of the Cell (6e) and Biochemistry (Berg, 8e) – you can refer to these for more details if you wish, but you will only be examined on slide contents. There are also numerous books on Access Medicine ( https://msuclanac.sharepoint.com/sites/UCLanLibrary/SitePages/Access-Medicine.aspx) – one in particular that is recommended is Harper's Illustrated Biochemistry (32e), Section VII. SCHOOL OF MEDICINE

UM1010 Genes & Transcription PDF

Document Details

Tags

Related

Summary

Full Transcript