RNA Transcription PDF
Document Details
Uploaded by GoldenGadolinium
University of Dundee
Tags
Related
Summary
This document provides an overview of RNA transcription, focusing on regulated gene expression, RNA polymerases, and transcription initiation. The document covers different types of RNA polymerases and their roles in transcribing various types of RNA molecules.
Full Transcript
RNA Transcription Regulated Gene Expression Not all ~22,000 genes are expressed in each cell Temporal and tissue specific expression is important for di;erentiation and function ~20-40% of genome is regulatory sequences Common genes known as const...
RNA Transcription Regulated Gene Expression Not all ~22,000 genes are expressed in each cell Temporal and tissue specific expression is important for di;erentiation and function ~20-40% of genome is regulatory sequences Common genes known as constitutive Crucial in determining cell identity, function and response to signals Mis regulation is associated in many diseases Includes cancer RNA Polymerases Conserved across 3 domains of life Transcription Produces RNA complementary to one strand of DNA RNA polymerase separates two strands of DNA in a transient "bubble" Does not require a helicase Bubble = 12-14 bp RNA-DNA hybrid within = ~8-9 bp 3'-5' strand as template for 5'-3' RNA Transcription is pervasive Most of the genome is transcribed Even if little produces protein coding RNA Non-coding DNA sequences (98% of genome) Often transcribed into RNA molecules with important biological functions microRNA, lincRNA, rRNA, tRNA etc Transcription initiation Requires may proteins and complex cis-regulatory DNA elements RNA Polymerase I Transcribes ribosomal RNA genes 5.8S, 18S, and 28S rRNA RNA Polymerase II Transcribes all mRNAs in cell Also transcribes other RNAs snoRNA, miRNA, siRNA, IncRNA, most snRNA Complex of 10 di;erent protein subunits Active site is at interface between 2 biggest subunits RPB1 & RPB2 Nucleotides continuously added to 3’ end of RNA Forms DNA-RNA hybrid Exit groove Where RNA leaves RNA polymerase II after being split from DNA Intake hole Nucleotides for RNA enter through pore RNA Polymerase III tRNA genes, 5S rRNA Some snRNA Genes for other small RNAs Core promoter elements Contains short (consensus) sequences as binding sites for GTFs Element Consensus Sequence General Transcription Factor BRE G/C G/C G/A C G C C TFIIB TATA T A T A A/T A A/T TBP INR C/T C/T A N T/A C/T C/T TFIID DPE A/G G A/T C G T G TFIID Determines start site for transcription Basal levels of transcription initiation Located immediately upstream (5’) of transcription start site Fixed direction/orientation relative to gene General transcription factors (GTF) Help RNA polymerase recognise promoters TFIID 12 subunits 1 called TBP ~11 additional called TAFs TBP-associated factors TATA binding protein (TBP) Subunit of TFIID Recognises TATA box Short sequence of DNA rich in A & T Signals start of gene ~30bp from start of gene Binds to DNA with 8-stranded beta sheet Rests on top of DNA like a saddle 2 protein loops droop down the sides Induces kink in DNA backbone Bends DNA by nearly 90o TFIIB 1 subunit Recognises BRE element (TFIIB-response element) Recruited to BRE by TFIID Accurately positions RNA polymerase TFIIF 3 subunit Often recruited along with RNA polymerase Stabilises RNA polymerase interaction with TBP andTFIIB Helps attract TFIIE and TFIIH TFIIE 2 subunits Attracts and regulates TFIIH Stabilises transcription bubble By interacting with non-transcription strand to prevent collapse TFIIH 9 subunits Unwinds DNA at transcription start point Phosphorylates Ser5 of RNA polymerase CTD Prevents it from interacting with other general transcription factors Releases RNA polymerase from primer CTD of RNA polymerase II Factors bound to CTD can immediately interact with produced RNA MmRNA processing Capping, splicing and 3’ end formation Associated with C-terminal domain (CTD) of RNA polymerase II DNA damage repair DNA Helicase subunits of TFIIH have a role in nucleotide excision repair Nuclear architecture Nuclear matrix DNA replication Transcription factor binding sites near origins of replication Actively transcribed genes are replicated early in S phase Regulation of transcription Regulated by sequence-specific transcription factors Bind at cis-regulatory sequence Can interact from nearby or from a distance Enhancers or silencers Modulate (up or down) levels of initiation Orientation independent Location variable/flexible Can be 1000s of kb away from promoter Binding sites for multiple transcription factors Transcriptional activators Transcriptional activators Functions Co-activators Co-repressors General transcriptional machinery Multiple protein domains Activation domain Recruitment of general transcription factors Regulatory domain Can mediate dimerization between factors Homo or hetero Nuclear transport Brings transcription factor to the nucleus Auto-inhibition DNA binding domain Sequence specific recognition of DNA promoter Specific amino acids can interact with di;erent bases Combinatorial control Transcription factors are often combined and recognise longer sequences Increases specificity greatly Can be homodimers or heterodimers Many binding sites tend to be palindromic Inhibitory factors can also dimerise with transcription factors Prevents factor from binding to DNA Some transcription factors have di;erent recognised sequences but bind weakly May work together to increase one another's a;inity Enhanceosome Multitude of transcription factors assembling into a macromolecular complex at enhancer sequences Transcription activators work synergistically Summary Transcriptional activators work cooperatively Enhancers work from a distance to modulate the assembly of transcription machinery at promoter Through DNA looping Modern Model In addition to looping Proteins appear to hold loops in place Cohesins and CTCFs Get exclusion of water and it is just a mass of proteins interacting with one another Chromatin Remodellers Recruited by transcription factors Used to initiate nucleosome sliding to move histones Histone chaperones May remove histones to disassemble nucleosomes Histone-modifying enzyme Experiments indicate increase in activator expression induce large scale chromatin unfolding Regulation of Transcription factors Brought into play by extracellular signals Examples of factor activation Protein synthesis Ligand binding Covalent modification May be result of kinase cascade and ends in phosphorylation Addition of second subunit Unmasking By a chaperone etc. May keep protein in cytoplasm for example Stimulation of nuclear entry Release from membrane Dysregulation examples in disease Upregulation Amplification Gain of function Pathway overreaction Downregulation Loss of function Overactivation of repressors Changes in target genes Chromatin architecture shifts Gene translocations Fusion transcription factors Superhelical Tension DNA supercoiling Conformation of DNA adopted under superhelical tension How? When a DNA region opens, the downstream DNA must rotate 10bp opening - 1 helical turn If DNA ends are fixed and DNA is not able to rotate 1 supercoil is produced for 10 bp opening 1 supercoil for what would have been 1 helical turn Transcription Factors Learning Intentions Recapitulation of basic mechanisms Eukaryotic Transcription: General Transcription Factors Regulators of Transcription; sequences and transcription factors (TFs) How does a Transcriptional Activator function (and the role of chromatin remodellers)? How can a Transcriptional Activator be regulated? Common mechanisms of transcriptional dysregulation in disease General Transcription Factors TFIID TBP subunit Recognises TATA box TAF subunits (~11 subunits) (TBP-associated factors) Recognises other DNA sequences near transcription start point Regulates DNA binding by TBP TFIIB Recognises BRE element in promoters Accurately positions RNA polymerase at the start site of transcription TFIIF (3 subunits) Stabilises RNA polymerase interaction with TBP and TFIIB Helps attract TFIIE and TFIIH TFIIE (2 subunits) Attracts and regulates TFIIH TFIIH Unwinds DNA at the transcription start point Phosphorylates Ser5 of the RNA polymerase CTD (C-terminal domain) Releases RNA polymerase from the promoter Eukaryotic Gene Control Region Consists of core promoter and many cis-regulatory sequences Enhancers and silencers Regions of DNA bound by up-regulatory or down-regulatory TFs respectively Can be kilobases from transcription start site Transcriptional activators Multiple protein domains Allows them to perform diverse functions required for regulating gene expression DNA-binding Domain Function Allows activator to recognise and bind specifically to particular DNA sequences Normally palindromic sequences P=0.25n Chance of a TF binding to a random sequence n = no. of bases 0.25 = chance of pairing to a nucleotide (1 in 4) Importance By binding to DNA, the activator positions itself close to the target gene's transcription machinery Examples Common DNA-binding motifs include helix-turn-helix, zinc- finger, leuine zipper and helix-loop-helix structures Activation domain Function Interacts with other proteins (GTFs, co-activators, etc.) Enhances transcription beyond basal levels Importance Crucial for recruiting or stabilising components of the transcription machinery RNA polymerase etc. Can be used for modifying chromatin structure to make DNA accessible Acetylation or certain methylations of histones etc. By HATs or HMTs Regulatory domain Function Controls the activity of the transcriptional activator itself Often in response to extracellular signals In some activators, this facilitates dimerization With leucine zipper domains etc. Importance Ensures activator is only active under certain conditions Example of Heterodimerization (NFAT and AP1) Transcription factors that often work together to activate transcription of specific target genes Particularly in the immune response NFAT (Nuclear Factor of Activated T-Cells) Activation mechanism 1. NFAT is typically cytoplasmic and inactive in phosphorylated state 2. Upon cellular activation, the calcium-calcineurin pathway is triggered 3. Calcineurin dephosphorylates NFAT Exposes a nuclear localisation signal (NLS) 4. NFAT translocates into the nucleus, where it can bind DNA Uses Rel-homology domain (RHD) DNA binding NFAT binds to a specific DNA sequence in the promoter or enhancer regions of target genes Its DNA-binding a`inity is relatively weak on its own Relies on cooperative binding with other TFs (E.g. AP1) AP1 Activation mechanism 1. Dimeric TF composed of proteins from Fos and Jun families 2. Activated by signals like growth factors, cytokines, or stress Through MAP kinase cascades 3. These pathways lead to phosphorylation and activation of Fos and Jun Enhances stability and DNA-binding activity Binds to DNA with leucine zipper dimer DNA binding AP-1 binds to specific DNA sequences called TRE or CRE (cAMP response element) Fos and Jun proteins Protein families that form subunits of AP1 TF complex Immediate-early genes that play a critical role in regulating a wide variety of cellular responses Proliferation Di`erentiation Apoptosis Stress responses Associated with oncogenesis Overexpression of c-Fos or c-Jun can drive tumour formation Topologically Associated Domain (TAD) 3D structural unit of a genome within which DNA sequences interact more frequently with each other than with sequences outside the domain Play a key role in gene regulation Key features Definition TADs are regions of the genome (1000s to 1,000,000s of bp) where chromatin loops and other structural elements create a compartment that fosters interactions among genes, enhancers, and other regulatory elements Boundaries TAD boundaries are typically enriched with CCCTC binding factor (CTCF) and cohesin Help define and maintain these domains Boundaries act as insulators Prevents interactions between adjacent TADs Hierarchal structure TADs are part of a larger hierarchy of chromatin organisation Function of TADs Gene regulation Groups enhancers and promoters within the same domain Ensures regulatory element sonly a`ect the genes they are supposed to Disruption of TAD boundaries can lead to aberrant enhancer- promoter interactions Could possibly cause misregulation of gene expression and disease Facilitating chromatin accessibility Within a TAD, chromatin is organised in a way that makes specific regions more accessible for TFs, CoAs or repressors Chromatin Structure in TADs Active TADs Enriched in euchromatin Inactive TADs Enriched in heterochromatin Histone modifications Active TADs Enriched with activating histone marks Promote open chromatin state Inactive TADs Repressive histone marks Histone mods define TAD boundaries CTCF and cohesin presence is influenced by histone modifications H2A.Z (common in open DNA) is commonly found near active TAD boundaries Methylation of H3 contributes to TAD insulation Transcription factor domains Homeodomains Found in many transcription-regulatory proteins Mediate binding to DNA Consist of 3 overlapping alpha-helices Packed together by hydrophobic forces DNA-binding element Helix 2 & helix 3 Helix-turn-helix motif Amino acids in recognition helix make contacts with bases in major groove of DNA 3 side chains make H-bond connections with bases Arginine in flexible loop of protein contacts bases in minor groove Leucine-Zipper domains Consist of 2 long, intertwined alpha-helices Hydrophobic side chains stretch out into space shared between them Many side chains are leucines Tightly packed = very stable Extensions from helices straddle the DNA major groove Side chains from helices make connections with DNA bases in the groove H-bonds Zinc finger domains Beta sheet and alpha helix Use centrally coordinated zinc atoms Bound by 2 Cys from beta sheet and 2 His from alpha-helix Only large enough to bind a few DNA bases Often found in tandem repeats as part of a larger DNA binding region Rests in major groove of DNA Amino acid side chains connect to bases in DNA Identity of side chain determines which bases are bound Assembling di;erent zinc fingers allows for greater specificity of protein mRNA Modifications Capping Immediately after transcription, a 5' cap is added to the pre-mRNA transcript Added after ~25 nucleotides are transcribed Cap consists of a modified guanine nucleotide (7-methylguanosine, m7G) Linked by a 5'-5' triphosphate linkage Steps of capping RNA triphosphatase Removes the gamma-phosphate from the 5' end of the RNA Guanylyltransferase Adds a guanosine monophosphate (GMP) to the 5' diphosphate via a 5'- 5' triphosphate bond Methyltransferase Methylates the added guanosine at the N7 position to form the 7- methylguanosine (m7G) cap Functions of cap Protection from degradation Protects RNA from exonucleases, which degrade uncapped RNAs Facilitation of splicing Helps recruit the spliceosome Promotion of nuclear export Recognised by nuclear cap-binding complex (CBC) Facilitates export of mRNA to cytoplasm Translation initiation 5' cap is recognised by eukaryotic initiation factor 4E (eIF4E) Component of translation initiation complex Ensures that the ribosome binds e;iciently to the mRNA Quality control Allow the cell to ensure that both ends of a transcript are intact before exporting from nucleus (with 3' polyadenylation) Dysfunctional capping Mutations in RNGTT (RNA Guanylyltransferase and 5'-Phosphatase) Guanylyltransferase enzyme adds guanosine to for 5' cap Crucial for RNA stability and processing Disease impact Dysfunctional guanylyltransferase can lead to insu;icient capping Results in unstable or degraded RNA Can impair translation of key neuronal proteins Contributes to neurological disorders such as intellectual disabilities and developmental delays As neuronal cells are highly dependent on precise and rapid gene expression, loss of proper capping disrupts processing and translation Deregulated CBC Nuclear cap-binding complex (CBC) is composed of CBP80 and CBP20 Binds to 5' cap and regulates pre-mRNA processing Splicing Nuclear export Overexpression of CBC components has been linked to tumour progression in cancers Enhances stability and nuclear export of oncogenic mRNAs Poly-A Tail Chain of adenosine (A) nucleotides added to 3' end of eukaryotic pre-mRNA molecule Steps of polyadenylation Cleavage of the pre-mRNA pre-mRNA is cleaved at a specific site Downstream of the polyadenylation signal (AAUAAA) Upstream of a GU-rich sequence Cleavage occurs ~10-35 nucleotides downstream of the AAUAAA signal Leaves a free 3' end Addition of adenosine residues Poly(A) polymerase (PAP) adds ~50-250 adenosine residues to the 3' end of the mRNA in a template-independent manner Poly-A tail binding Poly-A tail is immediately bound by poly(A)-binding proteins (PABPs) Stabilise the tail and regulate its functions Functions mRNA stability Protects mRNA from degradation by nucleases Longer tails are associated with greater stability Nuclear export Along with the cap and other proteins, the tail facilitates transport of mRNA from nucleus to the cytoplasm Translation regulation Interacts with PABPs in cytoplasm Enhances translation by forming a circular structure with the 5' cap Via interactions with eLF4G (translation initiation factor Alternative Polyadenylation (APA) Many eukaryotic genes undergo alternative polyadenylation Di;erent polyadenylation signals are used to generate mRNA isoforms with varying 3' untranslated regions (UTRs) Significance Shorter 3' UTRs Often associated with increased mRNA stability and translation e;iciency Longer 3' UTRs May contain additional regulatory elements, such as mRNA binding sites Influences gene expression Dysfunction and diseases Shortened poly-A tails Leads to rapid mRNA degradation, reducing protein expression Observed in diseases like spinal muscular atrophy (SMA) Defective mRNA processing impacts survival motor neuron (SMN) protein levels RNA Splicing Eukaryotic genes are split Not all RNA that is transcribed is used Splicing Introns need to be removed Exons stay and code for the protein 5' cap is added to 5' end Why must introns be removed Often contain stop codons Proteins would be incomplete May shift translational reading frame of downstream exons Cells will stop growing and die Classification of self-splicing introns Group 1 Adopt a conserved 3D structure consisting of paired helices and loop regions Splicing mechanism 1. First transesterification Free guanine nucleotide or nucleoside (GMP, GDP, or GTP) acts as a cofactor 3'-OH group of the guanosine attacks the 5' splice site Breaks bond between intron and upstream exon Guanosine becomes attached to the 5' end of the intron 2. Second transesterification The 3'-OH of the upstream exon attacks the phosphodiester bond at the 3' splice site Joins exons and releases intron Occurrence Found in some nuclear rRNA genes of protists, mitochondrial and chloroplast genes Certain bacteriophage genomes Group 2 Conserved secondary structure with six domains (D1-D6) Forms complex tertiary structure Key catalytic regions are in domainsD1 and D5 Splicing mechanism 1. First transesterification 2'-OH group of an adenosine residue within the intron (branch site) attacks the 5' splice site Forms a lariat structure Intron is looped and covalently linked via a 2'-5' bond 2. Second transesterification 3'-OH of upstream exon attacks the phosphodiester bond at the 3' splice site Joins exons Releases intron as a lariat Occurrence Found in some mitochondrial and chloroplast genes Found in bacteria Evolution Considered to be ancestors of eukaryotic spliceosomal introns and the spliceosome Self-splicing mechanism closely resembles spliceosomal splicing Applications Group I introns are used in biotechnology Ribozymic engineering Group II introns are exploited as tools for targeted gene insertion Spliceosome Catalyse splicing Composition 150 proteins 5 RNAs Small nuclear (snRNA) U1, U2, U4, U5 & U6 100-300 nucleotides long Attach to proteins to form snRNPs Small nuclear ribonuclear proteins Pronounced snurps Named after snRNA they contain Carry out splicing How are introns recognised Splice site consensus sequences Most introns have the same general structure Consensus sequences are recognised by snRNPs (snurps) How are separate ends of the intron brought together Spliceosome cycle Overview Mechanisms of RNA splicing Phosphodiester bond between 5’ exon and intron is broken Group 1 1. Intro, folds in such a way as to hold a free guanine nucleotide in ribose form 2. Guanine nucleotide OH group reacts with 5’ splice site 3. Guanine nucleotide attaches to 5’ end of intron 4. 3’ end of 5’ exon reacts with 3’ splice site Known as self splicing No other proteins are involved Group 2 May occur by self splicing or using spliceosome Self splicing Adenine residue is present in the sequence Opposed to using free adenine Intron forms a lariat (loop-like structure) by attaching to adenine in the sequence 3’ end of 5’exon attaches to 3’ splice site Spliceosome U1 binds to consensus sequence at 5’ splice site U2AF binds to 3’ splice site U2 auxiliary factor U2AF facilitates BBP binding to branch site BBP - Branch point binding protein Branch Point - Region of intron with adenine residue U2 displaces BBP at branch site Causes adenine at branch site to bulge U2AF is released and U2 recruits U4, U5, and U6 snRNPs U6 occupies same area as U1 As a result, U1 is released from 5’ splice site U6 attempts to interact with U2 As a result, U4 is released 5’ splice site breaks and a lariat is formed 3’ end of 5’ exon binds to 3’ exon Spliceosome functions isoenergetically Splicing chemistry itself does not consume energy Relies on a series of transesterification reactions that swap phosphodiester bonds No net gain or loss of energy Mechanism The 2'-OH group of the branch point adenosine attacks the phosphodiester bond at the 5' splice site Cleaves the bond between the 5' exon and the intron By interactions between U6 at 5' splice site and U2 at branch point The 5' end of the intron forms a covalent 2'-5' phosphodiester bond with branch point adenosine Creates a lariat Occurs simultaneously with splice site breakage 3'-OH group of the free 5' exon acts as a nucleophile Hydrolyses phosphodiester bond at 3' splice site Through associations of U2 with U6 5' exon forms phosphodiester bond with 3' exon Simultaneously with splice site bond breakage Why Introns are not junk 1. Facilitation of alternative splicing Di;erent combinations of proteins (isoforms) can be produced by a single gene Means the genome does not need a full other gene to produce a structurally similar protein 2. Regulation of gene expression Introns often contain cis-regulatory elements Enhancers, silencers or TF binding sites, etc. Can influence transcription e;iciency by a;ecting chromatin structure Example Intronic enhancers play a role in the tissue-specific expression of genes like myosin heavy chain in muscle cells 3. Evolutionary advantages Provide safe space for mutations to accumulate without directly a;ecting protein coding sequences Facilitate the evolution of new genes and regulatory networks through exon shu;ling and gene duplication 4. Protection of coding sequences Can act as a bu;er, protecting critical exons from harmful mutations or errors during transcription or splicing 5. Non-coding RNAs and Regulatory Molecules Introns can encode small non-coding RNAs miRNAs or snRNAs Regulate gene expression post-transcriptionally 6. Chromatin structure Contribute to chromatin looping and organisation within the nucleus Influences gene accessibility by TFs Introns help establish topologically associated domains (TADs) Maintain proper nuclear architecture Introns in TADs Structural elements Introns contain sequences that interact with architectural proteins CTCF and cohesin for example Often contain CCCTC consensus sequence Binds CTCF May also be involved in defining euchromatin or heterochromatin regions within TADs Types of Alternative slicing Exon skipping An exon is removed from the transcript between two spliced introns Most common form of alternative splicing Mutually exclusive exons Two or more exons are arranged such that only one is included in the final mRNA Produces isoforms with di;erent functional domains Alternative 5' splice site Splicing machinery selects between two or more possible 5' splice sites within the same intron Alternative 3' splice site Splicing machinery selects between two or more possible 3' splice sites within the same intron Intron retention Intron is retained in the mature mRNA instead of being spliced out Alternative promoters Di;erent promoter regions are used Leads to alternative 5' untranslated regions (UTRs) RNA polymerases Eukaryotic genes are split Not all RNA that is transcribed is used Splicing Introns need to be removed Exons stay and code for the protein 5' cap is added to 5' end Why must introns be removed Often contain stop codons Proteins would be incomplete May shift translational reading frame of downstream exons Cells will stop growing and die Classification of self-splicing introns Group 1 Adopt a conserved 3D structure consisting of paired helices and loop regions Splicing mechanism 1. First transesterification Free guanine nucleotide or nucleoside (GMP, GDP, or GTP) acts as a cofactor 3'-OH group of the guanosine attacks the 5' splice site Breaks bond between intron and upstream exon Guanosine becomes attached to the 5' end of the intron 2. Second transesterification The 3'-OH of the upstream exon attacks the phosphodiester bond at the 3' splice site Joins exons and releases intron Occurrence Found in some nuclear rRNA genes of protists, mitochondrial and chloroplast genes Certain bacteriophage genomes Group 2 Conserved secondary structure with six domains (D1-D6) Forms complex tertiary structure Key catalytic regions are in domainsD1 and D5 Splicing mechanism 1. First transesterification 2'-OH group of an adenosine residue within the intron (branch site) attacks the 5' splice site Forms a lariat structure Intron is looped and covalently linked via a 2'-5' bond 2. Second transesterification 3'-OH of upstream exon attacks the phosphodiester bond at the 3' splice site Joins exons Releases intron as a lariat Occurrence Found in some mitochondrial and chloroplast genes Found in bacteria Evolution Considered to be ancestors of eukaryotic spliceosomal introns and the spliceosome Self-splicing mechanism closely resembles spliceosomal splicing Applications Group I introns are used in biotechnology Ribozymic engineering Group II introns are exploited as tools for targeted gene insertion Spliceosome Catalyse splicing Composition 150 proteins 5 RNAs Small nuclear (snRNA) U1, U2, U4, U5 & U6 100-300 nucleotides long Attach to proteins to form snRNPs Small nuclear ribonuclear proteins Pronounced snurps Named after snRNA they contain Carry out splicing How are introns recognised Splice site consensus sequences Most introns have the same general structure Consensus sequences are recognised by snRNPs (snurps) How are separate ends of the intron brought together Spliceosome cycle Overview Mechanisms of RNA splicing Phosphodiester bond between 5’ exon and intron is broken Group 1 1. Intro, folds in such a way as to hold a free guanine nucleotide in ribose form 2. Guanine nucleotide OH group reacts with 5’ splice site 3. Guanine nucleotide attaches to 5’ end of intron 4. 3’ end of 5’ exon reacts with 3’ splice site Known as self splicing No other proteins are involved Group 2 May occur by self splicing or using spliceosome Self splicing Adenine residue is present in the sequence Opposed to using free adenine Intron forms a lariat (loop-like structure) by attaching to adenine in the sequence 3’ end of 5’exon attaches to 3’ splice site Spliceosome U1 binds to consensus sequence at 5’ splice site U2AF binds to 3’ splice site U2 auxiliary factor U2AF facilitates BBP binding to branch site BBP - Branch point binding protein Branch Point - Region of intron with adenine residue U2 displaces BBP at branch site Causes adenine at branch site to bulge U2AF is released and U2 recruits U4, U5, and U6 snRNPs U6 occupies same area as U1 As a result, U1 is released from 5’ splice site U6 attempts to interact with U2 As a result, U4 is released 5’ splice site breaks and a lariat is formed 3’ end of 5’ exon binds to 3’ exon Spliceosome functions isoenergetically Splicing chemistry itself does not consume energy Relies on a series of transesterification reactions that swap phosphodiester bonds No net gain or loss of energy Mechanism The 2'-OH group of the branch point adenosine attacks the phosphodiester bond at the 5' splice site Cleaves the bond between the 5' exon and the intron By interactions between U6 at 5' splice site and U2 at branch point The 5' end of the intron forms a covalent 2'-5' phosphodiester bond with branch point adenosine Creates a lariat Occurs simultaneously with splice site breakage 3'-OH group of the free 5' exon acts as a nucleophile Hydrolyses phosphodiester bond at 3' splice site Through associations of U2 with U6 5' exon forms phosphodiester bond with 3' exon Simultaneously with splice site bond breakage Why Introns are not junk 1. Facilitation of alternative splicing Di;erent combinations of proteins (isoforms) can be produced by a single gene Means the genome does not need a full other gene to produce a structurally similar protein 2. Regulation of gene expression Introns often contain cis-regulatory elements Enhancers, silencers or TF binding sites, etc. Can influence transcription e;iciency by a;ecting chromatin structure Example Intronic enhancers play a role in the tissue-specific expression of genes like myosin heavy chain in muscle cells 3. Evolutionary advantages Provide safe space for mutations to accumulate without directly a;ecting protein coding sequences Facilitate the evolution of new genes and regulatory networks through exon shu;ling and gene duplication 4. Protection of coding sequences Can act as a bu;er, protecting critical exons from harmful mutations or errors during transcription or splicing 5. Non-coding RNAs and Regulatory Molecules Introns can encode small non-coding RNAs miRNAs or snRNAs Regulate gene expression post-transcriptionally 6. Chromatin structure Contribute to chromatin looping and organisation within the nucleus Influences gene accessibility by TFs Introns help establish topologically associated domains (TADs) Maintain proper nuclear architecture Introns in TADs Structural elements Introns contain sequences that interact with architectural proteins CTCF and cohesin for example Often contain CCCTC consensus sequence Binds CTCF May also be involved in defining euchromatin or heterochromatin regions within TADs Types of Alternative slicing Exon skipping An exon is removed from the transcript between two spliced introns Most common form of alternative splicing Mutually exclusive exons Two or more exons are arranged such that only one is included in the final mRNA Produces isoforms with di;erent functional domains Alternative 5' splice site Splicing machinery selects between two or more possible 5' splice sites within the same intron Alternative 3' splice site Splicing machinery selects between two or more possible 3' splice sites within the same intron Intron retention Intron is retained in the mature mRNA instead of being spliced out Alternative promoters Di;erent promoter regions are used Leads to alternative 5' untranslated regions (UTRs)