Lecture Notes On DNA and Protein Interactions PDF
Document Details
Uploaded by RighteousRaleigh9327
Tags
Summary
These lecture notes cover the fundamental principles of DNA and protein interactions in prokaryotes and eukaryotes, discussing various aspects of gene regulation and genome organization, as well as cell division. Includes details of DNA structure.
Full Transcript
Lecture 1: Introduction and genomes Flow and regulation of genetic information Prokaryote > single supercooled haploid dsDNAn chromosome contains din nucleotide - lacks histones > non-membrane bound nucleotide permits linked transcription and translation > genes expressed in persons - co...
Lecture 1: Introduction and genomes Flow and regulation of genetic information Prokaryote > single supercooled haploid dsDNAn chromosome contains din nucleotide - lacks histones > non-membrane bound nucleotide permits linked transcription and translation > genes expressed in persons - common promoter > may carry extrachromosomal plasmids - useful for DNA cloning Eukaryote > multiple highly condensed dsDNA linear chromosomes contained in membrane-bound nucleus > transcription/translation unlinked > protein coding genes have exons interspersed with introns > multiple levels of gene regulation > chromatin = DNA + histones Genome organisation and DNA condensation > nucleotide DNA looks like bottle brush, loops of DNA coming from a central core > NAP’s nucleotide associated proteins Prokaryote DNA Euk DNA Nucleoside is a basic unit of chromatin > DNA wraps 147 by/1.75 turns around histones core >. Histones core (octamer), 2 x H2A, 2 x H2B, 2 x H3, 2 x H4 Nucleoside monomer + linker = ~200 by Beads on a string = 10mm fibre Packaging of DNA affects how accessible it is Chromatic remodelling regulates gene expression Euchromatin = relaxed state/ transcription ally active Heterochromatin = condense state / transcriptionally inactive No correlation between genome size, gene number and organism complexity > C-value paradox = 3200 Mb C-value = haploid genome DNA content Prediction = linear relationship exists between organismal complexity and genome size Reality = unlike bacteria, increased euk size isn’t correlated with increased gene number Linear relationship exists in bacteria, evolution minimizes non functional DNA Euk complexity is a product of gene expression and regulation > exons, introns, > pre mRNA splicing = process of excising non-coding introns and joint coding exons, allows one gene to express multiple mRNA to protein isoforms Promoter = non-coding region of DNA where transcription is initiated by RNA polymerase, controls when/ where mRNA is synthesised Bacterial genes are organised in operons, allows cells to adapt rapidly to changes in environment Operant = c;Ulster of coregulated genes controlled by a single promoter and expresssed as a single polycstronic mRNA Human genome = 3200 Mb > only 25% is coding Regulatory sequences = promoters, intros, 5’ and 3’ UTRs Pseudo genes = fragments of genes that aren’t working anymore/incomplete 75% non coding/ extragenic DNA > repeated sequence elements (tandemly repeated sequences, interspersed genome wide (45%), Cell cycle, mitosis, meiosis recap Euk cell cycle = G1 (growth and normal metabolism), S (DNA replication), G2 (growth and mitosis prep), prophase, pro metaphase, metaphase, anaphase, telophSE G1/s checkpoint > delay progression until conditions are favourable, some cells stop dividing and exit cell cycle G2/M checkpoint = DNA damage, Metaphase checkpoint = are chromosomes attached to bipolar spindles Sister chromatids. = newly replicated chromosomes still atttached at centromere Centromere = DNA region where sister chromtids remai joined Microtubues = proteinaceous filaments involved in structure, transport and motility Kinteochore = proeinaceous region of centromere that binds spindle micortubules (tubulin) Centrosome= organelle that serves as MTOC 2n = 46 chromosomes (2 x 23) 46 chromatic pairs afte replication Chrosmtaid pairs are pulled apart by micortubules and segregated eqlly to daughter cels Each daughter cells inherits identical DNA content (46 chomormses = 23 chromosome pairs) Mitosis = cell cycle , cell division of all cells in body expect germ cells Meiosis = germ cells > DNA replicated then 2 rounds of cell division, produces haploid cells 2 haploid cells join, one form each parent Exchange DNA in homologous segregation Generates variation in DNA Meitioic recombination step (prophase 1) > chro homologues pair, physical exchange of DNA between non siste rchromatics at prophase 1 > chiasmata = points of contact between homologous chromosomes Prelecture DNA protein interactions Double stranded DNA is a polymer of relatively uniform structure with a. Highly negatively charged sugar-phosphate backbone Proteins recognise a particular sequence by having a surface that is chemically complementary to that of DNA, forms favourable electrostatic and van der Waals interactions between protein and base pairs Proteins that recognise specific DNA sequences exhibit remarkably diverse architecture > When a protein binds to its preferred sequence it can form an optimal number of contacts with the base pairs and backbone Mos t proteins recognise functional groups in the major groove of DNA (where each base pair can be uniquely distinguished) Lecture 2: DNA recognition I: Many proteins bind more tightly to some DNA sequences than to others - proteins can tell one part of DNA apart form another bit These protein often’ read’ the DNA through tpatterns of complememtary chemical interactions It takes multiple interactions to ‘read’ a long sequence and many proteins need to recognise long sequences in order to function The need for multiple interactions between protein and DNA means that the shape/ fold of DNA binding proteins is important Some protien folds work well for DNA recognition and are found in multiple different DNA binding proteins - understanding these allows bioinformation analysis to predict that novel proteins might bind DNA Catabolise Activator Protein > CAP >turns on some genes when glu concs are low ChIP seq > measures how often a protien is bound at any particular site across a gene > revels binding pattern of CAP, binds in intergenic region (between genes) Affinity is a measure of how tightly a protein binds to a particular piece of DNA Kd = dissociation constant:small Kd indicates tight binding P = protien D = DNA PD = protein-DNA complex Specificity is an indication of how tightly a protein binds to one site (target) relative to all other DNA sequence s(non specific site) For proteins with high sequence specificity, Kd for target sequence measure affinity, determine the sequence specificity of a DNA-binding protein, determine the effect of environment on DNA binding, analyse effect of AA changes on DNA binding > band shift uses same size DNA but moves around in the gel by adding proteins to it Acrylamide = makes molecular mesh Can measure affinity of DNA for a protein Can determine affect of environment > DNA stays double stranded in gel = native gel > Denaturing acrylmaide gel = before you load samples, the strands separate A band shift ChIP for CAP in E. coli > very very specific H bonds are directional (3 atoms in straight line) > distance- dependent ( strength decreases when distance >3 A) Major groove is more accessible and contain more information A = h bond acceptor D = h bond donor M = methyl group H = hydrogen Read by putting AA in major or minor grooves. > align in certain way H bonds between arginine and guanine are very common > so it lysine, threonine, asparagine, glutamine, protein backbone How to recognise rate sequences. > proteins that need to act specifcallly at a small number of locations on the genome must recognise sequences many bases long > Single AA read more than one base at a time > single residues can contact both bases in a base- pair simultaneously > single residues can contact bases in adjacent base-pair steps > all4 bases can form H bonds wiht DNA binding proteins An alpha helix is ideally suited to sequence recognition in the major groove > side chains protrude form he alpha helix at regular intervals, the width and depth of the major groove are a close match to hte dimensions of an alpha helix Concs = no universal code for DNA protein interactions > multi part interactions can span base pairs or base-pair steps Lecture 3: Sequence that has the greatest affinity to CAP > recognise sequence that long requires many interactions > does this by acting as a dimer, each monomer contains DNA binding domain > full 22 base pair sequences is made up of two identical 11 base pair long sequence Each monomer binds to half the sequence. Multiple interactions a t each monomer > smalll blue dots = ionic interactions with DNA backbone > large blue ovals = hydorgen bonds with bases > many van der waals interactions> protein is interacting with both strands of target sequence, not just one, Proteins aren’t reading out during to letters in one line > one AA contact involves arginine at position 185, with bases in both strands and in adjacent base pairs simultaneously > each interaction contributes to the affinity of CAP in this site = if one interaction is removed = CAP is still likely to bind to that region of DNA with high affinity > most important bases are 4-8 > some proteins are more tolerant of change in their binding site than others > denucleases typically dont cut at sites with any changes from their optimal sequence > All 3 resides form part of the same alpha helix, which dictates their position in space > helix turn helix motif, CAP uses it to bind to DNA > 2 of the motifs, lie in adjacent major grooves, first helix (blue) > interacts with recognition helix to holdi it in place, and can make non specific interactions with the backbone of DNA, > motif is always part of larger protein Found in hundreds of DNA-binding proteins from pro and euk ( eg CAP) Recognition helix makes specific H bonds in major groove H-t-h proteins ar often dimeric and helix spacing matches helical pitch of DNA > Shows diff orientation Spacing between two recognition helices remains constant > to match pitch of DNA helix (3.4nm per turn) > all proteins are dimmers. Have 2 helices, proteins bind to one side of DNA molecule, recognition helices need to be same distance to slot into major grooves > important in systems where DNA binding by a protein needs to be involved, slight conformational change in protein can change positions of helices = dont match DNA helix pitch and no optimal interactions Have rotational symmetry = implications for sequences of DNA that they recognise Target sequence of midst binding proteins is inverted repeat/ palindrome > both strands > inverted repeat = identical contacts Why dimerise > a single domain that interacts with DNA of a protein (reading head) is limited in number of bases that it can interact with, normally between 1-10 base pairss of DNA > proteins dimerise > to bind at a only a small number of locations within a large genome proteins must recognise sequences that are longer than the sequence bound by a single DNA bidning domain > dimeric molecules bind to DNA tighter than they would do as individual monomers > doubling numbers of interactions between protien and DNA greatly increases the affinity of the interactions, squares it > also achieved by having multiple DNA binding domains in a single protein Direct readout of DNA sequence > proteins that derive their sequence specificity from hydrogen bonding with bases in DNA Indirect readout > some protiens show clear specificity for particular sequences of DNA without any specific patterns of H bonds being made between protien residues and bases in DNA > no contacts made with bases concerned > the DNA structure of some sequences differs from the classical B-form > DNA can bend or loop over long distances > some proteins bind tighter to naturally bent regions than straight DNA> protiens that bend or distort DNA when they bind often show preferences for particular nucleotide sequences cause they are easier to bend / distort DNA is easier to bend in a region where A and T are next to each other on the same strand > Protien that distort DNA when they bind to it can recognise sequence by how easily they allow bending, instead of by chemical signatures > indirect readout example Manu sequence-independent protiens also bind more tightly to DNA that can be distorted eg sites of DNA damage Eg bacteria H-NS Some CAP uses both direct and indirect readout when bidning DNA > bend is ~90 degrees, bend formed when CAP protein binds> major part of bend is shark kink introduced between bases 6 and 7 of each half site > posits 6 is highly conserved between CAP bidning sites = important , in structure = not contacted directly, presence of base pairs allows CAP to introduce kink into DNA = indirect reading > Examples of a low-specificity interaction: T4 DNA ligand makes extensive sequence-indent interactions with DNA > large contact area maximises VdW and ionic interactions > dissociation constants for nicked and unpicked DNA are the same = no specificity > DNA ligase maximises non sequence specific interactions > maximises van der waals interactions by having large SA in contact with DNA > DNA ligase wraps itself round the DNA double helix = close contacts > also maximises ionic interactions by lining tunnel where DNA sits with positively charged residues Lecture 4: intro to genome stability; types and consequences of DNA damage; repair by direct reversal 2 faces of genome stability > mutation and genetic variation can often lead to undesirable traits > mutation anf genetic variation is an essential aspect of evolution > dual nature of genome = stability and instability > stability = maintains integrity of organisms genetic material over time eg repair mechanism, cell cycle check point, prevent mutations/ cancer > instability = drives evolutionary change, higher than normal rate of mutations, chromosomes rearrangements Cancer is based on genome instability Yellow = beneficial Copying > makes mistakes > Accidental DNA damage > How does DNA get damaged > by intrinsic( natural part of cell biology/ life ) and extrinsic factors (lifestyle) > Consequences of damage to DNA bases > mutagenic = eg C to U (lesion), mutation = per entrant inheritable change > no way for cell to know that a base pair is wrong when you get to replication number 2 = cell mutation, leads to a cancer B) cytotoxic > T is damaged so can’t be read by DNA, not repaired = DNA pol gets stuck DNA lesion/ damage isn’t interchangeable with mutation DNA damage: broken bonds > examples of spontaneous hydrolytic damage Orange = backbone of DNA Water reacts with N-glycosyl bond > holds base on to sugar > hydrolytic cleavage > base falls off, leaves sugar and no genetic information = abasic site / AP site (apurinic/apyrimidinic) > G/A = purine Loss of g and a are more common Most common form of DNA damage = hydrolytic damage Second most common = deamination Cytosine has amine group, deamination = water attacks bond and removes NH3 > remove amine group from Cys produces uracil which shouldn’t be found in DNA > easy to evolve enzyme that recognises it > uracil is mutagenic, pairs with A Spontaneous hydrolytic damage > deamination of 5 -methyl cytosine generates thymine, which is harde r for DNA repair proteins to detect than uracil > CpG sequences are therefore common hotspots for mutations > add methyl group to cys = change shape, some proteins/ enzymes only bind to they methyl versions > DNA damage = adding bonds = Alkylation > alkylating agents are reactive compounds that can transfer methyl or ethyl groups to a DNA base > eg methylation of the O^6 position of guanine > O^6-methylguanine forms complementary base pairs with thymine Spontaneous oxidative damage > edge 8-oxo-guanine > one of the most common DNA lesions resulting from ROS eg hydroxyl radicals, superoxide radicals, peroxides > base pairs ambiguously with cytosine or adenine > mitochondria is wher e- acceptor takes place, full of ROS = more oxidative damage > Adding bonds > carcinogens > eg benzopyrene is metabolised and modifies guanine > block DNA explication and transcription > links. To cance r Bigges t environment mutagen = UV light-induced damage > glues two adjacent bases together > Forms one or two extra bonds between two bases, most common is two thymines Problem = DNA bases need to be able to pivot, have to move a s larger blob now > Use DNA damage to kill cancer cells > 2 adjacent guanines > cisplatin forms chemical link with each guanine, Intro to repair pathways for damaged bases > Direct reversal > DNA photolyases = repair of UV induced photoproducts > photoreactivation = energy derived from visible ,ight is utilized to break the cyclobutane ring structure > DNA photolyases are flavoproteins. >a light harvesting cofactor rangers energy to FADH-, which then acts as a redox-activ cofactor to break the clycobutane ring > photolyases are widely distributed in prokaryotes and eukaryotes DNA-alkyltransferases = repair of some alkylated bases > Catalytic cys acts as the receptor for the alkylating group > Ada protein becomes alkylated > suicide reaction = the protein doesn’t turnover like a classical enzyme and must be degraded > Widely distributed in pro and euk To repair undamaged Lecture 5: intro to excision repair; nucleotide, non-homologous and end-joining Excision. Repair > applies to mismatch repair (follows dna pol) Find the damage > 3 billion base pairs in haploid genome, most cells have two copies of that > specificity is for damaged bases rather than structures Cut on both sides of the damage > size of gap made distinguishes the diff repair methods > nuclear activity Remove the damaged DNA > only one of the two strand has been cut. > if patch is tiny it falls off on its own, bigger = 10+ NT, need motor to take apart (helicase) > Copy the undamaged strand to make a patch > scarless, General excision of base excision Enzyme that catalyses reaction, recggonises uracil, just cuts bases off, not sugar phosphate backbone. Polymerase > all ad NT to 3’ end of growing chain > dna pol beta can cut flap off Lesion-specific anf general stage of base excision repair All crease AP site Damage recognition by DNA glycosylases > How to access a base buried in the dsDNA? = base flipping. > complex is more stable having the bases separate DNA struucres stabilise because they sit as they should do Damaged bas flips out of DNA more easily = specific AS, close fit, lock and key enzyme NT excision repair in bacteria > Complex of 4 polypeptide chain search > burn ATP, both play role in recognising DNA damage, NT repair recognise many damages if its bulky > UVRB searches one strand, til it bumps into something bulky, bulky lesion= motor can’t go past it, > looking for general properties of DNA damage not base pairs Lost Uvra = recruit UVRC > Ber = base Ner = nucleotide Both pathways found in pro and euk Double strand DNA breaks > accidental = ionizong radiation, DNA-damaging agents, desiccation, inappropriate nuclease activity, replication past a DNA nick, potential; consequences = extreme chromosome instability, may be lethal for the cell or organism > programmed = meiosis recombination, V(D)J recombination, consequences = beneficial genetic variation Breaks can have one or two ‘ends’ depending on the source Non-homologous end joining > take 2 ends and tether them together = end binding > leave 3’ hydorxyl and 5’ phosphate to be joined together > need to be complementary =m microhomology DNA pol fills in gapS Non-homologous end joint > has no requirement for a homologous donor of DNA > can occur at any stage in the cell cycle > NHEJ products may contain errors near the break site > can repair double-end DSBs only > is common in euk burt most bacteria lack this pathway > also plays a role in generation of antibody diversity Lecture 6: 2 different pathways for the repair of double strand breaks > non-homologous end lining = NHEJ > homologous recombination = similar, nearly identical DNA, bring them together in a new combination DNA damage on ne strand Break both strands = compare to different DNA General mechanism of homologous recombination > top strand is 5’ to 3’ Light blue flip around so 5’ to 3 ‘ is on the bottom To find complementary sequences > use base pairing > chews away 3’ end a bit and 5’ end more Loses bit of information around the break > doesn’t now f ther will be 3’ hydroxyl at the break > strand resection Homology search and b strand nvasion > take one single strand of DNA , open up complementary strand of DNA, search by base pairing to find where it can base pair > DNA synthesis. > DNA pol can extend 3’ hyodrxyl, (must extend from 3 hydroxyl) D loop formed. > single stranded DNA and double stranded DNA > DNA synthesis. > Ligation and branch migration: dual holiday junction forming : dual holliday formation Unwind som base pairs from one side and rewind one of the other sides Dual because there’s two junctions Resolution of holiday junctions > To separate two DNA molecules, need to cut back bone at Holliday junctions Ligation completes repair Orientation of the cuts at these unctions > can cut both light strands or both dark strands Indepdent > what happens at one Holliday junction is independent to the other Cut both juncations in same way then n crossover products are formed ( patch products) > If each junction is cleaved in a different way then crossover products are formed > splice products Diploid cell in G1 or G0 phase > homologous chromosomes Homologous recombination sing homologous chromosomes as a template can change the DNA sequence f the chromosome Requires a Homologous DNA molecule eg homologous chromosomes to sister chromatids, but the nature of th template has important consequences for th outcome of HR Repair is usuallly error free provided that an identical sequence is used as the template: information is retained Can repair single and double end DSBs Main DSB repair pathway in unicellular organisms Cell cycle dependent repair in higher eukaryotes to ensure that, as far as possible, sister chromatids rather than homologous chromosome are used as the template for DSB repair Lecture 7:homologous recombination 2 RuvA. = protein that recognises and binds to the Holliday junction and helps assemble the other proteins > recognition based on the shape of the DNA structures, not sequence specific. > binds as a tetramer, >. Provides a platform for RuvB and RuvC to bind to > provide grooves for the short regions f single stranded DNA between the double stranded DNA arms to rn through when the Holliday junction migrates RuvB = helical that powers the movement of DNA > uses ATP hydrolysis to power movement of DNA > functions as a hexamer, 2 hexamers assemble at the Holliday junction, one at each of the heteroduplex branches Helicase = motor protein RuvC = nuclease that cuts the DNA backbone to separate the two DNA molecules from one another > acts as a dimer, each cutting one strand > recruited and aligned on the holiday junction by interactions with RuvA > symmetrical organisation of RuvC on the junction means that the two strands it cuts are essentially identical Lecture 7: homologous recombination 2 Rec = recombination Rec BCD unwinds DNA, cuts 3’ and 5’ end. >mtor > opens up DNA, burns ATP, States cutting strands once open Recognises chai sequence > stops it cutting 3’ end > once recognised it changes the property of the complex > Recruits RecA > gene that does homology RecA > functions as a filament > single RecA binds to single stranded dna, more RecA ion on. >2 didfff DNA bindign sites > no sequence specificity > single RecA has 2 binding sites, one binds to single stranded DNA, the other to double strands Big filament = hundreds of RecA > RecA forms a filament on ssDNA with 3bp per RecA > DsDNA is sampled randomly when it binds to the secondary DNA binding site destabilisng h bond > nstable > h bonds break > Base pair flips out, if compelemraert it binds. > Holiday junction migration > RuvAB RuvA is a unction specific binding protein, tetramer RuvB s a pump/motor, ATPdepdnet dsDNA pump > swapping strands between DNA molecules Ruva forms scaffold for cleavage At any one holiday junction the strands that you cleave have to be the same Orientation of RuvC is diff at diff Holliday junctions HR and DNA replication fork restart HR in meiosis > an example of genome instability No homologous recombination = meiosis doesn’t proceed Gene conversion > programmed dsDNA breaks > Spo11 = nuclear, not sequence-specific, but there are cleavage ‘ hot spots’ where DNA is not tightly packaged Importance of heteroduplexes during HR in meiosis In meiosis a nuclear called MRX generates 3’ ended ssDNA regions. > MRX first cuts away the covalently bound Spo11 protein > it then degraded the strand whose 5’ end is at the break.> the 3’ end involved in homology search corresponds to the Spo11 cut site Gene conversion > the non reciprocal transfer of genetic material from one homologous chromosome to another Less of heterozygosity. > a gentic event that can ccur in the dividing of cells of a diploid organism heterozygous for one or more markers Summary ReURIS A HOMOOGUS dna MOLECULE, BT THE NATUER OF THe template has important consequence for the outcome of HR Repair is usually error-free provided that an identical sequence is used as a template: information required Can repair single-and double end DSBs Main DSB repair pathway in unicellular organisms Cell cycle dependent repair in higher eukaryotes to ensure that, as far as possible, sister chrmatids rather than homologous chromosomes are used as the template for DSB repair Shared ,echansstic feature across kingdoms although the identity of the enzymes for HJ translocation and resolution is better established in bacteria Essential to chromosome pairing during meiosis and to the generation of gentic diversity of the game takes via gene conversion and crossover Lecture 8: VDJ recombination Paradox > immune cells generate over a billion different types of antibodies >10^2 antibodies generated, encoded by genes in the germ cells, far more than the nuMiner og human genes, active against antigens not even found in nature 5’ end of the mRNA encodes n terminal end of protein 3’ end = gene that codes constant region Variable sequence Diversity segment > multiple segments = can form a diversity domain Joining segment Encode a variable domain, bring together a variable segment, diversity segment, constant region > Single V/D/J segment brought together C segment codes constant domains 2 light chains >. Lambda and Cappa Multiple joining, constant regions Kappa Instead of multiple joining constant gens, you have series of joining segments and single constant gene Single recombination event in light compared to multiple in heavy Increase by 2 fold diversity > lambda or kappa Junction diversity > sequence of DNA at point of joint these VDJ elements Recombination processs is site specific Arrow heads have certain direction > facing right = top strand Grabs one’s sequence, binds another sequence, and intervening DNA is looped out Don’t recombine 23 and 23 Or 12 nad 12 Rag 1 and rag 2 > recognise DNA , bring them together in a complex Water molecule used as nucleotide to cut hte DNA > similar to erestrionc enzyme > produce nick > same AS can use the 3’ OH as produced by cleavage as the nucleotide to cut the bottom strand Intervening dna forms circle, not important One breaks in DNA have been made you can’t go back NHEJ proteins introduce nucleotide changes during repair in the hyper variable group > hairpin opening and fusion of coding elements > carrried out by Artemis, produces random nicking of the strands, generating overhangs, > some items called P (palindromic) nucleotides > terminal deoxynucleotidyl transferase > expressed in immature B cells > template indepdent DNA polymerase > addd sequence on in random order using TdT > get some homology at some point > produces micro homology alignment. ( partial interactions) > Combnatorial joining / diversification > CDR3 has additional diversity de to sloppy repair Somatic hyper mutation > 3 enzyme system so important > somatic mutation is induced by cytosine deaminase A cytididne deaminase (AID- activation induced deaminase) is required for: somatic mutation and class switching > uracil-DNA glycosylase (UNG) activity influences the pattern of somatic mutations > mismatch repair pathway is also important Repair pathways are overw2helmed, odd things happen to DNA, result sin increased amount of mutation Class switch recombination >. Downstage of VDJ region > series of coding elements for constant domain 5 main bios types of the immunoglobulins: IgM, IgD, IgG, IgE, IgA > variations are due to differences in constant domains of the heavy chain Isotypes retain the antigen-specific variable region The genes that encode the downstream f then recombined VDJ gene on the heavy chain locus Upstream of some of those genes are ‘switch sequences’ where recombination can occur Does class switch recombination retain the VDJ antigen binding region >. Yes Des the mechanism of class switch recombination rely on alternative splicing > no > IgD is however Which DNA repair processes apply enzymes for class switching recombination > Non-homologous end joining( leads t sloppy repair = DNA sequence changes), base excision repair (UNG) Lecture 9: DNA replication Gene Expression and Rearrangement (GER) MOLG22200 DNA Replication 1 DNA polymerases as accurate machines for copying DNA Prof. Mark Szczelkun [email protected] Gene Expression and Rearrangement (GER) MOLG22200 DNA Replication 1 DNA polymerases as accurate machines for copying DNA Pre-Lecture Tasks for lectures 1 and 2 (Xerte-based reading and reflection) Review of information from last year DNA Replication 2 Main lecture The coordinated machinery of the replication fork & mismatch repair Post-Lecture Tasks for lecture 1 and 2 (Xerte) Reading and reflection on the lecture material DNA Replication 3 Post-Lecture Tasks for lecture 3(PDF) Read a short review and answer some questions The coordinated machinery of the replication fork & mismatch repair Gene Expression and Rearrangement (GER) MOLG22200 DNA Replication 1 DNA polymerases as accurate Learning Outcomes for this week’s lectures machines for copying DNA Aim: To understand how genomes are maintained by the processes of DNA replication At the end of the lectures, you should be able to understand: How the replication machinery accurately copies a genome in a coordinated manner How accidental errors introduced during replication are repaired How the initiation of DNA replication is controlled The problem of replication of the ends of linear DNA and a mechanism evolved to overcome this Gene Expression and Rearrangement (GER) MOLG22200 DNA Replication 1 DNA polymerases as accurate Q. How often does damage arise during machines for copying DNA replication of a genome? Mutation rate is low – 1 mutation per 109 nucleotides per cell division For a typical 400 aa protein – 1 change every 200,000 years Q. What are the consequences of a breakdown of a replication fork? Collapse of a replication form can lead to toxic dsDNA breaks (Nigel Savery Lecture on HR repair) DNA polymerases as machines for copying DNA DNA polymerases as machines for copying DNA digfamiliesofpots A-family B-family Based upon the conserved sequences, DNA polymerases have been grouped into eight distinct families: A, B, C, D, X, Y, PrimPol (AEP C-family superfamily) and reverse transcriptases The core catalytic domains of these are unrelated to each other, i.e., Y-family adopt different protein folds as their X- family catalytic cores! DNA polymerases as machines for copying DNA Core DNA Pol III (C-family) Main replicating enzyme Three polypeptides α subunit the polymerase E. coli DNA Pol I (A-family) “Klenow Fragment” shown ε subunit 3'-5' exonuclease Role in completing Okazaki fragments θ subunit stimulates ε One polypeptide see post-lecture task DNA polymerases as machines for copying DNA A brief recap from Year 1…(Biochemistry: Cellular Composition BIOC1003), also see Pre-lecture Xerte Task 9 tt I 1PhotYbondtfhmnisergroove Template DNA Newly-synthesised DNA palm has A catalytic residues stabilise Keeping DNA polymerase on the template Sliding Clamp β – subunit Thumb Role to increase indie's dissociat polymerase processivity Role as an assembly point for other proteins, such as repair factors More details in the recorded lecture Increarpercent it holding chore by ton PotermatiashM change in Reducing errors in replication Reducing errors #1 – catalytic selectivity dNTP Chain extends from the 3′ end 1 in 105 Errors Reducing errors #1 – catalytic selectivity learn GCTA iais U structures mis nature betweenbases distant DNA Reducing errors #1 – catalytic selectivity Hydrogen bonding not required! Z F T A Eric Kool: 4-methylbenzimidazole replacement for adenosine (abbreviated Z) and 2,4-difluorotoluene deoxynucleoside (abbreviated F) mimic for thymidine can be paired efficiently by the Klenow polymerase (DNA Pol I) Reducing errors #1 – catalytic selectivity catalytic pocket Reducing errors #1 – catalytic selectivity R sugar phosphate backgued Reducing errors #1 – catalytic selectivity ASP ASP Mismatch catalytic pocket cantclueproperly wrongshape Reducing errors #1 – catalytic selectivity Reducing errors #1 – catalytic selectivity Consensus shape for active site to accommodate AT, TA, GC and CG Incorrect base pairing excluded by steric clashes Reducing errors #1 – catalytic selectivity Reducing errors in replication Reducing Errors #2 – Enzymatic proofreading But, 1 in 105 chance of incorrect base pairing (imino or enol tautomers) e.g. Rare imino form of cytosine tautomerisation 1 in 105 chance Reducing Errors #2 – Enzymatic proofreading Reducing Errors #2 – Enzymatic proofreading Reducing Errors #2 – Enzymatic proofreading nochecking apof DNA 1 in 105 distorts chance Reducing Errors #2 – Enzymatic proofreading 3´OH incorrectly positioned, 3-4 bp unwind, palm domain senses dsDNA structure, allows access to 3´-5´ DNA re-anneals polymerisation rate slows exonuclease site Reducing Errors #2 – Enzymatic proofreading 1 in 102 errors removed Reducing errors in replication see Lecture 10 for mismatch repair Polymerase Specialisation e.g. E.coli Enzyme Polypeptides Family Function Processive/ Editing Pol I 1 A RNA primer removal, Okazaki fragments 20-100 nt / 5´-3´ and 3´-5´ exo Pol II 1 B DNA repair >1000 nt / 3´-5´ exo Pol III 10 C Chromosome replication >1000 nt / 3´-5´ exo Pol IV 1 Y Translesion synthesis Low / None Pol V 3 Y Translesion synthesis Low / None see post-lecture task for Pol I Polymerase Specialisation Y-Family Translesion polymerases No proofreading capitalisation Can bypass DNA damage More open catalytic site Low processivity mutagenic Polymerase Specialisation More open catalytic site Incoming dNTP Thymine dimer FINGERS FINGERS r PALM Little Thumb Finger Thumb base and catalytic metal ions in panel A are barely visible while the thymine dimer (shown as red sticks), incoming nucleotide, and both active siteMg metal 2+ ions in panel B are solvent-exposed. T7 DNA polymerase (PDB:1T7P) pol η (PDB:3MR3) DNA polymerases as accurate machines for copying DNA SUMMARY A family of DNA copying enzymes with a similar overall structural arrangement. Catalytic site evolved to synthesise a DNA strand by copying a variable template For replicative polymerases, catalytic selectivity reduces but does not eliminate the chance of an incorrect base being added Main copying error comes from unavoidable base structural isomerisation Additional domains or subunits evolved with exonuclease activity to remove newly-added bases that distort DNA structure (“proofreading”). Some translesion/bypass polymerases can copy past DNA damage to ensure successful replication by having a relaxed catalytic selectivity but will introduce errors 9 30011 year bugger 2mgml 0.2mg 2y 9 C V1 C2V2 2mgml 02 4 CIVI C2UL 98.5 02 424 2 Lecture : DNA replication 3 Relicon model. > replicator = dna > ORIGIN is where DNA is unwound and replisome is assembled. >multple origins in eukaryotes > only a subset intimate > origins do not re-fire > other origins are passively copied When from replication fork they move in different directions Red Cross. Origin turned off Green cross = where replication scans start Timing of replication initiation > incomplete replication can lead to chromosome breakage Gaining and losing bits of DNA Cell cycle control > cyclin-depdendent kinases (CDKs ) > G2== low CDK levels > s phase = activity increases > drops in mitosis. > Controlled by proteins that are phosphorylated Proteins that initiate replication are part of AAA family. > hyodlyse ATP > important in assembly of proteins are replication fork. > CDC and ORC bind on to DNA > recruit Mcm27 ( helicase) > ring shaped protein is opened up,v DsDNA goes through centre of helicase Complex is activated by increase of CDKs DDK phosphorylates helicase. > signal to recruit other proteins S-CDK then phosphorylates reacts more eg Dpb11. > Pol e carries out euk, replication eolymerase- leading strand synthesis Conformational change where helicase opens, unwind and rewind DNA Pol delta = lagging strand pol G1 phase = loading phase. >no helicase activation due to low CDK levels Linear chromosome = problem with replication timer action > DNA pol need an RNA primer to synthesise DNA. > lagging strand synthesis is unable to copy the ends of linear chromosome Ozaki fragments, Telomeres > protect chromosomes > repeating 6 base sequence. >telomere repeats allow the DNA ends to extend > can see steps be added on repetitively Extend template. Of 3’ end > telomerase adds sequence on repeatedly > extend so lagging strand synthesis can lay down primer Reverse transcriptase > ribonlceoprotien > RNA (TER ) nis complementary to repeated sequence at 3’ end > TER is the template Broken chromosomes frequently rejoined > natural ends didn’t > Have series of proteins that recognise repeats and coat the ends of telomeres to prevent them being found by repair factors > shelterin proteins. > block kinase activity and DSB recognition Telomeres and cell senescence As you shorten you lose repeats No telomere repeats. = no shelterin proteins > seen as breaks and cells die Telomeres shorten with age Loss of telomere activity leads to cell senescence DNA replication 2 Origin or distinct unwinds and forms replication fork Forks act independently Machinery of the replication fork > helicase is a ring shape protein, inserts ssDNA > primase. >ssDNA produce by helicase is protected by single stranded bindign protein (SSB) > synthesis of DNA by DNA pol, represented by hand structures > attach by clamp loader, DNA pol are also attached to sliding clamps > mismatch repair enzymes > as replication fork unwinds DNA it changes the topology of DNA Combination of all proteins that allows the replication fork to pass along the DNA and to coordinate the synthesis of leading and lagging strand s Sun thesis forms loop of NDA Machinery of the replication fork Holoenzyme = whole of complex Epsilon =e Towel proteins are flexible > allow pol to move around > link core pol to gamma complex clamp loader > clamp loader is bound to sliding clamp (beta subunit Sliding clamp. >ring structure > processivity factor > form central pores where DNA can pass > layer of water molecules sit there between the protein > How does the clamp get on the DNA > gamma clamp loader > pentamer > a claw shaped protein machine to open the sliding Clamp > AAA+ ATPase Stuck, stable complex > to open ring > Clamp loader has high affinity for particular DNA structures, double strand single strand interface Replication - coordinating lagging and leading strands > leading strand followshelicase > trombone model for replication > Rleases sliding clamp and NDA Sliding clam remains on DNA where it can be a site where ther protiens can bind and complete synthesis of Okazaki fragment > Load sliding clamp > Cycle continues Loading of lagging strand pol, synthesises Okazaki fragment, loop of NDA grows, complete fragment, release, rebind a new primer form a short loop and grows Mismatch repair enzymes > natural bases that are mi smat ched > muta tion is fixed within DNA > enzymes look for msitmatches and to repair Damage recognition > diff types of mismatches Damage signalling MutH role is to introduce nick to DNA Nick = site where exonUlcease is loaded. > Dam =enzyme > introduces methyl group to base A > methylated A is copied by T > polymerase puts up methylated base = heavy methylated site > newly synthesised strand is unmethylated > transient period following replication is heavy methylated, target for MutH, cuts unmethylated strand = MutH cuts newly synthesised DNA. > Heavily methylated G can be upstream r downstream from mistmatch Lecture 12: prokaryotic gene expression Structure of RNA pol is conserved across all 3 domains of life Translate from codons to AA = ribosomes Both RNA pol an dn ribosomes need to know where to start and stop with precision. In prokaryotes transcripts are translated whilst they are being produced > lag between notation of transcription and appearance of active protein is short > bacteria rely heavily n transcriptional responses to stresses and environmental changes Transcripts are translated are produced MRNA comes out of back of RNA pol only bit at ne time is still within RNA pol. > mRNA has start signal for ribosomes, as son as. It’s visible to the back of RNA pol a ribosome can join on As RNA gets longer, more ribosomes add on > RNA is unstable so multiple rounds of transcription are needed to maintain a population of an mRNA in the cell Steady state = 1 molecule per cell Increase number of cells = increase how often you initiate transcription > increasing the frequency of transcerition of a gene increases the number of molecules of that mRNA in a cell Not how fast rna pol moves through the gene > how often it’s starts Increasing the frequency of transcripts increase RNA concentration still further How to contort he level of. A transcription of a gene > A stronger promoter sequence initiates transcription more frequently = higher levels of RNA > don’t want to change expression Expressed at high level = strong promoter Repression of initiation. > a protein that disrupts the ability of RNA pol to begin RNA synthesis. = lower levels of RNA synthesis > gets in the way of promoter Activation of initiation > a protein helps RNA polymerase to begin RNA synthesis = higher level of RNA synthesis Cells sometimes control transcription after initiation > eg by placing controlible terminator is active levels of full length RNA are low > mechanisms are anti-termination or attenuation can allow RNA pol to transcribe through the controlled terminators = higher levels of full-length RNA Strong promoter > makes lots of mRNA > rna pol in bacteria has 2 different complexes Core has 5 subunits > beta and beta prime are largest subunits Complex is effieicient for translation Can’t recognise promoter or start transcription Core needs to associate with sigma subunit > forms holoenzyme > recognises and binds to promoter > Both pro and euk have to recognise promoters > in pro the subunit needed for recognition of promoters associataes with RNA pol before the whole complex binds to DNA > in euk they form at the promoter then enzyme binds When RNA is 15 NT long the sigma subunit is pushed off > released and can bind to another core RNA pol. > Another core RNA pol stays till the end > The sigma cycle Closed complex = RNA pol is at promoter > interactions like H bonds in major groove to recognise sequence in promoter > ionic interactions with back bone > van der waals > Stay more tightly bound to promoter than non specific DNA > this is KD K2 > isomerisation step, interacts with DNA, RNA pol opens up ssDNA > irreversible > rate constant not eqb constant > Kd on x axis eg nM > k2 on y axis per second > A very strong promoter does temps efficiently > binds tightly and isomerises quickly means high K2 > small Kd > Weak promoter has low k2 Kd and k2 for promoters is set by promter sequence Weak > falls off quickly not many interactions > high Kd > isomerise slowly = low k2 Optimal promoter First transcribed base is +1 > second is +2 > etc > no base 0 Start from +1 , count backwards > 10 bases bac is -10 (not being transcribed ) Size of RNA pol > at promoter is 70 base pairs eg -60 to +20 -10 bp = -10 hexamer Same for 35 -35 works best if its TTGACA. > Numbers = % of promoters that have that base at that position Every natural promoter differed from the consensus > -10 is recognised by region 2 Region 4 recognises -35 In the middle there’s no consensus sequence > distance is important > 17bp is best > sigma subunit has 2 DNA binding domains and distance apart of those binding domains matches a 17bp spacer > sequences to far or close protein has to strain/ DNA distort to let protein DNA interactions take place Upstream element > recognised by alpha subunit of RNA pol > increases affinity > not use sigma subunit that binds to DNA > Start of RNA > more tha. Half start with A > Bubble starts and spreads til it gets to part where r yo initiate transcription > helps isomerisation by Weak hybrid > pops open (. Breathing) > transiently single stranded then close > sigma subunit doesn’t force two strands apart > breath = captures strand and stops it form coming back again Captures start of Bubble > unzips to transcription start point Opening complex without helix. > unlike eukaryotes with helicase Plug diff sigma unitinto RNA pol then it can transcribe a different set of genes E. coli has 6 types of alternative sigma factor > each of which regulates a different set of genes E. coli can express 7 diff types of sigma factor Under conditions f rapid growth in the absence of external stresses the majority of the sigma factor molecules present in the cell are sigma-70 > some may not be expressed at all Sigma factors compete for core RNA pol > all types can bind to core RNA pol = completion. > ratio of diff kind of holoenzyme present in cell at any given time depends on the relative amounts of each different type of sigma factor being expressed and how tightly each factor binds to the3 core enzyme Sigma-70 > low levels of holoenzyme containing alternative sigma factors so genes that have promoters recognised by those sigma factors are likely to be expressed at low level> majority of holoenzyme present in the cell contain sigma-70 Sigma-32 recognises promoters with alternative -10 and -35 sequences > recognises promoters that ae found upstream of a very small subset of genes that are required to help cells cope with high temps = heat-shock sigma factor Sigma 54 b> regulates small set of genes involved in nitrogen metabolism. >not related t sigma -70 at all > unlike all ther sigma factors, 54 recognises protmers with core promoter elements at -12 and -24 instead of -10 and -35 > only sigma factor unable to perform an open comes without outside help = closed complex til activated by an atp-dependent transcription activator Normal temp = 70 concs are high nad 32 are low > 70 dependent genes are expressed and heat shocks genes aren’t Heat shock = proteins trait to unfold > genes for proteins have promoters that can only be recognised by holoenzyme contains 32 > states to compete with 70 Regulation of transcription Repression/negative regulation > protein acts by turning a promoter off Activation/ positive regulation. > turning. A promoter on E. coli contains at least 132 transcription factors ie activators and repressions Approx 70% of sigma 70 dependent promoters are regulated by at least one repressor 50% by at least one activator Many regulators have both represssor and activator Many transcription factors are global regulators = act at multiple promoters > single transcription factor portion can act at multiple promoters Go over year 1 notes CAP and lac represssor sequence specific DNA binding proteins > targets particular regulator to promoter is presence of binding site in promoter for that protein When lac promoter has n repressor, the level of transcript is still low = OFF > requires activation by CAP to turn on > lac promoter is weak promoter > differs from consensus promoter > lacP1 bind to RNA pol weakly and falls off lots, can isomerize to open complex at moderate rate In the absence eof other factors transcrtn form lacP1 will be low (off) Activation of transcrtion > activators turn weak promoters into strong ones > stabilise binding of RNAp to promoter = decrease KD >sometiems speed up isomerisation = increase k2 Transcription activation by CAP at lacP1 Binding site recognise me by CAP protein > 20 bp inverted repeat CAP is homodimer > each monomer binds to one half of inverted repeat CAP binding site at lapi is pretty close to consensus Most promoters regulated by cap contain binding sites that differ one or. More positions from this consensus binds a little less tightly r than consensus Stabilises by protein protein contact Weak interactions at -35 but strong protein protein interactions with cap and cap making strong protein DNA interactions > Efffects ability of protein to activate transcription = loop > AA n loop makespecifc protein protein interactions with backend of RNA pol Loop defines activating region of monomer At different promoters the CAP binding site is centred at -41, -61, -71, -81 Moves around the turn of the helix Gone 5 bases = half helical turn > protein is distant form RNA pol and on a different face Linker allows RNA pol back subunit to sample diff locations along DNA on same face of the helix The need to make protein protein interactions often makes the function of activator bidning sites very sensitive to location Prokaryotic activators usually bind within 100 bp of +1 Repression of transcription > blocs transcription of weak promoters > generally sequence specific. DNA binding proteins that recognise a site that overlaps blue site that RNA pol wants to bind to Binds and blocks binding by RNA pol> take lac promoter and bind repressor then RNA pol can’t bind Strong or activated promoter > bind repressor in RNA pol site then doesn’t matter than activators are stabilising RNA pol since it can’t get on to DNA Strong promoter repression allows low level background translation. > repressor protein isn’t ocvalently bound to the DNA, occasionally dissociate > promoters can ‘leak’ Repressor binding site occupies about 20 bases from +1 Diner binds to operator sequence Tetramer overall Lac operant contains 3 binding sites O1 = operator 1 > strongest binding site, binds tiggers to lac repressor >must bind to to repress transcription 400 bp form o2 Promoter with only 01 = 50 fold Need o1 and o2 Tetraermic lac reperesir can bind to two > need loop into DNA > bind 2to 2 sites on DNA at once lac repressor binds more tightly > less leakiness Most pro activators and repressor are controlled by regulating how tightly they bind to their specific target sites Inducer diffuse, bids to repressor, causes change, repressor falls off DNA Small then number the tighter the binding is value increased to -11 Repeated with non specific DNA > lower affinity > no different between inducer > inducer only changes affinity for its target site IPTGA turn lac repressor dependent genes on > Most pro activators and repressor are controlled by regulating how tightly they bind to their specific target sites Both CAP and Lac repressor are subject to allosteric regulation by the small molecules that they bind Both have small molecule binding sites and DNA binding surfaces (helix-turn-helix motifs ) > changes distance between helix turn helix motifs > The small molecule signals have opposite effects on. The binding specificity of CAP and Lac repressor > CAP = form that is bound to the small ligand that is the high specficity that’s able to locate target sequence and activate > lac repressor = binds small molecule and reverts to wore specificity form Lecture 3 of prokaryote gene expression Transcription termination > efficient and accurate transcription termination is essential to the crowded genomes of prokaryotes Little space between prokaryote genome genes > Euk genomes have 2% genes 2 genes in same direction controlled by diff promoters > Problems bacteria have to overcome to make transcription > apples to euk > multiple interactions stabilise the transcription elongation complex and these must be overcome in order to terminate transcription > rna pol = processive = every time rna ol’ adds new NT, can fall off DNA or add new NT, = chance of continuing step forward is higher than the chance of falling off RMA DNA hybrid is 8/9 nt long > if rna pol fall off then RNA does too > Truncated rna leads to truncated proteins = dangerous for cell, might lack an important domain Needs high affinity for all DNA and. Not fall off. > Strands get pulled apart > one to AS (template) > other goes around rna pol and is ignored Energy to hold rna polin place > protein dna interactions in binding site >rna dna hybrid - > rna pol interacts with hybrid. >interactions with RNA with RNA exit channel/ binding site Release RNA pol form DNA. > break interactions > Transcription termination. > 2 main types in bacteria > intrinsic termination > rho-dependent termination Intrinsic termination > only need RNA pol and DNA sequence being transcribed > run of Ts in coding strand > dna sequence indicates RNA sequence. > intirinsc terminators can be seen in DNA but function in RNA > formation of the RNA hairpin within the RNA binding site disrupts protein-RNA binding site disrupts protein-RNA interactions and destabilises the wea A:U hybrid > run of Us > hairpin is rich in g and c residues > folding of hairpin in RNA bidning site > folding invades exit channel and wedges it apart. > opens up at exactly the same time RNA dna hybrids run antiparallel 5’ to 3’ on each side of hairpin Spacing between hairpin and Us is important > Not all hairpins nad runs of Us mean termination > eg larger distance between hairpin and Us = no termination Rho-dependent termination > rho is a helicase, burns ATP, assembles as 6 monomer rings/ hexameters, is. A motor > pls the RNA out of a paused RNA pol. > rho moves at same speed of RNA pol > pulls RNA out of RNA pol\alternative model proposes that rho is attached to RNA pol instead of chasing it, still pulls RNA out Rho targets Rut (rho-utilisation) regions in the RNA of Rho-dependent genes > rho is sequence specific RNA binding proteins Rut region = specfic region where rho binds Cells sometimes control transcription after initiation > eg placing a controllable terminator between the promoter and a gene > result = when the terminator is active levels of full length RNA are low Gen be control by antitermination > eg by N, a viral protein, which allows RNA polymerase to ‘read through’ transcription terminators > product of an ‘early gene’ > t is an intrinsic terminator > Control of transcription by anti terminator proteins > N is a protein binds both to RNA pol and a sequence in RNA called Nut box > N utilisation > nut is upstream of terminator > N binds to nut box in RNA and to RNA pol > N-RNA pol complex transcribes past terminator Hairpin cant form due to not enough out from elongated exit channel N anti terminates by altering physical relationship between RNA pol and RNA components so terminator can’t form weak hybrid and hairpin at same time Riboswitches and attenuation > riboswitch = rna sequences that fold into structures that bind small metabolites, the confirmation of the rbioswithces changes when the metabolite is bound > riboswitchees can regulate transcription by controlling the formation of an intrinsic terminator sequence of coding sequences > other riboswitches can control translation Single riboswitch does one thing or another > eg transcription or translation not both Coding region has sequences which form riboswitch > if 3 forms hairpin wiht 4, its upstream of U’s = terminator > if it doesn’t base pair with 4 = no hairpin Ligand is often AA Only what gene turned on if no AA present = 2 and 3 form hairpin and no termination If AA is present. > diff shape of riboswitch, allows 3 and 4 base pair, forms terminator , re mature transcription = attenuation n Control of trnascertion termination by ribosome mediated attenuation> attenuation by leader peptide formation controls expression of at least 5 E. coli AA bio synthetic persons eg trp and his Presence of excess tryptophan expression of the trp Operon is decreased by Mechanism only works if you want to detect the presence of AA > In pro ribosomes can start translation of mRNA while RNA is being produced. Not in euk Sense trp by whether ribosomes can include trp in peptide Regions 4 can anneal in certai. Ways Position of ribosome controls formation of terminator Prokaryote gene expression 4 Roles of translation and translational control in prokaryotes gene expression Transcription controls more expression than translation. > time and efficient. > transcription and translations re coupled in bacteria, time delay between turning ene on at promoter and producing active protein is short, > pro mRNA is short lived, > control at transcription = less energy, don’t make useless RNA Translation initiation > ribosome binds to mRNA, occupies 40NT. >sequence specific interaction between mRNA and ribosome > Enzyme part of ribosome is RNase. > 16s ribosomal RNA > specific binding Strength f ribosome binding site is altered between different genes, all contain slightly different ribosomes binding sites In most genes in Coli there’s no control over frequency of gene translation beyond intrinsic control determined by sequence of RBS Work by steric hindrance Translational repressor proteins bind to mRNA and prevent ribosomes from binding eg Coli CsrA protein is a homodimeric translational repress or that inhibits translation of hfq gene > RNA sequence specific binding protein > Riboswitches > region 3 of riboswitch is complementary to the region of RNA that contains ribosomes binding site > can’t recognise double stranded RNA needs to be single stranded RNA Small regulatory RNAs can control access of ribosomes to RBS > smalll RNA turn translation on, If ribosome needs to start form ne 5’ end, only produce Produce multiple polypeptides Mae tryptophan needs 5 diff enzymes Not making one long protein that gets chopped up, each proteins is synthesised separately because each has own start codon Polycistronic mRNA and operons allow you to control expression of multiple related genes by making a sin gel regulatory decision at a single transcription contour region Organisation of genes and operons in bacteria > in some operons the distance between two genes on mRNA is long, translation is independent of one another Short = little space between stop and start codons gets coupled translation > same ribosome continues to translate second gene and then falls off Lecture 16: Mechanism of euk transcription I Transcription = DNA to RNA Splicing = unique to euk Compartmentalisation in euk > translation in cytoplasm, transcription in nucleus Number of protein coding genes doesn’t correlate with organism complexity Sophisticated gene regulation is responsible for complexity Importance of transcriportion regulation > contorl of transcription is essential for all biological processes ef metabolism > human genome is mostly non coding, or genome has a huff enumer of regulatory sequences that control mtranscprtion of these genes > transcriptional misregulation causes many diseases eg cancer, diabetes In euk there are at least 3 RNA pol > only one in pro Mitochondria have own genome > mtRNAp and mtRibosmoes RNA pol II > most complex > produces 20000 genes, makes mRNA Pol II How does Pol II know where to start > Whe. Is it required to the genome > How doe pol 2 separate DNA strands How does pol 2 transcribe long genes Does it make mistakes, how are they fixed How does pol 2 transcribe through chromatin How does it stop How is the amount of mRNA made by pol 2 controlled> Trans protein cycle = initiation, elongation termination Initiation > pol binds DNA, DNA i sopened/melted to allow access to template strand, RNA synthesis begins > Needed at every mRNA gene=RNA pol II, general trans proton factors (GTF), promoter DNA Needed at specific mRNA = sequence-specific trans portion factors, regulatory elements, co activators GTFS = TFIIA, TFIIB, TFIID, TFIIF, TFIIH > TBP is important subunit of TFIID > most of these are protein complexes, Core promoter DNA > contains DNA sequence elements that are recognised by GTFs Forms pre initiation complex (PIC) from which mRNA synthesis starts What do the GTFs do ? > duplicate function of sigma subunit in pro > promoter recognition. And promoter opening Order of addition More functions Don’t revise Get an idea Organisation of promoters > Cor promoter > contains sequence elements that the general trans train factors cna recognise in order to start trans portion BreU and BreD > Initiator element over leaps start site At least 6 elements along corepromter that GTS will recognise to recruit pol 2 Don’t find all elements at every gene > redundancy between elements Cor e promoters direct accurate trans portion from a specific location on the genome Beginning trans portion > some GTFs bind to core promoter elements directly > some to other GTFs and RNAP > POL2 and GTFs combine at the promoter to form the pre initiation complex(PIC) Once assembled the PIC undergoes multiple isomerisation steps > opens DNA to make a transcription bubble Promoter proximal pushing> in most metazoan organisms, pol2 will initiate at a promoter, transcribe for 30-60bps and then pause Occurs in almost all genes When bound to pol 2 they stop Pausing is cause db y 2. Bidning. Pol2 factors. >DSIF = DRB-sensitivity-inducing factor > NELF. = negative elongation factor Factor bind to pol 2 and cause it to stop 5 bp downstream of start site PTEF-b > causes signal, phosphorylates diff components > Pausing is relieved by kinase pTEF-b > re tried by TFs and other cofactors, > phosphorylation causes NELF to dissociate but not DSIF > DSIF becomes a positive elongation factor Function of pausing > a quality control checkpoint, ensures mRNA is properl capped > quic response = allows for a very fast activation of transcription > chromatin clearance = keeps the promoter free of nucleosmes > regulation = the amount of mRNA made is controlled by pause release rather than initiation Elongation > mst RNA is made here, pol2 must stay bound to DNA fo the entire eight of the gene, or the mRNA will be incomplete, must be processive > must make RNA without terrros, which would introduce mutation > errors cna be detected and repaired Productive elongation requires additional factors > DSIF enhances processivity > TFIIS enables RNAP2 t perform proofreading SPT4 and 5 enhance processivty TFIIS removes misincorporated nt and resolves backtracking by proofreading Mechanism of euk transcription II Activator > activates transcription Binds to specific place on genome > can recruit proteins to the genome > co activators cause transcrption eg by recruiting the pre initiation complex > Co activators = SWI/SNF, mediators complex, SAGA complex , NuA4 complex Mediator coactivator > discovered in fractional yeast extracts by its ability to stimulate trans portion > mixing RNAP2, GTFs , DNA and NTPs did not produce much RNA until mediator is added Mediator stimulated transcrption stronger in the presence of a TF Mediators interacts directly with transcrption factors In vitro transcrption experiment > 1 tube two templates, Mediator is a huge and modular protein complex > 26 proteins, 1.4 MDa, > split into head middle and tail modules> middle and head interacts with RNAP2 Mediator interacts with multiples RNAP2 surfaces and wiht GTFs, but not the promoter DNA Mediator recruits and stabilises the PIC to stimulate transcription Break interfaces = birth defects in cells TF..> composition and mechanism > binds upstream of core promoter TFBS also found downstream of the gene and far away from the gene (enhancers, only exact in metazoa) Located : promoter proximal adjacent to the core promoter, often called upstream activating sequences, distal downstream of the expressed gene, distal within the gene, distal far upstream of the gene Distal transcrption factor binding sites are called enhancers > enhancers can be > 1Mb away from the gene Enhancers and TF can act in combinations Given their lac k of proximity, identifying whic genes are regulated by a given enhancer is a major challenge in the field TF : activators > activate transcrption of specific target genes > have a modular structure but not all modules are strictly required DBDs bind sequence specific genomic DNA SSDs sense external signals, either by binding ligands and/or post translational modifications eg phosphorylation Ads interact with coactivators and bring them to the genome to activate transcrption Don’t always find all 4 domains in a TF DBD > the defining domain f all transcrption factors > they contain a variet of diff types of DNA binding domains that recognise specific DNA sequences DBS > bind specific sequence, a consensus sequence > wea conservation to the consensus results in weaker bidning > but these sequences are poor predictors of actual binding to the genome in vivo as chromatic structure can block TF bidning Consensus sequence Can still bind similar sequences > just means weaker conservation = weaker binding Sequence is palindromic These sequences are poor predicted of actual binding to the genome in vivo as chromatin structure can block TF binding 2 fold symmetry = TF bonds is idly to be 2 fold symmetry itself Protein =protein interactions are often required for DNA bidning, as most TFs can’t bind as monomers Acrtivation domains > AD. > also known as TADs (trans activation domain)> often highly unstructedm smal protein domains with a compositional bias > often contain essential bulky hydrophobic resides (W, F, L > difficult to identify by sequence, conservation between ADs is nt obvious > ADs interact directly with coactivators, enabling the activator to stimulate transcrption at a specific genomic location > ADs are highly promiscuous in. Their interactions and often interact with several coactivators So peculiar and short Act