Genetics Test 2 Study Guide PDF
Document Details
Uploaded by Deleted User
Tags
Related
- Human Genetics and Molecular Biology Notes PDF
- Molecular Biology and Genetics - Explorations: An Open Invitation to Biological Anthropology (2nd Edition) PDF
- Lesson 2: Central Dogma of Molecular Biology: Replication PDF
- Lecture 5 Biology: DNA Replication PDF
- Molecular Biology Final Exam 2021-2022 KING SALMAN Univ PDF
- BIOL10221 Molecular Biology DNA Replication I 2024 PDF
Summary
This document is a study guide for a genetics test and covers DNA replication, the Meselson-Stahl experiment, and other related concepts in molecular biology. This document is a sample of the format of exam papers and contains the key aspects of such papers; questions are not included.
Full Transcript
Genetics Test 2 study guide **[DNA replication]** 2 things make replication challenging 1. Cant start from nothing -- need a primer and a template 2. Need to replicate both strands at the same time, but can only replicate in the 5'3' direction **Meselson Stahl Experiment** 3 models or re...
Genetics Test 2 study guide **[DNA replication]** 2 things make replication challenging 1. Cant start from nothing -- need a primer and a template 2. Need to replicate both strands at the same time, but can only replicate in the 5'3' direction **Meselson Stahl Experiment** 3 models or replication being tested. Conservative- DNA replication makes a new DNA molecule that has 2 entirely new strands, with the original molecule keeping both its original strands. Semi-conservative- each parental DNA strand remains intact, and serves as a template for the daughter strand to be produced. Dispersive- DNA molecules are mixtures or hybrids of parental and daughter DNA, resulting in patchwork strands. E. coli was grown in media containing the heavy isotope ^15^N. The bacteria take up the isotope and incorporate it into their DNA, making their DNA denser than the DNA of bacteria grown without ^15^N present. After allowing the bacteria to grow for several generations in the ^15^N, they switched back to normal media containing ^14^N, and allowed the bacteria to continue to grow. They removed bacteria at set intervals to see how the density of the DNA changed from one generation to the next. To do that, they used **density gradient centrifugation**. This technique uses cesium chloride to form a density gradient. This allows molecules to be separated out by their density. Spinning from the centrifuge causes more dense particles to move to the bottom of the tube, because these particles have more mass and are carried further by their inertia. Less dense particles then settle higher within the sample. We end up with a sorted solution layered by particle density with most dense at the bottom and least dense at the top. DNA isolated from cells at the start of the experiment ("generation 0," just before the switch to ^14^N, produced a single band after centrifugation. This result makes sense because the DNA should have contained only ^15^N at that point. DNA isolated after one generation also had 1 band, but it was shifted higher. This indicates a hybrid form of DNA that isn't entirely ^15^N nor entirely ^14^N, which immediately rules out the conservative model of replication. In the second generation they saw 2 bands produced- one was in the same position as the intermediate band from the first generation, while the second band was higher, equal to that of the ^14^N only band This allowed them to identify replication as being semi-conservative, because if the dispersive model was correct, you'd never have a band equivalent to ^14^N only, because there would always be some ^15^N present in the daughter strands. A diagram of a chemical reaction Description automatically generated **DNA polymerase** -- synthesize new DNA strands by adding individual nucleotides in the 5'3' direction. Some also contain 5'3' and 3'5' exonuclease activity **Exonuclease activity** -- the ability to remove an incorrect nucleotide that was added in error by the polymerase. Breaks a phosphodiester bond in the sugar phosphate backbone of DNA, and removes the terminal nucleotide from the 3' end. Polymerases can only move back 1 nucleotide. **3 important things that have to line up to ensure correct nucleotide is added** 1. The 3' hydroxyl group at the end of the primer 2. The alpha phosphate on the incoming nucleotide 3. The catalytic core of the polymerase The two ends of the basepairs that need to be connected need to line up perfectly and that needs to line up with the part of the polymerase that forms that phosphodiester bond. If all 3 of those things line up perfectly that means you have the correct nucleotide and you get phosphodiester bond formation When the incoming nucleotide is mispaired with the template nucleotide, the finger domains of the polymerase only partially close, the 3 key components don't line up correctly, and that reduces the rate of bond formation. Reducing the rate of bond formation in turn reduces the likelihood of that nucleotide being incorporated. **Processivity** -- refers to the number of nucleotides added each time the enzyme binds DNA **Fidelity** -- refers to how accurate the replication is **Leading strand synthesis** -- synthesized in the same direction that the replication fork is moving. Requires 1 primer at the start of synthesis. **Lagging strand synthesis** -- synthesized in the opposite direction that the replication fork is moving. Requires synthesis to be carried out in fragments called Okazaki fragments of roughly 1,000-2,000 bp in prokaryotes. Uses a new primer for each fragment. **3 main steps in lagging strand synthesis** 1. Remove the RNA primer 2. Replace the primer with a DNA sequence 3. Link each DNA fragment together **[Eukaryotic replication ]** 3 key differences compared to prokaryotic replication 1. Genomes are typically much larger 2. Eukaryotic genomes consist of multiple chromosomes 3. DNA molecules are linear **Origin recognition complex (ORC)--** origin is not a well defined sequence, but is usually AT rich like in prokaryotes. The origin recognition complex binds the initiation sequence, and is critical for allowing the assembly of the pre-replication complex. **Pre-replication complex (pre-RC)** -- contains several proteins that are important to know -- **CDC6, CDT1**, and **MCM2-7** complex. ORC recruits CDC6 to form a platform that MCM2-7 can be loaded onto with the help of CDT1. ORC, CDC6, and CDT1 are all required for the MCM protein complex to bind to the origin during G1 phase. Once the pre-RC is formed, it's activated via phosphorylation by **CDK2** and **DDK**. The activation of the pre-RC allows for the assembly of the replisome **Replisome** -- replicative complex containing helicase, primase and polymerase **3 main polymerases in eukaryotes - α, δ, ε** **Replication protein A (RPA)** is required for those polymerases to bind to the DNA. RPA assembles on ssDNA after helicases unwind it. RPA prevents nucleases from cleaving ssDNA, blocks formation of hairpin structures, and prevents basepairs from rejoining. Also coordinates assembly of the primosome **DNA Pol α --** exists as a complex with primase. Because of this association with primase, it's the only eukaryotic polymerase that can start synthesis de novo. It synthesizes an RNA primer (primase) then switches to adding DNA nucleotides. Pol α is specialized to this task, as it has limited processivity and no 3' exonuclease activity. Does this on both leading and lagging strand. **DNA Pol ε** -- bulk of leading strand synthesis, and contains 3'-5' exonuclease activity. Locked into place on DNA by PCNA **DNA Pol δ** - major polymerase for lagging strand synthesis -- elongates the RNA-DNA primers produced by the primosome. RPA removes the RNA primer, and then Pol δ adds DNA nucleotides where the primer was. Also contains 3'-5' exonuclease activity. **PCNA** -- Proliferating Cell Nuclear Antigen -- loads Pol ε onto the DNA. A member of the sliding clamp family of proteins that the prokaryotic b-clamp is part of. PCNA is a DNA clamp that increases processivity by encircling the DNA and acting as a scaffold to recruit additional replication proteins. **Joining of Okazaki fragments in eukaryotes** Pol δ extends RNA-DNA primers, and once it reaches a previous RNA-DNA primer, it recruits the ssDNA binding protein RPA to come and flip out the RNA, leaving a flap of RNA primer attached to the DNA. FLAP endonuclease then comes and cleaves the flap of RNA primer, allowing Pol δ to continue. When Pol δ reaches the next fragment of DNA, it drops off allowing ligase to bind and seal the nicks **Eukaryotic DNA ligase** -- contains a replication factor targeting sequence (RFTS) at the N-terminal, which is used to recruit it to sites of DNA replication. Also contains a nuclear localization sequence, and 3 functional domains -- a DNA binding domain, a nucleotidyltransferase catalytic domain, and an oligonucleotide binding domain. In order to ligate the Okazaki fragments together, the ligase progresses through three steps: 1. Addition of an adenosine monophosphate (AMP) group to the enzyme, which is referred to as adenylation. This occurs specifically at a lysine in the active site of the enzyme 2. AMP is transferred from the active site to the donor nucleotide on the DNA 3. That results in the phosphodiester bond between the 5' phosphate of the donor and the 3' hydroxyl of the acceptor, which ultimately seals the nick **DNA replication-coupled nucleosome assembly** -- nucleosomes ahead of the replication fork have to be disassembled to allow for the movement of the replication machinery, and then reassembled once the replication machinery passes. Since nucleosomes are highly basic proteins, they need to be accompanied by acidic proteins to prevent aggregates from forming during replication. These proteins are called histone chaperones. 3 key histone chaperones are **ASF1**, **CAF-1**, and **FACT** **End replication problem** -- the DNA at the very end of a linear chromosome can't be fully copied in each round of replication, resulting in the gradual shortening of the chromosome. This happens because during lagging strand synthesis (see replication section) requires the use of multiple primers. When the replication fork reaches the end of the chromosome, there is a short stretch of DNA that is too small for the primer to bind to, and that sequence of DNA does not get replicated. **Replication at the telomeres** -- all of the various 3D structures present at the telomere (shelterin complex, T-loop, G-quads) make replication difficult. Additionally, since telomeres are at the end, you only have a replication fork coming from 1 direction, not both. Two proteins involved in replication at the telomere -- **WRN** and **BLM** -- both are a rec-Q helicase that localize at the telomere during S phase by recognizing proteins in the shelterin complex. WRN and BLM then remove telomeric secondary structures. **[DNA sequencing]** **Sanger sequencing** -- chain termination sequencing -- used for short stretches up to 900 bp in length Several ingredients mixed together for Sanger sequencing 1. A DNA polymerase 2. A primer to start the polymerase 3. The 4 nucleotide bases 4. The template strand we want to sequence 5. Dideoxynucleotides marked with different color dyes. **Dideoxynucleotides** -- lack a hydroxyl group on the 3' carbon of the sugar ring, which normally serves as a hook allowing the new nucleotide to be added to an existing chain. Once dideoxynucleotide is added to a sequence, no additional nucleotides can be added. This is repeated over thousands of cycles so that you end up with a dideoxynucleotide at every location in the sequence. The sequences are then run through capillary gel electrophoresis to separate them by size. As each fragment passes through the gel, it gets hit by a laser that reds the color dye present, thus telling you the nucleotide present at that portion of the sequence. **Next-gen sequencing** -- not 1 single assay, but a category of sequencing technologies. Advantages include higher sensitivity, faster (because they're highly parallel), and cheaper than Sanger sequencing. Steps in Next-gen sequencing -- constructing library, PCR amplify your library then ligate library to adapter sequence that contains a barcode. The library is then added to a flow cell with millions of wells that grab onto the adapter sequence. Dye labeled nucleotides are added to the sequence 1 nucleotide at a time, and an image is taken after each additional nucleotide is added to the sequence. **[Transcription ]** DNA RNA. Genes are transcribed into pre-mRNA. Introns are removed, and exons joined to form mature mRNA **Template strand** -- used to produce a complementary anti-parallel RNA strand **Non-template strand** -- coding strand that is the same as the RNA being produced. Template and non-template only refer to a specific gene being transcribed. The template strand of one gene can be the non-template strand of another, and vice versa. **3 phases of transcription** -- initiation, elongation and termination **[Transcription in prokaryotes ]** Initiation -- need 2 things, a promoter and a polymerase. **Promoter** -- sequence upstream of a gene that tells the RNA polymerase where to bind to start transcription. No 1 single sequence, but there are conserved sequences within the promoter. **Transcription start site** (+1) is the DNA base that the first RNA nucleotide synthesized will be paired with. **TATA box** (-10 sequence) 6bp sequence roughly 10 basepairs upstream of the start site, with a TATAAT consensus sequence. **-35 sequence** -- TTGACAT consensus sequence. -10 and -35 are important for the σ (sigma) subunit of RNA polymerase to bind. **[Transcription in Eukaryotes]** RNA Pol I -- rRNA **RNA Pol II -- mRNA, snRNA, LncRNA, miRNA** RNA Pol III -- tRNA **Eukaryotic promoters** -- CAAT box (\~ -80 to -90) increases promoter efficiency, TATA box (-30) surrounded by GC box (GC rich sequence) and CAP site (+1) which is the transcription initiation sequence. **Transcription factors** -- proteins that control the rate of transcription of a gene. Can control 1 or multiple genes, can increase or decrease expression, and can work alone or in a complex. Ultimately ensure that the correct genes are being expressed in the correct amount, in the correct space and at the correct time. For a protein to be a transcription factor, it must bind DNA [directly]. To be able to activate transcription, the transcription factor must have a DNA binding domain, allowing it to interact with DNA, and a Transactivation domain, allowing it to recruit additional proteins involved in the activation of transcription. **General transcription factors** help position Pol II on the promoter, and are necessary for all transcription to occur. All other transcription factors that are not general transcription factors have specific subsets of genes that they regulate. Three places a transcription factor can bind -- promoter, enhancer and silencer **Transcriptional coactivators** -- bind to transcription factors and increase the rate of transcription **Transcriptional corepressors** -- bind to transcription factors and repress the rate of transcription **Enhancers** -- no set sequence and no set distance from promoter. Transcription factors bind to enhancers and increase activity of RNA polymerase at the promoter **Silencers** -- same as enhancers, except they suppress activity of RNA polymerase at the promoter Activator proteins bind to enhancers, repressor proteins bind to silencers. ![A diagram of a protein Description automatically generated](media/image2.png) Cohesions facilitate bending in the DNA to bring the enhancer or silencer near the promoter so that proteins bound there can interact with the promoter. This is referred to as a **Topologically Associated Domain (TAD)**, which is a self-interacting stretch of DNA. These are important for regulating gene expression through enhancer/silencer and promoter interactions. **Insulator sequences** -- regulate the interaction of enhancers and silencers with promoters. Help prevent enhancers or promoters from making contact with the wrong promoter, or with the right promoter but at the wrong time. **Pre-initiation complex** -- DNA PoL II along with the general transcription factors TFIID, TFIIB, TFIIE, TFIIF and TFIIH Don't need to know function of each general transcription factor, just know the following... **Eukaryotic transcription initiation** -- starts when general transcription factors bind to the promoter to help RNA Pol II bind. Once the pre-initiation complex is formed, the DNA is destabilized, leading to promoter melting, where the DNA duplex is separated. The template strand is inserted into a groove on RNA pol II towards the active site. A domain on RNA pol II then clamps down onto the DNA when the first nucleotide reaches the active site. Short transcripts are produced as part of **abortive initiation**. For **sustained transcription** -- RNA Pol II must be phosphorylated at the C-terminal domain to initiate **promoter escape**. Following promoter escape, we have **pause-release** where short stretches are synthesized followed by pausing. That pause happens because 2 proteins **DSIF** and **NELF**, bind to RNA Pol 2 and cause it to stop. This is followed by **CDK9** phosphorylating DSIF and NELF, which causes them to fall off of POL 2, allowing for **productive elongation.** This pause release occurs so that genes that need to be expressed at the same time, or in a certain order, can have their transcription coordinated in a precise manner. **Mediator complex** -- mediates interaction between regulatory transcription factors, polymerase and enhancer, silencer or promoter. Also phosphorylates CTD on polymerase allowing for transcription elongation. **FACT** -- facilitates chromatin transcription -- disassembles nucleosomes in front of the RNA polymerase by removing H2A/H2B dimers, which loosens the DNA wrapped around the nucleosome so that the polymerase can transcribe through it. FACT then reassembles the nucleosome behind the polymerase. **Eukaryotic transcription termination** -- no specific termination sequence. RNA Pol II typically transcribes way beyond the end of the gene, but the oversized transcript gets **cleaved to produce the correct size pre-mRNA**. The extra transcript that gets produced beyond the end of the gene gets digested by a 5' exonuclease. That exonuclease eventually catches up to the polymerase that's still synthesizing, and causes the polymerase to fall off, stopping transcription. **Cleavage of transcript to produce pre-mRNA** involves 4 important protein complexes -- **CPSF**, **CSTF**, **CF1** and **CF2. CPSF** CSTF binds a GU rich region, and CPSF binds an AAUAAA sequence -- and cleaves about 10-30 nucleotides downstream of it, but upstream of where CSTF is. CSTF stimulates the activity of CPSF. CF1 influences where the cleavage occurs, and CF2 interacts with the CTD on RNA Pol II to carry out an unknown function. Once cleavage is complete, we have our pre-mrna, and it needs to be processed into mRNA. The pre-mRNA is going to consist of 3 primary regions- the **5' untranslated region (5' UTR)** which is a sequence of nucleotides at the 5' end of the mRNA that does not encode any of the amino acids of the final protein product, the **protein coding region** -- which is made up of the codons that specify the amino acid sequence of the protein, and the **3' untranslated region (3' UTR)** -- which is a sequence of nucleotides at the 3' end of the mRNA that is not translated into amino acids -- and this area affects the stability of mRNA and helps regulate the translation of the mRNA protein-coding sequence. **Processing steps of mRNA** -- capping at the 5' end, addition of poly(A) tail at the 3' end, and splicing to remove introns. The C-terminal domain of RNA pol II is important for mRNA processing. Processing proteins get recruited to the C-terminal domain so that they're in place ready to process RNA as soon as it emerges from the polymerase. **Capping at the 5' end** - methylated-guanosine a (7-methyl-G) gets linked to the phosphates at the 5\' end of the mRNA. The cap helps protect the 5\' end of the mRNA from degradation by nucleases and helps to position the mRNA correctly on the ribosome. If the cap doesn't get placed on properly the RNA is quickly degraded by processing bodies in the cytoplasm. Uncapped RNA can also be recognized as non-self by the immune system, stimulating an immune response. **Poly(A) tail addition** -- stretch of adenines that gets added which is important for nuclear export, and stabilizes the transcript. The poly(A) tail shortens over time, eventually triggering the degradation of the mRNA when it gets short enough. Splicing -- the process of removing introns. Introns contain 3 conserved sequences -- a 5' splice site, a branch site, and a 3' splice site that guide the removal of introns. **Spliceosome** recognizes these sites and bind and cleave the RNA there. **Spliceosome** -- RNA/Protein complex (ribonucleoprotein) with 5 different ribonucleoprotein subunits, as well as multiple other proteins bound to it. Ribonucleoprotein complexes within the splicesome are referred to as snRNPs -- small nuclear RNPs (U1, U2, U4, U5, U6) **The process of splicing** - The U1 snRNP binds the 5' splice site and the U2 snRNP binds the branch site. U4, U5, and U6 bind to U1 and U2, along with the pre-mRNA, forming the spliceosome complex. This results in the intron forming a loop, which brings the 5' and 3' splice site together. At that point the 5' end of the intron is cut and connected to the branch site, which creates a lariat structure. When the lariat structure is formed, U1 and U4 are released, the 3' splice site is cleaved, and the 2 exons are joined together. After the exons are joined, the intron, which is still in that lariat structure, is released along with U2, U5 and U6. The intron gets degraded and the snRNPs get recycled. You now have a mature mRNA transcript that's ready for translation. **Exon junction complex** (**EJC**)-- group of proteins that bind to the pre-mRNA and activate splicing by the spliceosome. After splicing is finished, the EJC remains on the mRNA at each exon junction (where the 2 exons now meet once the intron is removed). The EJC helps export the mRNA out of the nucleus and into the cytoplasm for translation. **Alternative 3' cleavage --** pre-mRNA cleavage can occur at different sites. Some pre-mRNA can have multiple cleavage and polyadenylation sites that are favored under different conditions. If you cleave the pre-mRNA at a different location, that will produce a different protein product if exons are removed, or it could alter the stability of the mRNA (because the 3' UTR is altered), leading to altered protein production. **Alternative promoters** - can have 2 or more promoters active for a given gene, which allows you to produce different proteins by including or excluding certain exons. Which promoter gets activated is dictated by which transcription factors bind, or which enhancer region the transcription factors bind to. **Alternative splicing** -- allows a single gene to produce multiple proteins by including or excluding different exons to produce a different order or combination of exons. **Cis-acting regulatory elements** - a region of DNA that regulates the transcription or expression of a gene that is found on the same chromosome. Two types splicing silencers and splicing enhancers. **Splicing repressors/silencers** are where splicing repressor proteins bind and reduce the likelihood that splice site will be used. Can be intronic or exonic. **Splicing enhancers** are sites where splicing activator proteins bind, which then increase the probability that a nearby site will be used for splicing. Can be intronic or exonic. **Trans-acting regulatory element**- is a protein that binds to a cis-acting element of a specific gene to regulate its transcription. **SR proteins** (serine arginine rich proteins) are trans-acting regulatory elements that regulate splicing, as well as mRNA export. Heterogenous ribonucleoproteins (**hnRNPs**) are also trans-acting regulatory elements that bind to splicing silencers and prevent snRNPs and other activator proteins involved in spliceosome assembly from assembling on the intron. **mRNA degradation** -- 3 main types of enzymes that are used to degrade mRNA, categorized by where they cut. 1. Endonucleases --cleave the RNA internally 2. 5' exonucleases -- degrade RNA from the 5' end 3. 3' exonucleases -- degrade RNA from the 3' end Ends are protected by cap and polyA tail which must be dealt with. Additionally, there are Elf4 proteins and PABP (polyA binding proteins) that bind either end and further protect the mRNA from degradation. Most commonly used mechanism of degradation occurs via de-adenylation, where the polyA tail is shortened. This step is reversible, can re-adenylate the tail if necessary. **Decapping** -- need to remove the methylguanosine cap, so we use decapping proteins (DCP1 and DCP2) which allows the 5' exonuclease XRN1 to come in and degrade the RNA **Exosome for 3' decay** -- exosome structure varies depending on the situation, but is generally 10-12 proteins that form a complex, some have a 3' exonuclease function, and others act as accessory proteins involved in substrate recognition. These pathways are not mutually exclusive though -- they can both be involved in the degradation of a single mRNA **Endonuclease mediated decay** -- fastest way to inactivate an mRNA -- frequently carried out by the polysomal ribonuclease - PRM1. After PRM1 cleaves the mRNA, XRN1 and exosomes come in and degrade the mRNA from either end. **P-bodies or mRNA processing bodies** - cytoplasmic ribonucleoprotein granules where mRNA/protein complexes congregate. Typically form in the cell in response to stress, which could potentially allow the cell to rapidly change to changing environments by releasing or sequestering certain mRNAs **sncRNA** -- small noncoding RNAs -- double stranded RNA 20-31 nucleotides long with 2 nucleotide overhangs. 2 types: **siRNA** (small interfering RNA) and **miRNA** (micro RNA). **siRNA** -- typically from exogenous source like viruses or transposable elements. Because of that, cellular response to siRNA is to eliminate it. **Dicer** -- an RNase III enzyme and endoribonuclease that cleaves pre-miRNA to produce miRNA which ultimately silences a gene. Dicer can also process dsRNA into siRNA. Either processing event results in the formation of the RNA-induced silencing complex (RISC). **RNA-induced silencing complex (RISC)** - protein complex that forms in response to DICER and is responsible for the gene silencing that occurs in response to miRNAs and siRNAs. Several variations, but bare minimum is argonaute protein plus siRNA or miRNA. **Argonaute** protein binds the small RNA and positions it so that it can interact with target mRNA. Once there is complementary binding between the two RNA, argonaute cleaves the RNA. **microRNA** (miRNA or miR) -- 21-25 nucleotides long and form hairpin structures, typically derived from non-coding regions. Important for downregulating gene expression through a few different mechanisms, but all of the mechanisms involve the miRNA being at least partially complementary to 1 or more mRNA. miRNA will bind mRNA and lead to either translational repression, mRNA cleavage or de-adenylation of the mRNA, all of which will reduce the mRNA level. miRNA are transcribed by RNA Pol II as pri-miRNA precursors, with a 5' cap and a polyA tail. The pri-miRNA is then processed by the microprocessor complex (RNase III Drosha and dsRNA binding protein Pasha) which results in pre-miRNA. The pre-miRNA is exported into the cytoplasm by Exportin 5. Once in the cytoplasm, the pre-miRNA the RNase III enzyme Dicer processes the pre-miRNA further to produce the final miRNA Diagram of a structure Description automatically generated **siRNA as a molecular biology tool to silence genes** 1. Exogenous siRNA enters the cell it gets incorporated into RISC 2. Once the siRNA is part of the RISC complex, the siRNA is unwound to form single stranded siRNA. 3. The strand that is thermodynamically less stable due to its base pairing at the 5´end is chosen to remain part of the RISC-complex 4. The single stranded siRNA which is part of the RISC complex now can scan and find a complementary mRNA 5. Once the single stranded siRNA (part of the RISC complex) binds to its target mRNA, it induces mRNA cleavage. 6. The mRNA is now cut and recognized as abnormal by the cell. This causes degradation of the mRNA and in turn no translation of the mRNA into amino acids and then proteins. Thus silencing the gene that encodes that mRNA. To deliver siRNA to the cell you can use electroporation or viral transduction. **Electroporation** -- uses electricity to create pores in the cell membrane. The electric potential in across the cell rises, allowing charged molecules like DNA to be moved across the membrane. **Viral transduction** -- virus is produced containing your siRNA, and then your cells of interest are exposed to the virus, which deliver the siRNA into the cell **Long non-coding RNA (LncRNA)** -- any RNA transcript over 200 nucleotides long that doesn't get translated into a protein. Processed the same way mRNA are. Regulate transcription through several mechanisms, most of which are driven by the ability of lncRNA to form triplex structures (DNA/RNA/protein interactions). 1. lncRNA can bind to a transcription factor, and to the target site DNA, and strengthen or block the interaction between transcription factor and DNA 2. lncRNA can interact directly with other RNA and sequester them, acting like a sponge. If a lncRNA did this to a miRNA, it would increase expression of the target mRNA, since the miRNA could no longer interact with the mRNA 3. lncRNA can interact with splicing factors like SR proteins and facilitate their localization at transcription sites, or block them from interacting with pre-mRNA. They can also modify where the splicing occurs by disrupting different protein-protein interactions between splicing factors. **[Techniques]** **Antibodies** are produced by the immune system, and can recognize highly specific portions (epitopes) or proteins, which makes them useful as a research tool. **Immunoprecipitation (IP)** -- similar to affinity chromatography, but on a smaller scale. Doesn't require a large column and can be done in an Eppendorf tube. An Antibody is added to your protein sample that binds to whatever protein you want to isolate. The protein sample that has all the proteins from the cell in it, as well as your antibody/protein complex now, is added to agarose beads. The antibody then attaches to the beads, and allows you to separate you protein of interest from all the other proteins in the sample. **ChIP-Seq** -- Chromatin Immunoprecipitation sequencing - allows you to identify all of the DNA sequences a specific protein binds to. The sample is crosslinked to lock any proteins bound to the DNA in place, and then the DNA is broken up into pieces. The protein of interest is then immunoprecipitated from the sample, pulling with it and DNA sequence it's bound to. You then separate the protein from the DNA it's bound to, and sequence the DNA. **Cut & Run** -- cleave under targets and release using nuclease -- upgrade to ChIP-Seq. Add an antibody against protein of interest to your cell lysates, and then a nuclease cleaves the DNA around the protein. Sequence attached to your protein/antibody complex is identified with sequencing. **Reporter assays** -- allows you to determine if your protein of interest increases or decreases the expression of the gene its bound to. Utilizes luciferase downstream of the promoter of the gene you're studying. If binding of your protein of interest to the promoter increases transcription, the luciferase enzyme is expressed, producing fluorescence. If your protein of interest suppresses the promoter, luciferase is not produced, and no fluorescence is generated. **Promoter pulldown** -- allows you to identify what proteins bind to a particular sequence. DNA is immobilized on a probe, and then incubated in protein extract. You can then identify which proteins bound to the DNA using western blots or mass spec. **RNA-seq** -- used to analyze the transcriptome. Tells you which genes are turned on or off and to what extent, at any given time in a particular cell type or tissue. Also gives you information about alternative promoter and alternative splicing usage. RNA is collected, reverse transcribed into cDNA, and then sequenced. **scRNA-seq** -- bulk RNA-seq can hide different subpopulations of cells. scRNA-seq allows you to isolate the transcriptome in each individual cell in a group and then categorize them by subtype. **Translation** Key components of translation to know 1. mRNA 2. tRNA 3. aminoacyl-tRNA synthetase 4. ribosomes (small and large subunit) 5. rRNA 6. initiation factors **mRNA start codon** -- the 3 nucleotide sequence that the ribosome recognizes to initiate translation. Establishes the reading frame **tRNA** -- transfer RNA, acts as an adapter molecule between mRNA and the growing chain of amino acids at the ribosome to facilitate protein synthesis at the ribosome. ![](media/image4.png) **Parts of tRNA** **3 nucleotide anticodon** -- binds to 3 nucleotide codon on mRNA to ensure amino acid is being brought to the right portion of the mRNA, and thus being added in the correct order **Acceptor stem** -- 3' and 5' terminal nucleotides bind together. Also location of CCA 3' overhang **CCA 3' overhang/CCA tail** --provides a 3' hydroxyl group for attachment of amino acid **Aminoacylation** -- process of adding an amino acid **Aminoacyl-tRNA synthetase** Aminoacyl-tRNA synthetase -- enzyme that aminoacylates tRNA (attaches amino acid to tRNA). There is a different aminoacyl tRNA synthetase for every different amino acid. However, there is [not] a different aminoacyl tRNA synthetase for each codon. For example, we have 6 different codons that can make a leucine, but we only have 1 aminoacyl-tRNA synthetase for leucine. Catalyzes a 2 step reaction that leads to the addition of the amino acid to the CCA 3' overhang on the tRNA. The amino acid gets activated with ATP, which forms aminoacyl-AMP, and releases pyrophosphate, and then the amino acid is transferred to the tRNA and the AMP gets released. **Aminoacyl-tRNA synthetase proofreading function** -- enzyme has 2 pockets -- active site pocket and editing pocket. The amino acid enters the active site pocket first, and then is moved into the editing pocket. If the amino acid fits in the active site pocket, but not the editing pocket, it is the correct amino acid. If the amino acid fits into both the active site pocket and the editing pocket, it is the wrong amino acid. This is mostly based on the size of the amino acid. Amino acids that are too large, never fit into the active site pocket. Amino acids that are too small, will fit into the active site, but then also fit into the editing site as well, indicating they're the wrong amino acid. ![A diagram of a cell Description automatically generated](media/image6.png) In addition to ensuring the correct amino acid is present, the enzyme can also check to make sure the correct tRNA is bound. It does this by using a nucleotide binding pocket to recognize the anticodon on the tRNA, as well as interacting with and reading the acceptor stem on the tRNA. **rRNA** -- ribosomal RNA -- has enzymatic function (ribozyme). Binds with ribosomes, and is responsible for peptide bond formation during translation. **Ribosomes** -- actual site of translation/protein synthesis. Eukaryotic and prokaryotic ribosomes are similar, eukaryotic slightly larger. Both are made up of 2 subunits. Eukaryotic ribosome has a 60s and 40s subunit which combine to form the 80s ribosome, and prokaryotic has a 50s and 30s subunit, which combine to form the 70s ribosome. "s" indicates speed of sedimentation, which is why they don't add up correctly. When the ribosomal subunits come together, there are 4 sites for RNA binding, 1 for the mRNA and 3 for the tRNA. **3 tRNA binding sites\ **A site -aminoacylated tRNA\ P site -- peptidyl-tRNA\ E site -- exit site **Eukaryotic translation** Formation of the pre-initiation complex and initiation complex- mRNA comes out of the nucleus into the cytoplasm where the eIF4 proteins A/E/G bind to the 5' CAP on the mRNA. That stimulates eIF 1, 2 and 3 to come together with the 40s ribosomal subunit, and a methionine tRNA. Since methionine is the start codon, this is always going to be the first tRNA interacting with these components. eIF 1, 2 and 3, the methionine tRNA and the 40s subunit form the **43s pre-initiation complex**. The 43s pre-initiation complex then attaches to the eIF4 A/E/G proteins on the mRNA and forms the **48s initiation complex**. The 48s initiation complex is formed in the 5'UTR. The 48s initiation complex then scans the mRNA for the start codon. Once the start codon (AUG) reaches the P site, eIF5 kicks off all of the other initiation factors, and the 60s ribosomal subunit comes in and binds to the 40s subunit with the methionine tRNA. This forms the **80s initiation complex**. At this point, the 80s ribosome is positioned on the mRNA with the start codon at the P site. Now, a charged tRNA will be added to the A site to begin building the polypeptide chain. The charged tRNA are brought to the A site by EF-1α-GTP. **EF-1α-GTP** is a protein responsible for binding to charged tRNA, and bringing them to the A site. Once EF-1α-GTP places the charged tRNA in the A site, there is a proofreading step (kinetic proofreading). If the tRNA is correct, the GTP is hydrolyzed quickly and EF-1α-GTP can diffuse away, leaving the tRNA in place. If the tRNA is incorrect, GTP is hydrolyzed slowly, allowing for EF-1α-GTP and the tRNA to diffuse out of the A site before the peptide bond is formed. When the correct charged tRNA is brought to the A site, a peptide bond will form between the amino acid in the A site, and the amino acid in the P site. This peptide bond formation occurs in the peptidyl-transferase center, and is carried out by rRNA. Now, there is a charged amino acid in the P and A site, and the ribosome must move so that a new charged tRNA can enter the A site. The ribosome moves via a ratchet mechanism, shifting the tRNA in the P site, into the E site, and the tRNA in the A site into the P site. The large subunit moves first, followed by the small subunit. **Translation termination** occurs once the stop codon is reached. Two release factors are involved. eRF1 recognizes the stop codon, and eRF3 facilitates the release of the polypeptide chain in a GTP dependent manner. These release factors also provide another proofreading function at the ribosome. If they initial kinetic proofreading at the A site fails and an incorrect tRNA reaches the P site, then a peptide bond has already been formed with the incorrect amino acid. This leads to a conformational change that allows eRF1 to terminate translation, despite the lack of a stop codon. If the incorrect amino acid is not corrected at this point, and additional conformation change occurs, which makes it more likely for a second incorrect amino acid to be incorporated. This leads to a final conformation change that provides even easier access for the release factors to terminate translation. **Ribosome recycling** -- the release factors remove the polypeptide chain from the ribosome, but do not remove the mRNA. This requires the ribosome recycling protein ABCE1, which removes the mRNA and allows the ribosomal subunits to bind to a new mRNA. **Nonsense mediated decay**- Exon junction complex is present on mRNA, and gets removed when ribosome reaches it. If the ribosome stalls or reaches a premature stop-codon, the exon junction complex never gets removed, triggering nonsense mediated decay. Nonsense mediated decay proteins bind to the mRNA, and de-adenylate it, leading to degradation. **uORFs** -Upstream open reading frames -- open reading frames within the 5' UTR of the mRNA that regulate the rate of translation of the downstream coding sequence on the mRNA. **3 main scenarios can occur when the pre-initiation complex attaches to the 5' UTR, leaky scanning, re-initiation, and stalling/drop off** **leaky scanning** -- the pre-initiation complex fails to recognize the start codon of the upstream open reading frame, so it binds to the uORF, but doesn't realize, and keeps moving along the mRNA looking for the start codon. Translation proceeds normally once the pre-initiation complex reaches the start codon. **Re-initiation** -- the pre-initiation complex does recognize the start codon in the uORF, and it assembles fully into a ribosome and translates the uORF. The ribosome will eventually get to the stop codon in the uORF, and remain bound to the mRNA, and then re-initiate translation in the coding sequence. Translation proceeds as normal, but the rate is slowed because the ribosome wasted time translating the uORF first. **Stall/drop off** - the pre-initiation complex does recognize the start codon in the uORF, and it assembles fully into a ribosome and translates the uORF. The ribosome will eventually get to the stop codon in the uORF, and either stall, leading to nonsense mediated decay, or it can drop off the mRNA without ever translating the coding sequence. This leads to a lack of translation. **Alternative translational start site** -- mRNA can contain multiple start sites that the ribosome recognizes. This leads to additional variation at the protein level. **Post-translational modifications** **Protein cleavage/proteolysis** -- removal of a portion of the polypeptide chain. Important for the activation of some enzymes that are produced in an active form, as well as the localization of proteins. **Phosphorylation** -- addition of a phosphate group to a protein. Phosphates are negatively charged, which alters the charge of the protein, and can lead to conformation changes and altered function or activity. Phosphorylation often acts as a molecular switch, to activate or deactivate an enzyme. This is frequently seen in signal transduction pathways. Phosphorylation can also be used for localization, allowing proteins to be sequestered in the cytoplasm or sent into the nucleus. Certain phosphorylation events can lead to degradation by the ubiquitin proteasome system. **Kinases** -- phosphorylate proteins **Phosphatases** -- dephosphorylate proteins **Ubiquitination** -- addition of ubiquitin. Ubiquitin is a small protein that is added to proteins in a 3 step enzymatic process. A ubiquitin gets attached to the enzyme E1, which then passes the ubiquitin to E2, and then final E3, which then adds the ubiquitin to the target protein. Different amounts of ubiquitin added in different locations on a protein can have different outcomes. The predominant use of ubiquitin is for degradation in the proteasome however. The proteasome is the protein recycling machinery of the cell. The proteasome degrades ubiquitinated proteins into individual amino acids for recycling. **SUMOylation** -- small ubiquitin-related modifier -- involved in localization, stability and protein-protein interactions. SUMO can either mask a binding site of a substrate protein that would normally interact with the target protein, it can act as a scaffold so that additional proteins can attach to the target protein, or it can induce a conformational change in the target protein that induces a new function. A diagram of different types of sumo Description automatically generated **Methylation** -- addition of a methyl group -- mediated by methyltransferases, with S-adenosyl methionine (SAM) as the primary methyl group donor. One or multiple methyl groups can be added to a single amino acid. Typically occurs on histones for regulation of gene expression. **Methyltransferases** -- add methyl **Demethylases**- remove methyl **Acetylation** -- acetyl group from acetyl coenzyme A is transferred to a target protein. Important for stability and localization of proteins. Can have dramatic changes on the conformation of a protein and the substrates and cofactors it interacts with. **Histone acetyltransferases** (HATs)-- add acetyl -- 2 types, Type A is in the nucleus, and type B is in the cytoplasm **Histone deacetylases** (HDACs)-- remove acetyl Assays to study translation -- pulse-chase and ribo-seq **Pulse chase** -- allows you to study protein turnover. Use radiolabeled amino acids that you add to your cells for a short period of time, so that they get incorporated into the newly synthesized proteins (the pulse). Then, you remove the radiolabeled amino acids, and add back in unlabeled amino acids (the chase). Can measure the half life of a protein, its turnover, or rate of synthesis by looking at how quickly the radiolabeled amino acids were incorporated (synthesis), and can determine half life by looking at how much radiation remained after a given time using a western blot followed by fluorography, which measures the level of radioisotope left. **Ribo-seq** -- gives you a snapshot of ribosome activity in a cell at a specific time point, which allows you to determine which proteins are actually being actively transcribed. So takes RNA-seq, which tells you what mRNAs are being produced, and takes it one step further by telling you which mRNA are actually being made into proteins. Approach is similar to ChIP-seq. Ribosomes are isolated, and a nuclease is used to remove all the other mRNA that's attached to the ribosomes. The piece of mRNA that is physically inside of the ribosome is protected from the nuclease, and we can then sequence it to see which genes are being translated, and the rate they're being translated at (more ribosomes on a given mRNA = higher rate of translation). This technique has also been used to identify new translation products, because we can see exactly where the ribosome starts and stops on the mRNA. Also allows for better understanding of how uORFs function to regulate translation. **FLIM-FRET** -- can attach fluorescent probes to one or multiple proteins (or even DNA), and study the interactions between multiple molecules, or different regions of a single protein. Used to measure distances at a molecular scale. A molecular ruler basically. **Techniques to identify the 3D structure of a protein** -- circular dichroism, NMR, electron microscopy and X-ray crystallography.