Molecular Biology of Gene Regulation PDF
Document Details
Uploaded by AdequateColosseum
Tags
Summary
This document explores the intricate mechanisms governing gene expression in multicellular organisms. It explains how cells maintain distinct identities despite their shared genome and how external signals can alter gene expression. The document focuses on the concepts of transcription regulation and related aspects including the role of sequence-specific DNA-binding proteins, transcriptional activators and repressors within various biological contexts.
Full Transcript
The Different Cell types of a Multicellular Organism Contain the Same DNA If we compare a mammalian neuron with a liver cell, the differences are so extreme that it is difficult to imagine that the two cells contain the same genome. For this reason, and because cell differentiation often seemed irre...
The Different Cell types of a Multicellular Organism Contain the Same DNA If we compare a mammalian neuron with a liver cell, the differences are so extreme that it is difficult to imagine that the two cells contain the same genome. For this reason, and because cell differentiation often seemed irreversible, biologists originally suspected that genes might be selectively lost when a cell differentiates. We now know, however, that cell differentiation generally occurs without changes in the nucleotide sequence of a cell’s genome. The cell types in a multicellular organism become different from one another because they synthesize and accumulate different sets of RNA and protein molecules. Different Cell types Synthesize Different Sets of RNAs and proteins Studies of the number of different RNAs suggest that, at any one time, a typical human cell expresses 30–60% of its approximately 30,000 genes at some level. There are about 21,000 protein-coding genes and a roughly estimated 9000 noncoding RNA genes in humans. When the patterns of RNA expression in different human cell lines are compared, the level of expression of almost every gene is found to vary from one cell type to another. Many processes are common to all cells, and any two cells in a single organism therefore have many gene products in common. These include the structural proteins of chromosomes, RNA and DNA polymerases, DNA repair enzymes, ribosomal proteins and RNAs, the enzymes that catalyze the central reactions of metabolism, and many of the proteins that form the cytoskeleton such as actin Some RNAs and proteins are abundant in the specialized cells in which they function and cannot be detected elsewhere, even by sensitive tests. Hemoglobin, for example, is expressed specifically in red blood cells, where it carries oxygen, and the enzyme tyrosine aminotransferase (which breaks down tyrosine in food) is expressed in liver but not in most other tissues. Different Cell types Synthesize Different Sets of RNAs and proteins External Signals Can Cause a Cell to Change the expression of its Genes Although the specialized cells in a multicellular organism have characteristic patterns of gene expression, each cell is capable of altering its pattern of gene expression in response to extracellular cues. If a liver cell is exposed to a glucocorticoid hormone, for example, the production of a set of proteins is dramatically increased. Released in the body during periods of starvation or intense exercise, glucocorticoids signal the liver to increase the production of energy from amino acids and other small molecules; the set of proteins whose production is induced includes the enzyme tyrosine aminotransferase, mentioned above. When the hormone is no longer present, the production of these proteins drops to its normal, unstimulated level in liver cells. Gene expression Can Be regulated at Many of the Steps in the pathway from DNA to RNA to protein If differences among the various cell types of an organism depend on the particular genes that the cells express, at what level is the control of gene expression exercised? There are many steps in the pathway leading from DNA to protein. We now know that all of them can in principle be regulated. (1) controlling when and how often a given gene is transcribed (transcriptional control), (2) controlling the splicing and processing of RNA transcripts (RNA processing control), (3) selecting which completed mRNAs are exported from the nucleus to the cytosol and determining where in the cytosol they are localized (RNA transport and localization control) (4) selecting which mRNAs in the cytoplasm are translated by ribosomes (translational control) (5) selectively destabilizing certain mRNA molecules in the cytoplasm (mRNA degradation control) (6) selectively activating, inactivating, degrading, or localizing specific protein molecules after they have been made (protein activity control) Gene expression Can Be regulated at Many of the Steps in the pathway from DNA to RNA to protein For most genes, transcriptional controls are paramount. This makes sense because, of all the possible control points illustrated below only transcriptional control ensures that the cell will not synthesize superfluous intermediates. Control of Transcription by SequenceSpecific DNA-Binding proteins How does a cell determine which of its thousands of genes to transcribe? Perhaps the most important concept, one that applies to all species on Earth, is based on a group of proteins known as transcription regulators. These proteins recognize specific sequences of DNA (typically 5–10 nucleotide pairs in length) that are often called cis-regulatory sequences, because they must be on the same chromosome (that is, in cis) to the genes they control. Transcription regulators bind to these sequences, which are dispersed throughout genomes, and this binding puts into motion a series of reactions that ultimately specify which genes are to be transcribed and at what rate. Approximately 10% of the protein-coding genes of most organisms are devoted to transcription regulators, making them one of the largest classes of proteins in the cell Transcription factors bind DNA in a sequence-specific maner A transcription regulator recognizes a specific cis-regulatory sequence because the surface of the protein is extensively complementary to the special surface features of the double helix that displays that sequence. Each transcription regulator makes a series of contacts with the DNA, involving hydrogen bonds, ionic bonds, and hydrophobic interactions. Although each individual contact is weak, the 20 or so contacts that are typically formed at the protein–DNA interface add together to ensure that the interaction is both highly specific and very strong Dimerization of transcription regulators increases their Affinity and Specificity for DNA A monomer of a typical transcription regulator recognizes about 6–8 nucleotide pairs of DNA. However, sequence-specific DNA-binding proteins do not bind tightly to a single DNA sequence and reject all others; rather, they recognize a range of closely related sequences, with the affinity of the protein for the DNA varying according to how closely the DNA matches the optimal sequence. Hence, cis-regulatory sequences are often depicted as “logos” which display the range of sequences recognized by a particular transcription regulator Dimerization of transcription regulators increases their Affinity and Specificity for DNA Heterodimers are often transcription regulators. Transcription regulators may form heterodimers with more than one partner protein; in this way, the same transcription regulator can be “reused” to create several distinct DNA-binding specificities. Transcription Factor Interactions Increase Gene-Control Options Transcription factors often can form homodimers or heterodimers thus further increasing the number of possible transcription factors Four different factor monomers could form a total of 10 dimeric factors, five monomers, 16 dimeric factors, and so forth. By having specific sequences for different monomers, each gene can have unique transcription activation requirements This is called combinatorial transcriptional regulation Transcription factors usually have DNA binding domains and dimerization domains One transcription factor. Homo dimerize Transcription factors usually have DNA binding domains and dimerization domains Transcription factors usually have DNA binding domains and dimerization domains Transcription factors usually have DNA binding domains and dimerization domains Transcription Factor Interactions Increase Gene-Control Options Similar combinatorial transcriptional regulation is achieved through the interaction of structurally unrelated transcription factors bound to closely spaced binding sites in DNA An example is the interaction of two transcription factors, NFAT and AP1, which bind to neighboring sites in a composite promoter-proximal element regulating interleukin-2 (IL-2) Neither NFAT nor AP1 binds to its site in the IL-2 control region in the absence of the other. However, when both NFAT and AP1 are present, protein-protein interactions between them stabilize the DNA ternary complex composed of NFAT, AP1, and DNA Such cooperative DNA binding of various transcription factors results in considerable combinatorial complexity of transcription control. As a result, the ≈2000 transcription factors encoded in the human genome can bind to DNA through a much larger number of cooperative interactions, resulting in unique transcriptional control for each of the ≈25,000 human genes Transcription factors work synergistically We have seen that complexes of transcription activators and coactivators assemble cooperatively on DNA. We have also seen that these assemblies can promote different steps in transcription initiation. In general, where several factors work together to enhance a reaction rate, the joint effect is not merely the sum of the enhancements that each factor alone contributes, but the product. If, for example, factor A lowers the free-energy barrier for a reaction by a certain amount and thereby speeds up the reaction 100-fold, and factor B, by acting on another aspect of the reaction, does likewise, then A and B acting in parallel will lower the barrier by a double amount and speed up the reaction 10,000-fold. Transcription Factors switch genes ON and OFF: Transcription factors that turn genes ON are called ACTIVATORS whereas Transcription factors that turn genes OFF are called REPRESSORS EXAMPLE OF REPRESSOR: THE TRYPTOPHAN REPRESSOR A cluster of bacterial genes can be transcribed from a single promoter. Each of these five genes encodes a different enzyme, and all of these enzymes are needed to synthesize the amino acid tryptophan from simpler molecules. the genes are transcribed as a single mRNA molecule, a feature that allows their expression to be coordinated. Clusters of genes transcribed as a single mRNA molecule are common in bacteria. Each of these clusters is called an operon because its expression is controlled by a cis-regulatory sequence called the operator (green), situated within the promoter. Transcription Factors switch genes ON and OFF: Transcription factors that turn genes ON are called ACTIVATORS whereas Transcription factors that turn genes OFF are called REPRESSORS EXAMPLE OF REPRESSOR: THE TRYPTOPHAN REPRESSOR This operon needs to be activates when Trp is low so that these enzymes can produce Trp from other precursors In the presence of abundant Trp, this Operon is switch off because the enzymes are no longer necessary This is all regulated by a Repressor that binds the operator between -10 and -35 boxes in the promoter Transcription Factors switch genes ON and OFF: Transcription factors that turn genes ON are called ACTIVATORS whereas Transcription factors that turn genes OFF are called REPRESSORS EXAMPLE OF REPRESSOR: THE TRYPTOPHAN REPRESSOR When bound to Trp, the repressor undergoes a conformational change that allows them to bind the operator and block RNA polymerase binding to the promoter Activators turn genes ON An activator protein binds to its cis-regulatory sequence on the DNA and interacts with the RNA polymerase to help it initiate transcription. Without the activator, the promoter fails to initiate transcription efficiently. DNA-bound activator proteins can increase the rate of transcription initiation as much as 1000-fold, a value consistent with a relatively weak and nonspecific interaction between the transcription regulator and RNA polymerase An activator and a repressor control the Lac Operon Complex Switches Control Gene transcription in eukaryotes When compared to the situation in bacteria, transcription regulation in eukaryotes involves many more proteins and much longer stretches of DNA. As in bacteria, the time and place that each gene is to be transcribed is specified by its cisregulatory sequences, which are “read” by the transcription regulators that bind to them. Once bound to DNA, positive transcription regulators (activators) help RNA polymerase begin transcribing genes, and negative regulators (repressors) block this from happening. In contrast to bacteria, eukaryotic gene control involves many more intermediate factors (there are usually dozen of transcription factors controlling one gene) and it has to deal with the fact that , opposite to bacteria, genes are packed in nucleosomes. Prokaryote gene Eukaryote gene An eukaryotic gene control region consists of a Promoter plus many cis-regulatory regions GENE Control Region = Promoter + cis-regulatory regions An eukaryotic gene control region consists of a Promoter plus many cis-regulatory regions Promoter Here is where the General transcription factors bind PPE: Promoter proximal elements Enhancers and Silencers: Can be thousands of bp away either upstream or downstream and even whithin introns Cis-regulatory regions: where specific transcription factors bind An eukaryotic gene control region consists of a Promoter plus many cis-regulatory regions Enhancers and Repressors + PPE = Cis-regulatory regions + Promoter Gene Regulatory region Specific transcription factors General transcription factors PPE: Promoter proximal elements Enhancers and Silencers: Can be thousands of bp away either upstream or downstream and even whithin introns Eukaryotic Transcription Regulators work in groups Eukaryotic transcription regulators usually assemble in groups at their cis-regulatory sequences. Often two or more regulators bind cooperatively. In addition, a broad class of multisubunit proteins termed co-activators and co-repressors assemble on DNA with them. Typically, these co-activators and co-repressors do not recognize specific DNA sequences themselves; they are brought to those sequences by the transcription regulators. Often the protein–protein interactions between transcription regulators and between regulators and coactivators are too weak for them to assemble in solution; however, the appropriate combination of cis-regulatory sequences can “crystallize” the assembly of these complexes on DNA. An eukaryotic gene control region consists of a Promoter plus many cis-regulatory regions GENE Control Region = Promoter + cis-regulatory regions Mediator: the largest co-activator One of the most prevalent coactivators is the large Media- tor protein complex, composed of more than 30 subunits. About the same size as RNA polymerase itself, Mediator serves as a bridge between DNA-bound tran- scription activators, RNA polymerase, and the general transcription factors, facilitating their assembly at the promoter Transcription factors can act different steps (A) promoting binding of additional transcription regulators (B) assemblingRNA polymerase at promoters, transcription activators are often needed (C) to release already assembled RNA polymerases from promoters (D)to release RNA polymerase molecules that become stalled after transcribing about 50 nucleotides of rnA. Transcription factors work synergistically We have seen that complexes of transcription activators and coactivators assemble cooperatively on DNA. We have also seen that these assemblies can promote different steps in transcription initiation. In general, where several factors work together to enhance a reaction rate, the joint effect is not merely the sum of the enhancements that each factor alone contributes, but the product. If, for example, factor A lowers the free-energy barrier for a reaction by a certain amount and thereby speeds up the reaction 100-fold, and factor B, by acting on another aspect of the reaction, does likewise, then A and B acting in parallel will lower the barrier by a double amount and speed up the reaction 10,000-fold. insulator DNA Sequences prevent eukaryotic transcription regulators from influencing Distant Genes What keeps a transcription regulator bound on the control region of one gene from looping in the wrong direction and inappropriately influencing the transcription of an adjacent gene? To avoid such cross-talk, several types of DNA elements compartmentalize the genome into discrete regulatory domains. A DNA element, called an insulator, prevents cis-regulatory sequences from running amok and activating inappropriate genes. Insulators function by forming loops of chromatin, an effect mediated by specialized proteins that bind them. The loops hold a gene and its control region in rough proximity and help to prevent the control region from “spilling over” to adjacent genes Some transcription factors act by modifiying chromatin structure The eukaryotic general transcription factors and RNA polymerase are unable, on their own, to assemble on a promoter that is packaged in nucleosomes. Thus, in addition to directing the assembly of the transcription machinery at the promoter, eukaryotic transcription activators promote transcription by triggering changes to the chromatin structure of the promoters, making the underlying DNA more accessible.. Swf, swn Chromatin DNA is highly packaged in within the nucleus The complex of histones and DNA is called chromatin Nucleosome Nucleosome: ~137 bp of dsDNA + complex of 8 histones (2 of each H2A, H2B, H3 and H4) Histone tails 30nm fiber 30-nm fiber has a “zig-zag ribbon” structure made from two “strands” of nucleosomes stacked on top of each other like coins. The 30-nm fibers also include H1, the fifth major histone. This structure is maintained by interactions between histones from adjacent nucleosomes Condensation influences transcription Condensed fibers can be further packed into heterochromatin The regions of chromatin actively being transcribed are thought to assume the extended beadson-a-string form. The chromatin in chromosomal regions that are not being transcribed or replicated exists predominantly in the condensed 30-nm fiber Euchromatin Histone tails Controlling chromatin condensation is a way to control gene expression Histone ‘tails’ play a major role controlling the local condensation state of chromatin Each of the histone proteins making up the nucleosome core contains a flexible N-terminus of 19–39 residues extending from the globular structure of the nucleosome called histone tails Histone tails Histone tails are subject to multiple post-translational modifications such as: acetylation, methylation, phosphorylation ubiquitination A particular histone protein never has all of these modifications simultaneously, but the histones in a single nucleosome usually contain several of these modifications simultaneously. Histone code The particular combinations of post-transcriptional modifications found in different regions of chromatin have been suggested to constitute a histone code A particular histone protein never has all of these modifications simultaneously, but the histones in a single nucleosome usually contain several of these modifications simultaneously. Histone tail modifications Histone Acetylation: Histone-tail lysines undergo reversible acetylation and deacetylation by enzymes that act on specific lysines in the N-termini. In the acetylated form, the positive charge of the lysine ɛ-amino group is neutralized. Example: lysine 16 in histone H4 is particularly important for the folding of the 30-nm fiber because it interacts with a negatively charged patch on the surface of the neighboring nucleosome in the fiber. Consequently, when H4 lysine 16 is acetylated, the chromatin tends to form the less condensed “beads-on-a-string” conformation conducive for transcription and replication. What enzyme does the acetylation: HAT (histone Acetyl transferases) What enzyme removes the acetylation: HDAC (Histone DeACetylases) Histone tail modifications Acetylation also reduces the affinity of DNA to histones in the nucleosome by adding negative charges to histone tails This also helps is chromatin relaxation Histone tail modifications Methylation: Lysine ε-amino groups can be methylated up to three times, a process that prevents acetylation, thus maintaining their positive charge. Moreover, Arginine side chains can also be methylated Histone Methyl Transferases (HMT) are involved in methylating histone tails while Histone DeMethylases (HDMT) removes methyl groups from histones As a general rule, methylation favors condensation while acetylation favors relaxation HISTONE CODE: Heterochromatin vs Euchromatin Heterochromatin usually contains histone H3 modified by methylation of lysine 9 or 27, Euchromatin generally contains histone H3 extensively acetylated on lysine 9 and 14, and to a lesser extent at other H3 lysines, methylation of lysine 4, and phosphorylation of serine 10 Transcription factors can influence histone tail modification Some transcription factors can act as histone tail modifiers by recruiting HAT / HDAC / HMT / HMC Often one modification brings further modifications (example: figure). This succesive histone modifications often occurs during transcription initiation The alterations of chromatin structure that occur during transcription initiation can persist for different lengths of time. In some cases, as soon as the transcription regulator dissociates from DNA, the chromatin modifications are rapidly reversed, restoring the gene to its pre-activated state. In other cases, the altered chromatin structure persists, even after the transcription regulator that directed its establishment has dissociated from DNA. In principle, this memory can extend into the next cell generation Gene Regulatory Example: Development Regulatory DNA defines the gene expression patterns in development. the genome is the same in a muscle cell as in a skin cell, but different genes are active because these cells express different transcription regulators that bind to gene regulatory elements. For example, transcription regulators in skin cells recognize a regulatory element in gene 1, leading to its activation, whereas a different set of regulators is present in muscle cells, binding to and activating gene 3. transcriptional regulators that activate the expression of gene 2 are present in both cell types.. Patterning by sequential induction Patterning by sequential induction. a series of inductive interactions can generate many types of cells, starting from only a few. Studies in Drosophila have revealed the Gene control Mechanisms Underlying Development Like the eggs of other insects, but unlike most vertebrates, the Drosophila egg— shaped like a cucumber—begins its development with an extraordinarily rapid series of nuclear divisions without cell division, producing multiple nuclei in a common cytoplasm—a syncytium. The nuclei then migrate to the cell cortex, forming a structure called the syncytial blastoderm. After about 6000 nuclei have been produced, the plasma membrane folds inward between them and partitions them into separate cells, converting the syncytial blastoderm into the cellular blastoderm. Transctiption factors distribute unevenly in drosophila During Oogenesis, mRNA form a trasncription factor called Bicoid is placed by the mother in one side of the egg and it forms a gradient in the egg. Upon fertilization, the mRNA is translated creating a gradient of Bicoid protein The region that has higher concentration of Bicoid because the anterior region and the one with the least bicoid will become the posterior side Different transcription factors distribute differently across the synctium Example: Even-Skipped (eve) transcription factor expressed in 7 stripes Eve has 7 cis regulatory regions, one per stripe Each Eve regulatory region is control by 2 activator and 2 repressors Each Eve regulatory region is control by 2 activator and 2 repressors CpG islands By a mechanism(s) that remains to be elucidated, most RNA polymerase II molecules transcribing in the “wrong” direction, i.e., transcribing the non-sense strand, pause or terminate by ≈1 kb from the transcription start site. Transcription occurs in both directions, but Pol II molecules transcribing in the sense direction are elongated to >1 kb much more efficiently than transcripts in the antisense direction.