Microbial Genetics and Genomics - 3050 Section 4 Slides PDF
Document Details
Uploaded by IngeniousCentaur3308
Tags
Summary
These slides cover microbial genetics and genomics, including information flow in cells, DNA structure, genes, prokaryotic and eukaryotic gene structures, transcription, and regulation of gene expression.
Full Transcript
Microbial genetics and genomics, and their applications in society Chapters 6, 9, 10, 12, 13, 19 Information Flow in Cells Information Flow in Cells In prokaryotes, transcription and translation can be coupled, with translation starti...
Microbial genetics and genomics, and their applications in society Chapters 6, 9, 10, 12, 13, 19 Information Flow in Cells Information Flow in Cells In prokaryotes, transcription and translation can be coupled, with translation starting on a mRNA before it’s completed We will see one reason this is important later when we talk about regulation of gene expression DNA structure specific base-pairing between cytosine and guanine, adenine and thymine anti-parallel strands, DNA has directional information (5’-3’) 3D structure shows exposure of bases in y the major and minor F grooves this is how DNA- binding proteins find the correct sequences to bind to DNA structure DNA is a massive it is compacted in the cell structure: E. coli through supercoiling and chromosome is 700X the interactions with proteins length of the cell Genes GENE: piece of nucleic acid that specifies a function genes produce: mRNAs (translated to make proteins) tRNAs (involved in protein synthesis) rRNAs (key components of the ribosome) other active RNAs (regulatory, enzymatic, etc.) not all genes encode proteins Genes gene structures are different in the different domains of life prokaryotes: can have multiple protein coding regions on one mRNA no introns (usually) operon: cluster of co-transcribed genes with expression controlled from a regulatory region before the first gene polycistronic mRNA: several genes on one transcript Genes gene structures are different in the different domains of life eukaryotes: one protein coding region on one mRNA introns are common primary RNA transcripts undergo processing to generate the final mature mRNA: 5’ caps poly-A tails splicing to remove introns Prokaryote Gene Organization but some prokaryotic RNAs also get processed e.g.: for rRNAs, which are key components of ribosomes Transcription - RNA polymerase RNA polymerase is a multi-protein complex promoter: region where RNA polymerase binds, opens the dsDNA and transcription starts termination happens at specific sites in the DNA Bacterial sigma factors Bacteria use the sigma protein for promoter recognition sigma is only involved in initiation and is released after transcription begins bacteria use different sigma factors to control transcription of different sets of genes e.g., E. coli has 7 sigmas Initiation in Archaea and Eukarya Archaea and Eukarya use TBP and TFB proteins for promoter recognition they bind to the promoter and then RNApol binds promoters have different sequence properties compared to Bacteria RNA polymerases and evolution archaeal and eukaryotic RNA polymerases much more similar to each other than bacterial fits with the evolution of the nucleus from an archaeal cell Transcription termination in bacteria, transcription after transcription of the inverted often stops at inverted repeat sequence, the resulting repeat sequences RNA folds into a stem-loop structure that causes RNA polymerase to stop and fall off the DNA Regulation of gene expression some genes are expressed all the time: constitutive most are expressed only when needed: regulated don’t want to have every protein present and active all the time different mechanisms can be used to control whether or not a protein is produced and/or active Regulation of gene expression some genes are expressed all the time: constitutive most are expressed only when needed: regulated regulation points: transcription translation protein activity/stability Regulation of transcription initiation First level of control is through the sigma and TBP proteins Additional DNA-binding proteins that are not part of RNA polymerase act to control whether it can bind to promoters and initiate transcription Negative regulators: prevent transcription the regulatory protein is a repressor that inhibits binding of RNA polymerase repressors bind to a DNA sequence called an operator Positive regulators: promote transcription the regulatory protein is an activator that stimulates binding of RNA polymerase activators bind to a DNA sequence called an activator binding site Regulation of transcription initiation repressors bind to a DNA sequence called an operator: operators are located between the promoter and the gene(s) activators bind to a DNA sequence called an activator binding site (ABS): ABS can be directly beside the promoter or further away DNA-binding regulatory proteins DNA-binding regulatory proteins interact with DNA at specific sequences bind at the exposed bases in the grooves of the dsDNA in prokaryotes, many DNA-binding -common mos proteins have a HELIX-TURN- - HELIX structure, where one of the helices binds to the DNA often function as dimers and bind to inverted repeat sequences on the DNA, one monomer bound at each repeat Induction common for control of expression of genes that encode catabolic enzymes substrate turns on expression of the gene(s) can be control by a repressor (= negative induction) or by an activator (= positive induction) e.g. lactose catabolism Negative Induction - lac operon lac operon (3 genes) for catabolism of the sugar lactose is controlled by negative induction: when no lactose is present, the repressor is bound to the operator and the genes are not transcribed the repressor protein blocks RNA polymerase from transcribing the genes (like a roadblock) Negative Induction when lactose is present, the inducer binds to the repressor this causes it to change shape so that it can no longer bind to the operator, so the genes are transcribed Positive Induction - mal operon no maltose catabolism of the sugar maltose: genes are only expressed when maltose is present requires an activator protein, but the activator protein cannot bind to DNA without the inducer RNA polymerase cannot bind to the promoter and start transcribing the genes without the activator bound to the DNA Positive Induction - mal operon no maltose + maltose maltose enzymes maltose when maltose is present, it binds to the activator protein as the inducer this causes it to change shape, and it can then bind to DNA it stimulates RNA polymerase to bind to the promoter and thereby activates transcription of the operon Repression common for control of expression of genes that encode anabolic enzymes product of the reaction or pathway turns off expression of the gene(s) e.g. arginine biosynthesis enzymes: add arginine to a growing culture and the cells continue to grow as before but expression of the arginine biosynthesis genes is turned off Repression when no arginine is present, the genes encoding the biosynthesis enzymes are transcribed because the repressor is not bound to the operator Repression when arginine is present, it binds to the repressor this causes it to change shape, and it can now bind to the operator and transcription is stopped (roadblock) Operons versus Regulons genes for utilizing lactose are all in one operon, which is regulated by the LacI repressor protein genes for utilizing maltose are actually located in three different locations, all of which are regulated by the maltose activator protein this set of genes/operons all controlled by the same regulator are referred to as a regulon Global control of gene expression control of genes for catabolism of individual sugars is straightforward when there is only 1 sugar to consider, but what do cells do when there is more than 1 sugar available? this has been well-studied in E. coli in the presence of 2 sugars, the cells will only use 1 of them, and then once that has been used up, they switch to using the other results in two distinct growth phases in the batch culture: diauxic growth Global control of gene expression this phenomenon is called catabolite repression when 2 sugars are available, e.g. glucose and lactose, the cells will first use up all the glucose and then move on to using the lactose (glucose used first because it is a better food source than lactose) the lactose utilization genes are not turned on until the glucose is gone Global control of gene expression if lactose is present, why aren’t the genes expressed? Global control of gene expression if lactose is present, why aren’t the genes expressed? expression of lac genes requires the repressor to be absent from the operator AND for the presence of a bound activator protein called CRP CRP and catabolite repression CRP = cyclic AMP receptor protein only binds to DNA when bound to cAMP cAMP is synthesized by the enzyme adenylate cyclase glucose inhibits this enzyme so high glucose = low cAMP glucose present = low cAMP = no cAMP-CRP = no expression of lac genes glucose absent and lactose present = repressor not bound to operator and higher cAMP = cAMP-CRP = expression of lac genes Regulation by Two-Component Systems one of the major ways that bacteria sense and respond to their environment is through protein signaling pathways called two-component systems (TCS) also found in Archaea and a few cases in Eukarya e.g. E. coli has ~50 different 2-component systems that regulate reactions to different stimuli Regulation by Two-Component Systems most consist of 2 proteins: 1. sensor kinase 2. response regulator Regulation by Two-Component Systems sensor kinase: senses something and responds by phosphorylating itself on a histidine Alsocalledeinases residue many are located in the cytoplasmic membrane Regulation by Two-Component Systems response regulator: gets phosphorylated by the sensor kinase and then goes and regulates something many are DNA-binding transcriptional regulators Regulation by Two-Component Systems E. coli controls which porin proteins it has in the OM through a TCS - sensor kinase EnvZ senses the osmotic force in the environment activated EnvZ-P activates the response regulator OmpR, which Outer membrane protein regulator controls transcription of the porin protein genes Chemotaxis is controlled by a complex TCS ↳ cells that swim w/flagella towards attractants 2 away from repellents. cells make directed movements towards attractants and away from repellants sensing and response is done through a multi- protein TCS the response - regulator CheY controls the rotation of the flagellum by binding to the causes to change motor direction in which its I tumbles spinning not all response regulators control gene expression Regulation by Quorum Sensing some organisms sense how many other cells are around them and modify their gene expression and behaviour in response to the population density = quorum sensing amount of other sensing change cells in environment I that behavior based on examples of QS-regulated behaviours: motility, toxin production, light production, biofilm formation very common in Bacteria, but also happens in Archaea and Eukarya Quorum Sensing cells synthesize and release a signal molecule called an autoinducer Broad term - when a lot of cells are present, there is a lot of the autoinducer present in the environment > - a receptor protein senses the autoinducer and the cell then responds in some way different species make different autoinducer molecules, AHLs are used by many bacteria Quorum Sensing receptor protein becomes activated when it binds the autoinducer and cells respond in some way, often by changes in gene expression Quorum Sensing was discovered in bacteria that use QS to regulate bioluminescent light production these bacteria will colonize specialized organs inside squid, where they grow to ? high density and then Symbiosis emit light RNA-based regulation some gene expression is controlled directly on the RNA, through RNA folding and RNA-RNA binding different mechanisms, all of which serve to control protein production: antisense RNAs riboswitches attenuation Regulation by Antisense RNAs in bacteria, translation of an mRNA requires specific sequences on the mRNA called ribosome binding sites (RBS) that are next to the start codon (AUG) for the protein anti-sense RNAs and RNA folding can control whether or not the RBS is available for a ribosome to bind and begin translation Regulation by Antisense RNAs anti-sense RNAs and RNA folding can control whether or not the RBS is available for translation can be positive or negative regulation antisense RNA blocks RBS = no translation antisense RNA binds to the mRNA and frees the RBS = translation Regulation by Antisense RNAs anti-sense RNAs can also control if the target RNA gets degraded or is stabilized and therefore expressed can be positive or negative regulation antisense RNA promotes mRNA degradation = no translation antisense RNA prevents mRNA degradation = translation etc. definition , answer maries -picture , End of Midterm materia... short Regulation by Attenuation attenuation functions by controlling the completion of mRNA synthesis (not initiation) this mechanism relies on the coupling of transcription and translation - i.e., that translation of an mRNA can start before it has been completely synthesized/transcribed mRNA contains a leader region, and translation of this leader region determines whether or not the transcription will continue leader region can fold into different structures and this folding either allows transcription to continue or causes it to terminate Regulation by Attenuation attenuation and the tryptophan biosynthesis genes cells need tryptophan, but they only need to make it when it’s not available from the environment they do not want to express the biosynthesis genes if they do not need to make it Regulation by Attenuation attenuation and the tryptophan biosynthesis genes attenuation controlled by translation of a leader peptide, which contains 2 Trp residues other amino acid biosynthesis operons controlled in similar ways Regulation by Attenuation attenuation and the tryptophan biosynthesis genes when there is plenty of tryptophan present in the cell, the leader peptide is translated quickly as the mRNA is produced by RNA polymerase this allows the formation of a terminator stem-loop in the mRNA next to the RNApol, which causes termination of transcription Regulation by Attenuation slow translation of the leader when there is low Trp leads to different base-pairing in the mRNA and no terminator structure is formed next to the RNApol so transcription continues Regulation by Attenuation lots of Trp = fast leader translation, formation of terminator stem-loop right next to RNA polymerase little Trp = slow leader translation, different base-pairing in the RNA and no formation of terminator stem-loop right next to RNA polymerase Microbial Genomics genome: complete genetic makeup including chromosome(s) and any plasmids, etc. genomics: the mapping, sequencing, analyzing and comparison of genomes first complete cellular genome sequence was from Haemophilus influenzae - reported in 1995 currently >225,000 prokaryotes have complete or in-progress genomes in the public sequence database Microbial Genomics the genome sequence of an organism is an information map that can be exploited for many applications Microbial Genomics massive increase in genome sequencing enabled by advances in high-throughput sequencing (HTS) technologies in ~3 hours, it could generate 500 Mb of sequence data cost for all materials for 1 run ~$800 e.g. Ion Torrent “Personal Genome Machine” - first purchased at MUN in 2011 Microbial Genomics massive increase in genome sequencing enabled by advances in high-throughput sequencing (HTS) technologies ~30 Gb of DNA sequence data, 7-12 million reads generates ultra-long read lengths (hundreds of kb), but the sequence generated has errors e.g. Oxford Nanopore data available in real time so MinION you can see what you’re cost to buy instrument getting while it’s running ~$1000 and cost for materials for 1 run ~$2000 Microbial Genomics massive increase in genome sequencing enabled by advances in high-throughput sequencing (HTS) technologies ~1.2 Gb of DNA sequence data, 4 million reads high quality data - very few errors cost for instrument ~$20,000 and all materials for 1 run ~$1,000 e.g. Illumina iSeq Genetic elements in cells all cellular organisms have double-stranded DNA as their genetic material (not true for viruses) most prokaryotes have circular chromosomes, but not all plasmids are extra-chromosomal self-replicating elements, can be circular or linear Plasmids in prokaryotes plasmids are much smaller than chromosomes e.g., F plasmid E. coli chromosome 99,200 bp 4,639,675 bp Prokaryote Genomics large range of genome sizes for different organisms endosymbionts and parasitic organisms have smallest genomes, free-living species have larger genomes Prokaryote Genomics large range of genome sizes for different organisms within the free-living bacteria, genomes can vary in size by 10X Prokaryote Genomics Mycoplasma genitalium has the smallest genome of any non-endosymbiont, but it is an obligate parasite some endosymbiont genomes are extremely small, and these organisms have essentially become organelles Prokaryote Genomics what does larger genome size mean? = more genes! every dot represents one species almost perfectly 1 gene per 1000 bases, even after billions of years of evolution Eukaryotic genomes as with prokaryotes, large variation in genome size Eukaryotic genomes as with prokaryotes, large variation in genome size can be as small as bacterial genomes not a linear correlation between genome size and # of genes as found for prokaryotes - e.g. the protozoan Trichomonas genome is >1/10 the size of the human genome but has >2X as many genes Eukaryotic genomes large variation in numbers of introns per genes, which is part of the reason there is not the same linear relationship as seen for prokaryotes for genome size versus number of genes Prokaryote Genomics sequencing the genome of an organism is very useful, but there is still a lot that is not known about what all the genes are doing even when the genomes are fairly small and/or from a long- studied model organism like E. coli fairly consistent % of genes of unknown function, no matter the genome size Function versus size the proportions of genes involved in different cellular functions change as the genome size changes the processes of DNA replication and translation are done by large machines (DNA polymerase and ribosome) and these machines do not change with genome size – so those functions make up a large fraction of the smallest genomes more more differentthings" genes = larger genomes and more genes means more - = open reading frame-gene Sequence their proportion goes down different functions and greater flexibility, in large genomes because but this means a larger proportion of genes the # of genes required for are needed to sense the environment and these processes doesn’t regulate expression accordingly (= signal change transduction and transcription) Microbial Genomics the genome sequence provides a prediction of the organism’s physiological capabilities e.g. Vampirovibrio chlorellavorus, a predatory cyanobacterium that does not have photosynthesis genes but instead attaches to algal cells and sucks out their contents for nutrients! Using genomics to understand biology DNA sequence is only the information storage use of microarrays or RNA-seq to- quantify transcription ↳ (sequencing) need to look at gene expression to - understand a cell’s functioning - use of mass spectrometry for proteomics to quantify protein - levels, modifications, etc. Metagenomics Sequencing of DNA directly from environmental samples Allows us to learn about complex communities and organisms we cannot grow in the laboratory Estimates are that we can only grow ~1% of naturally occurring microbes in the lab (remember the Asgard archaea?) Single-cell genomics technological developments now allow these techniques to be applied to single cells can characterize individual cells from natural environments without the need to grow them in the lab in pure culture individual cells are sorted into small wells in plates and then their DNA is sequenced Genome evolution how do genomes change over time? mutations gene duplications gene deletions mobile elements horizontal gene transfer duplication of a gene within a genome, followed by mutations that change the function of one of the versions Genome evolution how do genomes change over time? mutations (genetic drift) gene duplications gene deletions mobile elements horizontal gene transfer mobile genetic elements such as viruses, plasmids and transposons cause important changes in genomes Genome evolution how do genomes change over time? mutations gene duplications gene deletions mobile elements horizontal gene transfer horizontal gene transfer is the movement of genetic material from one organism/cell to another prokaryotes reproduce by fission, so HGT is their version of sex DNA that gets into a new cell will only be maintained if it: 1. integrates into a replicating part of the genome or 2. is capable of autonomous replication Genome evolution bacteria and archaea evolve very quickly through all these different processes and genes move among different species in the natural environment, where they are almost always existing in complex communities with many different species present “Core” versus “pan” genomes these evolutionary changes are constant, so any prokaryotic “species” is really a continuum of related cells that have some genes in common and some that are unique core genome: the minimal set of genes that are found in all cells of a species (or whatever group is being considered) pan genome: the collection of all genes that are found in all cells of a species Core and pan genomes e.g.: Salmonella enterica strains core = a set of 2811 genes that are present in each strain each one then also has a bunch of genes that are unique to it Core and pan genomes e.g.: Escherichia coli strains comparison of a non-pathogenic strain (K-12) to 2 pathogenic strains green shows regions shared by all 3 strains and chunks of additional DNA found in pathogenic strains are the red, blue and orange some of these regions are known as “pathogenicity islands” (PAI on figure), which are clusters of genes that make the strains more pathogenic HGT in prokaryotes evolution by gene duplication and mutation is “slow” whereas evolution by HGT is extremely fast Transformation: uptake Transduction: transfer of free DNA from outside of DNA between cells of the cell by viruses if a plasmid becomes Conjugation: transfer integrated into the of plasmid DNA between chromosome it can also cells lead to transfer of chromosome regions Transformation discovered while studying Streptococcus infection, before it was even known that DNA was the genetic material the capsule-producing strain S (“smooth” colonies) causes disease the R strain (“rough” colonies) that does not produce the capsule does not cause disease Transformation dead S cells do not cause disease and live R cells do not cause disease but a mixture of dead S and live R cells added together resulted in disease recovered regenerated S cells could be isolated from the infected animal R cells were “transformed” into S cells by the dead S cells - later was shown that it was DNA from the S cells that went into the R cells, transforming them into S cells Transformation cells that take up DNA are competent some species are naturally competent (e.g. Streptococcus) and these usually take up linear DNA from the environment some species can be made competent by treatment with chemicals (e.g. E. coli) and these will then take up entire plasmids HGT in prokaryotes Transduction: transfer of DNA between cells by viruses during replication of a virus inside the cell, some virus particles can end up with the cell’s DNA inside instead of virus DNA this can then be transferred to another cell Two types, differ because of the way different viruses replicate: 1. Generalized: any gene in the cell can be transferred 2. Specialized: only certain genes can be transferred Generalized transduction any gene in the cell can be transferred during production of the virus particles, it is possible for some to end up with a piece of host DNA instead of virus DNA (these are defective viruses) Not functional a virusfor replication this DNA can then be transferred to another cell, and get incorporated into the genome Specialized transduction only certain genes can be transferred happens with some viruses that integrate into the cell’s genome viruses that integrate into host genome are called TEMPERATE when integrated in genome, they are called a PROPHAGE host cell with an integrated virus is called a LYSOGEN Specialized transduction when the integrated virus is induced, it cuts itself out of the host genome and replicates to produce new viruses Specialized transduction these viruses sometimes make mistakes and cut themselves out improperly and take some host DNA along this host DNA is then replicated and packaged inside the particles with the virus DNA it is only the genes right next to where the virus integrates that get packaged and transferred (i.e., it is specialized for transferring those genes) Conjugation transfer of plasmid DNA between cells (also called mating) conjugative plasmids: carry genes that cause the transfer between cells involves cell-cell contact via a pilus e.g., the F plasmid of E. coli genes in the tra region produce the pilus and carry out the plasmid transfer, takes up ~1/3 of the F plasmid Conjugation 1. pilus contacts the recipient cell 2. pilus retracts, bringing cells together 3. cells fuse together 4. transfer of one strand of plasmid into the recipient with replication Ft Fplasmid of complementary strand cell DNA in the recipient 5. cells separate and both now F+ Transposable elements TEs: pieces of DNA that can move from one location to another in the genome - “jumping genes” have a big impact on evolution of genomes three different types: 1. insertion sequences (IS) Simplest 2. transposons (Tn) 3. transposable viruses Transposable elements Simplest contain a gene, tnp, that encodes the enzyme responsible for the DNA movement: TRANSPOSASE also have inverted repeat DNA sequences at their ends (20 - 1000 nts) cleotides IS are very simple - just a tnp gene and the inverted repeats 100s of different IS elements have been found in Bacteria and Archaea Transposons transposons are more complex – have tnp and IR sequences, but also have additional genes Tn5: has a bunch of genes located between 2 IS elements, IS50L and IS50R (almost identical, but tnp gene in IS50L is not functional) middle section contains 3 different antibiotic resistance genes (many Tns carry antibiotic resistance genes) Transposable elements scattered throughout most prokaryote genomes (chromosomes and plasmids) responsible for a lot of the variation between strains of the same species E. coli F plasmid contains 4 TEs: IS2, 2 copies of IS3, Tn1000 TEs cause mutations transposable elements can cause gene disruptions insertion of a TE into a gene disrupts that gene and there will usually not be a functional gene product produced Applications of prokaryote genetics microorganisms, their genes and genomes are the foundation for genetic engineering and biotechnology recombinant DNA: combining two or more different pieces of DNA into a new genetic entity made possible by studying and using plasmids from microbes Applications of prokaryote genetics recombinant DNA technology was EcoRI originally made possible through the use of restriction enzymes: enzymes produced by bacteria 5′ 3′ that cut DNA at specific sequences 3′ 5′ present in bacteria as a defense against viruses the cell’s own DNA is protected by 5′ 3′ another enzyme that modifies the DNA structure at those sequences 3′ 5′ incoming virus DNA gets cut into pieces Applications of prokaryote genetics use specialized plasmid vectors, such as pUC19, to “clone” foreign DNA so that it can be propagated in E. coli and manipulated these plasmid vectors carry antibiotic resistance genes so that only cells containing the plasmid will grow, cells with no plasmid cannot grow = selection Applications of prokaryote genetics pUC19 also carries another gene so that cells containing a recombinant plasmid can be distinguished from those that only have the vector without any foreign DNA make E. coli cells competent by treating with specific chemicals and add possible recombinant DNA sample (some vector DNA will reclose on itself without a foreign piece inserting) cells with the closed vector make blue colonies because they have a functional lacZ gene that allows them to metabolize a chemical that is similar to lactose (but turns blue when metabolized) cells with recombinant plasmid form white colonies cells with plasmids, both with and without extra DNA inserted, can grow but they can be distinguished by their appearance = screening these approaches are used to produce proteins that are used in human medicine Applications of prokaryote genetics and production of products for agriculture production of bovine somatotropin protein in E. coli and it is then purified and injected into dairy cows to increase milk production Applications of prokaryote genetics and genetic engineering in agriculture making herbicide resistant plants insect resistant plants faster growing fish for aquaculture Engineered microbes trying to change/engineer microbes to have specific properties so they can be used to treat diseases e.g., manipulated Listeria monocytogenes to be weaker for infecting normal mouse cells, but it could still replicate in mouse cancer cells tagged the bacterial cells with a radioactive compound so they brought it inside the cancer cells and killed them Engineered microbes making synthetic genomes and cells synthesized large overlapping fragments of a bacterial genome put them into yeast and they recombined into one circle transform it into a bacterial cell and a “synthetic” cell is generated after cell division Applications of metagenomics bio-prospecting from uncultured microorganisms in natural environments take DNA from the environment and clone it, then screen to find the colonies with recombinant DNA that have a new interesting property