Production & Purification of Recombinant Proteins PDF
Document Details
Uploaded by SteadfastMalachite2641
University of Bordeaux
Sarah Courtois
Tags
Related
- Introduction to Genetic Engineering Transcript PDF
- Recombinant Protein Expression PDF
- BioM1 Recombinant DNA-based molecular techniques PDF 2020-2021
- Sesión 4: Producción de Proteínas Recombinantes 2024 EMS PDF
- Key Points- Expression, Cloning & Purification of Proteins PDF
- Medical Biotechnology Production of Pharmaceutical Proteins PDF
Summary
These notes cover the production and purification of recombinant proteins. They discuss the construction of expression vectors, selection of host cells, and the production and purification phases. It details a strategy for the production and purification, and the different components involved.
Full Transcript
UE TBMC P3R _ M1 CBIO Part 2 PRODUCTION & PURIFICATION of RECOMBINANT PROTEINS Sarah Courtois ([email protected]) UMR5536 (CNRS) Centre de Résonance Magnétique des Systèmes Biologiques Center for Magnetic Resonance of Biolo...
UE TBMC P3R _ M1 CBIO Part 2 PRODUCTION & PURIFICATION of RECOMBINANT PROTEINS Sarah Courtois ([email protected]) UMR5536 (CNRS) Centre de Résonance Magnétique des Systèmes Biologiques Center for Magnetic Resonance of Biological Systems 1 (Base de cours de Thierry Noël) General scheme of production and purification of a recombinant protein 1 - Construction of an expression vector, plasmid or virus, allowing a strong constitutive or regulated (ideally) expression of the gene encoding the protein of interest 2 - Selection and mass culture of a host cell able to express the vector information 3 - Production phase to obtain the expected volumes of protein 4 - Phase of extraction, separation and purification of the protein Steps 1 and 2 linked: expression vector is chosen according to the host cell, and vice versa Steps 3 and 4 linked: production and purification of the protein belong to the same process Developing a strategy for the production and purification of a recombinant protein should follow a reverse order Design a purification method Step 4 Develop the production strategy Step 3 Order of presentation Choose the host cell Step 2 2 Determine the choice of the vector Step 1 P3R: Part 2 Host-vector systems I. Generalities on the construction of an expression vector II. Expression systems in the bacteria III. Expression systems in yeast and fungi IV. Expression systems in mammalian cells 3 I. GENERALITIES ON THE CONSTRUCTION OF AN EXPRESSION VECTOR Construction of an expression vector Donor organism Genetic information Genome expressing a gene of interest Archae Protein of Eubacteria interest Virus Organism naturally expressing Fungi the protein of interest Plants Animals. So the first step will be to take this genetic information, so you will use restriction on the nucleate that will get your gene of interest. Then you will amplify the gene of interest by the PCR method to have enough quantity to do the cloning. So the cloning step is when you have, you choose a vector that you will open using restriction on the nucleates, and you will insert, this is an insertion step, insert your gene of interest inside this vector, and close the new vector with ligase enzymes. So Host cell this step, when you want to prepare your expression vector is the cloning step. Then, when your vector is ready, you want to put it in a cell to express your protein. Genetically modified host organism expressing the protein of interest Protein of interest 4 I. GENERALITIES ON THE CONSTRUCTION OF AN EXPRESSION VECTOR Construction of an expression vector restriction endonucleases (amplification: Donor organism PCR) Genetic information Genome expressing a gene of interest gene of interest Vector restriction vecteur endonucleases Archae Protein of (to « open » the Eubacteria interest plasmid) Virus ligase Organism naturally expressing Cloning Clonage Fungi the protein of interest Plants Expression Vector Animals Host cell Genetically modified host organism expressing the protein of interest Protein of interest 5 I. GENERALITIES ON THE CONSTRUCTION OF AN EXPRESSION VECTOR Construction of an expression vector restriction endonucleases (amplification: Donor organism PCR) Genetic information Genome expressing a gene of interest gene of interest Vector restriction vecteur endonucleases Archae Protein of (to « open » the Eubacteria interest plasmid) Virus ligase Organism naturally expressing Cloning Clonage Fungi -Transformation refers to the process the protein of interest of introducing foreign DNA (like a Plants plasmid) into bacterial or yeast cells. Expression Vector Animals -Transfection is the process of introducing foreign DNA or RNA into eukaryotic cells (like mammalian, insect, or plant cells). Host cell Genetically modified host organism expressing the protein of interest Protein of interest 6 I. GENERALITIES ON THE CONSTRUCTION OF AN EXPRESSION VECTOR Construction of an expression vector restriction endonucleases (amplification: Donor organism PCR) Genetic information Genome expressing a gene of interest gene of interest Vector restriction vecteur endonucleases Archae Protein of (to « open » the Eubacteria interest plasmid) Virus ligase Organism naturally expressing Cloning Clonage Fungi the protein of interest Plants Expression Vector Animals Host cell Choice of the couple Genetically modified host host-vector determinant organism expressing for success the protein of interest Protein of interest 7 I. GENERALITIES ON THE CONSTRUCTION OF AN EXPRESSION VECTOR Criteria for choosing a host-vector system Usage destination or characteristics of the protein - Pharmaceutical use (native, active and pure proteins) - Industrial use (large quantity, economic production) - Modified proteins (enhanced safety standards) Qualitative criteria - Folding, post-translational modifications (glycosylations,...) - Stability, solubility So if you want a protein which is well formed with post-transcriptional modification, which - Biological activity is not possible in all the host cells that you can choose, you need to consider the stability of your protein and if it's soluble. And also consider if your protein has biological activity. Quantitative and economic criteria - Expression level of the protein (yield) - purification processes (optimized price/performance ratio) - Equipment (simple and cheap) 8 I. GENERALITIES ON THE CONSTRUCTION OF AN EXPRESSION VECTOR Criteria for choosing a host-vector system Anja Schu¨ tz et al., STAR protocol, 2023, A concise guide to choosing suitable gene expression systems9 for recombinant protein production I. GENERALITIES ON THE CONSTRUCTION OF AN EXPRESSION VECTOR Genetic structure of a transgene The transgene and the vector are two distinct but related components in genetic engineering. The transgene is the gene of interest that will be expressed in the host organism. The vector is the tool or vehicle that carries the transgene into the host cell. Transgene 10 https://www.proteogenix.science/scientific-corner/protein- production/recombinant-protein-expression/ I. GENERALITIES ON THE CONSTRUCTION OF AN EXPRESSION VECTOR Transgene = chimeric construction, Genetic structure of a transgene does not exist in nature Structure: A typical transgene includes: Transcriptional signals / Promoter of the host cell (or recognized by the host cell) in which the transgene is expressed: constitutive or inducible. Example: In mammalian cells, the CMV (cytomegalovirus) promoter is often used to achieve strong, ubiquitous expression of the transgene. Open Reading Frame (ORF) / Coding Sequence (encoding the protein of interest): starts with a start codon (AUG) and ends with a stop codon (UAA, UAG, or UGA). so first, you need to know that the transgene and the vector are two different components. The transgene is really the gene of interest that you will express in the host organism and there is a specific structure And the vector will be the tool, or we say vehicle, to carry the transgene into the host cells. So if we talk about the transgene, first, you need to know that a transgene is a chimeric constriction, so it does not exist in nature. You will take a gene of interest that exists in an organism, but you will modify it and put it in a vector, so then it's not anymore a natural, let's say a natural gene., for the transgene, you have transcriptional signals or promoters, which are host promoters, or that can be recognized by the host cell. It's very important if you choose the wrong promoter and put it in a cell where it's not recognized, so you will not have the production of your protein. 5' ATG STP 3' Host promoter Gene of interest (cDNA) Host transcription termination (or recognized by host cell) (or recognized by host cell) : Possibility to add a signal peptide for protein secretion : One or several tags (in C-ter if a signal peptide is located in N-ter) 11 I. GENERALITIES ON THE CONSTRUCTION OF AN EXPRESSION VECTOR Transgene = chimeric construction, Genetic structure of a transgene does not exist in nature Structure: A typical transgene includes: PolyAdenylation signal (for mRNA stability): leading to the addition of a poly(A) tail to the mRNA transcript, which is important for mRNA stability and efficient translation in eukaryotic cells. Example: The SV40 polyadenylation (SV40 poly(A)) signal is often used in mammalian transgenes Optional elements like tags (for protein detection) or signal peptide (for protein secretion) 5' ATG STP 3' Host promoter Gene of interest (cDNA) Host transcription termination (or recognized by host cell) (or recognized by host cell) : Possibility to add a signal peptide for protein secretion : One or several tags (in C-ter if a signal peptide is located in N-ter) 12 I. GENERALITIES ON THE CONSTRUCTION OF AN EXPRESSION VECTOR Genetic structure of a vector Multiple Cloning Site (MCS) short segment of DNA containing numerous restriction enzyme recognition sites 13 https://www.proteogenix.science/scientific-corner/protein- production/recombinant-protein-expression/ P3R: Part 2 Host-vector systems I. Generalities on the construction of an expression vector II. Expression systems in the bacteria III. Expression systems in yeast and fungi IV. Expression systems in mammalian cells 14 So Escherichia coli was the first one used and particularly in the Eli Lilly Company, which is a major American pharmaceutical company that played a key role in the mass production of insulin, making it one of the first companies to provide insulin for the treatment of diabetes. Before this, the insulin was extracted from the bronchial cyst of goats and pigs. So their insulin is biologically similar to the human insulin, but it has limitation in terms of scale, cost, and could have potential allergic reaction in some patients as it's not perfectly similar. II. Expression systems in bacteria The most ancient system used (Eli Lilly, Insulin hormon, 1982 FDA) The simplest system 15 II. Expression systems in bacteria Escherichia coli (E. coli) The least expensive Used in research and production Other types: Bacillus subtilis, etc Limited to simple proteins Little or no post-translational modifications II. Expression systems in bacteria The vector: a plasmid Extrachromosomal circular DNA (2 to 6 kbp) Autonomous replication (ori), independent from bacterial chromosome Copies number variable according to the plasmid, number controlled (gene rop or par) Possess a marker gene (antibiotic resistance) Possibility to introduce exogenous DNA, up to 10 kbp Possibility to use two plasmids simultaneously in co-transformation experiments 17 II. Expression systems in bacteria Common characteristics shared by the host bacterial strains recA- : mutation for recombination and post-replicative repair So the use of RegA-negative strains will decrease the risk of recombination between your plasmid and your host genome, so this will help to maintain the integrity of the recombinant plasmid in the bacterial culture. You will use strains that are DAM and DCM-negatives, so these both mutations are responsible for DNA repair and DNA methyltransferase, so you have the DNA adenine methyltransferase, and here's the DNA cytosine methyltransferase, so it's useful to delete them when you need plasmids that are free from adenine or cytosine methylation, which is especially important for the restriction enzymes that can be blocked by methylation DNA when you want to constrict your vector. You will have strains also, RP and MB-negatives, this refers to mutation in the host specificity of the DNA system, meaning that you will inactivate the restriction and modification activities associated with the bacterial DNA reconstrict modification system, so this system is responsible for bacterial defense against foreign DNA, so you can understand that we don't want this, such as plasmids or phage DNA, so it will help to protect the bacterial genome that can invade the bacteria. 18 II. Expression systems in bacteria Common characteristics shared by the host bacterial strains recA- : mutation for recombination and post-replicative repair dam-, dcm- : mutation defective for adenine and cytosine methyltransferases 19 II. Expression systems in bacteria Common characteristics shared by the host bacterial strains recA- : mutation for recombination and post-replicative repair dam-, dcm- : mutation defective for adenine and cytosine methyltransferases hsdS(rB- mB-) : defective host specificity determinant (restriction, methylation). This system is responsible for bacterial defense against foreign DNA. 20 II. Expression systems in bacteria Common characteristics shared by the host bacterial strains recA- : mutation for recombination and post-replicative repair dam-, dcm- : mutation defective for adenine and cytosine methyltransferases hsdS(rB- mB-) : defective host specificity determinant (restriction, methylation). This system is responsible for bacterial defense against foreign DNA. No endotoxin No endogenous wild plasmid Non virulent (often associated to auxotrophy markers → selection tools) ex: E.coli leu- mutation → transformed with a plasmid containing a functional LEU2 21 gene II. Expression systems in bacteria A key element: the promoter of the expression vector Strong Regulated, ideally inducible Minimum level of transcription close to zero when not induced 22 Nobel prize of medicine in 1965 to François Jacob, Jacques Lactose Operon Monod et André Lwoff So it's an operon required for the transport and metabolism of lactose in Escherichia coli, as well as in other bacteria in the intestinal flora. An operon is a set of genes that is placed under the control of a single promoter. In the case of the lactose operon, there are three genes that are controlled by the lac promoter. The lacZ gene, that encodes for the beta-galactosidase, an enzyme necessary to metabolize the lactose sugar. The lacY, which encodes a permeance necessary for the entry of the lactose within the bacterial cells. And the lacA, which encodes a trigalactosid transacetylase, whose function remains not really clear. So the transcription of the operon gives a single mRNA that will be translated then in three different proteins. So transcription is regulated by a refresher protein called lacI, lacI, and synthesized independently from the operon. De Isabelle Borde, 2005, Biologie et Multimédia - 23 www.snv.jussieu.fr/ bmedia/operonlactose/ Université Pierre et Marie Curie - UFR de Biologie Promoter of Lac operon By promoter, it should be understood promoter + operator Strains of E. coli defective for β-galactosidase (e.g., DH5α). Ampicillin resistance marker Screening of colonies expressing ß-gal on X-gal Induction with IPTG cDNA Gène d'intérêt + Tags ß-gal (lacZ) X-gal X + galactose (5-bromo-4-chloro-3-indolyl-β-D- (5-bromo-4chloro-3-hydroxyindole) galactopyranoside) 24 IPTG = isopropyl β-D-1-thiogalactoside = same role than allolactose In fact, the plasmin that contains this lacZ will encode for the alpha- peptide of the beta-galactosidase. So this peptide will complement the defective beta-galactosidase in some strains of HHR colony. So to have this blue-white screening, in fact, in the presence of the X-gal, which is a chromogenic substrate, which is colorless in the native form, but it can be hydrolyzed when the enzyme beta-galactosidase is present, and it will produce a blue-colored product. , when you have IPTG in your media, the bacterial colony will produce an intact way, express the lacZ, so produce an intact beta-galactosidase that will produce the blue color.... So if you don't have your gene Blue: ßgal + of interest inserted in your multi- colony site, you will have a small distance between the promoter and the lacZ gene, so you will White: ßgal - produce your beta-galactosidase, but if you have the insertion of your gene of interest, you will have a big distance between the promoter and the lacZ gene, and you will have your codon stop at the end of your gene of interest, so you will not have anymore the production of your beta- galactosidase, meaning that when you have... When you have the inhibitor of the inhibitor, you will see if your colonies are able to produce or not the blue color, and so meaning that only the white colony will have the gene of interest. Transcription at Lac promoter is induced by IPTG that blocks the repressor LacI. When no gene is cloned between Lac promoter and LacZ, βgal is produced and clives X-gal : Blue 25 : white When a gene is cloned between Lac promoter and LacZ, βgal is not produced and does not clive X-gal Promoter of T7 bacteriophage. T7 promoter is much more stronger than Lac promoter T7 promoter. Specifically recognized by T7 RNA polymerase so in this case, in this PET vector, we use a nitrate for friction between the T7 promoter, the lac I, and also the lac O. So it works like this. We use a specific strain of E. coli for this, a BL21, which is, we said, lisogene for this effective prophage lambda-DE3, meaning lisogene means that it has a prophage or dormant virus, which is integrated in its genome, but which is defective, meaning that the virus doesn't have any more virulence genes, so there is no multiplication of the virus. And the lambda-DE3 has T7 RNA polymerase gene under the control of the lactose operator promoter, as we saw previously. So how it works in this model, in these strains, BL21, when you transform them with a plasmid that carry your gene of interest, it's under the control of the T7 and lac operator. And so when you don't induce the expression, so you don't add the IPTG, which is like the lactose, you have your lac I that will express the repressor that will fix the operon inside your Specifications for an expression plasmid and also in your chromosome. So you will not have any transcription of your T7 RNA polymerase, and so you will not have either the system : transcription of your gene of interest. But if you add the IPTG, so it will bind the lac repressor produced by the. T7 promoter inducible lac I, that will no longer bind the lac operator, or both add the T7 promoter from the T7 polymerase, so you will have a production of your RNA polymerase that will. T7 promoter completely off when recognize your T7 promoter, and as you don't have also in here the inhibition, you will have the transcription and production of your protein of interest. Here it's another not induced way to present this model. Response :. Hybrid T7 prom / LacI / LacO 26 E. coli BL21 strain (DE3), lysogen for defective prophage λDE3 Derived from E. coli B serotype strain Devoid of certain protease activities (OmpT, LonB) Lysogen = carrier of a prophage, "dormant" virus, integrated in its chromosome Defective = virus whose virulence genes are inactivated, which renders it unable to multiply λDE3 = recombinant bacteriophage bearing the T7 RNA polymerase gene under the control of a lactose/operator promoter (lacUV5). lacUV5 27 According anonymous BL21 transformed with a plasmid carrying a gene of interest (Promoter T7/LacO) When non-induced by IPTG, the Lac repressor LacI repress expression of the T7 RNA polymerase carried by the defective virus λDE3: No T7 polymerase produced No transcription at T7 promoter on plasmid ➔ No expression of the protein of interest 28 anonymous When IPTG is added to the culture medium, it enters the cells, binds to the lac repressor produced by LacI, which no longer binds LacO both at the T7 polymerase gene and at the T7 promoter on plasmid: T7 polymerase is expressed The gene of interest on plamid is transcribed ➔ The protein of interest is produced anonymous 29 Promoter of T7 bacteriophage The molecular basis of recombinant protein expression in uninduced bacterial host cells using the pET system if you add the IPTG, you will block the repressor that will no longer fit the operon between the promoter and the T7 gene, so you will have a high expression of the T7 RNA polymerase, and that will recognize the T7 promoter and induce the production of your protein of interest. Some specificity is that in here it's the cloning region of the PT vector, so you have in here the T7 promoter and the lac operator just before the multi-cloning site. T7 promoter 30 https://www.takarabio.com/products/protein-research/expression-vectors-and-systems/e-coli-expression-systems/pet-expression-system Promoter of T7 bacteriophage The molecular basis of recombinant protein expression in IPTG-induced bacterial host cells using the pET system T7 promoter 31 https://www.takarabio.com/products/protein-research/expression-vectors-and-systems/e-coli-expression-systems/pet-expression-system Detail of the cloning region of a pET vector How does it work ? What should be the genotype of bacterial host? 32 Detail of the cloning region of a pET vector → E. coli BL21 strain: One of the primary limitations of the BL21 strain is that E. coli does not naturally recognize certain rare codons frequently found in eukaryotic genes (from humans, plants, animals, etc.). These rare codons can lead to poor expression of eukaryotic proteins because the tRNAs necessary to translate them are either absent or present in very low quantities. → AGA and AGG (for arginine), AUA (for isoleucine), CUA (for leucine), CCC (for proline), GGA (for glycine) 33 Improved BL21(DE3) E. coli strains Carry plasmids overexpressing rare tRNAs in E. coli (tRNA for Arg, Ileu, Leu) Carry plasmids expressing the T7 lysozyme, a protein inhibitor of T7 polymerase, to avoid basal of leaky expression of T7 ploymerase in case of expression of toxic recombinant proteins. A mutant strain defective for RNAse E: stability/ half-life of mRNA improved 34 II. Expression systems in bacteria A. The vector: a plasmid B. Common characteristics shared by the host bacterial strains C. A key element: the promoter of the expression vector 1. Lactose Operon Promoter of Lac operon Promoter of T7 bacteriophage Detail of the cloning region of a pET vector E. coli BL21 strain (DE3), lysogen for defective prophage λDE3 Improved BL21(DE3) E. coli strains 2. Promoter of arabinose operon 35 Promoter of arabinose operon A genetic map of the E. coli araC and araBAD operons. The map indicates the proteins these operons encode and the reactions in which these proteins participate. so you have three genes that will be cut in three different enzymes that will participate in the transformation of the RNA. So, it's called the ARA. So, with the genes B, A, and D. ARA-C, that codes for a repressor. 36 Promoter of arabinose operon Negative Regulation: When AraC is present, but not L-arabinose or cAMP, AraC links together araO2 and araI1 to form a DNA loop, thereby repressing both araC and araBAD So, you have a DNA loop that will block the expression of the ARA-BAD genes. , if you have ARRA-DNAs inside your laser, the ARRA-DNAs will bind these little pockets inside the N-terminal domain of ARRA-C. And because of the ARRA-DNAs, ARRA-C will change the configuration, will release the ARRA-O2 domain, and will fix instead the ARRA-I2 domain 37 Promoter of arabinose operon Positive regulation: When AraC and L-arabinose are both present and cAMP is abundant, the resulting AraC–arabinose complex releases araO2 and instead binds araI2, thereby activating araBAD transcription. This process is facilitated by the binding of CAP–cAMP. araC is repressed by the binding of AraC–arabinose to araO1. 38 Promoter of arabinose operon pC : promoter araC; pBAD : promoter genes for arabinose catabolism; O2 : operator operon BAD; I1, I2 : transcription inductors CAP : Catabolite activator protein (fixes cAMP when no glucose); arac : + and – regulator protein. 39 Example of an "AraC" vector for the expression of a secreted recombinant protein 40 Regulation of the pBAD promoter Construction pBAD-6his-GFP Western blot of arabinose induction Induction 3 h at 0.5 DO GFP Advantages - No basal or leaky expression if no arabinose in the medium - Precise control of the expression of the recombinant protein by arabinose concentration - Expression possibly regulated (lowered) by glucose (CAP protein) - Non toxic and non expensive production system 41 II. Expression systems in bacteria Other regulatory elements. Control of plasmid copy number - Different ori:. bluescript 500-700 copies / cell Control by rop gene. ColE1 15-20 copies / cell (repressor of primer) - Selection markers modulation (ATB resistance) You will have less translation than if you use a classic codon. You can play on the mRNA stability. By adding or removing loops inside your mRNA. And also, modify the presence of shiny Dalgarno sequences. To help the ribosome to bind the RNA and improve the translation. You can also play on the growth rate of your bacteria.. Improved translation - Play on codon usage (rare or not). Ex E: GAG / GAA 60/40 mammals 35/70 bacteria - Enhance mRNA stability: add / remove loop Rho-independent 3' region RNAse E (Bl21 Star) - Check / modify the presence of SD (Shine-Dalgarno) sequence (AGGAGG) in the promoter for further ribosome binding on RNA 42 II. Expression systems in bacteria Other regulation elements Induction of the expression of the recombinant protein at mid-log phase 3 main parameters can vary at this stage: - Concentration of IPTG (0.1 to 1 mM) or arabinose - Duration of induction (1 to 24 hours or more) - Culture temperature during the expression phase (from 18°C to 37°C) If excessive or too fast expression, risk of protein aggregation - insoluble proteins precipitate and are sequestered in inclusion bodies - Increased risk with T7 promoter: when activated, recombinant proteins You have a risk for aggregation. So when you have insoluble protein. They will precipitate and they will be sequestered in inclusion bodies. represent ± 50% of total cellular proteins This is particularly the risk when you use the T7 promoter. Which is very strong. And when it's activated, you will have the recombinant protein. That will represent around 50% of the total cellular protein But if you have too much or too fast expression. These proteins can aggregate and form an inclusion group. 43 Different cellular compartments for expression of recombinant proteins in E. coli Cytoplasm Periplasm 44 External medium D’après I. turbica Different cellular compartments for expression of recombinant proteins in E. coli 1- Soluble within the cytoplasm. Most common situation. Solubility depends on the correct folding of the protein - Ensured by different chaperones: Trigger factor, DnaK, GroE/L - Possibility to enhance folding by the surexpression of GroE/L. Solubility depends on the quantity of produced protein - Modulation by inductor concentration (IPTG, arabinose) - Modulation by temperature of incubation - Plasmid copy number So the solubility will depend on the quantity of proteins that you will produce. So this quantity will depend. on the concentration of the inductor that you will add in your media. 45 Different cellular compartments for expression of recombinant proteins in E. coli 2- Secretion into the periplasm. Fine location to reduce purification costs. Protein exported by a transporter system (translocase) - Recombinant protein must have secretion signals in N-ter Ex:. 21 aa from ompA, a protein of the external membrane. N-terminal region of periplasmic alkaline phosphatase (PhoA). Virus envelope protein (for example GIII secretion signal). Additional S-S maturation by Dsb system (Disulfide bond protein). Disadvantage: low secretion yield 46 Different cellular compartments for expression of recombinant proteins in E. coli 3- Insoluble inclusion body in the cytoplasm. Formation favored by: - when Protein expressed too fast or in too large amount - Imbalance in ratio Recombinant Protein / Chaperones ensuring folding / oxidases ensuring S-S - Hydrophobic domains. Advantages : - Aggregate protein protected from proteolysis - Protection of the cell against possible toxicity of recombinant protein 47 Solubilization of inclusion bodies and refolding of the protein 1 - Solubilization In a denaturing buffer containing urea 8M, or guanidine-HCL 6M, And then you will remove this denaturing buffer slowly. To allow the proteins to fold again properly. So you will either do a dialysis to totally remove the and a high concentration of reducing agent (2-ME or DTT) denaturing agent. Or inside your chromatography, inside your column. You will wash the column with a buffer containing no denaturing agent. To check if the protein is well folded. You can use the GFP folding test. Which is quite simple. In fact when you construct your transgene. At the end of your gene of interest. You have the sequence coding for the GFP. And the principle is 2 - Refolding that when the fused recombinant protein is expressed and well folded. We will see a GFP light. Because GFP is fluorescent Consists in removing the denaturing agent only if it is properly folded. So if you see the GFP meaning your protein is well folded. If you don't see the GFP meaning you probably have inclusion body. And you need to refold your protein. - by successive dialysis against decreased concentrations of denaturing agent - By refolding on affinity chromatography: the protein solubilized in denaturing agent is loaded on a column (ex IMAC) and the column is washed with buffer containing no denaturing agent 48 GFP-folding test When fused to a recombinant protein, GFP is correctly folded if the recombinant protein is also correctly GFP +: correct folded folding GFP is fluorescent if correctly folded GFP-folding test used to screen GFP -: incorrect correct folding of proteins in high folding throughput production systems 49 Plasmid Vectors pUC Series: High-copy-number plasmids commonly used for cloning and blue/white screening. E.g., pUC18, pUC19. pBR322: One of the earliest plasmid cloning vectors, contains genes for ampicillin and tetracycline resistance. pBluescript SK +/-: Used for cloning and blue/white screening, with T7 and T3 promoters for in vitro transcription. pGEM-T: Used for cloning PCR products, often used for TA cloning (contains T overhangs). pTZ Series: High-copy-number plasmids for blue/white screening, similar to pUC vectors. pJET1.2: Blunt-end cloning vector for cloning PCR products. pCR-Blunt: Cloning vector for blunt-end PCR products. pZErO: Contains a lethal ccdB gene to select against non-recombinant plasmids. pET Series: Widely used vectors for high-level expression of recombinant proteins in E. coli using the T7 promoter system. E.g., pET-28a, pET-21a. pBAD Series: Arabinose-inducible expression system. E.g., pBAD/Myc-His. pGEX Series: Glutathione S-transferase (GST) fusion protein expression vector. E.g., pGEX-4T-1, pGEX-6P-1. pMal Series: Maltose-binding protein (MBP) fusion protein vectors. E.g., pMAL-p2X, pMAL-c2X. pQE Series: High-level expression vectors with a 6xHis tag for protein purification. E.g., pQE-30, pQE-60. pTrc99A: Combines the lac and trp promoters for high-level, inducible expression. pLysS and pLysE: Used in conjunction with pET vectors to suppress basal expression of toxic proteins. pCOLAduet, pACYCDuet, pRSFDuet: Vectors for co-expression of multiple genes in E. coli. pRSET: Expression vector with a T7 promoter and 6xHis tag. pBBR1MCS: Broad-host-range vector, useful for expressing genes in various bacterial species. 50 Expression systems in bacteria Advantages Disadvantages. Mass culture in fermentors (up to 2000L). Little post-translational modification Simple and inexpensive media Low glycosylation with atypical sugars Sometimes Insoluble proteins, misfolded. Genetics very well known Numerous mutants, improved strains. Poor periplasmic and external secretion Cost of purification. Several expression vectors. Modulable gene expression. Good yield in production (several g / L) 51 Examples of recombinant proteins for therapeutic use produced in E. coli Ecokinase (anticoagulant) tPA tissue plasminogen Galenus Mannheim Rapylisin activator (anticoagulant) Roche Exubera Pfizer Insulin Apidra Sanofi Aventis Humulin Eli Lilly Omnitrop Sandoz Somatotropine hGH Saizen Serono Nutropin Genentech Calcitonin (salmon) Fortical (osteoporosis) Upsher-Smith Lab GM-CSF Leukine Amgen IGF (Insulin-like factor) Increlex (Growth factor) Tercica Keratinocyte growth factor Kepivance (oral mucositis) Amgen Interferon-alpha Viraferon (antiviral) Schering-Plough Interleukine IL-2 Proleukin (Melanoma) Chiron TNF- (tumor-necrosis) Beromun (Cancer surgery) Boehringer Ingelheim 52 P3R: Part 2 Host-vector systems I. Generalities on the construction of an expression vector II. Expression systems in the bacteria III. Expression systems in yeast and fungi IV. Expression systems in mammalian cells 53 Expression system in yeasts Copyright Thierry Noël https://souslemicroscope.com/levures/ These systems are particularly advantageous. Because of their ability to perform post-transcriptional modification. Like deacetylation, phosphorylation and glucofolate. Expression systems in yeasts and filamentous fungi are powerful tools for the production of recombinant proteins, enzymes, and other biotechnological products. These systems are particularly advantageous due to their ability to perform post- translational modifications (such as glycosylation, phosphorylation, and folding), which are often required for the production of functional eukaryotic proteins. 54 Common Yeast Species Used for Protein Expression Saccharomyces cerevisiae 55 Common Yeast Species Used for Protein Expression Saccharomyces cerevisiae 56 Common Yeast Species Used for Protein Expression Saccharomyces cerevisiae Features: This is the most commonly used yeast for research and industrial applications, with a well-understood genetics. Post-translational modifications: Performs glycosylation, phosphorylation, and disulfide bond formation. Advantages: It’s a Generally Recognized As Safe (GRAS) organism, making it suitable for pharmaceutical production. Limitations: Glycosylation in S. cerevisiae often results in hypermannosylation, which is 57 different from human glycosylation and can affect protein function in therapeutic applications. Common Yeast Species Used for Protein Expression The advantages of using this yeast is because it has a very high cell density. Pichia pastoris And you can produce a high yield of protein. The decontamination is simpler than in Features: A methylotrophic yeast that uses methanol as a carbon source. Advantages: Capable of growing to very high cell densities and producing high yields of protein. Glycosylation patterns are simpler than S. cerevisiae and more similar to mammalian systems. Induction: Protein expression is often driven by the AOX1 promoter, which is induced by methanol. Limitations: Requires careful handling of methanol during induction. Other yeasts like Yarrowia lipolytica and Kluyveromyces lactis are also used but are less common than S. cerevisiae and P. pastoris. 58 Partners for expression of a recombinant protein Cloning Transformation Yiest host Donor Gene cell gene + Vector MARKER + marker - Genes GENE INTEREST+ Interest - 59 Markers of transformation * Homologous markers: (S. cerevisiae systems) Gene belonging to the species in which it should be expressed Ex : Metabolic genes ∆ URA, HIS, LEU (auxotrophic marker ) * Advantages : Full optimal expression Clear phenotype * Disadvantages : Host cell should be auxotroph (not always compatible with industrial purposes) 60 Markers of transformation * Heterologous markers: (Pichia pastoris systems) Genes from a different species (even different phylum) Ex : Bacterial antibiotic resistance gene (Kan, Neo, Hygro, Bleo, Phleo, Zeo :-mycin) * Advantages : Expression in any susceptible genetic background * Disadvantages : Expression level variable - force promoter strength - methylation rate High rate of spontaneous resistance mutation 61 Transformation vectors for yeasts * Plasmids of procaryotic (bacterial) origin Selection marker + Gene of interest + Tags Bacterial ori : - Replicative in bacteria - Integrative in yeast 62 Transformation vectors for yeasts * Eukaryotic plasmid Cryptic plasmid 2µ S. cerevisiae So the 2-micron plasmid. Is a circular and double-stranded DNA molecule. Around 6.3 kilobars per. It is considered as a cryptic. Because it does not have any non-phenotype effect on the yeast cells. So it is not essential for the yeast cell structure. Or for any normal cellular function. This plasmid has a high copy number. Around 50 and 100 percent. Due to this original replication. And due to the mechanism of the plasmid. It also contains two genes. d Which are REB1 and REB2 genes. Which helps to regulate its replication. And also to partition the plasmid. Ori used for In each daughter cell. And to be sure that each daughter d shuttle vectors cell. Will receive copies of the plasmid. d E. coli / S. cerevisiae During the cell division. To be sure it will not be lost 63 Transformation vectors for yeasts * Eukaryotic plasmid Cryptic plasmid 2µ S. cerevisiae d Ori used for d shuttle vectors d E. coli / S. cerevisiae The epizomal plasmids do not insert themselves. Inside the host genome. But replicate independently Episomal autoreplicative plasmid 64 [several tens of copies per cell] Episomal expression vectors in yeast: Example in Saccharomyces cerevisiae Gene of interest 2µm ori: replication in yeast pGAL1: strong promoter inducible by galactose CYC1: transcription termination Cyt C oxydase oriE: Bacterial ori Ampr: bacterial selection marker (Bla) LEU2: yeast selection marker (isopropylmalate dehydrogenase) 65 D'après I. Turbica, Univ-Paris 11 Expression vector for Pichia pastoris AOX1 & 2 : Alcool oxydases, promoter methanol induced and glucose repressed CH3OH + O2 --> CH2O + H2O2 (converts alcool to aldhehyde) CYC1 : Cytochrome c, isoform 1; electron carrier of the mitochon drial intermembran e spac e that transfers electrons from ubiquinone-cytochrome c oxidoreductase to cytochrome c oxidase during cellular resp iration pEM7 : bacterial promoter pTEF1 : fungal promoter Zeocin : Atb cleaving DNA We can use the PZ alpha plasmid. Which is an expression vector. Which is specially designed for high-level recombinant protein production. 66 Expression in yeast Vector Organism Promoter Selection Marker Key Feature GAL1 (galactose- Inducible expression, pYES2 S. cerevisiae URA3 inducible) propagation in bacteria Constitutive expression pGPD-416 S. cerevisiae GPD (constitutive) URA3, HIS3, LEU2 for high production pESC- GAL1 / GAL10 URA3, LEU2, HIS3, Co-expression of two S. cerevisiae URA/LEU/HIS/TRP (bidirectional) TRP1 genes Multi-copy plasmid for YEp13, YEp24 S. cerevisiae 2µ, multiple LEU2, URA3 overexpression Continuous protein pTEF1 S. cerevisiae TEF1 (constitutive) URA3 expression AOX1 (methanol- Highly inducible by pPICZ P. pastoris ZeoR (Zeocin) inducible) methanol Secretion with α-mating pPICZα P. pastoris AOX1 ZeoR signal Constitutive expression pGAPZ P. pastoris GAP (constitutive) ZeoR without induction Stable integration at the pAO815 P. pastoris AOX1 HIS4 AOX1 locus TOPO-compatible, pYES2.1/V5-His-TOPO S. cerevisiae GAL1 URA3 propagation in E. coli pRS316 S. cerevisiae CEN/ARS URA3 Low copy number, stable His/Myc tags for pESC-His-Myc S. cerevisiae GAL1 / GAL10 HIS3 purification and detection Integrates at HIS4, pHIL-D2 P. pastoris AOX1 HIS4 suitable for secreted proteins 67 Expression in yeast Advantages Disadvantage. N-glycosylations with mannans. Small eukaryotic genome, genetically engineerable, possibly immunogenic finely characterized. Mass culture in fermentors. No toxins Solution. Many transformation vectors. "humanisation" of glycosylation with sialic acid. Simple post-translational modifications * glycosylations * carboxylations * acylations. Good yields (g/l). Possible secretion in the medium 68 Some products on the market from P. pastoris From Pichia.com: https://pichia.com/science-center/commercialized-products/ 69 Expression systems in filamentous fungi Main host: Aspergillus niger 70 Transformation vectors for filamentous fungi Selection marker + Gene of interest + Tags * Integrative plasmids Selectable Markers: resistance to antibiotics or auxotrophic markers Origins of Replication: autonomous replication sequences (ARS) integration by non-homologous recombination in filamentous fungi (90%) It is known for its high secretion capacity. That it efficiently releases a large amount of protein into the culture medium. So simplifying the purification system after this. And it is commonly used also in food and pharmaceuticals. And even in biofuel industries. Due to its classification as generally recognized as a safe organism. Transformation vectors in filamentous fungi are plasmids. Designed to introduce the foreign DNA into the fungi genome. So you have here two selection markers. That can be resistant markers to antibiotics. Or also auxotrophic markers. So this is the same as we saw previously. Also there is a different origin of replication. Some vectors carry autonomous replication sequence called ARS. For epizomal maintenance of the vector in the fungi. While others lack this feature. Promoting the integration into the genome via homologous recombination. In filamentous fungi. An efficient protein expression relies on strong transcriptional signals. 71 Such as the promoters and terminators. For both the constitutive and inducible expression. There is a promoter TPD that can be used. Which is quite strong. And a constitutive promoter. Transcription signals used for protein expression In filamentous fungi Transgene 5' ATG STP 3' Promoter GPD Gene of interest Terminator TRPC Glyceraldehyde -3P-DH Tryptophan synthase Promoters for Gene Expression: Strong fungal promoters such as the gpdA (glyceraldehyde-3-phosphate dehydrogenase) promoter from Aspergillus or the cbh1 (cellobiohydrolase I) promoter from Trichoderma 72 Transcription signals used for protein expression In filamentous fungi Transgene 5' ATG STP 3' Promoter GPD Gene of interest Terminator TRPC Glyceraldehyde -3P-DH Tryptophan synthase : Possibility to add a secretion signal : Tag in C-ter position if signal peptide [ no commercial kit for A. niger ] 73 main vectors used for recombinant protein production in filamentous fungi Vector Organism Promoter Selection Marker Key Feature Strong, constitutive expression for high-level pAN52-1 Aspergillus niger gpdA (constitutive) pyrG protein production alcA (inducible by Ethanol-inducible promoter, ideal for pAN56-1 Aspergillus niger pyrG ethanol) controlled expression glaA (inducible by Secretion of target proteins via glucoamylase pAN510 Aspergillus niger argB starch) signal sequence Inducible expression, used for regulated pALF1 Aspergillus nidulans alcA argB production in A. nidulans cbh1 hygR (hygromycin B High-level secretion, strong promoter for pDHt/SK Trichoderma reesei (cellobiohydrolase resistance) cellulase-related production 1) bar (herbicide Plant-fungal shuttle vector, often used for pCAMBIA Fusarium species CaMV 35S resistance) functional studies Reporter vector with GFP, ideal for tracking pFGPD-GFP Aspergillus niger gpdA phleomycin resistance protein localization Amylase promoter-driven secretion, useful for pFB6 Aspergillus oryzae amyB (amylase) niaD industrial enzyme production Expression and secretion with pTTTi Trichoderma reesei cbh2 pyr4 cellobiohydrolase 2 promoter, common in cellulase studies hygR (hygromycin B Strong constitutive expression, high pAN7-1 Aspergillus nidulans gpdA resistance) resistance selection hph (hygromycin B Heat shock promoter-driven expression, pBC-hph Aspergillus fumigatus hsp70 resistance) useful for stress studies 74 A documented example of production of a recombinant protein in Aspergillus nidulans Protein Production system Company Chymosin K. lactis Gist-Brocades (DSM) Calf/cow gene A. niger Genencor Calf/cow gene A. Niger * Chr. Hansen Calf/cow gene E. coli Pfizer Synthetic gene * Promoter of glucoamylase 75 Bovine chymosin Genes (paralogs A & B) Inactive preprochymosin Dairies Cheesemaking Acid cleavage Chymosin 325 aa, 35 kDa = aspartic protease Proteolytic activity on casein (cut Phe-Met, 105-106 aa) Caseinoglycopeptide + paracasein milk coagulation Young nursing Calf 1 - 2 stomach chambers 1 liter of rennet (renin) 1 L. rennet coagulates 10.000 L. milk / Needs in France = 2 millions rennet liters Recombinant chymosin 76 P3R: Part 2 Host-vector systems I. Generalities on the construction of an expression vector II. Expression systems in the bacteria III. Expression systems in yeast and fungi IV. Expression systems in mammalian cells 77 Expression systems in mammalian cells Hamster Mouse Human Why Use Mammalian Expression Systems? Complexity of mammalian proteins and need for post- translational modifications (PTMs) Advantages over prokaryotic (bacterial) and yeast systems Applications: Biopharmaceuticals, research, and gene therapy. 78 Commonly Used Mammalian Cell Lines These cells are capable of complex deposition. Which is essential for producing therapeutic proteins. As for example monoclonal antibodies. Other cell lines can be used. Animal cell lines Cell line Origin Culture medium Culture conditions BHK21 Kidney cells from 10% Fetal calf serum 37°C, 5% CO2, Syrian hamster + glutamine Non-adherent cells CHO Ovary cells from 10% Fetal calf serum 37°C, 5% CO2, Chinese hamster + glutamine Adherent cells NS0 Murine myeloma 10% Fetal calf serum 37°C, 5% CO2, + glutamine Weak adherent cells Human cell lines HEK293 Human embryonic Culture without serum possible, weak kidney adherent, non-adherent at room T° 79 Transient vs. Stable Expression in Mammalian Systems Transient Expression The gene of interest will be introduced Quick, high-yield protein production. temporarily. Inside the cells. And will not integrate into the gene. So this method is for rapid and high gain protein production. Over a short Gene is not integrated into the genome. period. And it's ideal for research. Useful for short-term studies or research applications. Lower cost and faster setup. Stable Expression Gene integrates into the host genome, allowing for long-term, consistent you have the stable expression. expression. That involves the integration of the gene. To directly include the host cells genome. So this approach takes more time and more effort. Suitable for large-scale or therapeutic production. Requires selection and screening processes. Higher cost and longer time to establish. 80 Constitutive expression in integrative and replicative vectors → commonly used as a constitutive expression vector in mammalian cells, offering options for both transient and stable expression. → Constitutive Expression: Driven by strong promoters (e.g., CMV), allowing continuous protein production → Integrative Vectors: Integrate into the host genome for stable, long-term expression. Example: pcDNA3.1 with antibiotic selection for stable cell line creation → Replicative Vectors: Remain episomal for high-yield, short-term production (transient expression) So first for constitutive expression. Means that the gene is expressed continuously. And is driven by a strong promoter. So after the transfection. The cells that have randomly integrated this vector into their genome. Will be selected by adding the Neonucin. This anti-biotic inside the culture medium. That only the cells that integrated the vector. And expressing this resistance gene will survive. So we will select only the cells that transfected with the vector. In contrast you have the replicative vector. That will replicate independently. Independently of the genome of your host cell. And that are used for a transient expression. So they will remain epizomatic. Producing a high yield for a limited time. For your protein production. So if you don't need to have a stable cell type. So in summary you have the integrative vectors. Like this one. Which are used for a stable expression. Where the gene integrates into the genome. While the replicative vectors are best for transient. Short term and high level of production.. Expression driven by a strong CMV promoter (immediate early). PolyAdenylation signals from bGH or SV40. Integrative vector when linearized. Episomal vector with ori SV40 81 (in host cell expressing the T Ag of SV40) Bicistronic expression in animal cells Allows expression of two genes from one mRNA transcript Use of IRES sequences of viral origin: Internal Ribosome Entry Site IVS : Intervening sequence, selfexcisable intron, then cleaves polycistronic mRNA to release IRES and allow access to ribosomes Expression of two proteins, of which one can be the selection marker 82 Application : pIRES vector pIRES vector enables bicistronic expression using IRES Allows simultaneous expression of two genes—e.g., a protein of interest and a fluorescent marker Commonly used for studies requiring co-expression or selection. Meaning you have two lipid clone inside. A and B. So you will carefully choose your embryonic case. For the first gene you want to put inside the MCSA. Or for the gene you want to put in the MCSA site. So for example you can add your protein of interest first. And then after the IRS site you can add a fluorescent marker. Meaning that you will be sure that your fluorescent cells will also express your protein of interest. So this is a nice and very precise way to be sure your cells express your protein of interest. So when you need a very precise control. 83 Regulated expression : Tet-Off / Tet-On Tet-Off: Gene expression is turned off in the presence of tetracycline or doxycycline Tet-On: Gene expression is turned on in the presence of tetracycline or doxycycline → Allows precise control of gene expression levels and timing. pCMV tTA pCMV rtTA tTA : Tetracycline transactivator And when you want to choose when and how much protein you want Tet-controlled Reverse Tet-controlled to express. You can use other systems which are called TET on TET off system. How it works. transactivator TetR rTetR transactivator So in case you have the TET off system. So for the TET off system. It shows the tetracycline transactivator protein. protein protein So once the TETA binds the operator. It will activate the promoter. + Dox And finally activate the expression of your gene of interest. When you have a tetracycline derivative.. You have the TET off system. Where the gene expression is turned off. In the presence of the doxycycline. And the TET on system. Where you have the gene expression. Which is activated when you have the presence of the doxycycline. So this system is highly useful in experiments. When we need to control the timing. And also the level of gene expression. Such as in the studies for gene function. Or for therapeutic applications. Where the protein expression needs to be carefully regulated. TRE = Tet responsive element = TetO-TetO-TetO 3x VP16 = Transcription activation domains from herpes virus 84 Regulated expression : Tet-Off / Tet-On Tet-Off: Gene expression is turned off in the presence of tetracycline or doxycycline Tet-On: Gene expression is turned on in the presence of tetracycline or doxycycline → Allows precise control of gene expression levels and timing. pCMV tTA pCMV rtTA tTA : Tetracycline transactivator Tet-controlled Reverse Tet-controlled transactivator TetR rTetR transactivator protein protein + Dox TRE = Tet responsive element = TetO-TetO-TetO 3x VP16 = Transcription activation domains from herpes virus 85 Expression in mammalian cells Advantages Disadvantages. Weak yields (10 mg/L). Mass culture possible (fed-batch with adherent cells). Slow growth. Many vectors. Expensive culture. Maturations close to the protein of interest,. Fragility of the cells in the case of animal/human proteins. Proteins cytosolic or secreted Improvements. Possible production of large protein. Use of less demanding cell lines. Engineering cell lines expressing anti-. Oligomeric assembly apoptotic factors (ex Bcl2) 86 Therapeutic applications 8 main categories of recombinant proteins on the market Coagulation factors Interferons et interleukines Thrombolytic and anti-coagulant Vaccines factors Hormones Monoclonal antibodies * Growth factors Miscellaneous Not always produced as recombinant proteins themselves, but involve a recombinant protein in the process of immunization 87 Pharmacological classification of the 173 biomedicaments approved for their use in humans in France in 201488 Total annual sales/Business turnover of each pharmacological category of biomedicaments in France in 2014 89 Increase of total annual sales business turnover of biomedicaments in France from 2010 to 2014 Around 5% per year – Total 5 billions € 90 Market of biomedicines in the world in 2010 135 Billions $ 91 CONCLUSION Even though mammalian cells produce little, are more fragile, and more expensive... ➔ Therapeutic recombinant proteins = CHO 92 Criteria for choosing a host-vector system Anja Schutz et al., STAR protocol, 2023, A concise guide to choosing suitable gene expression systems93 for recombinant protein production