Sense Codon Reassignment Enables Viral Resistance and Encoded Polymer Synthesis PDF
Document Details
null
2021
null
Wesley E. Robertson, Louise F. H. Funke, Daniel de la Torre, Julius Fredens, Thomas S. Elliott, Martin Spinck, Yonka Christova, Daniele Cervettini, Franz L. Böge, Kim C. Liu, Salvador Buse, Sarah Mas
Tags
Summary
This research article details the reassignment of sense codons in E. coli to enable viral resistance and the synthesis of non-canonical amino acids. The authors showcase the facile reprogramming of cells for encoded translation of various non-canonical heteropolymers and macrocycles. The study utilized codon reassignment strategies for creating cells with novel characteristics.
Full Transcript
RES EARCH ◥ this resistance is not general, and phage are RESEARCH ARTICLE...
RES EARCH ◥ this resistance is not general, and phage are RESEARCH ARTICLE often propagated in the absence of RF1 (8), because the TAG stop codon is rarely used for SYNTHETIC BIOLOGY the termination of translation (9), and—even when viral genes do terminate in an amber Sense codon reassignment enables viral resistance codon—the inability to read a stop codon does not limit the synthesis of full-length and encoded polymer synthesis viral proteins. In contrast, sense codons are commonly at least 10 times more abundant Wesley E. Robertson1†, Louise F. H. Funke1†, Daniel de la Torre1†, Julius Fredens1†, Thomas S. Elliott1, than amber codons in viral genomes and oc- Martin Spinck1, Yonka Christova1, Daniele Cervettini1, Franz L. Böge1, Kim C. Liu1, Salvador Buse1, cur over the length of viral genes; thus, we Sarah Maslen1, George P. C. Salmond2, Jason W. Chin1* predicted that a cell that does not read sense codons would not make full-length viral proteins It is widely hypothesized that removing cellular transfer RNAs (tRNAs)—making their and would therefore be completely resistant to cognate codons unreadable—might create a genetic firewall to viral infection and enable sense viruses. codon reassignment. However, it has been impossible to test these hypotheses. In this work, Current strategies for encoding new mono- following synonymous codon compression and laboratory evolution in Escherichia coli, mers in cells are limited to encoding a single we deleted the tRNAs and release factor 1, which normally decode two sense codons and type of monomer (commonly in response to a stop codon; the resulting cells could not read the canonical genetic code and were completely the amber stop codon) (3, 10, 11), directing resistant to a cocktail of viruses. We reassigned these codons to enable the efficient synthesis the inefficient incorporation of monomers or of proteins containing three distinct noncanonical amino acids. Notably, we demonstrate the potentially incompatible with encoding se- facile reprogramming of our cells for the encoded translation of diverse noncanonical quential monomers (12–17); these limitations Downloaded from https://www.science.org on August 14, 2024 heteropolymers and macrocycles. preclude the synthesis of noncanonical hetero- polymer sequences composed entirely of non- N canonical monomers. We hypothesized that ature uses 64 triplet codons to encode the tRNAs that read them from the genome reassigning sense codons to noncanonical the synthesis of proteins composed of may enable the creation of cells with several monomers may enable the efficient and se- the 20 canonical amino acids, and most properties not found in natural biology, in- quential polymerization of distinct nonca- amino acids are encoded by more than cluding new modes of viral resistance (2) nonical monomers to produce noncanonical one synonymous codon (1). It is widely and the ability to encode the biosynthesis heteropolymers. hypothesized that removing sense codons and of noncanonical heteropolymers (3–6). How- Recently, a strain of E. coli, Syn61, was ever, these hypotheses have not been ex- created with a synthetic recoded genome in perimentally tested. Removing release factor which all annotated occurrences of two sense 1 Medical Research Council Laboratory of Molecular Biology, 1 (RF1) (and therefore the ability to efficiently codons (serine codons TCG and TCA) and a Cambridge, UK. 2Department of Biochemistry, University of terminate translation on the TAG stop codon) stop codon (TAG) were replaced with synon- Cambridge, Cambridge, UK. *Corresponding author. Email: [email protected] from Escherichia coli provides some resistance ymous codons (18). In this study, we evolved †These authors contributed equally to this work. to a limited subset of phage (7, 8). However, Syn61 and deleted the tRNAs and release A Serine Serine codon tRNA anticodon codon tRNA anticodon TCG CGA serU CGA serU Serine codon tRNA anticodon TCA UGA serT UGA serT 2 rounds of parallel mutagenesis TCT TCT TCT & dynamic selection to create Syn61(ev2) TCC GGA serW,X TCC GGA serW,X TCC GGA serW,X synonymous AGT AGT codon AGT deletion of serT, serU, prfA AGC GCU serV AGC GCU serV compression AGC GCU serV to create Syn61∆3 STOP STOP STOP 3 rounds of parallel mutagenesis codon codon Release factor codon codon Release factor codon codon Release factor TGA RF2 prfB TGA RF2 prfB TGA RF2 prfB & dynamic selection TAA TAA TAA Syn61∆3(ev5) TAG RF1 prfA RF1 prfA prfA E. coli prfA Syn61 B 0.015 Fig. 1. Strain evolution and creation of Syn61D3. (A) Schematic of strain evolution. Black lines connect the codons that encode serine and protein termination to the anticodons of the tRNAs or release 0.010 ou/min factors predicted to decode them. The genes encoding the corresponding tRNAs and release factors are indicated in the black boxes. Cells with the decoding rules of Syn61 are denoted with a pink box throughout. 0.005 Two rounds of parallel mutagenesis and dynamic selection created Syn61(ev2). serT, serU, and prfA were then deleted to create Syn61D3. Finally, three rounds of parallel mutagenesis and dynamic 0.000 selection were applied to create Syn61D3(ev5). Syn61D3 and Syn61D3(ev5) are represented by the ) WT (ev1 (ev2 ) ∆3 (ev3) ev4) (ev5) light-teal box throughout. (B) Growth rates of Syn61 and all intermediate strains in the development of ( ∆3 ∆3 ∆3 Syn61D3(ev5). Growth rates were calculated on the basis of growth curves measured for n = 8 replicate Syn61 cultures for each strain. ou, optical units. For statistics, see methods in the supplementary materials. Robertson et al., Science 372, 1057–1062 (2021) 4 June 2021 1 of 6 RES EARCH | R E S E A R C H A R T I C L E Fig. 2. Lytic phage propagation and cell lysis A RF-1 serT are obstructed in Syn61D3. (A) Schematic of viral infection of Syn61D3. Deletion of AGU serU (encoding tRNASerCGA), serT (encoding serU tRNASerUGA), and prfA (encoding RF1) makes the UCG, UCA, and UAG codons unreadable, AGC and the ribosome will stall at these codons Stall within an mRNA that contains them, as shown here UCA UCA UCG UCA UCG UAG for a viral mRNA. (B) Schematic of the number of TCG, TCA, and TAG codons and their positions in B the genome of T6 phage. (C) Cultures were T6 TCG 189 infected with T6 phage at a multiplicity of infection TCA 979 (MOI) of 5 × 10−2, and the total titer (intracellular TAG 17 0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 kb phage plus free phage) was monitored over 4 hours. PFU, plaque-forming units. Treatment C T6 total titer D T6 phage infection with gentamicin was used to ablate protein 1011 100 No cells synthesis, providing a control for cells % OD600 without phage 1010 Titer (PFU / ml) that cannot synthesize viral proteins or Syn61(ev2) 109 produce new viral particles. (D) T6 efficiently Syn61 RF1 108 lyses Syn61 variants but not Syn61D3. Cultures Syn61 3 50 were infected as in panel (C), and OD600 was 107 Syn61(ev2) measured after 4 hours. (E) Number of the 106 + gentamicin Downloaded from https://www.science.org on August 14, 2024 indicated codons per kilobase in each indicated 105 phage. (F and G) Syn61D3 survives simultaneous 104 0 infection of multiple phage. (F) Photos of the 0 1 2 3 4 ev2 RF1 3 culture at the indicated time points after infec- Time post-infection (hours) Syn61 variant tion (+) or in the absence of infection (−). Cultures were infected with phage l, P1, T4, T6, E Codon usage F +P1vir+T4+T6+T7 G +P1vir+T4+T6+T7 and T7, each with an MOI of 1 × 10−2. (G) OD600 10 100 of the cultures was measured after 4 hours. % OD600 without phage All experiments were performed in three 0h Codons / kb independent replicates; the dots represent the TCA independent replicates, and the line (C) or bar 1 TCG 50 [(D) and (G)] represents the mean. The photo (F) TAG is a representative of data from three 4h independent replicates. 0.1 0 P1vir T4 T6 T7 Phage - + - + - + ev2 RF1 3 Phage Syn61 ev2 RF1 3 Syn61 variant factor that decode TCG, TCA, and TAG codons. serU, and prfA could be deleted in a single strain. However, Syn61D3 grew 1.7 times slower We show that the resulting strain provides strain derived from Syn61. than Syn61(ev2) (Fig. 1B). This growth de- complete resistance to a cocktail of viruses. Syn61 grows 1.6 times slower than the strain crease may result from the presence of target Moreover, we demonstrate the encoded incor- from which it was derived (18). To increase codons in the genome of Syn61 that were not poration of noncanonical amino acids (ncAAs) the growth rate of the strain before serT, serU, annotated and targeted (20, 21), and it may in response to all three codons and the en- and prfA deletion, we applied a previously also result from the other noncanonical roles coded, programmable cellular synthesis of described random parallel mutagenesis and that tRNAs may play (22, 23). entirely noncanonical heteropolymers and automated dynamic parallel selection strategy We performed three sequential rounds of macrocycles. (19); this approach uses feedback control to random parallel mutagenesis and automated dynamically dilute mutated cultures on the dynamic parallel selection to evolve Syn61D3 Creating Syn61D3 basis of growth rate and thereby selects fast- to Syn61D3(ev5), which grew 1.6-fold faster We predicted that replacing the annotated growing strains from within mutated pop- than Syn61D3 (Fig. 1, A and B; fig. S1, B, C, and TCA, TCG, and TAG codons in the genome ulations (fig. S1A). Through two consecutive F to H; and data S1). When grown in lysogeny would enable deletion of serT and serU (en- rounds of mutagenesis and selection, we created broth (LB) media in shake flasks, the doubling coding tRNASerUGA and tRNASerCGA, respec- a strain, Syn61(ev2), which grew 1.3-fold faster time of Syn61D3(ev5) was 38.72 ± 1.02 min (fig. tively) and prfA (encoding RF1), which decode (Fig. 1B; fig. S1, B to E; and data S1 and S2). S1I). Syn61D3(ev5) contains 482 additional mu- these codons, in a single strain (Fig. 1A). We Next, we removed serU, serT, and prfA tations with respect to Syn61—420 substitutions previously showed that serT, serU, and prfA from Syn61(ev2) to create Syn61D3 (Fig. 1A, and 62 indels—of which 72 are in intergenic could be deleted in separate strains derived fig. S1C, and data S1 and S2). This demon- regions (data S1 and S3 and fig. S2). No target from Syn61 (18); however, this does not cap- strated that removing the target codons in codons were reverted, further demonstrating the ture the potential epistasis between these Syn61 was sufficient to enable the deletion of stability of our recoding scheme. Sixteen sense genes. We sought to determine whether serT, all decoders of the target codons in the same codons in nonessential genes were converted Robertson et al., Science 372, 1057–1062 (2021) 4 June 2021 2 of 6 RES EARCH | R E S E A R C H A R T I C L E Fig. 3. Reassigning two sense A ncAA codon tRNA anticodon codons and a stop codon to XXX YYY O-tRNA noncanonical amino acid in Syn61D3. (A) Schematic of each Serine Serine codon tRNA anticodon codon reassignment. Introduction codon tRNA anticodon TCT of an orthogonal aaRS/tRNAYYY TCT TCC GGA serW,X TCC GGA serW,X pair—where YYY is the sequence Single codon AGT reassignment AGT of the anticodon of the orthogonal AGC GCU serV AGC GCU serV tRNA (encoded by O-tRNA)—to O-aaRS / O-tRNAYYY STOP Syn61D3 (light-teal box, as described STOP codon codon Release factor in Fig. 1A) enables decoding of the codon codon Release factor pair TGA RF2 prfB TGA RF2 prfB cognate codon (XXX) introduced TAA TAA into a gene of interest. The orthogonal pair directs the incorpo- ration of a noncanonical amino acid (ncAA) in response to the XXX codon. B C tRNAPyl(YYY) — CGA UGA CUA These codon reassignments are indi- — TCG TCA TAG 100 Single Exp: 9487.7 Da cated in the dark gray box. (B) TCG, codon(XXX) Act: 9487.8 - 9488.0 Da Intensity % TCA, and TAG codons are not read by ncAA — + — + — + — + the translational machinery in -100 Da 50 Syn61D3, and codon reassignment TAG Downloaded from https://www.science.org on August 14, 2024 enables ncAA incorporation into 10 kDa TCA TCG Ub11XXX. Plasmids encoding the 0 orthogonal MmPylRS/MmtRNAPylYYY Single: Ub(11XXX) 8500 9000 9500 10000 10500 pair and a C-terminally His6-tagged MW (Da) ubiquitin, with a single TCG, TCA, D E or TAG codon at position 11 (Ub11XXX), 100 Double Exp: 9629.0 Da Act: 9629.0 - 9629.2 Da Intensity % or no target codons (wild type, wt) were introduced into Syn61D3. “XXX” 50 -100 Da 10 kDa denotes a target codon, and “YYY” TAG denotes a cognate anticodon. TCA Double: Ub(11XXX, 65XXX) TCG Expression of ubiquitin-His6 was 0 8500 9000 9500 10000 10500 performed in the absence (−) or MW (Da) presence (+) of a ncAA substrate for MmPylRS, BocK. Full-length F G ubiquitin-His6 was detected in cell 100 Triple Exp: 9756.00 Da Act: 9756.0 - 9756.2 Da Intensity % lysate from an equal number of -100 Da cells with an anti-His6 antibody. 10 kDa 50 (C) Production of ubiquitin-His6 -200 Da TAG incorporating BocK, Ub-(11BocK)- Triple: Ub(11XXX, 14XXX, 65XXX) TCA His6, from a Ub11XXX gene bearing TCG 0 the indicated target codon was con- 9000 9500 10000 10500 MW (Da) firmed by ESI-MS. MW, molecular weight. Theoretical mass: 9487.7 Da; H I measured mass: 9487.8 Da (TCG), 100 Quadruple Exp: 9883.30 Da 9487.8 Da (TCA), and 9488.0 Da -100 Da Act: 9883.20 Da Intensity % 10 kDa (TAG). The smaller peak of −100 Da 50 results from the loss of tert-butox- -200 Da TAG ycarbonyl from BocK. (D) As in (B), Quadruple: Ub(9XXX, 11XXX, 14XXX, 65XXX) TCA but using Ub11XXX,65XXX, which TCG 0 contains target codons at positions 11 8500 9000 9500 10000 10500 and 65 of the Ub gene. (E) Production MW (Da) of ubiquitin-His6 incorporating BocK at positions 11 and 65, from a Ub11XXX65XXX gene bearing the indicated or −200 Da correspond to loss of tert-butoxycarbonyl from one or two BocK target codons was confirmed by ESI-MS. Theoretical mass: 9629.0 Da; residues, respectively. (H) As in (B), but using Ub9XXX,11XXX,14XXX,65XXX, measured mass: 9629.2 Da (TCG), 9629.0 Da (TCA), and 9629.0 Da (TAG). which contains target codons at positions 9, 11, 14, and 65 of the Ub gene. The smaller peak of −100 Da corresponds to loss of tert-butoxycarbonyl (I) Production of ubiquitin-His6 incorporating BocK at positions 9, 11, 14, from BocK. (F) As in (B), but using Ub11XXX,14XXX,65XXX, which contains and 65, from Ub9XXX,11XXX,14XXX,65XXX bearing the indicated target codons target codons at positions 11, 14, and 65 of the Ub gene. (G) Production was confirmed by ESI-MS. Theoretical mass: 9883.3 Da; measured mass: of ubiquitin-His6 incorporating BocK at positions 11, 14, and 65, from 9883.2 Da (TCG), 9883.2 Da (TCA), and 9883.2 Da (TAG). The smaller peaks Ub11XXX,14XXX,65XXX bearing the indicated target codons was confirmed by of −100 or −200 Da correspond to loss of tert-butoxycarbonyl from one or two ESI-MS. Theoretical mass: 9756.0 Da; measured mass: 9756.2 Da (TCG), BocK residues, respectively. All experiments were performed in biological 9756.0 Da (TCA), and 9756.0 Da (TAG). The smaller peaks of −100 replicates three times with similar results. Robertson et al., Science 372, 1057–1062 (2021) 4 June 2021 3 of 6 RES EARCH | R E S E A R C H A R T I C L E to target codons (5×TCG, 3×TCA, 8×TAG); these ble time scale and showed similar changes in treatment with this phage cocktail led to lysis frequencies are comparable to those observed OD600 upon infection. We conclude that de- of Syn61(ev2) and Syn61DRF1 but had little for other codons (data S1). Subsequent experi- letion of RF1 alone has little, if any, effect on effect on the growth of Syn61D3 (Fig. 2, F and ments used Syn61D3 or, once available, its T6 phage production or cell lysis. G), suggesting that the deletion of tRNAs in evolved derivatives to investigate the new prop- Infection of Syn61D3 with T6 phage led to a Syn61D3 provides resistance to a broad range erties of these strains. steady decrease in total phage titer. Notably, of phage. this decrease was comparable to that observed tRNA deletion ablates virus production when protein synthesis, and therefore phage Reassigning target codons for in Syn61D3 production in cells, was completely inhibited ncAA incorporation We investigated the effects of deleting the by addition of gentamicin (Fig. 2C and fig. We expressed Ub11XXX genes (ubiquitin-His6 genes encoding tRNASerCGA, tRNASerUGA, and S3B). Moreover, T6 infection had a minimal bearing TCG, TCA, or TAG at position 11) RF1 on phage propagation by Syn61D3 (Fig. 2A) effect on the growth of Syn61D3 (Fig. 2D). and genes encoding the cognate orthogonal in a modified one-step growth experiment We conclude that Syn61D3 does not produce MmPylRS/MmtRNAPylYYY pair (25) (in which (24). For Syn61(ev2), the total titer of phage T6 new phage particles upon infection with T6 the anticodon is complementary to the codon [a representative of the lytic, T-even family phage and that T6 phage does not lyse these at position 11 in the Ub gene) in Syn61D3(ev5) (Fig. 2B)] briefly dropped (as phage infected cells. Similar results were obtained with T7 (Fig. 3A and data S2). cells) before rising to two orders of magnitude phage, which has 57 TCG codons, 114 TCA In the absence of added ncAA, little to no above the input titer, as infected cells produced codons, and 6 TAG codons in its 40-kb genome ubiquitin was detected from Ub genes bearing new phage particles (Fig. 2C and fig. S3A). As (fig. S3, A, C, and D). We treated cells with a a target codon at position 11, while control expected, the optical density at 600-nm wave- cocktail of phage containing lambda, P1vir, experiments demonstrated that ubiquitin is length (OD600) of Syn61(ev2) was decreased by T4, T6, and T7, which have TCA or TCG sense produced from a “wild-type” gene that does infection with T6 phage, which is lytic (Fig. 2D). codons that are 10 to 58 times more abundant not contain any target codons (Fig. 3B). Thus, Downloaded from https://www.science.org on August 14, 2024 Syn61DRF1 (data S1) and Syn61(ev2) produced than the amber stop codon in their genomes none of the target codons are read by the en- a comparable amount of phage on a compara- (Fig. 2E and fig. S3E), and found that the dogenous translational machinery in Syn61D3. A ncAA 1 B C CbzK / p-I-Phe codon tRNA anticodon CGA O-tRNA1 100 Exp: 9707.81 Da Exp: 10055.00 Da TCG Ub: wt Act: 9707.40 Da Act: 10054.60 Da Intensity (%) CbzK: - + - + - + ncAA 2 codon tRNA anticodon p-I-Phe: - + - + - + Serine codon tRNA anticodon TCA UGA O-tRNA2 50 TCT 10 kDa ncAA 3 TCC GGA serW,X Codon codon tRNA anticodon reassignment TAG CUA O-tRNA3 0 AGT Serine 6000 8000 10000 12000 14000 AGC GCU serV MW (Da) codon tRNA anticodon STOP Triply orthogonal TCT codon codon Release factor O-aaRS / TCC GGA serW,X D E CbzK / p-I-Phe / BocK TGA RF2 prfB O-tRNAYYY 100 TAA pairs AGT Exp: 9820.97 Da Ub: wt Act: 9820.80 Da Intensity (%) AGC GCU serV CbzK: - + - + - tert- butoxycarbonyl STOP p-I-Phe: - + - + (-100 Da) codon codon Release factor BocK: - + - + 50 TGA RF2 prfB TAA 10 kDa 0 6000 8000 10000 12000 14000 MW (Da) Fig. 4. Double and triple incorporation of distinct noncanonical amino expressed in the presence of CbzK and p-I-Phe, as described in (E) and purified acids into TCG, TCA, and TAG codons in Syn61D3 cells. (A) Reassignment by nickel–nitrilotriacetic acid chromatography. These data confirm the quantita- of TCG (blue box), TCA (gold box), and TAG (green box) codons to distinct ncAAs tive incorporation of CbzK and p-I-Phe in response to TCG and TAG codons, in Syn61D3. Reassigning all three codons to distinct ncAAs in a single cell respectively. Ub-(11CbzK, 65p-I-Phe), theoretical mass: 9707.81 Da; measured requires three engineered triply orthogonal aaRS/tRNA pairs. Each pair must mass: 9707.40 Da. Ub-(11CbzK, 14CbzK, 57p-I-Phe, 65p-I-Phe), theoretical mass: recognize a distinct ncAA and decode a distinct codon. The tRNAs from these 10,055.00 Da; measured mass: 10,054.60 Da. (D) The incorporation of three triply orthogonal pairs are labeled O-tRNA1-3. (B) The incorporation of two distinct distinct noncanonical amino acids into TCG, TCA, and TAG codons in a single noncanonical amino acids in response to TCG and TAG codons in a single gene. gene. Syn61D3(ev4)—containing the 1R26PylRS(CbzK)/AlvtRNADNPyl(8)CGA pair, Syn61D3(ev4)—containing the 1R26PylRS(CbzK)/AlvtRNADNPyl(8)CGA pair (16) and the MmPylRS/MmtRNAPylUGA pair, and the AfTyrRS(p-I-Phe)/AftRNATyr(A01)CUA the AfTyrRS(p-I-Phe)/AftRNATyr(A01)CUA pair (29), which direct the incorporation of pair—were provided with CbzK, BocK, and p-I-Phe. Cells also contained CbzK into TCG and p-I-Phe into TAG, respectively—were provided with CbzK and Ub9TAG,11TCG,14TCA (TCG/TCA/TAG). Expression of this gene was performed in the p-I-Phe. Cells also contained Ub11TCG,65TAG (TCG/TAG), Ub9TCG,11TCG,14TAG,65TAG absence (−) or presence (+) of the ncAAs. Full-length Ub-(9p-I-Phe, 11CbzK, (2×TCG/2×TAG), or wt Ub, which contains no target codons. Expression of 14BocK)-His6 was detected in cell lysate from an equal number of cells with an ubiquitin-His6 was performed in the absence (−) or presence (+) of the ncAAs. anti-His6 antibody. (E) ESI-MS of purified Ub-(9p-I-Phe, 11CbzK, 14BocK), Full-length ubiquitin-His6 was detected in cell lysate from an equal number of theoretical mass: 9820.97 Da; measured mass: 9820.80 Da. Western blot cells with an anti-His6 antibody. (C) ESI-MS analyses of purified Ub-(11CbzK, 65p- experiments [(B) and (D)] were performed in five biological replicates with I-Phe) (black trace) and Ub-(11CbzK, 14CbzK, 57p-I-Phe, 65p-I-Phe) (gray trace), similar results. The ESI-MS data [(C) and (E)] were collected once. Robertson et al., Science 372, 1057–1062 (2021) 4 June 2021 4 of 6 RES EARCH | R E S E A R C H A R T I C L E A C 2000 D 10000 E 15000 Fluorescence (a.u.) Fluorescence (a.u.) A Fluorescence (a.u.) B O A H2 N O B H2N 8000 OH OH 1500 N O N O 10000 H O H O OH O 6000 O OH O 1000 O O O 4000 5000 O A-B O B-A 500 2000 P-site A-site P-site A-site 0 0 0 peptidyl-A-tRNA B-tRNA peptidyl-B-tRNA A-tRNA Encoded seq: ABABAB Encoded seq: ABABAB Encoded seq: ABABAB A B BocK: - + CbzK: - + AllocK: - + O A H2 N O B H2N p-I-Phe: - + p-I-Phe: - + CbzK: - + OH OH N O N O H OH O H O 29092.20 Da O O OH 29274.00 Da O F O O O 100 O A-A O B-B 29171.80 Da P-site A-site P-site A-site peptidyl-A-tRNA A-tRNA peptidyl-B-tRNA B-tRNA Intensity (%) B ATG GCT TCG TAG TCG TAG TCG TAG sfGFP 50 r.s.1-3 Met Ala A B A B A B r.s.1 r.s.2 r.s.3 0 28000 29000 30000 O O O H N O H N O H N O MW (Da) H OH OH OH Downloaded from https://www.science.org on August 14, 2024 O NH2 O NH2 O NH2 O O 100 967.5214 H O O O [M + H]+ HN O HN O O N OH OH OH O NH2 NH2 NH2 I I Intensity (%) O O H H G SUMO TCG TAG TCG TAG GyrA-CBD A B A B H2N N N H N OH 50 O O SUMO TCG TAG TCG TAG TCG TAG GyrA-CBD -Alloc -Alloc HN O HN O (2 Na+) -Cbz A B A B A B O O 0 A B M (Da): 966.5062 0 500 1000 1500 2000 SUMO TGT TCG TAG TCG TAG GyrA-CBD m/z B A Cys J I O O O 100 721.3845 100 1052.5175 [M + H]+ [M + 2H]2+ O HN O HN O HN O O H O HN N -Cbz +Na+ Intensity (%) Intensity (%) O O O O O H H H 526.7656 N N N H 2N N H N H OH 50 NH O 50 [M + 2H]2+ O O O HN O +Na+ -Alloc NH O (2 H+/ +MeOH2+ HN -Alloc 1441.7783 O H 2 Na+) N NH HN O HN O HN O (2 Na+) [M + H]+ N O - 2 Cbz -Alloc (H+/Na+) H O O O linear (H+/Na+) O O O HS 0 0 M (Da): 1440.7540 0 500 1000 1500 2000 M (Da): 1051.5050 0 500 1000 1500 2000 m/z m/z Fig. 5. Programmable, encoded synthesis of noncanonical heteropolymers addition of both ncAAs to the medium. a.u., arbitrary units. (F) ESI-MS of purified and macrocycles. (A) Elementary steps in the ribosomal polymerization of sfGFP-His6 variants containing the indicated ncAA hexamers. BocK/p-I-Phe two distinct ncAA monomers [labeled A (dark blue) and B (green)]. All linear (expected mass after loss of N-terminal methionine: 29,172.07 Da; observed: heteropolymer sequences composed of A and B can be encoded from these 29,171.8 Da), CbzK/p-I-Phe (expected mass after loss of N-terminal methionine: four elementary steps. (B) Encoding heteropolymer sequences (noncanonical 29,274.13 Da; observed: 29,274.0 Da), and AllocK/CbzK (expected mass after loss monomers are shown as stars). The sequence of monomers in the heteropolymer of N-terminal methionine: 29,091.64 Da; observed: 29,092.2 Da). The ESI-MS is programmed by the sequence of codons written by the user. The identity data was collected once. (G) Encoded synthesis of free noncanonical polymers. of monomers (A and B) is defined by the aaRS/tRNA pairs added to the cell. DNA sequences encoding a tetramer and a hexamer were inserted between SUMO Cells can be reprogrammed to encode different heteropolymer sequences from a and a GyrA intein coupled to a CBD, in Syn61D3(ev5) cells containing the same single DNA sequence. Sequences were encoded as insertions at position 3 of pairs as in r.s.1 (B). Expression of the constructs, followed by ubiquitin-like-specific sfGFP-His6. Reassignment scheme 1 (r.s.1) uses the MmPylRS/MmtRNAPylCGA protease 1 (Ulp1) cleavage and GyrA transthioesterification cleavage, results in pair to assign AllocK as monomer A and the 1R26PylRS(CbzK)/AlvtRNADNPyl(8)CUA the isolation of free noncanonical tetramer and hexamer polymers. Adding an pair to assign CbzK as monomer B (fig. S7, D and E). r.s.2 uses the MmPylRS/ additional cysteine immediately upstream of the polymer sequence results in self- MmtRNAPylCGA pair to assign BocK as monomer A and an AfTyrRS(p-I-Phe)/ cleavage and release of a macrocyclic noncanonical polymer. (H to J) Chemical AftRNATyr(A01)CUA pair to assign p-I-Phe as monomer B. r.s.3 uses the 1R26PylRS structures and ESI-MS spectra of the purified linear and cyclic AllocK/CbzK (CbzK)/AlvtRNADNPyl(8)CGA pair to assign CbzK as monomer A and the heteropolymers. The raw ESI-MS spectra show the relative intensity and observed AfTyrRS(p-I-Phe)/AftRNATyr(A01)CUA pair to assign p-I-Phe as monomer B. mass/charge ratios for the different noncanonical peptides. The observed (C to E) Polymerization of the encoded sequence composed of the indicated ncAAs masses corresponding to the expected [M + H]+ or [M + 2H]2+ ions are highlighted and the resulting sfGFP-His6 expression in Syn61D3(ev5) were dependent on the in bold. Other adducts and fragment ions are labeled relative to these. Robertson et al., Science 372, 1057–1062 (2021) 4 June 2021 5 of 6 RES EARCH | R E S E A R C H A R T I C L E This further demonstrates that all of the target copy of the same type of monomer or with a 5. T. Passioura, H. Suga, Trends Biochem. Sci. 39, 400–408 codons are orthogonal in this strain. different type of monomer (Fig. 5A). We encoded (2014). 6. A. C. Forster et al., Proc. Natl. Acad. Sci. U.S.A. 100, Upon addition of a ncAA substrate for the each elementary step by inserting TCG-TCG 6353–6357 (2003). MmPylRS/MmtRNAPyl pair [Ne-(tert-butox- (encoding AA; we arbitrarily assign monomer 7. M. J. Lajoie et al., Science 342, 357–360 (2013). ycarbonyl)-L-lysine (BocK)] (25), ubiquitin A to the TCG codon in this nomenclature), 8. N. J. Ma, F. J. Isaacs, Cell Syst. 3, 199–207 (2016). 9. G. Korkmaz, M. Holm, T. Wiens, S. Sanyal, J. Biol. Chem. 289, was produced at levels comparable to wild- TAG-TAG (encoding BB; we assign monomer 30334–30342 (2014). type controls (Fig. 3B and data S4). Electro- B to the TAG codon), TCG-TAG (encoding 10. D. D. Young, P. G. Schultz, ACS Chem. Biol. 13, 854–870 spray ionization mass spectrometry (ESI-MS) AB), and TAG-TCG (encoding BA) at codon 3 (2018). 11. C. C. Liu, P. G. Schultz, Annu. Rev. Biochem. 79, 413–444 and tandem mass spectrometry demonstrated of a superfolder green fluorescent protein (2010). the genetically directed incorporation of BocK (sfGFP) gene. We demonstrated the elementary 12. Y. Zhang et al., Nature 551, 644–647 (2017). at position 11 of Ub in response to each target steps for three pairs of monomers: A = BocK, 13. E. C. Fischer et al., Nat. Chem. Biol. 16, 570–576 codon using the complementary MmPylRS/ B = (S)-2-amino-3-(4-iodophenyl)propanoic acid (2020). 14. H. Neumann, K. Wang, L. Davis, M. Garcia-Alai, J. W. Chin, MmtRNAPylYYY pair (Fig. 3C and fig. S4A). Ad- (p-I-Phe); A = Ne-(carbobenzyloxy)-L-lysine Nature 464, 441–444 (2010). ditional experiments demonstrated efficient (CbzK), B = p-I-Phe; and A = N ɛ-allyloxycarbonyl- 15. K. Wang et al., Nat. Chem. 6, 393–403 (2014). incorporation of ncAAs in response to sense L-lysine (AllocK), B = CbzK (Fig. 5B and fig. 16. D. L. Dunkelmann, J. C. W. Willis, A. T. Beattie, J. W. Chin, Nat. Chem. 12, 535–544 (2020). and stop codons in glutathione S-transferase– S11). We genetically encoded six entirely non- 17. J. S. Italia et al., J. Am. Chem. Soc. 141, 6204–6212 maltose binding protein fusions (fig. S5 and natural tetrameric sequences and a hexameric (2019). data S4). We demonstrated good yields of Ub- sequence for each pair of monomers, as well 18. J. Fredens et al., Nature 569, 514–518 (2019). 19. W. H. Schmied et al., Nature 564, 444–448 (2018). His6 incorporating two, three, or four ncAAs as an octameric sequence for the AllocK/CbzK 20. M. R. Hemm et al., J. Bacteriol. 192, 46–58 (2010). into a single polypeptide in response to each of pair (22 synthetic polymer sequences in total) 21. S. Meydan et al., Mol. Cell 74, 481–493.e6 (2019). the target codons (data S4; Fig. 3, D to I; and (figs. S11 and S12 and Fig. 5, C to E). All en- 22. A. Katz, S. Elgamal, A. Rajkovic, M. Ibba, Mol. Microbiol. 101, 545–558 (2016). fig. S4, B to G), and we further demonstrated coded polymerizations were ncAA-dependent 23. Z. Su, B. Wilson, P. Kumar, A. Dutta, Annu. Rev. Genet. 54, Downloaded from https://www.science.org on August 14, 2024 the incorporation of nine ncAAs in response (figs. S11 and S12B and Fig. 5, C to E), and ESI- 47–69 (2020). to nine TCG codons in a single repeat protein MS confirmed that we had synthesized the 24. L. You, P. F. Suthers, J. Yin, J. Bacteriol. 184, 1888–1894 (2002). (fig. S6). Together, these results demonstrate noncanonical hexamers and octamers as sfGFP 25. T. Yanagisawa et al., Chem. Biol. 15, 1187–1197 (2008). that the sense codons TCG and TCA and the fusions (Fig. 5F and fig. S12C). We encoded 26. V. Bethencourt, Nat. Biotechnol. 27, 681 (2009). stop codon TAG can be efficiently reassigned tetramer and hexamer sequences composed 27. J. A. Zahn, M. C. Halter, in Bacteriophages: Perspectives and Future, R. Savva, Ed. (IntechOpen, 2018). to ncAAs in Syn61D3 derivatives. of AllocK and CbzK between SUMO (small 28. S. Osawa, T. H. Jukes, J. Mol. Evol. 28, 271–278 (1989). ubiquitin-like modifier) and GyrA-CBD (DNA 29. D. Cervettini et al., Nat. Biotechnol. 38, 989–999 Encoding distinct ncAAs in response to gyrase subunit A intein-chitin-binding domain) (2020). distinct target codons and purified the free polymers (Fig. 5, G to I; fig. 30. D. Cervettini, K. C. Liu, J. W. Chin, Scripts for Sense Codon Reassignment Enables Viral Resistance and Encoded Polymer Next, we assigned TCG, TCA, and TAG codons S13; and data S4). Finally, we encoded the syn- Synthesis, Version 1.0, Zenodo (2021); https://doi.org/10. to distinct ncAAs in Syn61D3(ev4) using engi- thesis of a non-natural macrocycle reminiscent 5281/zenodo.4666529. neered mutually orthogonal aminoacyl-tRNA of the products of nonribosomal peptide syn- AC KNOWLED GME NTS synthetase (aaRS)/tRNA pairs that recognize thetases (Fig. 5, G and J). We thank Z. Zeng and R. Monson (Department of Biochemistry, distinct ncAAs and decode distinct codons University of Cambridge) for helping with phage assays. (Fig. 4A and fig. S7). We incorporated two Discussion Funding: This work was supported by the Medical Research distinct ncAAs into ubiquitin in response to We have synthetically uncoupled our strain Council (MRC), UK (MC_U105181009, MC_UP_A024_1008, and Development Gap Fund Award P2019-0003) and an ERC Advanced TCG and TAG codons (Fig. 4B; fig. S8, A and B; from the ability to read the canonical code, Grant SGCR, all to J.W.C. Author contributions: L.F.H.F. and and data S4) and demonstrated the incorpora- and this advance provides a potential basis for K.C.L. performed strain evolution experiments. L.F.H.F., W.E.R., and tion of two distinct ncAAs at four sites in bioproduction without the catastrophic risks S.B. performed experiments to knock out serT, serU, and prfA. L.F.H.F. analyzed genome sequences. J.F. performed phage ubiquitin, with each ncAA incorporated at two associated with viral contamination and lysis experiments, with advice and supervision from G.P.C.S. W.E.R., different sites in the protein (Fig. 4, B and C; (26, 27). We note that the synthetic codon D.d.l.T., T.S.E., Y.C., D.C., F.L.B., M.S., and S.M. performed fig. S8, C to E; and data S4). We incorporated compression and codon reassignment strategy experiments and analysis to demonstrate codon reassignment and ncAA incorporation in response to target codons. D.C. wrote three distinct ncAAs into ubiquitin, in response we have implemented is analogous to models scripts to analyze codon usage in bacteriophage genomes. J.W.C. to TCG, TCA, and TAG codons (Fig. 4, D and E; proposed for codon capture in the course of supervised the project and wrote the manuscript, together with the fig. S8F; and data S4). We demonstrated the natural evolution (28). other authors. Competing interests: The authors declare no competing interests. Data and materials availability: The generality of our approach by synthesizing seven Future work will expand the principles we GenBank accession numbers for all the strains and plasmids distinct versions of ubiquitin, each of which have exemplified herein to further compress described in the text are provided in data S1 and S2, and the incorporated three distinct ncAAs (figs. S9 and reassign the genetic code. We anticipate authors agree to provide any data or materials and strains used in and S10 and data S4). that, in combination with ongoing advances this study upon request. Scripts for analyzing codon usage, next-generation sequencing sample preparation, and automated in engineering the translational machinery of strain evolution are available in Zenodo (30). Encoded noncanonical polymers cells (4), this work will enable the program- and macrocycles mable and encoded cellular synthesis of an SUPPLEMENTARY MATERIALS For a linear polymer composed of two distinct expanded set of noncanonical heteropoly- science.sciencemag.org/content/372/6546/1057/suppl/DC1 monomers (A and B), there are four elemen- mers with emergent, and potentially useful, Materials and Methods tary polymerization steps (A+B→AB, B+A→BA, properties. Figs. S1 to S13 References (31–41) A+A→AA, B+B→BB) from which any sequence MDAR Reproducibility Checklist can be composed (Fig. 5A). For ribosome- RE FERENCES AND NOTES Data S1 to S4 mediated polymerization, these four elemen- 1. F. H. C. Crick, L. Barnett, S. Brenner, R. J. Watts-Tobin, Nature tary steps correspond to each monomer acting 192, 1227–1232 (1961). View/request a protocol for this paper from Bio-protocol. 2. P. Marliere, Syst. Synth. Biol. 3, 77–84 (2009). as an aminoacyl-site (A-site) or peptidyl-site 3. J. W. Chin, Nature 550, 53–60 (2017). 23 December 2020; accepted 8 April 2021 (P-site) substr