Machine-Learning-Guided Peptide Drug Discovery: Development of GLP-1 Receptor Agonists PDF

This article is licensed under CC-BY-NC-ND 4.0...

See https://pubs.acs.org/sharingguidelines for options Downloaded via 122.171.17.175 on July Journal of Medicinal Chemistry Table 1. Alignment of GLP-1, Secretin, Dual-Agonist, a Origin of substitutions are highlighted in colors. Figure 1. Overview of the data generation and data designed, typically on the order of hundreds to thousands synthesis (SPPS) and cleaved from the resin. Failed analysis. A panel of high-throughput assays for determining For each assay end point, a random forest model is agonists such as semaglutide have been shown to be important tools in the treatment of diabetes and obesity.8,9 However, GLP-1 is known to self-assemble into amyloid fibrils and the intrinsic physical instability of GLP-1 poses a significant challenge in synthesis and formulation.10,11 Recent drug development efforts have taken advantage of the varying degrees of sequence homology of GLP-1, glucagon, and glucose-dependent insulinotropic polypeptide (GIP) to engineer unimolecular dual or triple agonists targeting receptors of GLP-1, glucagon, and GIP. This approach has proven to be a successful concept for treatment intervention in diabetes and obesity.12−17 Alternatively, one could further envision to exploit sequence homology to obtain beneficial peptide properties not only in terms of receptor pharmacology, but also from a synthesis and/ or formulation perspective.18 Secretin is a 27-amino acid peptide hormone that together with GLP-1 belongs to the glucagon superfamily of structurally related peptide hormones all targeting family B G-protein coupled receptors (GPCRs).19 This family of peptides is linear peptides comprising 25 or more residues allowing them to Journal of Medicinal Chemistry Figure 2. Schematic representation of the parallelized secretin backbone. The two-step process for converting by introducing only necessary GLP-1 residues to provide mutational scan, a glutamate scan, and a lipidation GLP-1R agonist. Figure 3. Overview of substitution-effects from a GLP-1 secretin backbone. For each assay end point, a random the level of contribution of each amino acid substitution. the corresponding GLP-1 residue. (B) Detailed overview analogs with dual GLP-1R and SCTR potency. Small points accelerated development of novel peptide-based therapeutics by rigorous design and ML-driven analysis of large peptide libraries. Journal of Medicinal Chemistry Figure 4. Overview of substitution-effects from a deep single mutations in all positions. For GLP-1R and SCTR From the models, the effect of single mutations was highlighting the effect of individual substitutions. relationship (QSAR) approaches. An overview of the learning cycles in the streamline platform is illustrated in Figure 1. In the streaMLine platform, peptides are synthesized using solid-phase peptide synthesis (SPPS) in a plate format. The crude peptide libraries are screened directly in functional potency assays and in preformulation assays for determination of, for example, fibrillation and solubility. Each peptide library is analyzed by using high-resolution mass spectrometry Journal of Medicinal Chemistry chemical contexts, i.e., different backbones. The peptide sequences together with assay data are used as training data to construct random forest models22 describing the relation- ship between peptide sequence and assay end point. In the training data, the systematic peptide library is encoded using amino acid descriptors (z-scales23 or one-hot encoding), and potential laboratory batch effects are incorporated in the model to normalize, e.g., synthesis and assay plate differences. For each assay end point, a new model is trained and used for inferring the key amino acid substitutions affecting the end point. Model inference is done either by (1) correcting assay data for batch effects and amino acid similarity and thereby computing normalized assay measurements for individual peptides, (2) or by computing Shapley Additive explanation (SHAP) values.24 SHAP values are used to explain the effect of each amino acid substitution on the end point and can thus be used to infer the key drivers in the data set. After model inference, the most promising substitutions are assessed in a new peptide library design. Data points obtained on crude peptides are challenging to interpret individually, but in the context of the systematically designed peptide libraries, the random forest model provides accurate guidance for identifying the effect of substitutions. The performances of all models generated in this study are given in Figure S1. Development of Dual GLP-1R-SCTR Agonists. We applied the streaMLine platform to generate selective GLP-1R agonists with suitable physicochemical parameters starting from the secretin backbone. The native secretin peptide has no GLP-1R potency, hence to identify both desired substitutions a starting point with some GLP-1R potency was needed. We therefore aimed at generating first a dual GLP-1R-SCTR agonist that could be used as an intermediate for further optimization (Figure 2). A peptide library was designed to evaluate the effect of introducing GLP-1 residues into the native secretin. Non- conserved residues (position 2−3, 9−10, 12−14, and 17−25), were changed into the corresponding GLP-1 residue, one at a time or in combinations (Table 1). The library (768 peptides) was screened by determining GLP-1R and SCTR potency (EC50), fibril formation (ThT assay), and solubility (turbidity), and random forest models were trained to determine the relationship between measured end points and the amino acid sequence of peptides. From these models, we computed SHAP values to determine the level of contribution of each substitution.22 Substitutions with positive SHAP values increase the end point, while substitutions with negative SHAP values decrease the end point. Amino acid positions 2, 9, 18, and 22 had the highest positive SHAP values for GLP-1R EC50, thus being critical for improving GLP-1R potency. Conversely, positions 3, 9, 10, 14, and 19 exhibited the most negative SHAP values for SCTR EC50, hence being critical for abolishing SCTR potency or enhancing GLP-1R selectivity (Figure 3A). In addition to potency determination, the propensity for fibril formation and a reduction in solubility was most pronounced when mutating amino acid positions 12, 14, 18, 19, 21, 23, and 25. The introduction of GLP-1 residues at these positions could thus negatively affect the physicochemical properties of a peptide (Figure 3B). Based on these learnings, five substitutions were introduced in the secretin backbone to achieve an agonist with dual activity on GLP-1R and SCTR (Table 1). Mutations 9D, 18A, Journal of Medicinal Chemistry Figure 5. Overview of substitution-effects from glutamate agonist. (A) Effect of introducing HLEs or glutamate. compute SHAP values determining the level of contribution substituting the backbone residue with either a glutamate derivatization or glutamate substitution for improving and large points denote mean SHAP value. have reported half-lives in the range of 2−4 min.27,28 Conjugation with fatty diacids facilitates strong binding to serum albumin, thus reducing renal clearance and enzymatic degradation.28,29 Fatty acid conjugation was investigated by attachment to the epsilon nitrogen of lysines via linker moieties. The combined fatty acid and linker is here referred to as the half-life extender (HLE). Six different HLEs were evaluated in each position representing different lengths of fatty diacids, octadecanedioic acid (C18DA) and eicosanedioic acid (C20DA), and varying combinations of linker moieties, L-γ-glutamyl (gGlu) and 3,8- dioxa-aminooctanoic acid (OEG). A library of 576 peptides was designed where each HLE at each position was examined in backbones comprising 0, 1, or 2 glutamate mutations. All positions were screened, except a few positions in the pharmacophore essential for receptor activation. For the glutamate substitution screening, positions 1−5, 7, and 8 were excluded, with position 15 already being a glutamate. For the HLE screening, positions 1−5, 7−9, 18, and 22 were excluded. The library was screened for GLP-1R and SCTR potency and turbidity. For each end point, we computed SHAP values to determine the contribution of each mutation relative to the backbone residue (Figure 5). Evaluating the effect of glutamate mutations on GLP-1R potency revealed several amino acid positions where glutamate was tolerated, i.e., positions 12, 16, 17, 19, 20, 21, 24, 25, and Journal of Medicinal Chemistry Figure 6. Overview of substitution-effects from selected substitutions identified in the deep mutational scan, 192 peptides and used to determine SHAP values determining the contribution of each substitution relative to the Table 2. Profiling of Optimized Secretin-Derived GLP-1R hGLP1R hSCTR EC50 selectivity compound EC50 (nM) (nM) ratioa secretin 2300 0.0023 10−6 GLP-1 0.002 800 400,000 GUB021794 0.018 190 10,556 a Selectivity ratio was calculated as hSCTR EC50 divided mutational scan. Furthermore, this library would enable us to rank the mutations relative to each other and according to desired end points to identify an optimal combination of substitutions providing a selective GLP-1R agonist with This article is licensed under CC-BY-NC-ND 4.0 pubs.acs.org/jmc Article Machine-Learning-Guided Peptide Drug Discovery: Development of GLP‑1 Receptor Agonists with Improved Drug Properties Jens Christian Nielsen, Claudia Hjo̷ rringgaard, Mads Mo̷ rup Nygaard, Anita Wester, Lisbeth Elster, Trine Porsgaard, Randi Bonke Mikkelsen, Silas Rasmussen, Andreas Nygaard Madsen, Morten Schlein, Niels Vrang, Kristoffer Rigbolt, and Louise S. Dalbo̷ ge* Cite This: https://doi.org/10.1021/acs.jmedchem.4c00417 Read Online on how to legitimately share published articles. ACCESS Metrics & More Article Recommendations * sı Supporting Information 13, 2024 at 04:46:43 (UTC). ABSTRACT: Peptide-based drug discovery has surged with the development of peptide hormone-derived analogs for the treatment of diabetes and obesity. Machine learning (ML)-enabled quantitative structure−activity relationship (QSAR) approaches have shown great promise in small molecule drug discovery but have been less successful in peptide drug discovery due to limited data availability. We have developed a peptide drug discovery platform called streaMLine, enabling rigorous design, synthesis, screening, and ML-driven analysis of large peptide libraries. Using streaMLine, this study systematically explored secretin as a peptide backbone to generate potent, selective, and long-acting GLP-1R agonists with improved physicochemical properties. We synthesized and screened a total of 2688 peptides and applied ML-guided QSAR to identify multiple options for designing stable and potent GLP-1R agonists. One candidate, GUB021794, was profiled in vivo (S.C., 10 nmol/kg QD) and showed potent body weight loss in diet- induced obese mice and a half-life compatible with once-weekly dosing. INTRODUCTION Peptide-based therapeutics are gaining increasing attention in elute chemical motifs necessary for binding.4 For peptide drug discovery, however, data are often sparse, and this has limited the pharmaceutical industry. Peptide hormones have both high the use of machine learning (ML) for QSAR optimization. The receptor potency and selectivity, minimizing off-target effects amount and composition of data are crucial parameters for and generally translating into an excellent drug safety and QSAR methods,5 which is why it is advantageous to generate efficacy profile.1,2 These features make endogenous peptides a data that are specifically designed for modeling purposes. good starting point for the development of novel peptide Glucagon-like peptide-1 (GLP-1) is an endogenous 30- therapeutics. However, native unmodified peptides are rarely amino acid peptide hormone produced by enteroendocrine L- used as drugs because of their inherent limitations due to their cells and secreted into the hepatic portal in response to food very short systemic half-life and unfavorable physicochemical intake. By activating GLP-1 receptors in the pancreas, native properties which must be circumvented to develop peptide GLP-1 serves as an incretin hormone stimulating insulin molecules suitable for therapeutic use.1,2 release and inhibiting glucagon secretion.6 In addition, GLP-1 Early drug discovery phases aim to improve the properties of is an important appetite regulator by activating central GLP-1 candidate molecules by modifying their chemical structure. receptors (GLP-1R).7 In line with this, long-acting GLP-1R Such improvements can be achieved by rational design using an iterative and often laborious approach, where small batches of compounds are screened in multiple rounds of optimization. Received: February 19, 2024 In contrast, when larger data sets are available, it is useful to Revised: June 19, 2024 construct mathematical models that capture the quantitative Accepted: June 20, 2024 structure−activity relationship (QSAR) to guide drug design. Published: July 8, 2024 QSAR models have been widely used in the development of small molecule therapeutics e.g., to discover novel binders3 and © 2024 The Authors. Published by American Chemical Society https://doi.org/10.1021/acs.jmedchem.4c00417 A J. Med. Chem. XXXX, XXX, XXX−XXX pubs.acs.org/jmc Article and GLP-1R Selective Agonist GUB021794a * denotes attachment of half-life extender: C20DA-gGlu-2xOEG. analysis workflow of the streaMLine platform. Initially a systematic library of peptides is of peptides. The crude library of peptides are prepared using solid-phase-peptide peptide samples are identified by high-resolution mass spectrometry and excluded from the receptor potency and physicochemical properties at different pH levels are measured. trained and used for inferring key amino acids substitutions that determine peptide properties. span two key receptor domains. The C-terminal region of the peptides binds to the extracellular domain of their respective receptor whereafter the N-terminal region interacts with the core domain of the receptor enabling receptor activation.20 The sequence identity of secretin and GLP-1 is shown in Table 1. Contrary to other peptides of the glucagon family, such as GLP-1, secretin is not reported to aggregate.21 Thus, secretin could serve as the starting backbone with improved physicochemical properties compared to GLP-1. The main physiological role of secretin is to regulate water homeostasis and bicarbonate secretion from the exocrine pancreas and inhibit gastric acid secretion by activating the secretin receptor (SCTR),19 i.e., endogenous activities we intended not to activate. Hence, we aimed to leverage the more favorable physicochemical properties of secretin to develop a selective and physicochemical stable GLP-1R agonist based on the secretin backbone. With this aim, we exploited an innovative ML-based peptide drug discovery platform termed streaMLine and demonstrated how streaMLine effectively facilitates B https://doi.org/10.1021/acs.jmedchem.4c00417 J. Med. Chem. XXXX, XXX, XXX−XXX pubs.acs.org/jmc Article development process for generating a selective and stable GLP-1R agonist based on the secretin into a preclinical drug candidate. First, a minimal GLP-1R agonist was developed activation of GLP-1R. Second, a parallelized workflow was initiated where a deep scan provided a blueprint for generating various soluble, physically stable, and half-life extended dial-in scan in the secretin backbone. (A) Effect of introducing GLP-1 residues into the forest model was trained on 768 peptides and used to compute SHAP values determining Delta mean SHAP values denote the contribution of substituting the secretin residue with of SHAP values for selected positions, where substitutions were introduced to obtain denote SHAP values per individual peptide and large points denote mean SHAP value. RESULTS The streaMLine Platform. The streamline platform is a drug development tool where peptide libraries are designed, synthesized, and screened to provide large data sets suitable for machine learning (ML)-enabled quantitative structure−activity C https://doi.org/10.1021/acs.jmedchem.4c00417 J. Med. Chem. XXXX, XXX, XXX−XXX pubs.acs.org/jmc Article mutational scan (DMS) in a secretin derived GLP-1R and SCTR dual agonist. (A) Effect of potency, random forest models were trained on 1152 peptides encoded using z-scales.23 computed to normalize for assay batch effects. (B) Detailed overview of selected positions (HRMS) to determine purity. The average purity per library ranges between 30 and 50%, and peptide samples with less than 10% purity are excluded from further analysis. The peptide libraries are designed in a highly systematic manner, where each substitution is observed multiple times in combination with other substitutions. Typically, peptide libraries consist of hundreds to thousands of peptides, which enables robust evaluation of each substitution in multiple D https://doi.org/10.1021/acs.jmedchem.4c00417 J. Med. Chem. XXXX, XXX, XXX−XXX pubs.acs.org/jmc Article and 22F increased GLP-1R potency. Likewise, 2A was found to increase the GLP-1R potency. The substitution of alanine for 2-aminoisobutyric acid (Aib) is well-known to prevent DPP-4 proteolytic cleavage of the GLP-1 backbone without compromising GLP-1R potency9,25 hence 2Aib was intro- duced. The 3E mutation prevented the isomerization of the native aspartic acid residue in secretin without influencing receptor potency. Next, a comprehensive sequence exploration was performed on the dual GLP-1R-SCTR agonist, i.e., a deep mutational scan. Development of Selective GLP-1R Agonists. A deep mutational scan (DMS) was designed such that all-natural amino acids (except cysteine and methionine) were introduced in all sequence positions, either as single mutations or as double mutations. The library consisted of 1152 peptides, which were screened for GLP-1R and SCTR potency. Based on these data, we trained random forest models on the relationship between all assay end points and the peptide amino acid sequence. The models were used to normalize for batch effects (synthesis and assay plate) and the resulting pEC50 values for each single mutant are shown in Figure 4. The DMS identified several receptor-selectivity-promoting substitutions. For each position, substitution maps were obtained allowing us to navigate toward desired properties, including GLP-1R potency and/or improved receptor selectivity. Rather than identifying a single compound with desired properties through an iterative design process, the DMS generated a solution space of possible amino acid substitutions from which peptide candidates could be designed and synthesized. A selection of substitutions is described below. We identified amino acid positions 9, 12, and 25 where substitutions could significantly improve GLP-1R selectivity by increasing GLP-1R potency and decreasing SCTR potency (Figures 4A, B and S2). At position 12, several substitutions improved GLP-1R selectivity. 12Y most effectively improved GLP-1R potency and reduced SCTR potency, whereas 12E only reduced SCTR potency. Aromatic residues 25H, 25F, 25Y, and 25W improved GLP-1R potency while also reducing SCTR potency, with 25H being the most effective. At position 9, only the native GLP-1 residue 9D improved potency and selectivity. In addition, amino acid positions 10, 14, and 19 were identified to improve selectivity by decreasing SCTR potency with a neglectable effect on GLP-1R potency (Figures 4A, B and S2). 10I and 10V reduced SCTR potency without significantly compromising GLP-1R potency. A similar effect was seen by substituting position 14 to F, Y, or L. Positions 16, 18, and 22 could be substituted to enhance GLP-1R selectivity. For position 16, all mutations, except P, increased GLP-1R potency (Figure S2). For positions 18 and 22, 18A, 18Aib, 18L, 18, 22F, 22W, and 22Y considerably improved GLP-1R potency while only marginally affecting SCTR potency. Improving Solubility and Conjugation of Fatty Acid. In parallel with DMS, we systematically investigated the effect and tolerability of glutamate substitution and derivatization with half-life extenders (HLEs) in our dual GLP-1R-SCTR agonist. Glutamate substitutions can be used to modulate the isoelectric point of a peptide, thereby improving solubility at the desired formulation pH.26 Fatty acid conjugation is a well- described technology broadly applied to extend the half-life of peptides from minutes to hours. Native secretin and GLP-1 E https://doi.org/10.1021/acs.jmedchem.4c00417 J. Med. Chem. XXXX, XXX, XXX−XXX pubs.acs.org/jmc Article scan and half-life extender (HLE) scan in a secretin-derived GLP-1R and SCTR dual For each assay end point, a random forest model was trained on 576 peptides and used to of each amino acid substitution. Delta mean SHAP values denote the contribution of or HLE. (B) Detailed overview of SHAP values for selected positions that tolerate HLE half-life and solubility, respectively. Small points denote SHAP values per individual peptide 27. Position 12, 16, and 24 were found to have a positive impact on GLP-1R potency compared to the backbone residue. Among these positions, we found the largest reduction in turbidity from introducing 16E and 24E, indicating improved solubility (Figure 5B). For the HLEs, we found no major difference in potency across the different fatty acids and/or linker combinations and therefore analyzed the different HLEs as a single substitution. Evaluating the effect of attaching HLEs on the GLP-1R potency revealed several positions where HLEs were tolerated, i.e., positions 10, 12, 14, 16, 17, 20, 21, 24, 25, and 27. For positions 12, 14, and 16, the attachment of HLEs was found to have positive effects on GLP-1R potency compared to the backbone residue, with positions 12 and 14 also inducing GLP- 1R selectivity. This selectivity effect of positions 12 and 14 was consistent with the observations from the DMS, where GLP- 1R selectivity could also be improved by mutating these positions (Figure 4). Importantly, no effect on turbidity was observed when HLEs were conjugated to positions 12 (Figure 5A) and 14 (Figure 5A, B). Fine-Tuning Potency, Selectivity, and Physicochem- ical Properties. Previous sections described our parallel peptide development process. The conjugation of HLEs could dramatically alter the properties of a peptide.26 We therefore set out to investigate if an HLE was compatible with positions and substitutions found to be selectivity-inducing in the deep F https://doi.org/10.1021/acs.jmedchem.4c00417 J. Med. Chem. XXXX, XXX, XXX−XXX pubs.acs.org/jmc Article substitutions in secretin derived selective GLP-1R agonist. Effect of combining selected HLE scan, and glutamate scan. For each assay end point a random forest model was trained on the level of contribution of each amino acid substitution. Mean SHAP values denote data set mean. Agonist solubility pH 7.0 and 8.0 fibrillation pH 7.0 chemical stability pH 7.0 and 8.0 rat half-life, (mg/mL) and 8.0 (% degradation) i.v. (h) NA no 8.6/9 NA NA no 3.8/8.1 NA >10 no 1.4/0.55 22 by hGLP1R EC50. Design and Characterization of Final Candidate. Based on all of the substitution options identified, we designed GUB021794. The peptide sequence is shown in Table 1. We aimed for a peptide candidate with minimal (

Machine-Learning-Guided Peptide Drug Discovery: Development of GLP-1 Receptor Agonists PDF

Document Details

Tags

Related

Summary

Full Transcript

Upgrade to continue