Skip to main content

Advertisement

Log in

Learning Relationships Between Chemical and Physical Stability for Peptide Drug Development

  • Original Research Article
  • Published:
Pharmaceutical Research Aims and scope Submit manuscript

Abstract

Purpose or Objective

Chemical and physical stabilities are two key features considered in pharmaceutical development. Chemical stability is typically reported as a combination of potency and degradation product. Moreover, fluorescent reporter Thioflavin-T is commonly used to measure physical stability. Executing stability studies is a lengthy process and requires extensive resources. To reduce the resources and shorten the process for stability studies during the development of a drug product, we introduce a machine learning-based model for predicting the chemical stability over time using both formulation conditions as well as aggregation curves.

Methods

In this work, we develop the relationships between the formulation, stability timepoint, and the chemical stability measurements and evaluated the performance on a random test set. We have developed a multilayer perceptron (MLP) for total degradation prediction and a random forest (RF) model for potency.

Results

The coefficient of determination (R2) of 0.945 and a mean absolute error (MAE) of 0.421 were achieved on the test set when using MLP for total degradation. Similarly, we achieved a R2 of 0.908 and MAE of 1.435 when predicting potency using the RF model. When physical stability measurements are included into the MLP model, the MAE of predicting TD decreases to 0.148. Using a similar strategy for potency prediction, the MAE decreases to 0.705 for the RF model.

Conclusions

We conclude two important points: first, chemical stability can be modeled using machine learning techniques and second there is a relationship between the physical stability of a peptide and its chemical stability.

Graphical Abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data Availability

Data used for training the models is available in the Supplemental Information document. Data for ThioT curves is available upon request.

References

  1. D’Addio SM, Bothe JR, Neri C, Walsh PL, Zhang J, Pierson E, Mao Y, Gindy M, Leone A, Templeton AC. New and evolving techniques for the characterization of peptide therapeutics. J Pharm Sci. 2016;105(10):2989–3006.

    Article  PubMed  Google Scholar 

  2. Market PT. Peptide therapeutics market (by applications, by route of administration, and by marketing status)–global industry analysis. Share, Growth, Trends and Forecast: Size; 2015. p. 2014–20.

    Google Scholar 

  3. Waterman KC. The Application of the Accelerated Stability Assessment Program (ASAP) to Quality by Design (QbD) for Drug Product Stability. AAPS PharmSciTech. 2011;12(3):932–7.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Sengupta P, Chatterjee B, Tekade RK. Current regulatory requirements and practical approaches for stability analysis of pharmaceutical products: a comprehensive review. Int J Pharm. 2018;543(1–2):328–44.

    Article  CAS  PubMed  Google Scholar 

  5. Li H, Nadig D, Kuzmission A, Riley CM. Prediction of the changes in drug dissolution from an immediate-release tablet containing two active pharmaceutical ingredients using an accelerated stability assessment program (ASAP Prime®). AAPS Open. 2016;2(1):1–9.

    Article  Google Scholar 

  6. Guideline, I. C. H. H. T. Stability Testing of New Drug Substances and Products. Q1A (R2), current step 2003, 4, 1–24.

  7. Rosenberg AS. Effects of protein aggregates: an immunologic perspective. AAPS J. 2006;8(3):E501–7.

    Article  PubMed  PubMed Central  Google Scholar 

  8. Jiskoot W, Randolph TW, Volkin DB, Middaugh CR, Schöneich C, Winter G, Friess W, Crommelin DJA, Carpenter JF. Protein instability and immunogenicity: roadblocks to clinical application of injectable protein delivery systems for sustained release. J Pharm Sci. 2012;101(3):946–54.

    Article  CAS  PubMed  Google Scholar 

  9. Nielsen MK, Ahneman DT, Riera O, Doyle AG. Deoxyfluorination with sulfonyl fluorides: navigating reaction space with machine learning. J Am Chem Soc. 2018;140(15):5004–8. https://doi.org/10.1021/jacs.8b01523.

    Article  CAS  PubMed  Google Scholar 

  10. Ahneman, D. T.; Estrada, J. G.; Lin, S.; Dreher, S. D.; Doyle, A. G. Predicting reaction performance in C–N Cross-coupling using machine learning. Science (1979) 2018, 360 (6385), 186–190. https://doi.org/10.1126/science.aar5169.

  11. Gao H, Struble TJ, Coley CW, Wang Y, Green WH, Jensen KF. Using machine learning to predict suitable conditions for organic reactions. ACS Cent Sci. 2018;4(11):1465–76. https://doi.org/10.1021/acscentsci.8b00357.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Coley, C. W.; Green, W. H.; Jensen, K. F. Machine learning in computer-aided synthesis planning. Acc Chem Res 2018, 51 (5). https://doi.org/10.1021/acs.accounts.8b00087.

  13. Fine J, Kuan-Yu Liu J, Beck A, Alzarieni KZ, Ma X, Boulos VM, Kenttämaa HI, Chopra G. Graph-based machine learning interprets and predicts diagnostic isomer-selective ion-molecule reactions in tandem mass spectrometry. Chem Sci. 2020;11(43):11849–58. https://doi.org/10.1039/d0sc02530e.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Fine JA, Rajasekar AA, Jethava KP, Chopra G. Spectral deep learning for prediction and prospective validation of functional groups. Chem Sci. 2020;11(18):4618–30. https://doi.org/10.1039/c9sc06240h.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Lai P-K, Fernando A, Cloutier TK, Kingsbury JS, Gokarn Y, Halloran KT, Calero-Rubio C, Trout BL. Machine learning feature selection for predicting high concentration therapeutic antibody aggregation. J Pharm Sci. 2021;110(4):1583–91.

    Article  CAS  PubMed  Google Scholar 

  16. Lai P-K, Fernando A, Cloutier TK, Gokarn Y, Zhang J, Schwenger W, Chari R, Calero-Rubio C, Trout BL. Machine learning applied to determine the molecular descriptors responsible for the viscosity behavior of concentrated therapeutic antibodies. Mol Pharm. 2021;18(3):1167–75.

    Article  CAS  PubMed  Google Scholar 

  17. Melo, M. C. R.; Maasch, J. R. M. A.; de la Fuente-Nunez, C. Accelerating antibiotic discovery through artificial intelligence. Communications Biology 2021 4:1 2021, 4 (1), 1–13. https://doi.org/10.1038/s42003-021-02586-0.

  18. Fjell, C. D.; Hiss, J. A.; Hancock, R. E. W.; Schneider, G. Designing antimicrobial peptides: form follows function. Nature Reviews Drug Discovery 2012 11:1 2011, 11 (1), 37–51. https://doi.org/10.1038/nrd3591.

  19. Cardoso MH, Orozco RQ, Rezende SB, Rodrigues G, Oshiro KGN, Cândido ES, Franco OL. Computer-aided design of antimicrobial peptides: are we generating effective drug candidates? Front Microbiol. 2020;10:3097. https://doi.org/10.3389/FMICB.2019.03097/BIBTEX.

    Article  PubMed  PubMed Central  Google Scholar 

  20. Ma J, Sheridan RP, Liaw A, Dahl GE, Svetnik V. Deep neural nets as a method for quantitative structure-activity relationships. J Chem Inf Model. 2015;55(2):263–74. https://doi.org/10.1021/ci500747n.

    Article  CAS  PubMed  Google Scholar 

  21. Wallach, I.; Heifets, A. Most ligand-based classification benchmarks reward memorization rather than generalization. J Chem Inf Model 2018, 58 (5). https://doi.org/10.1021/acs.jcim.7b00403.

  22. Szucs, R.; Brown, R.; Brunelli, C.; Heaton, J. C.; Hradski, J. Structure driven prediction of chromatographic retention times: applications to pharmaceutical analysis. Int J Mol Sci 2021, 22 (8). https://doi.org/10.3390/ijms22083848.

  23. Jethava KP, Fine J, Chen Y, Hossain A, Chopra G. Accelerated reactivity mechanism and interpretable machine learning model of n-sulfonylimines toward fast multicomponent reactions. Org Lett 2020, 22 (21), 8480–8486. https://doi.org/10.1021/acs.orglett.0c03083.

  24. Wen Y, Amos RIJ, Talebi M, Szucs R, Dolan JW, Pohl CA, Haddad PR. Retention index prediction using quantitative structure-retention relationships for improving structure identification in nontargeted metabolomics. Anal Chem. 2018;90(15):9434–40. https://doi.org/10.1021/acs.analchem.8b02084.

    Article  CAS  PubMed  Google Scholar 

  25. Kapoor Y, Milewski M, Dick L, Zhang J, Bothe JR, Gehrt M, Manser K, Nissley B, Petrescu I, Johnson P, Burton S, Moseman J, Hua V, Grunewald T, Tomai M, Smith R. Coated microneedles for transdermal delivery of a potent pharmaceutical peptide. Biomed Microdevices. 2020;22(1):1–10. https://doi.org/10.1007/s10544-019-0462-1.

    Article  CAS  Google Scholar 

  26. Max, K.; Weston, S.; Keefer, C.; Engelhardt, A.; Cooper, T.; Mayer, Z.; Kenkel, B.; Team, R. C.; Benesty, M.; Lescarbeau, R.; Ziem, A.; Scrucca, L.; Tang, Y.; Candan, C. Classification and regression training. 2016, p 198.

  27. Kuhn, M. Building predictive models in R using the caret package. Journal of Statistical Software, Articles 2008, 28 (5), 1–26. https://doi.org/10.18637/jss.v028.i05.

  28. Friedman, J.; Hastie, T.; Tibshirani, R. Regularization paths for generalized linear models via coordinate descent. J Stat Softw 2010, 33 (1), 1–22. https://doi.org/10.18637/jss.v033.i01.

  29. Geurts P, Irrthum A, Wehenkel L. Supervised Learning with Decision Tree-Based Methods in Computational and Systems Biology. Mol Biosyst. 2009;5(12):1593. https://doi.org/10.1039/b907946g.

    Article  CAS  PubMed  Google Scholar 

  30. Breiman L. Random Forests. Mach Learn. 2001;9(1):5–32. https://doi.org/10.1186/1478-7954-9-29.

    Article  Google Scholar 

  31. Wright, M. N.; Ziegler, A. Ranger: a fast implementation of random forests for high dimensional data in C++ and R. J Stat Softw 2017, 77 (1). https://doi.org/10.18637/jss.v077.i01.

  32. Sanchez, G. PLS path modeling with R. R package notes 2013, 235. https://doi.org/citeulike-article-id:13341888.

  33. Li H, Liang Y, Xu Q. Support vector machines and its applications in chemistry. Chemom Intell Lab Syst. 2009;95(2):188–98. https://doi.org/10.1016/j.chemolab.2008.10.007.

    Article  CAS  Google Scholar 

  34. Fan RE, Chang KW, Hsieh CJ, Wang XR, Lin CJ. LIBLINEAR: a library for large linear classification. J Mach Learn Res. 2008;2008(9):1871–4.

    Google Scholar 

  35. Max, K.; Weston, S.; Keefer, C.; Engelhardt, A.; Cooper, T.; Mayer, Z.; Kenkel, B.; Team, R. C.; Benesty, M.; Lescarbeau, R.; Ziem, A.; Scrucca, L.; Tang, Y.; Candan, C. Classification and Regression Training. 2016, p 198.

  36. Kuhn, M. Building predictive models in R using the caret package. Journal of Statistical Software, Articles 2008, 28 (5), 1–26. https://doi.org/10.18637/jss.v028.i05.

  37. Olden JD, Joy MK, Death RG. An accurate comparison of methods for quantifying variable importance in artificial neural networks using simulated data. Ecol Modell. 2004;178(3–4):389–97. https://doi.org/10.1016/j.ecolmodel.2004.03.013.

    Article  Google Scholar 

  38. Xu Y, Goodacre R. On splitting training and validation set: a comparative study of cross-validation, bootstrap and systematic sampling for estimating the generalization performance of supervised learning. J Anal Test. 2018;2(3):249–62. https://doi.org/10.1007/s41664-018-0068-2.

    Article  PubMed  PubMed Central  Google Scholar 

  39. Xu Y, Goodacre R. On splitting training and validation set: a comparative study of cross-validation, bootstrap and systematic sampling for estimating the generalization performance of supervised learning. J Anal Test. 2018;2(3):249–62. https://doi.org/10.1007/s41664-018-0068-2.

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We thank Dr. Pete Wuelfing for his insights into the application of Machine Learning methods for peptide stability and his support for the project. This work is funded, in part by, the Merck-Purdue Center award # 40002619 on ‘Machine Learning Methods to Elucidate Peptide Aggregation’ to Gaurav Chopra.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gaurav Chopra.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 3.77 MB)

Glossary

Observation

A group of measured inputs (e.g., solution pH, concentration, time) and a measured output variable (e.g., the percent label claim of a drug product).

Machine learning

a process where an algorithm is given a set of observations. This algorithm then adjusts a set of weights, biases, and other parameters to predict the output variable from the input variables. Assuming a relationship exists between the inputs and the outputs, an optimized algorithm should produce a model that can predict the output from the inputs.

Cross-validation

a technique to ensure that a given model is robust by measuring the predictive performance on data that was not used to train the model.

Training set

a group of observations that are used to fit a model.

Test set

a group of observations used to evaluate a trained model after cross-validation. These observations are removed before this process and can be used to evaluate a final model. This was done randomly for this work.

Validation set

a group of observations used to evaluate a trained model during cross-validation. These observations are typically removed from the training set in a systematic manner.

K-Fold Cross-Validation

A popular cross-validation technique which divides the available data into various portions called folds [39]. For example, the available data can be divided into 5-folds each containing 20% of the entire data set. One of these folds is then removed and referred to as the validation set and the remaining training set is used to train the model. The model is then evaluated on the validation set and the performance of the model is calculated. This process is repeated until all the folds has been used as the validation set and statistics can be calculated on how well the model did for the various folds.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fine, J., Wijewardhane, P.R., Mohideen, S.D.B. et al. Learning Relationships Between Chemical and Physical Stability for Peptide Drug Development. Pharm Res 40, 701–710 (2023). https://doi.org/10.1007/s11095-023-03475-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11095-023-03475-3

Keywords

Navigation