Abstract
The Coronavirus Disease 2019 (COVID-19) pandemic has induced a serious public health threat worldwide. Anti-coronavirus peptides (ACVPs) could act as potential peptide drugs for COVID-19. Computational methods for ACVP identification can improve the development of COVID-19 therapeutic. In this work, we propose PredACVP, which is an ACVPs prediction model using generative adversarial network (GAN)-based data augmentation method and stacked ensemble learning. The GAN-based data augmentation method is utilized to overcome the few-shot learning problem and improve the prediction performance. With the advantage of converting a high-dimensional vector into a low-dimensional vector, the stacked ensemble learning could fuse multi-view information (amino acid composition, dipeptide composition, composition of k-spaced amino acid group pairs, and physicochemical properties) without overfitting. The PredACVP model, which achieves AUC of 0.990 on test datasets, outperforms the state-of-the-art tools for ACVPs identification. PredACVP can improve the prediction performance, and accelerate the development of peptide drugs for COVID-19. GAN used in this work not only can be applied on the data augmentation and peptide for ACVP sequences, but also can be utilized for the other therapeutic peptides.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Plante, J.A., Mitchell, B.M., Plante, K.S., et al.: The variant gambit: COVID-19’s next move. Cell Host Microbe 29(4), 508–515 (2021)
Basith, S., Manavalan, B., Hwan Shin, T., et al.: Machine intelligence in peptide therapeutics: a next-generation tool for rapid disease screening. Med. Res. Rev. 40(4), 1276–1314 (2020)
Schütz, D., Ruiz-Blanco, Y.B., Münch, J., et al.: Peptide and peptide-based inhibitors of SARS-CoV-2 entry. Adv. Drug Deliv. Rev. 167, 47–65 (2020)
Beddingfield, B.J., Iwanaga, N., Chapagain, P.P., et al.: The integrin binding peptide, ATN-161, as a novel therapy for SARS-CoV-2 infection. Basic Transl. Sci. 6(1), 1–8 (2021)
Zhao, H., Meng, X., Peng, Z., et al.: Fusion-inhibition peptide broadly inhibits influenza virus and SARS-CoV-2, including Delta and Omicron variants. Emerg. Microbes Infect. 11(1), 926–937 (2022)
Pang, Y., Wang, Z., Jhong, J.-H., Lee, T.-Y.: Identifying anti-coronavirus peptides by incorporating different negative datasets and imbalanced learning strategies. Brief. Bioinform. 22(2), 1085–1095 (2021). https://doi.org/10.1093/bib/bbaa423
Timmons, P.B., Hewage, C.M.: ENNAVIA is a novel method which employs neural networks for antiviral and anti-coronavirus activity prediction for therapeutic peptides. Brief. Bioinform. 22(6), bbab258 (2021). https://doi.org/10.1093/bib/bbab258
Kurata, H., Tsukiyama, S., Manavalan, B.: iACVP: markedly enhanced identification of anti-coronavirus peptides using a dataset-specific word2vec model. Brief. Bioinform. 23(4), bbac265 (2022). https://doi.org/10.1093/bib/bbac265
Heidari, A., McGrath, J., Ilyas, I.F., et al.: Holodetect: few-shot learning for error detection. In: Proceedings of the 2019 International Conference on Management of Data, pp. 829–846 (2019)
Wan, C., Jones, D.T.: Protein function prediction is improved by creating synthetic feature samples with generative adversarial networks. Nat. Mach. Intell. 2(9), 540–550 (2020)
Goodfellow, I., Pouget-Abadie, J., Mirza, M., et al.: Generative adversarial networks. Commun. ACM 63(11), 139–144 (2020)
Pirtskhalava, M., Amstrong, A.A., Grigolava, M., et al.: DBAASP v3: database of antimicrobial/cytotoxic activity and structure of peptides as a resource for development of new therapeutics. Nucleic Acids Res. 49(D1), D288–D297 (2021)
Campbell, K.M., Steiner, G., Wells, D.K., Ribas, A., Kalbasi, A.: Abstract S03-01: Pan-HLA prediction of SARS-CoV-2 epitopes. Clin. Cancer Res. 26(18_Supplement), S03-01-S03-01 (2020). https://doi.org/10.1158/1557-3265.COVID-19-S03-01
Heydari, H., Golmohammadi, R., Mirnejad, R., et al.: Antiviral peptides against coronaviridae family: a review. Peptides 139, 170526 (2021)
Xiu, S., Dick, A., Ju, H., et al.: Inhibitors of SARS-CoV-2 entry: current and future opportunities. J. Med. Chem. 63(21), 12256–12274 (2020)
He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)
Guo, J., Lu, S., Cai, H., et al.: Long text generation via adversarial training with leaked information. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 5141–5148 (2018)
Nie, W., Narodytska, N., Patel, A.: Relgan: relational generative adversarial networks for text generation. In: International Conference on Learning Representations (2018)
Rao, B., Zhou, C., Zhang, G., et al.: ACPred-Fuse: fusing multi-view information improves the prediction of anticancer peptides. Brief. Bioinform. 21(5), 1846–1855 (2020)
Dai, R., Zhang, W., Tang, W., et al.: BBPpred: sequence-based prediction of blood-brain barrier peptides with feature representation learning and logistic regression. J. Chem. Inf. Model. 61(1), 525–534 (2021)
Zhu, Y., Lu, S., Zheng, L., et al.: Texygen: a benchmarking platform for text generation models. In: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, pp. 1097–1100 (2018)
Funding
The National Natural Science Foundation of China (62272004); National Key Research and Development Program of China (2020YFA0908700); Anhui Medical University Science Foundation Program by Anhui Medical University (2021xkj169).
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Xu, J., Xu, C., Cao, R., He, Y., Bin, Y., Zheng, CH. (2023). Generative Adversarial Network-Based Data Augmentation Method for Anti-coronavirus Peptides Prediction. In: Huang, DS., Premaratne, P., Jin, B., Qu, B., Jo, KH., Hussain, A. (eds) Advanced Intelligent Computing Technology and Applications. ICIC 2023. Lecture Notes in Computer Science, vol 14088. Springer, Singapore. https://doi.org/10.1007/978-981-99-4749-2_6
Download citation
DOI: https://doi.org/10.1007/978-981-99-4749-2_6
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-4748-5
Online ISBN: 978-981-99-4749-2
eBook Packages: Computer ScienceComputer Science (R0)