Generative Adversarial Network-Based Data Augmentation Method for Anti-coronavirus Peptides Prediction

Xu, Jiliang; Xu, Chungui; Cao, Ruifen; He, Yonghui; Bin, Yannan; Zheng, Chun-Hou

doi:10.1007/978-981-99-4749-2_6

Jiliang Xu¹³,
Chungui Xu¹⁴,
Ruifen Cao¹³,
Yonghui He¹⁵,
Yannan Bin¹³ &
…
Chun-Hou Zheng¹³

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14088))

Included in the following conference series:

International Conference on Intelligent Computing

801 Accesses

Abstract

The Coronavirus Disease 2019 (COVID-19) pandemic has induced a serious public health threat worldwide. Anti-coronavirus peptides (ACVPs) could act as potential peptide drugs for COVID-19. Computational methods for ACVP identification can improve the development of COVID-19 therapeutic. In this work, we propose PredACVP, which is an ACVPs prediction model using generative adversarial network (GAN)-based data augmentation method and stacked ensemble learning. The GAN-based data augmentation method is utilized to overcome the few-shot learning problem and improve the prediction performance. With the advantage of converting a high-dimensional vector into a low-dimensional vector, the stacked ensemble learning could fuse multi-view information (amino acid composition, dipeptide composition, composition of k-spaced amino acid group pairs, and physicochemical properties) without overfitting. The PredACVP model, which achieves AUC of 0.990 on test datasets, outperforms the state-of-the-art tools for ACVPs identification. PredACVP can improve the prediction performance, and accelerate the development of peptide drugs for COVID-19. GAN used in this work not only can be applied on the data augmentation and peptide for ACVP sequences, but also can be utilized for the other therapeutic peptides.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Plante, J.A., Mitchell, B.M., Plante, K.S., et al.: The variant gambit: COVID-19’s next move. Cell Host Microbe 29(4), 508–515 (2021)
Article Google Scholar
Basith, S., Manavalan, B., Hwan Shin, T., et al.: Machine intelligence in peptide therapeutics: a next-generation tool for rapid disease screening. Med. Res. Rev. 40(4), 1276–1314 (2020)
Article Google Scholar
Schütz, D., Ruiz-Blanco, Y.B., Münch, J., et al.: Peptide and peptide-based inhibitors of SARS-CoV-2 entry. Adv. Drug Deliv. Rev. 167, 47–65 (2020)
Article Google Scholar
Beddingfield, B.J., Iwanaga, N., Chapagain, P.P., et al.: The integrin binding peptide, ATN-161, as a novel therapy for SARS-CoV-2 infection. Basic Transl. Sci. 6(1), 1–8 (2021)
Google Scholar
Zhao, H., Meng, X., Peng, Z., et al.: Fusion-inhibition peptide broadly inhibits influenza virus and SARS-CoV-2, including Delta and Omicron variants. Emerg. Microbes Infect. 11(1), 926–937 (2022)
Article Google Scholar
Pang, Y., Wang, Z., Jhong, J.-H., Lee, T.-Y.: Identifying anti-coronavirus peptides by incorporating different negative datasets and imbalanced learning strategies. Brief. Bioinform. 22(2), 1085–1095 (2021). https://doi.org/10.1093/bib/bbaa423
Article Google Scholar
Timmons, P.B., Hewage, C.M.: ENNAVIA is a novel method which employs neural networks for antiviral and anti-coronavirus activity prediction for therapeutic peptides. Brief. Bioinform. 22(6), bbab258 (2021). https://doi.org/10.1093/bib/bbab258
Article Google Scholar
Kurata, H., Tsukiyama, S., Manavalan, B.: iACVP: markedly enhanced identification of anti-coronavirus peptides using a dataset-specific word2vec model. Brief. Bioinform. 23(4), bbac265 (2022). https://doi.org/10.1093/bib/bbac265
Article Google Scholar
Heidari, A., McGrath, J., Ilyas, I.F., et al.: Holodetect: few-shot learning for error detection. In: Proceedings of the 2019 International Conference on Management of Data, pp. 829–846 (2019)
Google Scholar
Wan, C., Jones, D.T.: Protein function prediction is improved by creating synthetic feature samples with generative adversarial networks. Nat. Mach. Intell. 2(9), 540–550 (2020)
Article Google Scholar
Goodfellow, I., Pouget-Abadie, J., Mirza, M., et al.: Generative adversarial networks. Commun. ACM 63(11), 139–144 (2020)
Article MathSciNet Google Scholar
Pirtskhalava, M., Amstrong, A.A., Grigolava, M., et al.: DBAASP v3: database of antimicrobial/cytotoxic activity and structure of peptides as a resource for development of new therapeutics. Nucleic Acids Res. 49(D1), D288–D297 (2021)
Article Google Scholar
Campbell, K.M., Steiner, G., Wells, D.K., Ribas, A., Kalbasi, A.: Abstract S03-01: Pan-HLA prediction of SARS-CoV-2 epitopes. Clin. Cancer Res. 26(18_Supplement), S03-01-S03-01 (2020). https://doi.org/10.1158/1557-3265.COVID-19-S03-01
Article Google Scholar
Heydari, H., Golmohammadi, R., Mirnejad, R., et al.: Antiviral peptides against coronaviridae family: a review. Peptides 139, 170526 (2021)
Article Google Scholar
Xiu, S., Dick, A., Ju, H., et al.: Inhibitors of SARS-CoV-2 entry: current and future opportunities. J. Med. Chem. 63(21), 12256–12274 (2020)
Article Google Scholar
He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)
Article Google Scholar
Guo, J., Lu, S., Cai, H., et al.: Long text generation via adversarial training with leaked information. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 5141–5148 (2018)
Google Scholar
Nie, W., Narodytska, N., Patel, A.: Relgan: relational generative adversarial networks for text generation. In: International Conference on Learning Representations (2018)
Google Scholar
Rao, B., Zhou, C., Zhang, G., et al.: ACPred-Fuse: fusing multi-view information improves the prediction of anticancer peptides. Brief. Bioinform. 21(5), 1846–1855 (2020)
Article Google Scholar
Dai, R., Zhang, W., Tang, W., et al.: BBPpred: sequence-based prediction of blood-brain barrier peptides with feature representation learning and logistic regression. J. Chem. Inf. Model. 61(1), 525–534 (2021)
Article Google Scholar
Zhu, Y., Lu, S., Zheng, L., et al.: Texygen: a benchmarking platform for text generation models. In: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, pp. 1097–1100 (2018)
Google Scholar

Download references

Funding

The National Natural Science Foundation of China (62272004); National Key Research and Development Program of China (2020YFA0908700); Anhui Medical University Science Foundation Program by Anhui Medical University (2021xkj169).

Author information

Authors and Affiliations

Information Materials and Intelligent Sensing Laboratory of Anhui Province and the Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Anhui University, Hefei, 230601, Anhui, China
Jiliang Xu, Ruifen Cao, Yannan Bin & Chun-Hou Zheng
Department of Orthopaedics, The Second Affiliated Hospital of Anhui Medical University, Hefei, 230601, Anhui, China
Chungui Xu
Key Laboratory of Chemistry in Ethnic Medicinal Resources, State Ethnic Affairs Commission & Ministry of Education, Yunnan Minzu University, Kunming, 650500, Yunnan, China
Yonghui He

Authors

Jiliang Xu
View author publications
You can also search for this author in PubMed Google Scholar
Chungui Xu
View author publications
You can also search for this author in PubMed Google Scholar
Ruifen Cao
View author publications
You can also search for this author in PubMed Google Scholar
Yonghui He
View author publications
You can also search for this author in PubMed Google Scholar
Yannan Bin
View author publications
You can also search for this author in PubMed Google Scholar
Chun-Hou Zheng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Yannan Bin or Chun-Hou Zheng .

Editor information

Editors and Affiliations

Department of Computer Science, Eastern Institute of Technology, Zhejiang, China
De-Shuang Huang
University of Wollongong, North Wollongong, NSW, Australia
Prashan Premaratne
Zhengzhou University of Light Industry, Zhengzhou, China
Baohua Jin
Zhong Yuan University of Technology, Zhengzhou, China
Boyang Qu
University of Ulsan, Ulsan, Korea (Republic of)
Kang-Hyun Jo
Department of Computer Science, Liverpool John Moores University, Liverpool, UK
Abir Hussain

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xu, J., Xu, C., Cao, R., He, Y., Bin, Y., Zheng, CH. (2023). Generative Adversarial Network-Based Data Augmentation Method for Anti-coronavirus Peptides Prediction. In: Huang, DS., Premaratne, P., Jin, B., Qu, B., Jo, KH., Hussain, A. (eds) Advanced Intelligent Computing Technology and Applications. ICIC 2023. Lecture Notes in Computer Science, vol 14088. Springer, Singapore. https://doi.org/10.1007/978-981-99-4749-2_6

Download citation

DOI: https://doi.org/10.1007/978-981-99-4749-2_6
Published: 30 July 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-4748-5
Online ISBN: 978-981-99-4749-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics