Abstract
The Coronavirus disease 2019 SARS-CoV-2 is a disease which causes fear to human lives that has taken thousands and hundreds of lives globally. The pandemic which has resulted in a global health emergency is currently a much sought-after research topic. The frequently mutating virus which has originated from Chiroptera and subsequently got transmitted to other mammals including humans. However, at the genomic level, it is yet to be unraveled what makes humans more prone to getting infected by the coronaviruses. Here, we have implemented a Machine Learning model known as K-means Clustering that uses the combination of different features to determine the risk of infection. In this research paper, the K-means clustering method is used since it is a good performer for Clustering analysis. The algorithm can group the sequences of the dataset into five clusters based on the Elbow plot and co-linearity of co-efficient. Using dimensional reduction technique PCA is used with a 3D visualization and a heat map to showcase the correlation efficiency between the mutated and original sequence considered.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
WHO 2021
Abdelrahman Z, Li M, Wang X (2020) Comparative review of SARS-CoV-2, SARS-CoV, MERS-CoV, and influenza a respiratory viruses. Front Immunol 11:552909
Li W, Shi Z, Yu M, Ren W, Smith C, Epstein JH, Wang H, Crameri G, Hu Z, Zhang H, Zhang J, McEachern J, Field H, Daszak P, Eaton BT, Zhang S, Wang L (2005) Bats are natural reservoirs of SARS-like coronaviruses. Science 310(5748):676–679. https://doi.org/10.1126/science.1118391 Epub 2005 Sep 29
Wang LF, Eaton BT (2007) Bats, civets and the emergence of SARS. Curr Top Microbiol Immunol 315:325–344. https://doi.org/10.1007/978-3-540-70962-6_13
Shi Z, Hu Z (2008) A review of studies on animal reservoirs of the SARS coronavirus. Virus Res 133(1):74–87
Callaway E (2020) The coronavirus is mutating—does it matter? Nature 585:174–177
Zhu Z, Lian X, Su X, Wu W, Marraro GA, Zeng Y (2020) From SARS and MERS to COVID-19: a brief summary and comparison of severe acute respiratory infections caused by three highly pathogenic human coronaviruses. Respir Res 21:224
Patrick CY. Woo, PCY, Huang Y, Lau SKP, Yuen K-Y (2010) Coronavirus genomics and bioinformatics analysis. Viruses 2(8):1804–1820
Toyoshima Y, Nemoto K, Matsumoto S, Nakamura Y, Kiyotani K (2020) SARS-CoV-2 genomic variations associated with mortality rate of COVID-19. J Hum Genet 65(12):1075–1082
Beeching NJ, Fletcher TE, Fowler R (2020). BMJ best practice coronavirus disease 2019 (COVID-19). British Med J 24. Retrieved from https://bestpractice.bmj.com/topics/en-us/3000168/investigations
Plante JA, Liu Y, Liu J, Xia H, Johnson BA, Lokugamage KG et al (2021) Spike mutation D614G alters SARS-CoV-2 fitness. Nature 592:116–121
Ikemura T, Wada K, Wada Y, Iwasaki Y, Abe T (2020) Unsupervised explainable AI for simultaneous molecular evolutionary study of forty thousand SARS-CoV-2 genomes. Biorxiv. https://doi.org/10.1101/2020.10.11.335406
Khailany RA, Safdar M, Ozaslan M (2020) Genomic characterization of a novel SARS-CoV-2. Gene Reports, p 19. https://doi.org/10.1016/j.genrep.2020.100682
James Ingram (2021, April) SARS CORONAVIRUS ACCESSION, Version 1. Retrieved Nov 25, 2020 from https://www.kaggle.com/jamzing/sars-coronavirusaccession/version/1
National Center for Biotechnology Information (NCBI) [Internet]. Bethesda (MD): National Library of Medicine (US), National Center for Biotechnology Information; [1988]. [cited 2021 Mar 13]. Available from https://www.ncbi.nlm.nih.gov/
Zhaoqi B, Xuegong Z (2000) Pattern recognition. Beijing Tsinghua University Press
Thorndike RL (1953) Who belongs in the family? Psychometrika 18(4):267–276. https://doi.org/10.1007/BF02289263
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Rath, S.L., Sinha, C., Kasturi, S.L.N.P., Mohapatra, S., Jain, K. (2022). An Unsupervised Clustering Algorithm to Cluster the New SARS-CoV-2 Virus Mutation. In: Saini, H.S., Sayal, R., Govardhan, A., Buyya, R. (eds) Innovations in Computer Science and Engineering. Lecture Notes in Networks and Systems, vol 385. Springer, Singapore. https://doi.org/10.1007/978-981-16-8987-1_19
Download citation
DOI: https://doi.org/10.1007/978-981-16-8987-1_19
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-8986-4
Online ISBN: 978-981-16-8987-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)