Keywords
2019-nCoV, Wuhan coronavirus, SARS, MERS
This article is included in the Emerging Diseases and Outbreaks gateway.
This article is included in the Coronavirus collection.
2019-nCoV, Wuhan coronavirus, SARS, MERS
Fears are mounting worldwide over the cross-border spread of the new strain of coronavirus (denoted as 2019-nCoV) originated in Wuhan, the largest city in central China, after its spread to Thailand and Japan. The newly emerging pathogen belongs to the same virus family as the deadly severe acute respiratory syndrome and Middle East respiratory syndrome coronaviruses (SARS-CoV and MERS-CoV, respectively). The World Health Organization (WHO) has recently published surveillance recommendations for a possible “large epidemic or even pandemic” of the novel coronavirus and it has issued guidelines for hospitals across the world. However, many questions about 2019-nCov remain unanswered: (i) what is the origin and/or natural reservoir of the virus? (ii) is it easily transmitted from human to human? and (iii) what are the potential diagnostic, therapeutic and vaccine targets? Currently, only nucleotide sequences of eight human 2019-nCoV isolates are available without any additional information about biological properties of the virus, beyond the morphology confirmation of the virion using electronic microscopy. This is likely not enough information to answer the important abovementioned questions.
The informational spectrum method (ISM), a virtual spectroscopy method for analysis of proteins, is based on the fundamental electronic properties of amino acids and requires only nucleotide sequence availability to investigate proteins1. For this reason, ISM was previously used for analysis of novel viruses for which little or no information were available2–5. Here, the 2019-nCoV was analyzed with ISM to identify its possible origin and natural host, as well as putative therapeutic and vaccine targets.
The S1 surface protein sequences from 8 human 2019-nCoV, deposited in the publicly available GISAID database (assessed on January 19, 2020), were analyzed by ISM. The studied sequences were BetaCoV/Wuhan/IVDC-HB-04/2020, BetaCoV/Wuhan/IVDC-HB-01/2019, BetaCoV/Wuhan/IVDC-HB-05/2019, BetaCoV/Wuhan/IPBCAMS-WH-01/2019, BetaCoV/Wuhan/WIV04/2019, BetaCoV/Wuhan-Hu-1/2019, BetaCoV/Nonthaburi/61/2020, and BetaCoV/Nonthaburi/74/2020.
In the phylogenetic analysis, different amino acid sequences of other coronaviruses were also included: (i) S1 proteins from the following viruses: AVP78042, AVPvp78031, AY304486, AY559093, JX163927, YN2018B, KY417146, used already by other authors in the study of the phylogenetic relationship between 2019-nCoV and nearest bat and SARS-like CoVs (GISAID database); and (ii) S1 proteins from three first isolated human MERS-CoV: AGG22542, AFS88936, AFY13307, deposited in the GISAID database
Detailed description of the sequence analysis based on ISM has been published elsewhere2. According to this approach, sequences (protein or DNA) are transformed into signals by assignment of numerical values of each element (amino acid or nucleotide). These values correspond to electron-ion interaction potential6, determining electronic properties of amino acid/nucleotides, which are essential for their intermolecular interactions. The signal obtained is then decomposed in a periodical function by the Fourier transformation. The result is a series of frequencies and their amplitudes. The obtained frequencies correspond to the distribution of structural motifs (primary structure) with defined physico-chemical characteristics responsible for the biological function of the putative protein corresponding to the analyzed sequence. When comparing proteins that share same biological or biochemical function, the technique allows detection of code/frequency pairs that are specific for their common biological properties. The method is insensitive to the location of the motifs and, therefore, does not require previous alignment of the sequences. In addition, this is the only method that allows immediate functional analysis.
The phylogenetic tree of S1 proteins from coronaviruses was generated with the ISM-based phylogenetic algorithm ISTREE, previously described in detail elsewhere7. In the presented analysis, we calculated the distance matrix with the amplitude on the frequency F(0.257) as the distance measure between sequences.
In order to compare informational similarity between 2019-nCoV, SARS-CoV, MERS-CoV and Bat SARS-like CoV, the cross-spectra (CS) of S1 proteins from these viruses were calculated. Figure 1a shows the CS of 2019-nCoV, SARS-CoV and MERS-CoV. These CS contain only one dominant peak corresponding to the frequency F(0.257). Figure 1b displays the CS of S1 proteins from 2019-nCoV and Bat SARS-like CoV. Amplitudes in these latter CS are significantly lower than in those CS presented in Figure 1a. These results show that (i) S1 proteins from 2019-nCoV, SARS-CoV, MERS-CoV and Bat SARS-like CoV encode common information, which is represented with the frequency F(0.257), and (ii) S1 proteins from 2019-nCoV are remarkable more informationally similar with S1 from SARS-CoV and MERS-CoV than with S1 from Bat SARS-like CoV. This suggests that biological properties of 2019-nCoV are apparently more similar to SARS-CoV and MERS-CoV than to Bat SARS-like CoV.
To confirm this conclusion, the ISM-base phylogenetic tree for S1 proteins was calculated (Figure 2). In this calculation the amplitude on the frequency F(0.257) was used as the distance measure. As observed in Figure 2, all analyzed 2019-nCoV S1 amino acid sequences are grouped with SARS-CoV and MERS-CoV and separated from Bat SARS-like CoV. This indicates that 2019-nCoV are more phylogenetically similar to SARS-CoV and MERS-CoV than to Bat SARS-like CoV. This result differs from those obtained with the homology-based phylogenetic analysis, which showed that 2019-CoV are closely related to Bat SARS-like CoV (https://platform.gisaid.org/epi3/frontend#lightbox1296857287).
It has been previously shown that the dominant frequency in the informational spectrum of viral envelope proteins corresponds to interaction between the virus and its receptor2,3,7,8. The ISM analysis showed that the frequency component F(0.257) is present in the CS of S1 SARS-CoV and its receptor angiotensin converting enzyme 2 (ACE2)9, but not in the CS of S1 MERS-CoV and its main receptor dipeptidyl peptidase 4 (DPP4)10. Of note is that both receptors ACE2 and DPP4 are expressed in airway epithelia. Presence of F(0.257) in the informational spectrum of MERS-CoV (Figure 1) suggests also possible interaction between this virus and the ACE2. The dominant peak on the frequency F(0.257) in the CS of S1 from SARS-CoV and MERS-CoV and ACE2 supports this possibility (Figure 3), although this has not been formally proved for MERS-CoV11.
As it is shown in Figure 1a, the frequency F(0.257) is also present in the informational spectrum of the 2019-nCoV, suggesting that ACE2 might be the receptor for this novel coronavirus too. Calculation of the CS for S1 protein from the 2019-nCoV and all ACE2 sequences available at the UniProt database revealed that the highest amplitudes on the frequency F(0.257) correspond to ACE2 from civet and chicken. This result indicates that these species can be included as potential candidates for the natural reservoir of the 2019-nCoV. However, it is possible that 2019-nCoV viruses use very different receptors in the natural host(s) and not only the ACE2 as it is the putative case in humans.
Finally, the S1 amino acid sequence from the 2019-nCoV was scanned to look for the domain that gives the highest contribution to the information represented by the frequency F(0.257) (Figure 4a). This analysis revealed domain 266–330 (numbering concerns the maturated protein) is essential for interaction of 2019-nCoV with ACE2. Of note is the striking homology between these domains of S1 proteins from 2019-nCoV and SARS-CoV, but not from MERS-CoV for which ACE2 is not the main receptor (Figure 4b).
In conclusion, results of the presented in silico analysis suggest the following: (i) the newly emerging 2019-nCoV is highly related to SARS-CoV and, to a lesser degree, MERS-CoV; (ii) civets and poultry are potential candidates for the natural reservoir of the 2019-nCoV and (iii) domain 288–330 of S1 protein from the 2019-nCoV represents promising therapeutic and/or vaccine target. Further research on these issues are needed, including the development of reverse genetics and animal models to study the biology of 2019-nCoV.
Sequence data of the viruses were obtained from the GISAID EpiFlu™ Database. To access the database each individual user should complete the “Registration Form For Individual Users”, which is available alongside detailed instructions. After submission of the Registration form, the user will receive a password. There are not any other restrictions for the access to GISAID. Conditions of access to, and use of, the GISAID EpiFlu™ Database and Data are defined by the Terms of Use.
Views | Downloads | |
---|---|---|
F1000Research | - | - |
PubMed Central
Data from PMC are received and updated monthly.
|
- | - |
Alongside their report, reviewers assign a status to the article:
Invited Reviewers | ||
---|---|---|
1 | 2 | |
Version 4 (update) 06 Jan 21 |
read | |
Version 3 (revision) 27 Apr 20 |
read | |
Version 2 (update) 31 Jan 20 |
read | read |
Version 1 27 Jan 20 |
Provide sufficient details of any financial or non-financial competing interests to enable users to assess whether your comments might lead a reasonable person to question your impartiality. Consider the following examples, but note that this is not an exhaustive list:
Sign up for content alerts and receive a weekly or monthly email with all newly published articles
Already registered? Sign in
The email address should be the one you originally registered with F1000.
You registered with F1000 via Google, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Google account password, please click here.
You registered with F1000 via Facebook, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Facebook account password, please click here.
If your email address is registered with us, we will email you instructions to reset your password.
If you think you should have received this email but it has not arrived, please check your spam filters and/or contact for further assistance.
Comments on this article Comments (0)