INTRODUCTION

Coronaviruses (CoV) are the causative agents of Acute Severe Respiratory Syndrome (SARS-CoV), which first caused an outbreak of a global epidemic in 2002, when 8000 people were infected, 10% of whom died [1]. Later, in 2012, there was an outbreak of Middle East Respiratory Syndrome (MERS-CoV), when the epidemic covered 26 countries [2]. In December 2019, a new form of coronavirus (SARS-CoV-2) was recorded in China, which caused the global pandemic of the coronavirus disease (COVID-19) in 2020 [3]. In a short time, COVID-19 infection spread to 215 countries of the world, including Russia. In the world and in Russia, the number of new cases of infection and mortality from coronavirus continues to increase every day and today has reached more than 45 million [4].

When infected with SARS-CoV-2, the patient displays symptoms of an acute respiratory viral disease after 2–14 days of the incubation period: increased body temperature (90%); cough (in 80% of cases), shortness of breath (in 30% of cases), fatigue (in 40% of cases), chest congestion (in 20%), sore throat, runny nose, decreased sense of smell and taste, conjunctivitis. These symptoms may indicate the development of pneumonia without respiratory failure, acute respiratory distress syndrome (ARDS, pneumonia with acute respiratory failure), sepsis, septic shock, or multiple organ dysfunction syndrome [5]. These pathological conditions most often lead to the mortality in patients of working age (59.7 ± 13.3 years) with chronic diseases: arterial hypertension (23.7–30%), diabetes mellitus (16.2%), metabolic syndrome, coronary heart disease (5.8%), chronic obstructive pulmonary disease (COPD), nicotine addiction, inflammatory bowel diseases, and cancers [68]. In addition, patients with genetic diseases may be a risk group for COVID-19 infection. For example, in a study conducted by scientists from the University Medical Center Utrecht (Netherlands), 180 (45.6%) of 395 patients with Down syndrome developed severe respiratory syncytial virus infection [9]. The clinical picture in patients of the risk groups is characterized by the development of mutual burdening syndrome, accompanied by progressive respiratory and heart failure, which ultimately worsens their condition and leads to labor losses, early disability, and high mortality. In this regard, patients infected with COVID-19 with chronic and genetic diseases especially urgently need immediate diagnosis and rehabilitation.

In the midst of COVID-19 infection, it is extremely urgent to study the pathogenesis of this disease and to identify new protein and gene targets that may be highly sensitive and specific diagnostic and prognostic markers of the severity and outcome of the disease to combat this pandemic. The only accurate method that can clearly verify COVID-19 infection, diagnose genetic diseases, and form new risk groups based on the genetic predisposition among patients with chronic diseases or susceptibility to COVID-19 infection, as well as personalize medical rehabilitation programs, is genetic analysis.

Owing to the fact that pathogenetic mechanisms of COVID-19 infection are multiple in nature, the review considers the genetic aspects of this process at the level of virus and human genomes with an emphasis on identifying associations of potential prognostic genetic markers with the severity and outcome of the disease and genetic targets for exposure to targeted pharmaceuticals.

SARS-COV-2 CORONAVIRUS: STRUCTURE, GENOME, MUTATIONS AND VARIANTS ASSOCIATED WITH THE DEVELOPMENT OF INFECTION

Coronaviruses are large spherical particles (with a diameter of 120 nm) consisting of a bilayer lipid envelope containing four proteins: membrane proteins (M, E), spike protein (S), and hemagglutinin esterase (HE) around nucleocapsid (N) formed by multiple copies of this protein associated with single-stranded RNA [10] (Fig. 1).

Fig. 1.
figure 1

Micrograph (a) and schematic structure of SARS-CoV-2 coronavirus (b).

S proteins form outgrowths on the envelope of the virus, creating a kind of crown, owing to which the virus got its name [11]. Using these spikes, the viruses attach to receptor proteins of the host cells, which ensure fusion of the viral and cell membranes and the entry of the viral RNA into the cell. S proteins contain the receptor-binding domain (RBD, amino acids N318-T509), which interacts with the receptor-binding motif (RBM, amino acids S432-T486) of angiotensin converting enzyme 2 (ACE2), a cell receptor for SARS-CoV-2 [12, 13]. In addition, S glycoprotein contains a furin-like restriction site [14], which is required for recognition during pyrolysis and, therefore, contributes to zoonotic infection of the virus.

The SARS-CoV-2 genome is represented by single-stranded RNA about 30 kb containing a cap region at the 5'-end and a poly-A sequence at the 3'-end, which allows viral RNA to be translated on the ribosomes of the host cells. Viral RNA includes regulatory sequences, in which transcription is terminated, and ten open reading frames (ORFs), which are transcribed to form mRNA (Fig. 2).

Fig. 2.
figure 2

The structure of RNA genes of SARS-CoV-2 coronavirus (according to [15]).

When viral RNA is translated in the host cell, the 1a/1ab polyprotein (pp1a/pp1ab) is synthesized. The ORF1a and ORF1b genes encode the production of the pp1a/pp1ab replicase/transcriptase polyprotein, which is then cleaved by the viral chymotrypsin-like (3CLpro) main (Mpro) or two papain-like proteases to form 16 nonstructural proteins (Nsps). Other ORFs encode structural proteins: S protein, membrane, envelope and nucleocapsid proteins, as well as accessory proteins, which are more immunogenic than nonstructural proteins [10, 16]. During assembly of the virion, the nucleoprotein (ORF9a) packs viral RNA into a helical ribonucleocapsid (RNP) through its interaction with the viral genome and the membrane M protein [17]. The nucleoprotein is involved in increasing the transcription efficiency of subgenomic viral RNA and in viral replication. Nonstructural proteins are also involved in the formation of the multiprotein replication-transcription complex (RTC). The main protein of this complex is RNA-dependent RNA polymerase (RdRp), which is involved in replication and transcription of viral RNA. Other nonstructural proteins of the complex perform auxiliary functions during replication and transcription. For example, the exoribonuclease protein, with its correcting activity, provides additional replication fidelity that is absent in RNA-dependent RNA polymerase.

Mutations in the coronavirus genes can be phenotypically manifested in changes in amino acids of the encoded proteins and, therefore, affect functions of these proteins and processes with their participation. X. Tang et al. [18] conducted a population analysis of 103 SARS-CoV-2 genomes in patients from different regions of the world. On the basis of the differences between strains (r2 = 0.954) for two SNPs at points 8782 (ORFLab, T8517C, synonymous) and 28144 (ORF8, C251T, S84L), two genetic types of the virus were identified: L (71% frequency) and S (29% frequency). The viruses of the L type, derivatives of the S type viruses, are more aggressive and virulent owing to higher rates of genetic information transmission and replication [18]. In 103 sequenced SARS-CoV-2 genomes, mutations were identified in 149 regions, and the parental strains contained 43 synonymous, 83 nonsynonymous, and two stop mutations. Among these mutations, the most frequent were singletons (65.1%, 28/43) of synonymous and 84.3% (70/83) of nonsynonymous mutations, which indicates their recent origin or an increase in the number of cases. However, 16.3% (7 of 43) of synonymous mutations and one nonsynonymous (ORF8, L84S, 28144) mutation had a frequency of ≥70% among SARS-CoV-2 strains. Other nonsynonymous mutations that have alleles in more than two SARS-CoV-2 strains affected six proteins: Orf1ab (A117T, I1607V, L3606F, I6075T), S (H49Y, V367F), Orf3a (G251V), Orfa (P34S), Orf8 (V62L, S84L), and N (S194L, S202N, P344S) [18] (Table 1).

Table 1. Mutations and functions of SARS-CoV-2 proteins

Other authors (O.A. MacLean et al. [19]) criticized the statement that the detection of the majority of samples with a specific mutation is not evidence of a higher rate of spread of viruses with this mutation. For such a statement, it is necessary to compare the results (the number of infected cases) at the zero distribution and equal transmission rates. Since this was not done, the authors believe that there is insufficient evidence of a difference in transmission rates between different genetic types of SARS-CoV-2. The differences in the observed number of samples containing and not containing these mutations most likely could have been a result of random epidemiological factors [19].

In a study by the Watson Center (New York, United States) [20], 48 SARS-CoV-2 genomes were sequenced and 129 mutations, including 80 unique mutations, 43 missense mutations, 21 synonymous mutations, three deletions, 11 noncoding mutations, and two deletions in noncoding DNA, were found. The most common variants, 8782C>T (ORF1ab) and 28144T>C (ORF8), were found in 13 samples, and the 29095C>T (N) variant was found in five samples. The variants 8782C>T and 29095C>T are synonymous, while 28144T>C causes the L84S amino acid substitution in Orf8. Twelve out of 13 samples with these variants were identified outside Wuhan. Among the 43 missense variants, 30 were found at the ORF1ab locus. Interestingly, the number of mutations in the Nsp3 domain is greater than in other domains. All three detected deletions are localized in the Nsp1 domain of ORF1ab, and two deletions in noncoding DNA are found in the 3'-UTR and 5'-UTR regions, which do not affect the function of the protein [20].

The study by C. Yin (Department of Mathematics, Statistics, and Computer Science, University of Illinois, Chicago, United States) is of interest [21]. When genotyping 442 strains of SARS-CoV-2 from the world GISAID database, several frequent mutations (variants) were found in the SARS-CoV-2 genomes, and a comparative analysis of their frequencies in samples from the epidemiological region was carried out. These SNPs are apparently able to affect the transmissibility and pathogenicity of SARS-CoV-2 (Table 1).

The first frequent mutation (SNP, 241C>T) in the SARS-CoV-2 genome is located in the leader sequence, an important site of the genome for discontinuous subgenomic replication. This mutation is associated with three mutations: 3037C>T, 14408C>T, and 23403A>G, causing amino acid substitutions in Nsp3 (synonymous mutation), RNA polymerase (P323L), and S protein (D614G), respectively. Since these three mutations (241C>T, 14408C>T and 23403A>G) are found in proteins involved in replication (in RNA (241C>T, 14408C>T)) and in the binding to the ACE2 receptor (in the S protein (23403A>G)) and are common in virus isolates from Europe, where a larger number and more severe cases of COVID-19 infection were observed than in other geographic regions, these variants may contribute to the increased transmissibility of the virus [21]. The second variant (28144T>C) is localized in the Nsp8 protein, causing substitution of serine (S) for leucine amino acid (L). The Nsp8 protein initiates de novo RNA replication in the absence of primers [22]. This result is consistent with the data of a study performed on 103 genomes of SARS-CoV-2, in which the S and L types of the virus were isolated by two comutations (8782C>T and 28144T>C) [18]. Interestingly, primer-independent RNA polymerase (Nsp8) contains more mutations than any other proteins (28144T>C, 28881G>A, 28882G>A, and 28883G>C), which may contribute to mutagenic resistance owing to increased replication fidelity. This is evidenced by a study in which a mutation in RNA polymerase increased the replication fidelity in the RNA virus [23]. The third SNP (26144G>T) is present in nonstructural protein 3 (Nsp3: G251V), which cooperates with Nsp4 and Nsp6 to induce two membrane vesicles (DMV) and the membrane complex that serves as a platform for RNA replication and assembly [24]. The D614G mutation is localized in the S1–S2 binding region near the furin recognition site (R667), which is involved in the cleavage of the S protein, which is necessary for the entry of the virion into cells and their persistence in them [25]. The functional significance of SNP (23403A>G) causing the amino acid substitution in the S protein (D614G) is unclear. If the mutation is lethal or reduces the ability of the virus to spread, then the functional significance of such a mutation cannot be established. Only mutations in the S protein that increase the affinity for cellular ACE2 receptors and decrease the immune response stand a chance. It is likely that these variants are the result of natural selection and evolution of the virus [21].

Analysis of SNPs in samples from the endemic region revealed two SNPs (8782C>T, 28144Tdel>C) common in China, Europe, and the United States. Later, one of these strains mutated (8782C>T, 28144T>C, 18060C>T), which led to the emergence of four SARS-CoV-2 genotypes: genotype I (11083G>T), genotype II (26144G>T), genotype III (8782C>T, 28144T>C), genotype IV (241C>T, 3037C>T, 14408C>T, 23403A>G). Strains within the same genotype originate from the same ancestor [21]. The existence of eight genetic strains of SARS-CoV-2 is assumed [26].

Virus strains with functionally significant SNPs (241C>T, 3037C>T, 23403A>G) were found mainly in SARS-CoV-2 isolates from European countries. In these strains, additional mutations were revealed at positions (14408C>T, 23403A>G), which affected RNA polymerase (Nsp8), RNA primase (Nsp12), and the S protein. These mutations are likely associated with the severity of COVID-19 infection in European countries. Analyzing the SNP profiles of viruses from isolates of the world collection of different generations of the virus, it was found that one mutation can occur in one generation [21].

Whole genome sequencing of samples from the world collection of the SARS-CoV-2 virus (NCBI repository) made it possible to reveal 47 common SNPs affecting the main viral proteins: S protein, Nsp1, RdRp, and Orf8. The proteins are highly mutated. Mutations occurred during the spread of the virus from person to person over the past 3 months (March–May) of the pandemic. The presence of these mutations in key proteins of the virus can be the reason for its different virulence, features of a person’s response to antiviral drugs, and different lethality from viral infection [27]. Chinese scientists discovered 14 new SNPs (9 out of 14 are nonsynonymous) in the coding region of the SARS-CoV-2 genome with frequencies from 10 to 50%, which are combined into four groups LG_1–LG_4 [28]. It was shown that mutants of the LG_1 group emerged during mutational selection of strains from Europe, while selection of strains from America resulted in the formation of mutants of the LG_2 and LG_3 groups. Thus, the increase in the number of new SARS-CoV-2 alleles is determined by genetic differentiation of the strains in Europe and America. Positive and negative correlations were established between infection of patients with mutants of the LG_1 and the LG_2 and LG_3 groups, respectively, and the lethality rate from COVID-19. These observations suggest that, compared with the strains of the LG_2 and LG_3 groups, the LG_1 virus strains are more virulent, which can partly explain the higher lethality rates from COVID-19 infection in European countries than in the United States [28].

In contrast, mutations in Orf1a/b can weaken the virulence of the coronavirus. Thus, the Y6398H mutation in the Orf1a/b protein (p59/Nsp14/ExoN) weakens the virulence of the murine coronavirus (MHV-A59), which was associated with a decrease in its replication within five days after infection [29]. The T8517C mutation in Orf1ab SARS-CoV-2 does not change the protein sequence (replacement of the AGT (Ser) codon with AGC (Ser)), but most likely affects translation of the Orf1ab protein [18].

HUMAN GENOME VARIANTS ASSOCIATED WITH THE DEVELOPMENT OF COVID-19 INFECTION

ACE2

Genetic factors of the host organism can contribute to the development and severity of SARS-Cov-2 infection. It was found that angiotensin-converting enzyme-2 (ACE2) is the cellular receptor for SARS-CoV-2 [30, 31]. In the human body, this protein is encoded by the ACE2 gene. ACE2 polymorphisms can affect the affinity and specificity of the S protein binding to ACE2, and, therefore, determine the hereditary predisposition to the risk of infection and lethality of SARS-CoV-2, which are associated with the development of arterial hypertension, diabetes mellitus, and cerebral stroke, which constitute a high-risk group of infection and lethality of COVID-19 infection. It is noteworthy that ACE2 expression is significantly increased in patients with type 1 and type 2 diabetes treated with ACE2 inhibitors. In turn, treatment with ACE inhibitors increases expression of ACE2. Therefore, increased expression of ACE2 may contribute to an increase in the risk of infection and mortality from COVID-19. A person’s susceptibility to SARS-CoV-2 may be a result of the combined effect of the therapy and characteristics of the ACE2 gene polymorphism [32].

On the other hand, the population frequencies of 15 SNPs of the ACE2 gene in various regions of the world, according to Chinese studies, showed that the population of China, Southeast Asia, and North America is the most susceptible to SARS-CoV-2, while Europeans have the lowest frequencies of these SNPs. For example, rs4646127, localized in the ACE2 intron, has the highest frequencies (AF) in the populations of China (0.997) and Southeast Asia (0.994) and the lowest AF in the populations of Europe (0.651) and America (0.754). Moreover, 11 SNPs (AF > 0.05) and one rare variant (rs143695310) were associated with high expression of ACE2 in tissues [33] (Table 2).

Table 2. Polymorphism of human genes associated with the development of acute respiratory coronavirus infection

It was found by the method of RNA sequencing that Asians have higher expression of ACE2 than Europeans and African Americans [34]. A high level of expression of ACE2 was found in the cells of the lungs, liver, kidneys, myocardium, small intestine, and testes, organs with the highest tropism to SARS-CoV-2 [10, 35, 36]. Back in 2005, it was shown that, in the lungs, where SARS-CoV is replicated, the ACE2 gene has different splicing variants [37]. Correlations between the most common (38) ACE2 variants and the level of protein expression in various human tissues were established [38]. It was shown that age, gender, race, and smoking significantly (P = 0.008) affect expression of the ACE2 gene [38]. Our analysis of 2754 ACE2 variants from the database of the European population genomes revealed a lower ratio of missense SNPs in the population of Southern Europe compared with other regions of Europe, which may, in part, explain the higher level of COVID-19 mortality in Spain and Italy. Comparing the frequencies of five ACE2 variants (rs35803318, rs41303171, rs113691336, rs971249, rs2285666) in the Russian and European populations, we found that Russians are similar to other European populations, which indicates a similar level of infection and severity of the disease [39]. These studies may partly explain the main symptoms of COVID-19 infection and its association with the severity of chronic diseases of the organs listed above. Analysis of ACE2 gene expression in 119 cell types and 13 human tissues, as well as the spectrum of ACE2 coexpression with 51 RNA virus coreceptors and 400 membrane proteins of Chinese residents, confirmed expression of ACE2 in the lungs, liver cholangiocytes, cells of the colon and esophagus, epithelial cells of the ileum, rectum, and stomach, and proximal renal tubules. Peptidases ANPEP, DPP4, and ENPEP may be potential SARS-CoV-2 coreceptors with the most similar expression patterns to ACE2 in 13 human tissues [40]. The work of Italian scientists who conducted whole exome (3984 patients) and whole genome (3284 patients) sequencing to search for genetic factors of the severity of COVID-19 infection is of interest. It was found that expression levels of transmembrane serine protease 2 (TMPRSS2) and frequencies of its four SNPs (rs2298659 (p.Gly259Gly); rs17854725 (p.Ile256Ile), rs12329760 (p.Val160Met), rs3787950 (p.Thr75Thr)) significantly (P < 2.2 × 10–16) differed in Italians compared with East Asian and European populations. Since the level of the lethality from COVID-19 in Italy is one of the highest among other populations, TMPRSS2 variants may serve as possible modulators of the severity of the disease in Italians [41]. Expression levels of ACE2 and TMPRSS2 were also studied in CD11b expressed in epithelial cells of the intestine and T cells of patients with inflammatory bowel disease. ACE2 and TMPRSS2 expression was not increased in patient samples compared with controls. The use of antibodies to TNF-α, vedolizumab, ustekinumab, and steroids significantly reduces expression of ACE2 in CD11b cells [42].

IL-4, IL-6, IL-18 and TNF-α

The international team of authors from the United Kingdom, the Netherlands, and Croatia conducted a meta-analysis that included 386 studies in relation to the search for SNPs associated with the development of tuberculosis, influenza, respiratory syncytial virus, coronavirus, and pneumonia. One SNP (rs2070874) of the IL4 gene was significantly (OR = 1.66, 95% CI 1.29–2.14) associated with the risk of respiratory infections [43] (Table 2). An increased viral load of SARS-CoV was associated with the TT and GT genotypes of rs1946518 (T>G, c.–607IL1, OR = 10.6, 95% CI 2.03–55.0, P = 0.014), the TC genotype of rs1800587 (с.–889T, OR = 10.2, 95% CI 1.82–56.8, P = 0.031) IL1A, and the TT genotype of rs2288918 (+23962T, OR = 7.2, 95% CI 1.47–35.3, P = 0.034) RelB in 94 patients in Taiwan infected with SARS-CoV. It was also found that the viral load was higher in men (P = 0.0014) and elderly (over 65 years old) patients (P = 0.015). In COVID-19, an increase in the level of interleukin-6 (IL-6), tumor necrosis factor-α (TNF-α) and interferon-γ is observed [44]. This cytokine storm triggers the severe inflammatory response observed in COVID-19 infection. A number of studies and a meta-analysis including nine studies and 1426 patients with the infection showed that its severity (30.66, 95% CI 7.53–53.78, P = 0.009) and the percentage of lethal outcomes (41.32, 95% CI 28.15–54.49, P < 0.001) positively correlates with the level of IL-6 in patients [4548]. However, the question on the association of TNFA gene variants with the development of SARS-CoV infection remains controversial. Some studies did not find an association of TT and CT genotypes of the TNFA c.–204 locus (OR = 0.95, 95% CI 0.90–0.99) in 75 patients with interstitial pulmonary fibrosis during the development of SARS-CoV infection [49]. In contrast, M. Feldmann et al. believed that the therapy with infliximab and adalimumab, drugs against TNF-α, can reduce the intensity of the cytokine storm and inflammation and thus may be more effective in severe COVID-19 [44]. Therefore, the viral load of SARS-CoV, being a predictor of clinical outcomes in patients, is associated with polymorphisms in proinflammatory genes involved in the innate immunity depending on age and sex [50].

CCL2, RANTES, and MBL

Chemokines are involved in inflammation and antiviral immunity. An association of rs1024611 variants (A>G, g.34252769A>G, G-2518A) in the chemokine ligand-2 (CCL2) gene and mannose-binding lectin (MBL) gene with the susceptibility to SARS-CoV was established in 932 patients and in 982 individuals constituting the control group (Table 2). The GG rs1024611 genotype is associated with a high level of the CCL2 protein, and the B allele of MBL is associated with a low level of Mbl; both are associated with an increased risk of SARS-CoV infection (P = 1.6 × 10–4 and 4.9 × 10–8 for CCL2 and MBL, respectively). At the same time, the combination of variants demonstrated (P = 1.3 × 10–10) the cumulative effect of the risk of SARS-CoV infection. No association was found between the polymorphism and disease severity [51]. The rs1800450 A/B (230G>A) polymorphism in the promoter of the MBL gene codon 54 was detected in 123 (36.0%) of 352 patients and was significantly (P = 0.00086) associated (OR = 1.73, 95% CI 1.25–2.39) with their susceptibility to coronavirus infection. At the same time, carriers of this variant had a moderate or low expression of the Mbl protein [52]. Since Mbl recognizes mannose residues within hydrocarbons and glycoproteins in the envelopes of many bacteria and viruses, stimulating the selectin pathway of the complement activation, a decrease in Mbl cellular expression will contribute to a decrease in the innate immunity and to the risk of SARS-CoV-2 infection. In another study, an association between the CG (P < 0.0001, OR = 3.28, 95% CI 2.32–4.64) and GG genotypes (P < 0.0001, OR = 3.06, 95% CI 1.47–6.39) of rs2107538 (C>G, c.–28) RANTES with the susceptibility to SARS-CoV infection in 495 Hong Kong SARS-CoV patients and 578 individuals of the control group was established (Table 2). In 28 patients, the G allele of rs2107538 and CG (OR = 2.12, 95% CI 1.11–4.06) and GG genotypes are also associated with increased mortality (OR = 4.01, 95% CI 1.30–12.4). Thus, RANTES is involved in the pathogenesis of SARS-CoV infection [53].

АВО and HLA

The coronavirus can replicate in cells expressing antigens of the ABO group. Epidemiological analysis showed that carriers of the blood group I (O) had a low risk (OR = 0.699, 95% CI 0.635–0.770, P < 0.001) of coronavirus (SARS-CoV) infection [54]. French researchers used the CHO cellular model to study whether antibodies to ABO antigens could block the interaction of the SARS-CoV S protein with ACE2. For this purpose, the EGFP-labeled S protein was expressed in Chinese hamster ovary cells transfected with α-1,2-fucosyltransferase and α-transferase. Human monoclonal or anti-A antibodies inhibited adhesion of the cells expressing the S protein to the cells expressing ACE2. These data show that anti-A antibodies block the interaction between the virus and its receptor [55]. In 65 patients with coronavirus infection, it was found that an increased risk of infection is associated with the alleles HLA-B*4601 (OR = 2.08, P = 0.04) and HLA-B*5401 (OR = 5.44, P = 0.02), and the severity of the disease is associated with the HLA-B*4601 allele (P = 0.0008) [56]. Interestingly, seven Italian and Spanish centers for the COVID-19 pandemic conducted whole genome sequencing and analyzed 8 582 968 SNPs in 835 Italians and 775 Spaniards infected with SARS-CoV-2. An association (OR = 1.77, 95% CI 1.48–2.11, P = 1.15 × 10–10; OR = 1.32, 95% CI 1.20–1.47, P = 4.95 × 10–8), respectively, between sites rs11385942 (del>A, g.45834969dup) in the LZTFL1 transcription factor gene (3p21.31 locus) and rs657152 (A>C,T, g.133263862A>C) of ABO (9q34.2 locus) was found. At the 3p21.31 locus, the association affects genes LZTFL1, SLC6A20, CCR9, FYCO1, CXCR6, and XCR1. Analysis of blood groups showed that a higher risk of SARS-CoV-2 infection is associated with the blood group II (A) (OR = 1.45, 95% CI 1.20–1.75, Р = 1.48 × 10–4), while the blood group I (O) causes a protective effect (OR = 0.65, 95% CI 0.53–0.79, Р = 1.06 × 10–5) compared with other blood groups [57].

FcγRIIA and CD14

Researchers from Sydney (Australia) found an association (P = 0.03, OR = 3.2, 95% CI 1.1–9.1) of the homozygous H/H131 genotype of human Fc-γ receptor IIA (FcγRIIA) with the severity and outcome of coronavirus infection in 180 patients from Hong Kong infected with SARS-CoV [58]. In a later work on 152 patients, an association of the CC genotype of rs2569190 (A>G, с.–159CC g.140633331A>G) of the CD14 gene encoding a membrane protein of macrophages with the severity of coronavirus infection was established (P = 0.029, OR = 2.74, 95% CI 1.15–6.57) [59]. These data are consistent with the fact that an excessive level of the secretory sCD14 protein in the blood is associated with inflammation and gram-negative septic shock during the infectious process.

AHSG

It is known that high levels of fetuin-A (AHSG) are observed in obesity and are associated with insulin resistance [60]. In adipose tissue, fetuin-A inhibits adiponectin expression, enhances inflammation, and inactivates macrophages [60]. The gene of the cytochrome P450 3A family (CYP4F3A) encodes ω-oxidase, which inactivates leukotriene B4 (LTB4) in the liver. An association of three SNPs was established: rs2248690 (OR = 2.42, 95% CI 1.30–4.51), rs4917 (OR = 1.84, 95% CI 1.02–3.34) in the AHSG gene, and rs3794987 (A>G, g.15640081A>G) CYP4F3 with the susceptibility to SARS-CoV in 624 patients from Guangzhou province (Table 2). In addition, rs2248690 affects the transcriptional activity of the AHSG promoter, which regulates the serum level of AHSG [61].

ICAM3

The rs2304237 (c.428A>G) polymorphism of the gene encoding the intercellular adhesion molecule-3 was studied for the susceptibility of carriers of the polymorphism to the disease and outcome of SARS-CoV infection in 817 patients with severe acute respiratory syndrome. No association of the ICAM3 polymorphism with the body’s susceptibility to the coronavirus was found. Nevertheless, in patients with SARS-CoV homozygous at Gly143 of the ICAM3 gene, a high level of lactate dehydrogenase (P = 0.0067, OR = 4.31, 95% CI 1.37–13.56) and a low number of leukocytes (P = 0.022, OR = 0.30, 95% CI 0.10–0.89) were determined, which confirms the role of ICAM3 in the pathogenesis of coronavirus infection [62].

Thus, ACE2 polymorphisms are associated with the dynamics and prevalence, polymorphisms of leukocyte antigen are associated with the susceptibility and severity of SARS-СoV, and genes that regulate signaling pathways through Toll-like receptors are associated with the severity of COVID-19 infection [63].

TARGETED DRUGS AND VACCINES

The main proteins S glycoprotein, RNA polymerase, and RNA primase and the main protease (Mpro, 3CLpro) of SARS-СoV -2 are considered as attractive and effective targets for the development of antiviral drugs [21].

3CLpro Inhibitors

The 3C-like protease (3CLpro) is involved in processing of polyproteins translated from viral RNA and is required for viral replication. For example, scientists from the University of Lübeck (Lübeck, Germany), on the basis of X-ray diffraction analysis of SARS-CoV-2 3CLpro, developed α-ketoamide inhibitors with P3-P2 amide bond in the pyridone ring to increase their plasma half-life. Evaluating the pharmacokinetic characteristics of the inhibitor, researchers established its pronounced tropism to the lung tissue and safety for inhalation administration [64]. Since 3CLpro cleaves coronavirus polyproteins at highly conserved restriction sites, a three-dimensional model of SARS-CoV-2 3Clpro was developed in silico on the basis of the crystal structure of a very similar SARS-CoV ortholog. Using this model, virtual screening for the used antiviral drugs was carried out and 16 candidates were found as effective inhibitors. Among them, ledipasvir and velpatasvir proved to be the most attractive for the therapy of COVID-19 infection and had minimal side effects in the form of fatigue and headache (Table 3).

Table 3. Drugs targeted to viral and cellular proteins (according to [28])

Drugs such as epclusa (velpatasvir/sofosbuvir) and ledipasvir (sofosbuvir) turned out to be particularly effective owing to their dual inhibitory action on both viral enzymes, RNA-dependent RNA polymerase (RdRp) or 3CLpro, causing mild adverse reactions. Teniposide and etoposide (and its phosphate) were developed on the basis of a model of their chemical binding to 3CLpro. However, these drugs have many side effects and must be administered intravenously. The approved drug venetoclax binds well to 3CLpro, but has side effects, including upper respiratory tract infections [65]. Other proteases inhibitors, lopinavir and ritonavir, used in the treatment of HIV infection, may reduce viral load and improve the outcome of COVID-19 infection [66].

Drugs Targeted to the S Protein

The S1 domain of the SARS-СoV2 S glycoprotein contains unique N- and O-linked glycosylation sites that interact with human CD26, masking SARS-СoV-2 from the host immune response and the development of virulence [67]. The S glycoprotein also contains a binding site for ACE2, the so-called receptor-binding motif (RBM, amino acids S432-T486), which is a major antigenic determinant capable of triggering the production of neutralizing antibodies, such as human monoclonal antibodies mAb80R, which compete with ACE2 for binding. Screening of RBM peptide libraries allowed one to obtain short amino acid sequences (about 40 amino acids) that bind to ACE2 and mAb80R neutralizing antibodies [1]. Earlier studies (2009) showed that RBD forms a complex with the Fab domain of the neutralizing F26G19 monoclonal antibody obtained by immunizing mice with inactivated SARS-CoV. The crystal structure of the complex shows that the RBD surface recognized by F26G19 is overlapped by the molecular structure of the ACE2 protein. It is assumed that F26G19 antibodies neutralize SARS-CoV by blocking the interaction of the virus with the cells of the host organism [68]. The S protein contains the furin restriction site at the interface between the S1 and S2 subunits, which distinguishes SARS-СoV-2 from other SARS-CoV viruses. American scientists showed that mouse polyclonal antibodies against the SARS-СoV-2 S protein effectively inhibit the penetration of the virus into cells [69]. There is evidence that angiotensin receptor blockers may be effective in the treatment of coronavirus infection. For example, irbesartan, approved by FDA for the treatment of hypertension and diabetic nephropathy, inhibits SLC10A1, the cotransporter of sodium and bile acid (NTCP), which interacts with the C11orf74 transcriptional repressor, preventing the formation of SARS-CoV Nsp10 [70, 71]. In July–August 2020, the Regeneron Pharmaceuticals biotechnology company (New York, United States) sponsored clinical trials on the efficacy and safety of the use of double monoclonal antibodies REGN-COV-2, targeted at binding to two sites of the SARS-CoV-2 S protein in 2000 asymptomatic adult carriers of COVID-19. The effectiveness of REGN-COV-2 antibodies is assessed with respect to preventing the development of the infection or symptoms in infected patients within one month after the administration of REGN-COV-2 or placebo. All participants in the trial will be monitored in terms of the safety for seven months after the end of the trial [72].

Drugs to Lipids and Cholesterol

One more strategy for the therapy of coronavirus infection can be the use of drugs that bind to the lipid envelope of viruses. For example, cyclodextrin and sterols reduce the infectiousness of coronaviruses by inhibiting virus-lipid-dependent attachment to target cells of the host organism [81].

Drugs to Estrogen Receptor

It was found that overexpression of the ESR1 estrogen receptor inhibits viral replication [82]. Nonsteroid inhibitors of estrogen receptors (toremifene) can effectively block viral infections, including COVID-19, by destabilizing the viral membrane glycoprotein and inhibiting the binding of the viral envelope to the endosomal membrane and ultimately suppressing viral replication [83]. Toremifene also regulates expression of RPL19, HNRNPA1, NPM1, EIF3I, EIF3F, and EIF3E proteins involved in COVID-19 infection [84]. Therefore, toremifene may be a potential drug for the treatment of COVID-19 infection [66].

Vaccines Based on Viral Proteins

The effective containment of coronavirus epidemics in farm animals with vaccines indicates potential vaccination success. The S protein is considered as the most promising target for coronavirus vaccines, as well as an intranasal vaccine against MERS-CoV [85]. This research was initiated by the development of small animal models that efficiently reproduce MERS-CoV transmission and symptoms of human disease [88]. Currently, various variants of vaccines against SARS-CoV are being developed: vector DNA vaccines, live attenuated vaccines, and vaccines against the S protein subunit of the human MERS-CoV coronavirus [86]. An adenovirus vaccine can effectively induce the immune response against SARS-CoV-infected T cells and virus-neutralizing antibodies [87]. Both immune responses provide long-term protection for the body against viral infection. In recovered patients with coronavirus infection, the responses of virus-neutralizing antibodies decreased after about 6 years, while the immune responses to SARS-CoV-infected T cells persisted, indicating that the latter are necessary for the long-term immunity [88, 89]. The Cambridge, Massachusetts, biotechnology company Moderna Inc. (United States), together with the National Institute of Allergy and Infectious Diseases (NIAID) of the National Institutes of Health, developed the mRNA-1273 vaccine against SARS-CoV-2. The vaccine is designed to deliver mRNA encoding the SARS-CoV-2 S protein (S-2P) and induces a strong immune response. In the third phase of a randomized, placebo-controlled study on 30 000 adult volunteers in 89 clinical centers in the United States, this vaccine was shown to be safe and immunogenic, i.e., able to induce the production of a high titer of antibodies that neutralize the virus. It is assumed that the immune response will develop and prevent symptomatic manifestations of COVID-19 after two vaccinations [90, 91]. In Russia, clinical trials of the safety and tolerability of two vaccines, Gam-COVID-Vac and Gam-COVID-Vac Lyo, respectively, are being conducted on 76 volunteers on the premises of the Sechenov First Moscow State Medical University and the Burdenko Main Military Clinical Hospital. It is reported that the Russian vaccine against SARS-CoV-2 induces an immune response in 100% of volunteers and has no side effects [92]. The Gamaleya National Research Center for Epidemiology and Microbiology developed the Sputnik-V vaccine, which was announced on August 11, 2020, by the President of Russia V.V. Putin. Currently, its clinical trials on 40000 volunteers are nearing completion. The Vector Center developed the EpiVacCorona vaccine; in October 2020, it was registered by the Russian Ministry of Health. Both vaccines are now undergoing post-approval trials. The Chumakov Center for Research and Development of Immunobiological Preparations, Russian Academy of Sciences, is developing its own vaccine against SARS-CoV-2, which is currently at the stage of clinical trials [93].

CONCLUSIONS

The severity of COVID-19 infection depends on the polymorphism of the virus genome and some alleles of the human genome. Mutations in viral RNA polymerase (ORF8, 241C>T, S84L, 14408C>T, C251T), RNA primase (P323L), the S protein (23403A>G, D614G), and the structural proteins (N, E) increase the immunogenicity of proteins in relation to the immune T cell response, which may be associated with the increased transmissibility and severity of COVID-19 infection in European countries. Mutations in the genes of key viral proteins alter the affinity and specificity of targeted drugs, providing the molecular basis for differences in morbidity and mortality, as well as the body’s response to antiviral drugs or vaccines. It was established that the SARS-CoV-2 virus enters the cell using a transmembrane receptor, which is the ACE2 protein. In turn, expression of the ACE2 gene varies depending on the age, gender, and ethnicity of the patient and smoking habits. The peptidases ANPEP, DPP4, and ENPEP can serve as ACE2 coreceptors for SARS-CoV-2 in human tissues. Features of ACE2 gene expression may partly explain the main symptoms of COVID-19 infection and its association with chronic diseases. An association of the severity of COVID-19 infection with loci (HLA-B*4601), genes (FcγRIIA, MBL, TMPRSS2, TNFA, IL6, etc.), and antigen A of human blood groups was established. Thus, genetic testing makes it possible not only to identify COVID-19 infection in patients but also to form new risk groups based on both the presence of certain chronic diseases and genetic predisposition or susceptibility to SARS-CoV-2 infection, as well as to personalize medical rehabilitation programs.