The role of SARS-CoV-2 nucleocapsid protein in antiviral immunity and vaccine development

ABSTRACT
 The coronavirus disease 2019 (COVID-19) has caused enormous health risks and global economic disruption. This disease is caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The SARS-CoV-2 nucleocapsid protein is a structural protein involved in viral replication and assembly. There is accumulating evidence indicating that the nucleocapsid protein is multi-functional, playing a key role in the pathogenesis of COVID-19 and antiviral immunity against SARS-CoV-2. Here, we summarize its potential application in the prevention of COVID-19, which is based on its role in inflammation, cell death, antiviral innate immunity, and antiviral adaptive immunity.


Introduction
The coronavirus disease 2019 (COVID-19) has caused global economic disruption and enormous health risks. According to the information provided by the World Health Organization, by 15th September 2022, there were more than 607 million confirmed cases and over 6.49 million deaths worldwide [1]. This disease is caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The typical clinical symptoms of COVID-19 include fever, dry cough, dyspnea, muscle pain, fatigue, and even death. Besides the respiratory systems, COVID-19 may affect multiple organs and tissues. Intriguingly, the involvement of other organs and tissues may be independent of the severity of the respiratory disease [2].
SARS-CoV-2, which is an enveloped, positive and single-stranded RNA virus, belongs to the β-Coronavirus family, which is in the same family as the middle east respiratory syndrome coronavirus (MERS-CoV) and SARS-CoV. The other humaninfecting coronaviruses (HCoV) are the low-pathogenicity members HCoV-OC43, HCoV-NL63, HCoV-HKU1, and HCoV-229E. Many host proteins are essential for viral entry. Among them is the transmembrane serine protease 2 (TMPRSS2), which cleaves and thereby activates the SARS-CoV-2 surface glycoprotein, spike (S) protein. Activated S protein recognizes and then binds to angiotensin-converting enzyme 2, which is the SARS-CoV-2 receptor on the host cell. The binding of the S protein to angiotensin-converting enzyme 2 induces the fusion of the virus with the host cell membrane [3]. After SARS-CoV-2 enters the host cell, RNA-dependent RNA polymerase replicates and transcribes the ∼30kb viral genome, leading to the production of 4 structural proteins, i.e., envelope (E), membrane (M), nucleocapsid (N), and S proteins, 9 putative accessory proteins, and the 16 non-structural proteins (Nsp1-Nsp16) [4].
The main effect of the N-protein is to integrate the viral genomic RNA into a ribonucleoprotein complex, promoting the M and E proteins to initiate viral assembly [4]. There are enormous evidences indicating that the N-protein is multi-functional with many roles. Here, we focus on its function in inflammation, antiviral innate immunity, antiviral adaptive immunity and cell death. N-terminal domain (NTD) and C-terminal domain (CTD) are flanked by intrinsically disordered Narm and C-terminus, respectively (Figure 1(A)) [5,6]. NTD of SARS-CoV-2 N-protein has a righthanded fist shape with a four-stranded antiparallel β-sheet core subdomain, sitting between short 3 10 helices (loops) and a protruding β-hairpin region formed by β3 and β4 strands out of the core (Figure 1 (B)) [5,6]. Comparison of its structure with other human-infecting coronaviruses N-protein NTD structures indicates that the orientations of the Nterminal loops, the top tip of the protruding region, the bottom of the core subdomain are distinct (Figure 1(C)) [5,6]. SARS-CoV-2 N-protein CTD forms a tight homodimer with an overall rectangular slab shape. Each contains five α-helices, two antiparallel β-strands, and two 3 10 helices (Figure 1(D)) [6,7]. The overall structure of SARS-CoV-2 N-protein CTD is highly similar to the available N-protein CTD structures of SARS-CoV, MERS-CoV and HCoV-NL63 (Figure 1(E)). Interestingly, the electrostatic potential surfaces at the helical side of the dimer show distinct features [6,7], suggesting novel roles of SARS-CoV-2 N-protein CTD in viral RNA binding and transcriptional regulation. The unique structural characteristics of SARS-CoV-2 Nprotein might contribute to the unique roles of SARS-CoV-2 N-protein in COVID-19 pathogenesis, as discussed below.
The SARS-CoV-2 N-protein has a net positive charge, which allows it to bind the negatively charged viral genome RNA and assemble it into a higher ordered structure. After binding to RNA, the SARS-CoV-2 N-protein robustly goes through a liquid-liquid phase separation, forming liquidlike droplets or condensates. LLPS of the SARS-CoV-2 N-protein depends on the RNA sequence and structure [8]. This process is enhanced by body temperature and modulated by ionic strength and RNA concentration [8,9]. The intrinsically disordered N-arm, the central linker region, and the C-terminal dimerization domain are essential for robust LLPS [9].
Intriguingly, the SARS-COV-2 N-protein is found in the serum within two weeks of the post onset of symptoms or diagnostic PCR. Serum N-protein level is correlated with a COVID-19 patient's inflammatory response, tissue damage, coagulation, and disease severity [19][20][21]. Extracellular SARS-CoV-2 Nprotein also contributes to inflammation by regulating complement and it has been described that COVID-19 patients show complement hyperactivation [22,23]. The N-proteins from SARS-CoV, MERS-CoV, and SARS-CoV-2 recruit and activate mannose-binding protein-associated serine protease 2 (MASP-2), a key factor in the lectin pathway of complement activation. Blocking the interaction between N-protein and MASP-2 or depletion of MASP-2 can alleviate lung injury in vivo and in vitro and N-protein-induced complement hyperactivation [22,23]. Furthermore, when the SARS-CoV-2 N-protein is added to the culture medium of human primary lung microvascular endothelial cells, it significantly induces the expression of intracellular adhesion molecule 1 and vascular cell adhesion protein 1, which are markers of endothelial cell activation [24]. Simultaneously, NF-κB and MAPK pathways are activated [24]. Even though N-proteins from different coronaviruses are highly conserved in protein sequences, N-proteins from HCoV-HKU1, SARS-CoV, and MERS-CoV have no such role in endothelial cells [24]. The plasma membranebound pattern recognition receptor Toll-like receptor 2 antagonist, CU-CPT22, blocked SARS-CoV-2 N-protein-induced endothelial cell activation [24]. Additionally, incubation of macrophages with SARS-CoV-2 N-protein induces M1 polarization and the production of proinflammatory cytokines [25,26]. Also, incubation of human arterial fibroblasts with SARS-CoV-2 N-protein leads to altered expression of the receptor for the globular head of C1q, which links N-protein-induced inflammation with thrombosis in the vascular system [27]. SARS-CoV-2 N-protein also up-regulates the expression of tissue factor and intracellular adhesion molecule 1, markers involved in vascular coagulation and inflammation, and glucose transporter 4, a diabetic marker [27]. Lastly, complement activation and the activation of signaling pathways through Toll-like receptor 2 might be involved in increasing levels of IL-6 in the lungs of mice inoculated nasally with SARS-CoV-2 N-protein and inducing acute lung injury in mice administrated intratracheally with SARS-CoV-2 N-protein [26,28]. Indeed, the lung injury is associated with NF-κB p65 phosphorylation. Accordingly, an NF-κB inhibitor pyrrolidine dithiocarbamate alleviates SARS-CoV-2 N-protein-induced lung injury [25].

Modulation of cell death by SARS-CoV-2 Nprotein
Despite that SARS-CoV-2 infection promoting NLRP3 inflammasome activation, IL-1β secretion and cell death is only barely detected during early post-infection. This is due to the SARS-CoV-2 Nprotein inhibiting IL-1β secretion and pyroptosis while promoting the cleavage of pro-IL-1β [36]. Yeast two-hybrid screening of SARS-CoV-2 Nprotein-interaction candidates identified GSDMD, which is the substrate of Caspase1 and is involved in IL-1β secretion and pyroptosis. The binding of SARS-CoV-2 N-protein to the GSDMD linker prevents GSDMD from being cut by active Caspase1 [36]. Interestingly, SARS-CoV-2 N-protein binds to NLRP3 and promotes Caspase1 activation, which can cleave GSDMD [15], on the other hand Nprotein can bind to GSDMD to stop pyroptosis on the same pathway [36]. GSDMD-mediated pyroptosis restricts viral production. The binding of SARS-CoV-2 N-protein to GSDMD ensures enough time for SARS-CoV-2 coronaviruses to reproduce and spread before the immune system attacks, which might explain the long asymptomatic infection of COVID-19. However, SARS-CoV-2 coronaviruses will enter into the lytic phase eventually. Due to SARS-CoV-2 N-protein-NLRP3 interaction, the massive storage of mature IL-1β in the cytosol will lead to the release of abundant inflammatory cytokines, which cause tissue injury. Therefore, SARS-CoV-2 N-protein-NLRP3 interaction and SARS-CoV-2 N-protein-GSDMD interaction act synergistically to contribute to COVID-19 pathogenesis.
The role of SARS-CoV-2 N-protein in apoptosis has also been identified [37]. SARS-CoV-2 M-protein directly induces apoptosis in Vero E6 monkey kidney epithelial cells and HepG2 human hepatocellular carcinoma cells [37]. Mechanistically, M-protein binds to 3-phosphoinositide-dependent protein kinase-1 (PDK1) and thereby hinders the interaction between PDK1 and its substrate protein kinase B (PKB, also named as Akt) [37]. The SARS-CoV-2 N-protein does not directly induce apoptosis in Vero E6 or HepG2 cells [37]. Despite this, SARS-CoV-2 N-protein enhances M-protein-induced apoptosis via binding to both M-protein and PDK1 and strengthening M-protein-mediated attenuation of the PDK1-PKB/Akt interaction [37].
SARS-CoV-2 N-protein is also implicated in uncharacterized types of cell death [35,38]. SARS-CoV-2 can infect the neuron and it has been discovered that in vitro, the SARS-CoV-2 N-protein significantly speeds up the aggregation of α-synuclein protein, a key feature of Parkinson's disease [38]. Microinjection of SARS-CoV-2 N-protein into a neuronal cell model (SH-SY5Y cells) leads to less vesiclebound α-synuclein protein and more cell death [38].
SARS-CoV-2 N-protein binds to RIG-I through the DExD/H domain of RIG-I, which hinders the recognition of viral RNA [48]. Furthermore, the interaction between SARS-CoV-2 N-protein and RIG-I can block the recruitment of tripartite motif protein 25, an E3 ligase that mediates K63-ubiquitination and subsequent activation of RIG-I, in a mode similar to N-proteins from MERS-CoV and SARS-CoV [29][30][31]. In addition, N-proteins of SARS-CoV and SARS-CoV-2, but not MERS-CoV, target G3BP1 to block stress granule formation [32,39], which prevents cofactors from enhancing RIG-I activation after SARS-CoV-2 infection [39]. The modulation of stress granules by SARS-CoV-2 N-protein have additional effects on RIG-I signaling. Even though one study observed no interaction between SARS-CoV-2 N-protein and MAVS [31], another study by a different group has detected that SARS-CoV-2 N-protein binds to MAVS in a LLPS-dependent manner, which inhibits the binding of MAVS to RIG-I and tripartite motif protein 31, an E3 ligase that mediates K63-ubiquitination and subsequent activation of MAVS [40].
After type I interferon production, type I interferons bind to their receptors and activate intracellular tyrosine kinases, resulting in subsequent phosphorylation and activation of signal transducer and activator of transcription (STAT) 1/2. The SARS-CoV-2 N-protein hinders type I interferon-induced STAT1/ 2 phosphorylation and nuclear translocation. Mechanistically, SARS-CoV-2 N-protein binds to STAT1/2 and blocks the interaction with upstream tyrosine kinases [30,41]. Intriguingly, SARS-CoV N-protein does not have such a role [34].
In mammals, the antiviral innate immunity may be directed by proteins or small RNAs [33]. After virus goes through infection and replication, the virusderived double-stranded RNAs are recognized and cleaved by Dicer. Following cleavage by Dicer, the resulting small RNAs combine with the RNA-induced silencing complex. Endonucleases argonaute clade proteins are core components of the RNA-induced silencing complex, which uses the diced RNA as a template to bind and cleave viral RNA [33]. N-proteins from coronaviruses such as SARS-CoV and MERS-CoV are identified as viral suppressors of this mechanism [33]. Similarly, SARS-CoV-2 N-protein interacts with virusderived double-stranded RNAs and then sequestrates them to suppress this pathway [42,43], thereby acting as a critical immune evasion factor of SARS-CoV-2.
Finally, human humoral fluid-phase pattern recognition molecule long pentraxin 3 binds to SARS-CoV-2 N-protein but shows no antiviral activity. Long pentraxin 3 is abundantly expressed by blood and lung myeloid cells. The serum concentration of long pentraxin 3 is associated with mortality of COVID-19 patients. The role of long pentraxin 3 in SARS-CoV-2 N-protein-mediated complement activation and cytokine production remain to be determined [49].

SARS-CoV-2 N-protein activates the adaptive immunity
SARS-CoV-2 N-protein-specific CD4 + T cells are detected in most non-hospitalized and previously hospitalized COVID-19 subjects [50,51]. These CD4 + T cells express CD45RO, a marker of antigen experience [50]. A synchronous multipolar and antigen-specific T helper (Th) response can be detected immediately after a mild COVID-19 infection [51]. This response involves IL-2-production by T cells and correlates with virus-neutralizing antibody titers [52]. Also involved in the Th response is phenotypically stable Th1 or circulating follicular Th cells, which persist months [50,51]. In a portion of these subjects, the increased Th responses from memory circulating follicular Th cells correlated with sustained antibody production over time [50]. However, SARS-CoV-2 N-protein-specific Th1 cells are not always protective because even participants with higher frequencies of N-specific IFNγ + CD4 + T cells still required ICU care [53]. Furthermore, even long COVID patients with neurological symptoms showed enhanced activation of follicular Th cells as well as more Th1 cells, which is linked with increased levels of SARS-CoV-2 N-protein-specific antibodies [54].
Indeed, increased levels of SARS-CoV-2 N-proteinspecific antibodies are correlated with disease severity [54][55][56][57][58]. Despite this, opposite trends have also been reported. In patients with cancer, the level of SARS-CoV-2 N-protein-specific antibodies is not correlated with peak viral load during acute COVID-19. Rather, a lower level of SARS-CoV-2 N-protein-specific antibodies is correlated with prolonged viral load [59]. Accordingly, a higher level of antibody against a specific epitope in SARS-CoV-2 N-protein is associated with moderate cases [58]. Thus, epitope-specific antibody responses to SARS-CoV-2 N-protein differentiate COVID-19 outcomes. Recently, it has been reported that extracellular SARS-CoV-2 N-protein released by SARS-CoV-2-infected cells binds to heparan sulfate and heparin on neighbouring cells by electrostatic high-affinity interaction. SARS-CoV-2 Nprotein binds with high affinity to eleven human chemokines and thereby may sequester these chemokines. SARS-CoV-2 N-protein-specific antibodies bind to cell surface N-protein and thereby activate Fc receptor-expressing cells. These effects might either augment or reduce the severity of COVID-19 [60]. Nevertheless, the level of SARS-CoV-2 N-proteinspecific antibodies rapidly declines in convalescence over time. By contrast, SARS-CoV-2 N-proteinspecific IgM + or IgG + memory B cells continue to rise until 150 days [61]. B cell depletion is strongly correlated with prolonged SARS-CoV-2 viral load [59,62]. Thus, B cell memory is very important for COVID-19 control.

SARS-CoV-2 N-protein as a target for vaccine development
Most authorized SARS-CoV-2 vaccines are based on the S-protein of the ancestral SARS-CoV-2, which elicits neutralizing antibody responses. However, Omicron contains many mutations located in the Sprotein, especially in the receptor-binding domain, which is the major target of neutralizing antibody responses. It is not surprising that Omicron escapes the immune response to these vaccines. Vaccination of BALB/c mice by different routes using SARS-CoV-2 N-protein elicited robust antibody and IFNγ production [56,68]. The SARS-CoV-2 N-protein has become a target for vaccine development since the majority of the putative T cell specific epitopes are conserved in the variants of concern (Table 2) [69][70][71][72][73][74][75][76][77][78][79][80][81].
Even though one study reported that protection correlated with antibodies against the N-protein [73], it is unlikely that antibodies are essential for protection against infection by ongoing SARS-CoV-2 variants. There are at least 3 reasons for this notion: (1) Immunization with N-protein mRNA induced modest control of mouse-adapted SARS-CoV-2, as indicated by viral load and body weight in BALB/c mice and Syrian hamsters. In BALB/c mice, dual S-and Nprotein mRNA vaccination can lower viral load. While in Syrian hamsters, dual S-and N-protein mRNA vaccination induced not only more robust control of the Delta and Omicron variants in lungs but also increased protection in the upper respiratory tract. Under these conditions, robust N-specific binding antibodies were detected without neutralizing activity [81]. (2) The virus-neutralizing antibodies elicited by multiantigen COH04S1-vaccines using modified Vaccinia Ankara (MVA) vector showed reduced cross-reactivity but vaccination with COH04S1 promoted robust resistance to Beta and Delta variants [70]; (3) SpiN, a protein vaccine with full length N-protein fused with the receptor binding domain (RBD) of the S-protein, protected K18-ACE2 transgenic C57 BL/6 mice against infection with Delta and Omicron variants, although without detectable neutralizing antibodies [76].
No vaccine-related serious adverse events were recorded after intramuscular vaccination. The most common solicited adverse events were injection site pain and fatigue, mostly mild and transient.

N/A
Clinical studies: no information about N-protein-specific antibodies, potent neutralizing titers against ancestral SARS-CoV-2 virus, Delta, Omicron, and other variants of concern, but the virus-neutralizing antibodies were long-lasting as revealed with the live ancestral SARS-CoV-2 virus.
Clinical studies: no information about N-protein-specific T cells, restimulation of peripheral blood mononuclear cells with designer peptide antigens revealed longlasting robust Th1-predominant cell response as measured by IFN-γ and IL-4 ELISpot. [80] 10 N/A (a nucleoside-modified mRNA vaccine) A mRNA vaccine expressing fulll length N-protein from SARS-CoV-2 ancestral virus, alone or in combination with the clinically proven Sexpressing mRNA vaccine. N/A Intramuscular, but not intranasal, immunization of BALB/c mice and Syrian hamsters with N-protein mRNA induced modest control of mouse-adapted SARS-CoV-2, as indicated by viral load and body weight. In BALB/c mice, dual S-and N-protein mRNA vaccination further lowered viral load. In Syrian hamsters, dual S-and N-protein mRNA vaccination not only induced more robust control of the Delta and Omicron variants in the lungs but also provided enhanced protection in the upper respiratory tract.
Preclinical study: intramuscular vaccination with N-protein mRNA alone induced robust binding antibodies to N-protein but no neutralizing activity, dual S-and N-protein mRNA vaccination augmented serum neutralizing antibody activities.

EMERGING MICROBES & INFECTIONS
cell epitopes in SARS-CoV-2 N-protein are highly conserved, these N-protein-specific T cells show equivalent cross-reactivity against ancestral SARS-CoV-2 and variants of concern (Table 2) [69][70][71][72][73][74][75][76][77][78][79]. Thus, these Nprotein-specific T cells may constitute a critical second line of defense for providing long-term protection against SARS-CoV-2 variants. Indeed, in vivo CD8 + T cell depletion in Syrian hamsters after vaccination with N-protein mRNA identifies a role for CD8 + T cells in viral control as well as protection against body weight loss following Omicron challenge [81]. From this point of view, the usage of T cell-specific epitopes in the SARS-CoV-2 N-protein may be a better choice than full length protein for vaccine development. This strategy may also avoid the possible adverse effects of binding antibodies to N-protein [54][55][56][57][58]. Indeed, one of the 10 vaccines with N-protein usage employed the strategy of rationally designed T cell epitopes [80]. Although information about Nprotein-specific T cells is unavailable, long-lasting T cell immunity against Delta and Omicron variants has been observed [80]. This warrants a large-scale field trial for evaluation.

Conclusions
The SARS-CoV-2 N-protein plays pivotal roles in inflammation, cell death, innate antiviral immunity, and adaptive antiviral immunity. Further research is required to clarify how the SARS-CoV-2 N-protein modulates inflammation, cell death, and innate antiviral immunity. Of importance is to identify the key motifs involved. Additionally, even though SARS-CoV-2 Nbased vaccines showed protective effects, it should be noted that SARS-CoV-2 N-protein-specific T cells may cause inflammatory related injury, especially after repeated boosting. Thus, it is essential to identify protective T cell-specific epitopes in the SARS-CoV-2 Nprotein according to different HLA genotypes.

Disclosure statement
No potential conflict of interest was reported by the author (s).

Funding
This work was supported by R&D Program of Guangzhou Laboratory (SRPG22-006), College Students' Innovative Entrepreneurial Training Plan Program (202210487079) and Huazhong University of Science and Technology, HUST Academic Frontier Youth Team (2018QYTD10).