The Mechanism of SARS-CoV-2 Nucleocapsid Protein Recognition by the Human 14-3-3 Proteins

Graphical abstract


Introduction
The new coronavirus-induced disease, COVID19, has caused a worldwide health crisis with more than 90 million confirmed cases and 1.9 million deaths as of January 2021. 1 The pathogen responsible, Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), is highly similar to the causative agent of the SARS outbreak in 2002-2003 (SARS-CoV) and, to a lesser extent, to the Middle East Respiratory Syndrome Coronavirus (MERS-CoV). 2,3 Each is vastly more pathogenic and deadly than human coronaviruses HCoV-OC43, HCoV-NL63, HCoV-229E, and HCoV-HKU1 which cause seasonal respiratory diseases. 4 Like SARS-CoV and HCoV-NL63, SARS-CoV-2 uses angiotensin-converting enzyme 2 (ACE2) as entry receptor. 5 The ACE2 expression roughly correlates with the evidenced SARS-CoV-2 presence in different tissue types, which explains the multi-organ character of the disease 6 ( Figure 1).
In contrast to multiple promising COVID19 vaccine clinical trials, [7][8][9] treatment of the disease is currently limited by the absence of approved efficient drugs. 10 The failure of several leading drug candidates in 2020 warrants the search for novel therapeutic targets including not only viral enzymes, but also heterocomplexes involving viral and host cell proteins. Unravelling mechanisms of interaction between the host and pathogen proteins may provide the platform for such progress.
The positive-sense single-stranded RNA genome of SARS-CoV-2 coronavirus encodes approximately 30 proteins which enable cell penetration, replication, viral gene transcription and genome assembly amongst other functions. 11 The 46-kDa SARS-CoV-2 Nucleocapsid (N) protein is 89.1% identical to SARS-CoV N. Genomic analysis of human coronaviruses indicated that N might be the major factor conferring the enhanced pathogenicity to SARS-CoV-2. 4 N represents the most abundant viral protein in the infected cell, [12][13][14] with each assembled virion containing approximately one thousand molecules of N. 15 Given that each infected cell can contain up to 10 5 virions (infectious, defective and incomplete overall), 14 the number of N molecules in an infected cell can reach 10 8 , accounting for~1% of a total number of cellular proteins (~10 10 ). 16 The N protein interacts with viral genomic RNA, the membrane (M) protein and self-associates to provide for the efficient virion assembly. [17][18][19] It consists of two structured domains and three regions predicted to be disordered (Figure 2(A)), including a functionally important central Ser/Arg-rich region [20][21][22] and a set of potential protein-binding sites (Figure 2(B)). Such organization allows for a vast conformational change, which in combination Figure 1. 14-3-3 proteins are highly abundant in human tissues with SARS-CoV-2 presence. Correlation of ACE2 expression levels (*) and SARS-CoV-2 reported presence (**) in various tissues of COVID19 patients based on the data from Trypsteen et al., 6 shown with abundances (indicated in ppm, part per million, i.e., one molecule of a given protein per 1 million of all proteins from a given tissue) of the seven human 14-3-3 isoforms, extracted from the PAXdb database. 40 Tissues are shown in the order corresponding to the SARS-CoV-2 presence, starting, at the top, from the highest virus presence. 6 The shown relative scale of ACE2 expression is also taken from Trypsteen et al. 6 The total abundance of the seven human 14-3-3 isoforms in a given tissue and the average abundance of an isoform in 12 selected tissues are also indicated. The latter values were used for ordering the data for 14-3-3 isoforms, left to right, from the highest average abundance (14-3-3f; 2423 ppm, or~0.24%) to the lowest average abundance (14-3-3g; 575 ppm, or~0.06%). Note that the average abundance of all seven 14-3-3 proteins in three tissues with the highest SARS-CoV-2 presence (oral cavity, gastrointestinal tract, lungs) reaches 1.21% of all proteins. with positively charged surfaces, 23 facilitates nucleic acid binding. 24 Indeed, crystal structures of the N-terminal domain (NTD) reveal an RNA binding groove, [25][26][27] while crystal structures of the Cterminal domain (CTD) show a highly interlaced dimer with additional nucleic acid binding capacity. 28,29 The N protein shows unusual properties in the presence of RNA, displaying concentration-dependent liquid-liquid phase separation 22,23,30,31 that is pertinent to the viral genome packaging mechanism. 32,33 In human cells, the assembly of condensates is down-regulated by phosphorylation of the SR-rich region. 30,34 SARS-CoV-2 N protein is a major target of phosphorylation by host cell protein kinases, with 22 phosphosites identified in vivo throughout the protein (Supplementary  table 1). 13,35 Functions of N and viral replication can be regulated by a complex, hierarchical phosphorylation of the SR-rich region of SARS-CoV-2 N by a cascade of protein kinases. 36 Nevertheless, the potential functional role of N phosphorylation at each specific site is not understood.
Using immunofluorescence, immunoprecipitation, siRNA silencing and kinase inhibition, it has been shown that SARS-CoV N protein shuttles between the nucleus and the cytoplasm in COS-1 cells. 37 This process is regulated by N protein phosphorylation by several protein kinases including glycogen synthase kinase-3, protein kinase A, casein kinase II and cyclin-dependent kinase. [37][38][39] Consequently, phosphorylated N associates with 14-3-3 proteins in the cytoplasm. 37 Notably, treatment with a kinase inhibitor cocktail eliminated the N/14-3-3 interaction, whereas inhibition of 14-3-3h expression by siRNA led to accumulation of N protein in the nucleus. 37 These data suggest that 14-3-3 proteins directly shuttle SARS-CoV N protein in a phosphorylation-dependent manner: a role which may be universal for N proteins of all coronaviruses, including SARS-CoV-2. However, the molecular mechanism of the 14-3-3/N interaction remains ill-defined.
14-3-3 proteins are amongst the top 1% of highest-expressed human proteins in many tissues, with particular abundance in tissues vulnerable to SARS-CoV-2 infection including the lungs, gastrointestinal system and brain 40,41 (Figure 1). 14-3-3 proteins recognize hundreds of phosphorylated partner proteins involved in a magnitude of cellular processes ranging from apoptosis to cytoskeleton rearrangements. 42,43   proteins are present in most of the tissues as seven conserved "isoforms" (b, c, e, f, g, r, s/h) (Figure 1), with all-helical topology, forming dimers possessing two identical antiparallel phosphopeptide-binding grooves, located at~35 A distance from each other. 44 By recognizing phosphorylated Ser/Thr residues within the structurally flexible (R/K)X 2-3 (pS/pT)X(P/G) consensus motif, 44,45 14-3-3 binding is known to regulate the stability of partner proteins, their intracellular localization and interaction with other factors. 46 In addition to their high abundance in many tissues susceptible to SARS-CoV-2 infection ( Figure 1) and a detectable increase of expression of some 14-3-3 isoforms upon SARS-CoV-2 infection, 12 14-3-3 proteins were reported as one of nine key host proteins during SARS-CoV-2 infection. 47 These data indicate a potential association of 14-3-3 with viral proteins.
In this work, we dissected the molecular mechanism of the interaction between SARS-CoV-2 N and human 14-3-3 proteins. SARS-CoV-2 N protein containing several phosphosites reported to occur during infection, was produced using the efficient Escherichia coli system that proved successful for the study of polyphosphorylated proteins. 48 We have observed the direct phosphorylation-dependent association between polyphosphorylated SARS-CoV-2 N and all seven human 14-3-3 isoforms and determined the affinity and stoichiometry of the interaction. Series of truncated mutants of N localized the key 14-3-3-binding site to a single phosphopeptide residing in the functionally important SR-rich region of N. These findings suggest a topology model for the heterotetrameric 14-3-3/pN assembly occluding 3 Figure 2. Characterization of the SARS-CoV-2 N phosphoprotein. A. Prediction of the propensity to intrinsic disorder (ID) and protein-binding regions across the SARS-CoV-2 N sequence made by DISOPRED 3. 87 Scores higher than 0.5 designate disorder. Protein-binding regions predicted by DISOPRED 3 87 are shown at scores higher than 0.9. Two structured protein domains are also shown. B. Schematic representation of the N sequence and its main features. The subdomains are named above (grey font), predicted Nuclear Export Signal (NES, yellow), Nuclear Localization Signal (NLS, violet), N-and C-terminal domains (NTD, CTD) and two main phosphorylation loci (cyan rectangles) are marked. Sequences of the SR-rich region and short phosphorylatable section of the CTD are shown aligned between SARS-CoV and SARS-CoV-2 N proteins, highlighting multiple conserved phosphorylation sites (bold font indicates phosphosites present in both N proteins and experimentally confirmed for SARS-CoV-2 N). Some phosphosites (red spheres and labeled) are predicted as suboptimal 14-3-3-binding sites (green arrows). C. A Phostag gel showing that bacterial co-expression of the full-length N with PKA yields polyphosphorylated protein. D. A fragmentation spectrum of the representative phosphopeptide carrying phosphorylation at Ser197 and Thr205. Z-(red) and c-series (blue) of ETD fragmentation are shown. Error did not exceed 5 ppm. E. Absorbance spectra show that both recombinant unphosphorylated and PKA-phosphorylated N proteins elute from the Ni-affinity column bound to random E. coli nucleic acid. On-column washing with 3 M NaCl (50 column volumes) eliminates bound nucleic acid. F. Analysis of the oligomeric state of pN using size-exclusion chromatography (SEC) on a Superdex 200 Increase 10/ 300 column at 200 mM NaCl, with multi-angle light scattering detection (SEC-MALS) and SDS-PAGE of the eluted fractions. Flow rate was 0.8 ml/min. Apparent Mw determined from column calibration based on protein standards (arrows above) is compared to the absolute mass determined from SEC-MALS. The experiments were carried out three times, and the most typical results are shown. the SR-region, which presents a feasible target for further characterization and therapeutic intervention.

Results
Characterization of the polyphosphorylated N protein obtained by co-expression with PKA in E. coli Host-expressed SARS-CoV-2 N protein represents a phosphoprotein, harboring multiple phosphorylation sites scattered throughout its sequence. The most densely phosphorylated locus is the SR-rich region (Figure 2(B), Supplementary table 1, Supplementary data file 1). 13,35 Remarkably, this region is conserved in N proteins of several coronaviruses, 39,49 including SARS-CoV ( Figure 2(B)). Although a number of protein kinases have been implicated in SARS-CoV N phosphorylation, 36,37,39,50 the precise enzymes responsible for identified phosphosites and the functional outcomes are largely unknown. Of note, many of the reported phosphosites within the SARS-CoV-2 N protein are predicted to be phosphorylated by protein kinase A (PKA) (Supplementary table 1). Hence, PKA was used for production of phosphorylated SARS-CoV-2 N in E. coli, 48 using the same approach that was successfully applied for production of several phosphorylated eukaryotic proteins 48,51-54 including the polyphosphorylated human tau competent for specific 14-3-3 binding. 48 Indeed, co-expression with PKA yielded a heavily phosphorylated SARS-CoV-2 N (Figure 2(C)) containing more than 20 phosphosites according to LC-MS and MALDI analysis (Supplementary  table 1 and Supplementary data file 1). Especially dense phosphorylation occurred within the SR-rich region, involving recently reported in vivo sites Ser180, Ser194, Ser197, Thr198, Ser201, Ser202 and Thr205 13,35 (Figure 2(B) and (D) and Supplementary data file 1), implicating the success of the PKA co-expression at emulating native phosphorylation. Interestingly, due to the frequent occurrence of Arg residues, it was possible to characterize the polyphosphorylation of the SR-rich region only with the use of an alternative protease such as chymotrypsin, in addition to datasets obtained separately with trypsin (4 independent experiments overall, see Supplementary data file 1). Due to high conservation between SARS-CoV and SARS-CoV-2 N proteins, many phosphosites identified in the PKA-co-expressed N are likely shared by SARS-CoV N (Figure 2(B)). Importantly, many identified phosphosites lie within the regions predicted to be disordered ( Figure 2(A)), and contribute to predicted 14-3-3-binding motifs, albeit deviating from the optimal 14-3-3-binding sequence RXX(pS/pT) X(P/G) 44 (Figure 2(B) and Supplementary table 1).
Of note, the bacterially expressed SARS-CoV-2 N protein avidly binds random E. coli nucleic acid, which results in a high 260/280 nm absorbance ratio in the eluate from the nickelaffinity chromatography column. This is unchanged by polyphosphorylation (Figure 2(E)) and nucleic acid remained bound even after further purification using heparin chromatography (data not shown). To quantitatively remove nucleic acid from N preparations, we used continuous oncolumn washing of the His-tagged protein with 3 M salt. This yielded clean protein preparations with the 260/280 nm absorbance ratio of 0.6 ( Figure 2(E)) of the unphosphorylated and polyphosphorylated N (pN) with high electrophoretic homogeneity ( Figure  2(F)), enabling thorough investigation of the 14-3-3 binding mechanism.
SEC-MALS suggested that the nucleic acid-free protein was a~95 kDa dimer ( Figure 2(F)), based on the calculated Mw of the His-tagged N monomer of 48 kDa, regardless of its phosphorylation status. A significant overestimation of the apparent Mw of pN from column calibration,~160 kDa for the 95 kDa dimeric species, indicates presence of elements with expanded loose conformation, in agreement with the presence of unstructured regions. This necessitates the use of SEC-MALS for absolute Mw determination that is independent of the assumptions on density and shape.
The skewed Mw distribution across the SEC peak indicated the propensity of the SARS-CoV-2 N to oligomerization (Figure 2 Polyphosphorylated SARS-CoV-2 N and human 14-3-3c form a tight complex with defined stoichiometry Next, we compared the ability of native full-length SARS-CoV-2 N, both unphosphorylated and polyphosphorylated (N.1-419 and pN.1-419, respectively), to be recognized by a human 14-3-3 protein. For the initial analysis we chose 14-3-3c as one of the strongest phosphopeptide binders among the 14-3-3 family. 55 SEC-MALS ( Figure 3(A)) shows that 14-3-3c elutes as a dimer with Mw of~55.2 kDa (calculated monomer Mw 28.3 kDa, see also Figure 3(B)). The position and amplitude of this peak did not change in the presence of the N.1-419 dimer with Mw of 94.5 kDa (calculated monomer Mw 48 kDa), where the SEC profile shows two distinct peaks. This is corroborated by SDS-PAGE analysis of the fractions, suggesting no interaction between unphosphorylated N (pI > 10) and 14-3-3 (pI~4.5) despite the large difference in their pI values (Figure 3(C) and (D)). Thus, the presence of 200 mM NaCl was sufficient to prevent nonspecific interactions between the two proteins.
In sharp contrast, co-expression of SARS-CoV-2 N with PKA, and subsequent polyphosphorylation ( Figure 2), allows for tight complex formation between pN.1-419 and human 14-3-3c. This is evident from the peak shift and corresponding increased Mw from~95 to 150.7 kDa (Figure 3(E)), perfectly matching the addition of the 14-3-3 dimer mass (calculated dimer Mw 56.6 kDa) to the pN.1-419 dimer (calculated dimer Mw 96 kDa). The presence of both proteins in the complex was confirmed by SDS-PAGE (Figure 3(F)). Collectively these data pointed toward the equimolar binding upon saturation. Given the dimeric state of both proteins in their individual states (Figures 2(F) and 3) and the Mw of the 14-3-3c/pN.1-419 complex, the most likely stoichiometry is 2:2. The ratio of the proteins does not change across the peak of the complex (Figure 3(F)), implying that they form a relatively stable complex with the well-defined stoichiometry.

SARS-CoV-2 N interacts with all human 14-3-3 isoforms
We then questioned whether the interaction with pN is preserved for other human 14-3-3 isoforms. Analytical SEC clearly showed that the phosphorylated SARS-CoV-2 N can be recognized by all seven human 14-3-3 isoforms, regardless of the presence of a His-tag or disordered C-terminal tails on the corresponding 14-3-3 constructs (Figure 4(A)). However, the efficiency of complex formation differed for each isoform. Judging by the repartition of 14-3-3 between free and the pN-bound peaks, the apparent efficiency of pN binding was higher for 14-3-3c, 14-3-3g, 14-3-3f and 14-3-3e, and much lower for 14-3-3b, 14-3-3s and 14-3-3r, in a roughly descending order (Figure 4(A)). The interaction also appeared dependent on the oligomeric state of 14-3-3, since the monomeric mutant form of 14-3-3f, 14-3-3fm-S58E 56 (apparent Mw 29 kDa) showed virtually no interaction relative to the wild-type dimeric 14-3-3f counterpart (apparent Mw 58 kDa), (Figure 4(B)). Affinity of the phosphorylated SARS-CoV-2 N towards selected human 14-3-3 isoforms In light of the relative positions of the two proteins, separately and complexed, on SEC profiles A similar binding mechanism could be observed for 14-3-3e, however in this case we could achieve pN.1-419 saturation only at much higher 14-3-3e concentrations ( Figure 5(B) and (C)), and the resulting apparent K D was~7 times higher than for 14-3-3c ( Figure  5(C)). Nevertheless, once again the stoichiometry was close to 2:2. These findings strongly disfavor the earlier hypothesis that 14-3-3 binding affects dimerization status of N. 57 We further asked what are the specific regions of SARS-CoV-2 N that are responsible for interaction with human 14-3-3.
The N-terminal part of SARS-CoV-2 N is responsible for 14-3-3 binding Among multiple phosphosites identified in our pN.1-419 preparations (Supplementary table 1), the two most interesting regions are located in the intrinsically disordered or loop segments (Figure 2 (A)), normally favored by 14-3-3 proteins. 45 The first represents the C-terminally located RTA[pT 265 ]KAY site, which is predicted by the 14-3-3-Pred webserver 58 as the 14-3-3-binding site within the loop region immediately preceding the CTD. The second, SR-rich region features multiple experimentally confirmed phosphosites including several suboptimal predicted 14-3-3-binding sites (Figure 2 (B)). To narrow down the 14-3-3-binding locus we used several constructs representing its N-and Cterminal parts (N.1-211 and N.212-419, respectively). The individual CTD included the C-terminal phosphosite around Thr265 (N.247-364), and the longer N-terminal construct extended toward the C-terminus to include the predicted NES sequence (N.1-238) ( Figure 6(A)). This contains Asp225 and a cluster of Leu residues which together resemble the unphosphorylated 14-3-3-binding segments from ExoS/T 59 and therefore could be important for 14-3-3 binding.
As for the wild-type protein, the truncated SARS-CoV-2 N constructs were cloned and expressed in the absence or presence of PKA to produce unphosphorylated or polyphosphorylated proteins.  In contrast, N-terminal constructs including the RNA-binding domain, such as N.1-211, are monomeric (Supplementary Figure 3). Thus, our data align with the low-resolution structural model of SARS-CoV and SARS-CoV-2 N proteins, in which the CTD (residues 247-364), largely responsible for dimerization, and NTD, involved in RNA-binding, are only loosely associated. 24,28,29 Of the constructs analyzed, the N-terminal constructs N.1-211 and N.1-238 both interacted with 14-3-3c by forming distinct complexes on SEC profiles, and this interaction was strictly phosphorylation-dependent ( Figure 6). Given the similarity of the elution profiles for pN.1-211 and pN.1-238, it may be concluded that the presence of NES in the latter is dispensable for the 14-3-3 binding.
Under similar conditions, only a very weak interaction between the dimeric pN.212-419 construct and 14-3-3c could be observed, whereas the phosphorylated dimeric CTD (pN.247-364) displayed virtually no binding ( Figure 6). Neither unphosphorylated construct interacted with 14-3-3. Thus, the Thr265 phosphosite can be broadly excluded as the critical binding site. Separate phosphosites outside the CTD, for instance, in the last~30 Cterminal residues (Supplementary table 1) likely account for residual binding of the pN.212-419 construct. Only its SEC profile showed a significant positional peak shift with phosphorylation ( Figure 6(B) and (C)), indicating a potential change in the oligomeric state. It is tempting to speculate that such phosphorylation outside the CTD could affect higher order oligomerization associated with the so-called N3 C-terminal segment (Figure 2(B)). 20  According to SEC-MALS, the 14-3-3c dimer (Mw 57 kDa) interacts with the pN.1-238 monomer (Mw of 31 kDa) by forming a~82 kDa complex (Figure 7(A)) with an apparent 2:1 stoichiometry. It is remarkable that despite a moderate molar excess of pN.1-238 a 2:2 complex (one 14-3-3 dimer with two pN monomers) is not observed. The well-defined 2:2 stoichiometry of the 14-3-3c complex with the full-length pN (Figure 3) suggests that the dimeric pN is anchored using two equivalent, key 14-3-3-binding sites, each located in a separate subunit of N. It is tempting to speculate that, in the absence of the second subunit, the interaction involves the key phosphosite and an additional phosphosite which is separated by a sufficiently long linker (!15 residues), 61 to secure occupation of both phosphopeptide-binding grooves of 14-3-3 (Figure 7 (B)). Such bidentate binding would prevent the recruitment of a second pN.1-238 monomer and is in line with the observed data. Similar 2:1 binding was observed qualitatively for the interaction of 14-3-3c with pN.1-211 (Figure 6(C) and data not shown).
Our finding that the minimal N-terminal construct pN.1-211 exhibits firm binding to 14-3-3c indicates that the key 14-3-3-binding phosphosite(s) is/are located exclusively within this region. Given the presence of numerous candidate 14-3-3-binding sites within its most C-terminal part, i.e., the SRrich region, we further focused on the 1-211 sequence in search of the 14-3-3-binding phosphosite(s).
Localization of the main 14-3-3-binding site within the SR-rich region of SARS-CoV-2 N Further N mutants were designed to disrupt the most probable 14-3-3-binding phosphosites. These are located in the intrinsically flexible phosphorylatable SR-rich segment centered at positions 197 and 205 (Figure 8(A)). Both represent suboptimal 14-3-3-binding motifs SRN [pS 197 ]TP and SRG[pT 205 ]SP in lacking an Arg/ Lys residue in position À3 (bold font) relative to the phosphorylation site (squared brackets). However, each also features a Pro residue in position + 2 (bold underlined font), which is highly favorable for 14-3-3 binding 44 and absent from the other potential 14-3-3-binding phosphosites found in the SR-rich region (Figure 2(B)). These conflicting factors complicate predictions for the true 14-3-3-binding site. Moreover, even beyond the SRrich region the NTD is predicted to host further possible 14-3-3-binding phosphosites, including the RRA[pT 91 ]RR site, which is the highest-scoring in 14-3-3-Pred 58 prediction (Supplementary table 1).
We conceived stepwise truncations to remove the most probable 14-3-3-binding phosphosites, aiming to identify the iteration at which binding (observed for the pN.1-211 construct) ceased. 14-3-3 can bind incomplete consensus motifs at the extreme C-terminus of some proteins, 62 so truncations were designed to remove the critical phosphorylated residue. However, upstream residues of each candidate 14-3-3-binding site were preserved, in light of the sheer number of overlapping potential binding motifs in the SR-rich region (Figure 8(A)).
The new truncated constructs of N, namely N.1-204, N.1-196 and N.1-179, were obtained in unphosphorylated and phosphorylated states, as before, and again washed with high salt to remove potentially disruptive nucleic acid. SEC-MALS confirmed the monomeric state of the N-terminal N mutants (the exemplary data are presented for N.1-196, Supplementary Figure 3), consistent with the proposed architecture of the N protein. 21,24 None of the truncated mutants interacted with 14-3-3c in the unphosphorylated state (Figure 8(B)). More importantly, no binding to 14-3-3c was  (Figure 8(C)). This strongly indicated that all phosphosites of the 1-196 segment (including at least three phosphosites within the SR-rich region, i.e., Ser180, Ser188 and Ser194, see Figure 8(A)) are dispensable for 14-3-3 binding and at most could contribute only as auxiliary sites (as suggested by the scheme in Figure 7(B)). This narrowed the 14-3-3-binding region within SARS-CoV-2 N down to 15 residues from 196 to 211, leaving only two possible sites centered at Ser197 and Thr205 (Figure 8(A)).
By contrast, pN.1-204 showed only a very slightly altered interaction with 14-3-3c compared to pN.1-211 (Figure 8(C)). Although this does not exclude that Thr205 phosphosite may contribute to 14-3-3 binding in the context of the full-length pN (particularly if pSer197 is absent or mutated), pSer197 appears to be critical for 14-3-3 recruitment. Intriguingly, in contrast to Thr205, Ser197 is preserved in most related coronavirus N proteins (see Figures 2(B) and 9).

Discussion
In this work, we investigated the molecular association between the SARS-CoV-2 N protein and human phosphopeptide-binding proteins of the 14-3-3 family. The former is the most abundant viral protein, 12,13 the latter is a major protein-protein interaction hub involved in multiple cellular signaling cascades, expressed at high levels in many human tissues including those susceptible to SARS-CoV-2 infection (Figure 1). 40 SARS-CoV-2 N is heavily phosphorylated in infected cells, 13,35,36 which poses a significant challenge for proteomic approaches: the densely phosphorylated SR-rich region alone, functionally implicated in numerous viral processes, 34,49,63 hosts seven closely spaced Arg residues (Figure 2(B)). These arginines restrict the length of tryptic phosphopeptides and decrease the probability of their unambiguous identification and phosphosite assignment. 64 The multiplicity of implicated protein kinases (including GSK-3, protein kinase C, casein kinase II and mitogenactivated protein kinase 36,37,50 ) further hinders study of specific phosphorylations. Assuming that the mechanistic implication of a specific phosphorylation is independent of the acting kinase, we produced polyphosphorylated N protein (pN) via bacterial co-expression with a catalytic subunit of PKA. A combination of orthogonal cleavage enzymes and LC-MS phosphoproteomics mapped > 20 phosphosites (Supplementary table  1) including Ser23, Thr24, Ser180, Ser194, Ser197, Thr198, Ser201, Ser202, Thr205 and Thr391 reported recently at SARS-CoV-2 infection. 13,35 At least six of the identified phosphosites are located in the unstructured regions and represent potential 14-3-3-binding sites (Figure 2(A)  and (B)).
Biochemical analysis confirmed that polyphosphorylated N is competent for binding to all seven human 14-3-3 isoforms (Figures 3 and  4), but revealed remarkable variation in binding efficiency between them (Figure 4). This was supported by the quantified affinities to two selected isoforms, 14-3-3c (K D of 1.5 mM) and 14-3-3e (K D of 10.7 mM) ( Figure 5). Our observations are in line with the recent finding that 14-3-3c and 14-3-3g systematically bind phosphopeptides with higher affinities than 14-3-3e and 14-3-3r. 55 The low micromolar-range K D values compare well to those reported for other physiologically relevant partners of 14-3-3, 65-67 indicating a stable and specific interaction. Meanwhile, the well-defined 2:2 stoichiometry of the~150-kDa 14-3-3/pN complex, supported by titration experiments and SEC-MALS analysis (Figures 3 and 5), excludes the possibility that 14-3-3 binding disrupts pN dimerization. 57 It is reasonable to assume that the principally bivalent 14-3-3 dimer 44,46 recognizes just one phosphosite in each pN subunit because a bidentate 14-3-3 binding to different phosphosites within a single pN subunit would inevitably alter the observed 2:2 stoichiometry. Identification of the single phosphosite responsible for 14-3-3 recruitment proved challenging, as none of the potential sites were a perfect match to the currently known optimal 14-3-3-binding motifs. 65 To restrict the search, we analyzed the interaction of various N constructs with 14-3-3c. This eliminated the high-scoring potential 14-3-3binding phosphosite RTApT 265 KAY, present in the C-terminal N fragments, as the true site of interaction despite its conservation in many related coronavirus N proteins ( Supplementary  table 2 and 3). The residual binding of pN.212-419 to 14-3-3c suggested the existence of auxiliary 14-3-3-binding sites located outside the folded CTD (residues 247-364). Both pN.1-211 and pN.1-238 bound 14-3-3 with comparable efficiency suggesting the binding lies between amino acid 1 and 211. Interestingly, the N-terminal constructs existed as monomers (Supplementary Figure 3), which could potentially lower the binding affinity to 14-3-3 dimers in light of the 2:2 stoichiometry. Nonetheless, sufficient phosphorylation-dependent binding was clearly observed between the dimeric 14-3-3c and the monomeric N-terminal constructs (Figures 6 and 7).
Truncation of the SR-rich region streamlined the search for the 14-3-3-binding site to the 15residue stretch of amino acids 196-211. This sequence hosts two principally similar potential 14-3-3-binding sites, RNpS 197 TP and RGpT 205 SP (Figure 2(B) and (D)). Importantly, the proximity of these sites rules out their simultaneous bidentate binding to the 14-3-3 dimer: 14-3-3 binds in an antiparallel manner requiring a minimum of 13-15 residues between phosphosites on a single peptide. 44,68 Thus, binding to Ser197 and Thr205 sites must be mutually exclusive.
The markedly different binding between pN.1-204 and pN.1-196 to 14-3-3 ( Figure 8) prompted us to propose Ser197 as the critical phosphosite. This finding aided the design of a topology model for the complex (Figure 9(A)), in which the 14-3-3 dimer is anchored by two identical Ser197 phosphosites from the SR-rich region in the two equivalent pN chains. Noteworthily, the RN(pS/ pT) 197 TP site is conserved in not only SARS-CoV and SARS-CoV-2 but also in N proteins from several bat and pangolin coronaviruses (Figure 9 (B)). Meanwhile, plausible phosphorylation of residue 205 is possible in a smaller subset of coronaviruses (Figure 9(B)).
The model shown in Figure 9(A) notably does not exclude the possibility of 14-3-3 binding to alternative phosphosites under significantly different phosphorylation conditions. Moreover, we speculate that hierarchical phosphorylation within the SR-rich region 36 would alter 14-3-3 binding with phosphorylation at adjacent Ser/Thr residues, likely to inhibit the interaction. 69 The SR-rich region is phosphorylated by both Pro-directed and non-Pro-directed protein kinases. 36,37,39,50 Since 14-3-3 proteins typically reject peptides with a proline adjacent to the phosphorylated residue, the interplay between these aforementioned kinases could be regulatory. In theory, this could create a phosphorylation code and conditional binding of 14-3-3, as has been discussed recently for alternative polyphosphorylated 14-3-3 partners such as LRRK2, CFTR and tau protein. 48,67,70,71 The recruitment of the 14-3-3 dimer is expected to occlude the SR-rich region of N by masking 10-20 residues surrounding the Ser197 phosphosite within the complex. Apart from the likely effects on the properties of N and its ability to phase separate and bind RNA, the 14-3-3 binding at the SR-rich region, triggered by phosphorylation, can potentially interfere with N binding to the M protein, an event clearly relevant to the virion assembly. 17 In support of this hypothesis, the 14-3-3-occluded area reported here overlaps with the N region (residues 168-208) proposed to mediate its association with the M protein in SARS-CoV. 19 The presence of SR-rich regions in many viral N proteins suggests a more broad interaction of 14-3-3 with N proteins. Indeed, using 14-3-3-Pred prediction 58 a number of 14-3-3-binding phosphosites in N proteins from human (Supplementary  table 2) and bat coronaviruses (Supplementary  table 3) could be identified. Some display strong conservation of the 14-3-3-binding site around Ser197 (Figure 9(B)), whereas others contain separate high-scoring potential 14-3-3-binding sites beyond the SR-rich region. Given the reasonable threat that other zoonotic coronaviruses may ultimately enter the human population, 72,73 the relevant N proteins are highly likely to undergo phosphorylation and 14-3-3 binding, as seen for SARS-CoV-2 N. This is particularly likely, given the high concentration of both proteins in the infected cells (see above).
Our findings underline the essential role of the SR-rich region in the biology of N proteins and host-virus interactions. 49 Unrelated proteins with similar domains also tend to show RNA-binding capability (e.g., the splicing factors) 74,75 and are subject to multisite phosphorylation. 76 Such proteins are often associated with phase separation as a means to regulate membraneless compartmentalization within the cytoplasm. Likewise, SARS-CoV-2 N protein has been shown to undergo phase separation in vitro upon RNA addition: a phenomenon dependent on the concentration of salt, presence of divalent ions, phosphorylation state of N and on RNA sequence. 30,34,77,78 Furthermore, the N protein has been shown to recruit the RNAdependent RNA-polymerase complex, and granule-associated heterogeneous nuclear ribonucleoproteins forming phase separated granules which can aid SARS-CoV-2 replication. 30,34 Thus 14-3-3 potentially influences phase separation by binding the SR-rich region and may also affect the accessibility of the NES sequence located nearby (Figure 2(B)). This in turn may impact nucleocytoplasmic shuttling of N, as seen for SARS-CoV N. 37 Conclusively, 14-3-3 binding to the SR-rich region of N holds potential to regulate multiple host cell processes affected by N. 14-3-3 binding to pN may present a cell immune-like response to the viral infection aimed at arresting or neutralizing N activities. 57 On the other hand, in light of the abundance of N protein in the infected cell, 12,13 pN may instead arrest 14-3-3 proteins in the cytoplasm and indirectly disrupt cellular processes involving 14-3-3. For example, 14-3-3e and 14-3-3g each play a role in the innate immune response via RIG-1 and MDA5 signaling, respectively. 79,80 The N protein:14-3-3 interaction would modulate these and other signaling pathways involving 14-3-3 proteins. Intriguingly, two 14-3-3 isoforms, f and e, have been detected in purified particles of infectious bronchitis coronavirus. 81 The 14-3-3 protein take up could be mediated by its interaction with N, potentially resulting in the 14-3-3 transmission between coronavirus hosts.
As such, understanding the molecular mechanism of pN association with 14-3-3 proteins may inform the development of novel therapeutic approaches and paves the road for structural studies.
Protein concentration was determined by spectrophotometry at 280 nm on a N80 Nanophotometer (Implen, Munich, Germany) using sequence-specific extinction coefficients calculated using the ProtParam tool in ExPASy (see Supplementary table 5).

Isolation of E. coli tRNA
The DH5ɑ cells were incubated in 30 ml of liquid medium (LB without antibiotics) for 16 h at 37°C with maximum aeration and then harvested by centrifugation at 7000g for 10 min. The pellet was gently resuspended in RNAse-free Tris-acetate buffer, followed by alkali-SDS lysis and neutralization by cold ammonium acetate. The suspension was incubated on ice for 5 min and centrifuged at 21,000g for 10 min at 4°C. The resulting supernatant (8 ml) was incubated with 12.5 ml isopropanol for 15 min at 25°C and centrifuged at 12,100g for 5 min. The pellet was resuspended in 800 ml of 2 M ammonium acetate, incubated for 5 min at 25°C and centrifuged (12,100g, 10 min). RNA was precipitated from supernatant by 800 ml of isopropanol, incubated for 5 min at 25°C and centrifuged (12,100g, 5 min). After supernatant removal the pellet was washed with 70% ice-cold ethanol, dried and dissolved in 100-200 ml milliQ-water.

SARS-CoV-2 N protein co-expression with PKA
For phosphorylation in cells, SARS-CoV-2 N was bacterially co-expressed with a catalytic subunit of mouse protein kinase A (PKA), as described previously. 48 PKA was cloned into a low-copy pACYC vector 48 which ensured that the target protein was expressed in excess over kinase.
PKA and SARS-CoV-2 N were co-transformed into E. coli BL21(DE3) cells against Chloramphenicol and Kanamycin resistance, respectively. Cells were grown in LB to an OD 600 reading of 0.6 before inducing with 1 mM of IPTG. After induction, cultivation was continued for 1.5 h, 3 h, 4 h and overnight at 37°C, and 4 h was found to be sufficient to provide for the saturating interaction with 14-3-3. For all truncated constructs of N, overnight co-expression was used. For unphosphorylated controls we expressed proteins in the absence of PKA.
The cells with overexpressed proteins were harvested by centrifugation and resuspended in 20 mM Tris-HCl buffer pH 8.0 containing 1 M NaCl, 10 mM imidazole, 0.01 mM phenylmethylsulfonyl fluoride, as well RNAse to reduce the RNA content. Phosphorylated proteins were purified using subtractive IMAC and gelfiltration, whereby the His 6 -tagged PKA was efficiently removed. Phosphorylated N and its constructs typically showed significant shifts on SDS-PAGE and PhosTag gels indicating phosphorylation.

Identification of phosphosites within N
Sample treatment for proteomics analysis. For phosphopeptide mapping, the SARS-CoV-2 N protein co-expressed with PKA for 4 h at 30°C was purified as above. An aliquot (35 lg) was subjected to enzymatic hydrolysis "in solution" either with trypsin (Sequencing Grade Modified Trypsin, Promega) or with chymotrypsin (Analytical Grade, Sigma-Aldrich). Briefly, the sample preparation was as follows. The sample was reduced with 2 mM Tris(2-carboxyethyl) phosphine (TCEP) and then alkylated with 4 mM S-Methyl methanethiosulfonate (MMTS); a protein:enzyme ratio was kept at 50:1 (w/w); digestion was performed overnight at 37°C and pH 7.8 (for chymotrypsin, 10 mM Ca 2+ was added to the reaction solution). The reaction was stopped by adjusting it to pH 2 with formic acid.
The resulting peptides were purified on a custom micro-tip SPE column with Oasis HLB (Waters) as a stationary phase, using elution with an acetonitrile:water:formic acid mix (50:49.9:0.1 v/v/ v%). The eluate of resulting peptides was dried out and stored at À30°C prior to the LC-MS experiment.
LC-MS/MS experiment. Peptides were separated on a nano-flow chromatographic system with a flow rate of 440 nl/min (Ultimate 3000 Nano RSLC, Thermo Fisher Scientific) and ESI coupled to a mass-analyzer (Q Exactive Plus, Thermo Fisher Scientific). Briefly, the protocol was as follows. Dried peptide samples were rehydrated with 0.1% formic acid. An aliquot of rehydrated peptides (5 ll) was injected onto a precolumn (100 lm Â 2 cm, Reprosil-Pur C18-AQ 1.9 lm, Dr. Maisch GmbH) and eluted on a nano-column (100 lm Â 30 cm, Reprosil-Pur C18-AQ 1.9 lm, Dr. Maisch GmbH) with a linear gradient of mobile phase A (water:formic acid 99.9:0.1 v/v%) with mobile phase B (acetonitrile:water:formic acid 80:19.9:0.1 v/v/v%) from 2% till 80% B within 80 min. MS data were acquired by a datadependent acquisition approach after a MS-scan of a 350-1500 m/z range. Top 12 peaks were subjected to high collision energy dissociation (HCD) or electron transfer dissociation (ETD). After a round of MS/MS, masses were dynamically excluded from further analysis for 35 s.
MS data analysis. For Mascot (MatrixScience) database search, raw LC-MS files were converted to general mgf format using MSConvert with its default settings. For peptide database search, a concatenated database including SARS-CoV-2 and general contaminants was constructed. The parameters of the search were: Enzyme -Trypsin with one miscleavage allowed, or in case of Chymotrypsin -no enzyme; MS tolerance -5 ppm; MSMS -0.

Analytical size-exclusion chromatography (SEC)
The oligomeric state of proteins as well as protein-protein and protein-tRNA interactions were analyzed by loading 50 ml samples on a Superdex 200 Increase 5/150 column (GE Healthcare) operated at a 0.45 ml/min flow rate using a Varian ProStar 335 system (Varian Inc., Melbourne, Australia). When specified, a Superdex 200 Increase 10/300 column (GE Healthcare) operated at a 0.8 ml/min flow rate was used (100 ml loading). The columns were equilibrated by a 20 mM Tris-HCl buffer, pH 7.6, containing 200 mM NaCl and 3 mM NaN 3 (SEC buffer) and calibrated by the following protein markers: BSA trimer (198 kDa), BSA dimer (132 kDa), BSA monomer (66 kDa), ovalbumin (43 kDa), a-lactalbumin (15 kDa). The profiles were followed by 280 nm and (optionally) at 260 nm absorbance. Diode array detector data were used to retrieve full-absorbance spectral information about the eluted samples including protein and nucleic acid. All SEC experiments were performed at least three times and the most typical results are presented.
To assess binding parameters for the 14-3-3/ pN.1-419 interaction, we used serial loading of the samples containing a fixed pN concentration and increasing concentrations of 14-3-3 in a constant volume of 50 ml and a Superdex 200 Increase 5/150 column operated at 0.45 ml/min. To validate linear augmentation of the peak amplitude with increasing protein concentration, serial loading of 14-3-3 alone at different concentrations was used. Such linear dependence allowed conversion of the amplitude of the 14-3-3 peak into the molar concentration of its unbound form at each point of titration. Concentration of pN-bound 14-3-3 was determined as the difference between the total and free concentration. Binding curves representing the dependence of bound 14-3-3 on its total concentration were approximated using the quadratic equation to determine apparent K D values. Graphing and fitting were performed in Origin 9.0 (OriginLab Corporation, Northampton, MA, USA).

Size-exclusion chromatography coupled to multi-angle light scattering (SEC-MALS)
To determine the absolute masses of various N constructs and their complexes with 14-3-3, we coupled a SEC column to a ProStar 335 UV/Vis detector (Varian Inc., Melbourne, Australia) and a multi-angle laser light scattering detector miniDAWN (Wyatt Technologies). Either Superdex 200 Increase 10/300 (~24 ml, flow rate 0.8 ml/min) or Superdex 200 Increase 5/150 (~3 ml, flow rate 0.45 ml/min) columns (GE Healthcare) were used. The miniDAWN detector was calibrated relative to the scattering from toluene and, together with concentration estimates obtained from UV detector at 280 nm, was exploited for determining the Mw distribution of the eluted protein species. All processing was performed in ASTRA 8.0 software (Wyatt Technologies) taking dn/dc equal to 0.185 and using extinction coefficients listed in Supplementary table 5. Protein content in the eluted peaks was additionally analyzed by SDS-PAGE.

Data availability
The source LC-MS data on SARS-CoV-2 N phosphoproteomics are available along with the paper as a Supplementary data file 1.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.