Biological context

SARS-CoV-2, the causative agent of COVID-19, emerged late in 2019 to cause a global pandemic. This novel coronavirus is highly related to SARS-CoV that emerged in 2002. While several coronaviruses routinely infect humans causing mild respiratory symptoms (van der Hoek 2007), SARS-CoV-2 is capable of causing severe respiratory symptoms or even death (Bchetnia et al. 2020). Coronaviruses are enveloped viruses with large positive-sense RNA genomes. At 30 kb or even greater, some of the genomes of the Nidovirales order, which includes coronaviruses, are among the largest RNA genomes known, requiring a highly processive RNA synthesis process for their replication.

Upon entering host cells, the coronavirus RNA genome acts as an mRNA to be translated by host ribosomes to produce the viral polyproteins pp1a and pp1ab. These polyproteins are cleaved to produce a suite of 16 non-structural proteins (nsp) that are responsible for replication and transcription of the viral RNA genome (Snijder et al. 2016). The viral nsp assemble membrane enclosed compartments containing the virus RNA-dependent RNA-polymerase. The minimal polymerase complex required for activity in vitro is composed of nsp7, nsp8 and nsp12. While the polymerase active site is contained wholly within nsp12, nsp7 and nsp8 act as essential co-factors for enzyme activity enabling processive RNA synthesis (Subissi et al. 2014). Nsp7 has also been proposed to act with nsp8 as a part of an RNA primase to generate RNA primers for viral RNA synthesis (Imbert et al. 2006; te Velthuis et al. 2012). However, the activity of the primase or its proposed additional role to extend RNA primer has not been universally reproduced (Subissi et al. 2014).

Previous structures of nsp7 show the protein to be composed of four helical regions where the positioning of the N- and C-terminal helical regions are altered upon binding to nsp8 (Johnson et al. 2010; Zhai et al. 2005). Structural determination of the polymerase complexes of SARS-CoV (Kirchdoerfer and Ward 2019) and SARS-CoV-2 (Gao et al. 2020), show that two subunits of nsp8 bind to nsp12 and that one of these nsp8 subunits interacts with nsp7 giving a 1:2:1 nsp7:nsp8:nsp12 stoichiometry. A separate crystal structure of SARS-CoV nsp7 bound to nsp8, while resembling the nsp7-nsp8 interactions in the nsp7-nps8-nsp12 cryoEM structures, showed the assembly of a large 8:8 protein complex (Zhai et al. 2005). Lacking solution evidence for this large complex and alternate assemblies for nsp7 and nsp8 observed in crystal structures of feline coronavirus (Xiao et al. 2012) and SARS-CoV-2 (Konkolova et al. 2020) leaves ambiguity as to the biological role of these large nsp7-nsp8 complexes in the virus life cycle. As a well-conserved component of the virus replication machinery, a greater understanding of nsp7 structure and dynamics will accelerate our understanding of this essential protein complex improving models of protein–protein interactions and laying an important foundation for the development of antiviral therapeutics.

Methods and experiments

Construct design

This study uses the SARS-CoV-2 NCBI reference genome entry NC_045512.2, identical to GenBank entry MN908947.3 (Wu et al. 2020). This sequence was inserted into a pET46 vector, containing an N-terminal His6-tag, Ek protease and a tobacco etch virus (TEV) protease cleavage sites. Due to the nature of the TEV protease cleavage site, the purified protein contained one artificial N-terminal residue (G0) preceding the native protein sequence.

Sample preparation

Uniformly 13C,15N-labelled Nsp7 protein was expressed in Eschirichia coli strain Rosetta2 pLysS in M9 minimal medium containing 1 g/L 15NH4Cl (Cambridge Isotope Laboratories), 4 g/L 13C6-D-glucose (Cambridge Isotope Laboratories) and 100 μg/mL ampicillin. Bacterial cultures were grown to an O.D. 600 nm of 0.8 at 37 °C and induced with 0.5 mM IPTG for 14–16 h at 16 °C. The cell pellet was resuspended in buffer A (10 mM HEPES, pH 7.4, 300 mM NaCl, 30 mM imidazole and 2 mM dithiothreitol). The cells were lysed using a microfluidizer operating at 20,000 psi. The lysate was clarified by centrifugation at 25,000 × g for 30 min and then filtration using a 0.45 µm vacuum filter. Clarified supernatant was bound to Ni–NTA agarose (Qiagen), washed with buffer A and then eluted with buffer A containing 300 mM total imidazole. Protein containing fractions were pooled and cleaved with 1% (w/w) TEV protease over night at room temperature while dialyzing against 10 mM MOPS, pH 7.0, 150 mM NaCl, 2 mM dithiothreitol. TEV protease and tag were removed via a second IMAC purification. Protein was further purified with a Superdex200 column (GE Life Sciences) using a buffer containing 10 mM MOPS, pH 7.0, 150 mM NaCl, 2 mM dithiothreitol. Fractions containing the purified proteins were concentrated using Amicon Ultra concentrators (Millipore Sigma). The final NMR sample contained 1.7 mM 13C,15N-nsp7, 10 mM MOPS, pH 7.0, 150 mM NaCl, 2 mM DTT, 0.025% NaN3, 7% D2O.

NMR experiments

All experiments for the backbone and side chain assignments of nsp7 were recorded at 298 K using 600 MHz Varian VNMRS and Bruker Avance III spectrometers, equipped with an H/C/N Cryoprobe. All spectra were acquired using standard pulse sequences optimized to achieve the best performance on cryogenic probes and with non-uniform sampling. The set of NMR experiments used for resonance assignments is summarized in Table 1. Proton resonances were calibrated with respect to the signal of 2,2-dimethylsilapentane-5-sulfonic acid (DSS). Nitrogen and carbon chemical shifts were referenced indirectly to the 1H standard using a conversion factor derived from the ratio of NMR frequencies (Wishart et al. 1995). Spectra were processed using NMRPipe (Delaglio et al. 1995) with SMILE (Ying et al. 2017) for NUS reconstruction and analyzed using NMRFAM-Sparky (Lee et al. 2015).

Table 1 List of experiments collected to perform the sequence specific assignment of nsp7

Assignments and data deposition

The 1H-15N HSQC spectrum of nsp7 shows well-dispersed amide signals (Fig. 1). Assignments were performed with i-PINE (Lee et al. 2019) using PINE-Sparky2 automated (Lee and Markley 2018) interface and manual confirmation in NMRFAM-Sparky (Lee et al. 2015). Backbone assignments are 96% complete with G0, S1, and D67 not visible in the 15N-HSQC spectrum. For the nsp7 sequence (S1-Q83), assignments are 99% complete for Cα, Cβ, and CO (only V66 unassigned), and 98% complete for HN and N (D67 and S1 unassigned). Secondary structure prediction was performed using chemical shift assignments of five atoms (HN, Cα, Cβ, CO, N) for a given residue in the sequence with TALOS-N (Shen and Bax 2013). The results for nsp7 are shown in Fig. 2.

Fig. 1
figure 1

Assigned 1H,15 N-HSQC spectrum of the 13C,15 N-labelled SARS-CoV-2 nsp7 at 1.7 mM concentration in 10 mM MOPS, pH 7.0, 150 mM NaCl, 2 mM DTT, 0.025% NaN3 and 7% D2O measured at 298 K on a 600 MHz Agilent NMR Spectrometer with backbone NH chemical shift assignments shown. The inset shows the central region of the spectrum enlarged for clarity

Fig. 2
figure 2

Display of TALOS-N predicted secondary structure for nsp7. Helical probability shown in red, residues that are highly dynamic, predicted to be coil, or for which there is not a consistent prediction are not shown

With the exception of G0, the aliphatic and aromatic side chain C-H groups for all residues were assigned (> 99% completeness overall, 100% for nsp7 sequence). In addition, the Nδ2-Hδ2 groups of N residues, the Nε2-Hε2 groups of Q residues, the Nε-Hε groups of R residues and the Nε1-Hε1 group of W29 were also assigned. The Nζ-Hζ groups of K residues and the Nδ1-Hδ1 and Nε2-Hε2 groups of H36 were not assigned.

The structure of nsp7 from SARS coronavirus was previously determined by NMR by the Wüthrich lab, first at pH 7.5 and high ionic strength (Peti et al. 2005) (BMRB ID 6513, PDB ID 1YSY) and later at pH 6.5 (Johnson et al. 2010) (BMRB ID 16,981, PDB ID 2KYS). The sequence of nsp7 from SARS-2 and SARS are nearly identical, with only a single conservative amino acid difference at position 70 (Fig. 3). The backbone chemical shifts for nsp7 from SARS-2 and SARS are very similar, as might be expected given the very high sequence identity. The dihedral angles predicted by TALOS-N for nsp7 from SARS are in good agreement with the previous SARS coronavirus nsp7 NMR structure (Fig. 3).

Fig. 3
figure 3

Comparison of nsp7 from SARS-CoV-2 and SARS-CoV. a Sequence comparison shows a single conservative amino acid substitution. b The SARS-CoV-2 helical regions (red) shown in Fig. 2 are plotted on the SARS-CoV nsp7 structure determined by NMR at pH 6.5 (PDB 2KYS). This structure is higher quality than the pH 7.5 structure (PDB 1YSY) because more complete assignments and a larger number of restraints were obtained at pH 6.5. The location of the single amino acid difference between SARS-CoV and SARS-CoV-2 nsp7 is highlighted in blue sticks

The chemical shift values for the 1H, 13C and 15N resonances of nsp7 have been deposited at the BioMagResBank (https://www.bmrb.wisc.edu) under accession number 50337. Raw data has been deposited in BMRbig (https://bmrbig.org/) under deposition ID bmrbig4.