The mechanism of RNA capping by SARS-CoV-2

The SARS-CoV-2 RNA genome contains a 5′-cap that facilitates translation of viral proteins, protection from exonucleases and evasion of the host immune response1-4. How this cap is made is not completely understood. Here, we reconstitute the SARS-CoV-2 7MeGpppA2′-O-Me-RNA cap using virally encoded non-structural proteins (nsps). We show that the kinase-like NiRAN domain5 of nsp12 transfers RNA to the amino terminus of nsp9, forming a covalent RNA-protein intermediate (a process termed RNAylation). Subsequently, the NiRAN domain transfers RNA to GDP, forming the cap core structure GpppA-RNA. The nsp146 and nsp167 methyltransferases then add methyl groups to form functional cap structures. Structural analyses of the replication-transcription complex bound to nsp9 identified key interactions that mediate the capping reaction. Furthermore, we demonstrate in a reverse genetics system8 that the N-terminus of nsp9 and the kinase-like active site residues in the NiRAN domain are required for successful SARS-CoV-2 replication. Collectively, our results reveal an unconventional mechanism by which SARS-CoV-2 caps its RNA genome, thus exposing a new target in the development of antivirals to treat COVID-19.


5
The NiRAN domain RNAylates nsp9 111 Given the ability of the NiRAN domain to transfer NMPs to nsp9 using NTPs as substrates, we 112 wondered whether the NiRAN domain could also utilize 5′-pppRNA in a similar fashion (Fig. 2a). 113 We synthesized a 5′-pppRNA 10-mer corresponding to the first 10 bases in the leader sequence 114 (LS10) of the SARS-CoV-2 genome (hereafter referred to as 5′-pppRNA LS10 ) (Extended Data 115 Table 1). We incubated 5′-pppRNA LS10 with nsp9 and nsp12 and analysed the reaction products 116 by SDS-PAGE. Remarkably, we observed an electrophoretic mobility shift in nsp9 that was time-117 dependent, sensitive to RNAse A treatment and required an active NiRAN domain, but not an 118 active RdRp domain (Fig. 2b). Intact mass analyses of the reaction products confirmed the 119 incorporation of monophosphorylated RNA LS10 (5′-pRNA LS10 ) into nsp9 (Fig. 2c). The reaction 120 was dependent on Mn 2+ (Extended Data Fig. 5a) and required a triphosphate at the 5′-end of the 121 RNA (Extended Data Fig. 5b). Substituting Ala for Asn1 reduced the incorporation of RNA LS10 122 into nsp9 (Fig. 2d). We also observed NiRAN-dependent RNAylation of nsp9 using LS RNAs 123 ranging from 2 to 20 nucleotides (Fig. 2e). Mutation of the first A to any other nucleotide markedly 124 reduced RNAylation (Fig. 2f). Thus, the NiRAN domain RNAylates the N-terminus of nsp9 in a 125 substrate-selective manner.  Fig. 6a) 24,25 . Because the NiRAN domain transfers 5′-pRNA to nsp9, we 132 hypothesized that this protein-RNA species may be an intermediate in a similar reaction 133 mechanism to that of the VSV system. To test this hypothesis, we purified the nsp9-pRNA LS10 134 species by ion exchange and gel filtration chromatography and incubated it with GDP in the 135 presence of nsp12. Treatment with GDP deRNAylated nsp9 in a NiRAN-dependent manner, as 136 judged by the nsp9 electrophoretic mobility on SDS-PAGE (Fig. 3a) and its molecular weight 137 based on intact mass analysis (Fig. 3b). The reaction was time-dependent, (Fig 3c), preferred Mg 2+ 138 over Mn 2+ (Extended Data Fig. 6b) and was specific for GDP--and to some extent GTP--but 139 not the other nucleotides tested (Fig. 3d). Interestingly, although inorganic pyrophosphate (PPi) 140 6 was able to deAMPylate nsp9-AMP, it was unable to deRNAylate nsp9-pRNA LS10 (Fig. 3e). (See 141 Discussion) 142 We used Urea-PAGE to analyse the fate of the RNA LS10 during the deRNAylation reaction. 143 Treatment of nsp9-pRNA LS10 with nsp12 and [a-32 P]GDP resulted in a [ 32 P]-labelled RNA species 144 that migrated similarly to GpppA-RNA LS10 produced by the Vaccinia capping enzyme (Fig. 3f). 145 The reaction was dependent on a functional NiRAN domain but not an active RdRp domain. To 146 confirm the presence of a GpppA-RNA cap, we digested the RNA produced from the nsp12 147 reaction with P1 nuclease and detected GpppA by high performance liquid chromatography/mass 148 spectrometry (HPLC/MS) analysis (Fig. 3g). Thus, the NiRAN domain is a GDP 149 polyribonucleotidyltransferase (GDP-PRNTase) that mediates the transfer of 5′-pRNA from nsp9 150 to GDP. 151 In our attempts to generate GpppA-RNA LS10 in a "one pot" reaction, we found that GDP inhibited 152 the RNAylation reaction (Extended Data Fig. 6c). However, the formation of GpppA-RNA LS10 153 could be generated in one pot provided that the RNAylation occurs prior to the addition of GDP 154 (Extended Data Fig. 6c, d). 155 Nsp14 and nsp16 catalyse the formation of the cap-0 and cap-1 structures 156 The SARS-CoV-2 genome encodes an N7-MTase domain within nsp14 6 and a 2′-O-MTase in 157 nsp16, the latter of which requires nsp10 for activity 7 . Nsp14 and the nsp10/16 complex use S-158 adenosyl methionine (SAM) as the methyl donor. To test whether NiRAN-synthesized GpppA-159 RNA LS10 can be methylated, we incubated 32 P-labelled GpppA-RNA LS10 with nsp14 and/or the 160 nsp10/16 complex in the presence of SAM and separated the reaction products by Urea-PAGE 161 (Fig. 4a). We extracted RNA from the reaction, treated it with P1 nuclease and CIP, and then 162 analysed the products by thin layer chromatography (TLC) (Fig. 4b). As expected, the NiRAN- GpppA-RNA LS10 to form the cap-0 and cap-1 structures, respectively (Fig 4c). Thus, the SARS- 173 CoV-2 7Me GpppA2′-O-Me-RNA capping mechanism can be reconstituted in vitro using virally 174 encoded proteins. 175 Efficient translation of mRNAs is dependent on eIF4E binding to the 7Me GpppA-RNA cap 26 . To 176 test whether the SARS-CoV-2 RNA cap is functional, we incubated [ 32 P]-labelled 7Me GpppA-177 RNA LS10 with GST-tagged eIF4E. We observed [ 32 P]-labelled RNA in GST pulldowns of 178 [ 32 P] 7Me GpppA-RNA but not the unmethylated derivative (Fig. 4d). Thus, the 7Me GpppA-RNA cap 179 generated by SARS-CoV-2 encoded proteins is a substrate for eIF4E in vitro, suggesting that the 180 cap is functional. 182 We determined a cryo-EM structure of the nsp7/8/9/12 complex and observed a nsp9 monomer 183 bound in the NiRAN active site (Fig. 5a, Extended Data Fig. 7-9, Extended Data Table 2). The 184 native N-terminus of nsp9 occupies a similar position to previously reported structures using a 185 non-native N-terminus of nsp9 (Fig. 5b, c) 21 . Our cryo-EM analysis was hindered by the preferred 186 orientation of the complex and sample heterogeneity, yielding final maps with high levels of 187 anisotropy, with distal portions of nsp9 missing, and weak density for the N-lobe of the NiRAN 188 domain (Extended Data Fig. 7, 8). Therefore, we used our model and the complex structure by 189 Yan et al. 21 (PDBID: 7CYQ) to study the structural basis of NiRAN-mediated RNA capping.

190
The first four residues of nsp9 extend into the NiRAN active site, forming electrostatic and 191 hydrophobic contacts in and around a groove near the kinase-like active site (Fig. 5d). Asn1 of 192 nsp9 is positioned inside of the active site, primed for transfer of 5′-pppRNA onto its N-terminus.

193
Although the terminal NH2 group of nsp9 is the substrate for RNAylation, the local quality of the 194 structures is not high enough to distinguish its exact position. We have modelled the nsp9 acceptor

195
NH2 pointing towards what appears to be the phosphates of the nucleotide analogue UMP-NPP in 196 the active site (Fig 5b). In the structure by Yan et al. 21 , Asn1 was assigned an opposite 197 conformation and there are unmodeled residues (non-native N-terminus; NH2-Gly-Ser-) visible in 198 the density maps, distorting local structural features (Fig. 5c, arrow) 27 .

8
Asn2 of nsp9 is in a negatively charged cleft around the NiRAN active site, and contacts Arg733, 200 which extends from the polymerase domain and is partially responsible for positioning nsp9 (Fig.   201 5e). Both Leu4 and the C-terminal helix of nsp9 form hydrophobic interactions with a β-sheet (β8-202 β9-β10) in the N-lobe of the NiRAN domain (Fig. 5e, f). The N-terminal cap of the nsp9 C-terminal 203 helix also forms electrostatic interactions with a negatively charged pocket on the surface of the 204 NiRAN domain (Fig. 5f). Nsp12 lacking the RdRp domain (DRdRp; 1-326) neither RNAylates 205 nsp9 nor processes nsp9-pRNA LS10 to form GpppA-RNA (Fig. 5g). Likewise, deleting the C-206 terminal helix on nsp9 (ΔC; 1-92) and Ala substitutions of Asn1 and Asn2 abolished RNAylation 207 (Fig. 5h). 208 The NiRAN domain resembles SelO, with an RMSD of 5.7 Å over 224 Cα atoms (PDB ID: ); however, like in SelO, Asp208 is next to the metal binding Asn209 (PKA; N171) and may 214 act as a catalytic base to activate the NH2 group on the N-terminus of nsp9 (Fig. 5i). 215 In canonical kinases, the b1-b2 G-loop stabilizes the phosphates of ATP 28 . In contrast, the NiRAN 216 domain contains a b-hairpin insert (b2-b3) where the b1-b2 G-loop should be (Extended Data 217 Fig. 10b). This insertion not only makes contacts with the N-terminus of nsp9, but also contains a 218 conserved Lys (K50) that extends into the active site and stabilizes the phosphates of the bound 219 nucleotide. Likewise, Arg116 also contacts the phosphates of the nucleotide. SelO contains a 220 similar set of basic residues pointing into the active site that accommodate the flipped orientation 221 of the nucleotide to facilitate AMPylation (Extended Data Fig. 10b). Notably, Lys73, Arg116 222 and Asp218 in SARS-CoV-1 nsp12 are required for viral replication 5 .

223
The kinase-like residues of the NiRAN domain and the N-terminus of nsp9 are essential for 224 SARS-CoV-2 replication 225 To determine the importance of the NiRAN domain and the N-terminus of nsp9 in viral replication, 226 we used a DNA-based reverse genetics system that can rescue infectious SARS-CoV-2 (Wuhan-227 Hu-1/2019 isolate) expressing a fluorescent reporter 8 (Extended Data Fig. 11a). We introduced 228 single point mutations in nsp9 (N1A, N1D and N2A) and nsp12 (K73A, D218A and D760A) and 229 9 quantified the virus in supernatants of producer cells by RT-qPCR to detect the viral N gene. We 230 observed a 400 to 4000-fold reduction in viral load for all the mutants compared to WT (Fig. 5j,   231 Extended Data Fig. 11a). To account for the possibility of a proteolytic defect in the mutant viral 232 polyprotein, we tested whether the main viral protease nsp5 (M Pro ) can cleave a nsp8-nsp9 fusion 233 protein containing the Asn1/Asn2 mutations in nsp9. The N1D mutant failed to be cleaved by 234 nsp5, suggesting that the replication defect observed for this mutant is a result of inefficient 235 processing of the viral polyprotein. However, the N1A and N2A mutants were efficiently cleaved 236 by nsp5 (Extended Data Fig. 11b, c). Collectively, these data provide genetic evidence that the 237 residues involved in capping of the SARS-CoV-2 genome are essential for viral replication.

246
SARS-CoV-2 nsp12 is thought to initiate transcription/replication starting with an NTP, or a short 247 5′-pppRNA primer 29 . Cryo-EM structures of the RTC suggest that the dsRNA product makes its 248 way out of the RdRp active site in a straight line, supported by the nsp8 helical stalks 21,30,31 . In a 249 cis capping model, the helical duplex with nascent 5′-pppRNA would then need to unwind, flex 250 90°, and extend into the NiRAN active site ~70 Å away (Fig. 6a). More likely, a separate RTC 251 complex could perform capping in trans (Fig. 6b). Notably, Perry et al. 32 propose that the nascent  The SARS-CoV-2 capping mechanism is reminiscent of the capping mechanism used by VSV, 262 although there are some differences. The VSV large (L) protein is a multifunctional enzyme that  In summary, we have defined the mechanism by which SARS-CoV-2 caps its genome and have 283 reconstituted this reaction in vitro using non-structural proteins encoded by SARS-CoV-2. Our                   Fractions containing YIPP were pooled, concentrated, and stored as above. were resolved in a 15% TBE-Urea PAGE gel. The gel was then stained with toluidine blue O and 739 the 32 P signal detected via autoradiography.

740
The other half of the GTase reactions were treated with 10 U/ml P1 nuclease for 1 h at 37°C.

741
Reactions were then split in half again with one half treated with 1 U/ml Quick CIP for 30 min at  1H-1H TOCSY pulse train using a DIPSI-2 sequence and a field strength of 10KHz was tagged at 787 the end of the HSQC sequence to observe signals from 1H nuclei that are further away from 31P.

788
Given the lower sensitivity of this experiment, each FID was accumulated with 4096 scans and a 789 repetition delay of 1sec was used for a total recording time of 2 days and 14 hours.  Methyltransferase control reaction.