Signature motifs of GDP polyribonucleotidyltransferase, a non-segmented negative strand RNA viral mRNA capping enzyme, domain in the L protein are required for covalent enzyme–pRNA intermediate formation

The unconventional mRNA capping enzyme (GDP polyribonucleotidyltransferase, PRNTase; block V) domain in RNA polymerase L proteins of non-segmented negative strand (NNS) RNA viruses (e.g. rabies, measles, Ebola) contains five collinear sequence elements, Rx(3)Wx(3–8)ΦxGxζx(P/A) (motif A; Φ, hydrophobic; ζ, hydrophilic), (Y/W)ΦGSxT (motif B), W (motif C), HR (motif D) and ζxxΦx(F/Y)QxxΦ (motif E). We performed site-directed mutagenesis of the L protein of vesicular stomatitis virus (VSV, a prototypic NNS RNA virus) to examine participation of these motifs in mRNA capping. Similar to the catalytic residues in motif D, G1100 in motif A, T1157 in motif B, W1188 in motif C, and F1269 and Q1270 in motif E were found to be essential or important for the PRNTase activity in the step of the covalent L-pRNA intermediate formation, but not for the GTPase activity that generates GDP (pRNA acceptor). Cap defective mutations in these residues induced termination of mRNA synthesis at position +40 followed by aberrant stop–start transcription, and abolished virus gene expression in host cells. These results suggest that the conserved motifs constitute the active site of the PRNTase domain and the L-pRNA intermediate formation followed by the cap formation is essential for successful synthesis of full-length mRNAs.


INTRODUCTION
The 5 -terminal cap structure (cap 0, m 7 G [5 ]ppp [5 ]N-) of eukaryotic mRNAs is composed of N 7 -methylguanosine (m 7 G) linked to the 5 -end of mRNA through the 5 -5 triphosphate (ppp) bridge, and is essential for various steps of mRNA metabolism including stability and translation [Reviewed in (1)(2)(3)(4)].The cap 0 structure is formed on pre-mRNAs by nuclear mRNA capping enzyme with the RNA 5 -triphosphatase (RTPase) and mRNA guanylyltransferase (GTase) activities followed by mRNA (guanine-N 7 )-methyltransferase (MTase).In higher eukaryotes, the cap 0 structure is further methylated at the ribose-2 -O position of the first nucleoside of mRNA by mRNA (nucleoside-2 -O)-MTase to generate the cap 1 structure (m 7 G [5 ]ppp [5 ]Nm).Recently, 2 -O-methylation of the cap structure was found to be required for avoiding anti-viral innate immunity through the IFIT-1, Mda5 and RIG-I pathways (5)(6)(7)(8)(9).Thus, many eukaryotic viruses need to possess the fully methylated cap 1 structure on their mRNAs to efficiently produce viral proteins in infected cells.However, the molecular mechanisms of the cap formation by some RNA viral enzymes are significantly different from those by eukaryotic nuclear enzymes (10)(11)(12).
We have demonstrated that pRNA is covalently linked to the N ⑀2 position of a histidine residue at position 1227 (H1227) in the histidine-arginine (HR) motif of the VSV L protein with a phosphoamide bond and is subsequently transferred to GDP (pRNA acceptor) to form GpppRNA (16).It has been also shown that pRNA linked to H1227 is transferred to PP i to regenerate pppRNA, indicating that the covalent intermediate formation is a reversible reaction (16).Furthermore, both the residues of the HR motif were found to be critical for the intermediate formation with the VSV and Chandipura virus (CHPV, Vesiculovirus) L proteins (16,17).Alignment of amino acid sequences of ∼90 NNS RNA viral L proteins revealed that the HR motif is strikingly conserved in L proteins of NNS viruses belonging to the different families, Rhabdoviridae (e.g.rabies), Paramyxoviridae (e.g.measles), Filoviridae (e.g.Ebola) and Bornaviridae (e.g.Borna disease), except for fish novirhabdoviruses (16).Recently, we have found that the HR motif of the VSV L protein is essential for accurate stop-start transcription to synthesize full-length mRNAs and VSV growth in host cells (18).
Using our in vitro reconstituted transcription system with the recombinant VSV L and P proteins and the native N-RNA template, cap-defective mutations in the HR motif of the L protein were found to impair mRNA synthesis, but not leader RNA synthesis, by causing aberrant stop-start transcription (18).After synthesis of the leader RNA, the cap-defective mutants frequently terminate N mRNA synthesis at positions +38 and +40 to synthesize transcripts initiated with 5 -ATP (called N1-38 and N1-40, respectively).Surprisingly, a part of the RdRp, stopped at the +38/+40 termination site, was found to generate an unusual 28-nt transcript from position +41 (N41-68) followed by a long 3 -polyadenylated transcript from position +157 (N157-1326, also called N 2 RNA) initiated with non-canonical GTP.Higher rates of incorrect transcription termination and re-initiation using cryptic signals within the N gene resulted in dramatic attenuation of synthesis of downstream mRNAs as well as full-length N mRNA.Based on these observations, pre-mRNA capping with the PRNTase domain in the L protein during mRNA chain elongation was suggested to be a critical step to carry out accurate stop-start transcription leading to the production of full-length mR-NAs.
In 2008, Li et al. (34) have identified the G1154, T1157, H1227 and R1228 residues in block V of the VSV L protein that are involved in cap formation using alanine scanning mutagenesis and our in vitro oligo-RNA capping assay with GTP as a substrate.Although our later study revealed that H1227 and R1228 are required for the L-pRNA intermediate formation in the PRNTase reaction, it still remains unknown which step(s) of capping (e.g.GTP hydrolysis, L-pRNA intermediate formation, pRNA transfer to GDP) is impaired by the G1154A and T1157A mutations.Furthermore, no other conserved amino acid residues in the putative PRNTase domain involved in mRNA capping have been identified.
Here, using amino acid sequences of more than 220 NNS RNA viral L proteins, available in a current public database, we found that five collinear sequence elements (called motifs A-E) are conserved in the putative PRNTase domains of NNS RNA viruses in the order Mononegavirales as suggested by our previous analysis of a small number of L proteins (13).By performing extensive mutagenesis of the VSV L protein, we identified key amino acid residues in these motifs as critical for the L-pRNA intermediate formation in the PRNTase reaction and accurate stop-start transcription for synthesis of full-length mRNAs.These results suggest that these motifs constitute the active site of the PRNTase domain, and pre-mRNA capping with the PRNTase domain plays an essential role in NNS RNA viral mRNA biogenesis.

PRNTase and GTPase assays with the VSV L protein
The in vitro oligo-RNA capping assay was performed with the recombinant VSV L protein [60 ng, wild-type (WT) or mutant] using [␣-32 P]GDP and pppAACAG oligo-RNA as substrates according to the method described before (11,16,49).The L-pRNA intermediate formation assay was carried out with 0.3 g of the recombinant VSV L protein (WT or mutant) and 32 P-labeled pppAACAG oligo RNA as described previously (11,16,49).The GTPase assay was carried out with 0.3 g of the recombinant VSV L protein (WT or mutant) using [␥ -32 P]GTP as described (15,16).WT and mutant L proteins were expressed in insect cells and purified as described previously (16,49).

In vitro transcription assay with the VSV L protein
The in vitro VSV transcription assay was performed with the recombinant L protein (0.15 g, WT or mutant), recombinant P protein (0.05 g) and N-RNA complex (0.4 g protein) as described previously (18,49). 32P-labeled and unlabeled VSV transcripts were synthesized with [␣-32 P]GTP and GTP, respectively (18,49).mRNAs were deadenylated with RNase H in the presence of oligo(dT) (49) when described in figure legends.Unlabeled VSV transcripts were post-labeled with vaccinia virus capping enzyme (Epicentre) in the presence of [␣-32 P]GTP as described (18), except in the presence of 0.1 mM S-adenosylmethionine instead of inorganic pyrophosphatase. 32P-labeled short (e.g.leader RNA) and long (e.g.mRNAs) transcripts were analyzed by electrophoresis in 20 and 5%, respectively, polyacrylamide gels containing 7 M urea (urea-PAGE) followed by autoradiography (18,49).

Mini-genome assay
The VSV mini-genome assay using a plasmid expressing the VSV L protein (WT or mutant) was carried out as essentially described by Grdzelishvili et al. (23) with some modifications (see the Supplementary Methods).A reporter gene product and the VSV L protein expressed in the transfected cells were detected by an ELISA and Western blotting, respectively.

Generation of recombinant VSV L proteins with mutations in conserved amino acid residues surrounding the HR motif
To identify highly conserved amino acid sequence motifs close to the catalytic HR motif as candidate motifs for the putative PRNTase domain, we analyzed amino acid sequences surrounding the HR motif (block V) from representative viruses belonging to different genera in the Rhabdoviridae, Paramyxoviridae, Filoviridae, Bornaviridae and Nyamiviridae families using the PSI-Coffee alignment program (50) (Supplementary Figure S1).As highlighted in Figure 1, we found that block V contains five conserved motifs, Rx(3)Wx(3-8) xGx x(P/A) ( and indicate hydrophobic and hydrophilic amino acids, respectively; referred to as motif A), (Y/W) GSxT (motif B), W (motif C), HR (motif D) and xx x(F/Y)Qxx (motif E).The G1154 and T1157 residues of the VSV L protein are present within motif B that is located ∼75 residues upstream of motif D. We also confirmed that these motifs are strikingly conserved in L proteins of more than 220 known NNS RNA viruses (see Supplementary Table S1 and Figure S2).
To analyze whether these conserved and some semiconserved amino acid residues of the VSV L protein participate in mRNA capping reactions, we mutated them to alanine and/or closely related amino acids (see Figure 2).We divided mutants into four groups, and expressed and purified them together with the WT L protein.Their purity was verified by sodium dodecylsulphate-polyacrylamide gel electrophoresis followed by staining with Coomassie Brilliant Blue (Figure 2).It should be noted that, since R1090A and R1090K mutants were not obtained in high qualities and quantities due to their extremely low expression levels in insect cells (not shown), these mutants could not be further characterized.In addition, since the solubility of G1154A as well as G1154S (not shown) were significantly lower than those of WT and other mutant L proteins, the G1154 mutants were solubilized in the presence of 1 M NaCl instead of 0.3 M NaCl, which was used for the WT and other mutants.These observations suggest that mutations in R1090 and G1154 might affect the structural integrity of the domain or the whole protein, leading to their lower expression and solubility, respectively.

Conserved amino acid sequence motifs surrounding the HR motif are required for the PRNTase activity
First, we subjected the highly purified L mutants to the in vitro cap formation assay with pppAACAG and [␣- 32 P]GDP to measure their PRNTase activities independently of their GTPase activities (11,16,49) (Figure 3A and Table 1).We used the HR-RH mutant (lane 37), which possesses an RH sequence instead of the HR motif (motif D), as a representative of the previously characterized cap-defective L mutants (16).The cap formation activities of W1094A (lane 3) and W1094F (lane 4) were equivalent or higher than that of the WT L protein (lane 2), whereas the G1100A (lane 5), P1104A (lane 6) and P1104V (lane 7) mutations markedly reduced the cap formation activity to 11-15% of the WT activity.The Y1152A (lane 10) and Y1152W (lane 12) mutants were completely inert, whereas Y1152F (lane 11) showed about 4% of the WT activity.G1154A (lane 13) and G1154S (not shown) did not show any cap formation activity.While the S1155A (lane 14), S1155T (lane 15) and S1155G (lane 16) mutants showed 20-53% of the WT activity, S1155V (lane 17) exhibit an extremely low activity (∼2%), indicating that this residue could be partially replaced with a small amino acid with lower hydrophobicity (G > T > A).The T1157A (lane 18) and T1157S (lane 19) mutants were completely inactive.The L1153A (lane 22) and L1153F (lane 25) mutants showed low cap formation activities (10 and 3%, respectively, of the WT activity), whereas the L1153V and L1153I mutants exhibited modest activities (39 and 45%, respectively), indicating that larger aliphatic amino acids (I > V) could be partially substituted for L1153.On the other hand, alanine substitutions of nonconserved K1156 (lane 26) and S1158 (lane 27) did not have significant effects on the cap formation activity.The W1188A mutation also abolished the cap formation activity (lane 30), but this residue could not be replaced with other aromatic amino acids, such as F (lane 31), Y (not shown) and H (not shown).Interestingly, although F1269A (lane 32) was inactive in the cap formation, F1269Y (lane 33) and   F1269W (lane 34) exhibited 31 and 18% of the WT activity, respectively.Furthermore, similar to the HR-RH mutant (lane 37), Q1270A (lane 35), Q1270N (lane 36) and Q1270E (not shown) displayed no activity.
To investigate which step(s) of the PRNTase reaction is abrogated with the mutations, we analyzed effects of these mutations on the covalent L-pRNA intermediate formation, a critical step of the cap formation (Figure 3B and Table 1).We found that relative L-pRNA intermediate formation activities of the cap-defective mutants [G1100A (lane 5), P1104A/V (lanes 6 and 7), Y1152A/F/W (lanes 10-12), G1154A (lane 13), S1155V (lane 17), T1157A/S (lanes 18 and 19), L1153A/F (lanes 22 and 25), W1188A/F (lanes 30 and 31), F1269A (lane 32) and Q1270A/N (lanes 35 and 36)] were consistent with their cap formation activities (Fig- ure 3A and Table 1), suggesting that these mutations diminished or abolished the PRNTase activity in the step of the L-pRNA intermediate formation.However, we did not find any mutations that affect the pRNA transfer reaction, while not affecting the intermediate formation reaction.
We also examined the effects of these mutations on the GTPase activity of the L protein (Figure 3C and Table 1), which releases the ␥ -phosphate of GTP as inorganic phosphate (P i ).Any mutations of the G1100 (lane 5), P1104 (lanes 6 and 7), S1155 (lanes 14-17), T1157 (lanes 18 and 19), L1153 (lanes 22-25), F1269 (lanes 32-34) and Q1270 (lanes 35 and 36) residues did not abolish the GTPase activity, suggesting that these residues are specifically required for the PRNTase activity.In contrast, the Y1152A (lane 10), Y1152W (lane 12), G1154A (lane 13) and W1188F (lane   12 The PRNTase, GTPase and RdRp activities of the mutant L proteins were measured by the in vitro oligo-RNA capping, L-pRNA intermediate formation, GTPase and transcription assays as shown in Figures 3 and 4. Relative enzymatic activities of the mutant L proteins were expressed as percentages of the WT activities.Data represent the means and standard deviations from three independent experiments.

31
) mutations reduced the GTPase activity, indicating that these mutations affect both the GTPase and PRNTase activities.

Cap-defective mutants produce uncapped abortive transcripts by aberrant stop-start transcription
Our previous study (18) showed that the cap-defective mutations in the HR motif (motif D) negatively impact mRNA synthesis, but not leader RNA synthesis.To analyze the effects of the mutations in other conserved motifs on the transcription activity of the L protein, we reconstituted the transcription reaction with the WT or mutant L protein, the recombinant P protein and the N-RNA template.After the reactions, poly(A) tails on mRNAs were digested with RNase H in the presence of oligo(dT).Short (e.g.leader RNA) and long (e.g.deadenylated mRNAs) transcripts were analyzed by 20% (Figure 4A and Table 1) and 5% (Figure 4B  ) exhibited weaker RNA synthesis activities than those of other cap-defective mutants, their mRNA synthesis activities (1-3% of the WT activity) were significantly lower than their leader RNA synthesis activities (13-38% of the WT activity).In contrast, W1094A (lane 3), P1104A (lane 6), Y1152A (lane10), Y1152W (lane 12), G1154A (lane 13), S1155V (lane 17), L1153A/V/I/F (lanes 22-25) and W1188A (lane 30) showed low or no activities to synthesize both the leader RNA and mRNAs.Other mutations had no or moderate negative effects on synthesis of both the leader RNA and mRNAs.We also found that W1094 (lane 4), Y1152 (lane 11) and F1269 (lanes 33 and 34) could be functionally replaced with another aromatic amino acid(s) in transcription.

The PRNTase motifs are required for VSV gene expression in host cells
To analyze effects of mutations in the PRNTase motifs on VSV gene expression in host cells, we performed a minigenome assay with a plasmid expressing a negative strand genome with a reporter gene, instead of the five VSV genes (23).In the presence of the N, P and L proteins expressed from supporting plasmids, the mini-genome is replicated and transcribed into reporter mRNA.The expression levels of the reporter gene product in cells expressing selected mutant L proteins were compared with that in cells expressing the WT L protein (Figure 7A).We also confirmed that the mutant L proteins were expressed at levels similar to that of the WT protein in the transfected cells (Figure 7B).Consistent with the in vitro mRNA synthesis activities of W1094A and W1094F (Figure 4B), cells expressing these mutants showed 7% (Figure 7A, column 3) and 96% (column 4), respectively, of the reporter gene expression level in cells expressing the WT L protein.Gene expression from the mini-genome was not observed in cells expressing the cap-defective mutants, G1100A (column 5), T1157A (column 12), W1188F (column 20), F1269A (column 21), Q1270A (column 23) and HR-RH (column 24) as well as the transcription-defective mutants, Y1152A (not shown) and G1154A (column 10).Unexpectedly, P1104V (column 6), Y1152F (column 9), S1155A (column 11) and L1153I (column 15) did not support gene expression from the minigenome, although they retained weak transcription activities in vitro, suggesting essential roles of P1104, Y1152, S1155 and L1153 in VSV gene expression in host cells.Consistent with in vitro results (Figures 3 and 4), F1269 could be functionally replaced with other aromatic amino acids, Y (column 22) and W (not shown).
As we reported for the motif D mutations (18), recombinant VSVs with cap-defective mutations (e.g.T1157A, W1188F, F1269A, Q1270A) could not be generated from cDNAs using the reverse genetics system (data not shown).In contrast, recombinant VSVs with F1269Y and F1269W were rescued from cDNAs.Consistent with the in vitro transcription activities of F1269Y and F1269W (Figure 4), VSV with F1269Y formed plaques similar to WT VSV, while VSV with F1269W formed smaller plaques (Supplementary Figure S3).Taken together, we conclude that cap-defective mutations in the L protein cause defects in the VSV gene expression at the step of production of translatable mRNAs, thereby being lethal to the virus.

DISCUSSION
In addition to the catalytic amino acid residues (H1227 and R1228) in motif D (16), we identified other conserved amino acid residues in motifs A (G1100), B (T1157), C (W1188) and E (F1269 and Q1270) of the putative PRNTase domain of the VSV L protein that are essential or important for the L-pRNA intermediate formation and synthesis of fulllength mRNAs, but not for GTP hydrolysis or synthesis of the uncapped leader RNA.In contrast, conserved residues in the N-terminal region of the VSV L protein [e.g.H360, H639, D714 (RdRp active site)] were found to be required for synthesis of both the leader RNA and mRNAs, but not for any steps of mRNA capping (16,18).These PRN-Tase motifs are strikingly conserved in L proteins of divergent NNS RNA viruses belonging to the different families, Rhabdoviridae (e.g.rabies), Paramyxoviridae (e.g.measles, Nipah, respiratory syncytial), Filoviridae (e.g.Ebola), Bornaviridae (e.g.Borna disease) and Nyamiviridae (e.g.Nyamanini), in the order Mononegavirales and rhabdoviruslike bipartite negative strand RNA viruses (see Supplementary Table S1, Figures S2 and S4, and Supplementary Discussion), suggesting that they play common roles in substrate binding, catalysis and/or structural maintenance of the PRNTase domain.
A very recent cryo-electron microscopic (EM) analysis of the VSV L protein complexed with a fragment of the P protein produced a high-resolution density map of the complex leading to an atomic model of almost the entire L protein structure, composed of the N-terminal RdRp domain with blocks I to III, capping domain with blocks IV and V, connector domain, MTase domain with block VI and the C-terminal domain (51).As shown in Figure 8A and Supplementary Figure S5, the amino acid residues required for the L-pRNA intermediate formation are localized in close proximity to H1227 in the C-terminal part (block V, putative PRNTase domain) of the capping domain, juxtaposing an RNA exit channel of the RdRp domain.These observations suggest that all these residues constitute the active site of the PRNTase domain waiting for 5 -ends of pre-mRNAs that emerge from the exit channel.
In the cryo-EM structure of the non-transcribing VSV L protein (51), G1100 in motif A is located on a loop between two ␣-helices and is within ∼12 Å distance from H1227 in motif D (Figure 8A and Supplementary Figure S5).This highly conserved G residue was suggested to be required for the efficient intermediate formation possibly by playing a structural role in maintaining the loop or interacting with pppRNA via its backbone amide group.
Motif B is located in the middle position of a large loop structure (residues 1136-1173) between a ␤-sheet and ␣-helix (51).Interestingly, a C-terminal portion (residues 1157-1173) of this loop is deeply inserted into the active site cavity of the RdRp domain, suggesting that it serves as a priming loop for de novo transcription initiation (51).Thus, it is not surprising that amino acid residues in this loop are involved in capping and/or transcription.
Although L proteins of paramyxoviruses belonging to the Pneumovirinae subfamily [e.g.human respiratory syncytial virus (HRSV), human metapneumovirus] contain W instead of Y in motif B, W could not be substituted for Y1152 of the VSV L protein in all the enzymatic reactions.In contrast, we found that Y1152F exhibits weak capping and transcription activities, suggesting that an aromatic phenyl group (Y > F), but not indolyl group, is necessary for function at this position in the VSV L protein.In the cryo-EM structure (51), the hydroxyl group of Y1152 is hydrogenbonded to the side-chain carbonyl group of Q1270 in motif E, suggesting that the hydroxyl group plays an important role in bringing motif B in close proximity to motif D (Figure 8A and Supplementary Figure S5).
Since T1157 in motif B could not be functionally replaced with A or S in the PRNTase reaction (Figure 3), both the ␤-hydroxyl and methyl groups of T1157 seem to be essential for the intermediate formation.Our previous study (16) proposed that a lone pair of electrons on the ⑀2-nitrogen atom of the H1227 residue in motif D nucleophilically attacks the 5 -␣-phosphorus of pppRNA with the help of R1228.In the cryo-EM structure of the VSV L protein (51), T1157 was found to be located in the vicinity (∼10 Å) of the catalytic H1227 (motif D) in a large loop between two ␣-helices.Therefore, one possibility is that, similar to the S or T residue in the glycine-rich phosphate binding (P-) loop [GxxxxGK(S/T)] and P-loop like motifs in nucleotide binding proteins (52,53), the ␤-hydroxyl group of T1157 in motif B may contact a phosphate oxygen(s)/hydrogen(s) in pppRNA directly or via a metal ion during the intermediate formation with the adjacent H1227 residue.
In the EM structure of the VSV L protein (51), W1188 (motif C) is present on a loop between two ␣-helices, and is in close proximity (∼8 Å) to H1227 in motif D (Figure 8A and Supplementary Figure S5).The cap-defective W1188F mutant, but not W1188A, retained a low transcription activity and exhibited aberrant stop-start transcription, suggesting that F, an aromatic amino acid, could be substituted for W in partially catalyzing RNA synthesis, but not capping.Interestingly, the W1188F mutant showed a GTPase activity that is significantly lower than those of the WT and W1188A L proteins, indicating that a phenyl group at this position impairs the GTPase reaction, which may occur in the vicinity of this residue.
F1269 and Q1270 in motif E are located at an N-terminal end of an ␣-helix in close proximity (∼7-8 Å) to H1227 in the cryo-EM structure of the VSV L protein (51) (Figure 8A and Supplementary Figure S5).Consistent with conservation of an aromatic amino acid (F, Y or H) in motif E (Sup- plementary Table S1, Figures S2), F1269 could be functionally replaced with another aromatic amino acid (Y or W).Since aromatic side chains in many nucleic acid binding proteins are known to be involved in nucleic acid binding via base stacking interactions (54), the aromatic side chain in motif E could play an essential role in RNA binding during the L-pRNA intermediate formation.The Q residue in motif E was also suggested to be required for RNA binding, because its amide group has the potential to recognize RNA (e.g.nucleotide bases) via hydrogen bonding (55,56).Furthermore, the hydrogen bonding of Q1270 with Y1152 in motif B suggests its structural role in forming the PRN-Tase active site as described above.
We have not yet found any specific amino acid residues required for pRNA transfer to GDP (e.g.GDP binding residues), but not for the intermediate formation.We predict that some of amino acid residues required for binding to the ␤-␥ phosphates of pppRNA and the leaving PP i may be involved in interactions with the ␣-␤ phosphates of GDP after the formation of the L-pRNA intermediate.We propose that R1228 next to H1227 may bind to the ␤-␥ phosphate oxyanions of pppRNA via ionic interactions and/or serve as a proton donor (general acid) to facilitate the PP i release.R1228 could be functionally replaced with H although to a lesser extent, but not with K ( 16), suggesting that the secondary amine at the ⑀ position in the positively charged guanidino group of R1228 is one of the chemical groups essential for the intermediate formation and possibly pRNA transfer.Although Liang et al. (51) described that G1154 and T1157 are involved in guanosine nucleotide binding, there is no experimental evidence to support this hypothesis.
Some conserved residues in motifs A-E may interact with common elements in 5 -ends of NNS RNA viral pre-mRNA, such as 5 -triphosphate, the first purine base (A or G), riboses and internal phosphates.Since both motifs B and D are present on the long flexible loops in the nonliganded domain, the structure of the active site may undergo some structural re-organization upon pppRNA binding followed by intermediate formation.On the other hand, since mRNAs of NNS RNA viruses belonging to different genera/families contain unique sets of mRNA-start sequences, amino acid residues of L proteins conserved in respective genera/families will likely be involved in sequencespecific recognition of mRNA 5 -end sequences (see Supplementary Discussion).
The newly identified mutations, conferring the defect in the L-pRNA intermediate formation, induced termination of N mRNA synthesis mainly at position +40 (Figures 4 and  5), as previously reported for the mutations in motif D (18).We suggest that the L-pRNA intermediate formation with the PRNTase domain of the L protein is a key event that controls the fate of the RdRp domain at an early stage of mRNA chain elongation (Figure 8B).Since the minimum lengths of capped VSV transcripts naturally or artificially terminated during in vitro transcription were reported to be 23-37 nt (44)(45)(46), the L-pRNA intermediate formation may occur immediately before the cap formation on nascent pre-mRNAs.If the L protein is not able to form the covalent L-pRNA intermediate, the RdRp domain in the L protein may terminate transcription at position +40 to release uncapped transcripts before the transition into further elongation of mRNA chain.Interestingly, a small-molecule capping inhibitor, possibly interacting with regions adjacent to motifs B and E in the HRSV L protein, also was suggested to induce premature termination of mRNA synthesis to produce uncapped RNAs with <50 nt (57) (see Supplementary Discussion).
Li et al. (34) previously reported that RNA capping with GTP is diminished with the G1154A or T1157A mutation and abolished with the H1227A or R1228A mutation in the VSV L protein.Furthermore, they showed that all these mutant L proteins produce heterogeneous 3 -truncated transcripts (100-500 nt) as well as non-polyadenylated fulllength mRNAs in their reconstituted transcription reactions containing rabbit reticulocyte lysates (34,48).However, some of our results are distinctly different from those observations.First, in our hand, the G1154A and T1157A mutants were completely inert in RNA capping with GTP (not shown) as well as GDP (Figure 3), because these mutants were not able to form the L-pRNA intermediate.Second, the G1154A mutant did not synthesize detectable amounts of any transcripts in our reconstituted transcription reaction, although our incubation time (2 h) and amount of the protein (0.15 g) are shorter and smaller, respectively, than theirs (5 h and 1 or 3 g).Third, in our transcription reactions, the cap-defective mutants including T1157A, H1227A and R1228A produced uncapped short transcripts with particular lengths (e.g.N1-40, N41-68) and small amounts of 3 -polyadenylated fulllength N mRNA (N 1 ) and N 2 RNA, but not heterogeneous 3 -truncated transcripts or non-polyadenylated fulllength mRNAs (Figures 4-6) (18).Furthermore, Li et al. (34) showed that the G1100A mutant produces capped mR-NAs and P1104A is totally inactive in their transcription reactions.However, in our transcription system, the G1100A mutant displayed a typical phenotype of cap-defective mutants producing the uncapped short transcripts and the P1104A mutant is active in transcription although to a lesser extent than the WT L protein.The reasons for these discrepancies are currently not known.
This study shows for the first time that the conserved motifs A, B, C and E as well as motif D in the PRNTase domain of the VSV L protein are essential for the covalent L-pRNA intermediate formation in mRNA capping.Our results also suggest that the successful intermediate formation followed by capping with the PRNTase domain of the L protein licenses the RdRp domain on the same molecule to enter an mRNA chain elongation mode, which enables it to ignore cryptic termination and initiation signals within genes.To understand the molecular basis for co-transcriptional pre-mRNA capping with the highly sophisticated mRNA synthesis machine, further biochemical and structural analyses are necessary.Since all the conserved motifs identified in this study are functionally essential for the PRNTase domain of the VSV L protein, we suggest that this conserved viral enzyme become a potential target for developing broad-spectrum anti-viral agents against significant NNS RNA viruses.

Figure 2 .
Figure 2. Sodium dodecylsulphate-polyacrylamide gel electrophoresis (SDS-PAGE) analysis of recombinant wild-type (WT) and mutant VSV L proteins.The WT and mutant VSV L proteins (1 g) were analyzed by 7.5% SDS-PAGE followed by Coomassie Brilliant Blue staining.M lanes show marker proteins with the indicated molecular masses.The names of the point mutants contain the original amino acid (one-letter code) at the indicated position in the VSV L protein followed by the replacement amino acid.The HR-RH mutant carrying the H1227R and R1228H mutations (lane 33) is a representative of the previously identified cap-defective mutants (16).

Figure 3 .
Figure 3. Conserved amino acid residues in the PRNTase domain of the VSV L protein are required for the L-pRNA intermediate formation in RNA capping.(A) The WT and mutant VSV L proteins (60 ng) were subjected to in vitro capping reactions with pppAACAG and [␣-32 P]GDP.Digests of RNA products with nuclease P1 were analyzed by PEI-cellulose TLC followed by autoradiography.The positions of the origin (ori.) and standard GpppA cap analogue are shown.(B) The WT and mutant VSV L proteins (0.3 g) were incubated with pppAACAG (labeled with [␣-32 P]AMP).The resulting L-pRNA intermediate was analyzed by 7.5% SDS-PAGE followed by autoradiography.(C) The WT and mutant VSV L proteins (0.3 g) were incubated with [␥ -32 P]GTP.Released 32 P i was analyzed by PEI-cellulose TLC followed by autoradiography.Lanes 1, 8, 20 and 28 indicate no L protein.

Figure 4 .
Figure 4. Cap-defective mutant L proteins are not able to synthesize full-length mRNAs efficiently.The WT and mutant VSV L proteins (0.15 g) were subjected to in vitro transcription reactions with [␣-32 P]GTP, the other three NTPs, the recombinant P protein (0.05 g) and the native N-RNA complex (0.4 g).After treatment with RNase H and oligo(dT), transcripts were analyzed by 20% (A) or 5% (B) urea-PAGE followed by autoradiography.Lanes 1, 8, 20 and 28 indicate no L protein.M lanes show marker RNAs with the indicated lengths.The positions of previously identified transcripts (18) are indicated on the right.Le indicates the leader RNA.

Figure 5 .
Figure 5. Cap-defective mutant L proteins produce uncapped abortive short transcripts.Transcripts, synthesized by the WT or cap-defective mutant L proteins, were post-labeled by vaccinia virus capping enzyme in the presence of [␣-32 P]GTP and analyzed by 20% urea-PAGE followed by autoradiography.Lanes 1 and 4 indicate no L protein.Lane 11 indicates a longer exposure of lane 7. The positions of previously identified capped RNAs (18) are shown.

Figure 6 .
Figure6.The cap-defective mutant L proteins synthesize small amounts of polyadenylated full-length (N 1 ) and 5 -truncated (N 2 ) N mRNA.32P-Labeled transcripts, synthesized by the WT or mutant L protein, were treated with or without RNase H and oligo(dT) and analyzed by 5% urea-PAGE followed by autoradiography.The sample volume of transcripts synthesized with the WT L protein (lanes 1, 2, 5 and 6) was 20-fold smaller than those of transcripts synthesized with the other mutants.

Figure 7 .
Figure 7.The cap-defective mutations in the VSV L protein abolish gene expression from a mini-genome in host cells.The VSV mini-genome reporter assay (23) was performed with plasmids expressing the N, P and L (WT or mutant) proteins.(A) The relative expression levels of a reporter gene product in cells expressing the WT (defined as 100%) or mutant L protein are shown.Columns and error bars represent the means and standard deviations, respectively, from three independent experiments.Columns 1, 7, 13 and 18 indicate no plasmid expressing the L protein.(B) The WT and mutant L proteins expressed in the cells were detected by Western blotting.The results shown are representatives of the three independent experiments.

Figure 8 .
Figure 8.Localization of the amino acid residues required for the L-pRNA intermediate formation in the putative PRNTase domain and model of co-transcriptional mRNA capping.(A) A ribbon diagram of a threedimensional structure of the PRNTase active site was generated using the atomic coordinates for the cryo-EM structure of the VSV L protein (PDB: 5A22) and the PyMOL software (http://www.pymol.org).The amino acid residues, required for the L-pRNA intermediate formation, in PRNTase motifs A-E are shown as stick models.The indicated N ⑀2 position of the catalytic histidine residue (H1227) in motif D is the covalent pRNA attachment site (16).The hydroxyl group of Y1152 in motif B is hydrogen-bonded to the side-chain carbonyl group of Q1270 in motif E. (B) A schematic structure of the VSV L protein is depicted only with the RdRp and PRN-Tase domains.After de novo initiation of mRNA synthesis from an internal promoter in the genome (i), the 5 -end of triphosphorylated pre-mRNA (pppA-RNA) is extruded from the transcribing RdRp domain at an early stage of mRNA chain elongation (ii) to gain access to the active site of the PRNTase domain.Immediately after the formation of the covalent L-pRNA formation (iii) with the catalytic amino acid residues (histidine and arginine) in motif D (16), pRNA is transferred to GDP to form capped pre-mRNA (Gpp-pA-RNA), which is further elongated to full-length mRNA with the RdRp domain (iv).If the PRNTase domain fails to form the L-pRNA intermediate, the RdRp domain releases 40-nt pppRNA and reinitiates transcription using a cryptic initiation signal (v).

Table 1 .
Relative enzymatic activities of mutant L proteins