Structures of the Mononegavirales Polymerases

Mononegavirales, known as nonsegmented negative-sense (NNS) RNA viruses, are a class of pathogenic and sometimes deadly viruses that include rabies virus (RABV), human respiratory syncytial virus (HRSV), and Ebola virus (EBOV). Unfortunately, no effective vaccines and antiviral therapeutics against many Mononegavirales are currently available. Viral polymerases have been attractive and major antiviral therapeutic targets. Therefore, Mononegavirales polymerases have been extensively investigated for their structures and functions.

Mononegavirales are enveloped viruses with various morphologies for different families; for example, Rhabdoviridae are bullet-shaped, Paramyxoviridae are pleomorphic or spherical, and Filoviridae are filamentous (30)(31)(32). The genome organization and replication of Mononegavirales have been extensively studied for decades (1)(2)(3). The NNS RNA viral genomes are linear and single-stranded, and their lengths range from 8.9 to 19.0 kilobases (1)(2)(3). Mononegavirales encode 5 to 10 genes, with 4 core genes shared by all members. Those core genes (Fig. 1, blue boxes) encode four shared proteins, nucleoprotein (N or NP), phosphoprotein (P or VP35), matrix protein (M), and large protein (L). Three out of four shared proteins, namely, N, P, and L, constitute the RNA synthesis machine, suggesting the central role of RNA synthesis in the Mononegavirales life cycle (33) (Fig. 1).
The RNA polymerase is the sole enzyme of Mononegavirales, and there is a critical need to delineate the molecular and structural basis of the RNA polymerase of Mononegavirales (58). Since the first structure of the L protein alone of vesicular stomatitis virus (VSV) was determined in 2015 (59), multiple structures of RNA polymerases of Mononegavirales, including HRSV, HMPV, RABV, HPIV, and VSV, have been reported in recent months, revealing the architectures of L:P complexes and interactions between L and P (59)(60)(61)(62)(63)(64). This review illustrates similarities and differences among the polymerases by comparing the structures of those polymerases and revealing the potential RNA synthesis mechanisms of the highly conserved Mononegavirales polymerases.

THE MULTIMODULAR ADAPTER OLIGOMERIC P
The multimodular adapter P protein of Mononegavirales is an oligomeric and nonglobular molecule in solution (83). Although L contains all catalytic functions, P is the essential cofactor required for L to synthesize RNA effectively (38). P not only is the cofactor of L but also acts as an adapter to coordinate and modulate multiple proteins, including RNA-free N protein, NC complex, and additional regulatory proteins (84,85). Notably, P forms dimers in Rhabdoviridae (83,86,87), trimers or tetramers in Filoviridae (88,89), and tetramers in Paramyxoviridae and Pneumoviridae (90)(91)(92)(93)(94). Each P protomer consists of an intrinsically disordered N-terminal domain (P NTD ), an oligomerization domain (P OD ), and a C-terminal domain (P CTD ), connecting with a flexible linker (83). Despite a high diversity in length, sequence, and even in the structural folds of individual domains, this modular architecture is conserved among different Mononegavirales (Fig. 2B). The intrinsically disordered P NTD exhibits a substantial conformational heterogeneity and is essential for its dynamic coordination functions. The key features of P can be revealed as the modular architecture with intrinsically disordered domains and structural domains that interact with different proteins that constitute the RNA synthesis machine (95)(96)(97)(98)(99). Interestingly, the length difference seems to correlate with additional functions of the adapter P protein. For example, the linker between P OD and P CTD of RABV is longer than that of VSV and contains a dynein light chain 8 (LC8) binding site (100); P CTD of EBOV contains an additional region for RNA binding and innate immune escape (101). Furthermore, P is often phosphorylated by the host kinases, and phosphorylation is essential for its regulation of RNA synthesis (102)(103)(104)(105)(106)(107).
Together, this information suggests that P plays the following critical roles within the RNA synthesis machine: (i) P is an essential cofactor to regulate the processivity of L. As an adapter, P interacts with NC and bridge in the RNA to thread into the L active sites during transcription and replication (108)(109)(110)(111)(112)(113). (ii) P acts as a chaperone to maintain a supply of RNA-free N (N 0 ) and delivers to N 0 nascent RNA genome or antigenome during replication (98,99,(114)(115)(116)(117)(118). (iii) P interacts with other essential cofactors, such as M2-1 in Pneumoviridae and VP30 in Filoviridae, to coordinate the RNA synthesis activities of the RdRP (48,(119)(120)(121)(122).

OVERVIEW OF THE STRUCTURAL ANALYSES OF THE MONONEGAVIRALES POLYMERASES
The monomeric L and oligomeric P together constitute the RdRP in Mononegavirales. Due to the large size of L and the oligomeric states of P with intrinsically flexible domains, it is challenging to obtain the crystals of the Mononegavirales RdRPs (123). The recent advance of cryo-electron microscopy (cryo-EM) offers an alternative way for a high-resolution structural characterization of such macromolecular complexes (124).
In 2015, the cryo-EM structure of the VSV L was determined at 3.8-Å resolution (PDB: 5A22) (125), and it was the first structure of the Mononegavirales polymerases. Although the VSV L was prepared in the complex of the VSV P NTD , the structure allowed only the de novo model building of the entire L protein but not the model assignment of P NTD , despite extra electron density observed (125). Since 2015, there have been many attempts for the structural characterizations of the Mononegavirales polymerases. For example, crystal structures of NTD and CTD fragments of L have also been reported (126,127). In recent months, there were multiple successful cases of the structural characterization of the Rhabdoviridae and Pneumoviridae polymerases by cryo-EM, one for RABV (PDB: 6UEB), one for VSV (P NTD visible; PDB: 6U1X), two for HRSV (PDBs: 6PZK and 6UEN), one for HMPV (PDB: 6U5O), and one for HPIV (PDB: 6V85) (59)(60)(61)(62)(63). For consistency, the domain organizations and cartoon representations of the individual structures are colored as follows: RdRp (blue), Cap (green), CD (yellow), MT (pink), and CTD (cyan) for L; and P NTD (magenta), P OD (red), and P CTD (orange) for P (the same as Fig. 2).

STRUCTURES OF THE RHABDOVIRIDAE POLYMERASES
A higher 3.0-Å resolution cryo-EM structure of the VSV polymerase (PDB: 6U1X) was reported that enables the visualization of not only the 2,109-residue VSV L but also the bound P NTD of the 265-residue VSV P (59, 125) (Fig. 3A). The root mean square deviation (RMSD) between 3.8-Å and 3.0-Å structures of VSV L is 1.33 Å (59,125). All five domains of the VSV L except a few flexible linkers are visible in the structure, including three functional domains, namely, RdRp (35 to 865), Cap (866 to 1334), and MT (1598 to 1892), and two structural domains, namely, the connector domain (CD; 1335 to 1597) and the C-terminal domain (CTD; 1893 to 2109) (59) (Fig. 3A). The RdRp domain resembles the classical RNA polymerase fold. The Cap domain folds next to the RdRp domain, and there was no homology for the Cap domain outside the order of Mononegavirales due to the unique capping mechanism. The CD domain connects the Cap and MT domains, and the CTD domain folds back to be close to the RdRp domain. The three ordered segments 49 to 56, 82 to 89, and 94 to 105 of P NTD (1 to 106) are shown to interact with CTD, RdRp, and CD domains of L, respectively (59, 125) (Fig. 3A). There are 35.05% and 19.22% amino acid identities between VSV and RABV L and P protein, respectively. As expected, VSV and RABV L share high similarity, with a nearly complete conservation of secondary structure elements throughout the protein. Despite having a greater sequence difference, VSV P and RABV P are also structurally similar to each other. Interestingly, there is a flexible loop (1158 to 1172 in VSV and 1171 to 1186 in RABV) in the Cap domain of Rhabdoviridae L that is against the active site of the RdRp domain. This loop is identified as the priming loop responsible for the de novo initiation of RNA synthesis (59,60,125). Due to the compact packing of the RdRp and Cap domains, the position of the priming loop appears to block the putative RNA product exit channel. Therefore, it is believed that Rhabdoviridae L adopts an initiation state in the structures, and significant rearrangements of those domains are likely to occur during elongation and other states of RNA synthesis.

STRUCTURES OF THE PNEUMOVIRIDAE POLYMERASES
Multiple cryo-EM structures of the Pneumoviridae polymerases have also been reported in recent months, including a 3.2-Å (PDB: 6PZK) and a 3.67-Å (PDB: 6UEN) resolution structures of the HRSV polymerase and a 3.7-Å resolution structure of the HMPV polymerase (PDB: 6U5O) (61)(62)(63). Two structures of the HRSV polymerase are nearly identical, with an RMSD of 1.48 Å (61, 63) (Fig. 4A). The structures reveal that the RdRp (10 to 945) and Cap (946 to 1461) domains of the 2,165-residue L interact with the P OD (128 to 157) and P CTD (158 to 241) of a tetramer of the 241-residue P. Interestingly, although full-length L and P were used to reconstitute the HRSV polymerases, the EM densities of MT domain and structural CD and CTD domains of the L and the P NTD are missing in 3-dimensional (3D) reconstructions (61, 63) (Fig. 4A, missing domains are shown in gray). The integrity of proteins was confirmed by mass spectrometry. The missing EM densities suggest that the intrinsic flexibility of those domains (61) and P OD and P CTD are not sufficient to lock those domains of L into a homogenous conformation. Interestingly, four protomers of the tetrameric HRSV P OD and P CTD adopt distinct conformation, and each of the promoters uses different ranges of residues, namely, 128 to 182, 128 to 187, 128 to 202, and 128 to 241, to interact with distinct regions of HRSV L (Fig. 4A). A further comparison of structures reveals slightly different intermolecular arrangements among L and tetrameric P, suggesting the plasticity of the L:P interface for structural rearrangements during RNA synthesis (61).
The structure of the HMPV polymerase (PDB: 6U5O) shares a highly similar architecture to that of the HRSV polymerase, which contains the RdRp (8 to 902) and Cap (903 to 1380) domains of the 2,005-residue HMPV L and P OD (168 to 193) and P CTD (194 to 266) of a tetramer of the 294-residue HMPV P (62). The RMSD between the HRSV and HMPV L is 1.49 Å. The HMPV polymerase also lacks the MT and other structural domains (CD and CTD) of L and P NTD in the 3D reconstructions (62) (Fig. 4B). Similarly, each of the four protomers of the tetrameric HMPV P OD and P CTD adopts a distinct conformation and uses different ranges of residues, namely, 168 to 219, 168 to 231, 168 to 236, and 168 to 266, to interact with HMPV L (Fig. 4B).
There are high sequence identities between the HRSV and HMPV L and P, namely, 49.12%, and 37.18%, respectively. As expected, HRSV and HMPV polymerases share highly similar architectures between them, including the priming loop. Surprisingly, the priming loop in the Cap domain of the Pneumoviridae L shows a substantial shift and ϳ37 Å away from the active sites of the RdRp domain, suggesting that L adopts an elongation state in the structures (61)(62)(63). Despite the similarities, there are several noticeable differences between the structures of HRSV and HMPV polymerases, as follows: (i) HRSV L contains an insertion (134 to 176) compared with that of HMPV L; (ii) HRSV L has a missing connecting helix (660 to 691), but the equivalent connecting helix of HMPV L can be partially modeled; (iii) one protomer of the HRSV P tetramers shows a different arrangement compared with its counterpart protomer of the HMPV P. Those slight differences between the two genera Metapneumovirus and Orthopneumovirus are likely due to genus-specific features of the RNA synthesis machine, and more detailed comparisons can be found in reference 128.

STRUCTURES OF THE PARAMYXOVIRIDAE POLYMERASES
Cryo-EM structures of the Paramyxoviridae polymerases have also been reported,  (Fig. 5A). Interestingly, although all five domains of HPIV L are presented, the CTD adopts a significant domain switch compared with that of the Rhabdoviridae L (Fig. 5B). The two conformations of the HPIV polymerase (L:P) are highly similar, with slightly different orientations of the CD-MT-CTD module with respect to RdRp and Cap (Fig. 5B, right panel). Furthermore, in contrast to Pneumoviridae P, only one protomer of P CTD EM-density is visible in Paramyxoviridae P, suggesting the versatile roles of P in RNA synthesis. It is noticeable that the tetrameric Paramyxoviridae P OD is much longer than that of Rhabdoviridae and Pneumoviridae P OD , highlighting the potential mechanistic differences among those families.

STRUCTURAL SIMILARITIES AND DIFFERENCES AMONG THE MONONEGAVIRALES POLYMERASES
The L proteins of Rhabdoviridae, Pneumoviridae, and Paramyxoviridae have similar lengths (2,000 to 2,300 residues) and share a similar architecture. Indeed, the RdRp domains of Mononegavirales L share a standard right-hand thumb-palm-finger ring-like configuration of RNA and DNA polymerases. Comprehensive comparisons of the RNA/ DNA polymerases and viral polymerases have been extensively reviewed elsewhere   (129)(130)(131)(132)(133)(134)(135)(136). The structural superimpositions of the motifs, namely, fingers, palm, thumb, and structural support, of the RdRp domains of the Mononegavirales L, are shown in blue, red, green, and gray, respectively. The active sites (GDN) of the RdRp domains are shown as magenta spheres (Fig. 6A to E). For comparison, we also showed the The previous studies highlighted the conserved structural motifs A to E of the Cap domain of L (33,56,137). Unlike the capping in the host cells, the capping reaction of the Mononegavirales L forms a covalent protein:RNA intermediate linkage between the 5= of the RNA transcript and the active site H residue (motif D), followed by the attack by a guanosine nucleotide. The motifs A to E of the Cap domain of the Mononegavirales L are shown as a ribbon diagram in blue, yellow, red, magenta, and green, respectively. Those motifs are centered around the motif D (HR) active site. The proposed priming loops (orange) are next to the motif B (yellow) but exhibit a dramatic conformational rearrangement (Fig. 7).
Despite the high similarities, there are several significant differences between the known structures of Mononegavirales polymerases. (i) All five domains (RdRp, Cap, CD, MT, and CTD) of the Rhabdoviridae and Paramyxoviridae L compared with only two domains (RdRp and Cap) of the Pneumoviridae L are visible in the cryo-EM structures. (ii) P forms dimers in Rhabdoviridae but tetramers in Pneumoviridae and Paramyxoviridae. It is thought that P displays distinct structural features due to low sequence identity and different oligomerization states. Interestingly, different domains of P interact with L in the reported structures. In Rhabdoviridae, only the P NTD interacts with mostly CD and CTD and part of RdRp of L (Fig. 8B). However, in Pneumoviridae and Paramyxoviridae, the P OD and P CTD interact with the RdRp domain of L (59-63) (Fig. 8D  and 8F). Compared with the oligomeric P shown in Pneumoviridae and Paramyxoviridae, the lack of the P OD in Rhabdoviridae resulted in a monomeric P binding to L. (iii) The priming loop and the supporting helix of L (Fig. 8, colored in orange) adopt three different conformations, as follows: in Rhabdoviridae (VSV and RABV), the priming loop together with a supporting helix in the RdRp domain project into the GDN active sites (Fig. 8A) of the RdRp domain and close off a channel toward the Cap domain; in Pneumoviridae (HRSV and HMPV), the supporting helix is (partially) disordered, and the priming loop retracts from the RdRp active sites (Fig. 8C) and opens the channel connecting to the Cap domain; and in Paramyxoviridae (HPIV), the supporting helix is visible (similar as Rhabdoviridae), but the priming loop with a disordered tip is projected away from the RdRp active sites (similar as Pneumoviridae) (Fig. 8E).

MECHANISMS AND MODELS OF MONOMEGAVIRALES RNA SYNTHESIS
Collectively, the structures of the Mononegavirales polymerases discussed here reveal multiple distinct conformational arrangements of the L and P proteins, as shown in the cartoon diagrams (Fig. 9A). The comparison analyses suggest potential RNA synthesis mechanisms of Mononegavirales, switching of initiation, and elongation associated with priming loop and supporting helix rearrangements (59)(60)(61)(62)(63). Based on the structural similarities and differences among the Mononegavirales polymerases, we hypothesize that (i) the polymerases of the Rhabdoviridae (VSV and RABV) are likely at the initiation stage of genome replication, and (ii) the polymerases of Pneumoviridae (HRSV and HMPV) and Paramyxoviridae (HPIV) are at different phases, possibly late phase and early phase, of the elongation stages of transcription, respectively.
To better understand the RNA synthesis mechanism by the Mononegavirales polymerases, we superimposed other viral polymerase complexes in the initiation and elongation stages. For the initiation, the superimposition of the reovirus (ReoV) 3 initiation complex reveals in the presence of the RNA template (yellow), the initiating nucleotide stacks with a Trp (W1167 in VSV L and W1180 in RABV L) residue of the priming loop, which is also similar to the Y630 in hepatitis C virus (NS5B) (59,60,138,139) (Fig. 9B, left panel). The mutation of this Trp residue severely affects the genome or antigenome end initiation but not internal initiation or capping (140). For the elongation, the polymerases require the retraction of the priming loop and possibly the support helix to pave the way to accommodate the product. Indeed, the fully retracted priming loop configurations are observed in both Pneumoviridae (HRSV and HMPV) and Paramyxoviridae (HPIV). The superimpositions of the influenza B (FluB) elongation complexes at early and later stages reveal that the RNA transcripts (pink) have sufficient space to extend and pass through a continuous tunnel when the priming loop is entirely retracted (141) (Fig. 9B, middle and right panels). The remaining support helix in Paramyxoviridae (HPIV) results in a partially extruded tunnel, where the missing support helix in Pneumoviridae (HRSV and HMPV) leads to a fully open tunnel, which is ideal for highly processive transcription.
As highlighted above, the NC is the cognate RNA template for Mononegavirales RNA synthesis. Based on the structures of Mononegavirales RNA polymerases, we propose the models of the initiation and early and late stage elongation of RNA synthesis, as  (Fig. 9C, right panel).

CONCLUSIONS
Many Mononegavirales are significant human pathogens, imposing a tremendous public threat and health care burden. However, no effective vaccines and antiviral therapeutics against many Mononegavirales are currently available (18-21, 23, 29, 142-148). Viral polymerases have been attractive and major antiviral therapeutic targets, as seen in multiple drug discovery successes in various viral pathogens, including HIV-1, hepatitis C virus (HCV), and hepatitis B virus (HBV) (149)(150)(151)(152)(153)(154)(155)(156)(157). Drug design and target search heavily rely on an accurate understanding of the structure and functions of the target molecules. Therefore, various viral polymerases have been extensively investigated for their structures and functions (129,130). To understand the mechanistic insights of Mononegavirales RNA synthesis, the precise composition and structure of the Mononegavirales polymerases, how the different activities of the L protein influence one another, and how the cofactor regulates RNA synthesis need to be elucidated.
The structures of the Mononegavirales polymerases discussed here, including the L protein in complex with its cofactor P protein of VSV, RABV, HRSV, HMPV, and HPIV, reveal three conformations poised for initiation and elongation of RNA synthesis (59)(60)(61)(62)(63). The potential channels and the relative locations of multiple catalytic sites of L suggest that L coordinates a distinct capping and methyltransferase reaction with priming for de novo initiation of transcription. Transcription and replication might have different priming configurations and potential different product exit sites. The high similarity between L and P of the Mononegavirales polymerases provides a structural basis for the development of antiviral drugs that inhibit the RNA synthesis in transcription or replication.
This difference might also explain why L shows different architecture in three different families. P NTD is speculated to lock the CD, MT, and CTD domains into a closed conformation, which represents that L is poised for initiation at the 3= end of the genome or antigenome and ready for RNA synthesis. The interactions between multiple domains of L and the P NTD reveal how P induces a compact, closed, and initiationcompatible state of L and how P positions the RNA template and the putative RNA product exit channel.
Several interesting questions arise by comparing and analyzing the known structures of the Mononegavirales polymerases. First, although the mass spectrometry data indicated that the Pneumoviridae L proteins used in structural studies are intact, the mystery of the missing MT domain and structural domains of L remains. Where do the MT and structural domains (CD and CTD) go? How do we capture the snapshots of their intermediates? Second, the known structures of the Mononegavirales polymerases are protein only without RNA present in the complex. However, those polymerases are in different initiation and elongation-compatible stages. Why do the priming loop and the supporting helix of L adopt different conformations in the protein-only complex? Third, the tetrameric P has a large interaction surface between P OD and L in Pneumoviridae and Paramyxoviridae. Given that P is a dimer in Rhabdoviridae but a tetramer in Pneumoviridae and Paramyxoviridae, is it possible that the dimeric P in Rhabdoviridae may not form a tight complex with L with large interfaces? This may explain why the HRSV, HMPV, and HPIV L need to be coexpressed in the presence of P, but not VSV L, which can be expressed and purified alone.
From an evolutionary perspective, Mononegavirales have evolved to utilize a single multifunctional enzyme to transcribe individual genes (make, cap, and methylate the mRNAs) and replicate the entire genome without capping and methylation. This may be due to reduced evolutionary pressure; typically, this multifaceted process is sensitive to cell state and signaling inputs. These viruses have evolved to drive this process efficiently forward using minimal components. In eukaryotes, RNA transcription (copying the genetic information) is a delicate and complicated process involving many molecular machines, such as DNA-dependent RNA polymerases, capping enzymes, and methyltransferases. For example, the eukaryotic counterparts of the RdRp, Cap, and MT domains of the multifunctional enzyme L are (i) RNA polymerase II and polyadenylate polymerase, (ii) RNA triphosphatase and guanylyltransferase, and (iii) RNA methyltransferase, respectively (158)(159)(160)(161)(162)(163)(164)(165)(166)(167). Additionally, Mononegavirales L also mimics the replication of the entire genome by accessing the N protein-coated RNA genome, similar to eukaryotic counterparts of DNA polymerases on the histone-assembled DNA genome (168)(169)(170).
The structural similarity of the Mononegavirales polymerases agrees with the relatively high sequence conservation. Nonetheless, the structural differences also highlight the virus-or genus-specific features. Collectively, the structures of the Mononegavirales polymerases provide significant advances into understanding the molecular architectures, interrelationship, the inhibitors, and the evolutionary implications of the Mononegavirales polymerases. Other polymerases from measles, mumps, Nipah virus, and Hendra virus in Paramyxoviridae and Ebola virus and Marburg virus in Filoviridae need to be determined for us to fully understand the similarities and differences of the polymerases in Mononegavirales. Furthermore, structures of Mononegavirales polymerases in complex with RNA templates, RNA products, or inhibitors are desired to appreciate the specific protein:RNA interactions and druggable sites.

FIGURE PREPARATION
All the figures presenting the structural models were generated using PyMOL (171).