Structures of diverse poxin cGAMP nucleases reveal a widespread role for cGAS-STING evasion in host–pathogen conflict

DNA viruses in the family Poxviridae encode poxin enzymes that degrade the immune second messenger 2′3′-cGAMP to inhibit cGAS-STING immunity in mammalian cells. The closest homologs of poxin exist in the genomes of insect viruses suggesting a key mechanism of cGAS-STING evasion may have evolved outside of mammalian biology. Here we use a biochemical and structural approach to discover a broad family of 369 poxins encoded in diverse viral and animal genomes and define a prominent role for 2′3′-cGAMP cleavage in metazoan host-pathogen conflict. Structures of insect poxins reveal unexpected homology to flavivirus proteases and enable identification of functional self-cleaving poxins in RNA-virus polyproteins. Our data suggest widespread 2′3′-cGAMP signaling in insect antiviral immunity and explain how a family of cGAS-STING evasion enzymes evolved from viral proteases through gain of secondary nuclease activity. Poxin acquisition by poxviruses demonstrates the importance of environmental connections in shaping evolution of mammalian pathogens.

In order to productively replicate in mammalian cells, viruses must evade immune surveillance. Poxviruses are large DNA viruses which replicate exclusively in the cytosol (Moss, 2013), and encode poxvirus immune nucleases (poxins) to degrade 2 0 3 0 -cGAMP and prevent STING activation ( Figure 1A; Eaglesham et al., 2019). Vaccinia virus (VACV) poxin (encoded by the gene B2R) is sufficient to antagonize cGAS-STING signaling in cells and is necessary for effective viral replication in vivo. Crystal structures of VACV poxin in the pre-and post-reactive states revealed that catalysis proceeds through a metal-independent mechanism, contorting 2 0 3 0 -cGAMP into a conformation that activates the 2 0 hydroxyl for in-line cleavage of the 3 0 -5 0 bond .
Poxviruses have a well-documented ability to acquire genes horizontally, especially from their hosts (Hughes et al., 2010). The closest homologs of VACV poxin belong not to mammals or other mammalian viruses, but to baculoviruses, and Lepidoptera (moths and butterflies) which serve as exclusive hosts of baculoviruses . While baculovirus and lepidopteran poxin homologs share <25% identity with VACV poxin, they are functional nucleases and retain 2 0 3 0 -cGAMP-specific cleavage activity . The distribution of poxin homologs in

Group 4
Cypoviruses ( mammalian poxviruses, insects, and insect viruses indicates that poxviruses may have obtained this gene through horizontal transfer from an ancestral host-pathogen conflict. Here we use a forward biochemical approach to map the evolutionary origin of poxin enzymes and define a genetic route through which poxviruses acquired a new mechanism of immune evasion. We determine four X-ray crystal structures of baculovirus and lepidopteran host poxins, revealing an unexpected origin of poxin enzymes as descendants of viral proteases. The structures of the cabbage looper moth Trichoplusia ni, and monarch butterfly Danaus plexippus poxin enzymes resemble self-cleaving proteases from positive-sense single-stranded RNA ((+)ssRNA) viruses, and bind their own C-termini within a vestigial protease active-site pocket. Using the lepidopteran poxin structures as a guide, we identify functional poxin enzymes in the genomes of unclassified insect-specific RNA viruses distantly related to flaviviruses, which possess both 2 0 3 0 -cGAMP nuclease activity and autoproteolytic cleavage activity. Our results define broad importance for 2 0 3 0 -cGAMP cleavage in metazoan host-pathogen conflict and reveal an evolutionary path through which an insect RNA viral protease developed secondary nuclease activity to inhibit cGAS-STING immunity. Conservation of poxin cGAS-STING evasion among pathogens like monkeypox virus and cowpox virus highlights a deep genetic connection which allowed these mammalian poxviruses to obtain a new mechanism of immune evasion from the environment.

Results
Poxins are a diverse family of 2 0 3 0 -cGAMP nucleases To define poxin diversity and phylogenetic distribution, we used both the VACV poxin and lepidopteran Trichoplusia ni poxin sequences to seed position-specific iterative BLAST (PSI-BLAST) searches, identifying an initial combined total of 351 unique poxin-like sequences. Poxin homologs can be classified into four major enzyme groups with <25% identity to one another and which share seven different phylogenetic origins ( Figure 1B, Figure 1-figure supplement 1A). We cloned 33 diverse representatives and directly tested recombinant protein for 2 0 3 0 -cGAMP nuclease activity using thinlayer chromatography ( Figure 1B). Poxin homologs from all groups efficiently degraded 2 0 3 0 -cGAMP ( Figure 1B), demonstrating nuclease activity is a conserved function of this protein family regardless of the sequence origin or genomic context.
The presence of active poxin homologs within each major group demonstrates widespread distribution of an enzyme family dedicated to regulation and evasion of cGAS-STING signaling. Poxins are encoded in a diverse array of viruses and host animal species ( Figure 1B, Supplementary file 1). Group one is composed of enzymes from mammalian and insect poxviruses, some of which are fused to a C-terminal schlafen domain Liu et al., 2018a). Groups 2-4 consist entirely of enzymes identified in the genomes of moths and butterflies (Lepidoptera), and viruses or parasites which infect these insects. Group two primarily contains poxins encoded in alphabaculovirus genomes. However, two sequences identified in the genomes of parasitoid wasps (Microplitis demolitor and Glypta fumiferanae) cluster alongside these viral enzymes. These wasps parasitize caterpillars, laying their eggs inside them and co-injecting domesticated bracoviruses which modulate caterpillar immunity to favor egg maturation (Béliveau et al., 2015;Burke et al., 2018;Strand and Burke, 2015). Lepidopteran poxin enzymes cluster within Group 3, and have been shown to be highly upregulated after infection with various pathogens, including alphabaculoviruses and bacteria, providing evidence for a role in immunity or immune regulation (Shrestha et al., 2019;Woon Shin et al., 1998). Group four is formed by poxin proteins from the genomes of several additional lepidopteran viruses: betabaculoviruses (also called granuloviruses), cypoviruses (double-stranded RNA viruses in the family Reoviridae), and betaentomopoxviruses ( Figure 1B; Silva et al., 2020). The vast evolutionary distance between these DNA and RNA viral families suggests related poxin genes may have been transferred between viruses within coinfected cells (Silva et al., 2020;Thézé et al., 2015). Notably, the enormous diversity of poxin enzymes in insect pathogens and moth and butterfly genomes confirms a broad role for 2 0 3 0 -cGAMP degradation, and strongly suggests these genomes served as a source for emergence of poxins in mammalian poxviruses.
Host and viral poxins employ alternative catalytic residues for 2 0 3 0 -cGAMP cleavage Poxin enzymes from different groups share at most 15-25% identity, preventing identification of active-site residues and limiting analysis of the mechanism of 2 0 3 0 -cGAMP cleavage. To enable comparative analysis with VACV poxin (Group 1), we next determined a series of structures of representative enzymes from Groups 2-4 in complex with 2 0 3 0 -cGAMP. New poxin structures include the alphabaculovirus Autographa californica nucleopolyhedrovirus (AcNPV) poxin from Group 2, lepidopteran host poxins from the moth Trichoplusia ni and the monarch butterfly Danaus plexippus in Group 3, and the betabaculovirus Pieris rapae granulovirus (PrGV) poxin from Group 4 ( Figure 2A, Supplementary file 2). In spite of dramatic sequence divergence, poxin structures from each group reveal a shared core fold and head-to-tail dimeric architecture confirming poxins as a single enzyme family ( Figure 2A, Figure 2-figure supplement 1A).
The new poxin structures trap a post-reaction state following 2 0 3 0 -cGAMP cleavage and allow direct comparison with the catalytic mechanism of VACV poxin . VACV poxin functions through a metal-independent mechanism, contorting 2 0 3 0 -cGAMP to position the 2 0 hydroxyl for in-line attack on the 3 0 -5 0 bond, generating a cyclic phosphate intermediate which is further resolved into a 3 0 phosphate product . Strikingly, despite sharing <21% identity, each poxin structure contorts 2 0 3 0 -cGAMP in an identical strained conformation ( Figure 2B-F). Consistent with specificity for 2 0 3 0 -cGAMP and a shared mechanism of catalysis, the active-site charge landscape is conserved across all poxin proteins, with hydrophobic pockets for the bases, and multiple basic residues stabilizing the negatively charged phosphates of the 2 0 3 0 -cGAMP ligand ( Figure  In contrast to the shared conformation of 2 0 3 0 -cGAMP during cleavage, poxin enzymes exhibit diverse catalytic amino-acids that activate the 2 0 hydroxyl for in-line attack and stabilize cleavage intermediates. Unlike the catalytic triad of histidine, tyrosine, and lysine residues essential for VACV and AcNPV poxin cleavage activity ( Figure 2B,C, Figure 2-figure supplement 1B,C), the active sites of T. ni poxin and PrGV poxin are divergent. The T. ni and PrGV poxin enzymes share the conserved active-site histidine but lack a tyrosine residue entirely ( Figure 2D,E). Instead, both proteins contain a second histidine residue adjacent to the first, which forms a contact with the 2 0 hydroxyl or 3 0 phosphate of the cleaved 2 0 3 0 -cGAMP molecule ( Figure 2D In the case of T. ni poxin, mutagenesis analysis demonstrates this second histidine residue is essential for activity, suggesting this residue functions as a general base to deprotonate and activate the 2 0 hydroxyl nucleophile for in-line cleavage of the 3 0 -5 0 bond ( Figure 2-figure supplement 1B). Further, the T. ni poxin protein shows substitution of the VACV lysine for an arginine residue, whereas the lysine is conserved in PrGV poxin ( Figure 2D,E). Unlike mutation of the lysine residue in the VACV and AcNPV active sites, mutation of the T. ni poxin active-site arginine to alanine only partially reduces activity, and a charge-preserving mutation of the arginine to lysine results in even less activity (Figure 2-figure supplement 1B). Together, these results reveal amino-acid diversity in the active-site of poxin enzymes and suggest functional differences in the ability to control 2 0 3 0 -cGAMP stability.
To study potential functional variation between viral and host poxins, we measured kinetics for VACV, AcNPV, and T. ni poxin 2 0 3 0 -cGAMP degradation. ( Interestingly, mammalian and insect viral poxins from VACV and AcNPV exhibit similar kinetics, with a K M of 0.83 mM and 2.43 mM respectively, while the T. ni host poxin protein has a K M two orders of magnitude higher at 342.1 mM. However, the host T. ni poxin has a much higher rate constant of 2254 min À1 compared to VACV and AcNPV poxin (93.4 min À1 and 346 min À1 , respectively). For comparison, we determined the K D of VACV, AcNPV, and T. ni poxin for 2'3'-cGAMP using enzymes with an inactivating catalytic site histidine mutation (Figure 2-figure supplement 2E-H). Consistent with our enzyme kinetics results, inactive VACV and AcNPV poxin exhibited a K D of 0.58 mM and 0.81 mM respectively, within a similar range as the respective K M values of wild-type enzymes. In contrast, we could not measure the K D of the catalytic site mutant T. ni poxin, indicating it is higher than 20 mM, again consistent with the high K M value for wildtype T. ni poxin. These enzymes represent only three members of a highly divergent enzyme family, and additional analyses will be required to determine whether these trends in enzyme kinetics hold true for other viral and host poxin enzymes.      Although animal studies will be required to understand the biological role of diverse poxins, our biochemical analysis is consistent with a model where viral poxins are adapted to depletion of 2 0 3 0 -cGAMP to low levels, while the host poxin is adapted for regulation and efficient clearance of 2 0 3 0 -cGAMP after accumulation to higher concentrations. Further supporting these results, studies of the lepidopteran transcriptional response to baculovirus infection show that host poxin genes are strongly upregulated during infection in a second wave of transcription occurring after initial immune induction (Shrestha et al., 2019;Woon Shin et al., 1998). Together, these results reveal that while all poxin enzymes use the same overall mechanism, amino-acid diversity in the active site of poxin enzymes likely enables 2 0 3 0 -cGAMP nuclease activity to be tailored to alternative immuno-regulatory and immuno-evasion functions (

Structural analysis reveals poxins are descended from self-cleaving RNA-virus proteases
To define the origin of poxin enzymes, we next compared each nuclease against other structures in the Protein Data Bank to identify proteins with related folds. Although poxin functions as a 2 0 3 0 -cGAMP-specific nuclease, previous analysis of VACV poxin demonstrated that no structural homology exists with other nuclease or phosphodiesterase enzymes . Instead, the N-terminal domain of VACV poxin exhibits weak homology to chymotrypsin-like serine proteases ( Figure 3A). Analysis of the host T. ni poxin structure confirms a relationship with protease enzymes and demonstrates strong homology with proteases derived from (+)ssRNA viruses ( Figure 3A,B). Unlike the degenerated domain in VACV poxin, comparison of the N-terminal domain of T. ni poxin with the yellow fever virus protease demonstrates near complete conservation of a dual Greek key b-barrel fold common to serine protease enzymes ( Figure 3C). Like a molecular fossil, T. ni poxin likely retains extensive ancestral protease homology due to the slow evolutionary rate of insect host genomes compared to the rapid replication and divergence of viral genes (Duffy et al., 2008). In addition to the N-terminal protease domain, all poxins share a C-terminal domain required for dimerization and formation of the nuclease active site. Accordingly, this domain is highly conserved with no degeneration occurring between VACV and T. ni poxin ( Figure 3C). Together these results reveal unexpectedly close homology between poxin and protease enzymes and suggest a nuclease dedicated to 2 0 3 0 -cGAMP degradation evolved through dimerization and divergence of an ancient viral protease.
Identification of (+)ssRNA viral proteases as the closest structural homologs to poxin indicates a direct evolutionary connection between these groups of enzymes. (+)ssRNA viruses typically encode gene products as a polyprotein that must be proteolytically cleaved to release individual mature peptides (Lei and Hilgenfeld, 2017). Additionally, (+)ssRNA viruses often possess accessory proteases which excise themselves from the polyprotein and serve alternative structural or immune antagonist roles (Lei and Hilgenfeld, 2017;Mann and Sanfaç on, 2019). Given that poxins function as immune antagonists and share structural homology to (+)ssRNA viral proteases, we hypothesized that poxins originated as self-cleaving accessory nucleases within ancient RNA-virus genomes.
To test this hypothesis, we compared the vestigial protease active site of T. ni poxin with the chikungunya virus capsid protein that functions as a self-cleaving accessory protease ( Figure 4A). In the chikungunya virus capsid protease domain, the cleaved C-terminus remains coordinated in the active site by histidine and aspartic acid residues adjacent to the catalytic serine ( Figure 4A,C; Page and Di Cera, 2008;Sharma et al., 2018). A nearly identical pocket is conserved on the surface of the T. ni poxin protease-like domain, in which the poxin C-terminus is also coordinated by histidine and aspartic acid residues ( Figure 4B Hedstrom, 2002). Comparison of lepidopteran poxins with viral poxins from AcNPV and VACV reveals greater degeneration in the protease domain and protease active-site pocket in viral enzymes that results in loss of C-terminal coordination ( Figure 3A, Figure 4E-G). Notably, the protease active-site pocket is entirely distinct from the poxin nuclease active site, which forms at the dimer interface between poxin monomers ( Figure 2A, Figure  frame in insect hosts or other viruses likely released selective pressure for maintenance of self-cleavage activity and resulted in protease-domain degeneration.

Insect (+)ssRNA viruses encode functional self-cleaving poxins
To further test the hypothesis that poxins originated within ancient (+)ssRNA-virus polyproteins we next searched for modern day viral descendants which encode functional self-cleaving poxin enzymes. Re-examination of our initial PSI-BLAST results ( revealed additional short (92-195 amino acids), divergent poxin-like regions within large 5,901-8572 amino acid long polyproteins of eight (+)ssRNA flavivirus-like viruses that are unclassified members of the order Amarillovirales (Koonin et al., 2020). These genomes were originally characterized through RNA sequencing of diverse insects and likely represent insect-specific viruses (Kobayashi et al., 2013;Remnant et al., 2017;Shi et al., 2016;Teixeira et al., 2016). The (+) ssRNA viral poxin sequences are highly divergent, sharing between 10-25% identity with one another, and occur in variable genome positions including at the extreme N-terminus and up to 1000 amino acids inside the predicted viral polyprotein ( Figure 5A, Figure 5-figure supplement 1). Alignment of the (+)ssRNA viral poxin sequences reveals conservation of putative protease catalytic residues (histidine, aspartate, and serine) ( Figure 5B), with the catalytic serine residing in an A Sanxia water strider virus 6  SGTP motif that is consistent with the vestigial AGTP motif conserved in the lepidopteran poxin protease active site ( Figure 5B, Figure 4-figure supplement 1D).
We cloned putative poxins from the eight identified amarilloviruses and expressed them in E. coli as SUMO-fusion recombinant proteins ( Figure 5C, Figure 5-figure supplement 2A). Activity analysis demonstrates that (+)ssRNA viral poxins from Macrosiphum euphorbiae virus 1, Gamboa mosquito virus, and Xingshan cricket virus (XCV) efficiently catalyze 2 0 3 0 -cGAMP degradation verifying these proteins as functional poxin family members. Given the high degree of divergence exhibited by amarillovirus poxins, we compared the specificity of XCV poxin for 2 0 3 0 -cGAMP and for a cGAMP isomer with only 3 0 -5 0 linkages (3 0 3 0 -cGAMP). XCV poxin retains a high degree of specificity for 2 0 3 0 -cGAMP, similar to poxins tested from within each other group of enzymes ( Figure 5-figure supplement 2B). To assess if amarillovirus poxins possess autoproteolytic cleavage activity, we focused on the XCV poxin enzyme capable of 2 0 3 0 -cGAMP cleavage and a divergent homolog from Shuangao lacewing virus 2 (SLV2) that readily expressed to high levels in E. coli ( Figure 5-figure supplement  2A). The XCV and SLV2 poxins were each fused to a C-terminal GFP tag and purified with either an N-terminal or C-terminal 6 Â His tag to assess self-cleavage ( Figure 5D). Purification of the XCV or SLV2 poxins with an N-terminal tag yielded a fragment corresponding in size to a SUMO-poxin fusion, demonstrating proteolytic cleavage and removal of the GFP ( Figure 5E). Likewise, purification with a C-terminal 6 Â His tag confirmed these results and yielded a smaller fragment corresponding only to the cleaved GFP tag. Mutation of the putative protease catalytic serine residue conserved between amarillovirus poxin proteins blocked all cleavage ( Figure 5E).
Using Edman degradation, we mapped the XCV and SLV2 cleavage motifs to an identical H/ST sequence conserved in both viruses ( Figure 5F). Removal of the mapped cleavage site from the XCV poxin-GFP fusion construct abrogates all detectable proteolysis, confirming this motif is required for self-cleavage ( Figure 5-figure supplement 2C). Further, mutation of individual residues in the XCV poxin cleavage site motif demonstrates that the histidine residue at position P1 directly N-terminal to the scissile bond is critical for cleavage site recognition ( Figure 5-figure supplement 2C). A search of the XCV polyprotein for the cleavage site motif NxQH ( Figure 5F) reveals that this motif occurs in only two instances across the entire 7070 amino-acid polyprotein, at the mapped C-terminal cleavage site and again just N-terminal to the poxin-like region ( Figure 5-figure supplement 2D,E). Conservation of this motif at only these two positions in the polyprotein suggests that XCV poxin may be adapted for self-excision at both the N-and C-termini ( Figure 5-figure supplement  2F). Notably, the ability of XCV poxin to catalyze both auto-proteolysis and nucleolytic cleavage of 2 0 3 0 -cGAMP ( Figure 5C,E) confirms the existence of functional self-cleaving poxin enzymes within insect (+)ssRNA viruses. Together, these data verify a model for poxin evolution and demonstrate that this family of immune evasion proteins diverged from a self-cleaving accessory protease (Figure 5-figure supplement 2F).

Discussion
Our results reveal that poxins are a widespread family of enzymes dedicated to 2 0 3 0 -cGAMP degradation and control of cGAS-STING immunity. Through biochemical and structural analysis, we reconstruct the evolutionary history of poxins and define a clear molecular connection with viral protease enzymes. Structures of lepidopteran poxins reveal unexpectedly strong homology with serine proteases from (+)ssRNA viruses and explain how poxins originated from a self-cleaving viral protease that gained a secondary nuclease active site for 2 0 3 0 -cGAMP cleavage (Figure 3). In this model, acquisition of a C-terminal domain enabled protease dimerization and creation of a new binding site for 2 0 3 0 -cGAMP recognition ( Figure 6). Substrate contortion within this pocket catalyzes a metalindependent reaction that degrades 2 0 3 0 -cGAMP and potently inhibits host antiviral immunity (Figure 2).
A structure-guided maximum-likelihood tree of all poxin sequences identified in our study provides a global view of poxin diversity and the prominent role of antagonism of host cGAS-STING signaling in mammalian immunity and insect viral replication ( Figure 6A, Figure 6-figure supplement 1). The currently available poxin enzyme sequences form nine groups or subgroups based on phylogenetic analysis and species origin. Of note, amarillovirus poxin sequences in Group five and baculovirus poxin sequences in Group 2C do not achieve >50% bootstrap support. As new sequences become available, future bioinformatic work will be required to reconstruct the exact mechanism of . Horizontal transfer and evolution of poxin enzymes from proteases to nucleases. (A) Structure-guided phylogenetic tree depicting global poxin diversity. 100 bootstrap replicates were performed, and branch support is provided for all poxin groups with support values > 50. Most poxin sequences within groups highlighted with a dotted line encode putative signal peptides, suggesting extracellular or secretory roles of some alphabaculovirus and lepidopteran poxin homologs. Poxin enzymes with crystal structures are highlighted with black dots. Unclassified poxins from the mayfly E. danica and Anomala cuprea entomopoxvirus were grouped together on the tree for simplicity but are not closely related to one another. The tree is visualized here as rooted to emphasize and enumerate global poxin sequence diversity, and the unrooted visualization is available in Figure 6  poxin horizontal spread, and to confirm RNA viruses as the progenitors of all poxin enzymes. Our phylogenetic analysis further emphasizes that nearly all insect viruses which encode a poxin enzyme exclusively infect moths and butterflies in the order Lepidoptera. As lepidopteran hosts also encode endogenized versions of poxin, our results reveal moths and butterflies and their pathogens as an epicenter for extensive radiation of this protein family. The most divergent poxin enzyme sequences are encoded in the genomes of (+)ssRNA viruses from the order Amarillovirales, distantly related to mammalian (+)ssRNA pathogens like dengue virus and hepatitis C virus. Identification of functional self-cleaving poxin enzymes with 2 0 3 0 -cGAMP nuclease activity in circulating amarilloviruses provides further support for the biochemical model of evolution of poxin enzymes from (+)ssRNA viral proteases. Horizontal transfer from an ancient amarillovirus likely seeded poxin diversification in Lepidoptera and eventual acquisition by mammalian poxviruses. Amarillovirus genomes encoding poxin have been identified in geographically distant locations and from extremely diverse insect species representing six different phylogenetic orders (Faizah et al., 2020;Kobayashi et al., 2013;Kondo et al., 2020;Remnant et al., 2017;Shi et al., 2016;Teixeira et al., 2016). As our understanding of viral diversity continues to broaden, it is likely that additional immune evasion mechanisms will be identified as shared between mammalian viruses and invertebrate pathogens.
In mammals, poxin functions to degrade 2 0 3 0 -cGAMP and block induction of antiviral signaling by the cGAS-STING pathway during poxvirus infection . In most mammalian poxviruses, poxin exists as a fusion to a C-terminal domain with homology to mammalian schlafens . Recent work with ectromelia virus demonstrates the schlafen domain is dispensable for evasion of cGAS-STING signaling, and that mammalian schlafens fail to complement poxin-schlafen deletion mutant viruses (Hernáez et al., 2020). Given conservation of the poxinschlafen fusion amongst most orthopoxviruses Hernáez et al., 2020), further work is required to explore auxiliary roles for the schlafen domain in cGAS-STING regulation by poxin.
The widespread distribution of poxin enzymes in insect viruses supports a prominent role for 2 0 3 0 -cGAMP-signaling in metazoan antiviral immunity. Recent studies of cGAS-STING signaling in insects have demonstrated that Drosophila STING drives NF-kB and autophagy signaling to restrict viral infection (Goto et al., 2018;Gui et al., 2019;Hua et al., 2018;Liu et al., 2018b;Martin et al., 2018). However, the upstream signaling machinery that activates STING in insects remains poorly understood. Insects encode enzymes like Drosophila melanogaster CG7194 and CG12970 that are part of the cGAS/DncV-like nucleotidyltransferase (CD-NTase) family (Kranzusch, 2019;Whiteley et al., 2019), but these enzymes are significantly divergent from mammalian cGAS and it is unclear if insect CD-NTases synthesize 2 0 3 0 -cGAMP or respond to cytosolic DNA. Our results show that nearly all poxin representatives from across all groups retain specificity for 2 0 3 0 -cGAMP and fail to cleave 3 0 3 0 -cGAMP. Although exceptions to poxin specificity exist ( Figure 5-figure supplement  2B), these results suggest that 2 0 3 0 -cGAMP is a predominant ligand in insect immune signaling.
While most metazoan poxin enzymes were identified in the genomes of moths and butterflies, several examples suggest that poxins play an important role in immunity in diverse insects. Two different parasitoid wasp species encode poxin homologs (Glypta fumiferanae: AKD28026 and Microplitis demolitor: XP_008552911), and previous work suggests these proteins may even play a role in parasitism of caterpillars, within which the wasps lay their eggs (Béliveau et al., 2015;Burke et al., 2018;Strand and Burke, 2015). Further, the mayfly Ephemera danica encodes a poxin homolog (KAF4524375), which possesses an intact HYK catalytic triad indicating functional poxin nuclease activity, but this enzyme fails to cluster within other poxin groups ( Figure 6A). Future studies of these, and other insect poxin proteins will provide further insight into the ancient evolutionary relationship between RNA-virus proteases and poxin nucleases.
Whereas insect viral poxins likely restrict immune activation similar to the function of poxin in mammalian poxviruses, the biological roles of poxin enzymes endogenized in the genomes of insects are less clear. One hypothesis is that insect poxins function to limit the magnitude of STING activation. In agreement, our results suggest that host insect poxins function with kinetics distinct from  (Figure 2-figure supplement 2). Although viral poxins are capable of degrading 2 0 3 0 -cGAMP even at low ligand concentrations, host poxins instead appear tailored for setting an upper threshold for the immune response. In mammals, a growing body of work suggests that 2 0 3 0 -cGAMP can be released from infected cells and imported to activate bystander immunity (Carozza et al., 2020;Luteijn et al., 2019;Ritchie et al., 2019;Zhou et al., 2020). Interestingly, our bioinformatic analysis demonstrates that many baculoviruses encode two different poxin enzymes with one variant containing a signal peptide for extracellular secretion ( Figure 6A; Craveiro et al., 2015). Likewise, host lepidopteran poxins are encoded as multiple isoforms with and without a signal peptides (e.g. Trichoplusia ni XP_026730193 and XP_026730202) (Supplementary file 1; Chen et al., 2019). In insects, poxins may therefore regulate extracellular 2 0 3 0 -cGAMP signaling in addition to controlling cytosolic cGAS-STING activation.
In contrast to the widespread distribution of poxin enzymes among insects and insect viruses, there is a puzzling lack of cytosolic 2 0 3 0 -cGAMP nuclease machinery in mammalian cells . The only enzyme known to degrade 2 0 3 0 -cGAMP in humans is the nuclease ENPP1, which is exclusively extracellular and regulates signaling outside of the cell (Carozza et al., 2020;Li et al., 2014). Functional homologs of ENPP1 have not been identified in insects, indicating that alternative mechanisms for regulating cGAS-STING signaling may exist in these animals. Our data indicate that insect poxins may be expressed as both secreted and cytosolic forms, perhaps having functional roles in both intra-and extracellular cGAS-STING regulation. This abundance of enzymes that efficiently degrade 2 0 3 0 -cGAMP in the cytosol of insects may have provided the opportunity for poxviruses to acquire a new mechanism of immune control that did not exist in mammalian cells. Poxins have traversed a vast evolutionary distance from an origin in insect (+)ssRNA viruses to a role in enabling mammalian poxvirus pathogens to evade cGAS-STING immunity, and a remarkable transition from protease to nuclease activity provides a clear example of how proteins can evolve through gain and loss of enzymatic function. Acquisition of poxins from insect viruses further underscores the functional similarities between mammalian and insect innate immunity and reveals the importance of environmental genetic diversity as a driver for evolution of pathogenic viruses.

Materials and methods
Bioinformatic identification and cloning of poxin homologs VACV poxin and the host lepidopteran T. ni poxin sequences were used to initiate queries of the NCBI nonredundant protein database using position-specific iterative BLAST (PSI-BLAST) (Altschul et al., 1997) on January 17, 2020. Additional searches were performed on May 27, 2020 using the mapped boundaries for SLV2 (M1-H227) and XCV (C782-H1007) poxins, along with VACV and T. ni poxin as queries, allowing identification of 18 additional poxin sequences. For each analysis, continued iterations between 6 and 10 rounds were run until convergence of results. The BLAST default settings were used, specifying a PSI-BLAST E-value cutoff of 0.005 for inclusion in the next search round, with BLOSUM62 scoring matrix, and gap costs set at existence: 11, extension: 1. Initial results for VACV and T. ni query proteins were combined for a total of 351 poxin homolog sequences, and were largely overlapping with two sequences identified only with VACV as a query, and 26 identified only using T. ni poxin as a query. Results obtained in our second analysis using SLV2 and XCV poxin sequences as additional queries included 18 sequences not previously identified, for a total of 369 poxin homologs (Supplementary file 1). Some proteins identified here as poxin enzymes have previously been referred to by other names, such as p26 in alphabaculoviruses, HDD13 in Lepidoptera, Schlafen in poxviruses (in most orthopoxviruses, poxin sequences are fused to a C-terminal schlafen domain, but in some cases this name has carried over to poxin proteins in other poxviruses which are unfused and have no homology to mammalian schlafens), and acetyltransferase-like protein in betabaculoviruses. Proteins smaller than 179 amino acids or larger than 532 amino acids, such as sequences identified within large RNA-virus polyproteins, appeared to represent sequence fragments, or proteins too large to be a poxin protein alone and were excluded from our initial poxin biochemical screen. Of the remaining sequences, 33 representative enzymes were selected for biochemical analysis. To study amarillovirus poxin proteins, soluble fragments from within the viral polyprotein were identified using an estimated boundary of~200 amino acids around regions of poxin homology, individual analysis of protein disorder prediction with DisoPred3 (Ward et al., 2004), and homology modeling to lepidopteran poxin structures with Phyre2 (Kelley et al., 2015). Refseq accession numbers for the poxin proteins in biochemical screen in Figure 1B

Protein alignments and phylogenetic trees
All protein alignment diagrams were created using the MAFFT FFT-NS-i iterative refinement method (Katoh and Standley, 2013), rendered in Geneious Prime (2020.0.5) and exported for annotation in Adobe Illustrator 24.1. Phylogenetic trees were constructed in Geneious Prime, using alignments made using MAFFT FFT-NS-i iterative refinement (Figures 1 and 2) or PROMALS3D (Pei et al., 2008; Figure 6, Figure 6-figure supplement 1). The tree in Figure 1 was produced from a sequence alignment of 33 poxin proteins selected for initial biochemical analysis using the neighborjoining method and Jukes-Cantor genetic distance model with no outgroup and rendered using proportionally transformed branches for alignment to TLC images in Illustrator. To produce the phylogeny of all poxin enzymes in Figure 6, we aligned all poxin sequences ranging in size from 179 to 397 with poxin monomer crystal structures for VACV poxin (6EA9), AcNPV poxin (6XB3), PrGV poxin (6XB4) and T. ni poxin (6XB5) using PROMALS3D (Pei et al., 2008). In order to place amarillovirus poxin-like sequences on the tree, sequence boundaries were predicted using the mapped cleavage sites of the SLV2 and XCV poxin-like regions. For both, these were 112 amino acids N-terminal and 115-116 amino acids C-terminal to the serine in the conserved SGxP motif. Therefore, these boundaries were applied to each amarillovirus poxin sequence to adjust for the wide variety of sequence lengths identified by PSI-BLAST. For Lepidoptera, a single isoform of each poxin protein was included in the alignment used to generate the tree. In poxviruses, many representatives of poxin are encoded as C-terminal fusions to a domain with homology to mammalian schlafen proteins. For these sequences, only the region corresponding to the poxin domain was included in the alignment, and the schlafen domain was removed, extending from the conserved amino-acid motif LLNSGGG to the C-terminus. The resulting PROMALS3D alignment was subjected to maximum-likelihood analysis with 100 bootstrap replicates using PhyML in order to generate the final poxin phylogeny (Guindon et al., 2010;Pei et al., 2008).

Protein expression and recombinant protein purification
All poxin protein constructs were cloned into a custom pET vector designed to express an N-terminal 6 Â His tagged SUMO2 fusion (Zhou et al., 2018) in E. coli, using synthetic DNA fragments (IDT) and NEBuilder HiFi DNA Assembly mix (NEB). Certain constructs produced for the amarillovirus poxin self-cleavage experiment in Figure 5E were cloned into an alternative custom pET vector with an N-terminal SUMO2 fusion and C-terminal 6 Â His tag (Whiteley et al., 2019). Recombinant proteins were produced in the E. coli BL21 RIL strain (Agilent), in 2 ml (small scale) or 50 ml (large scale) MDG starter cultures, before growth and induction in 10 ml (small scale, for biochemistry) or 1 L (large scale, for kinetics assays and crystallography) M9ZB cultures as previously described (Zhou et al., 2018). Selenomethionine (SeMet)-labeled proteins for crystallography were grown in modified M9ZB medium as previously described . Cells were collected by centrifugation, disrupted by sonication, and recombinant protein was purified using Ni-NTA beads (Qiagen) as previously described . Poxin proteins purified for biochemical assays in Figure 1B, Figure (Reverter and Lima, 2006). Untagged poxin proteins were then further purified using 16/600 S75 or S200 size-exclusion chromatography columns (GE) in the same buffer, concentrated for storage and cryoprotection, flash frozen in liquid nitrogen, and stored long-term at À80˚C. Purified proteins were resolved on 4-20% Mini-Protean TGX gels (Bio-Rad) according to manufacturers' specifications and stained with Coomasie G-250 (VWR).

Synthesis of cyclic dinucleotides
2 0 3 0 -cGAMP for poxin nuclease assays was synthesized using the mouse cGAS catalytic domain P147-L507. Recombinant human SUMO2-tagged mouse cGAS was expressed in E. coli and purified using Ni-NTA affinity chromatography as previously described (Zhou et al., 2018). Briefly, the SUMO2 tag was removed as described above and mcGAS was further purified with heparin ionexchange and S75 size-exclusion chromatography. Mouse cGAS (5 mM) was incubated for 2 hr at 37C with 200 mM ATP and 200 mM GTP in the presence of 2 mM 45 bp stimulatory dsDNA in a 20 ml reaction with final buffer composed of 50 mM HEPES-KOH pH 7.5, 5 mM Mg(OAc) 2 , 37.5 mM KCl, and 1 mM DTT. Reactions were trace-labeled with [a-32 P] GTP (Perkin-Elmer). Reactions were terminated through addition of 1 ml Quick CIP (NEB) to digest remaining nucleoside triphosphate substrates, heat inactivated for 5 min at 80˚C, and frozen at À20˚C before use in nuclease assays. 3 0 3 0 -cGAMP was enzymatically synthesized in a similar manner using recombinant Vibrio cholerae DncV incubated with ATP and GTP as previously described Kranzusch et al., 2014). All 3 0 -5 0 linked cyclic dinucleotides were prepared with 200 mM ATP and 200 mM GTP.
2 0 3 0 -cGAMP used for crystallography was enzymatically synthesized and purified as previously described , by incubation of 100 nM recombinant mouse cGAS with 500 mM each ATP and GTP substrates and 50 mg ml À1 salmon sperm DNA in reaction buffer (10 mM Tris-HCl pH 7.5, 12.5 mM NaCl, 10 mM MgCl 2 , 1 mM DTT) at 37˚C for 24 hr. 2 0 3 0 -cGAMP was then purified by ion-exchange (2 Â 5 ml HiTrap Q columns) using a gradient of 0-2 M NH 4 OAc. Eluted 2 0 3 0 -cGAMP was freeze-dried and washed twice with methanol before final lyophilization and storage as powder at À20˚C.

Poxin nuclease activity assays
Poxin nuclease assays were performed as previously described . Reactions were carried out at 37˚C in 10 ml buffer (50 mM HEPES-KOH pH 7.5, 40 mM KCl, 1 mM DTT) with 1 ml of a cGAS 2 0 3 0 -cGAMP synthesis reaction (~20 mM final concentration of 2 0 3 0 -cGAMP). Reactions for the nuclease activity screens in Figure 1B and Figure 5C were carried out using 1 ml of bufferexchanged Ni-NTA elutions for each recombinant protein without normalization for protein concentration. Reactions with poxin active-site mutants in Figure 2-figure supplement 1B and reactions testing specificity of diverse poxins in Figure 5-figure supplement 2B were carried out using 1 ml of a 1 mM stock for each recombinant protein, and incubated for 15 min. For the poxin nuclease activity screen in Figure 1B, reactions were incubated for 1 hr, and reactions with amarillovirus poxin proteins in Figure 5C were performed for 20 hr. Longer reactions were used to allow more sensitive detection of 2 0 3 0 -cGAMP degradation activity. All reactions were terminated by spotting on a PEI cellulose thin-layer chromatography plate (EMD Millipore), and reaction products were resolved using a TLC mobile phase composed of 1.5 M KH 2 PO 4 pH 3.8. After developing, TLC plates were dried and exposed to a phosphor screen overnight before imaging on a typhoon phosphor-imager (GE). TLC images were cropped and adjusted for brightness and contrast in Fiji (Version 2.0.0-rc-69-1.52 p).

Poxin Michaelis-Menten kinetic analysis
In order to study poxin enzyme kinetics, poxin nuclease activity assays were carried out using stocks of chemically-synthesized 2 0 3 0 -cGAMP (Biolog) mixed with a small amount of [ 32 P]À2 0 3 0 -cGAMP tracer to achieve defined substrate concentrations. [ 32 P]À2 0 3 0 -cGAMP tracer was produced in 10 ml reactions with mouse cGAS as detailed above, using 3 ml [a-32 P] GTP (Perkin-Elmer,~10 mM final concentration) and 10 mM ATP. After a 2 hr incubation at 37˚C, quenching with Quick CIP, and heat inactivation for 5 min at 80˚C, the tracer was diluted 1:5 (50 ml final volume) in RNase-free water (VWR). Chemically-synthesized 2 0 3 0 -cGAMP was then re-suspended in water, and labeled by addition of [ 32 P]À2 0 3 0 -cGAMP tracer at a 1:50 dilution to achieve radioactively labeled stocks at a range of concentrations. 10 ml reactions were assembled in triplicate in 8-well strips with one well serving as a tracer-only background control, and seven wells serving as experimental poxin degradation reactions with varying concentrations of substrate. Reactions were pre-warmed to 37˚C in a 96-well heat block for 5 min prior to addition of 1 ml of buffer to the tracer-only background control and 1 ml of poxin protein to the experimental wells. Reactions were mixed with a multi-channel pipettor, and stopped by spotting directly onto a TLC plate after 30 s. VACV poxin reactions were carried out with 20 nM protein (10 nM enzyme dimer) incubated with 2 0 3 0 -cGAMP at the following final concentrations: 0.1, 0.25, 0.5, 0.75, 1, 2.5, and 5 mM. AcNPV poxin reactions were carried out with 20 nM protein (10 nM enzyme dimer) incubated with 2 0 3 0 -cGAMP at the following final concentrations: 0.25, 0.5, 0.75, 1, 2.5, 5, and 10 mM. T. ni poxin reactions were carried out with 100 nM protein (50 nM enzyme dimer) incubated with 2 0 3 0 -cGAMP at the following final concentrations: 50, 100, 150, 250, 500, 750, and 1 mM. Reaction progress was monitored using thin-layer chromatography, and quantified using ImageQuant software (GE). Percentage 2 0 3 0 -cGAMP turnover was calculated by quantification of the cleaved 2 0 3 0 -cGAMP spot intensity divided by the total signal for cleaved and uncleaved 2 0 3 0 -cGAMP in each lane. This value was then adjusted by subtraction of the percentage turnover value observed for the tracer-only negative control. To obtain the initial rates of 2 0 3 0 -cGAMP degradation in mM min À1 , adjusted percent turnover was multiplied by the total concentration of 2 0 3 0 -cGAMP in each reaction, and divided by the length of the reaction (0.5 min). 2 0 3 0 -cGAMP dependent enzyme kinetics were fit using the Michaelis-Menten model in GraphPad Prism, and K cat values were determined using the concentrations of poxin dimer, the minimal active enzyme unit. Results for each enzyme shown in Figure 2-figure supplement 2 are a single experiment (n = 3 technical replicates), representative of at least two biological replicates.

Electrophoretic mobility shift assay
Stable poxin-2 0 3 0 -cGAMP complex formation was assessed using an electrophoretic mobility shift assay as previously developed for the receptor STING (Morehouse et al., 2020;Whiteley et al., 2019). All 2 0 3 0 -cGAMP-binding experiments were performed with catalytic inactive poxin mutants VACV poxin H17A, AcNPV poxin H46A, and T. ni poxin H56A to prevent cleavage. Briefly, radiolabeled 2 0 3 0 -cGAMP was diluted to a final concentration of~50 nM into 10 ml reactions containing 1 Â reaction buffer (50 mM KCl, 50 mM Tris-HCl pH 7.5, and 1 mM TCEP) and 0-20 mM recombinant poxin protein as indicated. Reactions were incubated for 30 min at 25˚C, then separated on a 7.2 cm 6% nondenaturing polyacrylamide gel run at 100 V for 45 min in 0.5 Â TBE buffer. The gel was fixed for 15 min in a solution of 40% ethanol and 10% acetic acid before drying at 80˚C for 1 hr and then exposed to a phosphor screen and imaged with a Typhoon Trio Variable Mode Imager (GE Healthcare). Signal intensity was quantified using ImageQuant 5.2 software (GE Healthcare) and analyzed in GraphPad Prism 8.4.3 using the specific binding with hill slope model to determine K D . Note that use of proteins with catalytically inactivating mutations likely results in an underestimated K D value for each poxin studied.
X-ray crystallography data were processed with XDS and AIMLESS (Kabsch, 2010), using the SSRL autoxds script (A. Gonzales, Stanford SSRL). Experimental phase information for AcNPV, PrGV, and D. plexippus poxin proteins was determined using data collected from SeMet-substituted crystals. For SeMet-labeled AcNPV poxin, PrGV poxin, and D. plexippus poxin, heavy sites were identified using HySS in Phaser (Adams et al., 2010), initial maps were produced using SOLVE/RESOLVE (Terwilliger, 1999), followed by model-building in Coot (Emsley and Cowtan, 2004). For AcNPV poxin, a phase solution could only be found using data processed into the space group I4 1 , with 20 sites identified using HySS. Following initial manual building in Coot, a partial unrefined model was used as a molecular replacement search which obtained a solution in the spacegroup P1 with 16 AcNPV poxin copies in the asymmetric unit arranged as a double-helical filament. Analysis of data pathologies after processing into the space group P1 using Xtriage within PHENIX showed a multivariate Z-score of 5.243, indicating twinning, and successful refinement was carried out in PHENIX with the twin operator -l,-h,h+k+l. Using SeMet-labeled PrGV poxin crystals, eight sites were identified using HySS, and an initial map was calculated as above followed by model-building and refinement in PHENIX. For D. plexippus poxin, HySS detected 10 sites, allowing calculation of an initial map, model-building in coot, and refinement in PHENIX. The D. plexippus poxin structure was subsequently used as a molecular replacement search model to determine an initial map of the related T. ni poxin (50% identical), followed by model-building in coot and refinement in PHENIX. All structure figures were rendered using PyMOL (version 2.3.3).

Dali structural homology analysis
In order to compare poxin structural homology to proteins in the Protein Data Bank, poxin monomer structures were uploaded and used to query the DALI server (Holm, 2019). Z-scores for homologs less than 90% identical to one another (PDB90) were then plotted using GraphPad prism to compare the distribution and overall level of homology detectable between each poxin structure and proteins in the Protein Data Bank ( Figure 3A,B). Hits identified for each poxin protein in the PDB were sorted into 'protease' or 'other' groups using a PDB advanced search. The search was constructed by searching for overlaps between PDB IDs identified with DALI for each poxin with those identified using the following terms: Enzyme classification names equaling 'Serine endopeptidases' or 'Cysteine endopeptidases' or 'trypsin' or 'chymotrypsin' or 'enteropeptidase', OR Annotation name -CATH equaling 'trypsin-like serine proteases', OR Structure title containing phrases 'protease', or 'Peptidase', OR Macromolecule name containing phrases 'protease', or 'peptidase', OR annotation identifier -CATH equaling '2.40.10.120'. In order to compare the level of homology between T. ni poxin and eukaryotic or viral proteases by phylogenetic group, a list of the homolog PDB codes returned by DALI after query with T. ni poxin were used to perform a Protein Data Bank advanced search, and filtered by phylogeny as stated in the figure ( Figure 3B). PDB codes assigned to these phylogenetic groups were then plotted using GraphPad prism to compare the global level of homology between T. ni poxin and proteases from different groups ( Figure 3B). VACV poxin was the top hit in all DALI searches with new poxin structures with a Z-score of 18.1 for AcNPV, 15.7 for PrGV, and 16.9 for T. ni, but was excluded to allow analysis for distant protease homology.

Edman degradation
C-terminally 6 Â His tagged GFP fragments resulting from amarillovirus poxin self-cleavage were subjected to Edman degradation for cleavage site identification. GFP fragments were produced at large-scale and purified using Ni-NTA and S75 size-exclusion chromatography as described above. The cleaved GFP fragments were then resolved on a 15% SDS-PAGE gel, transferred to a PVDF membrane (Bio-Rad), and stained with Coomasie G-250 (VWR). Bands corresponding to the cleaved proteins were excised from the membrane and submitted for five cycles of Edman Degradation at the Tufts University Core Facility. The resulting assignments were made by the facility: SLV2 (S, T, P, R, R), XCV (S,T, No Call, S, K), both of which corresponded exactly to only one site within the SLV2 and XCV GFP fusion constructs used, marked in Figure 5F.