Control of bacterial immune signaling by a WYL domain transcription factor

Abstract Bacteria use diverse immune systems to defend themselves from ubiquitous viruses termed bacteriophages (phages). Many anti-phage systems function by abortive infection to kill a phage-infected cell, raising the question of how they are regulated to avoid cell killing outside the context of infection. Here, we identify a transcription factor associated with the widespread CBASS bacterial immune system, that we term CapW. CapW forms a homodimer and binds a palindromic DNA sequence in the CBASS promoter region. Two crystal structures of CapW suggest that the protein switches from an unliganded, DNA binding-competent state to a ligand-bound state unable to bind DNA. We show that CapW strongly represses CBASS gene expression in uninfected cells, and that phage infection causes increased CBASS expression in a CapW-dependent manner. Unexpectedly, this CapW-dependent increase in CBASS expression is not required for robust anti-phage activity, suggesting that CapW may mediate CBASS activation and cell death in response to a signal other than phage infection. Our results parallel concurrent reports on the structure and activity of BrxR, a transcription factor associated with the BREX anti-phage system, suggesting that CapW and BrxR are members of a family of universal defense signaling proteins.


INTRODUCTION
The constant evolutionary arms race between bacteria and the viruses that infect them, called bacteriophages (phages), has resulted in the evolution of a broad array of bacterial immune systems. These include the well-characterized restriction-modification and CRISPR-Cas systems, but also less well-understood systems like BREX (Bacteriophage Exclusion) (1,2), CBASS (Cyclic Oligonucleotide-Based Anti-Phage Signaling System) (3,4), and others (5,6). Many bacterial immune systems, including CBASS, act by a socalled abortive infection mechanism and kill an infected cell to avoid phage reproduction (3,7). Because of their destructive power, abortive infection systems must be exquisitely tuned to avoid activation in an uninfected cell, but activate rapidly and reliably upon infection.
CBASS immune systems are widespread and extremely diverse, with over 6200 distinct systems identified to date across bacteria and archaea (3,8). CBASS systems show diversity in both their activation mechanisms and their cell-killing mechanisms. All CBASS systems encode a cGAS/DncV-like nucleotidyltransferase (CD-NTase; here termed cGAS) that synthesizes a cyclic di-or trinucleotide second messenger molecule (8,9), and a second-messenger activated effector protein that mediates cell death to abort the viral infection. Type I CBASS systems encode only these two proteins, suggesting that their cGAS enzyme has an innate infection-sensing capability (10). The recent discovery of Type I CBASS systems that encode cGAS and an effector related to eukaryotic STING (STimulator of Interferon Genes) suggests that the mammalian cGAS-STING innateimmune pathway evolved from these systems (11).
The majority of bacterial CBASS systems encode putative regulators that likely provide an additional level of control over their activation. Type II CBASS systems encode enzymes related to eukaryotic ubiquitin-transfer machinery, whose roles in signaling remain unknown (4,10). In Type III CBASS systems, peptide-binding HORMA domain proteins and a AAA+ ATPase, Trip13, regulate cGAS activation through HORMA-peptide binding (7). These regulators are thought to represent the evolutionary precursors of the diverse HORMA domain signaling protein family in eukaryotes, which regulate key cell-cycle checkpoints, DNA repair and recombination in mitotic and meiotic cells, and autophagy signaling (7,12).
The diversity of bacterial CBASS systems suggests that additional mechanisms of phage infection sensing and signaling regulation in these systems remain to be discovered. Here, we identify a novel transcription factor, CapW, that is associated with hundreds of distinct CBASS systems. We show that CapW is a transcriptional repressor that binds the promoter region of its cognate CBASS operon to inhibit expression of CBASS genes. Two structures of CapW from different bacteria reveal a dimeric assembly that likely binds a small-molecule ligand to control DNA binding and transcriptional repression. Structure-based mutagenesis of CapW reveals that the protein's putative ligand-binding WYL domain is required for a phage infection-dependent increase in CBASS expression, but that this increase is unexpectedly not required for robust anti-phage activity. These data suggest that CapW may mediate increased CBASS expression, and concomitant host-cell toxicity, in response to a signal other than phage infection. Parallel discovery of CapW-like transcription factors in other bacterial immune systems including BREX (13,14) suggests that CapW and its relatives make up a family of ligand-responsive transcriptional switches, which directly sense stress signals and activate expression of diverse immune systems.

Bioinformatics
To comprehensively search CBASS systems for homologs of Escherichia coli upec-117 CapW (NCBI #WP 001534693.1), we exported the genomic DNA sequence +/−10 kb of 6233 previously-reported CD-NTases (3) using the Integrated Microbial Genomes (IMG) database at the DOE Joint Genome Institute (https://img.jgi.doe.gov). We used NCBI Genome Workbench (https://www.ncbi.nlm.nih.gov/tools/gbench/) to perform custom TBLASTN searches for proteins related to E. coli upec-117 CapW. CBASS system type and effector assignments for each hit were taken from Cohen et al. (3) and manually updated. Each hit was manually inspected for the presence of CapW specifically associated with CBASS rather than a neighboring operon.
Proteins were expressed in E. coli strain Rosetta 2 (DE3) pLysS (EMD Millipore). Cultures were grown at 37 • C to A 600 = 0.6, then induced with 0.25 mM IPTG and shifted to 20 • C for 16 h. Cells were harvested by centrifugation and resuspended in buffer A (25 mM Tris pH 8.5, 10% glycerol and 1 mM NaN 3 ) plus 300 mM NaCl, 5 mM imidazole, 5 mM ␤-mercaptoethanol. Proteins were purified by Ni 2+ -affinity (Ni-NTA agarose, Qiagen) then passed over an anion-exchange column (Hitrap Q HP, Cytiva) in Buffer A plus 100 mM NaCl, 5 mM imidazole, and 5 mM ␤mercaptoethanol, collecting flow-through fractions. Tags were cleaved with TEV protease (15), and cleaved protein was passed over another Ni 2+ column (collecting flowthrough fractions) to remove uncleaved protein, cleaved tags, and tagged TEV protease. The protein was passed over a size exclusion column (Superdex 200, Cytiva) in buffer GF (buffer A plus 300 mM NaCl and 1 mM dithiothreitol (DTT)), then concentrated by ultrafiltration (Amicon Ultra, EMD Millipore) to 10 mg/ml and stored at 4 • C. All point mutants showed equivalent migration on size exclusion column compared to wild type. For selenomethionine derivatization, protein expression was carried out in M9 minimal media supplemented with amino acids plus selenomethionine prior to IPTG induction (16).
For characterization of oligomeric state by size exclusion chromatography coupled to multi-angle light scattering (SEC-MALS), 100 l of purified P. aeruginosa PA17 CapW, S. maltophilia C11 CapW, or E. coli upec-117 CapW at 5 mg/ml was injected onto a Superdex 200 Increase 10/300 GL column (Cytiva) in buffer GF. Light scattering and refractive index profiles were collected by miniDAWN TREOS and Optilab T-rEX detectors (Wyatt Technology), respectively, and molecular weight was calculated using AS-TRA v. 6 software (Wyatt Technology).

Crystallization and structure determination
For crystallization of P. aeruginosa PA17 CapW, selenomethionine-derivatized protein in buffer GF (9 mg/ml) was mixed 1:1 with well solution containing 0.27 M LiSO 4 , 1% PEG 400, and 0.1 M sodium acetate (pH 5.0) in hanging-drop format. Crystals were cryoprotected by the addition of 30% glycerol, and flash-frozen in liquid nitrogen. Diffraction data were collected at the Advanced Photon Source NE-CAT beamline 24ID-E (see support statement below) and processed with the RAPD dataprocessing pipeline (https://github.com/RAPD/RAPD), which uses XDS (17) for data indexing and reduction, AIMLESS (18) for scaling, and TRUNCATE (19) for conversion to structure factors. We determined the structure by single-wavelength anomalous diffraction methods in the PHENIX Autosol wizard (20). We manually rebuilt the initial model in COOT (21), and refined in phenix.refine (22) using positional and individual B-factor refinement (Supplementary Table S3).
For crystallization of S. maltophilia C11 CapW, protein in a buffer containing 20 mM Tris pH 8.5, 1 mM DTT and 100 mM NaCl (14 mg/ml) was mixed 2:1 with well solution containing 0.1 M Tris pH 8.5, and 1.5 M lithium sulfate in sitting drop format. Crystals were cryoprotected by the addition of 24% glycerol and flash-frozen in liquid nitrogen. Diffraction data were collected at the Advanced Light Source BCSB beamline 5.0.2 (see support statement below) and processed with the DIALS data-processing pipeline (https://dials.github.io) (23). We determined the structure by molecular replacement in PHASER (24), using individual wHTH and WYL domain structures from P. aeruginosa PA17 CapW as search models. We manually rebuilt the initial model in COOT (21), and refined in phenix.refine (22) Nucleic Acids Research, 2022, Vol. 50, No. 9 5241 using positional and individual B-factor refinement (Supplementary Table S3).

DNA binding
For electromobility shift DNA-binding assays (EMSA), the S. maltophilia C11 CBASS promoter region was amplified via PCR with one primer 5 -labeled with 6-carboxyfluorescein (5 -6-FAM), followed by gelpurification (Machery-Nagel Nucleospin). For the EMSA control, a random sequence from within the S. maltophilia C11 CapW gene that was similar in length and GC content to the S. maltophilia C11 promoter region was amplified via PCR with one primer 5 -labeled with 6-carboxyfluorescein (5 -6-FAM), followed by gel-purification (Machery-Nagel Nucleospin). 10 l reactions with 100 nM DNA and indicated concentrations of protein were prepared in a buffer containing 50 mM Tris-HCl pH 8.5, 50 mM NaCl, 5 mM MgCl 2 , 5% glycerol and 1 mM DTT. After a 1 h incubation at room temperature, reactions were loaded onto 2% TBE-agarose gels in running buffer with 0.5× TBE pH 8.5 running buffer, run for 2 h at 60 V at 4 • C, and imaged using a Bio-Rad ChemiDoc system (Cy2 filter settings).
For DNA binding fluorescence polarization (FP) assays, a 30 bp double-stranded DNA was produced by annealing complementary oligos, one of which was 5 -6-FAM labeled. Binding reactions (30 l) contained 25 mM Tris pH 8.5, 50 mM NaCl, 5% glycerol, 5 mM MgCl 2 , 1 mM DTT, 0.01% nonidet p40 substitute, 50 nM DNA, and the indicated amounts of protein. After a 10 min incubation at room temperature, fluorescence polarization was read using a Tecan Infinite M1000 PRO fluorescence plate reader, and binding data were analyzed with Graphpad Prism v.9.2.0 using a single-site binding model.

GFP reporter assays
To generate a GFP reporter plasmid, a DNA sequence encoding full-length E. coli upec-117 CapW and its promoter, adjacent to a gene encoding super-folder GFP was synthesized (IDT) and cloned via isothermal assembly into pBR322. Point mutations were generated by PCR-based mutagenesis. Vectors were transformed into E. coli strain JP313 (25). 100 l of saturated overnight culture was added to 5 mL LB broth plus ampicillin and grown at 37 • C to an OD 600 of 0.4-0.5. 500 l of culture was pelleted by centrifugation and resuspended in 100 l of 2× SDS-PAGE loading buffer (125 mM Tris pH 6.8, 20% Glycerol, 4% SDS, 200 mM DTT, 180 M bromophenol blue). Samples were boiled for 5 min, then 1-10 l was loaded onto an SDS-PAGE gel. Proteins were transferred to a PVDF membrane (Bio-Rad Trans-Blot Turbo), then the membranes were blocked with 5% nonfat dry milk and blotted for appropriate proteins. Blots were imaged using a Bio-Rad ChemiDoc system using filters to image horseradish peroxidase activity. Antibodies used: Mouse anti-GFP primary antibody (Roche) at 1:3000 dilution; Mouse anti-FLAG primary antibody (Sigma-Aldrich) at 1:3000 dilution; Mouse anti-RNA polymerase primary antibody (clone NT63; BioLegend #10019-878) at 1:3000 di-lution; Goat anti-mouse HRP-linked secondary antibody (Millipore Sigma) at 1:30,000 dilution.
To measure reporter responses to phage infection, GFP reporter plasmids were transformed into E. coli strain JP313 (25). Overnight cultures were diluted into fresh LB media with ampicillin and grown to an OD 600 of 0.25-0.35. Bacterial cultures were infected with bacteriophage cI-diluted in phage buffer (150 mM NaCl, 40 mM Tris pH 7.5, 10 mM MgSO 4 ) plus 1 mM CaCl 2 at a multiplicity of infection of 10. Cultures were incubated at 30 • C and timepoints were taken immediately after infection (0 minutes), and 30, 60, 90 and 120 minutes post infection. Samples were prepared and analyzed by western blot as above.

APS NE-CAT support statement
This work is based upon research conducted at the Northeastern Collaborative Access Team beamlines, which are funded by the National Institute of General Medical Sciences from the National Institutes of Health (P30 GM124165). The Eiger 16M detector on the 24-ID-E beam line is funded by a NIH-ORIP HEI grant (S10OD021527).

Identification of a WYL-family transcription factor associated with CBASS systems
To identify novel regulators of anti-phage signaling in CBASS systems, we manually inspected a set of Type III CBASS systems encoded by different E. coli isolates. We identified an uncharacterized gene in a CBASS system from E. coli strain upec-117, which is predicted to possess both a winged helix-turn-helix (wHTH) DNA binding domain and a WYL domain, named for a conserved threeamino acid motif (tryptophan-tyrosine-leucine) in this domain (26). We named this gene capW (CBASS-associated protein with WYL domain) ( Figure 1A). To identify other CBASS systems with capW, we searched for homologs of E. coli upec-117 capW within 10 kb of 6233 previouslyidentified CBASS systems in diverse bacteria and archaea (3), and identified 160 CBASS systems encoding this protein (Supplementary Table S1). In these systems, the capW gene is consistently found upstream of the core CBASS genes and encoded on the opposite strand ( Figure 1A). CapW is associated with all major CBASS types, including Type I (40 systems), Type II (59 systems), and Type III (56 systems) ( Figure 1B). These systems encode diverse putative effector proteins, including the phospholipase CapV, endonucleases Cap4 and NucC, and transmembrane proteins Cap14 and Cap15 ( Figure 1A). The E. coli upec-117 CBASS system encodes three uncharacterized genes: a predicted effector similar to MTA/SAH-family nucleoside phosphorylases (10) that we named Cap17; an uncharacterized predicted 3 -5 exonuclease that we named Cap18, and a predicted threetransmembrane helix protein that we named Cap19 (Supplementary Table S2).
In other proteins, the WYL domain adopts an SH3 ␤-barrel fold and has been proposed to function in ligand binding in multiple contexts, including in the regulation of bacterial immunity as part of CRISPR/Cas systems and other immune systems (26,27). This family includes WYL1, a dimeric WYL domain-containing protein that binds single-stranded RNA and positively regulates Cas13d in a Type VI-D CRISPR-Cas system (28,29). The largest family of bacterial WYL domain-containing proteins possess CapW's domain structure with an N-terminal wHTH domain, a central WYL domain, and a conserved C-terminal domain termed WCX (WYL C-terminal extension). The structural mechanisms of these proteins, and their roles in bacterial signaling, are largely unknown. The most well-characterized members of this family are PafB and PafC, which together regulate the DNA damage response in mycobacteria (30,31). A recent structure of a naturally-fused PafBC protein from Arthrobacter aurescens in the absence of bound ligand or DNA (32) revealed an asymmetric overall structure, leaving unanswered the question of how these proteins' DNA binding propensity may be regulated by ligand binding. Overall, our bioinformatics data suggest that CapW is a ligand-responsive transcription factor that may regulate expression of its associated CBASS operon in response to phage infection. Moreover, CapW represents a large family of uncharacterized bacterial transcription factors involved in diverse signaling pathways.

CapW specifically binds the CBASS promoter region
To determine whether CapW is a transcription factor that controls CBASS expression, we first purified the protein and tested its binding to the shared promoter region between the CBASS core genes and CapW. We purified three CapW proteins, from E. coli upec-117 (Ec CapW), Stenotrophomonas maltophilia C11 (Sm CapW; 41% identical to Ec CapW), and Pseudomonas aeruginosa PA17 (Pa CapW; 64% identical to Ec CapW). All three proteins are associated with Type III CBASS systems ( Figure 1A Using an electrophoretic mobility shift assay (EMSA), we found that Sm CapW robustly binds to its cognate promoter region (Figure 2A). As Sm CapW is a homodimer, we searched its promoter for palindromic sequences that would represent a likely binding site for a two-fold symmetric CapW dimer. We identified two imperfect palindromes within the S. maltophilia C11 CBASS promoter region ( Figure 2B), and used a fluorescence polarization assay to demonstrate specific binding to a 24-bp sequence that overlaps the promoter's −10 site ( Figure 2E). Next, we searched the promoter regions of E. coli upec-117 and P. aeruginosa PA17 CBASS for similarly-positioned palindromic sequences. We identified a 21-bp imperfect palindrome in the promoter of E. coli upec-117 CBASS, and a 19-bp imperfect palindrome in the promoter of P. aeruginosa PA17 CBASS, both of which overlap their promoters' −10 sites (Figure 2C, D). Using fluorescence polariza-tion, we found that Ec CapW and Pa CapW specifically bind these sequences ( Figure 2F, G). Based on these data, we conclude that CBASS-associated CapW proteins bind palindromic sequences that overlap the −10 sites within the promoters of their cognate CBASS operons. Because the −10 site represents a key binding site for RNA polymerase and -factors (33), and is also where promoter melting occurs, this finding suggests that CapW acts as a transcriptional repressor by interfering with RNA polymerasepromoter binding or formation of an active transcription complex.

Structure and DNA binding mechanism of CapW
We next crystallized and determined the structure of Sm CapW to 1.89Å resolution (Supplementary Table S3 ( Figure 3A). The CapW homodimer adopts a distinctive domain-swapped overall architecture with the N-terminal wHTH domains adjacent to one another, followed by extended linkers reaching across the dimer such that each protomer's WYL domain interacts primarily with the wHTH domain of the opposite protomer ( Figure 3A). The Cterminal WCX domain adopts an extended ␣-␤ fold, and each protomer's WCX domain reaches back across the top of the dimer to interact with the WYL domain of the opposite protomer ( Figure 3A). The two WCX domains define a groove across the top of the CapW dimer that extends between the putative ligand-binding sites of each WYL domain (see next section).
The overall architecture of CapW is equivalent to that of another recently-discovered bacterial defense-associated transcription factor, BrxR. Two parallel studies report the discovery of BrxR as a regulator of BREX anti-phage systems, and determine structures of the protein from two different bacteria (13,14). These structures reveal that BrxR and CapW share a common domain organization and overall domain-swapped architecture, with wHTH domains on one face of the dimer and a putative ligand-binding surface on the opposite face that is made up of WYL and WCX domains.
In DNA-free structures of both CapW and BrxR, the wHTH domains of the two protomers are positioned adjacent to one another, with their DNA-binding surfaces aligned on one face of the dimer. We modelled a DNAbound structure of CapW based on a structure of Acinetobacter BrxR bound to its cognate palindromic DNA sequence (13), revealing that the two wHTH domains in the CapW dimer are perfectly aligned to bind a palindromic DNA sequence ∼20 base pairs in length ( Figure 3B), close to the length of palindromic sequences we identified as CapW binding sites ( Figure 2B). Based on our model of DNA-bound CapW, we designed two mutants to disrupt DNA binding: a single Arg32 to alanine mutant (Sm CapW R32A ), and a triple mutant with Ser42, Gln45, and Ser47 all mutated to alanine (Sm CapW SQS-AAA ) ( Figure 3C). While Arg32 is highly conserved in CapW but not in BrxR, Ser42/Gln45/Ser47 are conserved in BrxR and all three residues are directly involved in DNA binding (13). In Sm CapW, both CapW R32A and CapW SQS-AAA eliminated detectable binding of the protein to its binding site in the CBASS promoter ( Figure  3D). Together, these data suggest that our structure of Sm CapW represents a DNA-binding competent conformation of CapW.

The structure of P. aeruginosa PA17 CapW reveals a non-DNA binding conformation
The central WYL domain of CapW adopts an Sm-type SH3 ␤-barrel fold similar to bacterial Hfq (host factor for RNA bacteriophage Q␤ replication) proteins, which bind small RNAs (34,35). Other WYL domain containing proteins have been shown to bind single-stranded RNA (29) or DNA (36), suggesting that the CapW WYL domain may also bind nucleic acids and/or a small molecule ligand. The WYL domain is named for a set of three highly-conserved amino acids, tryptophan-tyrosine-leucine, located on one of the domain's ␤-strands. In CapW, the tryptophan residue is highly conserved, while the typical tyrosine residue in WYL domains is replaced by a highly conserved histidine (His171 in Sm CapW; Supplementary Figure S2A, B). This histidine residue is solvent-exposed on the top face of the CapW dimer, and is surrounded by a cluster of highly-conserved hydrophobic and polar residues including Tyr145, Ser147, Trp156, Arg169, Arg183, Phe185 and Arg189 (Sm CapW residue numbering; Figure 4A, C, Supplementary Figures S2A, B, S3). Comparing these residues to a sequence logo constructed from PFAM13280, which represents over 18,000 WYL-WCX domain proteins, we found that Sm CapW Tyr145 aligns with a highly-conserved tyrosine in this family, and Arg183/Phe185/Arg189 are situated in a region that shows high conservation in the broader WYL domain family (Supplementary Figure S2b). Based on this conservation and on the previously-identified ligand binding role for WYL domains (26,29,36), we propose that this conserved surface on the CapW WYL domain may bind a nucleic acid or small-molecule ligand. Notably, the two putative ligand-binding sites on CapW are situated near one another on the top face of the dimer at either end of a groove defined by the two WCX domains ( Figure 4A). The positioning of these sites suggest that they may cooperate to bind an extended nucleic acid ligand.
A major question for the function of CapW and related transcription factors is how these proteins are regulated in order to sense and respond to bacteriophage infection. Based on the putative ligand-binding role for the WYL domain, a compelling model is that ligand binding induces a conformational change that alters the ability of CapW to bind DNA. We determined a crystal structure of a second CapW protein, Pa CapW, that sheds light on this potential mode of regulation. We crystallized and determined a 2.3Å resolution crystal structure of Pa CapW in a low-pH condition (pH 5.0). The structure reveals that Pa CapW shares the same overall architecture as Sm CapW, forming a homodimer of protomers with wHTH, WYL, and WCX domains ( Figure 4B). The symmetric arrangement of WYL domains linked by WCX domains is similar to Sm CapW, but in Pa CapW the WYL domains are each rotated ∼10 • downward and inward toward the wHTH domains. This motion is accompanied by a slight widening of the groove between the WYL domains' ligand-binding sites and bordered by the two WCX domains (Figure 4B), and also by a rearrangement of the C-termini of each WCX domain. In Sm CapW, the WCX domain C-terminus forms two short ␣-helices with an intervening loop that folds along the top of the domain ( Figure 4A). In Pa CapW, this region undergoes a domain swap to fold against the opposite protomer's WCX domain ( Figure 4B). Finally, we observe that a salt bridge between a conserved arginine in the WYL domain and an aspartate on the WCX domain is broken in the Pa CapW structure, enabling the WCX domain to move out and away from the dimer-related WYL domain ( Figure  4D).
The highly-conserved tryptophan residue of the WYL domain family is positioned on the opposite face of the WYL domain compared to the putative ligand binding site, and in Sm CapW this residue (Trp170) packs against an ␣helix in the extended wHTH-WYL domain linker (␣4; Supplementary Figure S4A). In our structure of Pa CapW, the WYL domains pinch inward by ∼6Å ( Figure 4B), positioning the equivalent tryptophan residues (Trp181) and the nearby ␣5 helices too close to one another to accommodate the ␣4 helices in their original configuration. As a result, the ␣4 helix of each CapW protomer rotates downward away from the WYL domains (Supplementary Figure S4B). This change, and the motion of the WYL domains in general, in turn cause a striking ∼70 • rotation of each wHTH domain compared to its position in Sm CapW ( Figure 4B). Whereas the wHTH domains are aligned for cooperative DNA binding in our structure of Sm CapW, in Pa CapW they are completely misaligned and would be unable to bind a contiguous DNA sequence.
Our biochemical data shows that Pa CapW binds DNA with an affinity equivalent to that of Sm CapW or Ec CapW, yet our structure of this protein reveals a conformation that is clearly unable to bind DNA. We propose that the Pa CapW crystal structure reveals a conformational state equivalent to that of ligand-bound CapW, perhaps induced by the low pH of the crystallization condition or by proteinprotein packing interactions. Comparing the DNA-binding and non-DNA binding states of CapW reveals a key role for the WYL domain's conserved tryptophan residue in driving conformational changes between these two states.

CapW is a transcriptional repressor for CBASS
Our identification of CapW binding sites overlapping the −10 site of the CBASS promoter suggested that CapW binding may repress CBASS transcription. To test this idea, we generated an expression reporter system for Ec CapW. We first generated DNA-binding mutants of Ec CapW equivalent to Sm CapW R32A (Ec CapW R43A) and CapW SQS-AAA (S53A/Q56A/S58A), and found that both mutants eliminated detectable binding of Ec CapW to its palindromic site by fluorescence polarization ( Figure 5A). We next constructed an expression reporter system with the E. coli upec-117 capW gene and CBASS promoter linked to a gene encoding GFP ( Figure 5B). To track CapW expression directly in this system, we also fused a C-terminal FLAG tag to the capW gene. We measured both GFP and CapW-FLAG expression using Western blots, in the presence and absence of capW or a DNA-binding mutant. In the presence of wild-type capW, the levels of both GFP and CapW-FLAG were nearly undetectable by Western blotting ( Figure 5B). In contrast, disrupting capW or eliminating DNA binding through the R43A or SQS-AAA mutants resulted in strong expression of GFP ( Figure 5B). We observed a similar increase in expression of CapW-FLAG in both DNA-binding mutants, suggesting that CapW regulates its own transcription in addition to that of the core CBASS genes. We could identify a promoter in the E. coli upec-117 CBASS operon that is oriented in the reverse direction compared to the promoter driving core CBASS expression (Supplementary Figure S5). This promoter likely drives expression of capW and the system's likely effector cap17, and our data suggests that CapW binding to the CBASS promoter region affects transcription in both directions. Overall, these data show that CapW is a strong transcriptional repressor, capable of repressing transcription bidirectionally from the CBASS promoter.
Our data suggest that in its unliganded state, CapW strongly represses CBASS transcription by binding the operon's promoter. To test whether this repression is released upon phage infection, potentially by ligand binding to the protein's WYL domain, we used our GFP reporter system to measure expression after infection with phage . With wild-type capW, GFP expression was undetectable in uninfected cells, but was detectable within 30 minutes of phage infection, and increased through 120 minutes post-infection ( Figure 5D). We generated a series of point mutants to five polar and charged residues in the putative ligand-binding site of Ec CapW (based on the structure of Pa CapW; Figure 5C), and found that all of these mutants showed either no detectable expression increase after infection, or extremely low/delayed expression compared to the system encoding wild-type CapW ( Figure 5D). The equivalent point mutants of Sm CapW show no loss of DNA binding affinity in vitro (Supplementary Figure S3e), supporting a model in which these mutants render CapW a constitutive transcriptional repressor. These data reveal a key role for the WYL domain's putative ligand binding site in CapWdependent expression control. Our GFP reporter system suggests that CapW mediates an increase in CBASS expression upon phage infection. The E. coli upec-117 CBASS system shows strong protection against infection by a strain of phage that obligately undergoes the lytic infection cycle ( cI-) ( Figure 5E) (37). As with other tested CBASS systems, we found that this protection depends on the system's cGAS-like enzyme CdnC and its putative effector, the predicted MTA/SAH-family nucleoside phosphorylase Cap17 ( Figure 5E). We found that a catalytic-dead mutant of this system's predicted 3 -5 -exonuclease (Cap18) did not affect phage protection, but that introduction of a stop codon into the gene encoding the uncharacterized transmembrane protein (Cap19) strongly affected phage protection ( Figure 5E). When we tested the CapW WYL domain mutants that constitutively repress expression in our GFP reporter system, we surprisingly found that these mutants protect against phage as effectively as the wild-type system ( Figure 5E). These data suggest that the presumably low level of CBASS expression present in the CapW-repressed state is nonetheless sufficient for a robust anti-phage response.
Despite repeated attempts, we were unable to generate mutant E. coli upec-117 CBASS constructs that either disrupted capW or eliminated its DNA binding activity. After mutagenesis, clones containing these mutations consistently also showed large deletions of critical regions of the CBASS operon (not shown). Our inability to isolate these mutants in the context of a full CBASS system, when the same mu-tants are readily obtainable in our GFP reporter system, suggests that elimination of CapW-mediated repression and the resulting high-level expression of CBASS genes is likely toxic to host cells. This model is consistent with a parallel study on the related BrxR transcription factor, in which deletion of BrxR leads to increased expression of BREX genes and toxicity to host cells (13,14).

DISCUSSION
Bacterial CBASS immune systems are highly diverse, and an emerging theme in these systems is that they encode A B Figure 6. Model for CapW function in CBASS. (A) In an uninfected cell, the dimeric CapW transcription factor binds the promoter of its cognate CBASS system and maintains its expression at a low level. Upon phage infection, cGAS is activated to produce a second messenger signal that in turn activates the system's effector, killing the infected cell. (B) In response to a secondary stress signal that produces a small-molecule or nucleic acid ligand, CapW binds the ligand and undergoes a conformational change to release it from DNA. The resulting loss of transcriptional repression causes high-level CBASS expression and associated cell death even in the absence of the system's primary phage infection signal.
multiple redundant regulatory mechanisms in order to, for example, prevent anti-phage signaling and associated cell killing outside the context of infection. All CBASS systems encode a cGAS-like oligonucleotide cyclase that likely possesses an inherent phage-dependent activation mechanism (4,10). Type II, III and IV CBASS systems additionally encode regulators -many of which represent ancestral forms of important eukaryotic signaling proteinsthat provide a second level of control over the activity of their cognate cGAS-like enzymes (7,10). Here, we identify a third mode of regulation in some CBASS systems: a WYL domain-containing transcription factor, CapW. We show that CapW strongly represses CBASS expression by binding the CBASS promoter region, and that repression can be released upon phage infection in a manner that depends on the putative ligand-binding site in the protein's WYL domain. Our two structures suggest that ligand binding to the protein's WYL domain causes a conformational change in the WYL and WCX domains that triggers a large-scale rotation of the wHTH domains to release the protein from DNA ( Figure 6).
Mutation of conserved residues in CapW's WYL domain eliminates CapW-dependent CBASS de-repression upon phage infection, yet paradoxically do not affect the antiphage activity of CBASS. This observation, combined with our inability to delete capW in the E. coli upec-117 CBASS system, suggests a two-pronged model for CBASS action in CapW-containing systems. We propose that the system's primary mode of action responds directly to phage infection, through either cGAS's inherent phage-sensing activity or the action of known regulators, and kills the host cell to prevent phage replication ( Figure 6A). This mode does not require high-level expression of CBASS proteins, as demonstrated by the robust anti-phage activity of systems containing CapW mutants lacking the ability to respond to phage infection ( Figure 5D, E). In CBASS systems encoding CapW, we propose a secondary mode in which an unknown stress signal causes production of a CapW-binding ligand, releasing CapW from DNA to drive high-level expression of CBASS genes ( Figure 6B). Since our data and prior studies suggest that high-level CBASS expression is inherently toxic to host cells (38), this secondary mode would also result in cell death even in the absence of the primary phage trigger(s) sensed by the system. Thus, CapW may enable a single CBASS system to respond directly to phage infection (through its primary mode of action) and also respond to other, as-yet unidentified stress signals. Phage infection does trigger CapW-mediated transcription derepression ( Figure 5D), with the strongest effect observed around the time host cells undergo phage-mediated lysis (39). An important future direction will be to isolate the molecular signals that mediate this de-repression, providing insight into the stress pathways that act on CapW's secondary signaling mode.
CapW shares a similar overall architecture to another recently-identified transcription factor, BrxR, which controls the expression of BREX immune systems (13,14). Like CapW, BrxR binds BREX promoter sequences to repress expression in uninfected cells (13,14). Curiously, introduction of an early stop codon into Acinetobacter BrxR to eliminate protein production does not compromise the antiphage activity of its cognate BREX system (13). Based on this observation, Luyten et al. suggest that BrxR may activate BREX as a 'second line of defense' by responding to a ligand produced by a stress pathway or product of other defense systems like CRISPR/Cas or restriction-modification (13). We propose that the allosteric mechanism we identify for ligand-induced conformational changes in CapW also applies to BrxR, and to the larger family of bacterial defense-associated WYL domain transcription factors.
Nucleic Acids Research, 2022, Vol. 50, No. 9 5249 CapW/BrxR-like transcription factors are associated with a variety of bacterial immune systems including CRISPR-Cas and restriction-modification systems (14), and all of these proteins share a conserved ligand-binding surface (Supplementary Figure S2C), suggesting that they may bind the same or similar ligands. Moreover, these proteins all share the conserved tryptophan residue after which the WYL domain is named, which we implicate in allosteric communication between WYL/WCX and wHTH domains in CapW.
Outside bacterial immune systems, the WYL domain and WYL-WCX domain pair likely play a range of signaling roles. For example, the transcription factor PafBC possesses a tandem array of wHTH, WYL, and WCX domains, and plays a role in regulating the response to DNA damage in mycobacteria (32). The Cas13d regulator WYL1 shares a similar domain structure with an N-terminal ribbon-helixhelix domain, a central WYL domain, and a C-terminal dimerization domain (28,29). Examining sequence conservation in PFAM13280, which includes over 18,000 bacterial proteins that possess the WYL-WCX domain pair embedded in a variety of protein scaffolds, reveals that the ligandbinding site is much more variable across this family than within the smaller group of proteins associated with immune systems (Supplementary Figure S2b). Thus, while immune system-associated WYL proteins likely bind a common ligand, other family members have likely evolved to bind a variety of ligands to control diverse signaling pathways. Key directions for future work will be to determine the range of ligands bound by diverse WYL proteins, how ligand binding is coupled to conformational changes within different protein scaffolds, and how these conformational changes are coupled to signaling.