Genetic evidence for a regulated cysteine protease catalytic triad in LegA7, a Legionella pneumophila protein that impinges on a stress response pathway

ABSTRACT Legionella pneumophila grows within membrane-bound vacuoles in phylogenetically diverse hosts. Intracellular growth requires the function of the Icm/Dot type-IVb secretion system, which translocates more than 300 proteins into host cells. A screen was performed to identify L. pneumophila proteins that stimulate mitogen-activated protein kinase (MAPK) activation, using Icm/Dot translocated proteins ectopically expressed in mammalian cells. In parallel, a second screen was performed to identify L. pneumophila proteins expressed in yeast that cause growth inhibition in MAPK pathway-stimulatory high-osmolarity medium. LegA7 was shared in both screens, a protein predicted to be a member of the bacterial cysteine protease family that has five carboxyl-terminal ankyrin repeats. Three conserved residues in the predicted catalytic triad of LegA7 were mutated. These mutations abolished the ability of LegA7 to inhibit yeast growth. To identify other residues important for LegA7 function, a generalizable selection strategy in yeast was devised to isolate mutants that have lost function and no longer cause growth inhibition on a high-osmolarity medium. Mutations were isolated in the two carboxyl-terminal ankyrin repeats, as well as an inter-domain region located between the cysteine protease domain and the ankyrin repeats. These mutations were predicted by AlphaFold modeling to localize to the face opposite from the catalytic site, arguing that they interfere with the positive regulation of the catalytic activity. Based on our data, we present a model in which LegA7 harbors a cysteine protease domain with an inter-domain and two carboxyl-terminal ankyrin repeat regions that modulate the function of the catalytic domain. IMPORTANCE Legionella pneumophila grows in a membrane-bound compartment in macrophages during disease. Construction of the compartment requires a dedicated secretion system that translocates virulence proteins into host cells. One of these proteins, LegA7, is shown to activate a stress response pathway in host cells called the mitogen-activated protein kinase (MAPK) pathway. The effects on the mammalian MAPK pathway were reconstructed in yeast, allowing the development of a strategy to identify the role of individual domains of LegA7. A domain similar to cysteine proteases is demonstrated to be critical for impinging on the MAPK pathway, and the catalytic activity of this domain is required for targeting this path. In addition, a conserved series of repeats, called ankyrin repeats, controls this activity. Data are provided that argue the interaction of the ankyrin repeats with unknown targets probably results in activation of the cysteine protease domain.


INTRODUCTION
Legionella pneumophila is a Gram-negative, facultative, intracellular bacterium and the causative agent of Legionnaires' disease (1,2), which presents with either atypical pneumonia or flu-like symptoms, such as cough, fever, and chills.Individuals who are immunocompromised, smokers, or have chronic lung disease are at increased risk for disease (3).Disease is caused by inhalation or aspiration of aerosolized bacteria from freshwater sources, which are then internalized by alveolar macrophages (4).It is thought that amoebae growing in freshwater sources, such as Acanthamoeba castellanii and Hartmannella vermiformis, are the reservoirs for the bacteria that cause disease in humans (5,6).Selection for disease, thus, is thought to occur entirely in nonmammalian hosts.
Inside host cells, L. pneumophila resides and replicates in a membrane-bound vacuole that avoids fusion with late endosomes and lysosomes (7,8).Both tubular endoplasmic reticulum (ER) and secretory vesicle-derived material is recruited to the surface of the Legionella-containing vacuole (LCV), resulting in a compartment surrounded by ER (7,(9)(10)(11)(12)(13).Central to L. pneumophila pathogenesis is the Icm/Dot type IVb secretion system which is critical for the formation of the LCV (14,15).Each L. pneumophila isolate injects over 300 different bacterial proteins through this secretion system into host cells (16)(17)(18)(19).The total number of these translocated effectors identified in the Legionella pangenome is staggering, as machine learning strategies have identified at least 18,000 such proteins among isolates from more than 80 Legionella species (20,21).
The roles of individual Icm/Dot translocated substrates (IDTS) in promoting replication and vacuole formation have been difficult to elucidate because individual deletions of most substrates of the L. pneumophila genome have no consequence on intracellular growth of the bacterium.This lack of defect is thought to be due to functional redundancy, such that multiple translocated substrates target parallel pathways in the host cell or are able to complete a pathway independently of each other (22).Recent work indicates that IDTS expansion and redundancy can be partly explained by selection for growth in multiple poorly related amoebal hosts (23), and temporal overlap in the execution of individual protein activities during intracellular growth (24,25).
Previous work has shown that mitogen-activated protein kinases (MAPKs) are modulated during L. pneumophila infection.Upon L. pneumophila challenge in amoebae, a MAPK response interferes with L. pneumophila intracellular growth (26), consistent with an evolutionarily conserved pathway that controls L. pneumophila growth.In addition, L. pneumophila challenge of mammalian cells leads to a cytokine response via NFB and MAPK signaling (27,28).Finally, several IDTS have been shown to interfere with host cell protein synthesis and activate the MAPK response (28,29), indicating that MAPKs may be common targets of L. pneumophila virulenceassociated proteins.
The MAPK family of serine/threonine kinases is involved in directing cellular responses to a diverse array of stimuli such as growth factors, mitogens, osmotic and oxidative stress, and inflammatory cytokines (30)(31)(32).Members regulate proliferation, cell division, differentiation, apoptosis, inflammation, growth, and gene expression (30)(31)(32).The mammalian MAP kinase family includes the extracellular signal-related kinases (ERK1 and ERK2), c-Jun NH 2 -terminal kinases (JNK1, JNK2, and JNK3), and p38 proteins (p38α, p38β, p38γ and p38δ).MAPKs are activated by phosphorylation by specific MAPK kinases, MAP2Ks.The MAP2Ks are, in turn, activated by MAP2K kinases (MAP3Ks), which receive upstream signals such as growth factors binding their respective cell-surface receptors, or chemical or physical stresses in the extracellular environment (31).MAPKs are targets of numerous effectors translocated into host cells by diverse bacterial pathogens (33).
In the following study, complementary strategies were taken to identify L. pneumophila proteins that promote stresses associated with the MAPK response.A putative member of the bacterial cysteine protease family having carboxyl terminal ankyrin repeats was identified.We provide evidence that the function of this effector requires both a conserved catalytic triad and the amino-terminal ankyrin repeats.

Identification of Icm/Dot substrates that cause elevated phosphorylation of SAPK/JNK in mammalian cells.
Proteins had been previously identified that were able to activate an NFB-regulated promoter by expressing a bank of L. pneumophila Icm/Dot translocated substrate (IDTS) genes in mammalian cells (34).We took a similar approach to identify bacterial proteins that activate MAPKs, because a variety of studies show that MAPKs either modulate L. pneumophila intracellular growth or respond to specific IDTS (26)(27)(28)(29).To identify IDTS that activate mammalian MAPKs, we constructed a bank of 257 known and putative IDTS genes fused to gfp in a mammalian expression vector, expanding our previously constructed library (Materials and Methods; (34)).This bank was then transfected into HEK293T cells to identify L. pneumophila proteins that could activate MAPK cascades, as determined by increased phosphorylation of either ERK, p38, or SAPK/JNK relative to the empty vector control, using phosphorylation-specific antibodies (Material and Methods).As each of the MAPKs gave similar levels of activation in response to the IDTS, repetitions used only SAPK/JNK phosphorylation as the readout (Table 1).
A rank-order table was generated based on the median absolute deviation (MAD; Materials and Methods) of SAPK/JNK phosphorylation for each transfectant relative to empty vector control, and 24 candidates were identified that caused increased levels of SAPK/JNK phosphorylation (MAD >3.5;Table 1).IDTS that resulted in enhanced SAPK/JNK phosphorylation included a number of members of the SidE family of IDTS that catalyze phosphoribosyl-linked ubiquitin modification of targets (13,(35)(36)(37), the previously characterized AnkX phosphocholine transferase (38), as well as the Rab5-activated phospholipase VipD (39)(40)(41).To focus on a subset of IDTS candidates that affected known MAPK-activated pathways, we performed a screen in yeast to study the MAPK-related response.

Expression of legA7 in yeast inhibits osmoadaptation.
MAPK activation in response to L. pneumophila occurs in highly diverse and evolutionarily unrelated cell types such as mouse macrophages and Dictyostelium discoideum amoebae (26)(27)(28).
Therefore, we sought to identify IDTS that either impinge or synergize with the MAPK cascade across species boundaries.We previously constructed a bank of Saccharomyces cerevisiae strains harboring ectopically expressed IDTS under the GAL1 promoter control (42), so this bank was used to identify candidate IDTS that modulate the ability of yeast to respond to stresses that activate the MAPK cascade, taking advantage of a simple plate assay.To this end, we characterized IDTS previously demonstrated to cause growth defects in S. cerevisiae when expressed in a medium containing galactose to induce expression (42).For most strains expressing IDTS, the defect was dependent on the presence of galactose in the growth medium, consistent with the growth defects being dependent on the transcriptional induction of each of the IDTS genes (Supplemental Dataset 1; Fig. 1).
Based on the behavior of the strains plated on galactose-containing medium, many of the S. cerevisiae strains harboring IDTS either showed mild growth defects or still had detectable colony formation after serial dilution (Dataset S1; for a few examples see Fig. 1).We predicted that some of the strains with intermediate phenotypes would have amplified defects on medium that activates a stress pathway requiring MAPK activity for viability.After induction of IDTS expression, such strains should show lower colony formation efficiency in the presence of stress conditions, when compared to the same medium that did not induce this stress.
Wild-type yeast can adapt to a variety of environmental stresses, such as high osmolarity, allowing growth in the presence of high concentrations of a variety of solutes.Growth under these conditions depends on MAPK activation to provide osmoprotection (43).To address whether high osmolarity could potentiate S. cerevisiae growth defects caused by the expression of L. pneumophila IDTS, a screen was performed to identify osmosensitive strains (Dataset S1).Using the strains that showed depressed CFU formation after galactose induction, we screened for amplification of these defects by plating on a high osmolarity medium containing sorbitol (44)(45)(46)(47).
Strains were then retained that showed lower CFU efficiency on sorbitol-containing medium in the presence of galactose relative to the identical medium lacking sorbitol.The strain harboring a plasmid encoding legA7 (lpg0403), which is an uncharacterized IDTS (48), had the lowest colonyforming efficiency on high sorbitol relative to the screened pool (Dataset S1; Fig. 1).S. cerevisiae harboring legA7 showed a reduction in CFU on a medium containing high sorbitol compared to the empty vector control (Fig. 1).No strains harboring IDTS showed enhanced growth in the presence of sorbitol.

SAPK/JNK.
LegA7 had been identified as a potential translocated substrate based on the presence of predicted ankyrin repeats in its sequence (48).Consistent with its designation as an IDTS, the protein was shown to contain an Icm/Dot recognition signal that allows the translocation of reporter constructs into mammalian cells (49).Furthermore, transfection with a LegA7-expressing plasmid resulted in high MAPK activation levels (Table 1).To demonstrate that this screen was an accurate representation of the levels of activation by LegA7, HEK293T cells were transfected in triplicate with a plasmid encoding LegA7, as well as two other plasmids encoding IDTS that had caused some level of sorbitol sensitivity to yeast (Dataset S1; lpg0030, and lpg0059).Transfectants harboring the plasmid encoding LegA7 clearly showed enhanced phosphorylation of SAPK/JNK relative to the empty vector control (Fig. 2A; quantitated in 2B), although blotting of the GFP-IDTS fusions showed that the predominant form of the protein present at steady state was missing the carboxyl-terminal 20 kD based on electrophoretic mobility determination of apparent molecular weights (Fig. 2A, arrow).Therefore, LegA7 induced a MAPK response in mammalian cells in addition to its potentiation of osmotic stress in lower eukaryotes.

The effect of LegA7 on yeast is connected to MAPK pathways.
To explore the connection between LegA7 and MAPK-related stresses in yeast, we examined the effect of LegA7 on yeast growth under four stress conditions that are known to activate MAPK pathways.Like Sorbitol, NaCl, and cold temperature (20C) activate the HOG pathway that controls the osmoregulation response, while high temperature activates the cell-wall integrity (CWI) response pathway (50)(51)(52).The three stresses related to the HOG pathway resulted in severe growth inhibition by LegA7 (Fig. 3A; compare Gal.+ Sorb, NaCl or 20 o C to Gal. in absence of stress), indicating growth inhibition by the effector is linked to activation of the highosmolarity glycerol (HOG) pathway.On the contrary, high temperature (37C) resulted in suppression of the yeast growth inhibition by LegA7 at 30C (Fig. 3A; compare 37 o C to standard growth conditions).To further explore these results, we examined the yeast growth inhibition caused by LegA7, using yeast deletion mutants in the HOG pathway (hog1 and pbs2) and CWI pathway (mpk1 and bck1; Fig. 3B).Using standard yeast growth temperature (30C, Glu) the four deletion mutants were indistinguishable from wild-type yeast.In contrast, at 37C, the HOG pathway-related mutants (mpk1 and bck1) still suppressed the LegA7-induced growth defect, whereas the two CWI pathway mutants (MPK1 and Bck1) failed to suppress this defect (Fig. 3C).
These results indicate that yeast must activate the CWI pathway to suppress the growth defect mediated by LegA7 at 37C, and in the absence of a component from this pathway, there is a growth defect at high temperatures.

LegA7 shows sequence similarity to a family of bacterial cysteine protease effectors.
Although the carboxyl-terminal of LegA7 has been shown to have a series of ankyrinrepeats, the amino-terminal has not been annotated, but was shown to harbor the LED010 domain present in other Legionella effectors (20).We noted during BLAST searches that there was low similarity to HopN1, a Pseudomonas syringae type III secretion system translocated substrate that is similar to cysteine protease family members at key catalytic residues (53).The region of sequence similarity with HopN1 begins at the LegA7 Cys61 residue, which aligns with the predicted catalytic cysteine of the Pseudomonas protein (Fig. 4) (54).The cysteine in members of this family is part of a catalytic triad of essential amino acids that also includes a histidine and either a glutamate or an aspartate, usually located carboxyl-terminal to the cysteine (53).Scanning of the LegA7 sequence indicates that there are candidate residues that could be part of this catalytic triad and be required for the activity of this protein (Fig. 4).After analysis of the LegA7 protein sequence, we selected residues C61, H205, N220, and N222 as candidate residues to make up this triad.Using sitedirected mutagenesis, each residue was mutated to alanine.Mutations in all four residues described above were able to rescue the growth defect caused by LegA7 in yeast (Fig. 5A), with the efficiency of CFU formation for all strains being identical to the empty vector control (Fig. 5A).These mutants are consistent with C61, H205, N220, and N222 being important for LegA7 function and involved in a catalytic triad.The suppression of the defect was not a result of the mutations destabilizing the protein, because each of the mutant proteins showed higher steady-state levels of protein, than that observed for the wild-type protein, consistent with yeast tolerating these nontoxic proteins (Fig. 5B).In fact, the enhanced production of the mutants relative to wild-type is consistent with its loss of toxicity for yeast.

Mutagenesis screen to identify residues important for the LegA7 function.
The carboxy-terminal of LegA7 is predicted to have five ankyrin repeats (residues 290-454) (48).Each ankyrin repeat contains two alpha-helices separated by beta turns, such that each 33residue motif contains a beta turn-alpha helix-beta turn-alpha helix-beta turn.In addition to the five ankyrin repeats, there is an inter-domain region just amino-terminal to the ankyrin repeats.To determine if these repeats are important for LegA7 function, deletion mutations were constructed that lack the following: all five ankyrin repeats (290-454); the carboxy-terminal four repeats (331-454); the amino-terminal two repeats (290-361); or the two amino-terminal repeats in conjunction with the inter-domain region (264-361) (Fig. 6).None of the deletions caused a marked growth defect when introduced into yeast and grown on galactose, galactose and sorbitol, or galactose and NaCl plates (Fig. 6A).The lack of growth defect did not appear to be a result of the LegA7 derivatives being unstable because only the derivative lacking all five ankyrin repeats showed low steady-state levels of protein (Fig. 6C).
To further characterize residues important for the LegA7 function, we performed a random mutagenesis screen for mutations on the LegA7 expression plasmid that abrogated its inhibition of yeast growth.We took the approach of placing the HIS3 (histidine biosynthesis) gene at the 3'-end of legA7 to construct a protein fusion (55).This strategy allowed for selection against frameshifts, stop codons, or GAL1 promoter mutations that could cause silencing of this promoter, each of which would result in enhanced growth in the mutants relative to strains harboring the legA7 gene (Fig. 7A).Yeast harboring the legA7-HIS3 protein fusion was similarly defective for growth as the strain harboring only legA7 (Fig. 7B), allowing facile identification of strains with improved growth characteristics.Using a similar strategy to isolate IDTS point mutations, the absence of the HIS3 co-selection results in the vast majority of mutants being noninformative truncations, strongly supporting the use of this approach (56).
A plasmid encoding the legA7-HIS3 protein fusion was subjected to mutagenesis by passage within an E. coli mutator strain (Fig. 7A; Materials and Methods).Mutated plasmids were then introduced into yeast, selecting on medium containing galactose inducer and lacking histidine, to allow selection for intact legA7-HIS3 protein fusions with increased viability after induction of expression (Fig. 7B).As a control on day three, the wild-type legA7-HIS3 fusion showed little growth, so colonies were retained from two mutagenized pools at this timepoint.Of these, 36 were shown to have strong growth on sorbitol-containing high-osmolarity medium and retained a functional legA7-HIS3 protein fusion (Table 2).The high osmolarity resistant mutants fell into two categories: a few were found in the region surrounding the catalytic triad (G59A, T202A), but the majority were found in the inter-domain region and the two most amino-terminal ankyrin repeats (Fig. 7C).Of the residues near the catalytic triad, the most notable was G59.Although not part of the triad, the Gly residue is highly conserved in members of the bacterial cysteine protease family, located 2 or 3 residues upstream from the catalytic cysteine (Fig. 4, purple arrow).Two additional abundant sites for mutant isolation were H166 and S300, which were hit seven and six times, respectively.There were 17 additional mutations in the two amino-terminal ankyrin repeats and five mutations in the inter-domain region.These 22 mutations emphasize the importance of the interdomain region and the amino-terminal region of the ankyrin repeats for LegA7 function.In contrast, no mutations selected in this fashion were found in the three carboxy-terminal ankyrin repeats.
To evaluate the putative active site mutations and determine if any of the mutations in the inter-domain and ankyrin repeat regions could directly interface with the target of this activity, the residues altered were modeled with AlphaFold 2.0, using the ColabFold program (Dataset S5; Materials and Methods; (57,58)).The predicted 3-dimensional structure is consistent with C61/H202/N220 forming a catalytic triad (Fig. 4 and Fig. 7D), with the catalytic Cys residue in close apposition to His205.In addition, the defective H42Y mutation is located in a residue that interfaces with Glu58 and Gly59 directly abutting the active site (Fig. S2).The Tyr substitution is situated so that is could impinge on the active site, preventing substrate access or altering interactions between catalytic residues.
The mutations located in the inter-domain (ID) and ankyrin repeat (Ank) regions are consistent with LegA7 activity being regulated by an interaction surface that is distant from the catalytic triad.Many of these mutations alter hydrophobic or small sidechain amino acids that appear poorly accessible to water, likely causing local structural alterations that block presentation of a binding interface to host proteins.Among the few mutations in clearly surface-exposed residues, three were aligned with each other on each side of a cleft, with Ank1/2 on one side, and the catalytic domain/ID region on the other side (Fig. 7E).The clustering of these mutations (Asn279, Gly300, Arg341) is consistent with the formation of a binding surface for eukaryotic substrates, with the Asn and Arg residues participating in interprotein associations.This putative binding cleft is on the opposite side of the protein from the catalytic site, indicating that this surface may not be involved in target association with the catalytic site (Figs.7F,G; note flip).Rather, this binding surface is oriented similarly to the interface between the L. pneumophila VipD phospholipase and the host Rab5 activating protein (59), consistent with host protein binding to the ID and Ank1/2 regions resulting in activation of LegA7 catalysis.

DISCUSSION
Legionella pneumophila has evolved an arsenal of methods to manipulate the host cell to survive and replicate intracellularly.To this end, L. pneumophila translocates hundreds of IDTS proteins into the host cell through the Icm/Dot type IVb secretion system.Among these proteins are ones known to activate NFB, such as LnaB, and the kinase LegK1 (34,60), as well as at least five protein synthesis inhibitors that activate both NFB and MAPKs (28,61).To identify other IDTS that alter the host cell stress response, we introduced a 259-member plasmid bank of L. pneumophila IDTS into mammalian cells and screened for proteins that caused phosphorylation of the stress-activated MAPK SAPK/JNK (62).A complimentary screen was performed in yeast, by overexpressing IDTS proteins in yeast in the absence or presence of sorbitol, to identify IDTS proteins that amplify defects on a medium that activates a stress pathway requiring MAPK activity (52).Both screens led us to focus on a single IDTS, LegA7, which seems to impinge on a stress response pathway across evolutionarily diverse hosts.
LegA7 was previously identified as a substrate of the Icm/Dot secretion system due to its ankyrin repeats (63) and it is homologous to a cysteine protease domain at its amino-terminal (Fig. 4).
Mutagenesis of conserved residues of this cysteine protease domain (Fig. 5) supports the model that these residues form a catalytic triad similar to that found in members of this family (53).Cysteine protease domain family members are associated with a variety of catalytic functions in addition to performing proteolysis, such as small molecule transferases (53,64).Most of these activities are unknown and await identification.One of the most important features of this family is that members show high substrate specificity, and target single residues in their substrates.This is exemplified by the defining members, Pseudomonas syringae AvrPphB and Yersinia YopT, which are type III secretion system translocated substrates (54).AvrPphB is a protein with a papain fold that has a protease activity targeting a plant protein associated with innate immune signaling (65,66).YopT cleaves membrane-bound Rho GTPases just upstream from their acylation sites, resulting in the release of these proteins from the plasma membrane and disruption of the host actin cytoskeleton (54).
Cysteine protease domains like that identified in LegA7 are found in another L. pneumophila effector (LegA2-Lpg2215) and in putative effectors in other Legionella species (Fig. 8A).Each of these proteins (LegA7, LegA2, and the other IDTS proteins presented), belong to an orthologous group, all harboring the cysteine protease domain at the amino-terminal part of the protein, with ankyrin repeats of varying numbers found at the carboxyl terminal (Fig. 8A).In addition, one of these IDTS proteins also harbors a predicted phosphatidylinositol 3-phosphate (PI3P) binding domain (LED027), which was previously shown to bind PI3P in other L. pneumophila effectors (67), possibly directing effectors to the LCV surface (68)(69)(70).The catalytic triads identified in LegA7 and the putative IDTS proteins presented are very similar to the ones found in numerous type III secreted effectors (Fig. 8B).However, in LegA7 an asparagine residue is located in the position that is usually occupied by an aspartic acid residue in the catalytic triad in the type III effectors (Fig. 8B).The asparagine residue that is critical for the function of LegA7 has recently been shown to be critical for other unrelated cysteine proteases that also harbor conserved cysteine and histidine residues in their catalytic triads (71).At this point, there is no clear sequence motif that distinguishes protease from transferase activity, so the presence of the asparagine in LegA7 should not be considered diagnostic of a particular activity.For instance, the Yersinia YopJ type III effector has the typical catalytic triad of a cysteine peptidase, but it functions as an acetyltransferase that targets MAPK kinases, preventing their activation (72).
A mutagenesis screen to identify residues important for LegA7 function provided additional information about other domains important for the function of LegA7 (Fig. 7C).Sequence similarity alignments with other cysteine protease domains predicted a catalytic triad (Fig. 4) and this prediction was verified by mutations in the predicted catalytic residues (Fig. 7B; Table 2).Furthermore, tertiary structure prediction from AlphaFold 2 indicated that the C61/H205/N220 triad formed a compact pocket in the catalytic domain, consistent with the mutant analysis (Fig 7D ).
Notably, a His42Tyr mutation in the catalytic domain was isolated by selection in yeast, at a site not predicted to be involved in catalysis.The structural model, however, predicted that this residue was at the base of this catalytic pocket (Fig 7D).When the insertion of the Tyr at residue 42 was modeled compared to the WT His residue, it was found to impinge on neighboring residues Glu58 and Gly59 in the pocket, perhaps distorting the site or blocking access to the catalytic residues (Fig. S2), thus providing a molecular explanation for the isolation of this mutation.
We isolated five mutations in the inter-domain region and 11 mutations in the two ankyrin repeat motifs immediately downstream from this region that reduced yeast growth inhibition.Although many of the mutations in these two regions appeared to be in buried residues or possibly altering structure, there was a series of residues predicted to be surface exposed on a face of the protein where the inter-domain (ID) region comes in contact with the Ank1 and Ank2 repeats (Fig. 7E).
The region altered by these mutations is predicted to be turned 180 o away from the catalytic pocket, making it unlikely that the substrates of the protease domain were binding on this face of the protein.It is more likely that this region allows LegA7 to target to a cellular locale where it can access its substrates.Alternatively, binding to this region by a host or bacterial protein could activate LegA7, allowing either localization-or time-dependent activation of the catalytic triad.
Legionella translocated proteins are known to be highly regulated, both by other translocated proteins called metaeffectors (73), or by host proteins.The fact that this proposed regulatory surface appears to be on the face opposite from the catalytic triad (Fig. 7F,G) is not unusual, and has been previously observed in the crystal structure of the L. pneumophila VipD patatin family phospholipase, which is activated by Rab5 (59).
One of the puzzles of the point mutation screen is that no lesions were identified in the carboxylterminal three ankyrin repeat domains (Table 2; Fig. 7C).Some insight into this result may be given by the crystal structure of AnkX, an L. pneumophila IDTS ankyrin repeat-containing protein having phosphocholine transferase activity (38,74,75).Based on primary sequence information, AnkX is predicted to have up to 12 ankyrin repeats arrayed downstream from the catalytic domain, similar to the domain arrangement of LegA7.The crystal structure of the phosphocholine transferase domain indicates that the amino-terminal four ankyrin repeats are involved in intramolecular interactions that support the catalytic activity of AnkX (75).A proteolytic cleavage product that retains only the amino-terminal four repeats and the phosphocholine transferase region retains full activity and substrate specificity, consistent with the carboxyl-terminal repeats being dispensable for activity (75).Presumably, the ankyrin repeats in this protein are divided into an amino-terminal region involved in intramolecular interactions, with the carboxyl-terminal region providing intermolecular interactions that contribute to the localization or targeting of protein.By analogy with AnkX, the two amino-terminal repeats of LegA7 may be involved in intramolecular interaction supporting the activity of the protein, with the three carboxyl-terminal repeats involved in spatial targeting of the protein or processes unrelated to lethality in yeast.
We have determined that LegA7 appears to activate at least one host cell pathway that, when disrupted, results in yeast growth inhibition.Using a genetic strategy, we were able to obtain evidence that yeast growth inhibition likely results from an enzymatic activity at the amino-terminal end of this protein that is modulated by a subset of ankyrin repeats.Future work will be required to identify the substrate of LegA7 activity and enumerate the pathways that are misregulated by the targeting of this substrate.

Strains and media
Yeast and bacterial strains, plasmids and primers used in this study are listed in Datasets S2, S3 and S4, respectively.For E. coli strains, ampicillin was added to 100 µg•ml -1 , kanamycin was added to 30 µg•ml -1 .Yeast strains were grown in synthetic defined (SD) dropout medium supplemented with 2% glucose, or galactose as indicated in the text.

Screen for MAPK activation in mammalian cells
Plasmids containing Icm/Dot translocated substrate genes fused to gfp were constructed by inserting fragments into pDONR221 (Invitrogen) and transferring the inserts into pDEST53 (pCMV-GFP) by using the Gateway™ system, as we previously described (34).The inserts in the original pDONR211 constructions were sequenced, and the appropriate recombinants were transferred into the GFP-expressing plasmid (34).The GFP fusion constructions, in which GFP-IDTS fusions were under the control of the CMV promoter, were analyzed by restriction digestion, the insertions were sequenced, and the plasmids were purified using Ultra-Pure Miniprep kits (Qiagen) for use in transfections.
To screen for MAPK activation, HEK293T cells were seeded at a density of 1 x 10 6 cells per well of 12-well dishes and left overnight to adhere.Cells were transfected with 500 ng of each plasmid using 0.4 µl of Fugene 6 according to the manufacturer's instructions (Promega) for 40 hours.Each well was washed, solubilized in SDS sample buffer (2% SDS, 50 mM TRIS-HCl (pH = 6.8), 0.1% Bromphenol Blue, 10% glycerol), boiled for 2 min., loaded onto two SDS-PAGE gels, and transferred to filters for immunoprobing.One filter was used to probe for relative expression levels of each of the fusion constructions, using anti-GFP polyclonal serum A-11122 (Invitrogen).
Filters were scanned by densitometry and the relative phosphorylation level of each MAPK member was determined relative to anti α-tubulin antibody T9026 (Sigma) loading control.Images were inverted and quantified using Adobe Photoshop.
Analysis of MAPK phosphorylation samples resulted in a few transfections that showed large amounts of JNK phosphorylation relative to the control empty vector, giving the overall dataset a large standard deviation.Therefore, to expand the number of L. pneumophila candidates that cause alterations in JNK phosphorylation relative to empty vector control, the median and the median absolute deviation (MAD) of each sample tested were determined (76).A MAD score was then determined for each sample as (X imedian)/MAD, in which X i = amount of phosphorylation relative to the control of a particular sample.Samples that had MAD scores > 3.5 were considered to be expressing candidate L. pneumophila proteins that cause enhanced activation of JNK.

Quantification of MAPK activation in mammalian cells
For quantitative analysis of MAPK activation in mammalian cells, HEK293T cells were seeded at a density of 1 x 10 6 cells per well of 12-well dishes and allowed to adhere overnight.Cells were transfected with 500 ng of each plasmid using 1.5 µl of Fugene HD according to manufacturer's instructions (Promega) for 40 hours.Each well was washed, solubilized in 2x SDS sample buffer (125 mM Tris-HCl (pH = 6.8), 20% glycerol, 4% SDS, 2% 2-ME, 0.001% bromophenol blue), boiled for 5 minutes, loaded onto three SDS-PAGE gels, and transferred to PVDF membranes for immunoprobing.One filter was used to probe for relative expression levels of each of the fusion constructions, using anti-GFP polyclonal serum A-11122 (Invitrogen).The second filter was immunoprobed with anti phospho-JNK antibody 4668 (Cell Signaling).The third filter was immunoprobed with anti α-tubulin antibody T9026 (Sigma).The appropriate HRP-conjugated secondary antibodies (Invitrogen) were used for ECL detection.The filters were exposed to film and images of the films were taken using Kodak Image Station.
To determine the normalized signal intensity, the signal intensities of phospho-SAPK/JNK for legA7, lpg0030, lpg0059 and empty vector were normalized to the intensity of the loading control alpha-tubulin for each particular sample.The average and standard error were calculated for each strain.To determine 'Relative Expression' over the empty vector, the average normalized signal intensity of each strain was divided by the average normalized empty vector signal intensity.Data are the mean of three samples + S.E.

Yeast growth assays
L. pneumophila effector-encoding genes were cloned under the GAL1 promoter in the pGERG523 yeast overexpression vector.Plasmids were transformed into yeast cells using a standard lithium acetate protocol (77), and transformants were selected for histidine prototrophy on minimal SD dropout plates.Resulting transformants were then grown overnight in liquid SD culture medium at 30°C, cell number was adjusted, and a series of 10-fold dilutions were made.The cultures were then spotted onto the respective SD dropout plates containing 2% glucose or galactose.When indicated, the plates were supplemented with 0.7 M NaCl (Merck) or 1 M sorbitol (Sigma).

Construction of 13 x myc fusions
The pGREG523 vector was used for the overexpression of 13 x myc-tagged effectors in yeast (78).This vector contains a polylinker under the yeast GAL1 promoter at the end of a 13 x myc tag.
The L. pneumophila genes examined were amplified by PCR using a pair of primers containing suitable restriction sites (Dataset S4).The PCR products were subsequently digested with the relevant enzymes and cloned into pGREG523 to generate the plasmids listed in Dataset S3.The plasmid inserts were sequenced to verify that no mutations were introduced during the PCR.

Site-directed mutagenesis and construction of deletion mutants
Site-specific mutants in the putative peptidase domain of legA7 were constructed by the PCR overlap-extension approach on the legA7 gene inserted into pGREG523 (79) as previously described, using the primers listed in Dataset S4.
Deletions of LegA7 ankyrin repeats were performed, using oligonucleotides overlapping deletion junctions.Primers containing the desired deletions and complement were used to amplify the entire plasmid sequence using PfuUltra II fusion HS DNA polymerase (Agilent).After PCR, the product was DpnI treated to digest the parental DNA template.5µL of the DpnI-digested PCR reaction was then transformed into Ca 2 Cl-competent DH5alpha E. coli, allowing the linearized DNA to be recircularized by the E. coli cells (80).Mutants constructed were confirmed by Sanger sequencing (GeneWiz Azenta, South Plainfield, NJ).Plasmids harboring deletions mutants described in Fig. 6 and Fig. S1, were sequenced entirely (Plasmidsaurus, Eugene, OR), with sequences and plasmids being deposited with AddGene (ID numbers: 216518-216522).

Western blot analysis
For all protein fusions examined in yeast, the formation of a fusion protein with a proper size was validated by Western blotting using anti-myc antibody 9E10 (Santa Cruz Biotechnology), and anti-Xpress tag antibody R91025 (Invitrogen).Anti-PGK1 antibody 22C5D8 (Invitrogen) was used as a loading control.

Mutagenesis screen
To isolate random mutations in the legA7 gene, the HIS3 gene was placed directly downstream and in-frame with the legA7 gene (removing the cognate stop codon) in the plasmid pYES2/NTA (URA selection) using homologous recombination to generate a legA7-HIS3 fusion.The BY4741 strain transformed with the legA7-HIS3 fusion pYES2/NTA was verified to show a stronger growth defect than the plasmid harboring legA7 alone, as demonstrated by spotting assay.Mutagenesis was performed by transforming the plasmid into the E. coli XL1-Red mutator strain (mutS mutT mutD) (see Dataset S2).After growth in culture for 16 hours, DNA was isolated from two independent cultures and transformed into S. cerevisiae BY4741, selecting for growth on SD (galactose)-URA-HIS drop-out medium.Strains that gave higher viability or larger colony size than the parental legA7-HIS3 protein fusion-containing strain on sorbitol-containing medium were retained for further analysis.

Modeling of mutation sites.
The full-length LegA7 was submitted to the ColabFold program (58) to allow analysis by AlphaFold 2.0 (57).The five ranked models that were returned provided similar results, so the Rank 1 model was used for further analysis.The PDB file generated (Dataset 4) was displayed in iCn3D (81), allowing sites of mutations to be identified and relative orientation of domains to be evaluated.promoter and grown on plates containing glucose (Glu), galactose (Gal, inducing conditions), or galactose supplemented with 1 M sorbitol (Gal.+ Sorb.) at 30C, in the wild-type S. cerevisiae BY4741.pGREG523 (vector) was used as a negative control.LegA7 was overexpressed in the wild-type S. cerevisiae BY4741 (W.T.) and the hog1, pbs2, mpk1, and bck1 deletion mutants at 30C and 37C.pGREG523 (vector) was used as a control.Highlighted in red are the residues predicted to be part of the catalytic triad; highlighted in purple is the Glycine residue that came out in the mutagenesis screen (see text).4, the residues C61, H206, N220, and N222 were selected as candidates for a catalytic triad.

SUPPLEMENTARY MATERIAL
Point mutations were generated and plating efficiency on galactose (Gal.), galactose, and sorbitol (Gal.+ Sorb.) galactose and NaCl (Gal.+ NaCl) plates of yeast strains harboring the mutant derivatives was determined.B. LegA7 point mutations do not reduce steady-state levels of protein.
To induce gene expression in yeast, yeast strains were grown on SD plates containing galactose.
Lysates were analyzed by immunoblot with antibodies against the myc epitope, using PGK1 for loading control.

Figure 1 .
Figure 1.Growth defects caused by L. pneumophila Icm/Dot translocated substrates expressed

Figure 2 .
Figure 2. Ectopic expression of LegA7 results in elevated phosphorylation of SAPK/JNK in

Figure 4 .
Figure 4. Similarity of LegA7 to bacterial cysteine protease family members.Sequence

Figure 5 .
Figure 5. Genetic evidence for a catalytic triad in LegA7. A. Based on sequence similarity in

Figure 6 .
Figure 6.Ankyrin repeats are required for yeast growth inhibition.A. Deletion of the ankyrin

Figure 7 .
Figure 7. Identification of residues in the amino-terminal ankyrin repeats that are essential

TABLE 1 .
Identification of Icm/Dot translocated substrates that result in increased phospho-JNK levels in mammalian cells Listed are L. pneumophila genes that, when ectopically expressed from the pDEST53 plasmid, result in enhanced JNK phosphorylation relative to empty vector control, using the median absolute deviation (MAD) score > 3.5 as a cutoff for increased JNK phosphorylation relative to wild type control. a