PE5-PPE4-EspG3 trimer structure from mycobacterial ESX-3 secretion system gives insight into cognate substrate recognition by ESX systems

Mycobacterium tuberculosis (Mtb) has evolved numerous type VII secretion (ESX) systems to secrete multiple factors important for both growth and virulence across their cell envelope. Three such systems; ESX-1, ESX-3, and ESX-5; have been shown to each secrete a unique set of substrates. A large class of these substrates secreted by these three systems are the PE and PPE families of proteins. Proper secretion of the PE-PPE proteins requires the presence of EspG, with each system encoding its own unique copy. There is no cross-talk between any of the ESX systems and how each EspG is recognizing its subset of PE-PPE proteins is currently unknown. The only current structural characterization of PE-PPE-EspG trimers is from the ESX-5 system. Here we present the crystal structure of the PE5mt-PPE4mt-EspG3mm trimer, from the ESX-3 system. Our trimer reveals that EspG3mm interacts exclusively with PPE4mt in a similar manner to EspG5, shielding the hydrophobic tip of PPE4mt from solvent. The C-terminal helical domain of EspG3mm is dynamic, alternating between an ‘open’ and ‘closed’ form, and this movement is likely functionally relevant in the unloading of PE-PPE heterodimers at the secretion machinery. In contrast to the previously solved ESX-5 trimers, the PE-PPE heterodimer of our ESX-3 trimer is interacting with it’s chaperone at a drastically different angle, and presents different faces of the PPE protein to the chaperone. We conclude that the PPE-EspG interface from each ESX system has a unique shape complementarity that allows each EspG to discriminate amongst non-cognate PE-PPE pairs.


ABSTRACT
Mycobacterium tuberculosis (Mtb) has evolved numerous type VII secretion (ESX) systems to secrete multiple factors important for both growth and virulence across their cell envelope. Three such systems; ESX-1, ESX-3, and ESX-5; have been shown to each secrete a unique set of substrates. A large class of these substrates secreted by these three systems are the PE and PPE families of proteins. Proper secretion of the PE-PPE proteins requires the presence of EspG, with each system encoding its own unique copy. There is no cross-talk between any of the ESX systems and how each EspG is recognizing its subset of PE-PPE proteins is currently unknown. The only current structural characterization of PE-PPE-EspG trimers is from the ESX-5 system. Here we present the crystal structure of the PE5mt-PPE4mt-EspG3mm trimer, from the ESX-3 system. Our trimer reveals that EspG3mm interacts exclusively with PPE4mt in a similar manner to EspG5, shielding the hydrophobic tip of PPE4mt from solvent. The C-terminal helical domain of EspG3mm is dynamic, alternating between an 'open' and 'closed' form, and this movement is likely functionally relevant in the unloading of PE-PPE heterodimers at the secretion machinery. In contrast to the previously solved ESX-5 trimers, the PE-PPE heterodimer of our ESX-3 trimer is interacting with it's chaperone at a drastically different angle, and presents different faces of the PPE protein to the chaperone. We conclude that the PPE-EspG interface from each ESX system has a unique shape complementarity that allows each EspG to discriminate amongst noncognate PE-PPE pairs.
Tuberculosis is currently the deadliest infectious disease in the world, killing 1.5 million people in 2019 (1). The lack of an effective vaccine against the most prevalent pulmonary form of Tuberculosis, as well as the emergence of numerous multi-drug resistant strains of the causative agent, Mycobacterium tuberculosis (Mtb), highlights the growing need for more effective treatment options. Therefore, a more comprehensive understanding of the Mtb virulence machinery is needed to aid the development of new therapeutics.
Mtb, like all mycobacteria, contains a thick hydrophobic cell envelope that aids in protecting the mycobacterium from its environment. To overcome the limited permeability created by this envelope, mycobacteria have evolved a specialized secretion system to export proteins across their cell envelopes, the type VII secretion system, also known as the ESX system (2). Five different ESX systems encoded in the Mtb genome, and three are known to secrete proteins; ESX-1, ESX-3, and ESX-5 (3). Recently, the structures of the core complex of both ESX-3 (4,5) and ESX-5 (6) have been solved. The ESX-5 core complex has six-fold symmetry and sits on the inner membrane (6), while the ESX-3 core complex was solved as a dimer that could be modeled onto the six-fold symmetry of the ESX-5 core complex (4,5). These systems are not functionally redundant, as their substrates are not re-routed to other ESX systems (7). The ESX systems secrete a variety of different substrates, each containing a general type VII secretion motif of YxxxD/E (8). A significant class of substrates being the PE and PPE proteins, named for conserved residues (Pro-Glu for PE and Pro-Pro-Glu for PPE) within their N-terminal domains (9,10). The N-terminal domains are about 110 (PE) or 180 (PPE) amino acids in length and interact together to form a PE-PPE heterodimer. A cytosolic chaperone, EspG, is required for proper folding and/or stability of the PE-PPE proteins and, ultimately their proper secretion (11,12). Each ESX system secretes a unique subset of PE-PPE heterodimers, and therefore each encodes an EspG that binds to only its corresponding heterodimers (11,12). The first structural insight into the EspG and PE-PPE interaction was revealed by analysis of the structure of the PE25-PPE41-EspG5 complex, a trimer from ESX-5 (12,13). EspG5 interacts solely with PPE41 at the tip distal to the PE25 interaction and aids in preventing PE-PPE heterodimer aggregation in part by shielding a conserved hydrophobic tip on the PPE proteins, known as the hh motif (12). The additional structure of the ESX-5-related, PE8-PPE15-EspG5 trimer, revealed similar interactions of the substrate PE-PPE dimer with the EspG5 chaperone (14). Despite high conservation among PPE proteins in the identified EspG5 binding region from PPE41, three residues vary depending on whether the PPE protein is secreted by ESX-1, ESX-3, or ESX-5 (12). Alteration of any or all of these positions in the ESX-5-dependent PPE41 did not disrupt PPE41-EspG5 binding (12). Based on this observation it has been suggested that structural elements outside of the EspG-binding region differentiate the ESX-5-specific PPE proteins from their ESX-1 and ESX-3 homologs to bind EspG5 (12).
This study was initiated to understand the how each EspG from the different ESX systems specifically recognizes its unique subset of cognate PE-PPE heterodimers. Here we present the structure of PE5-PPE4-EspG3 from ESX-3. This structure reveals a novel binding mode of PE-PPE proteins with the EspG chaperone and suggests the molecular mechanism by which the PE-PPE dimers are specifically targeted by cognate chaperones.

EspG3 forms a complex with PE5-PPE4 and binding is conserved across species
In order to understand the mechanism for the specificity of PE-PPE recognition by cognate chaperones a high-resolution structure of a trimer produced by the ESX systems, other than ESX-5, was needed. Our efforts have focused on optimizing the ESX-3 PE-PPE-EspG trimer for X-ray structural studies. Constructs of full-length PE5 (Rv0285), the conserved N-terminal PPE domain of PPE4 (Rv0286, residues 1-181), in a complex with the cognate full-length EspG3 (Rv0289) from Mycobacterium tuberculosis (Fig. 1a) never formed high-resolution diffraction quality crystals, despite our best efforts. The difficulty could be due to some heterogeneity in the processing of EspG3mt within the Escherichia coli cell, as seen by the double band in Fig. 1b and Sup. Fig. 1a. Numerous variations of PE5-PPE4-EspG3 constructs were screened utilizing multiple mycobacterial species, different fusion approaches, and even mixing PE5-PPE4 dimers with EspG3 chaperones from different species (Sup. Table 1). This latter approach was inspired by the work done on the Plasmodium aldolase-thrombospondin-related anonymous protein complex (15), and in the end, produced the best crystals for further diffraction experiments. To ensure the mixed trimers behaved the same in solution as the wild-type (WT) trimer, sizeexclusion chromatography with multi-angle light scattering (SEC-MALS) experiment was performed on both the WT PE5mt-PPE4mt-EspG3mt trimer as well as the mixed PE5mt-PPE4mt-EspG3mm trimer which contained the Mycobacterium marinum EspG3 gene (MMAR_0548) (Fig. 1b-c). Both trimers form a 1:1:1 complex with experimental molecular weights of 56.2 kDa (Fig.  1b) for the full M. tuberculosis trimer and 54.6 kDa (Fig. 1c) Fig. 1a-e). The binding of different EspG3s to the same PE-PPE heterodimer suggests a common protein-protein recognition mechanism within the ESX-3 family.

Overall structure of PE5mt-PPE4mt-EspG3mm
The PE5mt-PPE4mt-EspG3mm trimer was able to form diffraction quality crystals, and two different crystal forms were observed that diffracted to 3.3 Å (I422) and 3.0 Å (P212121) ( Table 1). Final refinement and data statistics are in Table 1. Overall there is little structural variation between the individual proteins across the copies present in the two crystal forms (Table 2).
For all structural analysis and comparisons, the first copy of the PE5mt-PPE4mt-EspG3mm trimer from the higher resolution P212121 crystal form was used because it diffracted at a higher resolution and has the lowest B-factors from the noncrystallographic copies in the P212121 form. EspG3mm interacts solely with the tip of PPE4mt (Fig.  2), similar to EspG5 in the previously solved ESX-5 trimers (12)(13)(14). However, the orientation of PE5mt-PPE4mt relative to EspG3mm is dramatically different than was observed for either ESX-5 trimer, and the differences between them will be described in later sections. The YXXD/E motif for ESX secretion of PE5mt is accessible for interactions with the rest of the ESX machinery, as it is located distal to the EspG3mm interaction (11). In both crystal forms, this secretion motif is disordered, similar to the motif in PE8mt from the PE8mt-PPE15mt-EspG5mt trimer (14). The individual components of the PE5mt-PPE4mt-EspG3mm trimer align well to the individual components of the previously reported ESX-5 trimers, both PE25mt-PPE41mt-EspG5mt (4KXR and 4W4L) and PE8mt-PPE15mt-EspG5mt (5XFS), with only moderate variations (Table 3).
In a previous study on EspG structures (16), a small-angle X-ray scattering (SAXS) experiment was done on the PE5-PPE4-EspG3 trimer from M. smegmatis. Comparisons between this SAXS analysis and our crystal structure were performed to see if the solution-based characterization of the trimer matched the X-raybased characterization. We ran CRYSOL (17) on our crystal structure in comparison to the experimental scattering data from the M. smegmatis trimer. The overall c 2 is 2.53, which is acceptable given that the trimers are from different species with only 54.0-73.8% sequence identity across the different components (Sup. Fig. 2). The main differences are in the extreme high-and lowresolution areas, likely arising from differences in the primary structure between the two samples and from aggregation in the SAXS sample, respectively. Therefore, we are confident that the crystal structure is an appropriate model of the ESX-3 trimer as it exists in solution.

Interface between PPE4mt and EspG3mm
The interface between EspG3mm and PPE4mt contains numerous hydrophobic interactions, multiple hydrogen bonds, and two salt bridges centered around E140 PPE4mt ( Fig. 2b-f). Overall the interface buries 3,121 Å 2 of solventaccessible surface area, as calculated by the PISA server (18) and is comprised of 30 total residues from PPE4mt and 49 residues from EspG3mm ( Fig.  3). The tip of PPE4mt containing the ends of α4 and α5 and the loop between them is inserted into a groove on EspG3 composed of its central β sheet and C-terminal helical bundle. This bundle shields the hydrophobic tip of PPE4mt, including the hh motif of FF128, from solvent access. The tip of PPE4mt is interacting with EspG3 in such a way that the complex is unlikely to disengage at the ESX secretion machinery without structural rearrangement of the chaperone.

Mutations cause disruptions in the PPE4-EspG3 interface
In order to probe the interface of the crystal structure and test the importance of interacting residues, we made several mutations on both PPE4 and EspG3 sides of the interface and opted to use the cognate PE5mt-PPE4mt-EspG3mt trimer to test our mutations. The PISA output (18) of the interface was analyzed along with sequence alignments of the current known ESX-3 PPE proteins (Sup. Fig.  3) and alignments of the EspG3 used in this study (Sup. Fig. 4) to select which residues in the interface would be mutated.
PPE4mt is well conserved along the interface among ESX-3-specific PPE proteins (Sup. Fig. 3), and we targeted strictly conserved residues in the interface. We selected N127 and N132 because they contain buried hydrogen bonds, F128 and F129 because they are the hh motif and contribute a large amount of solvation energy to the interface according to PISA (18), and E140 because it is part of the salt bridges in the interface. We ran co-purification pull-down assays with mutated PPE4mt and EspG3mt (Table 4). As described earlier, EspG3mt is only co-purified with the PE5-PPE4 heterodimer if it forms a complex . The introduction of charges into the buried hydrogen bonds with N127D and N132E was unable to break the PPE4mt-EspG3mt interaction, and neither was the charge reversal of E140R, as all three mutations co-purify with EspG3mt (Sup. Fig. 5a). This suggests that disruption of any of these single positions is not sufficient to abolish PPE4mt-EspG3mt interaction. Conversely, the introduction of charged residues into the hh motif with F128R or F129E did disrupt the interface and prevented EspG3mt from being copurified (Sup. Fig. 5a), as it interrupts with the hydrophobic environment deep within the EspG3mt binding pocket.
The interface of EspG3mt is also well conserved amongst the various EspG3s tested in this study (Sup. Fig. 4), and again, we targeted strictly conserved residues. We selected R208 and E212 because they contain buried hydrogen bonds, R87 and R102 because they form the salt bridge within the interface, and S231 because it sits at the top of the groove of EspG3 and could sterically block entrance into the pocket. Neither single mutation of the salt bridge, R87E or R102E, was able to prevent co-purification of EspG3mt (Sup. Fig. 5b). Also, the introduction of a charged residue with R208E was unable to prevent the interaction (Sup. Fig. 5b). In contrast, E212R was sufficient to prevent copurification, as well as S231Y (Sup. Fig. 5b), as both prevent the hydrophobic tip of PPE4mt from interacting with the binding pocket of EspG3mt either by charge repulsion or steric hindrance. Thus, our mutations on both PPE4mt and EspG3mt highlight the importance of the hydrophobic environment deep within the PPE4mt-EspG3mt interface.

Structure of EspG3 in and out of trimer complex
Our structure is the first of EspG3 solved in complex with a cognate PE-PPE dimer, and thus we wanted to compare it to the previously solved unbound EspG3 structures. In total, there are six available EspG3 structures, four of EspG3ms (PDB codes: 4L4W, 4RCL, 5SXL, and 4W4J (13,16)), one EspG3mt (4W4I (13)), and one EspG3mm (5DLB (16)). These six structures can be classified into two different forms, an 'open' and a 'closed' form. The differentiation between these two forms is the orientation of the C-terminal helical bundle relative to the core β-sheet. The EspG3mm structure (5DLB) is representative of the 'open' form, and one of the EspG3ms structures (4RCL) is representative of the 'closed' form. Analysis of EspG3mm as it exists in the PE5mt-PPE4mt-EspG3mm trimer was done relative to these two representative structures. The overall alignment of the representative structures to the bound EspG3mm was good with r.m.s.d. of 2.1 Å and 1.9 Å for the 'open' and 'closed' forms, respectively (Fig. 4a). Inspection of these alignments show the majority of differences to be within the arrangement of the C-terminal helical bundles, with the bound form of EspG3mm being in close to the orientation found in the 'closed' form ( Fig. 4b-c). The bound EspG3mm cannot be any closer to the 'closed' form orientation because the C-terminal helical bundle is making contacts with PPE4mt. We hypothesized that this C-terminal helical bundle is dynamic and closes on cognate PPE proteins upon interaction. A comparison between the bound EspG3mm structure and the 'open' EspG3mm was performed with the DynDom server to test this hypothesis (19). DynDom identified a moving domain within the structures that was located in the C-terminal helical bundle (Fig. 4d). DynDom's analysis also performed a whole structure alignment that agreed with the previous Dali alignment in Fig 4a-b. DynDom performed alignments between the fixed domains (residues 11-168 and 189-279) and the moving domains (residues 168-188), which resulted in much better alignments with r.m.s.d. of 1.76 Å and 0.86 Å, respectively. Therefore, the moving domain, the C-terminal helical bundle, is essentially structurally identical between PPE4mt-bound EspG3mm and the 'open' EspG3mm and its rotation of 30.2° and translation of 0.8 Å is moderately perturbing the fixed domain. Since the moving domain is making extensive contact with PPE4mt and PPE4mt would sterically clash with the current orientation of the C-terminal helical bundle, the movement from the 'closed' to the 'open' orientation could be significant in releasing the secreted PE-PPE dimers from the chaperone at the secretion machinery.

Comparison of ESX-3 and ESX-5 PE-PPE-EspG trimers
A vastly different binding mode is observed when comparing the ESX-3-specific PE5mt-PPE4mt-EspG3mm trimer to the previously published ESX-5-specific trimers. As mentioned earlier, there is good agreement when comparing individual components of the ESX-3-specific trimer to the available ESX-5-specific trimers ( Table 3). The difference between the two sets of trimers became apparent when they were aligned via EspG (Fig. 5a-b). Our results focused on comparisons with the PE25mt-PPE41mt-EspG5mt (4KXR) trimer, but the same differences were present with the PE8mt-PPE15mt-EspG5mt (5XFS) trimer. The interaction angle of the different PE-PPE heterodimer with EspG is drastically different between the two trimers, with 30° angle difference (Fig. 5b). Another difference lies within the hh motif loops of PPE25mt (α4-α5 loop) and PPE4mt (α5-α6 loop) (Fig. 5c). In PPE25mt, this loop is seven residues long and undertakes a compact conformation that is not altered during EspG5mt binding (12). In contrast, in PPE4mt, this loop is nine residues long and has an extended conformation, and this difference was rapidly apparent when PPE25mt and PPE4mt were aligned (Fig. 5c).
This loop conformation also made each PPE protein incompatible with the other's binding mode. When looking at the PPE alignment in the context of the ESX-3 trimer, the α4-α5 loop of PPE25mt does not align over the central groove of EspG3mm and instead sterically clashes the central β sheet of the chaperone (Fig. 5d). The tip of PPE41mt would have to undergo a drastically new tip confirmation in order to bind in the opening of EspG3mm. In the context of the ESX-5 trimer, the α5-α6 loop does not align with the central groove of the chaperone, and instead, PPE4mt's hh motif sterically clashes with the C-terminal helical bundle of EspG5mt (Fig. 5e). Also, none of the salt bridges between PPE41mt and EspG5mt are conserved in PPE4mt. Specifically, D134 PPE41mt -K235 EspG5mt , D140 PPE41mt -R109 EspG5mt , and D144 PPE41mt -R27 EspG5mt ; that are all replaced with hydrophobic residues in PPE4mt: either T137 PPE4mt or L138 PPE4mt , V144 PPE4mt , and L147 PPE4mt , respectively.

Discussion
In this work, we present the first structure of the PE5mt-PPE4mt-EspG3mm trimer, which is from the ESX-3 system. Our structure is a mixed trimer, and we presented evidence that EspG3 from numerous mycobacterial species can bind the PE5mt-PPE4mt heterodimer. Conservation of the EspG3s used in this study ranged from 57-83% identity, yet an enrichment in conservation is observed within PPE4-interacting residues (Sup. Fig. 4). The ability of EspG3 from numerous mycobacterial species to bind PE5mt-PPE4mt suggests that the recognition mechanism is conserved within ESX systems across species. Overall the PE5mt-PPE4mt interaction is similar to the previously reported PE-PPE-EspG trimers (12)(13)(14) in that PPE4mt's tip is solely interacting with EspG3mm and the general secretion motif of YxxxD/E, on PE5mt, is at the distal end of the PE5mt-PPE4mt heterodimer. In all copies of PE5mt, this motif is unstructured as it is in the PE8-PPE15-EspG5 trimer (14), and similarly, W63 PPE4mt is pointed away from this secretion motif. This arrangement is distinct from the PE25-PPE41-EspG5 trimers (12,13) and EspB, an ESX-1 substrate that has a similar structural fold to the PE-PPE heterodimers (20,21). PE8mt contains an expanded C-terminal domain, and since the secretion motif is located in the linker between the C-terminal domain and the PE domain the orientation of the secretion motif was unclear (14). PE5mt does not have an expanded C-terminal domain and is just the conserved PE domain, yet its secretion motif is still unstructured in our trimer. Therefore, it is still unclear the exact significance of the structural variations in the ESX secretion motif, and further work is still needed.
Our structure is the first of EspG3 bound to a cognate PE-PPE heterodimer. In comparisons of the various published EspG3 structures, we identified two different forms that relate to the orientation of the C-terminal helical bundle, an 'open' and a 'closed' form. EspG3mm, when bound to the PE5mt-PPE4mt heterodimer, is in a conformation slightly different than the 'closed' form due to interactions with the tip of PPE4mt. We also found that the C-terminal helical bundle is a dynamic domain and shifts between the 'open' and 'closed' forms via a hinge movement (Fig. 4d). The functional significance of this domain movement could be two-fold. First, the plasticity of the Cterminal helical bundle could allow EspG3 to accommodate any variation in the ESX-3-specific PPE tips. While the tip of ESX-3-specific PPE proteins is mostly conserved (Sup. Fig. 3), there is some variations at the end of α5 that could alter the tertiary structure and thus slightly alter the interactions with the EspG3 chaperone and the PPE protein. Secondly, the movement of the C-terminal helical bundle could be critical to the release of the PE-PPE heterodimers at the ESX-3 secretion machinery. It is unlikely that PPE4mt could be removed from its interactions with EspG3mm without either movement of the C-terminal helical bundle or steric clashes with the C-terminal helical bundle. Movement of this helical bundle and release of PPE4mt would likely require energy input, and a candidate to provide that energy is EccA. EccA is an ATPase (2) and interacts with both EspG and PPE proteins in yeast two-hybrid experiments (13,22). Recent structures of the ESX machinery from both ESX-3 (4) and ESX-5 (6) suggests overall six-fold symmetry of the core ESX machinery within the inner membrane, and EccA could not only be acting to provide the energy required to uncouple the PE-PPE heterodimers from their EspG chaperone but also to provide a platform for interaction with the core secretion machinery as EccA is likely hexameric when functional.
Previous studies showed that each EspG only recognize PE-PPE heterodimers from their cognate systems (11,12). Despite the structures of two different PE-PPE-EspG trimers from ESX-5 (12)(13)(14), it was still unclear how EspG5 was differentiating from cognate and non-cognate PE-PPE heterodimers. Our structure represents the first PE-PPE-EspG trimer from ESX-3 and allows for direct comparisons between the ESX-3 and ESX-5 trimers. Our structure reveals that PE5mt-PPE4mt interacts with EspG3mm at a different angle of interaction than what was shown for either ESX-5 trimer. This difference in interaction angle presents a different face of PPE4mt to EspG3mm. We hypothesize that this is a conserved feature of the ESX-3 PPE-EspG3 interaction, as both characterized ESX-5 PE-PPE heterodimers (12)(13)(14) display the same face to EspG5 despite 33% sequence identity between PPPE41 and PPE15. Therefore, we hypothesize that each ESX system has a unique shape complementarity between its subset of PPE proteins and their cognate EspG chaperone, and these unique shapes are likely not compatiable for interaction with non-cognate chaperones. Our structure is also the first of an ESX-3-specific PE-PPE heterodimer. PE5mt-PPE4mt shares the same global conformation as the previously solved PE-PPE heterodimers, yet it differs drastically in PPE4mt in the loop between α5-α6, which contains the hh motif. This longer, more extended loop interacts deeper in the cleft of EspG3mm and is subsequently much more shielded from solvent. It is possible that the longer, extended loop conformation is a feature of ESX-3 PPE proteins and could play an essential role in EspG3 recognition.
In conclusion, we presented the first structure of a PE-PPE-EspG trimer from the ESX-3 system. This structure allowed us to compare the interactions of EspG3 and a cognate PPE protein to the previously described EspG5-PPE interactions. We hypothesize that shape complementarity is a key feature of distinguishing cognate and noncognate PPE proteins from the EspG chaperones.

Bacterial strains and growth conditions
The Escherichia coli Rosetta2(DE3) strains grown in Luria-Bertani (LB) medium or on LB agar at 37 °C. When needed, antibiotics were included at the following concentrations: chloramphenicol at 10 μg/ml, streptomycin at 50 μg/ml, and kanamycin at 50 μg/ml.

Expression and purification of PE5-PPE4-EspG3 heterotrimers
Optimized DNA sequences based on the amino acids of full-length PE5 and PPE4 residues 1-180 from M. tuberculosis were obtained from Invitrogen and put into a pRSF-NT vector (23) using NcoI and HindIII restriction sites, which contains an N-terminal His6 tag on PE5 that is cleavable by TEV protease. EspG3mt. EspG3mm expression plasmid was constructed as described previously (12). Mutations in PPE4mt, EspG3mt, and EspG3mm were introduced with Gibson assembly mutagenesis (SGI-DNA).
Co-expression of all heterotrimers was performed as described previously (12). Briefly, E. coli strains containing the appropriate PE5mt-PPE4mt and EspG3 plasmids were induced with 0.5 mM IPTG when they reached an OD at 600 nm of 0.5-0.8 and then continued to shake at 16 °C for 20 h. Cells were harvested by centrifugation. Cells were then resuspended in lysis buffer (300 mM NaCl, 20 mM Tris pH 8.0, and 10 mM imidazole) and 1:100 Halt protease inhibitor cocktail (Thermo Fisher Scientific, Waltham, MA). Cells were lysed using an EmulsiFlex-C5 homogenizer (Avestin, Ottawa, ON, Canada). The soluble lysate was purified over a Ni-NTA column (G-Biosciences, St. Louis, MO). Eluted protein was dialyzed against lysis buffer without imidazole and incubated with 1:20 mg of TEV protease at 4 °C for 20 h before being re-applied to the Ni-NTA column. Flowthrough and washes were pooled and concentrated for size-exclusion chromatography over a Superdex 200 Increase 10/300 GL column (GE Healthcare Life Sciences, Marlborough, MA) that was equilibrated in buffer A (100 mM NaCl and 20 mM HEPES pH 7.5).

Crystallization, data collection, and structure solution
Purified protein was concentrated to 4.195 mg/ml. Initial screening was done using the MCSG Crystallization Suite (Anatrace, Maumee, OH). This initial screening produced the P212121 crystals that were grown in 200 mM NH4 tartrate and 20% PEG 3350. Optimization around three others hits from the initial crystal screening containing NaCl as the precipitant and various buffers ranging from a pH 5.5 to 8.0 produced the I422 crystals, which were grown in 2.0 M NaCl and 100 mM bis-tris, pH 6.5. Crystals were transferred to cryoprotectant solution, which contained the crystallization solution supplemented with either 20% (P212121) or 25% (I422) glycerol and then flash-cooled in liquid N2. Data were collected at the Southeast Regional Collaborative Access Team (SER-CAT) 22-ID beamline at the Advanced Photon Source, Argonne National Laboratory. Data were processed using XDS and XSCALE (24). Molecular replacement using Phaser (25) was used to solve the structure of both crystal forms. First, the PE25mt-PPE41mt dimer (PDB: 4KXR (12)) and EspG3mm (PDB: 5DLB (16)) were used as search models for the I422 dataset. Later, an early model of the I422 structure was used as a search model for the P212121 dataset. DENSITY MODIFICATION AND ORIGINAL MODEL. The starting model for both forms was then iteratively rebuilt and refined using Coot and phenix.refine (26,27). The final structure for both crystal forms was refined in phenix.refine, with the P212121 form using non-crystallographic symmetry restraints (TLS?). All data collection and refinement statistics are listed in Table 1. The final model was assessed using Coot and the MolProbity server (28) for quality.

Size-exclusion chromatography multi-angle light scattering (SEC-MALS)
Proteins were expressed and purified as described and then passed over an AKTA pure with an inline Superdex 200 Increase 10/300 GL column (GE Healthcare Life Sciences), miniDAWN TREOS, and Optilab T-rEX (Wyatt Technologies, Santa Barbara, CA). The system was equilibrated and run in buffer A. Samples were loaded at a volume of 500 μL at a concentration of 2-4 mg/ml and the system was run at 0.5 mL/min. Analysis of light scattering data was done using Astra (Wyatt Technologies). Molecular weight determination was done by analyzing peaks at one-half their maximum. Graphics were prepared using Prism (Graphpad Software, La Jolla, CA).

Sequence analysis
Sequence analysis was performed using the EMBL-EBL analysis tools, specifically the Clustal Omega program (29). Rendering of sequence analysis was done with the ESPript server (30).

SAXS Data Comparison and Ab Initio Model Reconstruction
PE5ms-PPE4ms-EspG3ms trimer SAXS data (SASDDX2 (16)) was compared to a single copy of the mixed PE5mt-PPE4mt-EspG3mm trimer structure (PDB code: 6UUJ) using CRYSOL (17). Ab initio reconstruction of the envelope was completed using GASBOR (33). Monomeric symmetry was used as a constraint for GASBOR. Twenty ab initio models were generated and averaged using the DAMAVER software package (34). DAMSEL rejected only one model.

Accession codes
Coordinates and structure factors were deposited in the Protein Data Bank with accession codes 6UUJ (P212121) and 6VHR (I422).