The structure of the endogenous ESX-3 secretion system

The ESX (or Type VII) secretion systems are protein export systems in mycobacteria and many Gram-positive bacteria that mediate a broad range of functions including virulence, conjugation, and metabolic regulation. These systems translocate folded dimers of WXG100-superfamily protein substrates across the cytoplasmic membrane. We report the cryo-electron microscopy structure of an ESX-3 system, purified using an epitope tag inserted with recombineering into the chromosome of the model organism Mycobacterium smegmatis. The structure reveals a stacked architecture that extends above and below the inner membrane of the bacterium. The ESX-3 protomer complex is assembled from a single copy of the EccB3, EccC3, and EccE3 and two copies of the EccD3 protein. In the structure, the protomers form a stable dimer that is consistent with assembly into a larger oligomer. The ESX-3 structure provides a framework for further study of these important bacterial transporters.


Introduction
Mycobacteria use a set of specialized secretion systems called ESX to transport proteins across their complex, diderm cell walls (Grö schel et al., 2016). Originally described as virulence factors in Mycobacteria tuberculosis (Guinn et al., 2004;Hsu et al., 2003;Lewis et al., 2003;Stanley et al., 2003), orthologs of ESX have since been discovered in most Gram-positive bacteria (Bottai et al., 2017), and are more generally referred to as Type VII secretion systems (Bitter et al., 2009). In mycobacteria there are five paralogous ESX operons (ESX 1-5) each of which encodes an inner membrane translocon complex consisting of three conserved Ecc proteins: EccB, EccC, and EccD. A fourth protein, EccE is conserved in all ESX operons except the ancestral ESX-4 operon and is also considered a part of the ESX translocon complex as it copurifies with EccB, EccC, and EccD (Houben et al., 2012). All Type VII secretion systems translocate proteins in the WXG100-superfamily, which share a common two-helix hairpin structure and are found as homo-or heterodimers (Poulsen et al., 2014) and are mutually dependent for secretion with other substrates (Fortune et al., 2005). In contrast to the general secretory apparatus (Sec), ESX substrates have been shown to be secreted in their folded, dimeric state (Sysoeva et al., 2014).
Structural and functional information has been reported for truncated and isolated, soluble domains of the ESX translocon complexes and their homologs (Korotkova et al., 2015;Korotkova et al., 2014;Renshaw et al., 2005;Rosenberg et al., 2015;Strong et al., 2006;Wagner et al., 2016;Wagner et al., 2014;Wagner et al., 2013;Zhang et al., 2015;Zoltner et al., 2016). A low resolution, negative stain electron microscopy structure of ESX-5 shows a translocon complex assembled into a hexamer (Beckham et al., 2017). Structures of other proteins encoded in ESX operons including secreted substrates (Ilghari et al., 2011), substrate chaperons (Ekiert and Cox, 2014), and the protease MycP (Solomonson et al., 2013) have been solved. Despite revealing important functional information about ESX, structures of overexpressed and isolated proteins are insufficient to understand the regulated secretion of fully folded substrates. We therefore undertook structural studies of an endogenously expressed ESX-3 complex from the model organism M. smegmatis using cryo-electron microscopy (cryo-EM). During the preparation of this work for publication, a similar structure of the ESX-3 system expressed from a plasmid was published by another group (Famelis et al., 2019).
The ESX-3 translocon complex is important for iron acquisition (Serafini et al., 2013;Siegrist et al., 2009), cell survival , and virulence in pathogenic mycobacteria (Tufariello et al., 2016), and its role in iron homeostasis is conserved in the model system, M. smegmatis (Siegrist et al., 2009). The ESX-3 translocon complex proteins are transcribed in a single operon (Li et al., 2017), and expression of the ESX-3 operon is dependent on the transcriptional regulator IdeR, which controls iron metabolism (Rodriguez et al., 2002) and is required for growth in the human pathogen M. tuberculosis (Pandey and Rodriguez, 2014). The ESX-3 operon is 67% identical between the non-pathogenic model organism M. smegmatis and the pathogen M. tuberculosis over the 4354 amino acids of the ESX-3 operon. This high degree of conservation and essential role in cell growth makes ESX-3 an important candidate for small molecule inhibition (Bottai et al., 2014), as blockade of ESX-3 will both inhibit virulence in M. tuberculosis and kill a broad range of pathogenic mycobacteria.

Results
A major innovation made possible by the dramatic improvements in cryo-EM (Cheng, 2018) is the ability to examine challenging protein samples at atomic resolution, even when samples are only available at low concentrations. When coupled with recent genetic manipulations that allow for facile insertion of chromosomal epitope and purification tags (Murphy et al., 2018), cryo-EM now holds the promise of routine structural characterization of many endogenously expressed protein complexes not previously tractable by structural techniques. We undertook the purification of the ESX-3 complex from the native host without the need for overexpression. To facilitate purification of the endogenous translocon complex, a cleavable EGFP tag was inserted into the chromosome of M. smegmatis mc(2)155 (wild type) and DideR (Dussurget et al., 1996) strains at the C-terminus of EccE 3 via the ORBIT method (Murphy et al., 2018) ( Figure 1A, Figure 1-figure supplement 1). EccE 3 is the final gene in the 11 gene long ESX-3 operon making insertion at this site less likely to disrupt regulation and expression of the operon. Deletion of the gene for the iron acquisition regulator IdeR greatly increases chromosomal expression of ESX-3 from negligible amounts of protein to a yield sufficient for purification and structure determination (Figure 1-figure supplement 2A). Components of ESX-3 were pulled down using an anti-GFP nanobody (Rothbauer et al., 2008) and the EGFP tag was proteolytically cleaved. After size exclusion chromatography, the peak fractions were pooled and analyzed biochemically and by cryo-EM .

Global structure of the ESX-3 dimer
Four components of the ESX-3 translocon complex EccB 3 , EccC 3 , EccD 3 and EccE 3 were stably affinity-purified as a large molecular weight species of about 900 kDa ( Figure 1B and C, Figure 1-figure supplement 2B). The sample was imaged by cryo-EM and reconstructed revealing a dimeric structure, which can be divided into four areas: the flexible periplasmic multimerization domain, the stable transmembrane region, the stable upper cytoplasmic region, and the flexible lower cytoplasmic motor domain ( Figure 1D, Table 1). While the peak fraction does not contain particles of a larger size consistent with higher order oligomers, thorough examination of the void volume revealed a small number of particles in a higher oligomeric state ( Genomic deletion of ideR derepresses ESX-3 to boost expression for purification. (B) SDS-PAGE of purified ESX-3 shows four major bands corresponding to EccB 3 , EccC 3 , EccD 3 , and EccE 3 . (C) Blue native page of the purified ESX-3 complex shows a large molecular weight band around 900 kDa. (D) Merged maps of all focused refinement maps (gray transparency) of the ESX-3 dimer filtered to 10 Å resolution. The transmembrane and upper cytoplasmic focused maps (3.7 Å ) are segmented by subunit showing one copy per protomer of EccB 3 (pink), EccC 3 (blue), EccE 3 (orange), EccD 3-bent (yellow), and EccD 3-extended (green). (E) Atomic models of the transmembrane and upper cytoplasmic regions. (F) A combined map of the full complex filtered to 10 Å resolution (gray transparency) with full models for each protein, EccD 3-bent (yellow), EccD 3-extended (green), EccE 3 (orange), EccC 3 (blue), and EccB 3 (pink). The online version of this article includes the following figure supplement(s) for figure 1:    (Figure 1-figure supplement 6F). The highest resolution maps for each region were combined and filtered to the threshold of the lowest resolution map to form an overall 10 Å combination map for the entire ESX-3 dimer.
The ESX-3 dimer is comprised of ten total proteins, two copies each of EccB 3 , EccC 3 , and EccE 3 and four copies of EccD 3 . Two pseudo-symmetric protomers referred to as i and ii, combine to form the ESX-3 dimer. Each protomer contains one copy of EccB 3 , EccC 3 , and EccE 3 and two conformationally distinct copies of EccD 3 , referred to as EccD 3-bent and EccD 3-extended ( Figure 1E) based on their highly asymmetric conformations. At 3.7 Å resolution, it was possible to build de novo atomic models for all observable amino acids in the transmembrane and upper cytoplasmic regions, except the two transmembrane helices of EccC 3 ( Figure 1E and Supplementary file 1). The lower    resolution regions of density, the EccC 3 transmembrane helices, EccC 3 ATPase 1, 2, and 3 domains and the EccB 3 periplasmic domain, were flexibly fit using homology models. Using this hybrid approach, a model of the entire ESX-3 dimer has been produced ( Figure 1F).

EccD 3 forms a homodimer that encloses a large hydrophobic cavity
There are two copies of EccD 3 in each ESX-3 protomer ( Figure 2A). The ubiquitin-like N-terminal domain of each EccD 3 molecule interacts with EccE 3 and EccC 3 in the cytoplasm, and a long linker joins the soluble domain of EccD 3 to 11 transmembrane helices ( Figure 2B and Figure 2-figure supplement 1A-F). The four EccD 3 molecules account for 44 of the total 54 transmembrane helices observed in the ESX-3 dimer. A distinct transmembrane cavity is formed by dimerization of the two copies of EccD 3 in each protomer with a cross-sectional diameter of~20Â30 Å without significant regions of constriction. Transmembrane helices 1, 9 and 10 interact across the cavity dimer interface in a tight bundle making passive lipid transport into the membrane from the cavity unlikely ( Figure 2C). The inner surface of the periplasmic half of the cavity is composed primarily of hydrophobic residues and in our maps, eight extended densities consistent with hydrophobic lipid tails or detergent molecules line the periplasmic inner face of the cavity ( Figure 2C). In contrast, the   cytoplasmic face of the cavity has several polar residues and ordered hydrophobic densities are not visible.

A B
In the cytoplasm below the membrane, a stable, upper cytoplasmic region is formed by interactions between the soluble domains of EccD 3-bent , EccD 3-extended , EccE 3 , and EccC 3 . The linker joining the cytoplasmic ubiquitin-like domain to transmembrane helix 1 (residues 100-127) of EccD 3 conserves a high sequence identity throughout evolution (Ashkenazy et al., 2016), yet it adopts two distinct secondary structures resulting in the asymmetric placement of the cytoplasmic domains of EccD 3 ( Figure 2D and Figure 2-figure supplement 1A-G). In EccD 3-bent , residues 100-127 are bent, folding into an a-helix and forming a nexus of stabilizing contacts with EccB 3 and EccC 3 (Figure 2-figure supplement 1H). In EccD 3-extended , residues 100-127 are extended and fold into a shorter a-helix that interacts with EccE 3 and the cytoplasmic domain of EccD 3-bent (Figure 2-figure supplement 1I). This conformational flexibility suggests that if residues 100-127 were released from their associations with EccC 3 and EccE 3 they could rearrange into the alternative bent or extended conformation with little energetic barrier.
EccC 3 and EccE 3 make extensive, stabilizing interactions with the asymmetric, cytoplasmic domains of EccD 3 The next component of the stable upper cytoplasmic region is EccE 3 . EccE 3 is positioned at the front of the ESX-3 dimer ( Figure 3A), where the conserved transmembrane helix 1 of EccE 3 interacts with helix 11 of EccD 3-bent in the membrane. Helix 1 is followed by a second EccE 3 transmembrane helix, a linker helix, and then extends into the cytoplasm ( Figure  , otherwise the protomers are superimposable. All four EccC 3 transmembrane helices were modeled at 6 Å resolution through a combination of homology modeling and molecular dynamics. In protomer i, transmembrane helix 2 forms lipid mediated hydrophobic interactions with the transmembrane helix of EccB 3 in protomer i, and transmembrane helix 1 interacts with transmembrane helix 2 of EccC 3 in protomer ii. Transmembrane helix 1 of EccC 3 in protomer ii is shifted relative to the protomer i conformation and does not directly interact with other proteins. The EccC 3 motor domains are flexible and asymmetric across the dimer The motor domains containing the EccC 3 ATPase 1, 2 and 3 hang below the DUF domain in the flexible lower cytoplasmic region. They were resolved at a lower resolution than the upper cytoplasmic domain, but they are clearly asymmetric between protomers i and ii ( Figure 4A). Although the EccC 3 ATPase 1 domains in both protomers are in a similar location relative to the DUF, the ATPase 2 and 3 domains do not superimpose across protomers even at low resolution (Figure 4-figure supplement 2C) suggesting significant asymmetry between these domains. In protomer i, a homology model based on existing EccC structures fits well into the density; however in protomer ii, the interface between ATPase 1 and 2 is broken relative to the crystal structure with ATPase 2 and ATPase 3 rotated away from the crystal structure interface.

EccB 3 extends into the periplasm and stabilizes dimer formation
The ESX-3 dimer is stabilized by cross-protomer interactions formed by the two EccB 3 proteins. EccB 3 begins in the cytoplasm with a flexible N-terminal tail leading into a linker helix, followed by a single-pass transmembrane helix, and an extended periplasmic domain ( Figure 5A   protomer contacts with EccB 3 , EccC 3 , EccD 3-bent and EccD 3-extended from protomer ii ( Figure 5C, Supplementary file 3). The linker helix of EccB 3 forms further protein-protein interactions with EccC 3 and EccD 3-bent . The transmembrane helix of EccB 3 interacts with transmembrane helix 11 of EccD 3-extended . Two hydrophobic tails consistent with a lipid or detergent molecules link the transmembrane helix of EccB 3 to transmembrane helix 2 of EccC 3 ( Figure 5D). The two EccB 3 periplasmic domains share a large interaction interface across the protomers further stabilizing dimerization. Homology models of two EccB 3 proteins can be docked into the periplasmic domain ( Figure 5-figure supplement 1C); however, this region is not resolved sufficiently to identify specific interactions. The majority of cross-protomer interactions involve EccB 3 , suggesting the periplasmic domain is essential for oligomerization.

A hexameric model of ESX-3
Previous reports have shown ESX-1 and ESX-5 form hexamers or higher order multimers (Beckham et al., 2017;Houben et al., 2012). We modeled a higher order oligomeric state of ESX-3 based on the low-resolution negative stain structure of ESX-5, which had C6 symmetry imposed   protomers in the trimer of dimers model alternates between 72˚(the angle between protomers i and ii in the ESX-3 dimer map) and 48˚and contains both experimentally observed conformations of EccC 3 . The complete model of the ESX-3 translocon complex was docked into the ESX-5 negative stain map in the same manner revealing major clashes between the low resolution periplasmic and motor domains in a hexameric form (Figure 6-figure supplement 1C). Accommodation of a hexameric complex would require extensive rearrangement of both EccC 3 and EccB 3 .

Discussion
The ESX-3 structure presented here is purified without the addition of substrates or nucleotide. It is therefore likely to be in a conformation representing the end of the translocation cycle, awaiting either the direct binding of substrates, the binding of nucleotide or both, to reset a substrate-binding competent state. By fitting our dimer structure into a prior low resolution envelope we suggest a model of the oligomeric state of the complex, in close agreement with Famelis et al. However, even allowing for major rearrangements in the more flexible regions of EccC 3 and EccB 3 , a model built by the static trimerization of the experimentally determined dimer structure cannot itself explain the mechanism of action of ESX-3 secretion. The existence of an R-finger catalytic site for ATPase 1 of EccC 3 (Rosenberg et al., 2015) requires the R-finger of one protomer to insert into the active site of another protomer. Given the~65 Å distance we observe between ATPase 1 domains, the completion and activation of the catalytic site of ATPase 1 by an R-finger will necessitate an extremely large rearrangement of the position of the ATPase domains. How might this rearrangement occur? We propose movements in the highly flexible EccD 3 linker lead to the release of EccE 3 and EccC 3 from their rigid positions, thus allowing for a rearrangement of the ATPase domains into an active conformation ( Figure 6).
Once EccC 3 assembles into an oligomeric state, the substrate proteins will need to translocate through the inner membrane. We have considered two models for how pore formation and transit might occur: 1) through a pore created by the oligomerization of EccC 3 and EccB 3 or 2) through the large cavity created by the dimerization of EccD 3 . In the first model ( Figure 6A), the resting state of an ESX translocon complex is a hexamer, with disordered EccC 3 ATPase domains free in the cytosol, stabilized by interactions with proteins not seen in the structures presented here (van Winden et al., 2016). It is possible the rare multi-dimer oligomeric state we see in the void volume, and also seen by Famelis et al., represents this state. In a hexamer model, the center of the multimer is formed by the transmembrane helices of EccC 3 and EccB 3 , which create a cavity that could serve as a pore for translocation of substrates. These transmembrane helices are largely hydrophobic and do not contain obvious residues that would allow for the conductance of hydrated substrates. Thus the production of a protein transit channel would require either a large, conformational change in the transmembrane helices, likely facilitated by movements in the cytoplasmic domains of EccC 3 , EccD 3 and EccE 3 , or a novel mechanism of action for transit through the central pore.
A hexameric pore created by EccC 3 agrees well with the documented mechanism of action for motor ATPases in the additional strand catalytic E (ASCE) division of P-loop NTPases (Erzberger and Berger, 2006), which includes EccC 3 . A hexameric pore also agrees with the proposed mechanism of action for other bacterial secretion systems, such as the Type IV secretion system VirD4 coupling protein (Gomis-Rüth et al., 2001;Hormaeche et al., 2002), which is related evolutionarily to EccC 3 (Iyer et al., 2004). The hexamer model is thus firmly grounded in the motor ATPase and bacterial secretion systems literature, although the oligomeric state of VirD4 has recently been called into question (Redzej et al., 2017) and remains controversial (Llosa and Alkorta, 2017).
In a second, more speculative model, EccD 3 dimers form a channel for translocation of substrates ( Figure 6B). The large cavity found in the EccD 3 dimer is striking and by structural homology, is unlike any other membrane protein in the Protein Data Bank. In our density maps, the EccD 3 dimer cavity appears capped on the periplasmic side by a dense layer of lipids. In contrast, on the cytoplasmic side the cavity does not exhibit bound lipids due to the polar residues lining the lower half of the cavity. The large cavity is of sufficient diameter to transit a folded EsxG/H dimer, however given the strong hydrophobicity of the cavity the mechanism would not be mediated by water and would require a novel mechanism of secretion that has not been seen in other bacterial secretion systems. It is also possible that the cavity exists to transit a non-protein substrate such as a specific mycobacterial lipid. The ability to transport non-protein substrates could resolve some of the mysteries that remain about the relationships between ESX systems, cell wall stability, lipid content, and nutrient acquisition (Barczak et al., 2017;Bosserman and Champion, 2017;Siegrist et al., 2014;Tufariello et al., 2016).
As each protomer contains an EccD 3 cavity, the second model, proposing translocation through EccD 3 , does not require hexamerization. However, this model is also not incompatible with hexamerization, which would not block a substrate path through EccD 3 . Further, the role of the hexamer may not be to form a central channel for substrate transit. Rather, hexamerization could serve some other purpose. For example, it may tether functional dimers together, facilitate localization, or increase local concentration and allosteric control of enzymatic activity (Kuriyan and Eisenberg, 2007).
Although the ESX-3 structure presented here allows for mechanistic hypotheses about the transit of substrates across the inner-membrane, it does not provide sufficient information to allow for a structural model of transit across the outer mycomembrane. The EccB 3 periplasmic domain (Wagner et al., 2016) has been found to have similarity to the peptidoglycan binding phage protein PlyCB, which forms a ring inside the bacterial cell wall facilitating phage entrance into the cell. A hypothesis is that EccB 3 is anchored to a larger outer membrane complex, but the purification conditions we have employed remove proteins required for its stabilization. The description of ESX-3 presented here agrees in both protein composition and structure with the work recently published by Famelis et al. Given the high sequence conservation among the ESX systems in mycobacteria and related actinobacteria, these structures likely represent an excellent framework for the structural modeling of the other ESX systems. Together, these structures provide a wealth of information about protein-protein interaction interfaces and ESX complex architecture, which can be used to guide structure-based drug design and to generate hypotheses for further mechanistic investigations.

Materials and methods
Key resources

Strain construction
Mycobacteria smegmatis mc(2)155 (wild type) and DideR cells were chromosomally tagged using the ORBIT protocol (Figure 1-figure supplement 1). For wild type cells, the integrase and annealase expressing plasmid was pKM444. For recombineering in the DideR strain, which already contained a kanamycin resistance marker, we created a modified pKM444 plasmid with a zeocin resistance cassette inserted at the EcoIV restriction site. The tagging plasmid was pKM468 with a 3C protease cleavage site added before the EGFP tag. The targeting oligo had the sequence: 5' TGTGCG TTCCACTGGTTCCCCGGCAACCACCTGCTGCACGTGAGCCAGCCGGACTACCTAGGTTTGTACCG TACACCACTGAGACCGCGGTGGTTGACCAGACAAACCCGCCGGATGACCCGCTTCC TGCGCGGCTTCATGTTCGACTGAACCCTTCACCGAGGTCCG 3'. M. smegmatis cells containing pKM444 were grown in an overnight liquid culture and induced for annealase and integrase expression. Cells were prepared for electroporation and electroporated with the targeting oligo and tagging plasmid. The transformed M. smegmatis were plated on hygromycin (wild type) or hygromycin and kanamycin (DideR) containing 7H9 plates and incubated at 37˚C for 3 days. Colonies were verified for insertion of the tagging plasmid into the chromosome by PCR.
Western blotting 100 mL of EccE 3 tagged wild type and DideR knock out cells were grown overnight to an OD 600 of 1.0-1.2. Cells were pelleted and resuspended in 1 mL of buffer (50 mM Tris-HCl pH 8.0, 150 mM NaCl, 1% DDM) and sonicated for 30 s. Cell lysates were run on a 4-20% SDS-PAGE gel (GenScript) and transferred to PVDF membrane (BioRad) using a BioRad Trans-Blot Turbo Transfer System. The blot was washed with PBS and blocked in a 5% milk/PBS-T solution for 1 hr. The blot was incubated with mouse anti-GFP monoclonal antibody (Roche) overnight. After rinsing with PBS-T, the blot was incubated with anti-mouse IgG HRP-conjugated antibody (R&D Systems) for 2 hr. After activation (Amersham) the blot was imaged on a BioRad ChemiDoc. The blot was stripped with stripping buffer (ThermoFisher Scientific) as per the manufacture's instructions, and incubated overnight with rabbit anti-GroEL monoclonal antibody (Sigma-Aldrich). The blot was incubated with goat anti-rabbit IgG antibody HRP (GenScript) for 2 hr, activated (Amersham), and imaged on a BioRad ChemiDoc.

Protein purification
Purification for high resolution structural determination: M. smegmatis was grown in 6 L of 7H9 supplemented with 0.05% Tween 80 and 20 mg/mL kanamycin to an OD 600 of~0.8. After harvest, cells were washed three times with PBS and frozen in liquid nitrogen before lysis with a cryogenic grinder (SPEX SamplePrep). 24.9 g of powdered cell material was resuspended by adding 56.3 mL 50 mM Tris-HCl pH 8.0, 150 mM NaCl, 1% DDM supplemented with 1X protease inhibitor cocktail (Sigma-Fast) and 224 units Benzonase endonuclease. The suspension was stirred for 120 min at 4˚C. After centrifugation for 30 min at 98,000 g, the supernatant was incubated with 1.4 mL anti-GFP-nanobody resin for 110 min at 4˚C. The resin was transferred to a column and washed sequentially with 28 ml of wash buffer (50 mM Tris-HCl pH 8.0, 150 mM NaCl and 0.1% GDN), 14 mL of high salt wash buffer (50 mM Tris-HCl pH 8.0, 400 mM NaCl, and 0.1% GDN), and 14 mL of wash buffer (50 mM Tris-HCl pH 8.0, 150 mM NaCl, and 0.1% GDN). To cleave off the purification tag, the resin was incubated o/n at 4˚C with 70 units Pierce HRV 3C protease (Thermo Scientific Pierce) in 2.8 mL wash buffer supplemented with 0.2 mM DTT. This resin was sedimented by gentle centrifugation (300 x g for 3 min), the supernatant collected, and the resin was subsequently washed with 1.4 mL wash buffer. The supernatant and wash fraction were combined and concentrated using an Amicon Ultra-4 centrifugal filter unit with a 100 kDa molecular weight cut-off. The sample was centrifuged at 16,000 g before injection on a Superose 6 10/300 column equilibrated in 50 mM Tris-HCl pH 8.0, 150 mM NaCl and 0.021% GDN. Peak fractions were concentrated using a 0.5 mL centrifugal filter unit (Amicon, 100 kDa cut-off) to an A280 of 5.52 by Nanodrop reading in about~30 mL . Purification completed for examination of the void fractions was similar except: volumes were scaled for a powder weight of 21.1 g. and the high salt wash was omitted.

Blue-Native polyacrylamide gel electrophoresis (BN-PAGE)
BN-PAGE experiments were carried out using the Invitrogen NativePAGE Novex Bis-Tris Gel system as recommended by the manufacturer. Samples were prepared in a total volume of 10 mL using 0.5 mL 5% G-250 sample additive. Electrophoresis was performed at a constant voltage of 105-120 V for 2-3.5 hr at 4˚C. The gel was fixed and stained using the Pierce silver stain kit.

Cryo-EM -data acquisition
Samples were frozen for cryo-EM. Quantifoil R1.2/1.3, 400 mesh, copper grids were glow discharged using a Solarus plasma cleaner (Gatan) with an H 2 /O 2 mixture for 30 s. 2 mL of sample were applied per grid and the grids were plunged into liquid ethane using a FEI Vitrobot Mark IV.
Initially, samples were screened, and test data sets were collected on a FEI Talos Arctica 200kV microscope equipped with a Gatan K2 Summit detector. For the initial screen of freezing conditions, 2499 movies were collected at a magnification of 36,000 with a pixel size of 1.14, and a defocus range of À1.5 to À2.5 mm, an exposure time of 9 s, and a dose rate of 7 electrons/Å 2 /second ( Table 1). Data collection for the final structure presented in the main text was collected on a FEI Titan Krios at 300kV with a Gatan K2 Summit detector. Two imaging sessions were used. In the first imaging session, 2705 movies were collected at a magnification of 29,000 with a pixel size of 0.82, and a defocus of À0.4 to À1.2 mm, an exposure time of 10 s to collect 100 total frames, and a dose rate of 8 electrons/Å 2 /second ( Table 1). In the second imaging session, data was collected on the same microscope with the same detector, 4632 movies were collected at a magnification of 29,000 with a pixel size of 0.82, and a defocus range of À0.6 to À1.4 mm, an exposure time of 10 s to collect 80 total frames, and a dose rate of 6.7 electrons/Å 2 /second. Data used to analyze the void, plateau, and peak regions of the SEC profile were collected on a FEI Talos Arctica at 200kV with a Gatan K3 detector. All micrographs were collected at a magnification of 28,000 with a pixel size of 0.9, and a defocus range of À1.5 to À2.5 mm, an exposure time of 11.7 s to collect 117 total frames at a total dose of 58 electrons/ Å 2 . For the void region, 1215 micrographs were collected.

Cryo-EM -data processing
For all data, movies were motion corrected using MotionCor2 (Zheng et al., 2017) and CTF correction was performed using CTFfind4 (Rohou and Grigorieff, 2015). For the Arctica dataset, particles were picked using a gaussian blob in either RELION (Zivanov et al., 2018) or cisTEM (Grant et al., 2018) and initial 2D classification was performed to remove obvious artifactual particles. Initially, a shotgun approach was taken to generate several initial models using RELION, cisTEM, and cryosparc (Punjani et al., 2017). Once an initial model which contained realistic low-resolution features was generated, a user defined descent gradient was performed to improve the model with the goal of achieving accurate secondary structure features. First, all particles selected during 2D classification were refined in 3D against the randomly generated initial model. Second, a round of 3D classification with four classes and default RELION settings was performed and the best class selected. Third, the best class was refined as a single class in 3D classification with increasing Tau2_Fudge and decreasing search angle size. The resulting EM density map had clear transmembrane helix densities and was used as the model for a new 3D reconstruction. This reconstruction was used to back project models for reference-based particle picking in RELION. Two rounds of 2D classification were performed and the best classes selected. One round of 3D classification was performed using the Tau2_Fudge value optimized during the previous run through (T = 12) and the best class selected. A final 3D reconstruction of the Arctica data set yielded a map of about 4.7 Å resolution (Figure 1figure supplement 4).
After motion correction and CTF determination, the final Titan Krios dataset was processed entirely using RELION. Particles were picked using a gaussian blob, and extracted as 4x binned particles. Two rounds of initial 2D classification were performed with T = 3 on the binned particles and obvious artifactual particles were removed. The final reconstruction from the Arctica dataset was used as the initial model for a 3D reconstruction of the binned particles. 3D classification with four classes and the previously optimized Tau2_Fudge value, T = 12, was performed on the binned particles. The two best classes were selected and re-extracted without binning. A 3D reconstruction was performed. A mask was created for the high-resolution region of the reconstruction and 3D classification without image alignment was performed focused on this region. The best class was selected and the subsequent 4.0 Å reconstruction is the consensus structure for the entire complex ( Figure 1-figure supplement 5). Focused classification of each protomer, the periplasmic EccB region, and the ATPase 1, 2, and 3 domains of EccC were performed. To perform focused classification, the center of mass of the region of interest was determined using chimera (Pettersen et al., 2004). Particles were recentered on this area and reextracted. Masks for the region of interest were generated and 3D classification without image alignment was performed. The best class was selected and used for a focused 3D reconstruction without image alignment of the region of interest. A reconstruction was generated and density outside of the region of interest was subtracted. A final reconstruction of the masked and density subtracted particles was then performed. This procedure improved the resolution of the protomer i to 3.75 Å and protomer ii 3.83 Å , 5.8 Å resolution for the EccB 3 periplasmic domain, and~7 Å resolution for the EccC 3 lower cytoplasmic region.
To generate the symmetry expanded protomers based on non-point group symmetry (also known as non-crystallographic symmetry or NCS), a transformation matrix between the two protomers was calculated using chimera. Particles were then transformed and aligned using the subparticles.py and star.py utilities in pyem (Asarnow et al., 2019) resulting in a particle stack with twice as many particles as the input file, each focused on protomer i or protomer ii. Density subtraction was performed to remove density outside of the symmetry expanded protomer, and focused classification and refinement were performed as described above. This procedure improved the resolution of the symmetry expanded protomer to 3.69 Å resolution.

Atomic model building
The cytoplasmic domain from the crystal structure of EccD 1 (PDB 4KV2) was docked into the cytoplasmic domains of the two EccD 3 molecules and the sequence was mutated. The remaining transmembrane domains of EccD 3 and the residues 14-93 of EccB 3 were built de novo in Coot (Emsley et al., 2010) using baton building. The alpha helices of EccE 3 and EccC 3 were initially modeled using the RaptorX (Källberg et al., 2012) homology server. The loops and strands of EccE 3 and EccC 3 were built in Coot using baton building. All models were subsequently refined individually, as a symmetry expanded protomer, left and right protomers, and as the full model using phenix real space refine (Afonine et al., 2018), Coot, and the MDFF (Trabuco et al., 2009) server, Namdinator (Kidmose et al., 2019; Supplementary file 1).

Low resolution modeling
The left and right protomer map, periplasmic focused refined map, and lower cytoplasmic focused refined map were all docked into the consensus map and added together using chimera. The combined map was filtered to 10 Å resolution to match the lowest resolution component. Homology models for amino acids 94-516 of EccB 3 , the transmembrane helixes of EccC 3 , and 404-1268 of EccC 3 were generated using RaptorX. These models were fit into the combined map density using the fit map to model utility in Chimera. The full model was refined using phenix.real_space_refine.

Model interpretation and display
Buried surface area between subunits was calculated by PISA (Krissinel, 2015). Atomic models for individual proteins were compared against the PDB using the DALI server (Holm and Laakso, 2016). Chimera and ChimeraX (Goddard et al., 2018) were used to display maps and models for figure creation. Consurf (Ashkenazy et al., 2016) was used to produce multisequence alignments and to color structural models by homology.
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication. . Transparent reporting form

Data availability
The map files have been deposited at the EMDB with code 20820. The entry is online at https:// www.ebi.ac.uk/pdbe/entry/emdb/EMD-20820. The model has been deposited at the PDB with the code 6UMM. It is online at http://www.rcsb.org/structure/6UMM.
The following datasets were generated: