A Key Motif in the Cholesterol-Dependent Cytolysins Reveals a Large Family of Related Proteins

The cholesterol-dependent cytolysins’ pore-forming mechanism relies on the ability to sense the completion of the oligomeric prepore structure and initiate the insertion of the β-barrel pore from the assembled prepore structure. These studies show that a conserved motif is an important component of the sensor that triggers the prepore-to-pore transition and that it is conserved in a large family of previously unidentified CDC-like proteins, the genes for which are present in a vast array of microbial species that span most terrestrial environments, as well as most animal and human microbiomes. These studies establish the foundation for future investigations that will probe the contribution of this large family of CDC-like proteins to microbial survival and human disease.

fold with many evolutionarily distant proteins, including members of the mammalian membrane attack complex/perforin family (MACPF) (2)(3)(4)(5)(6)(7). This structural domain is characterized by a core ␤-sheet flanked by two contiguous ␣-helical bundles (␣HBs) (Fig. 1, Fig. S1 in the Supplemental Material). Membrane-bound CDC monomers self-associate into toroidal, multimeric prepore complexes of 35 to 40 monomers, which undergo large structural rearrangements in order to form the membrane-spanning ␤-barrel pore. Prior studies in our lab have shown that the conversion of the prepore to pore requires the refolding of the two D3 ␣HBs into membrane-spanning ␤-hairpins (TMHs) with an ϳ40Å vertical collapse of the complex to form the membrane-spanning ␤-barrel pore (8)(9)(10)(11)(12)(13). The mechanism(s) and structures that initiate the prepore-to-pore transition remain incompletely understood, yet they are at the core of the CDC pore-forming mechanism.
We have identified a motif, F/Y-F/Y-X n -YGR, that is conserved between the CDCs and a large group of uncharacterized proteins (termed herein as CDC-like or CDCL) that exhibit little similarity in their primary structures with the CDCs. The conspicuous The ␣␤ loop (red) in the soluble monomer refolds into an HTH upon conversion of the prepore to the pore. (B) Side view of PFO emphasizing the two interfaces which must be disrupted for the formation of TMHs and pore conversion. In TMH1, a water network is maintained around N197 (spheres) which stabilizes the interface. The TMH2 interface is maintained by the interaction of two methionine residues on the core ␤-sheet with a phenylalanine (spheres) in the ␣HB. (C) Representation of a growing PFO prepore following membrane binding. Following membrane binding, ␤5 of the ␣␤ loop must be displaced to allow interactions to form between ␤1 and ␤4 of two monomers. Movement of ␤5 also increases the solvent accessibility of label-site residue V322 (green spheres). Upon completion of the prepore, K336 (coral spheres), located on the tip of the ␣␤ loop, forms an electrostatic interaction with E183 (coral spheres) of the neighboring monomer. This interaction is necessary to supply the force to complete the flattening of the core ␤-sheet and to disrupt the water network around N197, resulting in pore conversion. (D) The conserved CDC D3 motif in the PFO soluble monomer. The conserved motif (orange spheres and sticks) resides on the severe bend in the core ␤-sheet on strands 2 and 3. R275 makes contacts with the beginning of the ␣␤ loop at the turn between ␤-strands 4 and 5, and F230 forms a -stacking interaction with a well-conserved aromatic, F351 (red sticks), at the end of the ␣-helix.
presence of this motif in both families of proteins prompted us to investigate its role in the CDC pore-forming mechanism. We show this motif makes critical contacts with an ␣␤ loop in the soluble monomer of the CDC protein perfringolysin O (PFO) (Fig. 1D), which reveals that the ␣␤ loop plays a central role in regulating the transition of the prepore to the pore. Importantly, the crystal structure for one CDCL from Elizabethkingia anophelis reveals that it exhibits a high degree of structural similarity with domains 1 to 3 (D1 to D3) of the CDCs and that the residues of the conserved F/Y-F/Y-X n -YGR motif occupy nearly identical positions and make similar contacts in the E. anophelis CDCL crystal structure to those in PFO. These studies reveal the critical nature of the F/Y-F/Y-X n -YGR motif to the CDC pore-forming mechanism and provide the first characterization of a representative of the CDCL family of proteins.

RESULTS
Primary structures of a large class of uncharacterized proteins share a conserved F/Y-F/Y-X n -YGR motif with the CDCs. The alignment of ϳ300 CDCL and over 100 CDC primary structures revealed a conserved motif consisting of F/Y-F/Y-X n -YGR. An example is shown in Fig. S2, where the predicted primary structures of selected CDCLs from 12 genera were aligned with PFO. The F/Y-F/Y-X n -YGR motif and a downstream glycine pair (Fig. S2) are highly conserved, whereas there are relatively few other conserved residues or conservative substitutions. We have previously shown that the conserved glycine pair in PFO (G324 and G325), which is located at the beginning of the ␣␤ loop in the CDCs, does not tolerate substitution with side chain amino acids and traps it in a prepore complex (14). This alignment also shows there is no similarity between the CDCL domains that correspond to the conserved membrane-binding D4 of the CDCs, Furthermore, the classic signature undecapeptide (ECTGLAWEWWR) and cholesterol binding (TL pair) motifs of the CDC membrane-binding D4 (15,16) are absent in the CDCLs. The first aromatic residue of the F/Y-F/Y-X n -YGR motif was found to vary in these proteins (62% conservation as an aromatic, the rest have S, Q, A, or N), although the rest of the motif residues, including the GG pair, were conserved in 92 to 99% of the CDCLs. The conspicuous conservation of the F/Y-F/Y-X n -YGR across the CDC and CDCL primary structures prompted us to investigate its role in the CDC poreforming mechanism.
Pore formation by PFO mutants with alterations in the conserved F/Y-F/Y-X n -YGR CDC motif. The conserved F/Y-F/Y-X n -YGR motif in PFO, the archetypical CDC, is F 230 -Y-X 41 -Y 273 GR and is located at the bends of ␤-strands 2 and 3 between domains 3 and 1 (D3 and D1) (Fig. 1D). When individually mutated to alanine, we generally observed greater than wild-type activity accompanying a 1 to 3°C decrease in melting temperature (T m ) of PFO, with the exception being R275A, which exhibits less than 1% of wild-type PFO activity (Table 1). Therefore, we initiated our investigation of this conserved motif to first understand the basis of losing pore-forming activity in the R275A mutant.
R275 forms interactions with the ␣␤ loop critical to oligomer geometry and pore formation. R275 forms its strongest interactions via side-chain-mediated interactions with the backbone and side chain of N348, the backbone of G325, and a charge interaction with D326, which are located at the N-terminal (N348) and C-terminal (G325 and D326) ends of the ␣␤ loop (Table S1). The PFO D326A pore-forming activity was 80% of wild-type PFO, whereas PFO R275K exhibited only 10% of wild-type activity, suggesting that the side-chain-mediated backbone interactions of R275 with N348 and G325, and not its charge, are the primary stabilizing interactions for the two ends of the ␣␤ loop. These interactions are lost in the R275A mutant (Table S1), which is likely responsible for the inactive phenotype of this mutant. Therefore, we investigated why these R275-mediated interactions with the two ends of the ␣␤ loop are necessary to form the pore.
Strand ␤5 of the ␣␤ loop is displaced from ␤4 to allow formation of intermolecular backbone hydrogen bonds between ␤4 and ␤1 of two monomers during the assembly of the pore (Fig. 1C) (14,17). The environmentally sensitive probe NBD, when covalently bound to cysteine-substituted V322 on ␤4 undergoes a nonpolar to polar transition, as ␤5 disengages from ␤4 and the ␣␤ loop swings away from the monomer-monomer interface (13,14). The fluorescence emission of PFO R275A:V322C-NBD in the presence and absence of cholesterol-rich liposomes revealed that the ␣␤ loop was already displaced in the soluble PFO R275A compared to PFO ( Fig. 2A and B) (14). When PFO R275A was analyzed by SDS agarose gel electrophoresis (SDS-AGE), no oligomers were detected, but transmission electron microscopy (TEM) revealed the presence of linear oligomers ( Fig. 3A and B). To prevent ␤5 displacement in PFO R275A , a disulfide bond was engineered between ␤4 and ␤5, as previously described (14). Oxidized PFO T319C:V334C is inactive, but activity is restored whether the disulfide bond is reduced before or after the prepore is assembled (14). When PFO R275A:T319C:V334C is reduced before or after prepore assembly on liposomes, it remains inactive (Fig. 2C); however, the typical circular geometry of the CDC oligomers is restored ( Fig. 3D and E). Hence, the premature displacement of this loop apparently results in conformational changes within the PFO structure that prevent the appropriate monomer-monomer contacts necessary for forming a circular complex. However, this does not explain why stabilizing the interaction between ␤4 and ␤5 and restoring the formation of circular oligomers does not restore pore-forming activity, which is explored in the next section.
E183:K336 electrostatic interaction requires R275-mediated interactions with the ␣␤ loop. The sequential formation of intermolecular -stacking and electrostatic b ND, the T m for the NBD-labeled derivatives could not be determined since the necessary concentrations could not be achieved for the analysis. c The values in parentheses reflect the percentage of pore-forming activity of these mutants in their oxidized state compared to the activity in the prereduced PFO F230C:F351C , which was 510% of wild-type PFO.
interactions drive the flattening of the core ␤-sheet (18) by bringing ␤-strands 1 and 4 of two adjacent monomers into alignment within the prepore complex to form the upper wall of the ␤-barrel pore (13,17). Loss of either intermolecular interaction traps PFO in a prepore complex. Activity can be specifically restored to mutants lacking the electrostatic interaction by the substitution of N197 with tryptophan (19), which disrupts the water network that stabilizes the interface between ␣HB1 and D1 and D2 (Fig. 1B), thereby allowing the ␣HBs to refold into the TMHs to form the ␤-barrel pore ( Fig. 1A) (8,9,20). Pore-forming activity and circular oligomer formation was restored in PFO R275A:N197W (Table 1, Fig. 3C). K336 of the electrostatic pair resides at the tip of the ␣␤ loop. These results reveal that R275-mediated interactions with the ␣␤ loop are necessary to the formation of a functional electrostatic interaction that drives the prepore-to-pore transition (19).
Impact of the F230:F351 -stacking interaction on PFO pore-forming activity. PFO residue F230 of the conserved motif forms a parallel, offset -stacking interaction with a well-conserved aromatic at the C-terminal end of the ␣␤ loop (F351 in PFO) (Fig. 1D). The individual substitution of alanine for either F230 or F351 (PFO F230A:F351A ) reduced activity to 5% of wild-type PFO, whereas the substitution of both with leucine increased activity up to 250% of that for wild type ( Table 1). The change in the interaction energy (21) for PFO and its derivatives after energy minimization of the structures (22) is shown in Table S1. The attractive interaction energy between residues 230 and 351 for the active mutants fell between the values for wild type (strongest) and the inactive F230A:F351A mutant (weakest). Hence, weaker interactions between res- Fluorescence scanning of NBD-labeled PFO V322C shows ␤5 movement away from ␤4 upon membrane binding, as indicated by a decrease in fluorescence emission (14). Decrease in NBD fluorescence was due to an increased interaction with a polar solvent, i.e., water. (B) Fluorescence scanning of NBD-labeled PFO R275A:V322C revealed ␤5 movement in the absence of liposomes, as indicated by the low fluorescence emission for both samples. (C) Kinetics of PFO mutants (5 g) injected into HBS containing carboxyfluorescein-loaded liposomes. PFO wild type (WT) formed pores rapidly on liposomes, allowing release of entrapped dye. PFO T319C:V334C , which forms a disulfide bond between ␤4 and ␤5, was allowed to form prepores prior to reduction of the disulfide bond with DTT, which led to rapid pore formation. PFO T319C:V334C:R275A was allowed to form prepores, but upon reduction of the disulfide bond, pore formation could not be detected.
A Large Family of CDC-Like Proteins ® idues 230 and 351 increased activity, up to a point, beyond which activity was largely lost (i.e., F230A:F351A). The F230A mutant was unusual in that it exhibited variable activity, which suggests that a fraction did not maintain the necessary interaction with F351 to function. We show that the loss of activity in the F230A:F351A mutant results from the inability to form a functional intermolecular electrostatic interaction, as the introduction of the N197W mutation rescued pore-forming activity of PFO F230A:F351A (Table 1). Therefore, the F230:F351 -stacking interaction and the R275-mediated Images of PFO R275A oligomers in which an engineered disulfide bond between ␤-strands 4 and 5 was generated by substituting cysteines for T319 and V334, as previously described (14) under oxidizing (D) and reducing (E) environments. Only PFO R275A:N197W exhibits pore-forming activity (Table 1).
interactions together are necessary to anchor the two ends of the ␣␤ loop, which is required for a functional intermolecular electrostatic interaction.
Flexibility of the F230-F351 interaction is required for pore formation. An F230C:F351C mutant was generated wherein Ͼ90% of the cysteines formed disulfide bonds. Pore-forming activity is eliminated by the unreduced disulfide bond, whereas its reduction increases activity to Ͼ400% of PFO (Fig. 1A, Table 1). Reduced cysteines can form noncovalent interactions within proteins (23), which is consistent with our above observations that weakening the interaction increases the specific activity of pore formation. In contrast to PFO F230A:F351A , the oxidized (inactive) and reduced (active) forms of PFO F230C:F351C assemble SDS-resistant and heat-stable circular oligomers on membranes (Fig. S3). Although reduced PFO F230C:F351C exhibits a significant increase in the rate of pore formation if it is reduced after the prepore is formed on liposomes, no activity is recovered (Fig. 4B). Furthermore, addition of the N197W mutation does not significantly restore activity to oxidized PFO F230C:F351C (Table 1). Hence, the oxidized PFO F230C:F351C complex is trapped in a stable, off-pathway prepore, suggesting that ␤5 remains sterically trapped at the monomer-monomer interface of the prepore unless the disulfide bond is reduced prior to prepore assembly.
The necessary flexibility of this interaction was further demonstrated by covalently modifying the sulfhydryls of active F230C and F351C single mutants with the environmentally sensitive NBD (9) probe and following its emission as each bound to and formed pores on cholesterol-rich liposomes ( Fig. 4C and D, respectively). In both cases emissions increased, indicating that the polarity of their environments decreased, Reduction to PFO F230C:F351C (red) prior to addition to liposomes displayed activity greater than that for wild-type PFO. (C) PFO F230C labeled with NBD was incubated in the absence (solid line) and presence (broken line) of liposomes representing the soluble monomer and membrane-bound pore complex, respectively. A very minor increase in fluorescence intensity was observed upon pore formation, indicating a shift to a slightly more nonpolar environment in the pore. (D) PFO F351C labeled with NBD reveals a 2-fold increase in fluorescence intensity, indicating a change to a more nonpolar environment. Data are representative of a minimum of three replicates. The difference in fluorescence intensity maximum for (C) and (D) are due to differences in labeling efficiency between the two cysteine mutants. which suggests their relative positions change between the soluble monomer and pore states.
Displacement of the ␣␤ loop in soluble PFO monomers of active mutants prevents pore formation. We next determined the effect of displacing the ␣␤ loop in an active mutant. When oxidized PFO F230C:F351C:V322C is labeled at V322C, the emission scans of the soluble monomer reveal the ␣␤ loop is displaced in its oxidized and reduced forms (Fig. 5). The NBD labeling of V322C in PFO F230C:F3251C likely displaced the ␣␤ loop, since reduced PFO F230C:F351C is ϳ4 times as active as wild-type PFO, whereas NBD-labeled V322C does not displace the ␣␤ loop or eliminate activity in wild-type PFO (14). Hence, the premature displacement of the ␣␤ loop prevents the prepore-to-pore transition.
Interactions mediated by Y231, Y273, and G274 stabilize D3. Y231, Y273, and G274 of the motif do not interact with the ␣␤ loop, yet alanine substitution of Y231 and Y273 increased the specific activity by 300 to 350% over wild-type PFO, whereas the alanine mutant of G274 resulted in an unstable protein that caused pore-forming activity to exhibit variation between assays (Table 1). Y231 and Y273 form a -stacking interaction at the bend between ␤2 and ␤3 (Fig. 1D) and G274 resides at the bend in ␤3. In sharp contrast to the small changes in melting temperatures (T m ) determined for the various mutants of F275 and F230 (Ϫ1 to 3°C), the individual and double alanine mutants of Y231 and Y273 and the mutants of G274 decreased their melting temperatures by Ϫ5 to Ϫ12°C ( Table 1). The largest changes in T m were observed for G274A (Ϫ9°C) and Y231A:Y273A (Ϫ12°C). The increased activity and lower T m s for the mutants of Y231 and Y273 are similar to what we have observed for mutations that destabilize D3, which makes it easier for D3 to unfold and form the ␤-barrel pore (8,9,18,19). In contrast, alanine-substituted G274 results in an unstable protein with variable activity, which appears to be the result of large side-chain-dependent repulsive forces caused by the alanine sidechain, specifically with the backbones of Y273 and R275 (Table S1).
These studies show the F/Y-F/Y-X n -YGR motif of the CDCs exhibits a dual role: (i) F230 and R275 form critical contacts that are important for the ␣␤ loop control of the prepore-to-pore transition; and (ii) Y231, Y273, and G274 stabilize the D3 structure of the soluble monomer. The conservation of this motif between the CDCs and CDCLs initially motivated us to investigate its importance to the CDC pore-mechanism. Therefore, we next determined whether this conservation was coincidental or if the CDCLs were a family of distant CDC relatives.
Phylogenetic analysis of CDCs and CDCLs. A maximum likelihood phylogenetic analysis of the CDCs and putative CDCLs is shown in Fig. 6. The CDCLs could be divided into several major clades, the species of which are shown in Fig. S4. The F/Y-F/Y-X n -YGR motif is also present in a smaller number of CDCLs that lack the C-terminal region of the CDCLs (Fig. S5). These short CDCLs (CDCL S ) are most often present in an operon with a full-length CDCL, as shown for Elizabethkingia anophelis strain AG1 (Fig. S6) and are most frequently present in species from the Bacteroides, Chitinophaga, Chryseobacterium, and Elizabethkingia genera. The CDCL S primary structures often lack significant similarity to the cognate CDCL: the E. anophelis AG1 CDCL and CDCL S only exhibit 36% identity (Fig. S6). Most of the CDCLs are produce by species that are not known to be pathogens, although several are produced by bacteria known to cause or are associated with disease, which include Bacteroides fragilis, implicated in peritonitis and abscesses, several Prevotella that are associated with oral disease, and Flavobacterium columnare and Tenacibaculum species, which are known fish pathogens.
Crystal structure of the CDCL from Elizabethkingia anophelis reveals its relationship to the CDCs. The crystal structure of the E. anophelis CDCL (Fig. 7A) shows that D1 to D3 are highly similar to the CDC D1 to D3 structure (24) and D3 exhibits the characteristic ␣HBs of the CDCs, which refold into the TMHs of the ␤-barrel pore (8,9). Importantly, the position of the F/Y-F/Y-X n -YGR motif residues and the glycine pair are nearly identical to the cognate residues in PFO (Fig. 7B). The structure of the CDCL D4, however, does not resemble the CDC D4 fold and DALI (25) searches with the CDCL D4 structure did not reveal any similar 3D structures. The receptor-binding D4 domains of the CDCs share significant similarity in their primary, secondary, and tertiary structures, and maintain specific signature motifs (15,16), which are missing in the CDCLs. Furthermore, the D4 domains of the CDCLs exhibit little conservation in their primary structures between genera, and often exhibit little conservation between CDCLs from species that have more than one CDCL gene, as exemplified by 7 full-length CDCLs from Chitinophaga sp. strain MD30 (Fig. S7).
Pore formation and formation of oligomeric structures by the Elizabethkingia anophelis CDCL and CDCL S . The addition of both CDCL S and CDCL to carboxyfluorescein-loaded cholesterol-rich liposomes resulted in rapid dye release, which is not observed in the presence of either CDCL S or CDCL alone (Fig. 8A). However, we have to note that although this activity could be observed and replicated on the same day, it could not be repeated with fresh liposomes and the same or freshly purified proteins. We have observed this behavior for other CDCLs and have yet to identify the basis for this behavior. It did not result from contamination with an active CDC, since the individual CDCL and CDCL S proteins at twice the concentration only showed a small release of marker from the liposomes (Fig. 8A). The activity of the combined proteins is 100-to 1,000-fold less than for the CDCs, likely due to the absence of a receptor for the CDCLs on the liposomes, similar to the CD59-binding CDC intermedilysin (26).
The CDCL did not form oligomers on liposomes but appeared to form stable dimers, whereas CDCL S formed SDS-resistant and heat-stable oligomers (Fig. 8B). When the CDCL and CDCL S proteins were combined, they formed an SDS-stable oligomer with a mass slightly higher than that of PFO oligomers (Fig. 8B). When this complex was heated with SDS sample buffer, the size was reduced to that of the CDCL S oligomer. This suggests the CDCL S acts as a scaffold for the assembly of the CDCL, and that the latter's oligomer is dissociated by SDS when heated. Analysis of the CDCL and CDCL S on lipid monolayers by TEM revealed circular oligomers for the CDCL S but not the CDCL, whereas the oligomeric structures observed in the presence of equimolar concentrations of CDCL S and CDCL appeared different from those of the CDCL S alone (Fig. 8C).
Single-particle analysis (Fig. 8D) revealed that when the CDCL and CDCL S proteins are mixed, stacked oligomers are formed, which is consistent with the observed increase in mass by SDS-AGE when the two proteins are combined. No dual ring structures were observed for the CDCL S alone, suggesting that these stacked oligomers only form when both the CDCL and CDCL S are present. The CDCL S ring contained ϳ28  (11,13). This smaller ring structure would account for the smaller size of the CDCL S complex and the slightly larger size of the CDCL-CDCL S complex compared to that of PFO oligomers (Fig. 7B  and D).
Expression of CDCL and CDCL s in E. anophelis. The E. anophelis CDCLs and a large fraction of the putative CDCL proteins from other species exhibit a lipoprotein type II signal peptidase cleavage site and the cognate cysteine. The expression of the CDCL and CDCL S from E. anophelis AG1 was followed over time from a culture grown at 31°C (Fig. S8). Affinity-purified rabbit antibody to both proteins was used to probe purified membrane and spent media fractions, which revealed that both proteins are secreted into the spent medium and are first detected in mid-log phase. Much of the CDCL S , but not the CDCL, appeared to aggregate into high molecular weight species that did not penetrate into the gel, which is consistent with the above studies, where we observed that the CDCL S proteins oligomerized into SDS-resistant oligomers in the presence of liposomes.

DISCUSSION
Most, if not all, studied ␤-barrel pore-forming toxins form an obligatory prepore intermediate (10,(27)(28)(29)(30)(31), the completion of which must be detected before conversion to the pore; our studies suggest the ␣␤ loop is a critical component of this sensor. The conservation of the F/Y-F/Y-X n -YGR motif and little else between the CDCs and the large family of CDCLs provided the basis for these studies that revealed the role of the ␣␤ loop in sensing completion of the prepore and its role in the CDC pore-forming mechanism. These studies show the F/Y-F/Y-X n -YGR motif forms critical interactions necessary for the function of the ␣␤ loop. These interactions impact the subsequent formation of the key intermolecular -stacking and electrostatic interactions, which drive the prepore-to-pore transition (18,19). Once these two intermolecular interactions are initiated, the prepore is irrevocably committed to its transition to the pore. The residues of the motif play two distinct roles in the CDC mechanism: first, specific residues form interactions with the ␣␤ loop that are critical to its function in regulating the prepore-to-pore transition and second, other residues stabilize the D3 structure, ) containing the F/Y-F/Y-X n -YGR and GG motifs' residues and the conserved Q228 of PFO (CDC residues in green and the CDCL residues are in orange). The numbers in parenthesis are the CDCL residues corresponding to those of PFO. (C) Residues of the CDCL that correspond to PFO motif residues F230, Y231, Y273, and R275 and the residues with which they have significant interactions mediated by their side chains compared to those made by the analogous residues in PFO (strongest interactions in bold) determined by their net pairwise interaction energies, as calculated by the interaction energy matrix analysis (21).
A Large Family of CDC-Like Proteins ® which affects the stability and pore-forming activity of PFO. Subsequent studies into the CDCL structure revealed this motif forms similar contacts with the ␣␤ loop of the CDCLs, which suggests a common role for the CDCLs and reveals the CDCLs to be distant relatives of the CDCs.
In PFO, R275 and the F351:F230 -stacking interactions anchor the two ends of the ␣␤ loop. If either interaction is eliminated, pore formation is lost. In the R275A mutant, the loss of activity can be directly related to the premature displacement of the ␣␤ loop, which results in the formation of linear oligomers. Although the formation of circular oligomers is restored by locking ␤5 to ␤4, pore-formation activity is not restored because the loss of the R275-mediated interactions prevented the subsequent formation of the intermolecular electrostatic interaction, which drives the prepore-topore transition (19). When the ␣␤ loop was displaced in the active F230C:F351C mutant, the pore-forming activity was eliminated, likely due to moving K336 (located at the tip of the ␣␤ loop) out of position so it could not form the intermolecular electrostatic interaction with E183 of an adjacent monomer within the prepore oligomer (19). These results were consistent with the observation that the loss of activity in the R275A and F230A:F351A mutants was restored by the introduction of the N197W, which we have shown restores activity in a mutant where the intermolecular electrostatic interaction is lost (19).
The F351:F230 -stacking interaction of the ␣␤ loop must allow it a certain degree of mobility, as its substitution with a rigid disulfide bond eliminates activity unless it is reduced. Notably, the F351C:F230C mutant is Ͼ400% as active as wild-type PFO when the disulfide bond is reduced before adding the monomers to liposomes, whereas if oxidized F230C:F351C is first allowed to assemble the prepore on liposomes, pore formation is not restored by reduction of the disulfide bond. Hence, preventing the movement of the ␣␤ loop at the appropriate times also abrogates pore formation. The results of the studies with the R275 and F230:F351 mutants show that when the ends of the ␣␤ loop are not sufficiently anchored, if it is prematurely displaced or its movement restricted than PFO cannot form a functional intermolecular electrostatic interaction. The strand ␤5 of the ␣␤ loop undergoes a ␤-strand to ␣-helix transition upon pore formation to form the helix-turn-helix (HTH) in the monomers of the pore complex. This transition shortens this loop by about 9 Å, thereby placing additional stress on the K336-E183 electrostatic interaction, which breaks the final contacts between D3 and domains 1 and 2 to allow the formation of the ␤-barrel pore. Hence, the ␣␤ loop must undergo specific structural transitions in the prepore to facilitate those intermolecular interactions (18,19) that drive the prepore-to-pore transition.
Residues Y231, Y273, and G274 of this motif, on the other hand, do not interact with the ␣␤ loop but appear to contribute to the stabilization of D3. The elimination of the -stacking interaction between Y231 and Y273 significantly increased pore-forming activity and the G274A mutant resulted in an unstable pore-forming phenotype. These phenotypes correlated with significant decreases in the melting temperatures of PFO, which is largely dictated by the unfolding of D3 (19). D3 is in a metastable state; it must remain stable in the soluble monomer but undergoes significant secondary and tertiary structural changes to form the ␤-barrel pore, whereas D1 and D4 remain largely unchanged (8,9,11,13). We have shown that specific intramolecular interactions stabilize the structure of D3 in the monomer, which are broken by the formation of the intermolecular electrostatic and -stacking interactions (18,20). The interactions mediated by residues Y231 and Y273 contribute to stabilizing D3, such that weakening them decreases the T m and increases the rate of pore formation, whereas replacing G274 with side-chain-containing residues greatly reduces the T m , which results in an unstable protein. Therefore, these residues of the F/Y-F/Y-X n -YGR motif stabilize the monomer structure of D3, whereas F230 and R275 form essential contacts with the ␣␤ loop so that it can function to initiate the unfolding of D3 to form the pore.
We have shown that some residues of this motif can be substituted with other residues, which generally result in greater than wild-type activity. Why then are these residues so highly conserved among the CDCs and CDCLs when little else is conserved? For residues R275 and G274, the answer is relatively easy since the conservative substitutions of R275K and G274A result in much less active and/or unstable derivatives. F230A also exhibited an unstable pore-forming phenotype. The -stacking interaction of Y231 and Y273 seems less important to the activity of PFO, since single or double alanine mutants retained Ն300% of wild-type activity. However, this increased activity comes with a loss in thermostability of 5 to 12°C, which may be advantageous or disadvantageous under specific environmental conditions (20). The glycine pair, which is also conserved, has been shown previously to be intolerant to substitution with side chain amino acids (14), which prevent the movement of the ␣␤ loop and trap PFO in an early prepore state (32). Therefore, these residues are likely the best fit for this particular function, based on our observations. Furthermore, the central importance of this motif and that of the previously studied glycine pair (14) to the function of the ␣␤ loop in the CDC prepore-to-pore transition explains their conservation in the CDCLs, as control of the prepore-to-pore transition is critical for CDCs to complete prepore assembly prior to the insertion of the ␤-barrel pore (10, 11, 18, 19, 33-36). Furthermore, the crystal structure of the E. anophelis CDCL exhibits a high degree of structural similarity to domains 1 to 3 of the CDCs and the residues of this motif are positionally conserved between PFO and the E. anophelis CDCL, and form contacts analogous to those shown herein for PFO ( Fig. 7B and C), suggesting a common function.
The missing signature undecapeptide and cholesterol-binding motifs (15,16) from the CDCLs and the lack of significant similarity between the CDC and CDCL primary structures made it difficult to know if the CDCLs were related to the CDCs. However, the studies herein clearly show the CDCLs are distant relatives of the CDCs, which likely bind different receptors. The genes encoding nearly 300 putative CDCL proteins are present in over 200 species from at least 12 different phyla that span most terrestrial niches and the microbiomes of humans, animals, and insects. This makes them the largest family of putative pore-forming toxins/proteins in the CDC/MACPF superfamily. Although most CDCL genes are present in bacterial species, genes for two putative CDCLs were identified in the genome of the diatom Thalassiosira oceanica (the genes also exhibit an intron-exon structure) and three in the genome of the fungus Basidiobolus meristosporus strain CBS 931.73. Several species also contain genes coding for more than a single putative CDCL or a CDCL plus a CDCL S , the extremes being exemplified by Chryseobacterium nematophagum (37) and Chitinophaga sp. MD30 (38), each of which have a combined total of 12 and 11 putative CDCLs and CDCL S s, respectively.
The E. anophelis CDCLs and CDCL S s must interact to assemble a functional pore, although on a molar basis it is 100-to 1,000-fold less active than PFO in in vitro assays. This is likely due to the absence of a receptor for the CDCL on the cholesterol-rich liposomes. The putative CDCL binding domains often exhibit little similarity with each other, as illustrated by the primary structures of 7 full-length CDCLs from Chitinophaga sp. MD30 (Fig. S7). The divergence of the binding domain 3D structure is evident in crystal structures of the E. anophelis CDCL and PFO ( Fig. 1 and Fig. S1). The significant diversity observed among the CDCL putative binding domains suggests they may have evolved to bind targets specific to each species. Although the CDCs target eukaryotic cells because cholesterol is either the receptor or a coreceptor, it is yet unknown whether the CDCLs bind to and function on eukaryotic and/or prokaryotic cells.
Neither the CDCL nor the CDCL S of E. anophelis exhibited any significant poreforming activity alone, even though the CDCL S readily formed CDC-like oligomers on liposomes. Mixing the CDCL and CDCL S resulted in a larger SDS-resistant oligomer and exhibited higher pore-forming activity than either protein alone. Single particle analysis revealed stacked oligomeric complexes of two, and sometimes three, rings, with one exhibiting a slightly larger outer diameter than the other. This complex could be dissociated by treatment with SDS and heat, which resulted in the appearance of stable CDCL S oligomers and CDCL dimers. These observations suggest the CDCL S oligomer acts as a scaffold for the assembly of the CDCL oligomer. The CDCL S oligomers are smaller than CDC oligomers and are composed of ϳ28 monomers, versus 35 to 40 monomers for CDC oligomers (11,13). The stacked nature of the E. anophelis oligomers remain an enigma, as most species encode a single CDCL gene and stacked oligomers have not been observed for other pore-forming proteins. The presence of both a CDCL and CDCL S appears to be primarily associated with species of Bacteroides, Elizabethkingia, Chitinophaga, and Chryseobacterium. The stacked oligomer might interact via the top of the oligomers, thereby orienting the ␤-barrels of the CDCL and CDCL S on opposite sides of the complex; theoretically this could insert a ␤-barrel into the membranes of two cells and form a channel that could facilitate the transfer of macromolecules between the cells.
These studies reveal the basis for the conservation of the F/Y-F/Y-X n -YGR motif and suggest that the ␣␤ loop may be the sensor, or an important component thereof, that initiates the necessary structural interactions and transitions necessary for the preporeto-pore transition in CDCs. The conservation of this motif and little else in the primary structures of the CDCLs and the crystal structure of the CDCL from E. anophelis strongly suggest that these residues and the ␣␤ loop are similarly critical to their pore-forming mechanisms. Finally, these studies reveal that the CDCLs represent a large family of pore-forming proteins related to the CDC/MACPF superfamily and are widespread in nature.

MATERIALS AND METHODS
Bacterial strains, plasmids, and chemicals. PFO and E. anophelis AG1 CDCL genes were codon optimized for Escherichia coli and cloned in between BamHI and NdeI sites in pet15bϩ (Genscript), as previously described (19). All mutations in PFO were generated in the cysteine-less background of PFO C459A , referred to as wild-type PFO herein. All chemicals and enzymes were obtained from Sigma except where noted. All fluorescent probes were purchased from Molecular Probes (Invitrogen).
Generation and purification of CDC and CDCL proteins and derivatives. All amino acid point mutations in PFO were generated via QuikChange mutagenesis (Stratagene) prior to sequence verification at the Laboratory for Molecular Biology and Cytometry Research at the University of Oklahoma Health Science Center. The expression and purification of PFO and its derivatives and the CDCLs were performed as previously described for PFO (39). Frozen protein aliquots were thawed and centrifuged at 20,000 ϫ g for 10 min and assayed for concentration prior to use in experiments.
For crystallization trials, fractions containing His 6 -CDCL were pooled and dialyzed into 20 mM Na citrate (pH 6.5), 150 mM NaCl at 21°C for 16 h. Protein was concentrated to 6 mg/ml and stored at Ϫ80°C. For X-ray crystallography, selenomethionine (SeMet) CDCL was expressed in E. coli BL21(DE3) cells using Molecular Dimensions SelenoMethionine Medium Complete, according to the manufacturer's instructions. The SeMet CDCL was purified using the wild-type protocol.
Affinity purification of rabbit IgG to CDCLs and Western blot analysis. Rabbit antibody was generated (Lampire Biological Products) against purified recombinant CDCL and CDCL S purified from E. coli based on the gene sequences from E. anophelis strain AG1. Purified CDCL or CDCL S (10 mg) was coupled to 1 ml of Affigel-10 according to the manufacturer's instructions (Bio-Rad). Unbound protein was eluted with 10 column volumes of 100 mM glycine buffer, pH 2.5. The resin was then equilibrated in HEPES buffered saline (HBS) and stored at 4°C until use. Antiserum (1 ml) was recirculated over the affinity column for 1 h at room temperature and washed with 20 column volumes of HBS. Distilled water (1 ml) was passed over the column and then the column was eluted with the glycine buffer (pH 2.5) and 1 ml fractions collected and immediately neutralized with 200 l of 1 M Tris-HCl (pH 8.0). The antibodycontaining fractions were combined and concentrated to approximately 1 mg/ml using an ultra 15-ml centrifugal filter (Amicon). Blots were probed with a 1:5,000 dilution of the antibody overnight at 4°C and then processed as previously described (10).
Labeling of PFO derivatives with fluorescent probes. Cysteine labeling of protein using Alexa Fluor 488 and Alex Fluor 568 or NBD was performed as previously described (14). Briefly, PFO derivatives were incubated in a 20-fold molar excess of fluorescent probe at 4°C overnight. Free dye was separated from labeled protein via G-50 column. Absorbance of the protein and label were obtained to define labeling efficiency.
PFO modeling onto the PLY pore structure. A model of the pore form of PFO was built on the basis of the PLY pore structure 5LY6 (13) using MODELLER (40) version 9.11. Briefly, the amino acid sequences of PFO and PLY were aligned using MUSCLE (41) and the alignment, along with the structure of a single PLY pore monomer, were used to create a homology model of PFO in the pore-monomer conformation. The PFO pore-monomer structure was read into PyMOL (42) and manually superimposed on to each monomer of the PLY pore in turn, creating a set of PFO monomers that were then assembled into a complete PFO pore.
Carboxyfluorescein release assay. Toxin was serially diluted (2-fold) 24 times into 96-well plates with a top concentration of 2 M. Each toxin derivative was run in triplicate, and 100 l of 1/1,000 CF liposome in HBS was added to each well prior to a 1-h incubation. The fluorescence emission of the individual wells of the 96-well plates was scanned using a Tecan Infinite 200Pro fluorescence plate reader (1).
Melting temperature (T m ) assay. As previously described (19), PFO wild type and derivatives were prepared at 0.5 mg/ml with 1ϫ SYPRO Orange (Thermo Fisher) and subjected to an increasing temperature gradient using a 7500 Fast Real Time PCR instrument (Thermo Fisher) according to the manufacturer's protocol. Each experiment is representative of 4 replicates.
Oligomer detection assay. SDS-agarose gel electrophoresis (SDS-AGE) was conducted as previously described (10). Briefly, PFO or CDCLs were incubated with excess liposomes at 37°C for 30 min. Samples were then treated with SDS and boiled as indicated prior to being run on an SDS-AGE gel. The concentrations used are listed in the figure legends of each experiment.
Negative stain imaging. PFO mutants (0.025 mg/ml), CDCL (0.3 mg/ml), CDCL S (0.05 or 0.2 mg/ml), or CDCL S ϩ CDCL (1:1 equimolar ratio, 0.05 or 0.2 mg/ml) in 20 mM HEPES (pH 7), 150 mM NaCl were applied to Teflon wells. The prereduced PFO mutants were treated with 2 mM Tris(2-carboxyethyl) phosphine hydrochloride (TCEP) reducing agent before incubation. A lipid mixture containing cholesterol:POPC (molar ratio 65:35 for PFO, molar ratio 55:45 for CDCLs) solubilized in chloroform (1 l at 0.5 mg/ml) was then overlaid onto the wells and lipid monolayers were formed upon chloroform evaporation. Samples were incubated at 37°C for 60 min in a humidified chamber. Proteins assembled on the lipid layer were transferred onto Formvar/carbon-coated grids (mesh size 300) and stained with 1% uranyl acetate, following the method described by Rames et al. (43). All stained grids were imaged using an FEI Talos L120C transmission electron microscope (Thermo Fisher Scientific), with an acceleration voltage of 120 keV and fitted with a 4K ϫ 4K Ceta CMOS camera at a magnification between 36,000ϫ and 57,000ϫ and a defocus of 0.05 to 2 m.
Single particle analysis. Image processing, particle picking, and 2D classification were performed with RELION-3.0 (44). For CDCL S , a total of 7,678 particles from 28 micrographs (pixel size 3.1 Å) were picked and extracted with a 130-pixel box size prior to reference-free 2D classification. For CDCL S ϩ CDCL, a total of 7,018 particles from 54 micrographs (pixel size 3.8 Å) were picked and extracted with a 90-pixel box size prior to reference-free 2D classification. Dimensions of 2D class averages were measured using ImageJ (45).
Fluorescence spectroscopic measurements. All fluorescence measurements were performed using a Horiba Fluorolog 3 photon-counting fluorimeter in photon-counting mode with FluorEssence software. Excitation and emission wavelengths for 5(6)-carboxyfluorescein (CF) were 470 nm and 515 nm, with both slits set to 2 nm. Fluorescence scanning experiments for both NBD and Alexa Fluor 488 scanned emission from 500 nm to 600 nm with slits at 2 nm and 5 nm, respectively; excitation wavelengths were 490 nm (2-nm slit) and 480 (4-nm slit), respectively. The rate of pore formation by PFO and derivatives was observed using CF-containing liposomes as previously described (19). As toxin forms pores on the liposomes, CF diffuses out of the liposomes and its emission dequenched. The increase of fluorescence emission over time was monitored following addition of toxin to liposomes. Steady-state NBD fluorescence experiments were performed as previously described (46). Briefly, NBD-labeled toxin derivatives were incubated in the presence and absence of liposomes for 30 min prior to scanning for fluorescence intensity. Steady-state Förster resonance energy transfer (FRET) of donor dye Alexa Fluor 488 and acceptor dye Alexa Fluor 568 was performed as previously described (34). Briefly, donor-labeled PFO derivatives were incubated with liposomes in the presence and absence of a 4-molar excess of acceptor-labeled PFO derivatives. A decrease in donor fluorescence intensity signal due to FRET correlates with the formation of liposome-bound oligomeric complexes.
Crystallization of CDCL. Initial crystallization trials of His 6 -CDCL were set up at 21°C on a Crystal Gryphon robot (ARI, Sunnyvale CA, USA) in 96-well sitting drop format using Rigaku UVϩ 96 plates (AXT, Sydney, Australia). Drops containing 0. Data collection, structure determination, and refinement. CDCL crystals were diffracted to 2.09 Å, without cryoprotectant, on the MX2 beamline of the Australian Synchrotron (Clayton, Victoria). Data collection was controlled using Blue-Ice software (47). For phasing, diffraction data were acquired from a single CDCL crystal containing SeMet residues in place of normal methionine. Two data sets were acquired, one at the measured selenium peak for the crystal of 12867 eV and the other at 13000 eV. Each data set consisted of a full 360°rotation being 720 images with a crystal oscillation of 0.5°. Data were processed with XDS (48). The crystals belong to space group P4 2 2 1 2 with unit cell parameters of a ϭ b ϭ 121.74 Å, c ϭ 88.91 Å. Experimental phases were determined using the peak wavelength data set alone with hkl2map (49) in SAD mode. Autotracing in SHELXE found 361 residues of the 514 residues with unambiguous positioning of six of expected eight selenomethionine residues from the anomalous selenium signal. An additional data set for a native CDCL crystal containing methionine was acquired to 2.2 Å resolution, without cryoprotectant, on the MX2 beamline of the Australian Synchrotron and merged with the selenomethionine data set for final refinement. The structure was refined by iterative rounds of manual rebuilding in COOT (50) and refinement with Phenix (51) and BUSTER (52). Final refinement statistics are summarized in Table S2.
Accession numbers. Atomic coordinates and structure factors of have been deposited in the Protein Data Bank under the ID code 6XD4.

SUPPLEMENTAL MATERIAL
Supplemental material is available online only.