Predictive Rules of Efflux Inhibition and Avoidance in Pseudomonas aeruginosa

Efflux pump avoidance and inhibition are desired properties for the optimization of antibacterial activities against Gram-negative bacteria. However, molecular and physicochemical interactions defining the interface between compounds and efflux pumps remain poorly understood. We identified properties that correlate with efflux avoidance and inhibition, are predictive of similar features in structurally diverse compounds, and allow researchers to distinguish between efflux substrates, inhibitors, and avoiders in P. aeruginosa.

and inhibition of active efflux in P. aeruginosa. To achieve this, we analyzed a series of 260 peptidomimetic compounds (Rempex compounds) active against P. aeruginosa. Rempex compounds possess two biological features of interest. First, they are EPIs that target MexAB-OprM and homologous efflux pumps and potentiate the antibacterial activity of levofloxacin and other antibiotics in P. aeruginosa cells (34)(35)(36). Second, they possess an intrinsic antibacterial activity and inhibit the growth of P. aeruginosa at certain concentrations. The compounds were optimized in medicinal chemistry programs specifically against P. aeruginosa and vary broadly in their properties (34,37,38). These features make Rempex compounds an excellent tool for deciphering predictive general rules of permeation and efflux avoidance in P. aeruginosa. We experimentally segregated contributions of active efflux from OM permeation and developed novel computational approaches to quantify molecular recognition by the inner membrane (IM) efflux transporter MexB and permeation through the OM. Finally, we applied machine learning algorithms to precisely identify descriptors of efflux substrates, inhibitors, and avoiders. The developed approach combines experimental data and predictors accounting for different physicochemical conditions allowing us screen for compounds with specific properties and to effectively guide drug design against P. aeruginosa infections.

RESULTS AND DISCUSSION
Rempex compounds readily permeate the outer membrane of P. aeruginosa and are substrates of efflux pumps. To follow the fate of compounds in cells and to identify descriptors associated with different permeation barriers, we first separated the contributions of the OM barrier and active efflux in measured activities of compounds. For this purpose, the bacterial growth-inhibitory activities for all Rempex compounds were analyzed in four P. aeruginosa strains: the wild-type strain, PAO1; the PD6 strain, lacking six efflux pumps (DmexAB-oprM, DmexCD-oprJ, DmexXY, DmexJKL, DmexEF-oprN, and DtriABC); and their hyperporinated derivatives, PAO1-Pore and PD6-Pore, respectively (39). These strains were previously shown to differ dramatically in their susceptibilities to various classes of antibiotics because of differences in efflux proficiency and permeation across the OM (13,15). To normalize to the differences in target inhibition potency among compounds, our key measured parameters were efflux ratios and OM barrier ratios, defined as the 50% inhibitory concentration for the parent/mutant (IC 50 parent /IC 50 mutant ), for efflux mutants and hyperporinated mutants, respectively (see Table S1 in the supplemental material). More specifically, the IC 50 ratios for PAO1/PAO1-Pore and PD6/PD6-Pore define the contribution of the OM barrier to the activities of compounds in the presence and absence of efflux, respectively. For the majority of compounds, these ratios were 1, suggesting that unlike most antibiotics (13,28), Rempex compounds readily permeate the OM barrier, likely using the self-promoted uptake mechanism (40,41). Activities of only a few compounds were slightly affected by the OM. Among them are compounds 58 and 46 (Fig. 1A), whose activity is enhanced by hyperporination of efflux-deficient PD6-Pore cells by 16-and 8fold, respectively.
The IC 50 ratios of PAO1/PD6 and PAO1-Pore/PD6-Pore define the contribution of the efflux to the activities of compounds in the presence and absence of the OM barrier, respectively. Unlike with the OM ratios, the activities of ;75% of compounds were significantly affected by active efflux ($4-fold), with the highest ratios of 100 to 200 characteristic for such compounds as 71 and 32 (Fig. 1A). Thus, most Rempex compounds are substrates of P. aeruginosa efflux pumps.
The total contribution of the permeability barrier in activities of compounds is further elucidated by the IC 50 ratio between PAO1 and PD6-Pore cells. Among compounds with the largest PAO1/D6-Pore ratios (128 to 256) are compounds 58 and 28, whose activity is affected by both active efflux and OM barrier (Fig. 1A).
Lastly, a comparison of the PAO1/PD6-Pore IC 50 ratio and the IC 50 in the PD6-Pore strain provides a measure of how the changes in the antibacterial activity on the target correlate with the permeation of a compound. There is a weak negative correlation (Pearson coefficient, r= 20.16, P = 0.004) between these two properties, suggesting that permeation is not the major limiting factor in the antibacterial activity of Rempex compounds (Fig. 1B).
Rempex compounds inhibit efflux. EPIs potentiate antibiotics' activities by increasing their intracellular accumulation (37,42). We analyzed the EPI activities using two different assays. We first analyzed the potentiation of the antibacterial activity of the antibiotic levofloxacin in PAM1032 cells overproducing the MexAB-OprM pump due to an nfxB mutation (37). Specifically, we identified the minimal potentiating concentration that reduces the MIC of levofloxacin by 8-fold (MPC 8 ). The compounds varied broadly in their MPC 8 values starting from 0.3 mM (compounds 17 and 32) and up to 190 mM (Fig. 1C). Overall, MPC 8 values moderately correlated (r = 0.30, P = 0.01) with IC 50 values in PD6-Pore cells. This result suggests that for some of the Rempex compounds, the potentiation of the levofloxacin activity is due to their antibacterial properties. However, certain compounds significantly deviate from this trend (Fig. 1C), pointing out that the EPI and antibacterial activities are independent from each other for these compounds.
In a second, bacterial growth-independent, assay, we analyzed the ability of compounds to inhibit efflux of the fluorescent probe Hoechst 33342 (HT), which diffuses slowly between leaflets of the cytoplasmic membrane and is pumped out from cells by efflux transporters. HT fluorescence increases 134-and 32-fold upon binding to DNA and lipids, respectively (15). We carried out the HT assay in hyperporinated PAO1-Pore cells to increase the permeation of both the probe and the compounds across the OM and to achieve efflux-saturating concentrations of compounds in the periplasm (15). We extracted the initial rates and intracellular steady-state concentrations of HT accumulation in cells from the time-dependent changes in HT fluorescence in the presence of increasing concentrations, c, of compounds (15). At c = 16mM (the highest concentration of compounds used in the assay), about 50 compounds increased the intracellular concentration of HT by at least 2-fold. Some of the strongest inhibitors, (e.g., compounds 17 and 61) increased the initial rates of HT uptake by more than 10-fold (Fig. 2). This increase in the rates of HT accumulation was observed only in efflux-proficient PAO1-Pore cells, whereas few or no changes in rates were seen in efflux-deficient PD6-Pore cells. Furthermore, the increased rates of the probe accumulation in hyperporinated cells are specific to efflux inhibition, because in these cells the OM barrier does not limit the probe permeation. Thus, Rempex compounds inhibit active efflux specifically and independently from their ability to penetrate the OM of P. aeruginosa.
If both the bacterial growth-dependent and -independent assays measured the ability of compounds to inhibit active efflux, we would expect that the outcomes of the assays correlate with each other. Indeed, we found that the effect of Rempex compounds on the kinetics of HT accumulation correlates negatively, albeit weakly (r = 20.20, P = 0.0005), with their levofloxacin potentiating (MPC 8 ) activities (Fig. 3A). Importantly, the most efficient inhibitors of HT efflux also have the lowest MPC 8 values (Fig. 3B). Thus, both assays report on efflux inhibition, but additional factors also contribute.
As shown above, antibacterial and levofloxacin potentiating activities of compounds have a similar positive trend (r = 0.30, P = 0.01 [ Fig. 1C]). In agreement, the activity in terms of the inhibition of HT efflux negatively correlates with the IC 50 s of compounds in PD6-Pore cells. However, this correlation is statistically weak (r = 20.11, P = 0.04), and many compounds deviate from it. Hence, the inhibition of HT efflux by Rempex compounds is defined by some properties that are independent from their antibacterial and potentiating activities.
Efflux avoidance dominates clustering of compound activities. Relationships between various activities of compounds were analyzed by hierarchical clustering. On the dendrogram (Fig. 3C), Rempex compounds separate into two large groups. The distinction between groups 1 and 2 mainly arises from the impact of efflux inactivation in bacterial growth inhibition assays. Group 1 primarily comprises compounds that are Efflux Inhibition and Avoidance in P. aeruginosa ® not affected by the inactivation of efflux (that is, they are not efflux substrates). Group 2 primarily comprises compounds that are strongly affected by the inactivation of efflux (that is, they are efflux substrates). Investigating further the subclusters of the two primary groups, we found that the compounds cluster according to their efficiencies in inhibition of HT efflux, as well as according to their antibacterial and levofloxacin potentiating activities.
Principal-component analysis (PCA) showed that between groups 1 and 2, four subgroups could be defined comprising compounds that are either effective inhibitors of HT efflux (green, EPI; blue, EPI/substrate) or do not have significant EPI activities (red, Avoider/non-EPI; cyan, Substrate). Although the subgroups lack a distinct boundary dividing them (Fig. 3D), the preponderance of points belonging to each of these subgroups falls into separate quadrants of the PCA plot. Thus, efflux inhibitory activities and efflux avoidance are not strongly interlinked, and these properties are associated with distinct compounds.
Physicochemical, permeation, and MexB interaction descriptors of compounds. To identify properties of compounds that correlate with their biological activities, we assembled several subsets of numerical descriptors and carried out an agglomerative clustering analysis to find the relationships between various descriptors belonging to either the same or different subsets.
The chemical structures of compounds, their physical properties, and their interactions with the solvent were represented by 73 physicochemical descriptors (see Table S2 and Table S3, "Physico-chemical properties" column, in the supplemental material). These descriptors are frequently used in quantitative structure-activity Diffusion of compounds across membranes was represented by 35 permeation descriptors (Table S3, "Permeation descriptors" column). Rempex compounds readily permeate the OM of P. aeruginosa, as seen from the values of the OM ratios-close to 1 for most of them (Fig. 1A). To generate the permeation descriptors, compounds were placed into seven different layers of the OM model corresponding to the outer and inner cores of lipopolysaccharide (LPS; cores 2 and 1), lipid A headgroup, hydrophobic core of the bilayer, glycerol layer of the inner leaflet, the phospholipid layer headgroup, and water (Fig. 4B). For each simulation, the following molecular descriptors were evaluated: (i) membrane-ligand interaction energy, (ii) number of hydrogen bonds between a compound and its surrounding water shell, (iii) number of hydrogen bonds between the ligand and the surrounding OM environment, (iv) lateral mean squared displacement, (v) ligand hydration shell, and (vi) ligand cumulative entropy (Table S3, "Permeation descriptors" column).
Since ;75% of Rempex compounds are efflux substrates, they likely interact with MexB, the IM transporter of the MexAB-OprM efflux pump. MexB is responsible for recognition, binding, and transport of substrates with very different molecular properties Efflux Inhibition and Avoidance in P. aeruginosa ® (31-33, 35, 42). As expected, we found that, with a few exceptions, most of the compounds bound the purified MexB in surface plasmon resonance (SPR) assays, with equilibrium dissociation constant (K D ) values ranging from the sub-to the mid-micromolar range (see Fig. S1 in the supplemental material). To quantify interactions of Rempex compounds with MexB at a molecular level, we carried out ensemble docking calculations using available X-ray crystal structures of MexB (44,45), along with a few conformations extracted from microsecond-long MD simulations (46). The MexB trimer is suggested to functionally rotate through three main conformations-access (A), binding (B), and extrusion (C)-which enable access, binding, and extrusion of substrates from cells, respectively (23). Rempex compounds were docked to the two major putative substrate binding pockets of MexB, as known from X-ray crystallography: (i) the access pocket (AP) of the access monomer and (ii) the deep binding pocket (DP) of the binding monomer (Fig. 4C) (47). Docking calculations yielded about 60 descriptors of binding of Rempex compounds inside the AP and DP sites of MexB (Table S2, docking descriptor definitions). These descriptors include average binding affinities and the total number of contacts between compounds and specific residues lining the two analyzed pockets ( Fig. 4C; Table S3, "MexB docking descriptors" column).
We next performed a cluster analysis on these three sets of descriptors separately and on all 174 descriptors combined ("all descriptors") to identify possible correlations among different properties ( Fig. 4D; Table S3, "All properties" column). We indeed identified 29 clusters, most of which had a clear association and included descriptors related to specific properties of compounds: molecular symmetry, size, charge and polarity, and number of rings (Table S3). A comparison with the results of clustering on property-specific subsets of descriptors (physicochemical, docking, and permeation) showed that for the most part, descriptors of physically related properties (e.g., size, number of rings, or entropy of permeation) are clustered together, regardless of the subset of descriptors considered. Thus, the identified clusters appear to be reliable and generally consistent with physical intuition.
Derivation of predictive models of permeation and efflux in P. aeruginosa. To assess the importance of different descriptors, we trained a linear predictive model based on experimental measurements and determined the relative importance of different predictors through the relative weight of their coefficients. To fit this model, we employed representative descriptors from the clusters discussed above. Seven distinct variables derived from experimental ratios were used for model outputs (see Table S4 and supplemental methods in Text S1 in the supplemental material): efflux = IC 50 PD6-Pore / IC 50 PAO1-Pore , permeation = IC 50 PD6-Pore /IC 50 PD6 , EPI-1 = g(MIC PAO1 /MPC 8 PA1032 ), EPI-2 = g (MIC PD6-Pore /MPC 8 PA1032 ), EPI MPC = g(IC 50 PAO1 /MPC 8 PA1032 ), and EPI SS = SS 16 mM /SS 0 mM . SS concn (e.g., SS 16 mM ) refers to the steady-state HT accumulation ratio at that concentration, and fold difference is the fold difference in HT fluorescence (HT 16 mM /HT 0 mM ).
To identify the most generalizable descriptors, we performed feature selection employing regression analyses for (i) all descriptors and (ii) the following specific subsets: LigMexB descriptors (all except permeation descriptors), Lig descriptors (QSAR, QM, and MD descriptors), docking descriptors, and permeation descriptors. (See Fig. S2 in the supplemental material for an overview of the procedure.) The top descriptors were then used to fit a model able to predict whether a given compound based on its specific descriptors is a strong or weak (i) membrane permeator, (ii) efflux avoider, or (iii) efflux inhibitor.
Overall, we found that for the permeation, efflux, EPI MPC , and EPI SS experimental ratios (four ratios out of seven analyzed), the LigMexB and "all descriptors" subsets of descriptors generated the best-performing models (see Fig. S3 in the supplemental material). We focused on a set of final models fitted with "all descriptors" (Fig. 5), because these descriptors incorporate molecular-level interactions and diffusion of compounds across both the OM and the IM. These final models were assessed and found to be well performing through the metrics of enrichment, precision, and recall as a function of both probability and ranking (see Text S1, supplemental methods, and Fig. S4 in the supplemental material).
Efflux avoidance and inhibition correlate with distinct molecular descriptors. We next identified among the top descriptors those that correlate with experimental measurements by inspecting the corresponding coefficients in the binomial regression models. A positive (negative) coefficient implies that the descriptor positively (negatively)  Table S3.
Efflux Inhibition and Avoidance in P. aeruginosa ® correlates with the outcome. The larger the magnitude of the coefficient, the more important it is to the outcome (Fig. 5).
(i) Efflux avoiders. We assumed molecules with efflux ratios IC 50 PD6-Pore /IC 50 PAO1-Pore of #0.25 to be weak efflux avoiders, while we considered compounds with ratios IC 50 PD6-Pore /IC 50 PAO1-Pore of $0.25 to be strong efflux avoiders (Table S4). The best model predicts positive correlations with the docking descriptors' contacts with P668 of the AP and S276, Q273, and Q46 of the DP and the permeation descriptors of diffusion within lipid A and within the glycerol moieties of phospholipids (Fig. 5A). Also, an increase in the number of hydrogen bond acceptors (physicochemical descriptor) in a compound is correlated with strong efflux avoidance. On the other hand, increase in hydrogen bonding with water and lipid moieties of the membranes (permeation), more contacts with L674 in the AP site (docking), and increased acylindricity and anisotropic polarizability of compounds (physicochemical) appear to make them better efflux substrates (negative correlation).
(ii) Efflux inhibitors. From the two types of EPI assays, the growth-dependent IC 50 PAO1 / MPC 8 PA1032 (EPI MPC ) ratio and the growth-independent SS 16 mM /SS 0 mM (EPI SS ) ratio generate the best-performing models for efflux inhibitors (Fig. S4). For both of these ratios, the "good"/"bad" cutoffs were set at 0.5.
As in the case of efflux substrates, the top descriptors that discriminate between "good" and "bad" EPI MPC are dominated by permeation descriptors. The best EPI MPC ratios track descriptors indicative of slow diffusion within the phospholipid headgroups and lipid tails of the membrane (Dxy) and lower hydrogen bonding with lipid moieties and with penetrating water (HB) (Fig. 5B). The dominance of these descriptors suggests that they represent unique features of compounds, perhaps related to the fact that MexB and similar pumps appear to capture their substrates from the lipid bilayers and a water-lipid interface (48,49). The difference in hydrogen-bonding profiles between inhibitors and avoiders likely reflects the compound affinities to various layers of the OM model and their optimal positions in these layers. With respect to docking descriptors, more potent inhibitors have fewer contacts with L674 in the AP and more with T89 and R128 in the DP (Fig. 5B). Among physicochemical descriptors, the number of aromatic bonds, total density functional theory (DFT) energy and relative shape anisotropy Kappa2 increase with the potency of EPI MPC , whereas pi energy and lipophilicity (as expressed by xLogP3) decrease with increasing potencies (negative correlation) (Fig. 5B).
Interestingly, the top descriptors for the growth-independent EPI SS are dominated by interactions with MexB (Fig. 5C). Average binding affinity to the AP as well as contacts with L674 and P668 in the same site correlate positively with the activity of EPI SS . In addition, more contacts with T130 and F136 in the DP correlate with higher EPI SS activities. In contrast, seven out of eight top permeation descriptors negatively correlate with these ratios. Most of these descriptors are hydrogen bonding with polar moieties and water. Thus, decreasing hydrogen bonding propensity is expected to increase the activity of EPI SS (Fig. 5C). Among the physicochemical features, the acylindricity and the number of heteroaromatic rings negatively correlate with EPI SS .
(iii) Properties distinguishing between efflux avoiders and efflux inhibitors. Finally, a combination of experimental ratios defining efflux avoidance (IC 50 PD6-Pore / IC 50 PAO1-Pore ) and efflux inhibition (EPI MPC ) and the same thresholds as used for the models above were employed to identify descriptors that may be useful in distinguishing these two properties, rather than being solely predictive of one or the other. Initially, we considered four possible classes: good avoidance and good inhibition (efflux $ 0.25, EPI MPC $ 0.5) (GG), good avoidance but bad inhibition (efflux $ 0.25, EPI MPC # 0.5) (GB), bad avoidance but good inhibition (efflux # 0.25, EPI MPC $ 0.5) (BG), and bad avoidance and bad inhibition (efflux # 0.25, EPI MPC # 0.5) (BB). Assessment of class balance shows that approximately 57% of compounds are in the BB class, 23% are in the GB class, 20% are in the BG class, and 0% are in the GG class. Due to the lack of examples in the fourth class, we trained a 3-class multinomial regression classifier. In this model, precision and recall for all three classes are good, as well as the enrichment for classes 1 and 2, which are of greater interest (approximately 2 to 3) (Fig. S4).
An inspection of the model parameters shows that there is little predictivity for the BB class (relying solely on its intercept and presumably not being class 1 or class 2). Hence, we analyzed which descriptors are related to GB (avoider-not-EPI) alone, which are related to BG alone (EPI-not-avoider), and which are related to both (Fig. 5D). Three permeation descriptors have the largest coefficients and clearly separate compounds belonging to these two classes: (i) diffusion within lipid A layer; (ii) hydrogen bonding with core 1 layer, and (iii) total energy of interactions with phospholipid headgroups. Compounds that are neither good substrates nor EPIs (avoider-not-EPI, GB) positively correlate with diffusion in lipid A and binding to phospholipid headgroups, but negatively with hydrogen bonding with core 1. In contrast, compounds that are both efflux substrates and inhibitors (EPI-not-avoider, BG) have inverse properties and show a positive correlation with hydrogen bonding in core 1 and a negative coefficient for diffusion in lipid A and binding to headgroups (Fig. 5D).
Three docking descriptors distinguish the two classes: contacts with (i) L674 and (ii) P668 in the AP and (iii) average affinity to the DP of MexB (Fig. 5D). Compounds avoiding efflux have fewer contacts with L674 and lower preference for the DP, but more contacts with P668. In contrast, substrates/inhibitors interact more with L674 of the AP and the DP and negatively trend with P668.
Finally, three physicochemical properties distinguish the two classes: (i) molecular shape as described by acylindricity, (ii) anisotropic polarizability (Anisotropic_pol), and (iii) partition coefficient (LogD) (Fig. 5D). The propensity to be a substrate/inhibitor increases with increasing acylindricity and anisotropic polarizability but decreases with increasing partition coefficient and lipophilicity.
Taken together, these results demonstrate that for the bacterial growth-dependent measurements, there is an interrelatedness between the propensities of a compound to be an EPI and to be recognized as an efflux substrate, suggesting that both rely on a similar set of compound properties. However, efflux pump avoiders and EPIs/substrates can be separated based on their molecular interactions with membranes and MexB, their shape, lipophilicity, and electrodynamic response properties.
Models identify efflux avoiders and EPIs/substrates among structurally unrelated compounds. We next tested whether our models can rank unrelated compounds based on their ability to avoid or inhibit efflux. For this purpose, we calculated the LigMexB subset of descriptors for a library of 674 molecules. The library comprised compounds of the Chembridge Diversity set with unknown antibacterial properties, known efflux inhibitors, and traditional antibiotics that were not part of the training or testing set for any of the regression models described above. We next used the developed efflux and EPI ss models to identify which of these molecules might be expected to avoid or inhibit efflux.
The 15 top-ranked efflux avoiders (probability of $0.76) are dominated by antibiotics, including three monobactams, seven cephalosporins, and sulbenicillin from the penicillin class ( Fig. 6; see Table S5 in the supplemental material). The remaining four species are from a previously reported series of compounds with EPI activities in E.coli (50). MIC measurements for 11 of the top compounds showed that 9 have MICs in at least one of the four P. aeruginosa strains. For most of these compounds, the efflux ratios MIC PD6 /MIC PAO1 and MIC PD6-Pore /MIC PAO1-Pore ranged between 1 and 0.25, showing that indeed efflux plays a negligible role in their antibacterial activities (Fig. 6).
Eleven compounds ranked high (probability of $0.75) to have EPI-like properties and were different from the predicted efflux avoiders ( Fig. 6; Table S5). Among these topranked putative EPIs, three are antibiotics of the fluoroquinolone class and three are EPIs showing good potentiation for a range of antibiotics against both E coli and P. aeruginosa (52). We tested EPI compounds on the ability to inhibit efflux of HT, the activity used to generate the EPI ss model, and we found that all three compounds increase the intracellular accumulation of HT by at least 4-fold (Fig. 6). Fluoroquinolones are intrinsically fluorescent and could not be tested in this assay. However, these antibiotics potentiate the activities of penicillins and carbapenems (53,54), and efflux inhibition could play a role in this synergism.
Thus, predictive rules for efflux avoidance and inhibition identified using a series of Rempex analogs appear to be applicable to a broader chemical diversity of compounds.
Models guide optimization of compounds for efflux inhibition. To further validate the identified efflux inhibition "rules," we applied them to a series of compounds that do not have EPI activities against P. aeruginosa. The previously reported OU-266 series acts on AcrA, the periplasmic component of the E. coli AcrAB-TolC efflux pump and potentiates activities of novobiocin in this bacterium but not in P. aeruginosa (50). Furthermore, unlike Rempex compounds (Fig. 1) and the top predicted efflux inhibitors from the test library (Fig. 6), this series does not have considerable antibacterial properties. We next generated a limited series of OU-266 derivatives (Test S1, supplemental methods) and used the identified predictive descriptors of efflux inhibition to optimize their EPI-like properties.
The major predictors of EPIs are their acylindricity (mean values for Rempex series, 2.00), anisotropic polarizability (mean, 178.7 Atomic Units [AU]), and the number of aromatic rings (mean, 2.43), all positively correlating with EPI activity, and the partition coefficient LogD (mean, 23.36), which correlates negatively (Fig. 7). In addition, interactions with L674 (mean, 1.43) and P668 (mean, 1.53) in the AP of MexB correlate with EPI-like properties positively and negatively, respectively (Fig. 5D). The properties of OU-266 notably deviate from the mean values calculated for Rempex compounds, but the molecule is asymmetric, with a hydrophobic and a polar terminus reminiscent of some of the features seen in Rempex compounds (Fig. 7). The addition of the second dihydroimidazoline ring in OU-109 aligned several of the top properties with the desired values. In particular, the acylindricity, the number of H-bond acceptors, and  Table S3 for a complete list of descriptors.) the contacts with P668 in MexB all moved into the optimal range. These changes led to a potent EPI activity against MexAB-OprM, as seen from the MPC values of OU-109 in combinations with the fluoroquinolones levofloxacin and ciprofloxacin, novobiocin, and different b-lactams (Fig. 7). Replacing 2-chlorophenyl with 4-chlorophenyl in OU-96 reduced anisotropic polarizability to the desired level and further reduced LogD, but these changes increased undesired contacts with P668 and, as a result, reduced EPI activity. The isopropylbenzene group increased hydrophobicity of OU-71 and OU-199 and enhanced their antibacterial activity without significant improvement of their EPI potencies. On the hydrophobic terminus of OU-266, the chlorophenyl can be substituted with a bromonaphthalene moiety in OU-72 without significant loss of EPI activity. However, this substitution enhanced the antibacterial activity and efflux of the compound by MexAB-OprM, as seen from the MICs of 6.25 to 12.5 mM in PD6-Pore cells and the lack of growth inhibition in PD6-Pore(MexAB-OprM). Further increasing the aromaticity in OU-1 shifted their LogDs and contacts with both P668 and L647 into undesired areas, leading to the loss of EPI properties (Fig. 7).
Thus, the top predictors of efflux inhibition discovered in this study can effectively guide the further development of compounds for efficient efflux inhibition in the challenging pathogen P. aeruginosa.

Conclusions.
Intracellular accumulation predictors generated using the model E. coli species have limited utility in optimization of antibacterial activities against P. aeruginosa, because of the powerful active efflux and the permeability barrier of the OM of this species. Interactions of compounds with the major efflux pump MexB and with the different layers of the OM of P. aeruginosa can be converted into numerical descriptors. In combination with traditional physicochemical properties of compounds, these descriptors can be used in modeling of efflux avoidance and permeation in P. aeruginosa. Antibacterial and efflux inhibitory activities of compounds correlate weakly and can be separated using bacterial growth-independent efflux inhibition assays. The two activities correlate with different sets of descriptors. Efflux ratios are reliable reporters of the propensity of compounds to avoid or to be captured by efflux pumps. Interactions with membranes, specific residues of MexB in AP, and the affinity to DP dominate efflux avoidance predictors likely reflect the contributions of specific residues to affinities of compounds to the substrate binding sites of MexB. Growth-dependent and -independent efflux inhibitory activities correlate with each other, albeit weakly, suggesting that they report on different properties of compounds. These properties correlate with different descriptors. Permeation predictors are prominent in both efflux inhibition and avoidance models, suggesting that these predictors represent properties of compounds that are not rendered by MexB docking and physicochemical descriptors. Possibly, these descriptors reflect the ability of MexB and similar pumps to capture their substrates from the lipid bilayer and at the water-lipid bilayer interface. The majority of Rempex compounds efficiently permeate the OM of P. aeruginosapresumably by the self-promoted uptake mechanism-and their activities are only weakly affected by the OM barrier. Alternative libraries of compounds are needed to generate reliable models for OM permeation. Efflux avoidance and inhibition models are predictive of such properties among unrelated compounds, and the two models select different chemical classes of compounds. These models can be useful for in silico prefiltering of large compound libraries for the desired properties. Model-based optimization of efflux inhibitory activities leads to gain in antibiotic potentiation activities against P. aeruginosa.

MATERIALS AND METHODS
Chemicals and strains used. All strains used were described previously (13,28). The cells were grown in Luria-Bertani (LB) broth (10 g/liter tryptone, 5 g/liter yeast extract, and 5 g/liter NaCl) at 37°C with shaking. Rempex compounds were generated in the discovery/optimization campaign by Rempex Pharmaceuticals and provided by Qpex Biopharma. The EPI compounds in Fig. 6 and Table S5 were discovered, developed at, and provided by Basilea Pharmaceutica International, Ltd.
Antibacterial activities were tested using a 2-fold serial dilution broth assay as described previously (13). The antibacterial activities are expressed as MIC (defined as at least 90% of growth inhibition) and IC 50 .
The kinetics of Hoechst accumulation was analyzed as described previously in a temperature-controlled microplate reader (Tecan Spark 10M) in a fluorescence mode (15). Compounds were prescreened for possible interference with Hoechst fluorescence, and those compatible were further analyzed to establish concentration dependencies. The kinetic analysis was performed using a MatLab program as described previously (15).
MexB purification and surface plasmon resonance assays. MexB was purified from P. aeruginosa PAO1 cells harboring pMexB plasmid (55) using Cu 21 metal affinity chromatography as described previously (39,56). Surface plasmon resonance (SPR) experiments were performed using a Biacore T200 (GE Healthcare) equipped with a research-grade CM5 S2 sensor chip. The purified MexB was immobilized by amino coupling. The immobilization and subsequent binding experiments were conducted in a running buffer containing 25 mM HEPES-NaOH (pH 7.0), 150 mM NaCl, and 0.2% Triton X-100 as described previously (56). Rempex compounds were screened for binding to the immobilized MexB at a concentration of 25mM, followed by kinetic analysis of a selected subset of compounds at six different concentrations. Each compound/analyte was injected over the ligand and reference flow cells simultaneously at a flow rate of 30ml/min and at a temperature of 25°C. The complex was allowed to associate and dissociate for 20 to 30 s and 150 s, respectively. The data were fit into a simple 1:1 binding (interaction) model or two-state kinetic model using the global data analysis for the association and dissociation rate constants k a and k d , respectively, and R max available within Biacore Insight Evaluation software.
QSAR, QM, and MD calculations. For each compound, we generated the 2D structure data file (SDF format) and the protonation/charge state most populated at physiological pH 7.4 using the MOE package (57). We then used the ChemAxon's Marvin suite of programs (58) to obtain 1-2-3D descriptors commonly used in QSAR studies, such as number of heavy atoms, isoelectric point, van der Waals volume and surface, number of rotatable bonds, number of H-bond donors/acceptors, etc. (Table S2). These descriptors include LogP values obtained with the XLOGP3 program (59). The configuration of the major microspecies has been used as an input to QM calculations performed with the Gaussian16 package as described in previous work (43). We optimized the ground-state structure employing a polarizable continuum model (60) as to mimic the effect of water solvent particularly to avoid formation of strong intramolecular H-bonds. To confirm the geometry obtained to be a global minimum on the potential energy surface, we performed full vibrational analyses, obtaining real frequencies in all cases. On the optimized geometry, we then performed single-point energy calculations in vacuum to generate the atomic partial charges fitting the molecular electrostatic potential. Under the constraint of reproducing the electric dipole moment of the molecule, we used the Merz-Kollman scheme (61) to construct a grid of points around the molecule. Atomic partial charges were then generated through the two-step restrained electrostatic potential method (62) implemented in the AnteChamber package (63). Using this program, we derived general Amber force field (GAFF) parameters (64). QM descriptors associated with the groundstate optimized structure include static polarizabilities, frontier molecular orbital energies, permanent dipole moment, and rotational constants. For each compound, we performed 1-ms-long all-atom MD simulation in explicit water solution (0.1 M KCl) using the Amber18 package as described before (43). From MD simulations, we obtained structural and dynamic features of the compounds investigated by means of the PTRAJ and CPPTRAJ programs of Amber18 (65). The number and population of structural clusters were determined using a hierarchical agglomerative algorithm (66).
Ensemble docking to MexB. All molecular docking calculations were performed using the software AutoDock Vina (67), implementing a stochastic global optimization approach. The program was used with default settings except for the exhaustiveness (giving a measure of the exhaustiveness of the local search), which was set to 1,024 (default of 8). Protein and ligand input files were prepared with AutoDock Tools (68). Flexibility of both docking partners was considered indirectly by using the ensemble of conformations. In particular, for each compound we used 10 different cluster representatives extracted from MD simulations in explicit water solution, while for MexB, we considered 6 conformations, including available X-ray crystal structures (PDB ID no. 2V50, 3W9I, and 3W9J) (44,45) and MD snapshots extracted from MD simulations (46). For each docking run, we retained the top 10 docking poses. We performed two sets of guided docking runs into the two major substrate binding pockets of MexB: the access pocket of the access monomer (AP) and the deep binding pocket of the binding monomer (DP). In each case, the docking search was performed within a cubic volume of 40 by 40 by 40 Å 3 centered in the center of mass of the pocket. The interaction between each compound and MexB was quantified by means of a statistical analysis of all putative binding poses, yielding about 60 descriptors. These descriptors include average binding affinities (predicted according to the docking scoring function) as well as the total number of contacts with single residues lining the two pockets (see Table S2).
Permeation descriptors of interactions with the OM. Initial coordinates of the P. aeruginosa OM were downloaded (51). The model has been parameterized in line with the GLYCAM force field (69), and parameters are adapted to run in the GROMACS (70) molecular dynamics engine. The OM model consists of an inner leaflet composed of 1,2-dipalmitoyl-sn-glycero-3-phosphoethanolamine (DPPE) and an outer leaflet composed of a truncated LPS structure. The membrane is fully solvated using the TIP3P water model (71), and anionic charges in the LPS molecules are counterbalanced with Ca 21 cations. A schematic representation of the model is provided in Fig. 5, and its parameterization was described previously (72). Similarly, parameters for drug molecules, derived as described above (43), were consistently adapted from the general Amber force field (GAFF) (64) and transformed into GROMACS input files using the AnteChamber PYthon Parser interfacE (ACPYPE) tool (73).
To extract the molecular descriptors of drug permeation across the OM membrane, each drug was placed into seven different molecular environments corresponding to specific regions along the direction perpendicular to the OM (Fig. 4). These regions were explicitly selected in order to cover the influence of both the inner (DPPC) and outer leaflet (LPS) of the OM. Thus, seven independent simulations per drug were necessary in order to recapitulate the influence of the OM into the permeation process. The whole procedure was automated via a series of bash scripts, which iteratively connected the pulling code and energy minimization in GROMACS (70).
All simulations were run with the GROMACS 5.4.1 molecular dynamics engine 2 with a time step of 2 fs. The LINCS algorithm (74) was applied to constrain all bond lengths with a relative geometric tolerance of 10 24 . In line with its original parameterization, short-range interactions (van der Waals and Coulomb) were calculated using a cutoff scheme of 0.9 nm, which were evaluated based on a pair list recalculated every 5 time steps. Long-range interactions were handled using a reaction field (75) correction with a permittivity dielectric constant of 66. After initial setup, each system was energy minimized using 3,000 steps of conjugated gradient, followed by a thermal equilibration of 1 ns. A harmonic potential of 1,000 kJ mol 22 , along the Z vector connecting the center of mass (COM) of the drug and the OM of the membrane, was applied in order to maintain the relative position of the drug with respect to each of the seven different regions of the membrane as described in the system setup section. During equilibration, bilayers were coupled to 1.0 bar using a Berendsen barostat (76) through a semi-isotropic approach with a relaxation time of 1.0 ps. Afterwards, production runs were coupled using a Parrinello barostat (77) algorithm, and a constant temperature of 310 K was maintained by weak coupling of the solvent and solute separately to a velocity-rescaling (78) scheme with a relaxation time of 1.0 ps. Production simulations were run for 20 ns, and trajectories were saved each 20 ps.
A total of 4,207 (;84 ms) trajectories were analyzed using an in-home-developed bash script, which was directly interconnected to the in-built GROMACS tools.
Statistical and machine learning methods. Figure S2 shows an outline of the developed algorithm. There are two phases to feature selection. In the first phase, we employ a single sparse LASSO fit (using the cvglmnet function in the glmnet_python package), with a regularization parameter tuned to retain 50% 6 2% of the total descriptors in the batch of descriptors considered. In the second phase, using the retained ;50% of the descriptors, we run two further (nonsparse) regressions, employing shuffling of the data along with 5-fold cross-validation to assess the robustness of the coefficients that result for each descriptor in a simple binomial model. We retain at most one descriptor from each cluster computed by correlation clustering of the descriptor (sub)set. We choose to retain the descriptor with the largest ratio of average coefficient divided by standard deviation of coefficient, as we expect that to be the most consistent and hence most generalizable representative of this cluster. Finally, we refit in an identical manner using the cluster representatives and discard any descriptor with an average that is within a standard deviation of zero as being unimportant. We run the second phase 100 times on different stratified subsets of the training data in order to perform a bootstrap analysis of the consistency with which specific descriptors are chosen.
The modeling experiments and parameters are summarized in Table S4. Seven different variables derived from the following experimental ratios were used for model outputs (Table S4): efflux = IC 50 PD6-Pore / IC 50 PAO1-Pore , permeation = IC 50 PD6-Pore /IC 50 PD6 , EPI-1 = g(MIC PAO1 /MPC 8 PA1032 ), EPI-2 = g(MIC PD6-Pore /MPC 8 PA1032 ), EPI MPC = g(IC 50 PAO1 /MPC 8 PA1032 ), and EPI SS = SS 16 mM /SS 0 mM . SS concn refers to the steady-state HT accumulation ratio at that concentration, and fold difference is the fold difference in HT fluorescence (16mM/0 mM). The function g is a rescaling factor defined as where xi is the i-th entry in the ratio list and the MAX function is taken over the entire list.
For model fitting, we selected models created with the same number of total descriptors to avoid size effects. We employed up to the total number of descriptors retained during feature selection for the smallest subset (permeation). For final model fitting and assessment, we arrange the descriptors in order of the number of times they were chosen by the bootstrap phase of feature selection and then choose the top N descriptors, where N is the number returned by searching for a "gap" in the ordered descriptors by using the L method of Salvador and Chan (79) as a relatively conservative estimate that nonetheless does not retain descriptors in the tail of the distribution of the number of times they appeared in the bootstrap phase of feature selection. Once the final set of descriptors is selected, the LogisticRegressionCV class of the scikit-learn package was employed to learn a nonsparse binomial classifier employing the neg_log_loss scoring penalty with "balanced" class weight and an L2 penalty. The random state was arbitrarily set to 0 for consistency and ease of debugging.
In case of a 3-class multinomial regression classifier, feature selection was performed in a similar manner as for the binomial classifiers, except that in phase 1 we retain all descriptors that have a nonzero coefficient for any class, and in phase 2, we choose cluster representatives for all three classes. If two or more classes choose the same cluster representative, we retain it. Otherwise, we choose from the different class representatives randomly. Finally, we retain all descriptors for which at least for one class the average value is more than 1 standard deviation away from zero.
Model fitting is likewise performed in a similar manner, except that we employ an elasticnet penalty, which is a balance between L1 and L2 penalties that allows some of the descriptors to go to zero, in order to loosen the restrictions on descriptors relating to different classes, and we use the ovr, or onevs-rest, formulation. Because of this, we use all top descriptors (76) returned from feature selection on the full set of all descriptors.
In order to assess the quality of the classifiers learned, we employed bootstrapping to estimate errors (Fig. S2). For each model, we fit on an arbitrarily chosen training set of 75% of the data using the train_test_split function of the scikit-learn package. We ensured that the class balance was the same for both this and the retained 25% testing set.

SUPPLEMENTAL MATERIAL
Supplemental material is available online only. TEXT S1, DOCX file, 0.02 MB.