A Machine Learning-Based Prediction Platform for P-Glycoprotein Modulators and Its Validation by Molecular Docking

P-glycoprotein (P-gp) is an important determinant of multidrug resistance (MDR) because its overexpression is associated with increased efflux of various established chemotherapy drugs in many clinically resistant and refractory tumors. This leads to insufficient therapeutic targeting of tumor populations, representing a major drawback of cancer chemotherapy. Therefore, P-gp is a target for pharmacological inhibitors to overcome MDR. In the present study, we utilized machine learning strategies to establish a model for P-gp modulators to predict whether a given compound would behave as substrate or inhibitor of P-gp. Random forest feature selection algorithm-based leave-one-out random sampling was used. Testing the model with an external validation set revealed high performance scores. A P-gp modulator list of compounds from the ChEMBL database was used to test the performance, and predictions from both substrate and inhibitor classes were selected for the last step of validation with molecular docking. Predicted substrates revealed similar docking poses than that of doxorubicin, and predicted inhibitors revealed similar docking poses than that of the known P-gp inhibitor elacridar, implying the validity of the predictions. We conclude that the machine-learning approach introduced in this investigation may serve as a tool for the rapid detection of P-gp substrates and inhibitors in large chemical libraries.


Introduction
ATP-binding cassette (ABC) transporters are energy-dependent efflux pumps responsible for the active efflux of drugs, thereby reducing their intracellular concentration. Due to overexpression of ABC transporters in tumor cells, multidrug resistance (MDR) develops, which leads to the failure of chemotherapy with fatal consequences for cancer patients [1]. P-glycoprotein, being a well-known member among the ABC transporter family, is encoded by the ABCB1/MDR1 gene. It is an important determinant of MDR [2][3][4] and upregulated in many clinically resistant and refractory tumors [5,6]. Its overexpression in tumor cells is associated with efficient extrusion of a large number of established anticancer drugs and natural cytotoxic products out of cancer cells, representing a major drawback of cancer chemotherapy [7]. Resistance is either inherently present or will be acquired during chemotherapy [8][9][10]. Hence, P-glycoprotein (P-gp) represents an important target to search for pharmacological inhibitors to overcome MDR [11]. Targeting P-gp to overcome MDR is of importance to achieve higher success rates for chemotherapy. The concept is to combine P-gp inhibitors with established chemotherapy drugs to resensitize tumors [12][13][14][15].
Machine learning and artificial intelligence are recently acquiring increasing interest in the area of drug discovery [16][17][18] because these methods have an enormous potential to speed up the preclinical Data Warrior software is a multipurpose chemistry data visualization and data analysis program that calculates various molecular descriptors and properties for a given set of compounds. It was used to calculate the chemical descriptors as previously reported [24,25]. After calculation of the 32 chemical descriptors, correlation coefficients between descriptors and correlation of the descriptors with the P-gp modulator category (substrate or inhibitor) were determined using SPSS statistics software version 23.0.0.3 (IBM, Armonk, NY: IBM Corp, USA). If the correlation coefficient between the P-gp modulator category (substrate or inhibitor) and a certain descriptor was below 0.1, this descriptor was omitted. Only descriptors correlating with the P-gp modulator (substrate or inhibitor) category above 0.1 were selected for further processing. As a next step, descriptors having a pairwise correlation coefficient to the P-gp modulator category lower than 0.9 were excluded [26]. By this strategy, relevant descriptors without an issue of over-fitting can be selected.

P-Glycoprotein Modulator Prediction Model Establishment
At first, a model, which can predict whether a given compound is a P-gp modulator, was built by using the compound list from Broccatelli et al. [22] After applying the descriptor selection criteria by considering the relevancy and over-fitting issues, "logP", "H-donors", "polar surface area", "ligand efficiency dependent lipophilicity", "molecular complexity", "stereo centers", "rotatable bonds", "rings closures", "aromatic rings", "sp-3 atoms", "amides", "amines", "alkyl-amines, "and "basic nitrogens" were considered for the preparation of the P-gp modulator/non-modulator prediction model. Various classification algorithms with the leave-one-out random sampling method were tested, i.e., k-Nearest Neighboring (kNN), Neural Network, Random Forest (RF), and Support Vector Machine (SVM). Receiver operating characteristic (ROC) curves are depicted in Figure 1. The receiver operating characteristic (ROC) curve plotted the true positive rate (= sensitivity) against the false positive rate (= 1-specificity). The RF algorithm performed better than the other classification algorithms both in learning and validation steps. The overall performance for the established model based on RF algorithm is summarized in Table 3. The establishment of the P-gp modulator/non-modulator and P-gp inhibitor/substrate prediction models were performed by using the machine learning software Orange (Ljubljana, Slovenia) [27]. chemical descriptors, correlation coefficients between descriptors and correlation of the descriptors with the P-gp modulator category (substrate or inhibitor) were determined using SPSS statistics software version 23.0.0.3 (IBM, Armonk, NY: IBM Corp, USA). If the correlation coefficient between the P-gp modulator category (substrate or inhibitor) and a certain descriptor was below 0.1, this descriptor was omitted. Only descriptors correlating with the P-gp modulator (substrate or inhibitor) category above 0.1 were selected for further processing. As a next step, descriptors having a pairwise correlation coefficient to the P-gp modulator category lower than 0.9 were excluded [26]. By this strategy, relevant descriptors without an issue of over-fitting can be selected.

P-Glycoprotein Modulator Prediction Model Establishment
At first, a model, which can predict whether a given compound is a P-gp modulator, was built by using the compound list from Broccatelli et al. [22] After applying the descriptor selection criteria by considering the relevancy and over-fitting issues, "logP", "H-donors", "polar surface area", "ligand efficiency dependent lipophilicity", "molecular complexity", "stereo centers", "rotatable bonds", "rings closures", "aromatic rings", "sp-3 atoms", "amides", "amines", "alkyl-amines, "and "basic nitrogens" were considered for the preparation of the P-gp modulator/non-modulator prediction model. Various classification algorithms with the leave-one-out random sampling method were tested, i.e., k-Nearest Neighboring (kNN), Neural Network, Random Forest (RF), and Support Vector Machine (SVM). Receiver operating characteristic (ROC) curves are depicted in Figure 1. The receiver operating characteristic (ROC) curve plotted the true positive rate (= sensitivity) against the false positive rate (= 1-specificity). The RF algorithm performed better than the other classification algorithms both in learning and validation steps. The overall performance for the established model based on RF algorithm is summarized in Table 3. The establishment of the P-gp modulator/nonmodulator and P-gp inhibitor/substrate prediction models were performed by using the machine learning software Orange (Ljubljana, Slovenia) [27].    After applying the descriptor selection criteria by considering the relevancy and over-fitting issues, "logP", "total surface area", "shape index", "molecular flexibility", "rotatable bonds", "aromatic rings", "aromatic atoms", "aromatic nitrogens", "basic nitrogens", "symmetric atoms", and "acidic oxygens" were considered for P-gp inhibitor/substrate prediction model preparation. Various classification algorithms with the leave-one-out random sampling method were tested, i.e., kNN, Neural Network, RF, and SVM. The ROC curves are depicted in Figure 2. The RF algorithm performed better than the other classification algorithms. The overall performance for the established model is summarized in Table 4. After applying the descriptor selection criteria by considering the relevancy and over-fitting issues, "logP", "total surface area", "shape index", "molecular flexibility", "rotatable bonds", "aromatic rings", "aromatic atoms", "aromatic nitrogens", "basic nitrogens", "symmetric atoms", and "acidic oxygens" were considered for P-gp inhibitor/substrate prediction model preparation. Various classification algorithms with the leave-one-out random sampling method were tested, i.e., kNN, Neural Network, RF, and SVM. The ROC curves are depicted in Figure 2. The RF algorithm performed better than the other classification algorithms. The overall performance for the established model is summarized in Table 4.  In order to evaluate the model performance further and select potential inhibitors, a P-gp modulator compound list consisting of 643 compounds from ChEMBL was used.

Molecular Docking
The recently published human P-gp structure was used (nanodisc reconstituted in complex with UIC2 fab and paclitaxel at the drug-binding pocket, PDB ID: 6QEX, in the absence of a lipid bilayer) [28]. The Fab chains were deleted. The bound ligands marked as "HETATM" including taxol were also deleted from the PDB structure file in order to prevent interference with molecular docking. The preparation of the final receptor structure as ".pdbqt" file was performed with Autodock tools 1.5.7. Selected compounds from inhibitor and substrate classes have been subjected to an automated and comprising molecular docking campaign by using the high-performance supercomputer MOGON (Johannes Gutenberg University, Mainz). Compound flexibilities were taken into account and a rigid receptor structure was used. At first, three independent screening of all 643 compounds from ChEMBL with Autodock Vina algorithm was performed by focusing on the drug-binding pocket of P-gp, where the majority of the known inhibitors and substrates bind to. The grid parameters are listed in Table 5.  In order to evaluate the model performance further and select potential inhibitors, a P-gp modulator compound list consisting of 643 compounds from ChEMBL was used.

Molecular Docking
The recently published human P-gp structure was used (nanodisc reconstituted in complex with UIC2 fab and paclitaxel at the drug-binding pocket, PDB ID: 6QEX, in the absence of a lipid bilayer) [28]. The Fab chains were deleted. The bound ligands marked as "HETATM" including taxol were also deleted from the PDB structure file in order to prevent interference with molecular docking. The preparation of the final receptor structure as ".pdbqt" file was performed with Autodock tools 1.5.7. Selected compounds from inhibitor and substrate classes have been subjected to an automated and comprising molecular docking campaign by using the high-performance supercomputer MOGON (Johannes Gutenberg University, Mainz). Compound flexibilities were taken into account and a rigid receptor structure was used. At first, three independent screening of all 643 compounds from ChEMBL with Autodock Vina algorithm was performed by focusing on the drug-binding pocket of  Table 5. Afterward, the top 20 compounds in terms of binding energy yielded from both inhibitor and substrate predictions were selected for molecular docking. Each molecular docking was based on three independent dockings each consisting of 2,500,000 calculations. This means that each data point represents the mean value of 7,500,000 individual MOGON-based calculations. The Autodock 4 algorithm was used for defined molecular docking calculations on the drug-binding pocket of P-gp as described before [11], and Visual Molecular Dynamics (VMD) software (Theoretical and Computational Biophysics group at the Beckman Institute, University of Illinois at Urbana-Champaign) was used for the visualization of the docking poses. Estimated inhibition constants were calculated by the Autodock algorithm with the equation: Ki (M) ∆G (cal/mol) = 1000 * LBE (lowest binding energy, kcal/mol) R (cal/mol-K): gas constant, 1.986 cal/mol-K T (K): room temperature, 298 K

Boxplot Analysis
The distribution of the values for the descriptors used for the P-gp inhibitor/substrate prediction model and the comparison for the predicted inhibitors and substrates among the ChEMBL P-gp modulator list were subjected to Boxplot analysis using Microsoft Excel 2019 (Microsoft, USA). Statistical significances were evaluated by the t-test (two-tailed, two-sample unequal variance).

P-glycoprotein Modulator Predictions
The P-gp modulator/non-modulator prediction model was evaluated with the validation set as mentioned in the corresponding method part. The RF algorithm reached 0.938 for all parameters. The ChEMBL P-gp modulator list of 643 compounds was tested, and 641 out of 643 substances were correctly predicted as modulators.
The P-gp inhibitor/substrate prediction model with the ChEMBL P-gp modulator list of 643 compounds was evaluated. A total of 493 substances were predicted as inhibitors, and 150 compounds were predicted as substrates. Subjecting all compounds to Autodock Vina screening allowed to rank them according to their binding energies. The top 20 inhibitor predictions with strong interaction to P-gp are shown in Table 6. These inhibitors were selected for subsequent molecular docking. The top 20 substrate predictions with strong interaction to P-gp are shown in Table 7. These substrates were also selected substances for subsequent molecular docking. The complete predictions for all 493 inhibitors together with their binding affinities to P-gp are shown in Supplementary Table S1, while all predictions for the 150 substrates and their affinities to P-gp are listed in Supplementary Table S2. The average lowest binding energy (LBE) was -8.155 for the inhibitors and -9.289 for the substrates.   Table S1). The proportion of natural products was higher among the predicted P-gp substrates (69/150 = 46%) (Supplementary Table S2). This trend was even more apparent if we focused on the top 20 inhibitor or substrate compounds only (Tables 6 and 7). Here, 2/20 (= 10%) were predicted inhibitors, but 11/20 (= 55%) were predicted substrates, indicating that P-glycoprotein may expel natural xenobiotics from cells with higher probability.

Molecular Docking
After running the prediction model on the P-gp modulator list from ChEMBL and the Autodock VINA screening, the top 20 compounds from the inhibitor class and the top 20 compounds from the substrate class were selected for molecular docking analyses on human P-gp. The lowest binding energies (LBE) and predicted inhibition constants are listed in Table 8 for the inhibitors and Table 9 for the substrates.  The negative control compounds (oxprenolol, promazine, riluzole) revealed weaker interaction with P-gp (Table 10) and slightly different docking pose as well ( Figure 3).
As can be seen in Figure 4, the predicted inhibitors possessed similar docking poses as elacridar at the drug-binding pocket of P-gp. Similar results were observed for the substrates: The predicted substrates revealed similar docking poses as doxorubicin. Hence, these results validated the precision and reliability of the model. Table 10. Lowest binding energies (LBE) and predicted inhibition constants obtained by molecular docking of the non-modulators.  As can be seen in Figure 4, the predicted inhibitors possessed similar docking poses as elacridar at the drug-binding pocket of P-gp. Similar results were observed for the substrates: The predicted substrates revealed similar docking poses as doxorubicin. Hence, these results validated the precision and reliability of the model.  Predicted inhibitors and substrates interact with P-gp significantly stronger than the negative control compounds. This is clear both from the binding energies and predicted inhibition constants. Binding energies of non-modulators are within -5.380 (piluzole) to -6.933 (promazine) kcal/mol and the predicted inhibition constants are within 8.273-114.080 µ M, whereas binding energies for the predicted substrates are within -7.337 (vindoline) to -12.500 (latilagescene G) and for the predicted inhibitors -8.900 (3-methylcholanthrene) to -13.537 (karavoate P). Predicted inhibition constants for the predicted substrates are within 0.001-4.363 and for the predicted inhibitors 0.0002-0.300 µ M. Docking pose of the negative control compounds differs from that of inhibitors and substrates. Overall, it can be speculated that the predicted inhibitors interact with P-gp stronger than the predicted substrates and the non-modulators are making weak interactions with P-gp and they bind to a different site.

P-gp Inhibitor AutoDock LBE (kcal/mol) Predicted Inhibition Constant (µM)
The distribution of the values for the descriptors used to build the model and the comparison for the predicted inhibitors and substrates in terms of those descriptor values were performed with Predicted inhibitors and substrates interact with P-gp significantly stronger than the negative control compounds. This is clear both from the binding energies and predicted inhibition constants. Binding energies of non-modulators are within −5.380 (piluzole) to −6.933 (promazine) kcal/mol and the predicted inhibition constants are within 8.273-114.080 µM, whereas binding energies for the predicted substrates are within −7.337 (vindoline) to −12.500 (latilagescene G) and for the predicted inhibitors −8.900 (3-methylcholanthrene) to −13.537 (karavoate P). Predicted inhibition constants for the predicted substrates are within 0.001-4.363 and for the predicted inhibitors 0.0002-0.300 µM. Docking pose of the negative control compounds differs from that of inhibitors and substrates. Overall, it can be speculated that the predicted inhibitors interact with P-gp stronger than the predicted substrates and the non-modulators are making weak interactions with P-gp and they bind to a different site.
The distribution of the values for the descriptors used to build the model and the comparison for the predicted inhibitors and substrates in terms of those descriptor values were performed with Boxplot analysis. As can be seen from Figure 5, the inhibitors revealed significantly different values for all descriptors except logP and acidic oxygens. The average values of descriptors for inhibitors and substrates are listed in Table 11.

Discussion
In the present study, we utilized machine learning methods based on leave-one-out random sampling in order to develop a P-gp modulator prediction platform by using chemical descriptors. The main focus was to predict whether a given compound can behave as substrate or inhibitor of P-gp. The RF classification algorithm (AUC:0.774) outperformed the other tested algorithms (kNN-0.676, Neural Network-0.745, SVM-0.720). Performance scores for the external validation set were even higher than the learning set with better sensitivity (0.786 vs. 0.750), specificity (0.833 vs. 0.700), overall prediction accuracy (0.800 vs. 0.725), and precision (0.917 vs. 0.714). Further testing with the P-gp modulator list from ChEMBL yielded promising results with accurate predictions. Four compounds from inhibitor and four compounds from substrate prediction list were selected for molecular docking analyses. Validations with molecular docking on a recently released human P-gp structure were performed in terms of binding energy and docking poses by including known inhibitor (elacridar) and substrate (doxorubicin) as controls. Curcumin, miconazole, tacrolimus, and venlafaxine revealed a similar docking pose at the drug-binding pocket of P-gp with comparable binding energies with that of elacridar. MK-3207, rifampin, vindoline, and voacamine revealed similar docking poses and comparable binding energy with those of doxorubicin. Overall, the precision and reliability of the model were further confirmed.
Machine learning and artificial intelligence attracted increasing interest in the drug discovery area [18,29,30], and utilizing these methods possess great potential for drug discovery, as they save time and costs during the preclinical steps. The RF algorithm depends on multiple decision trees that are built based on the training data, and a majority voting scheme is used to make classification or regression predictions [31]. RF application to drug discovery has been recently reported, and it outperformed other algorithms such as SVM and NN in terms of feature selection [32].
There are various studies in the literature that utilized machine-learning strategies focusing on P-gp. One study pointed out a P-gp substrate prediction model based on RF algorithm to estimate transport potential for central nervous system drugs, accuracy lies between 0.713 and 0.846 whereas precision is between 0.633 and 0.777 [33]. Our P-gp modulator prediction model involves an accuracy of 0.953 for the learning set and 0.938 for the validation set, and our P-gp inhibitor prediction model has an accuracy value of 0.725 for the learning set and 0.800 for the validation set. In terms of precision, our models also perform better. Modulator prediction model involves a precision of 0.968 for the learning set and 0.938 for the validation set. Inhibitor prediction model has a 0.714 precision for the learning set and 0.917 for the validation set. Similarly, a P-gp substrate efflux ratio prediction model has been recently reported based on SVM algorithm [34]. The affinities of flavonoids to P-gp have been evaluated with an SVM-based model and a high correlation with the experimental data has been achieved [35]. Another study involving P-gp inhibitor prediction was performed for chalcone derivatives and selected inhibitor candidates were analyzed in terms of their docking pose on a homology model of human P-gp [36]. The prediction of blood-brain barrier permeability mechanism of central nervous system drugs has been utilized with an SVM-based model [37]. Binding pattern prediction based on pharmacophore ensemble/SVM method for potential P-gp inhibitors was also recently reported [38]. Another SVM-based model coupled with molecular docking aimed to predict whether a given compound may act as P-gp substrate, the accuracy lies between 0.750 and 0.800, specificity between 0.750 and 0.810, and sensitivity between 0.740 and 0.790 [39]. Our modulator prediction model outperforms that model in all those parameters. Our inhibitor prediction model outperforms in the validation set. Similarly, in 2004, SVM-based P-gp substrate prediction model was reported; sensitivity was 0.812, specificity was 0.792, and accuracy was 0.794 [40]. Our modulator prediction model outperforms that model in all those parameters. Our inhibitor prediction model outperforms in the validation set for the specificity and accuracy parameters. In general, these previously published studies have certain disadvantages, e.g., low performance scores in terms of prediction, focusing on only P-gp substrate prediction or molecular docking with homology models but not crystal structures. Our model is superior compared to the previously published studies for several reasons. It is based on leave-one-out random sampling RF algorithm, focused on both natural as well as synthetic compounds, has high sensitivity, specificity, predictive accuracy, and precision to predict at first P-gp modulator/non-modulator and as a next step to predict P-gp substrate/inhibitor depending on various chemical descriptors, and it was coupled with molecular docking using the recently released crystal structure of human P-gp. The fact that predictions on the P-gp modulator list of compounds from ChEMBL was validated with accurate molecular docking results was also advantageous for our model. Furthermore, after the initial compound screening, selected inhibitors revealed similar docking poses as elacridar (as positive control for an inhibitor) and selected substrates revealed similar docking poses as doxorubicin (as positive control for a substrate). Non-modulators have significantly weaker interaction with P-gp and they bind to a slightly different position. Overall, those observations provide further clues for the reliability of the prediction model.
Many cancer types involve P-gp overexpression, which is associated with increased efflux of established anticancer drugs and natural cytotoxic products out of cancer cells. This phenomenon represents a major drawback of cancer chemotherapy with limitations in killing tumor populations due to MDR [61,62]. P-gp overexpression is indeed one of the main reasons for MDR and thus inadequate chemotherapy success rate. Targeting P-gp is critical to achieve high success rates for chemotherapy, therefore, identification of novel P-gp inhibitors is critical in that regard.
Our prediction platform for P-gp modulators facilitates to predict whether a given compound can behave as a substrate or an inhibitor of P-gp. The selection of potential inhibitors can be further validated by molecular docking and the comparison of the binding energy and docking pose with those of known P-gp inhibitors. As a next step in the future, our model may be helpful to identify potential novel P-gp inhibitors and to develop effective chemotherapy strategies involving combination therapy with targeted chemotherapy drugs and identified P-gp inhibitors.

Conclusion
In the present study, we established P-gp modulator/non-modulator and inhibitor/substrate prediction models based on the RF algorithm and leave-one-out random sampling. Validation with molecular docking was performed. The identification of novel P-gp inhibitors is critical to overcome MDR and to achieve better chemotherapy strategies. This model can predict whether a given compound can behave as substrate or inhibitor of P-gp, and will be, thus, helpful to identify potential P-gp inhibitors.