The Potential Inhibitors in Traditional Chinese Medicine for BCR-ABL T315I Mutation of Chronic Myelogenous Leukemia

Chronic Myelogenous Leukemia (CML) is a myeloproliferative disorder characterized by the appearance of abnormal proliferation of white blood cells in the Philadelphia chromosome. Current drugs target ABL kinase may have resistance or have risks of serious side effects. We performed molecular docking and 2D-QSAR modeling regarding ABL and its mutant T315I to discover the potential candidate compounds for CML treatment. We present four potent TCM compounds, salvianolic acid C, baicalin, 1,4-dicaffeoylquinic acid, and dihydroisotanshinone I as potential candidates as lead drugs from the TCM compounds. It might have the potential to treat Chronic Myelogenous Leukemia with fewer side effects.


Introduction
Chronic Myelogenous Leukemia (CML) is a myeloproliferative disorder characterized by the appearance of abnormal proliferation of white blood cells in the Philadelphia chromosome [1]. Approximately 95% of CML patients have the Philadelphia chromosome [2]. The Philadelphia chromosome is the ABL kinase gene on chromosome 9 fuses with the BCR gene on chromosome 22, the translocation t (9:22) (q34; q11) format Bcr-Abl fusion gene and loss of normal regulatory function. The results are excessive expression of tyrosine kinase, continued activation of the downstream conduction pathway, cell continuous proliferation, and inhibition of apoptosis. Currently, first generation drug for the treatment of CML is Imatinib, the second generation of drugs are Dasatinib, Nilotinib, Bosutinib, the third generation of the drug is Ponatinib [3][4][5]. Imatinib is a Tyrosine Kinase Inhibitor (TKI) in the clinical treatment of CML that inhibits Bcr-Abl tyrosine kinase activity and cell proliferation, finally induced apoptosis [6]. Sometimes, Imatinib is resistant to patients with Bcr-Abl point mutations. In this condition, patients are evaluated to be treated with second-generation CML tyrosine kinase inhibitors such as Dasatinib, Nilotinib, and Bosutinib. The second generation drugs can inhibit many different types of Bcr-Abl point mutations [4], but there are some cases of drug resistance occurring for point mutation of T315I. In 2012, the United States Food and Drug Administration (FDA) approved Ponatinib as a third-generation CML tyrosine kinase inhibitor that effectively inhibits the T315I point mutation whereas Ponatinib is expensive. In 2013, the FDA requested the manufacturers of Ponatinib to suspend promotion and sales for the sakes of the risk of life-threatening blood clots and severe narrowing of blood vessels.
Currently approved drugs for CML treatment almost exist with discomfort side effects. In addition, Imatinib has drug resistance for Bcr-Abl point mutations. Nilotinib, Dasatinib, and Bosutinib can inhibit Bcr-Abl for different types of point mutations, but could not overcome the consistency of Bcr-Abl T315I mutation. Ponatinib can overcome the T315I mutation of Bcr-Abl to inhibit the function of tyrosine kinase. However, Ponatinib has a serious life-threatening cardiovascular obstruction in humans. In this paper, we combine the Traditional Chinese Medicine (TCM) database with the Chinese herb formula for CML treatments to screen the candidate compounds with similar cytotoxicity to currently approved drugs but less the side effects. We performed molecular docking and 2D-QSAR modeling regarding ABL target and ABL T315I mutant to discover the potential candidate compounds for CML treatment.

Molecular docking
The structure of active site of human Bcr-Abl tyrosine kinase and its mutant were downloaded from Protein Data Bank (PDB ID: 2G1T, 2V7A) [7,8]. We collected herb prescription for CML treatment from the Shanghai Chinese herbal prescriptions Innovation Center (http://www.sirc-tcm.sh.cn/en/index.html) [9]. The chemical components of these Chinese herbal medicine prescriptions are combined with the chemical structure from TCM Database@Taiwan [10] to produce the TCM compound library. Molecular docking was performed using the DS2.5 LigandFit module with the force field of HARVard Macromolecular Mechanics (CHARMm) to screen out the candidate compounds. The candidate compounds were assessed based on DockScore and ADMET pharmacokinetic properties, including absorption, solubility, BBB, and PPB.

2D-Quantitative Structure-Activity Relationship models (2D-QSAR)
In this study, 18 candidate inhibitors were collected (Table 1) from the literature [5] with biological activity pIC50 regarding Bcr-Abl and Bcr-Abl T315I. We randomly assigned to training group and test group containing 14 compounds and 4 compounds, respectively. The chemical structures of these inhibitors were drawn with ChemDraw Ultra 10.0 (CambridgeSoft Inc., USA) and transform into the 3D structure using Chem3D Ultra 10.0 (CambridgeSoft Inc., USA). We applied the DS 2.5 Calculate Molecular Property Module to calculate the molecular descriptors for each inhibitor. Based on these molecular descriptors and corresponding pIC50 value, the genetic function approximation model (GFA) was used to select the high correlation (R 2 >0.8) molecular characteristics to build the 2D-QSAR model of biological activity (pIC 50). We used the training set compounds to build multiple linear regression models (MLR), support vector machine models (SVM), and Bayesian networks models (BN). After that, we used test set compounds to test these models for model accuracy assessment.
Multivariate linear regression is a linear approach modeling two or more variables by linear fitting to construct a function that explains the relationship between variables and response variable [11]. The model equation is as follows: Where, x i represents the i-th molecular property, a i is the corresponding fitting coefficient. The MLR model was constructed with the training data set and applied for the prediction and validation. The square of the correlation coefficient (R 2 ) between the predicted pIC50 value and the actual pIC50 value was used to verify the accuracy of the model. Finally, the MLR model was used to predict the pIC50 of the candidate compounds in TCM library.
The most important function of SVM model is to distinguish between two types of categories of data [12]. We construct the regression support vector machine model using LibSVM [13][14][15]. The Gaussian radial basis function was chosen as the kernel equation: The squared correlation coefficient (R 2 ) of the actual pIC50 values and predicted pIC50 values represents the accuracy of the prediction model. A Bayesian network model is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed network. The network can be used to compute the probabilities of the presence of various data categories. We discretize the training data and the test data into data binning according to the pIC50 values. Linear regression analysis was then performed for each pIC50 binning. The Bayesian Network Toolbox (BNT) [16] of Matlab was used to construct the Bayesian Network model for predicting the pIC50 binning of the training data. Assuming that then i-th pIC50 binning has n compounds, let y ij and x ijp represent the pIC50 value of j-th compound in the training data and the p-th molecular property. The chemical structure of the inhibitor and the biological activity of pIC50.
As described above, the screen flowchart of the candidate compounds is shown in Figure 1.

Molecular docking
The docking simulation was based on the binding interaction with the TCM chemicals. The referent compound was Ponatinib. We also implemented the molecular docking for first and second generation drugs. After assessing the DockScore and the ADMET pharmacokinetic properties of candidate compounds, we propose four potential inhibition compounds ( Figure 2). There are salvianolic acid, baicalin, 1,4-dicaffeoylquinic acid, and dihydroisotanshinone I.
The docking score is shown in Table 2. It was found that Salvianolic acid C, Baicalin, and 1,4-Dicaffeoylquinic acid had high docking values with Bcr-Abl fusion protein and Bcr-Abl T315I protein, respectively. The Dock Score of the previous three candidate compounds were higher than first and second generation drugs and Ponatinib.
The pharmacokinetic properties of Ponatinib, first and second generation drugs, and the TCM candidate compounds are shown in Therefore, the regression model of data set {y ij , x ij1 ... x ijp } n j =1 can represent as follows: Where β i and ε i are the regression coefficients and error terms for the i-th pIC50 binning. The unknown regression coefficients β i can use the least square method to estimate.
Then the pIC50 value of the k-th binning can be predicted by the following equation: The square of the correlation (R 2 ) between the predicted values and the actual pIC50 values can be used to verify the accuracy of the model.    The binding interaction for the candidate compounds is shown in Figures 3.1 and 3.2. Nilotinib binds to Arg332 of Bcr-Abl with a single hydrogen bond (Figure 3.1A). Imatinib binds to Phe 439 of Bcr-Abl with a single hydrogen bond (Figure 3.1B). Dasatinib binds to Bcr-Abl with Leu340, Gly463, Arg332, Tyr435, and Ala337 via double hydrogen bonds, π-cation, and π-σ bonds (Figure 3.1C). 1,4-Dicaffeoylquinic acid through their three hydrogen bonds H55-O12, H56-O28, and H63-O8 bonding Bcr-Abl (Figure 3.1D). Baicalin binds Tyr435 of Bcr-Abl by a double hydrogen bond (Figure 3.1E). Salvianolic acid C binds to Gly463, Ala337, and Ala433 on Bcr-Abl with three hydrogen bonds (Figure 3.1F). Dihydroisotanshinone I bind to Pro465 of Bcr-Abl with a π-σ bond (Figure 3.1G).

Symbol Description ES_Sum_aaN
The sum of the electrotopological state value for atom type aaN, "a" represents an aromatic bond and "N" is the nitrogen atom.

Kappa_2
Kier's Second Order Shape Index.

Jurs_PNSA_1
Partial Negative Surface Area.

Jurs_PNSA_3
Atomic Charge Weighted Negative Surface Area.

Shadow_XY
Shadow Index for the XY lane.

ES_Sum_ssNH
The sum of the electrotopological state value for atom type ssNH, "s" is the single bond and NH group.

Num_RotatableBonds
The numbers of bonds which allow free rotation around themselves.

Jurs_FPSA_1
Fractional Charged Partial Surface Area: PPSA-1/MW.  A.  The correlation between the pIC50 observations of Bcr-Abl and the predictions from the 2D-QSAR model is shown in Figure 5. The correlation between the pIC50 observations of Bcr-AblT315I and the predicted values using the 2D-QSAR model is shown in Figure 6. The correlation coefficient R 2 value of Bcr-Abl in the 2D-QSAR MLR model is 0.913. The correlation coefficient R 2 of the SVM model is 0.712. The BNT model correlation coefficient R 2 is 0.952 ( Figure 5). Besides the SVM model (R 2 =0.712) is less than 0.8, the remaining models all are >0.8, with a high correlation. On the other hand, the R 2 value of the BCR-Abl T315I in the 2D-QSAR model is 0.989, the R 2 value of the SVM model is 0.808, and the R 2 value of the BNT model is 0.834 ( Figure  6). All models are >0.8, with a high degree of correlation.

Salvianolic acid B. Baicalin
We used the 2D-QSAR model of MLR, SVM model, and BNT model to predict the biological activity (pIC50) of currently approved drugs of CML treatment and TCM candidate compound. The results are shown in Table 5.
Salvianolic acid C is present in Danshen. Baicalin was found in Bing Tou Huang Qin, Chuan Huang Qin, Da Che Qian, Baihua Dan Shen, Dian Huang Qin, Gan Su Huang Qin, Mu Hudie, Mu Hu Die Shu Pi, and Zhan Mao Huang Qin. It mainly found in the Huang Qin. 1,4-Dicaffeoylquinic acid is present in Cang Er, and Dihydroisotanshinone I can be extracted from Bai Huad Dan Shen. Danshen is widely used in Chinese herbal medicine to promote circulation to improve the effectiveness of blood flow, often used in the treatment of many diseases, including cancer [19,20]. The main components Danshen are hydrophilic phenolic acids and lipophilic tanshinones with anti-cancer effect [21,22]. Salvianolic acid C is a phenolic compound in Danshen. Baicalin is a component of Huang Qin. Previous studies show that baicalin has therapeutic effects on cancer [23][24][25][26][27]. Cang Er in Chinese medicine used in the treatment of typhoid fever caused by headache, sinusitis, urticarial, and arthritis [28]. The composition caffeoylquinic acids have the efficacies of antioxidant activity, anti-inflammatory, anti-microbial effects, enzyme inhibition, inhibition of platelet aggregation [29]. DihydroisotanshinoneI is one of the components of Danshen, against various cancer cell cytotoxicity [30,31], which can inhibit the proliferation of the endothelial cells and anti-angiogenesis and induce cell growth arrest in S phase to cause apoptosis.

Conclusion
We performed structure-based virtual screening and QSAR modeling to select potential TCM candidate compounds. The results show that salvianolic acid C, baicalin, 1,4-dicaffeoylquinic acid, and dihydroisotanshinone I might have the potential to treat Chronic Myelogenous Leukemia with fewer side effects.