3D-QSAR Studies on a Series of Dihydroorotate Dehydrogenase Inhibitors: Analogues of the Active Metabolite of Leflunomide

The active metabolite of the novel immunosuppressive agent leflunomide has been shown to inhibit the enzyme dihydroorotate dehydrogenase (DHODH). This enzyme catalyzes the fourth step in de novo pyrimidine biosynthesis. Self-organizing molecular field analysis (SOMFA), a simple three-dimensional quantitative structure-activity relationship (3D-QSAR) method is used to study the correlation between the molecular properties and the biological activities of a series of analogues of the active metabolite. The statistical results, cross-validated rCV2 (0.664) and non cross-validated r2 (0.687), show a good predictive ability. The final SOMFA model provides a better understanding of DHODH inhibitor-enzyme interactions, and may be useful for further modification and improvement of inhibitors of this important enzyme.


Introduction
The dihydroorotate dehydrogenase (DHODH) ia an essential mitochondrial enzyme that catalyzes the flavin mononucleotide-dependent formation of orotic acid, a key step in de novo pyrimidine biosynthesis [1,2]. This enzyme is an attractive chemotherapeutic target in various pathogens, such as Plasmodium falciparum and Helicobacter pylori, and for the treatment of human disease, such as cancer, malaria and rheumatoid arthritis [3][4][5].

OPEN ACCESS
All potent inhibitors of DHODH published to date bind to the putative ubiquinone binding channel and display beneficial immunosuppressive and antiproliferative activities, shown to be most pronounced during T-cell proliferation [6]. Brequinar and leflunomide are two examples of low-molecular weight inhibitors of DHODH that have been in clinical development [7][8][9]. Leflunomide is now marketed as a treatment for rheumatoid arthritis. A series of analogues of the active metabolite of an immunosuppressive agent leflunomide have also been synthesized and found to inhibit DHODH [10].
Quantitative structure-activity relationships are the most important applications of chemometrics, giving useful information for the design of new compounds acting on a specific target. Quantitative structure-activity relationship (QSAR) attempts to find a consistent relationship between biological activity and molecular properties. Thus, QSAR models can be used to predict the activity of new compounds. Although there has been much interest in synthesis of various inhibitors of DHODH, there have been few QSAR studies of DHODH inhibitors [10][11][12][13]. Kuo [10] and Ren [11] have even reported the structure-activity relationships (SAR) and quantitative structure-activity relationship (2D-QSAR) of this series of analogues, respectively.
The self-organizing molecular field analysis (SOMFA) [14] is a simple 3D-QSAR technique, which has been developed by Robinson et al. The method has similarities to both comparative molecular field analysis (CoMFA) [15] and molecular similarity studies. Like CoMFA, a grid-based approach is used; however, SOMFA only uses steric and electrostatic maps, which are related to interaction energy maps, no probe interaction energies need to be evaluated. The weighting procedure of the grid points by Mean-Centered-activity is an important ingredient of the SOMFA procedure. Like the similarity methods, it is the intrinsic molecular properties, such as the molecular shape and electrostatic potential, which are used to develop the QSAR models.
A SOMFA model could suggest a method of tackling the all-important alignment, which all 3D-QSAR methods have faced. The inherent simplicity of this method allows the possibility of aligning the training compounds as an integral part of the model derivation process and of aligning prediction compounds to optimize their predicted activities.
In a recent study, leflunomide has been found to exhibit some dose-dependent side effects in a small number of patients [16]. The purpose of this paper is to describe the application of self-organizing molecular field analysis, SOMFA, on the analogues of the active metabolite of leflunomide, to analyze the three-dimensional quantitative structure-activity relationships (3D-QSAR) and to determine the structural requirements of this series of analogues for optimum activity. The 3D-QSAR together with the modeling studies will provide a more precise elucidation of the molecular forces involved in the DHODH inhibitor-enzyme interactions, and may be useful for further modification and improvement of inhibitors of this important enzyme.

Data Sets
The biological activities of analogues of the active metabolite of leflunomide were taken from the papers by Kuo et al. [10]. Not every compound from Kuo's paper was included in the 3D-QSAR analysis because of the lack of parameters (6 compounds) and the exact IC 50 values (IC 50  10 5 nM for 6 compounds). All analogues were classified into two subgroups according to the substituents at their two different positions, 42 aromatic substituted analogues and 12 side chain 3-substituted analogues.
Fifty-three analogues are divided into two sets. The training set of 42 molecules with structures are shown in Table 1 and their enzyme inhibitory activities expressed as log(1/IC 50 ) are shown in Section 3. The predictive power of the models is evaluated using a test set of 11 molecules whose structures are also shown in Table 1 and activities will be shown in Section 3. Two sets of molecules are selected in order to elucidate convenient models for the predictive discrimination between these various activities.

Molecular Modeling and Alignment
The three-dimensional structures of the analogues were constructed with the ArgusLab 4.01 [17] according to the conformations of active metabolite A771726 (Compound 43) from PDB 1d3h [18], running on an AMD Athlon 64 X2 Dual Core Processor 3600 + CPU/Microsoft Win XP platform.
Unless otherwise indicated, parameters are default. Full geometry optimizations are performed first by molecular mechanics MM2, and then optimized by PM3 semi-empirical method in the ArgusLab software. The final active conformations are then performed RMS overlapping and fitted with the compound 43 as a reference. Two different alignments are selected to define overlap. The atom numbers and corresponding sequence for each alignment are defined in Table 2. According to two alignment of the final active conformations of analogues, these compounds are then performed SOMFA analysis. The superposition of active analogues structures according to alignment 1 (considering phenyl ring similarity) has been shown in Figure 1, the superposition of active analogues according to the alignment 2 (considering all skeleton similarity) has also been shown in Figure 2. Using VEGA software [19], the final overlayed geometries are converted into CSSR file format, the only file format which the SOMFA2 program can accept to process a SOMFA analysis.

SOMFA 3D-QSAR Models
In the SOMFA study a 40 × 40 × 40 Å grid originating at (−20, −20, −20) with a resolution of 0.5 Å, is generated around the aligned compounds, and all compounds have been assigned charges by the MNDO hamiltonian semi-empirical method according to our previous works [20,21]. 12 different models using different enzyme, compound subgroups and alignment under exploration are presented in Table 3.
For all of the studied compounds, shape and electrostatic potential are generated. To sum up the predictive power of these two properties into one final model, we combine their individual predictions using a weighted average of the shape and electrostatic potential based QSAR, using a mixing coefficient (c 1 ) as illustrated in Equation 1 [14]. Activity = c 1 Activity shape + (1 − c 1 )Activity ESP (1) Clearly, multiproperty predictions could have been obtained through multiple linear regression.
Using Equation 1 instead gives greater insight into the resultant model by allowing the study of the variation in predictive power with different values of c 1 .
With the highest value of r 2 , the SOMFA models then are derived by the partial least squares (PLS), implemented in SPSS software [22] with cross-validation.
The predictive ability of the model is quantitated in terms of r CV 2 which is defined in Equation 2.
where PRESS = (Y pred − Y actual ) and SD = (Y actual − Y mean ). SD is the sum of squares of the difference between the observed values and their meaning and PRESS is the prediction error sum of squares. The final models are constructed by a conventional regression analysis with the optimum value of mixing coefficient (c 1 ) equal to that yielding the highest r 2 and r CV 2 value according to Equation 2.

Results and Discussion
SOMFA, a novel 3D-QSAR methodology, is employed for the analysis with the training set composed of 42 various compounds, from which biological activities are known. Statistical results of 12 SOMFA models are summarized in Table 3. A cross-validated value r CV 2 which is obtained as a result of PLS analysis serves as a quantitative measure of the predictability of the SOMFA model. We find that the quality of the QSAR model was dependent upon the alignment and number of molecules. The model overlayed using alignment 2 shows higher r CV 2 values than using the model of alignment 1, and the model of subgroups shows higher r CV 2 values than the model of all analogues.
Among the models tested from all analogues, good cross-validated correlation coefficient r CV 2 values (0.664), moderate non-cross-validated correlation coefficient r 2 values (0.687) proves a good conventional statistical correlation which have been obtained, and we also find that the resultant SOMFA model have a satisfying predictive ability.
The observed and predicted activities of the training set are reported in Table 4. Figure 3 shows a satisfying linear correlation and moderate difference between observed and predicted values of molecules in the training set.   It is well known that the best way to validate a 3D-QSAR model is to predict biological activities for the compounds forming the test set. The SOMFA analysis of the test set composed of 11 compounds is reported in Table 5. Most of the compounds in the test set show satisfying correlation between observed and predicted values in Figure 4. We find that two compounds of test set (compound 10 and 15) always have large residuals, and could be classified as outliers. This is true for both rat and mouse DHODH models, there may be more complicated structure-activity relationships in these two compounds. The statistical parameters r pred 2 of test compounds excluding compound 10 and 15 are also summarized in Table 3; all the models performed well (r pred 2 > 0.5) in the activity prediction of most test compounds. SOMFA calculation for both shape and electrostatic potentials are performed, then combined to get an optimal coefficient c 1 = 0.766 according to Equation 1. The master grid maps derived from the best model is used to display the contribution of electrostatic potential and shape molecular field. The master grid maps give a direct visual indication of which parts of the compounds differentiate the activities of compounds in the training set under study. The master grid also offers an interpretation as to how to design and synthesize some novel compounds with much higher activities. The visualization of the potential master grid and shape master grid of the best SOMFA model is showed in Figure 5 and Figure 6 respectively, with compound 43 as the reference.   Each master grid map is colored in two different colors for favorable and unfavorable effects. In other words, the electrostatic features are red (more positive charge increases activity, or more negative charge decreases activity) and blue (more negative charge increases activity, or more positive charge decreases activity), and the shape feature are red (more steric bulk increases activity) and blue (more steric bulk decreases activity), respectively.
It can be seen from Figure 5 and Figure 6 that the electrostatic potential and shape master grid for Rat DHODH are very similar to that for Mouse DHODH. Because Rat DHODH have structural similarities to Mouse DHODH, so active analogues have the same or a similar 3D-QSAR to them. SOMFA analysis result indicates the electrostatic contribution is of a low importance (c 1 = 0.766). In the map of electrostatic potential master grid, we find a high density of blue points around the substituent R 1 at the phenyl ring, which means some electronegative groups are favorable. Meanwhile, the SOMFA shape potential for the analysis is presented as master grid in Figure 6. In this map of important features, we can find a high density of red points around the substituent R 1 and R 2 at the phenyl ring, which means a favorable steric interaction; simultaneously, we also find a high density of blue points outside substituent R at the 3-substituted side chain, where an unfavorable steric interaction may be expected to enhance activities. Generally, the medium-sized electronegative potential substituent R 1 and R 2 (benzene ring with electron-withdrawing groups, pyridine ring, for example) at the phenyl ring increases the activity, the small-sized substituent R (methyl, ethyl, for example) at the 3-substituted side chain increases the activity.
All analyses of SOMFA models may provide some useful information in the design of new active metabolite analogues of leflunomide.

Conclusions
We have developed predictive SOMFA 3D-QSAR models for analogues of the active metabolite of Leflunomide as anti-inflammatory drugs. The master grid obtained for the various SOMFA models' electrostatic and shape potential contributions can be mapped back onto structural features relating to the trends in activities of the molecules. On the basis of the spatial arrangement of the various electrostatic and shape potential contributions, novel molecules are being designed with improved activity.