Prediction of gastro-intestinal absorption using multivariate adaptive regression splines

https://doi.org/10.1016/j.jpba.2005.05.034Get rights and content

Abstract

Multivariate adaptive regression splines (MARS) and a derived method two-step MARS (TMARS) were used for modelling the gastro-intestinal absorption of 140 drug-like molecules. The published absorption values for these molecules were used as response variable and calculated molecular descriptors as potential explanatory variables. Both methods were compared and their potential use in quantitative structure–activity relationship (QSAR) context evaluated.

The predictive abilities of the models were studied using different sequences of Monte Carlo cross validation (MCCV). It was shown that both types of models had good predictive abilities and that for the data used, MARS gave better results than TMARS. It could be concluded that both methods could be valuable for QSAR modelling.

Introduction

High throughput screening has become a very important issue in drug discovery. Since most new molecules, potentially useful, fail in a later phase of the drug development due to non-proper absorption, distribution, metabolisation, elimination and toxicity (ADME-Tox) properties, screening methods for these properties are necessary in the first stages of the drug development. In silico screening can be very useful, since it allows screening for ADME-Tox and other properties before the molecules are even synthesized. In silico methods try to build relationships between a dataset consisting of known values for the property of interest and some calculated theoretical and/or experimental parameters or descriptors. These kind of relationships are called quantitative structure–activity relationships (QSAR). This paper focuses on the relationships between theoretical descriptors and the gastro-intestinal absorption of drug molecules.

In the literature different QSAR-models can be found predicting the absorption of molecules, and built using linear modelling techniques like multiple linear regression (MLR) [1], principal components regression (PCR) [2], partial least squares (PLS) regression [2], [3], and some more advanced non-linear techniques like artificial neural networks (ANN) [4] and classification and regression trees (CART) [5]. Two well known approaches used in screening are the Lipinski rule of five [6] and the linear free energy relationship (LFER) approach of Abraham et al. [7]. A disadvantage of these two methods is that they give a quite rough classification of the molecules, allowing the elimination of only a very limited set of molecules.

In this paper, it was tried to build models, that give a more accurate prediction of the absorption values of drug molecules. Therefore two techniques, multivariate adaptive regression splines (MARS) and two-step MARS (TMARS), were evaluated. The latter is in fact a combination of MLR and MARS [8]. The MARS technique was introduced by Friedman in 1991 [9] and successfully used in QSAR by Nguyen-Cong et al. [10] and Ren et al. [11], [12] and in quantitative structure retention relationships (QSRR) by Put et al. [13]. TMARS was introduced and applied successfully in the prediction of retention in gas chromatography by Xu et al. [8]. It was proven that the combined method TMARS significantly improved the prediction abilities compared to the individual MLR and MARS models.

In a first step, absorption was modeled using MARS. The models were evaluated for their predictive abilities using Monte Carlo cross validation (MCCV) [14]. In a second step, a TMARS model was build, evaluated and compared to the MARS-models.

Section snippets

Multivariate adaptive regression splines (MARS)

MARS is a local modelling technique that divides the data space into several, possibly overlapping, regions and fits truncated spline functions in each of these regions. Truncated spline functions consist of two segments, i.e. left-sided Eq. (1) and right-sided Eq. (2) truncated functions, separated from each other by a so-called knot location [9].bq(xt)=[(xt)]+q=(tx)q,ifx<t0,otherwisebq+(xt)=[+(xt)]+q=(xt)q,ifx>t0,otherwisewhere bq(xt) and bq+(xt) are the spline functions

Data

The data consists of intestinal absorption values for a subset of 140 molecules extracted from a dataset collected by Zhao et al. [1]. For each of the molecules the name and the percentage intestinal absorption (%HIA) are listed in Table 1. These molecules were selected because they show a high diversity in molecular structure and cover the whole absorption range (0–100%) [5].

Three-dimensional structure optimisation

The three-dimensional structures of the molecules were drawn and optimized using the Hyperchem® 6.03 professional

Building MARS-models

The model was build using the Briggsian logarithms of the percentages human intestinal absorption (%HIA) of all 140 molecules as response variable. The descriptors were used as descriptive variables. The global MARS-model is build and pruned. The order q of the MARS-model is set on 2, which means that both linear and second order splines can be used during model building. The maximum number of terms Mmax, the stop criterion in building the global MARS-model, was set to 100. Pruning was carried

Conclusions

Comparison of the MARS and the TMARS model shows that the MARS-model describes the dataset better and has a better predictive ability. The lower performance of the TMARS method can be explained by the fact that the TMARS model is based on a linear model. The obtained TMARS model shows high similarity with the linear model. Seven of the selected descriptors (nO, T(SS), Mor08m, Mor16v, HATS8v, C-030 and TIE) correspond to descriptors from the linear model. Only one descriptor (R1e) corresponds

Acknowledgment

This research is financed with a specialization grant from the Institute for the Promotion of Innovation by Science and Technology in Flanders (IWT).

References (22)

  • Y.H. Zhao et al.

    J. Pharm. Sci.

    (2001)
  • S. Winiwarter et al.

    J. Mol. Graph. Model.

    (2003)
  • S. Agatonovic-Kustrin et al.

    J. Pharm. Biomed. Anal.

    (2001)
  • C.A. Lipinski et al.

    Adv. Drug Deliv. Rev.

    (2001)
  • M.H. Abraham et al.

    Drug Discov. Today

    (2002)
  • Q.S. Xu et al.

    J. Chromatogr. A

    (2003)
  • V. Nguyen-Cong et al.

    Eur. J. Med. Chem.

    (1996)
  • R. Put et al.

    J. Chromatogr. A

    (2004)
  • Q.S. Xu et al.

    Chemom. Intell. Lab. Syst.

    (2001)
  • S.K. Poole et al.

    J. Chromatogr. B

    (2003)
  • S. Yang et al.

    J. Chromatogr. A

    (1996)
  • Cited by (0)

    View full text