Prediction of cytotoxicity data (CC50) of anti-HIV 5-pheny-l-phenylamino-1H-imidazole derivatives by artificial neural network trained with Levenberg–Marquardt algorithm

https://doi.org/10.1016/j.jmgm.2007.01.005Get rights and content

Abstract

A Levenberg–Marquardt algorithm trained feed-forward artificial neural network in quantitative structure–activity relationship (QSAR) was developed for modeling of cytotoxicity data for anti-HIV 5-pheny-l-phenylamino-1H-imidazole derivatives. A large number of descriptors were calculated with Dragon software and a subset of calculated descriptors was selected with a stepwise regression as a feature selection technique. The 28 molecular descriptors selected by stepwise regression, as the most feasible descriptors, were used as inputs for feed-forward neural network. The neural network architecture and its parameters were optimized. The data were randomly divided into 31 training and 11 validation sets. The prediction ability of the model was evaluated using validation data set and “one-leave-out” cross validation method. The root mean square errors (RMSE) and mean absolute errors for the validation data set were 0.042 and 0.024, respectively. The prediction ability of ANN model was also statistically compared with results of linear free energy related model. The obtained results show the validity of proposed model in the prediction of cytotoxity data of corresponding anti-HIV drugs.

Introduction

Acquired immunodeficiency syndrome (AIDS), which is caused by the human immunodeficiency virus type 1 (HIV-1), has become a major worldwide pandemic [1], [2]. Three million people had died from AIDS and 40 million were living with HIV-1 or AIDS at the end of 2003 [3]. From the beginning of anti-HIV-1 chemotherapy development, HIV-1 reverse transcriptase (RT) has been one of the main targets. Anti-AIDS drugs fall into three categories, the nucleoside reverse transcriptase inhibitors (NRTIs) that act as chain terminators to block the elongation of the HIV-1 viral DNA strand, the non-nucleoside reverse transcriptase inhibitors (NNRTIs) that directly inhibit RT enzyme by binding to the allosteric site near the polymerase active site and the protease inhibitors (PIs) [4], [5], [6], [7]. Highly active antiretroviral therapy (HAART) regimens, which are based on triple or quadruple combinations of NRTIs, NNRTIs and PIs, reduce HIV to very low levels, but are unable to extricate the infection and long period therapies lead to the emergence of drug resistant mutant strains [8]. Thus, it is strongly desired to develop new anti-HIV-1 agents with superior efficacy and safety profiles. The activity data for drug like chemicals can be conveniently assayed using cell culture. Once a well-designed subset of chemicals is tested, one can develop quantitative structure–activity relationship (QSAR) models to understand the structural basis of biological activity and the potential activity of untested chemical of the same class.

QSAR approach has become very useful in the prediction of physical and chemical activities and properties [9]. This approach is based on the assumption that variation of the behavior of the compounds, as expressed by any measured physical or chemical activities (properties) can be correlated with changes in molecular features of the compounds termed descriptors [10]. The main steps involved in QSAR include: data collection, molecular geometry optimization, molecular descriptor generation, descriptor selection, model development and finally model performance evaluation [11]. QSAR models can be formulated based on experimentally derived descriptors or parameters, which can be computed from molecular structure without the input of experimental data. Whereas the experimentally based QSARs work well with narrow classes of chemicals, such parameters are not available for diverse groups of chemicals. Another advantage of theoretically calculated descriptors is that they are available for any molecule, real or hypothetical. Therefore, the latter group of descriptors can be used in the evaluation of compounds not yet synthesized.

One of the important steps involved in QSARs studies is model building. There are several major approaches in QSAR modeling. One is the use of multivariate mathematical–statistical methods such as multiple linear regressions (MLR) [12], [13], [14], [15] and partial least squares (PLS) projection of latent structures [16], [17], [18], [19]. These methods are linear modeling approaches and have been developed to extract the maximum information from complex data matrices based on their linear behaviors. The other approach is the use of artificial neural networks (ANNs), which offer attractive possibilities for non-linear modeling and optimization when underlying mechanisms are very complex. ANNs are computational simulations of biological networks. An ANN consists of many pathways and nodes organized into a sequence of layers. The first layer is an input layer with one node for each variable or feature of the data. The last layer is an output layer consisting of one node for each variable to be investigated. In between, there is a series of one or more hidden layer(s) consisting of a number of nodes, which are responsible for learning. Nodes of one layer are connected to the nodes of the succeeding layer. Each connection is represented by a number called weight. Initially, a learning phase is defined in which each of the input parameter is applied to a processing element. The weights between these parameters are adjusted until the output is correct. The system can then be applied to unknowns.

ANNs have been widely applied to QSAR studies as a powerful non-linear modeling technique. Some applications of ANNs to the QSAR studies of anti-HIV activity of novel compounds are as follows: study inhibition of HIV replication (IC90) for 55 cyclic urea derivatives [20]; predicting anti-HIV activity for a set of 1-[(2-hydroxyethoxy)methyl]-6-(phenylthio)thymine (HEPT) derivatives [21]; QSAR analysis for a set of 4,5,6,7-tetrahydro-5-methylimidazo[4,5,1-jk][1,4]benzodiazepin-2(1H)-ones (TIBO) derivatives [22]; prediction of anti-HIV activity for a set of 107 inhibitors of the HIV-1 reverse transcriptase derivatives of 1-[(2-hydroxyethoxy)methyl]-6-(phenylthio)thymine [23], [24], [25], [26] and anti-HIV-1 activities prediction of 20 tetrapyrrole derivatives [27]. Some evidence show that ANNs modeling give better statistical results both in fitting and prediction, in comparison with linear modeling approaches in QSAR studies [22], [23], [24], [25].

As far as we are concerned there are no reports on the use ANNs in the QSAR studies for the 5-pheny-l-phenylamino-1H-imidazole derivatives, thus the aim of the current work is to provide an application of ANN to the structure–anti-HIV-1 activity relationship of 5-pheny-l-phenylamino-1H-imidazole derivatives. The results obtained by ANN will be statistically compared with those given by multiple linear regressions (MLR).

Section snippets

Data set

The data used in this QSAR study consisted of cytotoxicity data (CC50), the 50% cytotoxic concentration to reduce MT-4 cell viability, for 42 derivatives of 5-pheny-l-phenylamino-1H-imidazole that have been reported by Lagoja et al. [28]. The activity data [CC50 (μM)] for 5-phenyl-1-phenylamino-1H-imidazole derivatives (Fig. 1 and Table 1) were converted to the logarithmic scale [−log CC (μM)] and then used for subsequent QSAR analyses as the response variables.

Calculations

The two-dimensional structures of

Selected descriptors

Descriptor selection was carried out according to the steps described in Section 2.3. Applying stepwise regression showed that only 44 descriptors of the total calculated ones have significant relationships with cytotoxicity data of anti-HIV derivatives for 5-pheny-l-phenylamino-1H-imidazole. From these selected descriptors, a number of 28 descriptors were used as the most feasible descriptors in ANN modeling. A full list of 28 selected descriptors and their chemical meaning are given in Table 2

Conclusion

A Levenberg–Marquardt algorithm trained neural network was applied to analyze the QSAR of 5-pheny-l-phenylamino-1H-imidazole compounds. Based on our knowledge, this work is the first report on the use of ANN combined with Levenberg–Marquardt algorithm in QSAR studies. The results obtained show that this ANN modeling was able to establish a satisfactory relationship between the molecular descriptors and the anti-HIV activity. ANN approach would seem to have a great potential for determining

Acknowledgement

The authors are thankful to the Shahrood University of Technology Research Council for the support of this work.

References (50)

  • G.A. Arteca

    Analysis of shape transitions using molecular size descriptors associated with inner and outer regions of a polymer chain

    J. Mol. Struct. THEOCHEM

    (2003)
  • M.C. Hemmer et al.

    The prediction of the 3D structure of organic molecules from their infrared spectra

    J. Vib. Spectrosc.

    (1999)
  • A.J. Adeloye et al.

    Training feed forward networks with the marquardt algorithm 36-artificial neural network based generalized storage–yield–reliability models using the Levenberg–Marquardt algorithm

    J. Hydro.

    (2006)
  • E. Tourwe et al.

    Extraction of a quantitative reaction mechanism from linear sweep voltammograms obtained on a rotating disk electrode. Part I: theory and validation

    J. Electroanal. Chem.

    (2006)
  • K. Roy et al.

    QSAR by LFER model of cytotoxicity data of anti-HIV 5-phenyl-1-phenylamino-1H-imidazole derivatives using principal component factor analysis and genetic function approximation

    Bioorg. Med. Chem.

    (2005)
  • M.S. Gottlieb et al.

    Pneumocystis carinii pneumonia and mucosal candidiasis in previously healthy homosexual men: evidence of a new acquired cellular immunodeficiency

    Engl. J. Med.

    (1981)
  • F. Barre-Sinoussi et al.

    Isolation of a T-lymphotropic retrovirus from a patient at risk for acquired immune deficiency syndrome (AIDS)

    Science

    (1983)
  • UNAIDS/WHO AIDS Epidemic Update, December 2003, UNAIDS/WHO, Geneva, Switzerland,...
  • E. De Clercq

    Toward improved anti-HIV chemotherapy: therapeutic strategies for intervention with HIV infections

    J. Med. Chem.

    (1995)
  • M. Artico

    Non-nucleoside anti-HIV-1 reverse transcriptase inhibitors (NNRTIs): a chemical survey from lead compounds to selected drugs for clinical trials

    Farmaco

    (1996)
  • E. De Clercq

    Highlights in the development of new antiviral agents

    Mini-Rev. Med. Chem.

    (2002)
  • R.F. Schinazi et al.

    Insights into HIV chemotherapy

    AIDS Res. Hum. Retroviruses

    (1992)
  • A.M. Vandamme et al.

    Anti-human immunodeficiency virus drug combination strategies

    Antivir. Chem. Chemother.

    (1998)
  • A. Yasri et al.

    Toward an optimal procedure for variable selection and QSAR model building

    J. Chem. Inf. Comput. Sci.

    (2001)
  • M. Kukla et al.

    Synthesis and anti-HIV-1 activity of 4,5,6,7-tetrahydro-5-methylimidazo [4,5,1-jk] [1,4] benzodiazepin-2 (lH)-one (TIBO) derivatives

    J. Med. Chem.

    (1991)
  • Cited by (44)

    • Application of artificial neural networks to the prediction of antifungal activity of imidazole derivatives against Candida albicans

      2022, Chemometrics and Intelligent Laboratory Systems
      Citation Excerpt :

      A number of studies have been described in the literature, in which ANN analysis was used to model the structure-activity relationship, classify compounds, or identify compounds as potential candidates for therapeutic substances [35–38]. Chamjangali et al. [39] research, trained an artificial neural network to predict the quantitative structure-activity relationship (QSAR) to model cytotoxicity data for imidazole derivatives in the search for anti-HIV drugs. Furthermore, in the study by Wnuk et al. [40], the activity of imidazole derivatives against Streptococcus pyogenes was predicted using ANN.

    • Molecular docking and 4D-QSAR studies of metastatic cancer inhibitor thiazoles

      2018, Computational Biology and Chemistry
      Citation Excerpt :

      The LM method is a standard technique used to solve nonlinear least squares problems. In the last QSAR studies, the LM algorithm was accepted as an efficient algorithm (Chamjangali, 2009; Chamjangali et al., 2007). The goal of NLLS optimization approach based on LM is to estimate the unknown coefficients of a receptor according to the optimal descriptor, which minimizes the sum of squares error.

    • Combining radial basis function neural network with genetic algorithm to QSPR modeling of adsorption on multi-walled carbon nanotubes surface

      2015, Journal of Molecular Structure
      Citation Excerpt :

      In this case, linear methods have some limitations and give poor statistical results. Flexible nonlinear artificial intelligent based algorithms, such as artificial neural network (ANN) were used to perform nonlinear mapping of the physicochemical descriptors to the corresponding property and using ANN the predictions precision was improved [4,5]. RBFN is one of the most popular neural network models and is a powerful nonlinear regression technique, which has an input, a hidden and one output layer [6–8].

    • 3D-MoRSE descriptors explained

      2014, Journal of Molecular Graphics and Modelling
      Citation Excerpt :

      3D-MoRSE denotes 3D molecular representations of structure based on electron diffraction descriptors and has been introduced in 1996 by Schuur, Gasteiger and coauthors in two seminal papers [10,11]. These descriptors have found a broad application and have been shown as predominant in a number of QSAR/QSPR studies [12–22]. The majority of these papers describe the effect of 3D-MoRSE values on activity but lack of interpretation how the values of used 3D-MoRSE descriptors relate to the molecular structure.

    View all citing articles on Scopus
    View full text