Prediction of cytotoxicity data (CC50) of anti-HIV 5-pheny-l-phenylamino-1H-imidazole derivatives by artificial neural network trained with Levenberg–Marquardt algorithm
Introduction
Acquired immunodeficiency syndrome (AIDS), which is caused by the human immunodeficiency virus type 1 (HIV-1), has become a major worldwide pandemic [1], [2]. Three million people had died from AIDS and 40 million were living with HIV-1 or AIDS at the end of 2003 [3]. From the beginning of anti-HIV-1 chemotherapy development, HIV-1 reverse transcriptase (RT) has been one of the main targets. Anti-AIDS drugs fall into three categories, the nucleoside reverse transcriptase inhibitors (NRTIs) that act as chain terminators to block the elongation of the HIV-1 viral DNA strand, the non-nucleoside reverse transcriptase inhibitors (NNRTIs) that directly inhibit RT enzyme by binding to the allosteric site near the polymerase active site and the protease inhibitors (PIs) [4], [5], [6], [7]. Highly active antiretroviral therapy (HAART) regimens, which are based on triple or quadruple combinations of NRTIs, NNRTIs and PIs, reduce HIV to very low levels, but are unable to extricate the infection and long period therapies lead to the emergence of drug resistant mutant strains [8]. Thus, it is strongly desired to develop new anti-HIV-1 agents with superior efficacy and safety profiles. The activity data for drug like chemicals can be conveniently assayed using cell culture. Once a well-designed subset of chemicals is tested, one can develop quantitative structure–activity relationship (QSAR) models to understand the structural basis of biological activity and the potential activity of untested chemical of the same class.
QSAR approach has become very useful in the prediction of physical and chemical activities and properties [9]. This approach is based on the assumption that variation of the behavior of the compounds, as expressed by any measured physical or chemical activities (properties) can be correlated with changes in molecular features of the compounds termed descriptors [10]. The main steps involved in QSAR include: data collection, molecular geometry optimization, molecular descriptor generation, descriptor selection, model development and finally model performance evaluation [11]. QSAR models can be formulated based on experimentally derived descriptors or parameters, which can be computed from molecular structure without the input of experimental data. Whereas the experimentally based QSARs work well with narrow classes of chemicals, such parameters are not available for diverse groups of chemicals. Another advantage of theoretically calculated descriptors is that they are available for any molecule, real or hypothetical. Therefore, the latter group of descriptors can be used in the evaluation of compounds not yet synthesized.
One of the important steps involved in QSARs studies is model building. There are several major approaches in QSAR modeling. One is the use of multivariate mathematical–statistical methods such as multiple linear regressions (MLR) [12], [13], [14], [15] and partial least squares (PLS) projection of latent structures [16], [17], [18], [19]. These methods are linear modeling approaches and have been developed to extract the maximum information from complex data matrices based on their linear behaviors. The other approach is the use of artificial neural networks (ANNs), which offer attractive possibilities for non-linear modeling and optimization when underlying mechanisms are very complex. ANNs are computational simulations of biological networks. An ANN consists of many pathways and nodes organized into a sequence of layers. The first layer is an input layer with one node for each variable or feature of the data. The last layer is an output layer consisting of one node for each variable to be investigated. In between, there is a series of one or more hidden layer(s) consisting of a number of nodes, which are responsible for learning. Nodes of one layer are connected to the nodes of the succeeding layer. Each connection is represented by a number called weight. Initially, a learning phase is defined in which each of the input parameter is applied to a processing element. The weights between these parameters are adjusted until the output is correct. The system can then be applied to unknowns.
ANNs have been widely applied to QSAR studies as a powerful non-linear modeling technique. Some applications of ANNs to the QSAR studies of anti-HIV activity of novel compounds are as follows: study inhibition of HIV replication (IC90) for 55 cyclic urea derivatives [20]; predicting anti-HIV activity for a set of 1-[(2-hydroxyethoxy)methyl]-6-(phenylthio)thymine (HEPT) derivatives [21]; QSAR analysis for a set of 4,5,6,7-tetrahydro-5-methylimidazo[4,5,1-jk][1,4]benzodiazepin-2(1H)-ones (TIBO) derivatives [22]; prediction of anti-HIV activity for a set of 107 inhibitors of the HIV-1 reverse transcriptase derivatives of 1-[(2-hydroxyethoxy)methyl]-6-(phenylthio)thymine [23], [24], [25], [26] and anti-HIV-1 activities prediction of 20 tetrapyrrole derivatives [27]. Some evidence show that ANNs modeling give better statistical results both in fitting and prediction, in comparison with linear modeling approaches in QSAR studies [22], [23], [24], [25].
As far as we are concerned there are no reports on the use ANNs in the QSAR studies for the 5-pheny-l-phenylamino-1H-imidazole derivatives, thus the aim of the current work is to provide an application of ANN to the structure–anti-HIV-1 activity relationship of 5-pheny-l-phenylamino-1H-imidazole derivatives. The results obtained by ANN will be statistically compared with those given by multiple linear regressions (MLR).
Section snippets
Data set
The data used in this QSAR study consisted of cytotoxicity data (CC50), the 50% cytotoxic concentration to reduce MT-4 cell viability, for 42 derivatives of 5-pheny-l-phenylamino-1H-imidazole that have been reported by Lagoja et al. [28]. The activity data [CC50 (μM)] for 5-phenyl-1-phenylamino-1H-imidazole derivatives (Fig. 1 and Table 1) were converted to the logarithmic scale [−log CC (μM)] and then used for subsequent QSAR analyses as the response variables.
Calculations
The two-dimensional structures of
Selected descriptors
Descriptor selection was carried out according to the steps described in Section 2.3. Applying stepwise regression showed that only 44 descriptors of the total calculated ones have significant relationships with cytotoxicity data of anti-HIV derivatives for 5-pheny-l-phenylamino-1H-imidazole. From these selected descriptors, a number of 28 descriptors were used as the most feasible descriptors in ANN modeling. A full list of 28 selected descriptors and their chemical meaning are given in Table 2
Conclusion
A Levenberg–Marquardt algorithm trained neural network was applied to analyze the QSAR of 5-pheny-l-phenylamino-1H-imidazole compounds. Based on our knowledge, this work is the first report on the use of ANN combined with Levenberg–Marquardt algorithm in QSAR studies. The results obtained show that this ANN modeling was able to establish a satisfactory relationship between the molecular descriptors and the anti-HIV activity. ANN approach would seem to have a great potential for determining
Acknowledgement
The authors are thankful to the Shahrood University of Technology Research Council for the support of this work.
References (50)
- et al.
Radial basis function neural network-based QSPR for the prediction of critical temperature
Chemom. Intell. Lab. Syst.
(2002) - et al.
Radial basis function network-based quantitative structure–property relationship for the prediction of Henry's law constant
Anal. Chim. Acta
(2002) - et al.
Synthesis, cytotoxicity, QSAR, and intercalation study of new diindenopyridine derivatives
Bioorg. Med. Chem.
(2004) - et al.
Topochemical model for prediction of anti-HIV activity of HEPT analogs
Bioorg. Med. Chem. Lett.
(2005) - et al.
QSAR studies on some thiophene analogs as anti-inflammatory agents: enhancement of activity by electronic parameters and its utilization for chemical lead optimization
Bioorg. Med. Chem.
(2005) - et al.
QSAR by LFER model of HIV protease inhibitor mannitol derivatives using FA-MLR, PCRA, and PLS techniques
Bioorg. Med. Chem.
(2006) - et al.
QSAR analysis of substituted bis[(acridine-4-carboxamide)propyl]methylamines using optimized block-wise variable combination by particle swarm optimization for partial least squares modeling
Eur. J. Pharmaceut. Sci.
(2005) - et al.
QSAR model of the phototoxicity of polycyclic aromatic hydrocarbons
J. Mol. Struct. THEOCHEM
(2005) - et al.
Modeling of activity of cyclic urea HIV-1 protease inhibitors using regularized-artificial neural networks
Bioorg. Med. Chem.
(2006) - et al.
Evolutionary optimization, backpropagation, and data preparation issues in QSAR modeling of HIV inhibition by HEPT derivatives
Biosystems
(2003)
Analysis of shape transitions using molecular size descriptors associated with inner and outer regions of a polymer chain
J. Mol. Struct. THEOCHEM
The prediction of the 3D structure of organic molecules from their infrared spectra
J. Vib. Spectrosc.
Training feed forward networks with the marquardt algorithm 36-artificial neural network based generalized storage–yield–reliability models using the Levenberg–Marquardt algorithm
J. Hydro.
Extraction of a quantitative reaction mechanism from linear sweep voltammograms obtained on a rotating disk electrode. Part I: theory and validation
J. Electroanal. Chem.
QSAR by LFER model of cytotoxicity data of anti-HIV 5-phenyl-1-phenylamino-1H-imidazole derivatives using principal component factor analysis and genetic function approximation
Bioorg. Med. Chem.
Pneumocystis carinii pneumonia and mucosal candidiasis in previously healthy homosexual men: evidence of a new acquired cellular immunodeficiency
Engl. J. Med.
Isolation of a T-lymphotropic retrovirus from a patient at risk for acquired immune deficiency syndrome (AIDS)
Science
Toward improved anti-HIV chemotherapy: therapeutic strategies for intervention with HIV infections
J. Med. Chem.
Non-nucleoside anti-HIV-1 reverse transcriptase inhibitors (NNRTIs): a chemical survey from lead compounds to selected drugs for clinical trials
Farmaco
Highlights in the development of new antiviral agents
Mini-Rev. Med. Chem.
Insights into HIV chemotherapy
AIDS Res. Hum. Retroviruses
Anti-human immunodeficiency virus drug combination strategies
Antivir. Chem. Chemother.
Toward an optimal procedure for variable selection and QSAR model building
J. Chem. Inf. Comput. Sci.
Synthesis and anti-HIV-1 activity of 4,5,6,7-tetrahydro-5-methylimidazo [4,5,1-jk] [1,4] benzodiazepin-2 (lH)-one (TIBO) derivatives
J. Med. Chem.
Cited by (44)
Application of artificial neural networks to the prediction of antifungal activity of imidazole derivatives against Candida albicans
2022, Chemometrics and Intelligent Laboratory SystemsCitation Excerpt :A number of studies have been described in the literature, in which ANN analysis was used to model the structure-activity relationship, classify compounds, or identify compounds as potential candidates for therapeutic substances [35–38]. Chamjangali et al. [39] research, trained an artificial neural network to predict the quantitative structure-activity relationship (QSAR) to model cytotoxicity data for imidazole derivatives in the search for anti-HIV drugs. Furthermore, in the study by Wnuk et al. [40], the activity of imidazole derivatives against Streptococcus pyogenes was predicted using ANN.
Shallow neural networks and classification methods for approximating the subsurface in situ fluid-filled pore size distribution
2019, Machine Learning for Subsurface CharacterizationMolecular docking and 4D-QSAR studies of metastatic cancer inhibitor thiazoles
2018, Computational Biology and ChemistryCitation Excerpt :The LM method is a standard technique used to solve nonlinear least squares problems. In the last QSAR studies, the LM algorithm was accepted as an efficient algorithm (Chamjangali, 2009; Chamjangali et al., 2007). The goal of NLLS optimization approach based on LM is to estimate the unknown coefficients of a receptor according to the optimal descriptor, which minimizes the sum of squares error.
Combining radial basis function neural network with genetic algorithm to QSPR modeling of adsorption on multi-walled carbon nanotubes surface
2015, Journal of Molecular StructureCitation Excerpt :In this case, linear methods have some limitations and give poor statistical results. Flexible nonlinear artificial intelligent based algorithms, such as artificial neural network (ANN) were used to perform nonlinear mapping of the physicochemical descriptors to the corresponding property and using ANN the predictions precision was improved [4,5]. RBFN is one of the most popular neural network models and is a powerful nonlinear regression technique, which has an input, a hidden and one output layer [6–8].
3D-MoRSE descriptors explained
2014, Journal of Molecular Graphics and ModellingCitation Excerpt :3D-MoRSE denotes 3D molecular representations of structure based on electron diffraction descriptors and has been introduced in 1996 by Schuur, Gasteiger and coauthors in two seminal papers [10,11]. These descriptors have found a broad application and have been shown as predominant in a number of QSAR/QSPR studies [12–22]. The majority of these papers describe the effect of 3D-MoRSE values on activity but lack of interpretation how the values of used 3D-MoRSE descriptors relate to the molecular structure.