QSAR study of diethyl p-nitrophenyl phosphate derivatives for paraoxonase 1

It is perhaps noteworthy in the recent literature of biomedical engineering that there have been a plethora of studies concerned with the activation of human paraoxonase1 (PON1) to reduce the high concentrations of homocysteine in the human serum. Some of diethyl p-nitrophenyl phosphate derivatives were studied at the B3LYP/6-311g(d,p), B3LYP/6-311++g(2d,2p) and B3LYP/6-311++g(3df,3dp) basis set level through the relationship between their molecular and electronic structure, as well as via their relations with paraoxonase1 activity. In an attempt to shed light on which descriptors may contribute to the activation of PON1, this paper investigates the nature of the relationship between the activity of the enzymes and the descriptors of 15 molecules, namely HOMO, LUMO, Energy gap, hardness, softness, Electronegativity, chemical potential, electrophilicity index( ω ), Nucleofugality, Electrofugality. Two data analysis methods, CART Decision Tree and Artificial Neural Networks (ANNs) were used for the linear part and non-linear part of the data set, respectively. The results of the study show correlations between the activity of the enzyme and the studied descriptors. Moreover, the results of the study reveal that by using the CART method not only do we know the significant descriptors but also we have their critical values and orders.


Introduction
Paraxonase is a group of enzymes that catalyzes the organophosphates and lactones.This group contains three genotypic forms coded as PON set, which are located in the long arm of chromosome 7 of human beings [1][2][3].The three types of paraxonase are: PON1 that is synthesized in the liver and functions as an antioxidant, PON2 that is expressed as intracellular protein that can protect cells against oxidative damage, and PON3 that is similar to PON1 in activity but differs from it in substrate specificity [4,5].Both PON1 and PON3 have clinical significance in that they are involved in lowering the risk of developing atherosclerosis and coronary artery disease [2,6].PON1 or serum paraxonase, known also as serum aryldialkylphosphatase 1 and homocysteine thiolactonase, is an enzyme that is found in many mammalian species and in humans and is encoded by the PON1gene.Although PON1 hydrolyzes aromatic carboxylic acid esters, toxic organophosphate compounds, and lactones, its natural substrates and physiological function(s) are far from settled [3][4][5].Human PON1 is a glycoprotein composed of 354 amino acids, which associates with high-density lipoprotein (HDL, a cholesterol carrier in the circulation) [1,3,4,7,8].Serum PON1 is secreted mainly by the liver, local synthesis occurs in several tissues though.The structure contains 2 calcium ions which are essential for catalytic activity and enzyme stability [3][4][5]8,9].PON1 is one of three members of mammalian family enzymes that contain PON2 and PON3.These enzymes were originally discovered through their involvement in the hydrolysis of organophosphate.Members of paraoxonase exhibit a wide range of physiologically important hydrolytic activities, including drug metabolism and detoxification of nerve agents.Due to the similarity between PON1 and life.Organophosphates are the basis of many insecticides, herbicides, and nerve agents.These compounds are a diverse group of chemicals used in both domestic and industrial settings.In this study, Gaussian 03 was used to carry out geometry optimization for the 15diethyl p-nitrophenyl phosphate compounds via the density functional theory (DFT).The aim of this study was to theoretically shed light on how the diethyl p-nitrophenyl phosphate derivatives would activate PON1.

Material and method
The 15 descriptors mentioned above have been carried out at the B3LYP level of theory using Gaussian-03 series of program package [12,13].The calculations were based on 6-31G (d,p) basis set.This method has been widely implemented to study the relationship between corrosion inhibition efficiency of the molecules and their electronic properties [13] In order to set up correlation between experimental data and structural and electronic characteristics of the investigated activators, the geometry of the molecules were optimized by the density functional theory (DFT) [13], with the Becke's three parameter exchange functional(1) along with the Lee-Yang-Parr correlation functional theory (B3LYP) [14].
As HOMO is often associated with the electron donating ability of a molecule, high value of HOMO is likely to indicate the tendency of the molecule to donate electrons to appropriate acceptor molecules with lower energy MO [15][16][17].HOMO and LUMO orbitals of the 15 molecules were obtained from the quantum chemical calculation by the DFT using B3LYP/6-311G(d,p), B3LYP/6-311++G(2d,2p) bases sets as shown in Figure 1 below [18,19].
Both HOMO and LUMO of the 15 molecules were centered on the phenyl group, which illustrated the reactive orbitals of these molecules.
The negative charge as shown in the above ESP charts is around O5 of the phosphate group, for the whole molecules under investigation.In addition, these negative charges can be observed around the O29, O30, O33 and O34 of the two nitrous oxide attached to the phenyl ring in molecule (1); O30 and O31in molecule (2); O29 and O30 in molecule (4); O31, O32, O33 and O34 in molecule (5); O30 and O31 in molecule (11), and O31 and O32 in molecule (12).Furthermore, the negative charge can be detected around the O32 of the aldehyde group that is attached to the phenyl ring, O31 of the acetophenone of the molecule (10), and N31 of the cyanophenyl of the molecule (13).
With respect to Nucleofugality (defined as the propensity of an atom or group of them to depart bearing the bonding electron pair in a heterolytic cleavage process [20,21], the highest nucleofugality of the 15 molecules was 89.62, 73.007, 62.583, 58.62, 14.828, 12.303, 11.677, 9.964, 9.263, 9.224, 9.034, 8.549, 5.826, 4.865 and 3.651 for the molecules 5, 1, 6, 4, 2, 11, 8, 10, 12, 3, 13, 9, 7, 14 and 15 respectively.According to these results, the molecules that can activate the PON1 enzyme of the aforementioned group of molecules.The Table 3 below shows the correlations between the calculated parameters and the activity of the enzyme.Because of there are some parameters show a weak relationship with the activity of the enzyme that measured in the reference (Khersonsky O, Tawfik DS. 2005).We know that the bond length is: the distance between two bonded atoms at their minimum potential energy, or the average distance between two bonded atoms.The correlation between the bond length and reactivity is that the longer the bond length is the more reactive it will be.The short one is more stable, however [22].The longer bonds length includes those bonds within the phosphate group attached to the phenyl ring, mainly the bond between the phosphor atom and oxygen atom 4 (P1-O4: 1.6383Å), (P1-O2: 1.5868 Å), and (P1-O3: 1.5782 Å) that bonded to the phenyl ring, whereas the shorter bonds were (O4-C20: 1.3615 Å), (O2-C6: 1.4592 Å) and (P1-O5: 1.4609 Å) [23].[ Figure 5] This bonds length was among the entire molecules, as shown in figure 6 below.

Statistical analysis
The total number of 15 molecules based on descriptors, namely, HOMO, LUMO, Energy Gap, Hardness, Softness, Electronegativity, Chemical Potential, Electrophilicty Index, Nucleofugality, Hyperpolarisibility, Pz, Alpha, Delta and dependent variable K CAT were investigated using CART Decision Tree and Artificial Neural Networks (ANNs) in order to examine which factors may have impact on the dependent variable, K CAT .The data set included both linear and nonlinear calculations.While CART Decision Tree was conducted in order to pin down which descriptors have impact on the dependent variable, K CAT using SALFORD Predictive Modeler 8.0 for linear part, Artificial Neural Networks (ANNs) was utilized for the same purpose using SPSS 20.0 for non-linear part.CART Decision Tree tests revealed that Chemical Potential and Homo descriptors were statistically significant variables by conducting 5-fold cross-validation with 0.544 coefficient of determination.
Of the variables that have significant results that are in sync with the objective of the study are Chemical Potential and Homo.The model below delineates the output of this relationship and pinpoints these significant variables.The importance level for Chemical potential is higher since the separation starts with that descriptor.Besides, its critical value is -10.31.If the molecules display critical value less than and equal to -10.31, there will be no other significant descriptors available except Chemical Potential.By contrast, if the Chemical Potential is greater than -10.31, there will be other significant descriptors adding extra information to the model, which is called Homo.The Descriptors' contribution to the model comes with a single separating value at -6.90.As for the non-linear part, ANNs was conducted in order to examine which descriptors may have impact on the dependent variable, KCAT.Owing to the fact that the data set we had was small, we had the model run 100 times and then the average was taken.The most significant variables were Pz, Hyperbola, Delta and Alpha with 62 percent of determination of coefficient (R2=0.62).In consequence, the total number of 15 molecules consisting of 14 descriptors and one independent variable that have been already mentioned above were investigated in order to examine which descriptors would impact the dependent variable.Since the data set contained both linear and nonlinear calculations, two different statistical models, CART Decision Tree and ANNs, were utilized for the linear part and non-linear part, respectively.The results revealed that Chemical Potential and Homo were significant descriptors for the linear part.Similarly, Pz, Hyperpolarizability, Delta and Alpha were significant descriptors for the non-linear part.
To sum up, utilizing ANNs for the non-linear part shows that the descriptors, Pz, Hyperpola., Delta and Alpha do have an impact on the activity of the enzymes.Similarly, CART model used to measure the effect of the descriptors for the linear part on the activity of the enzymes reveals that the most significant descriptors are Chemical Potential and Homo.Most importantly, however, the results show that the order of the descriptors in the CART model bear significance.That is, while Chemical Potential with its value of greater than -10.31 was the first splitting variable having impact on dependent variable K CAT , Homo came second in importance, with a value of less than and equal to -6.90.By utilizing the CART method, thereupon, not only do we have the significant descriptors but also we have their critical values and orders.

Summary and Conclusion
The major aim of this paper was to examine the descriptors of the 15 aforementioned molecules that may activate PON1.The results of the study reveal that there is a strong relationships between the activity of the enzyme and the studied descriptors.In order to examine the relationships, two data analysis methods, CART Decision Tree and Artificial Neural Networks (ANNs) were used for the linear part and non-linear part of the data set, respectively.The most significant   descriptors for the linear part were Chemical Potential and Homo.However, the order of the descriptors was determined by CART model.While Chemical Potential with its value of greater than -10.31 was the first splitting variable to have impact on dependent variable KCAT, Homo was the second variable that showed an impact on dependent variable, with its value of less than and equal to -6.90.The importance of CART method was that it not only revealed and sorted out significant descriptors but it also pinpointed their critical values and orders.In result, CART Decision Tree turned out that Chemical Potential and Homo descriptors were statistically significant variables by conducting 5-fold cross-validation with 0.544 coefficient of determination.In a similar fashion, ANNs, employed for non-linear part in order to determine which descriptors have impact on dependent variable, revealed that Pz, Hyperpolarizability, Delta and Alpha were the most significant contributors, with 62 percent of determination of coefficient (R 2 =0.62).

Figure 5 .Figure 6 .
Figure 5.The longest and shortest bond length

Table 3 .
The C0rrelations of the different parameters with the activity