Quantitative Structure-Activity Relationship Analysis of Organophosphate Insecticides Using Electronic and Molecular Parameters

Quantitative Structure-Activity Relationship (QSAR) analysis of organophosphate derivatives, and their insecticide activities, was performed using electronic and molecular parameters. The series of organophosphate derivatives and their activities were obtained from literature. The semi-empirical AM1 method was used to model the structure of organophosphate derivatives and calculate the parameters of QSAR. Multiple linear regression (MLR) analysis was performed on the electronic and molecular parameters as well as the activities of the organophosphate insecticides to derive the QSAR model. The best QSAR equation model was used to design, in silico, new insecticide molecules of organophosphate derivatives with higher insecticidal activity. A new insecticide molecule, 4-(diethoxy phosphoryloxy) benzene sulfonic acid, -Log LD50 = 7.344, had the highest insecticidal activity. Lastly, we recommend synthesizing and testing the new insecticide molecule further in the laboratory.


Introduction
The housefly (Musca nebulo), while the most familiar fly, is also the most responsible for the transmission of diseases. It is readily found on domestic organic waste. Organophosphate is one of the most effective insecticides, even for flies. The organophosphate compound is more easily dissolved, eco-friendly, as well as cheaper than other insecticides. Additionally, it is also more toxic [1].
Organophosphate insecticides may be designed through computational chemistry. Computational chemistry, along with a statistical approach, predicts the activity of a compound by analyzing the quantitative structureactivity relationship (QSAR) of the compound. QSAR modeling pertains to the construction of predictive models of biological activities as a function of structural and molecular information from a compound library [2]. By using the QSAR model, researchers can develop new compounds, while minimizing trial-and-error experimentation in the laboratory. A QSAR equation is a mathematical equation that correlates biological activity to a wide variety of physical or chemical parameters. QSAR models have successfully screened many com-June 2017 Vol. 21  No. 2 pounds for biological activity [3][4][5][6][7], including organophosphates and carbamates [8], by using only freeenergy-related physiochemical substituent parameters such as π, σ, among others. Furthermore, Naik, et al. [9] conducted a QSAR study of organophosphates.
One of the QSAR approaches is the Hansch method. This method assumes that biological activity is a function of the hydrophobic, electronic, and steric parameters of the molecule [10]. Hydrophobic and steric parameters are also known as molecular parameters. These parameters are used to analyze the quantitative relationships between the structures of organophosphate derivative insecticides and their activities.
When organophosphate touches the fly, it is absorbed by the fly's body and flows through its circulatory system to the neuron system [11]. The organophosphate compounds affects the neuron system by obstructing the function of acetylcholine esterase enzymes. In the insect's neuron system, acetylcholine esterase enzymes contain a natural substrate, called a acetylcholine (ACh) [12]. The dipole interaction shows that C=O of the ACh will interact with the Serine-OH, while on the -N(CH 3 ) 3 will interact with the anionic part of the enzyme, the Glutamine-COO. Organophosphate competes with the ACh by interacting with the acetylcholine esterase enzyme [11], thus, involving dipole interaction. This fact is the background of this research, using the atomic netcharge and the dipole moment as parameters.
The middle structure of the acetylcholine esterase enzyme can adapt; in which interacting with the acetylcholine and when releasing choline [11]. The transformation in this middle structure of the enzyme is possible when the acetylcholine esterase enzyme interacts with the organophosphate derivative compound. Therefore, this research also considers SA (surface area) as a molecular parameter.
The the Log P (partition coefficient) parameter was also involved in this study, since it enables the organophosphate to absorb into the fly and flow through its circulatory system.
Through computational chemistry, a new compound can be designed based on molecular modeling and simulation calculations. These calculations are based on the quantum mechanical calculation method. The semiempirical AM1 (Austin Model 1) is a commonly applied method for molecular modeling. This method was selected for its efficient and accurate calculation results. AM1 is an accepted method for calculating organic sixheterocyclic compounds, such as benzene [13]. The organophosphate derivatives in this study are also sixheterocyclic compounds, therefore the AM1 method has been selected for calculating QSAR parameters.

Experimental
Molecular Modeling of Organophosphate Derivatives. The organophosphate and its derivative structure models were built using the Hyperchem 7.0 package. The detailed structures of the organophosphate derivatives in this study are listed in Table 1.
Geometrical Optimization. The geometrical optimization calculation was performed on the series of organophosphate derivatives in the Hyperchem 7.0 program by  using the semi-empirical AM1 method with a convergence limit of 0.001 kcal/Å. The Polak Ribiere algorithm method was used with convergence limit of 0.001 kcal/Å.mol. After the compound reached convergence, single point calculations of the electronic and molecular parameters were performed. Electronic parameters used in this research were the atomic net-charge (q) and dipole moment (μ). The atomic net-charge parameters consisted of C1, C2, C3, C4, C5, C6, O7, P8, O9, O10, and O11 organophosphate main structures, while the molecular parameters were Log P and SA.
QSAR Analysis. In this research, backward multiple linear regression (MLR) analysis was used to obtain the QSAR equation model. Of 35 organophosphate derivatives, 5 compounds were chosen as the testing compounds and the rest were selected as the fitting compounds. The testing compounds validated the obtained QSAR equation model. The independent variables in the MLR analysis were the electronic and molecular parameters, while the dependent variable was -Log LD 50 , which describes the insecticide activity of the organophosphate derivatives.
The selection of the best QSAR model was based on statistical parameters obtained through MLR analysis such as r, adjusted r 2 , F count /F table , as well as from the Predicted Residual Sum of Square (PRESS) calculation of each model. To validate the best QSAR model, the insecticide activities of a series of testing-compounds were calculated and the results were compared to those obtained from a laboratory experiment through a relation plot between the experimental -Log LD 50 and the predicted -Log LD 50 .
Design of New Insecticides. Previous QSAR models were used as a guide in designing new insecticide molecules of organophosphate derivatives for higher insecticidal activity. To design new insecticide molecules, we used CH 3 , C 2 H 5 , C 4 H 9 (the same as the existing R substituents on molecules of existing organophosphate derivatives), and C 3 H 7 . The X substituents were varied so that the molecules had the properties to either removing or donating electrons. To examine the influence of the X substituents on the insecticidal activity, we kept the R substituent constant (R = C 2 H 5 ), while the X substituent varied. To examine the effect of the R substituent on the insecticidal activity, we used the X substituent with the highest insecticidal activity, while the R substituent varied with CH 3 , C 2 H 5 , C 3 H 7 and C 4 H 9 .

Results and Discussion
Method Validation. In previous research, we used two semi-empirical methods, AM1 and PM3, for the calculation of the chemical shift (δ, ppm) of dimethyl phenyl phosphate using HyperNMR, in the Hyperchem 7.0 program, with the torsion angle of C1-O7-P8-O9 kept at 180°. The result of the calculation was compared to experimental measurements ( 1 H-NMR, 400 MHz) [14][15]. The comparison revealed that the AM1 method described the chemical conformation of organophosphate derivatives more accurately than the PM3 method. Therefore, the AM1 method was the calculation method for further insecticide compound modeling in this study. The C1-O7-P8-O9 torsion angle was kept constant at 180°s ince this angle gave the lowest potential energy [15].

Generation and Selection of QSAR Equation Model.
The organophosphate derivative compound interacted with acetylcholine esterase enzyme, which involved the atomic net-charge (q), dipole moment (μ), SA, and partition coefficient (Log P). The previous research arranged the QSAR equation using the atomic net-charge (q) and dipole moment (μ) as the electronic parameters [15], while this research used the atomic net-charge (q) and dipole moment (μ) as the electronic parameters, and with the SA and the partition coefficient (Log P) as the molecular parameters to arrange QSAR equation.
The 35 active compounds, with their acute toxicity to houseflies, were randomly divided into a set of 30 fitting compounds, and a set of 5 test compounds. The tested compounds are used to validate the QSAR equation model. The parameters of the fitting compounds, chosen at the beginning of the process, were used in MLR analysis to determine the best QSAR equation model. A backward MLR analysis method was used in the derivatization. Both electronic and molecular parameters were used as independent variables. The electronic parameters included the dipole moment (μ) and the atomic net-charge (q) consisted of the atoms C1, C2, C3, C4, C5, C6, O7, P8, O9, O10 and O11. The molecular parameters involved in the derivatization of the model were Log P and SA, while the dependent variable was the insecticide activity (-Log LD 50 ) of the organophosphate derivatives. Table 2 shows some of the QSAR equation models obtained from MLR analysis.
To select the best QSAR equation, we considered statistical parameters, such as value r (correlation coefficient), adjusted r 2 , Standard Deviation (SD), F count /F table , and PRESS. The best QSAR equation model was determined from the fewest number of variables, low SD value, high value of F count /F table , and low PRESS value. The PRESS value was the residual of predicted biological activity and experimental biological activity. So far, we had model 6 with the lowest PRESS, model 8 with the lowest SD value, and model 10 and 11 for the fewest number of parameters.
Therefore, these four model candidates were further vetted by using them to calculate the activity of the testing compounds. The calculation results (predicted -Log LD 50 ) were then be plotted against experimental -Log LD 50 . The resulting linear equations are listed below. If the predicted -Log LD 50 aligned with the experiment -Log LD 50 , the slope value of the plot will be closer to unity. Among those four plots, the slope of model 10 was the closest to unity, and was, therefore, considered the best QSAR equation Design of New Insecticides. Model 10, the best QSAR equation model, was used for predicting the activity of new insecticide molecules of organophosphate derivatives. The R substituents of the new insecticides were CH 3 , C 2 H 5 , C 3 H 7 and C 4 H 9 . The X substituents were varied so that the molecules had properties to withdraw or donate electrons. Based on the Linear Free Energy Relationship (LFER) principle in designing new insecticide molecules, X substituents were attached to the phenyl ring in the para and meta position to enhance the resonance effect. On the other hand, the X substituent in the ortho position had an experimental limitation due to steric effects; so we did not design new insecticide molecules with the X substituent at this position. The design of new insecticide molecules of organophosphate derivatives, with their predicted insecticidal activity, is shown in Table 4.
For insecticide molecules of organophosphate derivatives that already exist, the organophosphate derivative compound with R = C 2 H 5 and X = 4-NO 2 had the highest insecticidal activity (-Log LD 50 = 5.2), and X = 4-NO 2 was an electron withdrawing group. In Table 3, the new insecticide molecules of organophosphate derivatives are predicted to have high insecticidal activity, and a X = 4-SO 3 H substituent, also an electron-withdrawing group. Therefore, the insecticide molecules of organophosphate derivatives with an electron-withdrawing group on the benzene ring had the greatest insecticidal activity. However, the insecticide molecules of organophosphate derivatives with the X substituent at the para position (X = 4-SO 3 H, predicted -Log LD 50 = 7.344) had better predicted insecticidal activity from those with the X substituent at the meta position (X = 3-SO 3 H, predicted -Log LD 50 = 4.582). This showed that the position of -SO 3 H substituent caused resonance effects on benzene groups, resulting in pulling electrons from O7 towards -SO 3 H substituent groups by passing the benzene. This flow of electrons from O7 towards -SO3H substituent easily broke the bonds between O7 and P8 atoms, making it easier for phosphate groups bind to the O-Serine, increasing the insecticidal activity of these molecules.
The best QSAR equation model was used to predict the activity of new insecticide molecules of organophosphate derivatives. From the design of a new insecticide compounds, the 4-(diethoxy phosphoryloxy) benzene sulfonic acid (R = C 2 H 5 , X = 4-SO 3 H) was predicted to have the highest insecticidal activity (-Log LD 50 = 7.344). It has a better insecticidal activity than those already in existence. We recommend 4-(diethoxy phosphoryloxy) benzene sulfonic acid be synthesized and tested further in the laboratory.