In Silico Studies of Lamiaceae Diterpenes with Bioinsecticide Potential against Aphis gossypii and Drosophila melanogaster

Background: The growing demand for agricultural products has led to the misuse/overuse of insecticides; resulting in the use of higher concentrations and the need for ever more toxic products. Ecologically, bioinsecticides are considered better and safer than synthetic insecticides; they must be toxic to the target organism, yet with low or no toxicity to non-target organisms. Many plant extracts have seen their high insecticide potential confirmed under laboratory conditions, and in the search for plant compounds with bioinsecticidal activity, the Lamiaceae family has yielded satisfactory results. Objective: The aim of our study was to develop computer-assisted predictions for compounds with known insecticidal activity against Aphis gossypii and Drosophila melanogaster. Results and conclusion: Structure analysis revealed ent-kaurane, kaurene, and clerodane diterpenes as the most active, showing excellent results. We also found that the interactions formed by these compounds were more stable, or presented similar stability to the commercialized insecticides tested. Overall, we concluded that the compounds bistenuifolin L (1836) and bistenuifolin K (1931), were potentially active against A. gossypii enzymes; and salvisplendin C (1086) and salvixalapadiene (1195), are potentially active against D. melanogaster. We observed and highlight that the diterpenes bistenuifolin L (1836), bistenuifolin K (1931), salvisplendin C (1086), and salvixalapadiene (1195), present a high probability of activity and low toxicity against the species studied.


Introduction
The growing demand for agricultural products has led to the misuse of insecticides; resulting in the use of higher concentrations and the need for ever more toxic products [1]. This has led to increases in toxic effects on other beneficial organisms, (those which coexist with pests within the agroecosystem), in bioaccumulation of toxic insecticide concentrations in the bodies of both predators and endpoint consumers, including humans [2][3][4]. Most of the available insecticides for crop use are synthetic and often persist as toxic waste in the environment far beyond the time desired, thus causing both organismal resistance and environmental pollution [5,6].
The undesirable effects of using chemical pesticides include resistance in pests, pollution, and acute and chronic health problems [7][8][9]. The growing concern about environmental and health risks from the use of pesticides has led to prohibition of many traditional pesticides and substitutions with less toxic compounds [10][11][12]. Plant products are a rich source of active compounds with potential as insecticides, antifeedants, antimoulting hormones, oviposition impediments, repellents, juvenile hormone simulators, and growth inhibitors; and presenting promise against specific target insects [13,14]. may reflect Azadirachtin's interference in food and metabolism regulation, and provide evidence of delayed effects at this stage of development; reinforcing its insecticidal activity.
Another study developed by Loper and Collaborators (2016) [67], with strains of Pseudomonas fluorescensis, aimed to evaluate production of toxic mediators in D. melanogaster; such as Pf-5, Rizoxima, and Orfamida A. The methodology employed gene expression, with morphological, and phylogenetic analyses. The authors concluded that the oral toxicity induced by Pseudomonas fluorescensis by means of the Pf-5 gene was significant, being induced in several genes and that the effect is promoted by an extracellular chitinase and Rhizoxin and Orfamide A analogs.

Aphis gossypii
For Aphis gossypii, only 8 articles deal with the topic. Elbanhawhy and Collaborators (2019) [62], addressed the insecticidal activity of different organic extracts from entomopathogenic fungi; Cladosporium cladosporioides, Metarhizium anisopliae, Purpureocillium lilacinum, and Trichoderma longibrachiatum against the cotton aphid, Aphis gossypii. The impact of the extracts on certain biochemical characteristics and enzyme activity was also evaluated.
The authors performed extraction of metabolites from the fungus, as well as toxicity tests, evaluation of total carbohydrates and triglycerides as effects, and evaluation of enzymatic activity. The results revealed that there was a significant increase (p < 0.05) in chitinase activity, respectively of 34.85% and 9.82%, for the C. cladosporioides and P. lilacinum extracts under study.
The authors concluded that chitinase plays an important role in the growth and development of insects, in addition they reported that this family of enzymes is involved in insect defense against entomopathogenic fungi, since chitinase activity increased after treatment with P. lilacinum methanolic extract. The extract thus can be used in chitin biodegradation, which leads to the death of the aphid.
The use of alcoholic derivatives was explored by Kim and Collaborators (2013) [68], the authors evaluated the effect of isotridecyl alcohol ethoxylation on the larvicidal activity of fungal supernatant, where the fungus was isolated and the supernatant was produced. To achieve the proposed objective, the authors determined aphicidal activity using the leaf immersion method, which suggested that the performance of the supernatant may be related to the activity of the enzymes chitinases and lipases, since they participate in the degradation of the cuticle by fungi and entomopaths. This enzymatic degradation was possibly improved by the TDE-less ethoxylate, being noted that at each ethoxylation a synergistic interaction between the supernatant and the TDE-n was evidenced.
Thus, the authors concluded that the supernatant B. bassiana (SFB-205) with less ethoxylated tridecyl alcohol (TDE) presents greater insecticidal activity against cotton aphid adults. The incorporation of TDE into the supernatant increased the potency of the fungal supernatant in an ethoxylation dependent manner. This finding represents a practical approach to effectively control harmful agricultural insects using an entomopathogenic fungal supernatant or spores (conidia).
In another study Kim and Collaborators (2010) [69], evaluated the expression of enzymes for pest control through a study with fungi, which evaluated aphicidal activity in expression of Beauveria bassiana enzymes, this being their initial study. The work aimed to describe the roles of adjuvants, such as corn oil, and polyoxyethylene-(3)-isotridecyl ether (TDE-3), in promoting the aphicidal activity of the enzymatic precipitate of Beauveria bassiana SFB-205 supernatant. The methodology consisted of isolating the fungus, preparing the suspension, and screening.
Regarding chitinase degradation, the authors reported its occurrence and variance under different conditions, with the AMEP + TDE-3 suspension (based on corn oil) degrading the specific chitinase substrate more pronouncedly than the AMEP + TDE-3 water-based suspension in dry conditions. Compared to unexposed suspension, 73% of the substrate (pNG) was degraded by treatment with the AMEP + TDE-3 suspension based on corn oil, 120 min after drying. However, the AMEP + TDE-3 water-based suspension treatment yielded only 18.5% pNG degradation for the same exposure time. Degradation was significantly affected by suspension in water or corn oil (base type (B): F1.72 = 15.9, Pb0.001; exposure time (E): F5.72 = 1.8, p = 0.133; and B × E: F5.72 = 4.1, p = 0.032). In addition, treatment with the AMEP + TDE-3 suspension based on corn oil, a large part of the suspension remained in the wells even after incubation for 120 min. On the other hand, little or no remaining suspension was observed in with the AMEP + TDE-3 water suspension treatment at the end of the exposure.
Thus, the authors conclude that corn oil with TDE-3 can promote the insecticidal activity of attagel-mediated enzyme powder (AMEP), to provide another strategy for the development of biopesticides using entomopathogenic fungi. Kim and collaborators (2010) [70], evaluated the expression of B. bassiana enzymes to control Aphis gossypii, in two FPLC fractions. Chitinase activities were expressive and identified for the fractions, corresponding to 55 KDa of the protein band. The authors concluded that the two FPLC fractions from Beauveria bassiana SFB-205 supernatant, displaying chitinase or Pr1/Pr2 protease activity and bioassayed against Aphis gossypii in different ratios, promoted a decrease in the aphid population, yet this decrease was more significantly influenced by the chitinase fraction (in a dosage-dependent manner).

Acethylcolinesterase
Acetylcholinesterase (AChE) inhibition is the main mechanism of action of organophosphates [71][72][73]. The enzyme is essential and necessary for hydrolysis of the neurotransmitter acetylcholine (ACh), and plays a fundamental role in controlling synaptic transmission in all animals; hydrolyzing acetylcholine to end its synaptic action [74][75][76].
The enzyme AChE belongs to the group of phase I metabolic enzymes and can metabolize various internal and external substrates in pests; this group of metabolic enzymes consists of broad spectrum enzymes capable of metabolizing chemical insecticides such as organophosphates, carbamates, or pyrethroids [77,78]. Increasing or decreasing the amount of these enzymes leads to loss of efficiency in insecticides; thus, agents with new and different mechanisms of action must be developed for insect control [79][80][81][82].

Drosophila melanogaster
In studies involving Drosophila melanogaster, the participation of acetylcholinesterase is widely reported, on average 400 publications on the subject are identified, and the studies use various methodologies.
In a computational study of Quantitative Chemical Structure and Biological Activity Relationship (QSAR), Rodrigues and Collaborators (2020) [83], evaluated the potential insecticidal activity of monoterpenes against the insects Reticulitermes chinensis and Drosophila melanogaster. Construction of linear regression models was performed in which the activity was expressed in pIC 50 which is equivalent to −log LC 50 . The descriptors were obtained using the Dragon software version 7.0 and the selection of variables for later calculation by the genetic algorithm was performed in the Mobydigs 1.1 software using the multiple linear regression equation (MLR) method. Only the models with the highest Q 2 values were selected according to workflows. Molecular Docking simulations were performed using the Molegro Virtual Docker 6.1.0 software, with the proteins obtained from the Protein Data Bank PDB library. One of the proteins chosen was acetylcholinesterase under code: 1QON, 2.7 Å resolution, using a 15 Å GRID in the radius, and 0.30 Å resolution at the enzyme binding site with the structures.
The results revealed that the 40 monoterpenes present interaction values very close to known insecticides. Neryl acetate presented the lowest energy at −87 kcal/mol, close to the insecticides methiocarb and pirimicarb (carbamates), with the energies of −90 kcal/mol. The compound neryl acetate was one of the most active in the series, followed by citronellyl acetate with −83 kcal/mol, geranyl acetate with −78 kcal/mol and linalyl acetate with −77 kcal/mol; monoterpenes presenting the lower energies. In addition, the compounds presented significant interactions with the enzyme, suggesting that the monoterpenes belonging to the acetates group interact more strongly with acetylcholinesterase, and possibly that other monoterpene groups may present different mechanisms of action. The stability of the interaction was verified by molecular dynamics simulations revealing that the stability of the AChE active site was guaranteed by formation of complexes with three selected terpenes, being comparable to pirimicarb and methiocarb.
In another study Gomes and collaborators (2020) [75], evaluated the insecticidal action of Croton campestris methanolic extract, and the protective effect of gallic acid. The methodology used consisted of quantification of compounds by HPLC, enzymatic and locomotion assay, and evaluation of enzyme expression (acetylcholinesterase). The results showed that organophosphates reduced the expression of acetylcholinesterase by 66%, but this result was blocked by the study compound MFCC. The action was also observed in other enzymes such as superoxide dismutase.
The authors concluded that chemical constituents of the plant prevented organophosphate induced AChE inhibition, and attributed essential neuroprotective potential to the plant. It was shown that gallic acid contributes to the fraction's protective potential as compared to other phenolic compounds. Thus, MFCC was considered a promising source of potential therapeutic agents for the treatment of organophosphate intoxication.
Similarly, Abbod (2020) [84], analyzed the mode of action of the substance 3-butylidenephthalide, a natural pesticide. Their results showed that the study compound (3-BPH) exhibited in vitro activity for i-AChE in a concentration-dependent manner; the percentage of inhibition of the enzyme varied respectively between 14% ± 3.21 to 69% ± 4.93 for 50 µg/mL to 1000 µg/mL (mean ± SE of the three repetitions). Physostigmine was used as a standard AChE inhibitor and revealed an IC 50 of 0.082 µg/mL. Thus, the authors suggested 3-BPH as an important plant protector.
Musachio and collaborators (2020) [85], demonstrate the development of Parkinson's in Drosophila melanogaster species through exposure to Bisphenol A (BPA). Their results reveal an association with acetylcholinesterase levels, since in the head samples, there was a decrease in the activity of the enzyme AChE in both groups exposed to BPA (at 0.5 mM and 1 mM), when compared to the control group (Ap < 0.0007; F = 18.08). However in the body samples, the activity of AChE did not change (Bp < 0.2738; F = 1.620). The authors concluded that bisphenol induces changes similar to Parkinson's, and is possibly associated with oxidative stress, suggesting new options for future study.

Aphis gossypii
For the involvement of acetylcholinesterase in Aphis gossypii, 88 articles were found. Authors Ulusoy, Özgür and Alpkent (2019) [80], reported on the effect of in vitro antiacetylcholinesterase and anti-carboxylesterase toxicity for various plant extracts. The plants used in the test were: Daphne odora L. The methodology consisted of preparing extracts of the plants and performing toxicity tests, evaluating the effects of inhibiting acetylcholinesterase and carboxyesterase. The results demonstrated that F. carica extract in all concentrations was the most effective in inhibiting the expression of acetylcholinesterase. Ficus carica presented the greatest AChE inhibitory effect (51.9% inhibition). The least AChE inhibitory effect was 10% with D. amoena (20.9% inhibition). The most effective plant extracts after F. carica were D. odora (41.0% AChE inhibition), and E. camaldulensis (40.3% AChE inhibition). In conclusion, it was determined that aqueous D. odora, E. camaldulensis, F. carica, and M. pulegium leaf extracts present significant bioinsecticide effect and in vitro anti-AChE activities in A. gossypii.
The influence of acetylcholinesterase expression is demonstrated through studies of resistance associated with pirimicarb. Research developed by Tieu and Collaborators (2017) [86] found that pyrimidine resistance presented a significant cost in physical conditioning in susceptible aphids and in the absence of insecticidal pressure, and that the cost of physical conditioning was related to initial resistance, due to the involvement of acetylcholinesterase receptors.
The occurrence of mutation in acetylcholinesterase receptors in the melon aphid through exposure was also addressed by Lokeshwari, Kumar, and Manjunatha (2016) [87]. The AChE enzyme assay revealed that there was no significant change in AChE activities in resistant and susceptible strains. However, the AChE inhibitory assay revealed that 50% of enzymatic activity in resistant strains was inhibited at significantly higher concentrations of dimethoate (131.87, 158.65, and 99.29 µmol L −1 ) compared to susceptible strains (1.75 and 2.01 µmol L −1 ), indicating insensitivity to AChE due to AChE modulation.
Functional analysis of such point mutations was evaluated in molecular docking studies, using modeled wild type and naturally mutated AChE2. Computational analysis showed that conformational changes in the AChE2 active site due to structural gene substitutions (A302S, S431F, and G221A) significantly reduced the level of ligand binding (OP-dimethoate, Omethoate, and CM-pirimicarb), suggesting that they are potentially associated with resistance development.
The authors concluded that multiple mutations located at the active site of the enzyme are responsible for AChE insensitivity to dimethoate, and are probably the molecular basis of resistance to dimethoate in these A. gossypii populations. In addition to studies on the occurrence of mutations, there are also studies involving proteomic profiling, and protein analysis; such as research carried out by Xi and Collaborators (2015) [88], involving Spirotetramat tolerance, where the authors demonstrated that acetylcholinesterase conferred resistance to the substance under study.

Nicotinic Acetylcholine Receptor
Acetylcholine has two types of receptors, which are classified into nicotinic (nAChRs) and muscarinic (mAChRs) receptors [89][90][91]. Nicotinic receptors present as ion channels dependent on pentameric ligands that are activated by acetylcholine, as well as by nicotine to trigger action potentials for rapid synaptic neurotransmission [92]. In mammals, nicotinic receptors are expressed in presynaptic regulation to regulate the release of dopamine and other neurotransmitters from the nigrostriatal terminals. In insects, mainly in Drosophila melanogaster species, nicotinic receptors are expressed in abundance throughout the central nervous system [93][94][95][96]. Figure 1 demonstrates the structure of the nicotinic acetylcholine receptor [97]. The receptor is pentameric. The α4 subunits are green, and β2 is blue. Nicotine (red) and sodium (pink) are represented as spheres. The disulfide Cys-loop and C-loop connections are shown as yellow spheres.

Drosophila melanogaster
For Drosophila melanogaster, about 230 articles were found related to the nicotinic acetylcholine receptor.
Research developed by Fournier-Level and Collaborators (2019) [98], addressed the expression of receptors in Drosophila melanogaster after exposure to imidacloprid. Population and quantitative genomic analysis, supported by functional tests, revealed a mixed genetic architecture for resistance involving the main genes (Paramyosin, and Receptor Nicotinic-Acetylcholine Alpha 3), and polygenes with a large exchange and thermo-tolerance. The reduced genetic differentiation in the sites associated with resistance indicated an increase in gene flow. Resistance alleles showed stronger evidence of positive selection in temperate populations as compared to tropical populations, in which chromosomal inversions In (2L) t, In (3R) Mo, and In (3R) Payne harbor susceptibility alleles.

Drosophila melanogaster
For Drosophila melanogaster, about 230 articles were found related to the nicotinic acetylcholine receptor.
Research developed by Fournier-Level and Collaborators (2019) [98], addressed the expression of receptors in Drosophila melanogaster after exposure to imidacloprid. Population and quantitative genomic analysis, supported by functional tests, revealed a mixed genetic architecture for resistance involving the main genes (Paramyosin, and Receptor Nicotinic-Acetylcholine Alpha 3), and polygenes with a large exchange and thermo-tolerance. The reduced genetic differentiation in the sites associated with resistance indicated an increase in gene flow. Resistance alleles showed stronger evidence of positive selection in temperate populations as compared to tropical populations, in which chromosomal inversions In (2L) t, In (3R) Mo, and In (3R) Payne harbor susceptibility alleles.
Thus, the authors concluded that polygenic architecture and ecological factors should be considered when developing sustainable management strategies for beneficial pests and insects.
Similarly, Shin and Ventom (2018) [96], evaluated the electrochemical mechanisms of Dopamine receptor stimulation in Drosophila melanogaster. In this study, acetylcholine and nicotine were used as stimulants, since both interact with nicotinic acetylcholine receptors (nAChRs) to evoke endogenous dopamine release. Stimulation with 10 pmol acetylcholine caused 0.26 ± 0.05 μM dopamine release, while nicotine stimulation at 70 fmol evoked 0.29 ± 0.03 μM dopamine release in the central complex. The release of dopamine stimulated by nicotine lasted much longer than the release stimulated by acetylcholine. Dopamine release is reduced in the presence of nAChR α-bungarotoxin antagonist, and the sodium channel blocker tetrodotoxin, indicating that release is mediated by nAChRs and exocytosis.
Another mechanism studied is the involvement of serine metabolism in the hunger regulation and sleep modulation in Drosophila melanogaster, developed by Sonn and collaborators (2018) [99]. Their results revealed that mutation in the stdh gene in the acetylcholine nicotinic receptor selectively rescued the suppression of sleep induced by exaggerated hunger, through the administration of an antagonist. Thus, the authors conclude that neural serine metabolism controls sleep during hunger, possibly via cholinergic signaling, due to the development of a sleep regulatory mechanism that reprograms the metabolism of amino acids for adaptive sleep behaviors in response to metabolic needs. Thus, the authors concluded that polygenic architecture and ecological factors should be considered when developing sustainable management strategies for beneficial pests and insects.
Similarly, Shin and Ventom (2018) [96], evaluated the electrochemical mechanisms of Dopamine receptor stimulation in Drosophila melanogaster. In this study, acetylcholine and nicotine were used as stimulants, since both interact with nicotinic acetylcholine receptors (nAChRs) to evoke endogenous dopamine release. Stimulation with 10 pmol acetylcholine caused 0.26 ± 0.05 µM dopamine release, while nicotine stimulation at 70 fmol evoked 0.29 ± 0.03 µM dopamine release in the central complex. The release of dopamine stimulated by nicotine lasted much longer than the release stimulated by acetylcholine. Dopamine release is reduced in the presence of nAChR α-bungarotoxin antagonist, and the sodium channel blocker tetrodotoxin, indicating that release is mediated by nAChRs and exocytosis.
Another mechanism studied is the involvement of serine metabolism in the hunger regulation and sleep modulation in Drosophila melanogaster, developed by Sonn and collaborators (2018) [99]. Their results revealed that mutation in the stdh gene in the acetylcholine nicotinic receptor selectively rescued the suppression of sleep induced by exaggerated hunger, through the administration of an antagonist. Thus, the authors conclude that neural serine metabolism controls sleep during hunger, possibly via cholinergic signaling, due to the development of a sleep regulatory mechanism that reprograms the metabolism of amino acids for adaptive sleep behaviors in response to metabolic needs.
Exposure to nicotine in larval development through activation of dopamine receptors in Drosophila melanogaster was addressed by Morris and collaborators (2018) [100]. The authors demonstrated the involvement of the Dα7 subunit of nicotinic acetylcholine receptors in the early hatching of larvae, and concluded that exposure to nicotine during Drosophila melanogaster development affects the size of the brain and the dopaminergic system.

Aphis gossypii
To demonstrate the involvement of the nicotinic acetylcholine receptor in the action of insecticides against Aphis gossypii, 30 articles were found.
Of the methodologies covered, analysis of receptor mutation was performed by Hirata and Collaborators (2017) [101], evaluating mutation of the R81T gene. This mutation is characterized as the source of resistance to neonicotinoid insecticides. The authors evaluated Molecules 2021, 26, 766 9 of 39 the differential effects of the R81T mutation in cyan and nitro-substituted neonicotinoids and in sulfoxaflor, and for this purpose, isolation of the complete coding sequences for A. gossypii in the AChRα1, α2, β1 subunits was performed.
The results revealed that when co-expressed in Xenopus laevis oocytes in chicken β2 nAChR, A. gossypii α1 evoked internal currents in a concentration-dependent manner in response to acetylcholine (ACh) and showed sensitivity to neonicotinoid and sulfoxaflor. In addition, the chicken β2 mutation T77R + E79V (double mutant equivalent of R81T) resulted in a lesser effect on cyano-substituted neonicotinoids and sulfoxaflor than on nitro-substituted neonicotinoids (neonicotinoid insecticides replaced by cyan and nitro).
The authors concluded that the R81T mutation presents resistance to nicotinoids in nAChRs, and the mutation affects distinctly cyan and nitro-substituted neonicotinoids.
The occurrence of mutations in the nicotinic receptor was also addressed by Chen and Collaborators (2017) [102]. The authors identified three mutations at the target site within the β1 subunit of the nicotinic acetylcholine receptor (nAChR) in the IMI_R strain, with the R81T mutation being responsible for imidacloprid resistance in A. gossypii and M. persicae. The V62I and K264E genes were first detected in A. gossypii. The mutations are also present in internal populations, suggesting that they play a role in resistance to imidacloprid.
Other articles that address resistance after the occurrence of mutations in subunits of the nicotinic acetylcholine receptor, comprise research developed by Toda and Collaborators (2017) [103], studying a point mutation (R81T) in the region of the D loop of the β1 subunit of the acetylcholine nicotinic receptor gene conferring resistance, and identification of neonicotinoid resistance using the Polymerase Chain Reaction (PCR) methods.
The mutation in the R81T gene has also been reported by Wang and Collaborators (2016) [104]; however, their objective was α2 and β1 subunits after administration of Sulfoxaflor. The results demonstrated that the van der Waals interactions of whole molecules were highly correlated with neonicotinoid binding capacity, and correctly predict the classification order of the association between neonicotinoids and sulfoxaflor. Further, changes in a whole molecule electrostatic energy component can potentially explain the effects of the mutation at the target site through a pattern of reduced efficacy for modeled neonicotinoids, and provide a basis for reducing the effect of this mutation on sulfoxaflor.

Protein Sequence Alignment
Protein sequence alignment helps to verify similarity and identity of a single protein in different species. Using this technique, one can analyze conserved regions and identify common residues in the active site. In addition, structural differences and similarities that can contribute to the development of drugs may be revealed. Thus, we investigated shared amino acids from AChE, nAChR, and Cht sequences in A. gossypii and D. melanogaster.
The results revealed that A. gossypii and D. melanogaster respectively present 37% and 41% identity with the Anopheles gambiae-AChE, (Figure 2). Yet despite the low identity scores, the AChE site is conserved between species, with 90% of amino acids shared. For the enzyme nAChR, A. gossypii and D. melanogaster respectively presented 29.96% and 29.39% of identity with H. sapiens-nAChR, with 55% of amino acids shared at the active site ( Figure 3). The Cht enzyme of A. gossypii and D. melanogaster respectively presented 61% and 60% identity with Ostrinia furnicalis-Cht, and 100% of amino acids shared at the active site ( Figure 4).

Homology Modeling
In this study, one AChE model, two nAChR models, and two Cht models were generated. The reliability of the models was assessed using a Ramachandran chart, which represents all possible combinations of dihedral angles Ψ (psi) versus φ (phi) for each amino acid in a protein, except glycine, which has no side chains. The model is considered to be reliable when more than 90% of the amino acids are present in the permitted and/or favored regions (colored regions of the graph). Blank regions represent outliers with poor contacts. All of the generated models presented more than 97% of their amino acids in allowed and favored regions ( Figure 5 and Table 1), and were used in the following methodologies.

Homology Modeling
In this study, one AChE model, two nAChR models, and two Cht models were generated. The reliability of the models was assessed using a Ramachandran chart, which represents all possible combinations of dihedral angles Ψ (psi) versus ϕ (phi) for each amino acid in a protein, except glycine, which has no side chains. The model is considered to be reliable when more than 90% of the amino acids are present in the permitted and/or favored regions (colored regions of the graph). Blank regions represent outliers with poor contacts. All of the generated models presented more than 97% of their amino acids in allowed and favored regions ( Figure 5 and Table 1), and were used in the following methodologies.

QSAR Modeling
The models used the RF algorithm, were built using the cross-validation procedure in the Knime software, and were evaluated for their predictive power parameters of specificity, sensitivity, accuracy, precision, (positive predicted value-PPV), and negative predicted value (NPV). Performance and robustness were evaluated using the ROC curve and the Mathews correlation coefficient (MCC). Table 2 describes the characteristics of the models in terms of predictive power and robustness, and Figure 6 presents the performance of each model. Both models presented predictive power above 70%.    The ROC curve analysis provided good results, the Aphis gossypii ROC curve presented a value of 0.78 and the D. melanogaster a value of 0.86 for cross-validation ( Figure 6), both with an accuracy value higher than 70%, revealing a model with excellent classification and performance (Tables 2 and 3). Using these models with excellent performance, diterpene and natural weed product banks were screened to select compounds potentially active against AChE, nAChR, and Cht in the studied species. The descriptors used to generate the predictive models belong to the Dragon software, we selected the 15 most influential descriptors for the Drosophila melanogaster model, eight of which belong to the block of GETAWAY (GEometria, Topologia e Atom-Weights AssemblY) descriptors (ISH, H3u, HIC, HGM, H1u, H2u, HATS2u and HATS4u), these descriptors are calculated based on the representation of the molecular structure, in the Molecular Influence Matrix (MIM), denoted by H, including the atomic coordinates that are considered concerning the geometric center of the molecule, to obtain translation invariance. The other seven descriptors selected for the Drosophila melanogaster model belong to the block of WHIM (Weighted Holistic Invariant Molecular descriptors) descriptors (Dv, De, Vm, Vv, Ve, Vp and Vs). These descriptors are geometrical based on statistical indices calculated on the projections of the atoms along principal axes, are built in such a way as to capture relevant molecular 3D information regarding molecular size, shape, symmetry, and atom distribution concerning invariant reference frames.
As we selected the 15 most influential descriptors for the model against Aphis gossypii specie, five of these belong to the block GETAWAY (GEometria, Topologia e Atom-Weights AssemblY) descriptors (R2u, R3u, H5e, H6e and H8e), these descriptors use the molecular information matrix and shows rotational invariance concerning the coordinates of the molecule, thus resulting independently of the alignment of the molecule. The other ten most influential descriptors in the Aphis gossypii model belong to the RDF (Radial Distribution Function) descriptors (RDF105v, RDF110v, RDF115v, RDF120v, RDF125v, RDF130v, RDF135v, RDF140v, RDF145v and RDF150v). These descriptors are based on a radial distribution function that can be understood as the probability distribution of finding an atom in a spherical volume of radius R, taking into account the characteristics of the atoms, the interatomic distance, and the number of atoms in the molecule.

Combined Ligand-Based and Structure-Based Analysis
The molecular docking study was performed for the AChE, nAChR, and Cht enzymes of the species selected in this study. The diterpene bank was evaluated to select molecules with good probabilities of potential inactivation. The Molegro software generates compound interaction energies, producing a Moldock Score for each protein studied. Calculations were performed to identify the compounds presenting a higher probability of being potentially active for each protein analyzed, using the following formula: where E Lig is the energy of the analyzed ligand, E MLig is the lowest energy obtained from the tested ligands, and E Inib is the energy of the PDB inhibitor ligand obtained from the crystallography data of the tested protein. Only molecules that obtained a binding energy below the binding energy of the crystallographic inhibitor ligand were considered potentially active.
Of the 1955 compounds analyzed using molecular coupling, 1702 were considered to be potentially active against Aphis gossypii-AChE, 1532 active against Aphis gossypii nAChR, 33 active against Aphis gossypii-Cht, 1719 active against D. melanogaster-AChE, 1207 active against D. melanogaster nAChR and 20 actives against D. melanogaster-Cht. Most of compounds were shown to be potentially active in both species for the enzymes AChE and ACh.
A second consensus analysis was carried out to identify potentially multi-targeting compounds, which, based on the RF model and docking, demonstrate potential active probabilities for more than one species. The following formula was used: where Prob Dc is the active potential probability of the molecular coupling analysis, ESP is the specific mean value of the RF model and P Activity is the active potential probability value of the RF model. This combined probability was conditioned, as only molecules with values above 0.5 were considered likely to be active. The combined probability values were calculated for the compounds identified for each target enzyme, and we analyzed which molecules were multi-targeting.
Of the 1955 compounds analyzed for combined probability (Prob Comb ), 313 were considered potentially active against Aphis gossyppii-AchE, with a probability ranging from 59 to 75%, 95 were considered potentially active against Aphis gossyppii-nAChR with a probability ranging from 63 to 76%, 33 were considered potentially active against Aphis gossypii-Cht, with a probability ranging from 62 to 78%, 321 were considered potentially active against D. melanogaster-AchE, with a probability ranging from 54 to 75%, 74 were considered potentially active against D. melanogaster-nAChR, with a probability ranging from 55 to 74% and 5 were considered potentially active against D. melanogaster-Cht, with a probability ranging from 50 to 62%.
After performing the combined analysis, based on the ligand and structure, and using the formula to identify multitarget molecules, we identified 15 potentially active molecules for the three A. gossypii enzymes: AChE, nAChR, and Cht (Table 4) and 37 potentially active molecules for D. melanogaster enzymes: AChE and nAChR (Table 5).  The results of the ligand efficiency (LE) [108,109] are also available as a parameter to evaluate the best diterpenes (Tables 4 and 5). The selected diterpenes for Aphis showed LE close to the insecticides while for Drosophila are similar or superior, suggesting this class of compounds as potential bioinsecticides. Additionally, flexible docking using the GOLD 5.6.2 program was performed to compare the results, being similar to the results obtained with Molegro, excepted partially for nAChR. Therefore, the results reinforce the diterpenes were selected using Molegro software (Tables 4 and 5).
Analyzing Table 4, we observe five diterpenes that presented the highest in silico activities against A. gossypii. bistenuifolin L (1836) is an ent-kaurane diterpene that occurs in a species of the genus Isodon of the subfamily Nepetoideae (Lamiaceae), and with botanical occurrence in the region of China. In its structure, we verified the presence of a heterodimeric, a six acetoxy and two hydroxyls. The second-ranking molecule is also a ent-kaurane diterpene, bistenuifolin K (1931), which has a bicyclic structure, differing from the bistenuifolin C (1934) in the number of acetoxy and hydroxyl groups, presenting three acetoxy and five hydroxyl groups, while bistenuifolin C has four acetoxy and hydroxyl groups. Bistenuifolin I (1800), also an ent-kaurane diterpene with four hydroxyls, acetoxy and carbonyl groups, occurs botanically in Isodon species, although its geographical location has China. The fifth molecule with high activity is inermes b (1936), a neo-clerodane diterpene, which has another presents subtype of diterpene and diverges from 1931 and 1934. Like some other diterpenes already mentioned, inermes b can be found in China, in species of the genus Isodon.
In Table 5, analyzing the five diterpenes that presented potential in silico activity against D. melanogaster one observes that the clerodane diterpene salvisplendin C (1086) is found mainly in the genus Salvia, subfamily Nepetoideae (Lamiaceae) and distributed in Italy (Table 6). Diterpene 1086 has one hydroxyl and a lactonic ring in its structure, providing the highest activity of D. melanogaster. The with a carbocyclic skeleton diterpene salvixalapadiene (1195) has two carbonyls and two lactones in its chemical structure, present in Lamiaceae in the genus Salvia and geographically in Mexico. The ent-kaurane diterpene racemosin A (1302), obtained from the leaves of the species Isodon henryi, also has two acetoxy, a carbonyl and a hydroxyl group. Salviarin (1027) is also a clerodane diterpene, however, the substituents differ considerably when compared to the diterpene classified as the most active (1086), considering that 1027 has one hydroxyl groups and two carbonyls: occurring in six species of the subfamily Nepetoideae. The seco-neoclerodane diterpene tonalensin (342), less active compared to the two most active diterpenes (clerodanes), has functional groups ether, carbonyl, and lactone, and can be found in Mexico in species of the genus Salvia. Table 6. Distribution of diterpenes considered active against A. gossypii, with information on the subfamily Nepetoideae (Lamiaceae) listed by tribe, genus, and species (presenting plant data, and occurrence), is available on the SistematX (https://sistematx.ufpb.br) [110]. Information on the distribution of diterpenes considered active against A. gossypii and D. melanogaster is contained in Tables 6 and 7. Table 7. Distribution of diterpenes considered active against D. melanogaster with information on the subfamily Nepetoideae (Lamiaceae) listed by tribe, genus, and species, and presenting plant data and occurrence is available on the SistematX (https://sistematx.ufpb.br) [110].

Toxicity
Toxicity was evaluated A. gossypii and of the 15 compounds considered potentially active in the RF model and docking, 11 compounds 1800, 1804, 1836, 1840, 1842, 1845, 1910, and 1931-1934 presented no predicted mutagenicity or tumorigenesis effect, or negative effects on the reproductive system, or irritability. These molecules were considered to possess the best properties for not presenting any toxicity risk. Table 8 presents the compounds with no toxicity for the evaluated parameters. Toxicity was also evaluated for D. melanogaster, and of the 37 compounds considered potentially active in the RF model and docking, 27 compounds 21, 44, 46, 47, 51, 67, 77, 95,  111, 131, 151, 199, 200, 231, 342, 434, 442, 483, 759, 787, 1015, 1027, 1086, 1184, 1195, 1302, and 1350 presented no predicted mutagenicity or tumorigenesis effect, or negative effects on the reproductive system, or irritability. These molecules were considered to possess the best properties; for not presenting any toxicity risk. Table 9 presents the compounds with no toxicity for the evaluated parameters. Table 9. Toxicity evaluation of compounds considered active and multitargeting for D. melanogaster.

Interaction Analysis
Of the compounds considered potentially active in the RF model, in docking, and multitarget, with higher LE values and with low toxicity for the species under study, we selected the two compounds with the highest Prob Comb value to analyze interactions.
The interaction values ranged from −1 to −24 kcal mol −1 . In addition, in D. melanogaster-ChE, the compound salvisplendin C (1086) formed several hydrogen bonds with the amino acids Tyr71, Gly149, Gly151, Tyr162, Ser238, and His480. It also formed two hydrophobic bonds with the amino acids Tyr71 and Glu237 (Figure 7). It was observed that the energetic contributions of the interactions varied from −4 to −11 kcal mol −1 (Table 11) [111]. The compound salvixalapadiene (1195) has less-stable interactions; with only one hydrogen bond with the Tyr370 residue with an interaction value of −29.72 kcal mol −1 . Also, it shows two hydrophobic interactions with the amino acids Glu237 and Hist480. The diterpenes have stronger interactions than the insecticide Methomyl, which formed only one hydrophobic bond with His480 (−5.26 kcal mol −1 ) and another with Trp83 (−26.29 kcal mol −1 ) (Figure 8).
In D. melanogaster, salvisplendin C (1086) formed two hydrogen bonds with the amino acids Tyr137 and Lys189 and two hydrophobic interactions with the amino acids Tyr137 and Trp193 (Figure 9). Among these amino acids, Tyr137 stands out, with an interaction value with the diterpene corresponding to −28.27 kcal mol −1 (Table 11) [111]. The compound salvixalapadiene (1195) showed only hydrophobic interactions with the Tyr137 and Tyr243 residues. Already the insecticide acetamiprid formed only with a hydrogen bond with the amino acid Tyr137 and hydrophobic interaction with the amino acid Tyr243 (Figure 10).

Chitinase
The Chitinase enzyme presents over 2000 amino acids in its protein structure, yet the N-terminal region alone is responsible for its activity. Thus, homology models were built with this region only, and the amino acids in the docking images were renamed. The correct (non-altered) numbering of the amino acids corresponding to the N-terminal region of Cht is indicated in parentheses.

Alignment of Protein Sequences
The AChE, nAChR, and Cht enzyme sequences from A. gossypii and D. melanogaster were obtained from GenBank [112], and global alignment was performed using the Clustal Omega web tool [113], which aligns protein sequences inserted by the user. Unshared end regions were excluded from the alignment. The alignment facilitated the investigation of active sites, determination of similarities, and shared identity between the enzymes in the two species under study.

Homology Modeling
Due to the lack of experimentally known 3D protein structures, homology models of the enzymes under study for Aphis gossypii and Drosophila melanogaster were built. The sequences of the enzymes and species selected in the study were obtained from the GenBank database [114], and the model structures were obtained from the Protein Data Bank (PDB) [115]. Three enzymes were selected for the construction of homology models: AChE, ACh, and Cht. The template enzymes were: AChE from Anopheles gambiae (PDB ID: 5YDH) [116], ACh from Homo sapiens (PDB ID: 6PV7) [117], and ChtII from Ostrinia furnicalis (PDB ID: 6JAV) [107]. The enzyme models were built using the molecular homology modeling method in the MODELLER 9.20 software [118]. Five models were generated and the lowest energy model was chosen. The model's stereo-chemical qualities were evaluated using the PSVS web server (protein structure validation software suite) (http:// psvs-1_5-dev.nesg.org/), and PROCHECK [119]. PROCHECK generates a Ramachandran graph [120], which determines allowed and disallowed regions of the main chain of amino acids.

Molecular Docking
Molecular docking calculations flexible approach were carried out by the Molegro Virtual Docker (MVD) 6.0 software [121], and three targets were selected for anchorage studies. The 3D structure of the Drosophila melanogaster-AChE enzyme was obtained from the Protein Data Bank (PDB), using the following code: PDB ID 1DX4 [122]. Aphis gossypii-AChE, A. gossypii-nAChR and D. melanogaster-nAChR, and A. gossypii-Cht and D. melanogaster-Cht were obtained by homology. Initially, all water molecules were removed from the crystalline structure and the mean square quadratic deviation (RMSD) was calculated from the poses, indicating the degree of reliability of the adjustment. The RMSD provides the connection mode close to the experimental structure and is considered successful if the value is below 2.0 Å. The MolDock score was used as a scoring function to predict the best interactions between the ligand and the receptor. The anchor assistant was then generated, and the enzyme and ligands were inserted to analyze the stability of the system based on the interactions identified with the active site of the enzyme.
Enzyme information is contained in Table 12. In addition, we consider the connection ligand efficiency values (LE).
The program GOLD 5.6.2 [123] was used to perform flexible docking with the compounds selected using Molegro software. The parameters are standardized by the program, which makes the random search of the conformational space, decreasing the probability of the ligand being stuck in minimal locations. The scoring function selected was GoldScore. This function is similar to molecular mechanics with four terms: where, Shb_ext is the hydrogen protein ligand binding score, Svdw_ext is the van der Waals protein ligand score, Shb_int is the contribution of the ligand's intramolecular and hydrogen bonds; and Svdw_int is the contribution of intramolecular agglutinating stress [123]. been shown to outperform the other GOLD scoring schemes [125]. The default calculation form, which provides the most accurate docking results, was selected for all calculations.
In the standard calculation mode, by default, the GA run comprised 100 000 genetic operations on an initial population of 100 members divided into five subpopulations, and the annealing parameters of fitness function were set at 4.0 for van der Waals and 2.5 for hydrogen bonding [123,125]. GOLD gives the best poses by a genetic algorithm (GA) search strategy, and then various molecular features are encoded as a chromosome [124]. GOLD uses a genetic algorithm (GA) in which the following parameters are optimized: Dihedrals of ligand rotatable bonds; ligand ring geometries; dihedrals of protein OH groups and NH3 groups; and the mappings of the fitting points [123]. Initially, the protein is imported and the water molecules and cofactors are removed. Then hydrogens are added throughout the protein. We selected the option to detect the cavity of the active site using the template at a distance of 10 Angstrons. Then the compounds in sdf are imported and we chose the GoldScore scoring function. The GoldScore was used as an empirical scoring function because it has been shown to outperform the other GOLD scoring schemes [125]. The default calculation form, which provides the most accurate docking results, was selected for all calculations. In the standard calculation mode, by default, the GA run comprised 100 000 genetic operations on an initial population of 100 members divided into five subpopulations, and the annealing parameters of fitness function were set at 4.0 for van der Waals and 2.5 for hydrogen bonding [123,125].

Quantitative Structure-Activity Relationship (QSAR) Modeling
Knime 3.5.3 software (KNIME 3.5.3, Konstanz Information Miner Copyright, 2018, www.knime.org) [134] was used to perform analyses and generate in silico models. Given the success of our previous studies [52,83,122,[135][136][137], we chose to perform a 3D QSAR analysis for each species or genus bank. All compounds studied with a resolved chemical structure were saved in the special data file format (SDF), and imported into the Dragon 7.0 software [138], to generate descriptors.
The banks of molecules and their calculated descriptors were imported from the Dragon software, the data was divided as a "Partitioning" tool, using the "Stratified sample" option, which separated the data into training and test sets, respectively representing 80% and 20 % of the compounds. The sets were randomly selected, yet the proportions of active and inactive substances were maintained in both databases.
The Random Forest (RF) algorithm, using WEKA nodes [139] was used to build predictive models. The parameters selected for the RF algorithm were as follows for all models: Total number of forests = 250, and one seed was used to generate random numbers. Cross-validation was performed to estimate the predictive power of the models developed.
The external performances of the selected models were analyzed for sensitivity (truepositive rate or active rate), specificity (true-negative rate or inactive rate), and accuracy (general predictability). The sensitivity and specificity of the receiver's operating character curve (ROC) was used since it describes the actual performance more clearly than does accuracy-general predictability.
The models were also analyzed using the Matthews correlation coefficient (MCC), which can evaluate the model globally based on the results obtained in the confusion matrix. The MCC is a correlation coefficient for the observed and predictive binary classifications, resulting in values of between −1 and +1, where a coefficient of +1 represents a perfect forecast, 0 represents a random forecast and −1 indicates total disagreement between prediction and observation [140].
The MCC can be calculated using the following formula: where VP represents true positives, VN represents true negatives, FP represents false positives, and FN represents false negatives. The applicability domain (APD) was used to analyze the compounds in the test sets to assess whether the predictions were reliable. The APD is a theoretical chemical space that involves the model descriptors and the modeled response, estimating uncertainty when predicting a compound's activity in the training set used in the development of the model. This technique is important for verifying the reliability of QSAR models, comparing predicted values with observed values [141].
The APD is calculated using the following formula: where d and σ are respectively the Euclidean distances and the mean standard deviation for the compounds in the training set. Z is an empirical cutoff value, which in this study was set to 0.5 [142].

Toxicity
OSIRIS Property Explorer (https://www.organic-chemistry.org/prog/peo/) [143] generally allows designing chemical structures and predicting ADMET profiles [144]. In this study, we used OSIRIS to analyze toxicity. Four predictive parameters were provided by the software: Mutagenicity, tumorigenicity, reproductive effect, and irritability.

Conclusions
The insect species studied in this work, Aphis gossypii and Drosophila melanogaster, cause damages to both agriculture and man. In addition, improper use of insecticides to combat these and other pests, has resulted in accumulation of toxic effects in beneficial organisms as well, this, while harming ecosystems due to soil contamination. Through var-ied computational approaches, the present study aimed to identify potential bioinsecticides with low toxicity.
Bioinsecticide designs against essential targets, in five homology models, were built for enzymes not available in databases. The selected enzymes were AChE, nAChR, and Cht. On the Ramachandran graph, the models presented more than 96% reliability.
Predictive models were also built to predict the biological activity of diterpenes against A. gossypii and D. melanogaster. In this study, the models obtained excellent performance results, for the models, the Aphis gossypii ROC curve presented a value of 0.78 during cross-validation and for D. melanogaster a value of 0.86, both with an accuracy greater than 70%. To increase the predictive power and decrease the number of false positives generated by these models, combined analysis, based on the ligand and structure, was used. The combined analysis, based on the Random Forest and docking models, was able to identify potentially active molecules.
In this work, we apply methodologies to develop predictive QSAR models using Random Forest, homology models, molecular docking, combined ligand, and structurebased analysis, and toxicity to verify the interaction of important enzymes involved in the mechanisms of action of commercial insecticides for Aphis gossypii species and Drosophila melanogaster and evaluate the performance of 1955 diterpenes (Lamiaceae). As a result of the QSAR modeling, 11 diterpenes were selected with promising potential activity against Aphis gossypii and 27 structures against Drosophila melanogaster.
Out of 1955 diterpenes that were analyzed by combined probability (Prob Comb ), several potentially active compounds were identified: 313 were considered potentially active against (Aphis gossypii-AChE) with probability potentials ranging from 50 to 73%, 95 were active against (Aphis gossypii-AChE) with potentials between 50 to 76%, 33 were active against (Aphis gossypii-Cht) with potentials between 55 to 78%, 321 were active against (D. melanogaster-AChE) with potentials between 50 to 68%, 74 were active against (D. melanogaster-nAChR) with potentials between 50 to 58% and 5 were active against (D. melanogaster-Cht) with potentials between 50 to 62%. We also identified 15 potentially active molecules for the enzymes, AChE, nAChR, and Cht from A. gossypii, and 37 potentially active molecules for D. melanogaster for the enzymes: AChE and nAChR.
Toxicity was also evaluated for the species A. gossypii, and of the 15 compounds considered potentially active in the RF model and in docking, 11 compounds presented no toxicity for the evaluated parameters. The compounds were 1800, 1804, 1836, 1840, 1842 We also analyzed the structures of the diterpenes presenting significant results and noticed that the most active were ent-kaurane, kaurane, and clerodane. We also found that the interactions formed by these compounds were either more stable or similar to the commercialized insecticides. Overall, we conclude that the compounds bistenuifolin L (1836), and bistenuifolin K (1931), are potentially active against A. gossypii enzymes. Salvisplendin C (1086) and salvixalapadiene (1195), are potentially active against D. melanogaster. We further suggest that in species studied, these diterpenes: Bistenuifolin L (1836), bistenuifolin K (1931), together with salvisplendin C (1086), and salvixalapadiene (1195), deserve to be highlighted because of their high probability of activity and low toxicity.

Conflicts of Interest:
The authors declare no conflict of interest financial, or otherwise.