BIOINFORMATICS ANALYSIS USING HOMOLOGY MODELING TO PREDICT THE THREE-DIMENSIONAL STRUCTURE OF Spodoptera littoralis (Lepidoptera: Noctuidae) AMINOPEPTIDASE

uring the last years, the number of protein sequences of has increased rapidly. In spite of the fact that the X-ray crystallography is the main method for the determination of protein structure, it is consuming time and succeeds only if suitable conditions for growing crystals are possible (Wieman et al., 2004). In this concern, three main methods of computational prediction for protein structure are used to determine three dimensional (3-D) structure of a protein from its sequence which are homology modeling, threading and de novo methods (Polanski and Kimmel, 2007). Threading and de novo methods are used when no homologous structure is available, but these methods are not yet very accurate. At the same time, the homology modeling is an improved method based on the fact that homologous proteins have similar 3-D structures. Therefore, it is highly desired to model the 3-D structure of protein by using structural bioinformatics approach with homology modeling as a computational biological tool (Sateesh et al., 2010).

uring the last years, the number of protein sequences of has increased rapidly. In spite of the fact that the X-ray crystallography is the main method for the determination of protein structure, it is consuming time and succeeds only if suitable conditions for growing crystals are possible (Wieman et al., 2004). In this concern, three main methods of computational prediction for protein structure are used to determine three dimensional (3-D) structure of a protein from its sequence which are homology modeling, threading and de novo methods (Polanski and Kimmel, 2007). Threading and de novo methods are used when no homologous structure is available, but these methods are not yet very accurate. At the same time, the homology modeling is an improved method based on the fact that homologous proteins have similar 3-D structures. Therefore, it is highly desired to model the 3-D structure of protein by using structural bioinformatics approach with homology modeling as a computational biological tool (Sateesh et al., 2010).
The lepidopteran worm, Spodoptera littoralis is a polyphagous pest affecting various economically important crops. The homology modeling approach was utilized to predict the 3-D structure of aminopeptidase-N (APN) in Spodoptera littoralis, in silico to identify its gene function (Bravo et al., 2007;Choi et al., 2009;Singh et al., 2010). Aminopeptidase-N (APN) is one of the four different kinds of insect receptors, which binds to Bacillus thuringiensis (Bt) toxins via oligomerization process. The other three are cadherins, glycoproteins and alkaline phosphatase (Knight et al., 1994;Vadlamudi et al., 1995;Valaitis et al., 2001;Jurat-Fuentes and Adang, 2004).
Aminopeptidases N (APNs) are a class of endoproteases that cleave the Nterminus of the polypeptides to release D single amino acids (Piggot and Ellar, 2007;Crava et al., 2010). They are members of the zinc dependant metalloprotease M1 type-I, that need the divalent cation zinc to activate a molecule of water, and belong to a subfamily named gluzincins (Albiston et al., 2004;Crava et al., 2010). Furthermore, Luan and Xu (2007) and Crava et al. (2010) stated that a single zinc ion is joined with a highly conserved HEX 2 HX 18 E amino acid motif, which playing a major role with Cry toxin interactions (Yang et al., 2010). In more depth, the APN receptor plays a major role with Cry toxin interactions located in the apical membrane of midgut epithelial cells of insect's microvilli (Bravo et al., 2004(Bravo et al., & 2005Crava et al., 2010). Luo and Adang (1996); Denolf et al. (1997);  and Crava et al. (2010) reported that all the different APNs genes encoded proteins of approximately 1000 amino acids that undergo various forms of posttranslational modification via glycosylphosphatidylinositol (GPI) membrane anchored and N-glycosylation process to produce mature proteins ranging 90-170 kDa in size.
During sporulation, of the Grampositive bacteria, Bacillus thuringiensis (Bt) forms crystalline protein inclusions, which possess insecticidal activity (Bravo et al., 2005). The mode of action of Bt toxins to kill insects is not fully understood. However, Bravo et al. (2007) discussed the principal characterization of Bt toxins effects in lepidopteran insects. Whereas, the crystals of Bt toxins are solubilized in the insect midgut lumen due to its characteristic pH and reducing conditions. The soluble protoxins are then activated by midgut proteases to release the toxin fragment (Bravo et al., 2005(Bravo et al., & 2007Piggot and Ellar, 2007).
Spodoptera litura and S. littoralis are susceptible to Bt Cry1C toxin (Agrwal et al., 2002;Yassin et al., 2010). Moreover, Yassin et al. (2010) added that such toxin (Cry1C) generated susceptibility to 2882 bp gene encoding a 109 kDa APN receptor protein, which has been isolated from S. littoralis and cloned into convenient system pGEM-T easy vector.
As Known the 3-D structure of any protein is essential to understand how protein performs its function. So, the protein structure could be determined at high resolution by either experimental (i.e., Xray crystallography and nuclear magnetic resonance; NMR) or computational methods using bioinformatics tools as described by Sasin and Bujnicki (2004). A variety of advanced homology modeling methods have been developed to provide reliable models of a protein that sharing in even 30% or more sequence identity with a known protein structure (Burley, 2000). The author reported also that the homology modeling could offer a possibility for the identification of target amino acid residues for protein engineering. The comparative homology modeling takes advantages of the structural similarities within the same family to construct an atomic resolution model of a protein from its amino acid sequence (Sali and Blundell, 1999). Such view depended on that the proteins in the same family share the same basic folding in spite of low level of sequence identity (Choi et al., 2009).
The objective of this study was to predict the complete 3-D structure of Spodoptera littoralis aminopeptidase N (SlAPN) protein based on comparative homology modeling. This 3-D model explores the molecular basis of a potential reaction mechanism between such APN protein receptor and Bt Cry toxin.

Sequence identification
The APN sequence of Spodoptera littoralis (SlAPN) having length of 2882 bp was obtained from Yassin et al. (2010). Such sequence was translated using Vector NTI® Suite software version 11 available from Informax, Inc., Bethesda, Md.

Sequence retrieval alignment and homology modeling
Protein sequence homology analysis and homology modeling of Spodoptera littoralis APN was compared with the predicted model of Spodoptera litura APN that has been deposited in the Model Protein Database at http://www.caspur. it/PMDB/ with ID: PM0074654. A homology model for Spodoptera litura was constructed based on X-ray crystallographic structure of both proteins, tricorn interacting factor F3 of Thermoplasma acidophilum (PDB: 1z1w) and leukotrien A4 hydrolase D375n mutant of human (PDB: 1gw6).

RESULTS AND DISCUSSION
In the current study, homology modeling method and computer programs were performed to predict the 3-D structure of SlAPN protein. The present study recorded the schematic representation of the predicted amino acid sequence of SlAPN protein of previously nucleotide sequenced by Yassin et al. (2010) as illustrated in Fig. (1). The results also showed that the isolated SlAPN gene encoded a putative 952 amino acid residues. In turn, the molecular weight was calculated to be 108.58 kDa. In this concern, Khan and Ranganathan (2009) In silico analysis of the present observations showed the primary amino acid sequence that identified SlAPN as a member of the aminopeptidase family. Such results depended on the presence of common motifs of aminopeptidase family which included the following conservative regions: zinc-binding/gluzincin motif, gluzincin aminopeptidase motif, GPI anchor aminopeptidase. These results are in agreement with Crava et al. (2010) and Tajne et al. (2012) who found such observations on aminopeptidase N gene family in the lepidopterans.
In the present study, to analyze 3-D model of the deduced amino acid sequence, both signal peptide and GPI regions on SlAPN gene were excluded.
Whereas, the signal peptide is a sequence of 15-30 amino acids at the N terminus of a secreted protein requiring for transport through a membrane and cleaved off after secretion (Lodge et al., 2007). Whilst, GPI anchor tethers C terminus end of APN protein on the cell membrane by covalent linkage, as reported by Pierleoni et al. (2008). So, these regions are not involved in the Bt toxicity interaction at this investigation. However, this step was completed after using both Signal P and GPI Prediction GPI-SOM programs at http:// www.cbs.dtu.dk/services/SignalP/ and http://gpi.unibe.ch/, respectively.
After using the previous Signal P program, the obtained observations revealed that N-terminal cleavable signal peptide for retention in the endoplasmic reticulum was at residues 1 to 20. This result was also defined by Lodge et al. (2007). Besides, the study also showed that the cleavage site between the signal and the mature protein was between amino acid residues 20 and 21. Moreover, in the present investigation, GPI prediction program exhibited the presence of GPI anchor signal sequence at the C-terminus. Such sequence consists of three small amino acids SNS and followed by a stretch of 20 hydrophobic amino acid residues PTIFASSFLILAAMLIQLYR. There is also a signal sequence of three amino acids DSA to attach with GPI anchor.
The APN analysis in the present study showed the presence of three conserved regions. Firstly, zinc-binding/ gluzincin motif HEX 2 HX 18 E (residues 355 to 378), which is a part of a typical catalytic active site for the majority zincdependent metallopeptidases and is required for enzymatic function. Secondly, the third zinc-binding ligand which is conserved in the sequence motif NEXFA (residues 377-381). Finally, gluzincin aminooeptidase motif GAMEN (residues 319 to 323), which is believed to form part of the active site and it also involves in aminopeptidase activity (Herrero et al., 2005;Kyrieleis et al., 2005).
Furthermore, the present results showed that the potential N-linked glyco-sylation sites (NXS/T) are observed at residues 103, 377, 430, 574, 711 and 782. Besides, the amino acid sequence also contained four Cys residues. These residues are highly conserved among APN molecules of higher mammals (rat, rabbit, pig and human), as detected by .
Finally, the present analysis revealed a highly conserved 64 amino acid residues from Leu 129 to Pro 193 in S. littoralis APN. These residues are in common with 11 APN family proteins. However, these 64 amino acid residues are believed to be important for Cry1Aa toxin binding (Nakanishi et al., 1999 and and after binding to a receptor in the insect midgut, the toxin undergoes a conformational change leads to form pores (Sanjay et al., 2001).
Based on computational alignment, a theoretical 3-D model structure of S. littoralis APN was obtained in this study, whereas, about 814 amino acids (aa) residues of whole suggested 3-D model were corresponded to residues 58-871 of the primary structure. Considering that the Nterminal cleavable signal peptide (residues 1-20) and the C-terminal GPI modification site (residues 930-952) were removed from the predicted 3-D model as previously discussed. The present SlAPN model contained four structural domains, which spread from N-terminal domain I to Cterminal domain IV over the regions Asn 58 -Ile 266 ; Ser 267 -Gly 506 ; Asn 507 -Leu 581 and Ser 582 -Ala 871 , respectively (Fig. 2). The recorded overall dimensions of this model were 91A x 55A x 65A forming together a hook like structure.
Nevertheless, the former recorded N-terminal domain I contained highly conserved Bt Cry1 toxin-binding region (64 amino acid residues from Leu 129 to Pro 193 ). The second domain was the most important one contained highly conserved regions of aminopeptidase family; including zinc-binding/gluzincin motif, the third zinc-binding ligand and the gluzincin aminopeptidase motif. The N-linked glycosylation sites are located in all the four domains starting from domain I (residues Asn 103 -Thr 105 ), domain II (residues Asn 377 -Ser 379 , Asn 430 -Thr 432 ), domain III (residues Asn 574 -Thr 576 , Asn 711 -Ser 713 ) and finally in domain IV (residues Asn 782 -Ser 784 ). The four highly conserved Cys, among APN molecules of higher eukaryotes, (residues 728, 735, 763 and 799) reported by Agrawal et al. (2002) are located in domain IV. In the 814 aa of SlAPN, the contents of the α-helix, βsheet, turn, and random coil were 312 aa, 124 aa, 378 aa, and 0 aa, respectively.
Concerning the description of deduced amino acids that are present in domain I of SlAPN protein, a highly conserved Bt Cry1 toxin-binding region (64 amino acid residues from Leu 129 to Pro 193 ) is found. So, this region began a little before S4 and continuing until reaching the Pro 193 residue, which located at the loop after H1 helix. Since the Cry1 toxinbinding regions have many conserved amino acid residues that may recognize and bind to a common structure in Cry1 toxin regions. Most of those amino acids are also conserved in lepidopteran APNs, suggesting that this toxin might bind to this region of APN in insects (Nakanishi et al., 2002). This suggestion is confirmed with the structure of that region which is formed by various β-strands and α-helix, as well as the large loops providing the flexibility for binding to Cry1 protein; also confirmed with the position of that part in the most distal part of membrane anchor segment and localization in the bottom of the saddle-like structure. All these findings providing the stability of that conserved region which is very necessary for toxin binding.
In closer look, such domain reveals also the oligosaccharide-binding site 103 is located within the Bt Cry1 toxin-binding region that possibly allowing initial interaction between toxin and receptor and then an irreversible binding on the APN recognition site. As, the binding of toxin to the receptor is mediated by a twostep mechanism involving initial reversible binding followed by irreversible binding and membrane insertion in the midgut (Saraswathy and Kumar, 2004). In more depth, Chen et al. (2005) proposed that Cry1-APN interaction have two steps: carbohydrate recognition and irreversible protein-protein interaction.
Moreover, Pigott and Ellar (2007) regarded glycosylation as an important determinant of Cry1A binding. Knight et al. (2004) also considered the carbohydrates that attached to 120 kDa APN of Manduca sexta (tobacco hornworm) were epitope sites of Cry1Ac toxin. Also, Ning et al. (2010) found that binding between Cry1Ac and HaALPs was depended on the presence of N-linked oligosaccharides of these proteins, since digestion with Nglycosidase F eliminated toxin binding.
On the avenue of that domain (II), many conserved structures were observed in this investigation, such as the conserved zinc-binding/gluzincin motif (HEXXHX18E) in helix H4 and H5; the third zinc-binding ligand (NEXFA) in helix H5, and the gluzincin aminopeptidase (GAMEN) motif in βstrand S12. Hence, Cry1 proteins can develop toxic activity in a broad spectrum of pest insects by recognizing the conserved structures (Nakanishi et al., 1999;Agrwal et al., 2002;Herrero et al., 2005). This also bearing in mind the appearance of conserved residues in domains II and III of Cry toxin; those domains are implicated in receptor recognition and pore formation (Shinkawa et al., 1999). Thus, the study of the conserved regions in both toxin and receptor could be helpful to know the interaction of that enzyme with Bt toxins. However, the main mechanism of resistance to Cry toxins are due to the mutations affect toxin receptor interaction (Fernández et al., 2008).
Comparison of APNs sequence of experimented insect as well as vertebrate and fungi aminopeptidases showed that the most striking similarity was around the zinc-binding motif, suggesting the role of zinc metal in enzyme catalysis. The catalytic role of domain II in binding to different substrates has been demonstrated in several zinc-metallopeptidases including aminopeptidase of the beetle Tenebrio molitor (Cristofoletti and Terra, 2000).
The C-terminal superhelix domain IV (Fig. 3D)  The same Figure (3D) exhibited that each of the two helical modules is arranged into two layers. Whereas, the first module is composed of eight parallel helices; four of them are located in the outer layer (H11, H13, H15 and H17) and the other four helices are found in the inner layer (H12, H14, H16 and H18). The second module contains ten helices, seven of them (H19, H21, H23, H24, H25, H27 and H28) are in the outer layer and the rest three helices (H20, H22 and H26) are in the inner layer.
The evidences by several authors who suggested the role of APNs to function as Cry1 receptors are increased. Most of these evidences are based on in vitro binding and membrane reconstitution experiments, and on in vivo expression in transgenic Drosophila (Banks et al., 2003). From in vivo evidences, silencing of the APN gene resulted in the elevated resistance of Spodoptera litura larvae to Cry1C protein, thereby demonstrating a functional role for this protein in Cry protein-mediated toxicity (Rajagopal et al., 2002). Similarly, Sf21 cells expressing HaAPN1 obtained from cotton bollworm (Helicoverpa armigera) showed increasetoxin sensitivity (Sivakumar et al., 2007). The same authors also found that the silencing of HaAPN1 in H. armigera by dsRNA resulted in a decreased larval susceptibility to Cry1Ac toxins. Moreover, findings reported by Herrero et al. (2005) suggested that the lack of APN production in laboratory-selected beet armyworm (Spodoptera exigua) were correlated with resistance to Cry1C toxin. Recently, the strongest evidence which support the assumption that APN proteins play an important role in Cry toxicity comes from the recent findings of Zhang et al. (2009) who postulated that toxin resistant of H. armigera had a mutation in the APN gene. These findings are in accordance with those of Yang et al. (2010) who stated that knocking down of any one of the three APNs in sugarcane borer (Diatraea saccharalis) resulted in a decrease in Cry1Ab susceptibility.
As explained previously, Cry1 toxins showed toxicity for several lepidopteran insects. In addition, APNs from several insect species have been identified as Cry1 toxin receptors. It is important to identify the binding sites on both Cry toxin and receptor to understand the interaction between them. In this concern, Atsumi et al. (2005) hypothesized that the receptor binding sites on Cry1 toxins have two basic features. The former, as described by the previous authors, was a highly conserved structure, due to that the Cry1 toxins have similar primary sequences, three-dimensional structures and can recognize similar APNs in the midguts of several lepidopteran insects. Besides, a nonconserved structure, because the Cry1 toxins also exhibited a highly specific insecticidal activity and could distinguish host species in the lepidopteran spectrum.
From all the foregoing results, it can be concluded that the three dimensional structure of APNs, of economically important pest insect, must be identified and investment in biological control using Cry toxins scope. The results will provide insights on the functional properties of APN towards the understanding of the receptor-toxin interactions which will be valuable for the production of Cry toxin proteins with a greater activity.

SUMMARY
Insect pests are the major cause of damage to commercially important agricultural crops. The continuous application of synthetic pesticides resulted in developing severe insect resistance in addition to induce irreversible damage to the environment. Bacillus thuringiensis (Bt) emerged as a valuable biological alternative in pest control. The midgut aminopeptidase N (APN) of pest insect is a receptor for Bacillus thuringiensis Cry1 toxin. A 108.58 kDa APN has been characterized in Spodoptera littoralis. In the present in silico study, a homology model of SlAPN was constructed using Swiss-Model, Protein Modeling Server. The study detected that SlAPN threedimensional structure has 4 structural domains. Domain I of the receptor is the region that recognizes Cry1 toxins, a part of this section might be very important in this role. Domain II has functions in Cry1 protein-APN interaction. Domain III has a sandwich topology and domain IV is a superhelix. The present data help in the development of a roadmap for the design and synthesis of novel Cry toxins and improve toxic activities depending on the APN's conserved structures which will contribute to the management of insect resistance in the field.