In silico analysis and characterization of fresh water fish ATPases and homology modelling

ATPases is known to be a crucial in many biological activities of organisms. In this study, physicochemical properties and modeling of ATPases protein of fi sh was analysed using In silico approach. ATPases a protein selected from fi sh species, including Gold fi sh (Carassius auratus auratus), Zebra fi sh (Hypancistrus zebra), White fi shes (Coregonus autumnalis), Grass carp (Ctenopharyngodon idella) and Anabas testudineus (Koi) were used in this study. Physicochemical characteristics showed with molecular weight (25045.58-25148.57Da), theoretical isoelectric point (9.30-9.97), extinction coeffi cient(26470-34950), aliphatic index(147.31-150.35), instability index(32.84-42.67), total number of negatively charged residues and positively charged residues (5/7-6/8), and grand average of hydropathicity (1.014-1.151) were computed. All proteins were classifi ed as transmembrane proteins. In secondary structure prediction, all proteins were composed of random coils as predominant, followed by extended strands, alpha helix and beta turn. Three dimensional structure of protein were predicted and verifi ed as good structures. All model structures were evaluated being accepted and reliable based on structural evaluation and stereo chemical analysis. Research Article


Introduction
ATPases an enzyme that hydrolyzes ATP; especially: one that hydrolyzes ATP to ADP and inorganic phosphate called also adenosine triphosphatase. Medical De inition of ATPase an enzyme that hydrolyzes ATP; Transmembrane ATPases are membrane-bound enzyme complexes/ion transporters that use ATP hydrolysis to drive the transport of protons across a membrane. Some transmembrane ATPases also work in reverse, harnessing the energy from a proton gradient, using the lux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. There are several different types of transmembrane ATPases, which can differ in function (ATP hydrolysis and/or synthesis), structure (e.g., F-, V-and A-ATPases, which contain rotary motors) and in the type of ions they transport. The different types include: F-ATPases (ATP syntheses, F1F0-ATPases), which are found in mitochondria, chloroplasts and bacterial plasma membranes where they are the prime producers of ATP, using the proton gradient generated by oxidative phosphorylation (mitochondria) or photosynthesis (chloroplasts). V-ATPases (V1V0-ATPases), which are primarily found in eukaryotes and they function as proton pumps that acidify intracellular compartments and, in some cases, transport protons across the plasma membrane. They are also found in bacteria. A-ATPases (A1A0-ATPases), which are found in Archaea and function like F-ATPases, though with respect to their structure and some inhibitor responses, A-ATPases are more closely related to the V-ATPases-ATPases (E1E2-ATPases), which are found in bacteria and in eukaryotic plasma membranes and organelles, and function to transport a variety of different ions across membranes'-ATPases, which are cell-surface enzymes that hydrolyse a range of NTPs, including extracellular ATP. F-ATPases (also known as ATP Synthase, F1F0-ATPase, or H (+)-transporting two-sector ATPase) are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), with additional subunits in mitochondria. Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha (3) beta (3) subunits, while in the F0 complex, the ringshaped C subunits forms the rotor. These ATPases can also work in reverse in bacteria, hydrolyzing ATP to create a proton gradient. were retrieved from the NCBI Protein database (http:/www.ncbi.nlm.nih.gov) under the FASTA format for analysis. The above obtained sequence was further used for complete protein analysis (structure and functional annotation) and model building using comparative modelling approach. Using expasy's protparam server (http:// expasy.org/cgi-bin/protparam) complete primary structure analysis has been performed. SOPMA was used for secondary structure prediction of protein sequence.

Sequence alignment
Multiple sequence alignment between Mitochondrion ATPase sequences from different ish species was performed using the clustalW2 server (http:/www.ebi. ac.uk/tools/msa/clustalW2/). Neighbour-Joining phylogenetic analysis of protein sequences was also generated using Clustal omega [1][2][3]. Clustal W2 is a server for multiple sequence alignment which is also used for phylogenetic tree analysis. Phylip and mega also available server for phylogenetic tree analysis server.

Physiochemical characterization
Physiochemical properties of the proteins such as molecular weight (Mol. wt.), amino acid composition, theoretical isoelectric point (pI), total number of positive (Arg+Lys) and negative (Asp+Glu) residues (+R/-R), extinction co-ef icient (EC), instability index (AI), and grand average of hydropathicity (GRAVY) of investigated proteins was analysed by searching on the Expasy's protparam server (http://web. expasy.org/protparam/) For the domain structures the simple Molecular Architecture Research tool (SMART) program (http:/smart.embl-heidelberg.de/) was used, for primary structure analysis as expasy's protparam server.

Functional analysis
The server SOSUI (Hirokawa et al.) was performed to identify the types of protein.
The CYS_REC (http://linux1.softberry.com) was used to predict the Presence or absence of disulphide bonds and their bonding pattern, which are crucial in de ining the functional linkage and the stability of a protein. So, CYS_REC used to determine presence or absence of cystein bond.

Protein structure prediction
Secondary structure of proteins was predicted using SOPMA server (http:// npsaprabi.ibcp.fr/cgibin/npsa_automat.pI?page=/NPSA/npsa_sopma.html), with the default parameters (window width: 17; similarity threshold: 8; number of states: [4,5]. Homology modeling was constructed using Swiss model server (http//swissmodel. expasy.org/) [6,7]. Swiss model is a server which is used for 3D structure prediction and also template selection, template is select based on maximum similarity or identity with sequence. Quality and accuracy with validation of the predicted models were analysed performing RAMPAGE for Ramachandran plot analysis [10,11]. The best selected models were based on the total number of residues in the most favoured regions, additional allowed region, generously allowed region and disallowed region as well as an overall G-factor have over 90% in most favoured region and cut off value (>-0.5) of overall G-factor [11,12]. Raptor X also available server for 3d structure prediction and PSIPRED and GOR IV also available free server for secondary structure analysis server. For protein structure prediction also available server are phyre 2, HHpred, modeller, CPH models, lomates,Modbase and Robette etc.

Physico-chemical characterisation
Amino acid composition in ATPase computed using Expasy's prot param server. The physicochemical characterisations of proteins were obtained analysing Expasy's Protparam tools ( Table 1). The value of isoelectric point (pI) of proteins were ranged from 9.30-9.97(more than 7), implying the basic character of these proteins. The pI values function in protein puri ication by isoelectric point focusing on a polyacrylamide gel. Total number of positively (Asp+Glu) and negatively (Arg+Lys) charged residues (+R/-R) was ranged from 5 to 6 and 7 to 8, respectively. The extinction co-ef icient (EC) of proteins measured at 280 nm was in a range of 31970 to 34950 M -1 .cm -1 (assuming all pairs of cysteine residues from cysteins). The high value of ECs in this study implied a high concentration of cysteine along the protein sequences, functioning in quantitate the protein concentration in a volume of solution. The Instability index (II) value evaluates the stability of proteins in a test tube; it was recommended that a protein is stable when its II value is smaller than 40 and as unstable when such value is above 40. This study results showed that the II value proteins was in a range of proteins was in a range of 32.84-42.67 showing the protein of zebra ish is (II>40) and the rest is stable (II<40) [13][14][15][16][17][18][19][20][21][22]. The aliphatic index (AI) is a parameter for estimating thermal stability of a protein directly associating with the mole fraction of aliphatic side chains (Alanine, isoleucine, leucine and valine ) in the protein. In this study a high Aliphatic index values of proteins (147.31-150.35) imply high thermo stability of these proteins. Low grand average hydropathicity (GRAVY) regarded as a measure for the stability of globular protein at high temperature. The amino acid composition in ATP Synthase-F 0 computed using expasy's protparam was showed in table 1.
All proteins were classi ied as transmembrane proteins through SOSUI program. The transmembrane regions predicted from protein sequences were shown in Table  2. These amino acid sequences of Membrane Protein have 6 transmembrane helices, except Coregonus autumnalis which have 5 transmembrane regions.

Sequence alignment
A multiple amino acid sequence alignment of proteins was performed [1] ( Figure  1A,B). The result indicated a high amino acid sequence similarity between the ATPases of ive studied ish species; it was observed that between Coregonus and Hypancistrus was greater similarity. A neighbour-joining phylogenetic tree was constructed using clustal omega.

Functional analysis result
Disulphide bonds are signi icant in the protein folding and stability, which are generated between the thiol groups of cysteine residues by oxidative folding process. In this study, the cysteine residues in the proteins were determined using CYS_REC server. The results revealed that any of these proteins not contain cysteine residues and most probable patterns of pairs of cysteine were not found ( Table 3), suggesting that no one proteins contain disulphide bonds [4].
There is no CYS_REC found in ive Fish species. Cysteine residues and disulphide bonds which are important in determining the thermo stability of proteins. The results indicated the probable absence of disulphide bonds in these proteins.

Protein structure prediction and validation
The secondary structure of ATPases protein from ish species was predicted using SOPMA (Table 4). The results showed that except for ATPases from grass carp all contain alpha helix as a predominant component among the secondary structure elements, followed by random coil, extended strand and beta turn.
The three dimensional structures of ish ATPases protein were modelled based on the sequence and structural similarity to different available protein structure templates from the pdb ( Table 4). The inal structure of the models represented with the Swiss pdb viewer was shown in igure 2. Validation and predicted models performing Rampage for Ramachandran plots were represented in table 5 [4].
The stereo chemical quality and accuracy of proposed models were examined performing PROCHECK analysis shown in table 4. The analysis results revealed that the predicted models for ATPase of Grass Carp, zebra ish, Anabas testudineus, Carassius auratus, and Coregonus autumnalis have over (88%) of residues in the most favoured region, indicating that these homology models were good quality and additional allowed regions combined, implying acceptable. Results showed that over 88% (88.8%, 88.4%, 88.4%, 88.5%, and 88.6% respectively) of residues found in the most favoured regions. More over 7% (7.7%, 7.6%, 7.6%, 7.7%, 7.7%) of residues in the generously or additional allowed regions, and (3.5,4.0,4.0,3.8,3.8)% residues in the disallowed regions of the proteins. All protein models contained the lower than 8% fell of residues in the generously allowed regions, indicating that may be near to be good quality models. Q-mean score value range (-5.68-5.34), the result implied that models were accepted.

Conclusion
In this study, ive ATPase proteins of freshwater ish species were selected to characterise using computational tools. Physicochemical and functional characterisations of the proteins were profoundly investigated. All proteins were classi ied as transmembrane protein, an approximately number of alpha helix and random coils were computed to be dominating, followed by extended strands in the secondary structure of all proteins. The three dimensional models of proteins     was predicted and validated by the accuracy of ramachandran plot analysis; the results suggested that all proposed models are reliable and valid. This study provide information on the physiochemical characteristics, structural properties and molecular functions of ish ATPase, which are useful for further studies on speci ic functions.