Computational Comparative Homology Based 3D-structure Modelling of the HSp70 Protein from GWD

GWD: Guinea Worm Disease; HLA-DR: Human Leukocyte Antigen-Antigen D Related; MHC: Major Histocompatibility Complex; HSp70: Heat Shock Protein 70; BLAST: Basic Local Alignment Search Tool; HHBlits: Lightning-Fast Iterative Protein Sequence Searching by HMM-HMM Alignment; SMTL: SWISS-MODEL Template Library; QMEAN: Qualitative Model Energy Analysis; PSVS: The Protein Structure Validation Software Suite; PDB: Protein Data Bank


Introduction
Heat shock protein participates in several broad ranges of protein folding processes and control activity of regulatory proteins. Nearly all cells react in a similar manner against the external environmental abrupt changes like heat, chemicals (like amino acid analogues, ethanol, arsenite, several heavy metals, and certain mitochondrial functional inhibitors, cold [1][2], UV light [3] and during wound healing or tissue remodeling [4] which in turns results inspeedy changes in gene expression levels. The 70-kD class of stress proteins (HSP) draws a major attention because of its diverse range of cellular function. In most of the species it has been found that, this protein is purely stressinducible, the notable exception being primate cells [5]. In human cells, HSP70 is cell cycle regulated [6]. The virtual abundance of the HSPs can be observed in all living organisms starting from bacteria to humans. HSP-70 protein are named according to its molecular weight. Hsp70s (70-kDa) proteins provide assistance in widely distributed protein folding operations, assembling of freshly formed proteins its folding and re-folding action for misfolded protein and aggregated proteins, and controlling the regulatory proteins activity [7][8][9][10][11][12][13]. This protein has also performed functions of housekeeping in the cell. In an ATP controlled fashion this protein interacts with hydrophobic peptide segments. A peptide binding domain (PBD) and the amino-terminal ATPase domain (ABD) is the two distinct functional regions of this protein. For the neutral and hydrophobic amino acid residues affinity the PBD region of HSP70 holds a groove, whereas, the C-terminal or ABD is rich in alpha helical structure which behaves as a 'lid' for the substrate binding domain. When ATP is bounded with protein the lid remains open and allows peptides to bind and release rapidly but when ADP is bound, the lid remains closed, and peptides are strongly bound to the substrate binding domain. In malignant melanoma this protein is over-expressed in cell [14] but under expression is also seen in renal cell cancer [15]. The extracellular hsp perform a potent route for sending the danger signals to the host immune system in order to respond to an infection. The peptide complexes of hsp protein are also involve in the restricted antigen presentation of MHC classes (I&II) and enable the enhanced activation of T cells. The specific interaction of the mammalian cytosolic Hsp70 molecules with HLA-DR molecules, signifies the possibility of transferring the of bond antigenic peptides in the ternary complex into the binding groove of HLA-DR molecules by Hsp70 molecules. Rohrer et al., study suggest that the interaction of Hsp70 and HLA-DR takes place outside the peptide binding groove and is assigned to ATPase domain of heat shock protein 70 molecule, which enable the enhanced presentation of peptide to the antigenic presenting cells and improved T cells proliferations [16]. The computational method approach is the one of the reliable methodology to generate amino acid sequence into 3-D structure models [17] and a wide range of the approaches are routinely applied for such prediction for many biological applications. Homology modeling is based on the sensible assumption of two homologous proteins shares very standardized alike structures. The name homology modelling conveys exactly what this procedure is about; modeling a structure using homologous model as template (which is usually an exact X-ray or NMR-determined

The protein model statistical assessment
The protein model statistical Assessment were performed via the protein structure validation software suite (PSVS). PSVS were used for those protein structures which are generated from NMR, X-ray crystallographic and homology modelling methods. PSVS incorporates analyses from numerous widely-used structure quality evaluation tools, including RPF, PROCHECK, MolProbity, Verified 3D, Prosa II, the PDB validation software and various structure validation tools [23]. PSVS provides a standard constraint analyses, statistics on the PDB validation goodness-of-fit between structures and experimental data, Z-score values and knowledge-based structure quality scores in a standardized format suitable for database integration.

Visualization of 3D model
The generated model was visualized in 3D using the RasMOl molecular 3D viewer. The RasMol generated model information regarding chain, atoms, groups and bonds of 3D model were extracted and analysed [24].

Results and Interpretation
For the recognized 3-D protein structure model, the complete assessment were performed, analyzed and described into the following leads Template validation: For each indentified template, the template's quality has been predicted from the features of the target-template alignment as shown in Table 2. The templates with the highest quality have been selected for the model building. Template 3c7n.1.B validations for the generated model of Hsp70 protein was analyzed and were employed structure analysis for template chain. The multiple sequence alignment result of target protein sequence and the template 3c7n.1 chain B is found to be 88.45% identity by HHb lists as shown in the Figure 1.
Model analysis: HSP70 protein model structure was prepared for the target-template alignment, because the template 3c7n.1 has the highest quality and alignment with the target (Table 3), therefore selected for modelling as shown in Figure 2.
The result of PSVS provides the stereo-chemical property of the model. The molecular weight of the model is 131894. RMS deviation of the bond angle is 1.4°. Number of close contacts (within 2.2 Å): 0 and the bond lengths is 0.011 Å. With respect to mean and standard deviation for a set of 252 X-ray structures <500 residues, of resolution ≤ 1.80 Å, R-factor ≤ 0.25 and R-free ≤ 0.28; a positive value indicates structure).In homology modeling it is important that modeler finds a template structure with the highest possible sequence-identity. If the identity between the input sequence and the template structure falls below 40%, the output model is likely to be implausible. Earlier study reveals that about the conservation of the protein structures than protein sequences in the midst of homologues [18].

Molecular modeling of protein
The methodological strategy used in the model building was based on the target-template alignment using Promote-II. The template were selected based on the maximum sequence similarity. The 3c7n.1 found to have the maximum sequence similarity and taken as template and alignment were performed between the both the targets and template. The coordinates which are found as conserved between the targets and templates are copied from the template to the model. Using a fragment library, insertion and deletion are remodelled and the rebuilding of the side chains was performed. Finally, the geometry of the resulting model is regularized by using a force field. The obtained model is visualized with Deep Viewer. The global and per-residue model quality has been assessed using the QMEAN4 scoring function [22].

Ligand modelling
The number of the ligands which were found to be present in the template structure (Ligands such as SO4, BEF-ADP, ADP) are reassigned by homology in order of modelling, where the ligands shouldn't be clashing with the target protein and the residue which are in contact with the ligand were conserved among the target and the template.

Ramachandran plot statistics:
The Ramachandran plot displays the psi and phi backbone conformational angles for the each residue in the target protein Hsp70 as shown in the Figure 4. The displayed darkest region as 'red' in colour and correspond to the "core" region and represent the most favourable combination of phi-psi values. Few residues found in allowed region. The percentage of residue in the core regions are described as follows in Table 4.

PROCHECK output:
The protein sterochemical quality was analysed. The PDB structure of HSp70 protein was examined by PROCHECK tool. Procheck G-factor evaluated ( Figure 5) probability of dihedral angles of a residue types to be within a given range as below: (a) Procheck G factor for phi-psi for ordered residue overall is -0.659.
(b) Procheck g-factor for all dihedral angles for ordered residue overall is -0.619.

Output from MolProbity
The MolProbity server (in PSVS server) is a one of the worthful structure validation tool in the final stage of structure refinement.VDW violations from MAGE calculate MAGE VdW clashscore: Mean 38.49; SD: 0.0000. MolProbity clash score and visualize atomic overlap and beta position deviations ( Figure 6).

PDB validation software output
After the PDB validation software analysis 3.5 Angstroms are consider for hydrogen bonding in the asymmetric unit and 2.2 Angstroms are considered as close contact for heavy atoms in same asymmetric unit. Distances smaller than 2.2 Angstroms are considered as close contacts. The RMS deviation for covalent bonds relative to the standard dictionary is 0.011 Angstroms (Tables 5 and 6).
The following table contains a list of the covalent bondsgreater than 6.0*RMSD.

Number of Atoms 4403
Number of Bonds 4343

Chirality
The chirality checking was performed and it was found that, that there are no incorrect carbon chiral centers.

Conclusion
In this current study, we have modelled a 3D structure of the hsp70 protein by homology modeling and visualized with the help of online computational tools. 3-D models of hsp70 protein shows significant amino acid sequence similarity with the target sequence. Homology modelling suggested the similarity between targets: Template sequences. In this model, a template is a homologous protein that can be identified by a sequence similarity with target, and 89.45% identity was identified. Protein validation prediction indicates about the region where residues are present. Ramachandran Plot analysis from PROCHECK which indicated the maximum of the residues present in most favoured region, i.e., 81.40%of the residues were found in the most favoured region and form Richardson's lab Molprobity present 85.8% favoured regions for the selected residues. 3D structure of nematode protein-hsp70 protein structure reported extremely in any cases in GWD. In the present current study we used GWD-HSP70 protein sequence as target with 3c7n.1 template for the modeling of the target protein sequence, because at the point of the template selection procedure, we found that the target protein template provides more sequence identity in comparison to other founded template sequences. This study may provide the future prospects to illustrate consideration towards a computational approach for 3D molecular modelling and computer generated model are expected to be the most accurate model but it can't be a substitute of a crystal structure. The HSP70 3-D model representation will prove to be a useful model for exploiting in the dracunculiasis disease outbreak database and residue or derivatives relationship and experimental verification in future and could be helpful in drug designing and development.   Table 6: List of the covalent bond angles greater than 6.0 * RMSD.