Data set of in-silico analysis and 3D modelling of boiling stable stress-responsive protein from drought tolerant wheat

Boiling stable proteins are widespread, evolutionary conserved proteins from several kingdoms including plants, fungi and bacteria. Accumulation evidences in response to dehydration, suggest a wide spread adaptation and an evolutionary role of these protein families to protect cellular structures from water loss effects in a wide range of water potentials. Boiling stable proteins, although represents just 0.1% of total plant proteins, resist coagulation upon boiling and believed to be involved in water stress adaptation in plants. The present data profiles in-silico analysis of cloned boiling stable protein encoding gene wBsSRP from drought tolerant cultivar of wheat. The data presented here was of a gene isolated from total RNA/mRNA samples of wheat variety PBW 175 subjected to drought stress. The gene is available with EMBL data repository with accession number LN832556.


Data
This dataset represents in-silico and 3D modelling of a drought stress responsive gene encoding a boiling stable protein. One microliter of cDNA prepared from drought stressed leaves of tolerant cultivar of wheat PBW 175 was used as a template for using RT-PCR amplification of CDS (protein coding sequence) encoding hydrophilic protein having K-segment with a pair of gene-specific primers (WZY2 gene, LEA II family gene; accession no: EU395844) (Fig. 1). This stress related gene was submitted to EMBL GenBank and was designated as wBsSRP (wheat boiling soluble stress responsive protein; accession number LN832556). An ORF encoding 45 amino acid long protein sequence was Specifications Table   Subject Biology Specific subject area In-silico analysis of drought responsive gene Type of data Data tables and figures in word files How data were acquired Gene was isolated by eRT-PCR and in-silico analysis was done by using BLAST Value of the Data The data profiles in-silico and 3D modelling of boiling stable protein encoding gene Data can be used to provide in-depth knowledge that boiling stable proteins might play an important role in the protection of plants under water, salt, ionic, cold or heat stress conditions. This data can provides new insights to studies using diverse cultivars under control and drought, to see boiling soluble protein both at pre-flowering and post-flowering stages of the plant development in order to validate its role as a potential drought stress related marker. retrieved and subjected to BLAST-P and BLAST-N analysis ( Table 1, Fig. 2A). Multiple amino acid sequence alignment (Fig. 2B), indicated a typical conserved signature sequence. The phylogeny data analysis tree construction depicted existence of two major groups namely A and B (Fig. 2C). Physiochemical properties of the protein sequence were computed by Protparam tool (Table 2). Glycine content was more as compared to other amino acids ( Supplementary Fig 1). Hydropathy plot data using Kyte Doolite scale, is shown in Fig. 3A and Supplementary Fig 2. Structural disorder by in-silico predict done by PONDR-fit ( Fig. 3B and Supplementary Fig 3). The secondary structure prediction data by Chou Fasman (Fig. 3C). PSIPRED also validated the presence of helix in the protein sequence (Fig. 3D). Thermal mobility of residues is defined by B-factor profile (BFP) (Fig. 3E). I-TASSER was used for 3D modelling (Table 3). Threading based modelling of wBsSRP protein by I-TASSER server predicted ligand binding sites (Fig. 4B). Functional prediction was carried out using Profunc tool (Fig. 4C). Validity and quality of model was checked by Ramachandran plot (Fig. 5). And VADAR, and PROSA ( Fig. 6 and Supplementary Fig 4) which indicated a good three dimensional model. PDBsum server used for structural motif assessment (Fig. 7A). Helical wheel diagram of the K-segment in wBsSRP protein predicted helix was amphipathic containing hydrophobic (marked in green and blue) on one side and hydrophilic residues (marked in red and empty circles) on the other side of the helix ( Fig. 7B and Supplementary Fig 5). Active sites were predicted by CAST P tool (Fig. 8).

Plant material and growth conditions
The seeds of drought to tolerant cultivar of Triticum aestivum L. cv. PBW 175 [1] was surface sterilized, imbibed for 6 h and germinated for three days. Drought stress was imposed to 3-day old seedlings for 48 h by withholding water supply.

PCR amplification and cloning of wBsSRP gene
Using Nucleospin RNA plant isolation kit (Macherey Nagel, Duren, Germany), total RNA was extracted from the drought stressed seedlings of drought tolerant cv. PBW 175 using instructions. One mg of RNA sample was reverse transcribed using "Transcriptor High Fidelity cDNA Synthesis Kit" (Roche Diagnostics, Mannheim, Germany) with oligodT as a primer. One microliter of cDNA was used as a template for PCR amplification of CDS (protein coding sequence) encoding hydrophilic protein having K-segment with a pair of gene-specific primers (WZY2 gene, LEA II family gene; accession no: EU395844) (Fig. 1) following Rakhra et a (2017) [2]. The gene was successfully accessioned in EMBL GenBank with accession number LN832556. wBsSRP gene was cloned TA cloning vector pTZ57R/T using "InsTAclone" TM (Thermo Fisher Scientific, Waltham, Massachusetts, USA).  Table 1B.

Sequence analysis of wBsSRP
ORF Finder tool at NCBI (www.ncbi.nlm.nih.gov) to identify the coding regions. The wBsSRP gene and protein sequence was subjected to homology search using BLAST at NCBI database for deducing similarity with available sequences in databases (www.ncbi.nlm.nih.gov). Conserved region analysis among various protein homologues were carried out using CLUSTAL-W tool (http://www.ebi.ac.uk/ Tools/msa/clustalw2/). Phylogenetic tree was constructed based on aligned protein sequences from various plants using Bootstrap Neighbour Joining method by MEGA 4 tool [3]. Physicochemical properties was calculated by protparam tool at expasy (www.expasy.org). Chou Fasman (www. biogem.org/tool/chou-fasman/) and PSIPRED (http://bioinf.cs.ucl.ac.uk/psipred/) tools were used for secondary structure prediction from the amino acid sequence. Hydropathy analysis was carried out using Protscale at Expasy with following parameters: scale: Hphob./Kyte & Doolittle; window size: 9; weight variation model: linear. PONDR-fit tool was used to identify intrinsically disorder nature of protein (http://www.pondr.com/) using VLXT predictor.

Molecular modelling (3-D) and evaluation of wBsSRP protein
The three dimensional structure of wBsSRP protein was predicted by iterative threading assembly refinement algorithm (I-TASSER) Standalone package (Version 1.1) [4].

Validations, structural and functional analysis
Structural analysis, validations were done using VADAR (http://redpoll.pharmacy.ualberta.ca/ vadar), using following Programme options: Vandel Wall raii Sharke, Standard Voronoi procedure for value calculation. PROSA (http://prosa.services.came.sbg.ac.at/prosa.php), Phi/Psi Ramachandran plot (www.ebi.ac.uk/pdbsum). PDB sum was used to find out structural motifs. ProFunc server of EMBL-EBI was used to identify the likely biochemical function. Helical wheel prediction was carried Table 3A List of top ten templates used by I-TASSER for 3D structure prediction of wBsSRP. PDB hits   1  1zvoC  2  2kfeA  3  2kk7A  4  2kfeA  5  1ddzA  6  3u1cA  7  3itcA  8  2rb6A  9 2hgqE 10 2i9oA  Sequence alignment of the DNA binding template predicted by ProFunc tool server. The sequence alignment has been driven by the residues equivalenced by the template match. The sequences of the query and target proteins are aligned using the matched residues from the template search, together with any equivalenced residues within 10 Å of the template centre, to drive the alignment. Show the amino acid sequence, residue numbers and secondary structure "wiring diagram" of the query protein. The wiring diagram schematically illustrates the protein's helices as the red jagged elements and its beta strands as the yellow arrows. The sequence itself is coloured according to the residue similarity to the aligned residues in the target protein. Show the amino acid sequence residue numbers and secondary structure "wiring diagram" of the target protein. The wiring diagram schematically illustrates the protein's helices as the red jagged elements, its beta strands as the yellow arrows, and its coil regions as purple lines. The sequence itself is coloured according to the residue similarity to the aligned residues in the target protein. Correspond to the template residues: the residues highlighted in red correspond to the template residues and the equivalent residues in the other structure that they matched. Equivalenced residues: the dots identify which residues in each sequence lie within 10 Å of the template centre and hence show which were used to drive the alignment. Boxed regions: the boxed regions of the alignment represent segments where the sequence identity of the two sequences exceeds 35%; that is, regions of reasonably significant sequence similarity. Fittable regions: the red line segment identifies the structurally "fittable" and conserved functional region, common to these two proteins in the alignment. This corresponds to the segment from both proteins whose C-alpha coordinates can be structurally superposed with an r.m.s.d. of less than 3.0 Å.    out using Pepwheel tool using following parameters: number of steps:18, turns :5 and output format: PNG (http://www.bioinformatics.nl/cgi-bin/emboss/pepwheel). Helixator was also used to find out amphipathic TMCs (http://www.tcdb.org/progs/helical_wheel.php).