Bioinformatics Insights into Microbial Xylanase Protein Sequences

Published by Oriental Scientific Publishing Company © 2018 This is an Open Access article licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (https://creativecommons.org/licenses/by-nc-sa/4.0/ ), which permits unrestricted Non Commercial use, distribution and reproduction in any medium, provided the original work is properly cited. Bioinformatics Insights into Microbial Xylanase Protein Sequences

Several bioinformatics studies have been done on various xylanases.In-silico analysis of structural attributes ofcommercially important xylanases from diverse sources structural has beenreported 15  (Arora et al., 2009).Attempts have been made to study the structural dynamics changes of the Trichodermalongibrachiatumxylanase upon binding with xylohexaose and xylan ligands 16  (Uzuner et al., 2010).In-silico structural prediction of Bacillus brevisxylanase and its comparative assessment with few bacterial and fungal xylanases has been reported recently 17  (Mathur et al., 2015).
Efforts have been made to analyze several plant cell wall degrading enzymes (PCWDEs) including xylanses and polygalacturonases ofFusariumvirguliformeusing bioinformatics tools to develop fungal resistant soyabean 18  (Chang  et al., 2016).Homology modeling of xylanase from Aspergillusfumigatus R1 isolate to get an insight into three dimensional structurehas been attempted 19  (Deshmukh et al., 2016).
This manuscript reports in-silico characterization of xylanase protein sequences retrieved from NCBI representing diverse microbial sources namely fungi, bacteria, actinomycetes and yeast.Bioinformatics assessment of these sequences for homology,sequence alignment, physio-chemical attributes, motif assessment and phylogenetic tree construction isreported.The bioinformatics driven characterization of available sequences of microbial xylanases could be utilized for developing appropriate strategies for molecular cloning and expression of xylanasegenes.Further, the sequence-structure-function relationship could be established from in-silico studies and novel xylanases could be derived using state-of-the art technologies either metagenomics or directed evolution approaches.

database search and sequence retrieval
Xylanase protein sequences representing different microbial sources were retrieved from GenBank, NCBI (http//www.ncbi.nlm.nih.gov/).The sequences retrieved were saved in FASTA format and truncated proteins were discarded.The major groups as source organisms represents fungi, bacteria, actinomycetes and yeast and the all the sequences of xylanases belongs to GH11 family.

Multiple sequence alignment and Phylogenetic analysis
The protein alignment of full length amino acid sequences of xylanase were performed by CLUSTAL X version 2.1 21  (Larkin et al., 2007).Phylogenetic tree was constructed by NJ method using the MEGA 7.0 program 22  (Kumar et al., 2016)  based on protein sequences.

identification of conserved motifs
The protein sequences of xylanase were analyzed by Multiple EM for Motif Elicitation(MEME)program version 4.12.0 (http:// meme.nbcr.net/meme/) 23 (Timothy et al.,2009).The maximum number of motifswereset as 10.The minimum width of 6 and maximum width of 50 amino acids was set along with other factors as default values.

Physio-chemical characterization of xylanases
A total of the 122 xylanase protein

Multiple sequence alignment analysis
M u l t i p l e s e q u e n c e a l i g n m e n t It has also been reported that glutamate amino acid residue responsible for catalysis is conserved in genera of ascomycetes and basidiomycetes representing GH11 and GH10 family of xylanases 32  (Cervantes et al., 2016).Xylanase proteins representing actinomycetes revealed several conserved amino acid residues

Phylogenetic analysis
The phylogenetic tree based on microbial xylanase protein sequenceswere constructed by NJ method (Figure -2 A, B, C, D).The phylogenetic tree representing fungal xylanase protein sequences revealed 5 distinct major clusters designated as I,II,III,IV and V group (Figure -2A).
Genera specific clusters for different species of Aspergillus and Fusarium were observed.This indicates sequence level similarity among xylanases representing specific genera and could be utilized to decipher specific sequence features for designing genera specific probe or primers exclusively for xylanase genes.Further distinct sub-clusters representing multiple strains of predominately Aspergillusniger and Fusariumoxysporum were also observed (Figure -2A).In case of bacterial xylanases two distinct clusters designated as I and II comprising exclusively forPaenibacillus andDictyoglomus species were observed (Figure -2B).
Xylanase from Fibrobacter succinogenes occupied distinct place in the phylogenetic tree.The major clusters I and II represented predominantly multiple strains of Paenibacillus polymyxa and Dictyoglomusthermophilum indicating strain specific sequence similarity.Similarly, the phylogenetic tree for xylanases fromactinomycetes revealed two major clusters I and II with 15 and 4 members respectively.The major cluster I was further divided into three subclusters i.e.A, B, C (Figure -2C).
In case of xylanases from yeast sources, two major clusters I and II with 12 and 8 sequences were observed, which were further divided into two sub-clusters A and B respectively (Figure- predominantly from Fusariumgenera along with some sequences from yeast and the major cluster G with 27 sequences comprises of both fungal and yeast source organisms (Figure -2D).Phylogenetic tree revealing xylanases representing GH10 and GH11 family and also basidiomycetes and ascomycetes specific fungal groups have been reported 32,29 (Cervanteset al.,2016;Ellouzeet al.,2011).Distinct clades representing GH10, GH11 and GH30 family revealing evolutionary relatedness based on 22 protein sequences of xylanaseswere also deciphered 33  (Liao et al., 2015).

Motif distribution and characterization
The conserved motifs deduced by MEME are generally analyzed for biological function using  Trichodermaharzianumhas been reported 15  (Arora  et al.,2009).
The relevance of bioinformaticsin enzyme engineering has been witnessed in recent years and several in-silico tools mainly focusing on prediction of three dimensional structure of enzyme based on the availability of the protein sequences is now being routinely used 35,36  (Damborsky and  Brezovsky, 2014; Suplatov et al., 2015).The insilico analysis of the sequences of genes/proteins of several industrially important enzymes mainly focusing on homology search, multiple sequence Molecular cloning of relevant genes coding for enzymes and its expression needs bioinformatics interventiontargeting forsubstantial improvement in enzyme for desired features.Recently,functional diversity of multiple xylanases from Penicilliumoxalicum GZ-2, revealing functional redundancy using bioinformatics approach has been reported. 33 (Liao et al., 2015)   conclusions Using bioinformatics approach, an attempt has been made to characterize microbial xylanase sequences for several important attributes, which could be targeted for enzyme engineering to develop novel xylanases.The knowledge about the sequences is being applied for deciphering the three dimensional structure using appropriate insilicotoolsprior to wet-lab experimentation.The tools of bioinformatics are also relevant in the era of genomics, where several microbial genome sequences have been deciphered.This provides an opportunity to perform genome-wide identification and characterization of multigene families of industrially important enzymes and analyze the functional redundancy.There has been substantial improvement in advanced enzyme technologies including metagenomics and directed evolution based on recent bioinformatics driven approaches.
(Walia  et al., 2015) and high aliphatic index indicates stability of xylanases for wide temperature range.The aliphatic index of xylanases protein sequences from Aspergillusniger (ALN49265), Dictyoglomusthermophilum(WP_012582654, AAC46361), Aureobasidiummelanogenum ( K E Q 6 3 6 8 9 ) , A u re o b a s i d i u m n a m i b i a e ( X P _ 0 1 3 4 2 2 4 9 0 ) a n d Aureobasidiumpullulans(KEQ80629) was above 70 (Table-1).Another important physio-chemical attributeanalyzed by ProtParam is GRAVY value derived by calculating the sum of hydropathy values 28 (Kyte and Doolittle, 1982) of all the amino acids, divided by the number of residues in the sequence 20 (Gasteiger et al. 2005).Increasing positive score indicates a greater hydrophobicity.The microbial xylanase protein sequences revealed negative GRAVY value ranging from -0.832 to -0.093 indicating hydrophilic nature.

Fig. 1 .
Fig. 1.Multiple sequence alignment of xylanase protein sequences from (A) Fungal (B) Bacterial (C) Actinomycetes and (D) Yeast sources.Strongly conserved amino acid residues are indicated by asterisk* above the alignment ofretrievedxylanasesequenceswasperformed by CLUSTAL X version 2.1andis shown in Figure 1(A,B,C& D).Several conserved amino residues are observed for different source organisms while comprehensive multiple sequence alignment of all122xylanase sequences revealedtwohighly conserved residues namely YGW and EYYI (Figure-1E).The presence of these conserved amino acid residues has been reported for xylanases especially from fungal and bacterial sources[29][30][31] (Ellouze et  al.,2011, Sapag et al., 2002,Torronen et al.,1992).Similar conserved amino acid residues have been observed for xylanaseofT.longibrachiatum.Another conserved amino acid residues with

Fig. 2 (
Fig. 2(a).Phylogenetic tree constructed using protein sequences of 58 fungal xylanase.The distinct major clusters designated as I, II, III, IV and V comprising of 19, 19, 5, 10 and 5 members respectively are highlighted

Fig. 2 (
Fig. 2(c).Phylogenetic tree constructed using 19 xylanase protein sequences of actinomycetes.The distinct major clusters designated as I and II comprising of 15 and 4 members respectively are highlighted protein BLAST and domains are characterized by Interproscanto reveal the best possible match based on highest similarity score.The distribution of five motifs among microbial xylanase protein sequencesis shown in Figure 3A, B, C and D. The distribution of five motifs among 58 fungal xylanase protein sequences was analyzed (Figure3A) and motifs with width and best possible match amino acid sequences is shown in Table-2A.The predominance of motifs with conserved domain representing unique feature of GH11 family was observed.The motif 1 with amino acid sequence IDGTATFTQYWSVRQNKR S S G T V T T S N H F N A W A K L G M N L G T H N Y Q I VA T E a n d m o t i f 2 w i t h s e q u e n c e P S G N G Y L S V Y G W TTNPLVEYYIVESYGTYNPGSGGTYKGTV was uniformly distributed among fungal xylanases.Similarly the motif assessment for bacterial (Figure-3B, Table-2B), actinomycetes (Figure-3C, Table-2C) and yeast (Figure-3D, Table-2D) source organisms revealed predominance of conserved domains specific to GH11 family.The comprehensive analysis of all the xylanases sequences, irrespective of source

table 1 .
List of xylanase protein sequences from different microbial sources with in-silico physio-chemical attributes revealed by Protparam.
2539.7KDa.The Isoelectric point(pI) was in the range of 3.93-9.69.The molecular weight in the range of 8.5 to 85kDa and pI in the range of 4-10.3 has been reported for bacterial5 (Chakdhar   et al.,2015)and fungal xylanases25(Polizeli et al.