In silico characterization of nif H gene of Rhizobium sp. TN04 isolated from the rhizosphere of non-leguminous potato plants

.


Introduction
Nitrogen (N) availability significantly limits agricultural crop productivity worldwide.The utilisation of N fertiliser on a global scale is increasing significantly, with around 40% of the world's population depending on N fertiliser for crop cultivation [1].The excessive use of N fertiliser not only causes high costs but also pose environmental issues.Biological nitrogen fixation (BNF) is a fundamental microbiological process within soil and plant ecosystem, vital for providing N to crops [2].The process of N-fixation is regulated by Nod, Fix and Nif genes [3].
Among them, Nif genes show maximum diversity and encode proteins crucial in regulating the process of N-fixation [4].
The nif genes only work in microaerophilic or anaerobic environments because they are highly vulnerable to the presence of oxygen.
The N-fixing machinery in diazotroph organisms consists of 19 nif genes [5], responsible for converting N from an unusable form to a useful form using the enzyme known as nitrogenases.The It is well established that leguminous plants obtain nitrogen by establishing endosymbiotic relationships with rhizobia [9].These bacteria fix N by forming nodules on the roots of their host plants and play a beneficial role in promoting the growth of these plants.Rhizobium, characterized as non-sporulating, Gramstain-negative aerobic rods, belongs to αproteobacteria and β-proteobacteria [10,11].This bacterium is distributed among 18 genera within various families and is usually called as legume endosymbionts.
Moreover, they have also been identified in association with the roots of nonleguminous plants, including certain cereals like rice, wheat, and maize [12][13][14].
However, the isolation and identification of Rhizobium as a free living diazotroph in potato (Solanum tuberosum L.) plants remains relatively less explored.
Potato, a major vegetable crops cultivated in 79% of countries worldwide [15].It

Material and methods
The study conducted by Naqqash et al

3D structure modelling and verification
Among the three strains, Rhizobium sp.
TN04 was selected for the prediction of its

Protein-protein association network
The STRING

Results and discussion
This      3).The PSIPRED tool was used to predict the secondary structure [26] of nifH protein in all three Rhizobium strains.Results showed that nifH protein from both non-leguminous and leguminous strain of Rhizobium consist of the three main secondary conformations: sheets, coils, and helices (Figure 3 and 4).Coils are flexible segments within a protein that do not possess well-defined secondary structures.A higher percentage of coils may suggest increased surface accessibility and flexibility [47].These areas are frequently involved in substrate The presence of secondary configurations including α and β helices suggests that the nifH proteins in all strains are not in an unfolded state, indicating their stable nature.
Thermophiles have been found to have a higher proportion of their amino acid residues in αhelical configuration in order to tolerate high temperatures [49].In this study, the presence of α-helices in nifH protein suggests its thermally stable nature.Moreover, Roy et al., [50] demonstrated that it is important to detect structural modifications in protein of interest, especially regarding its function and stability, as it undergoes changes under different conditions.The MEMSAT-SVM tool has been developed to predict the topology of transmembrane domains in proteins [51].Although, the nifH protein is not classified as a membrane protein [52], the MEMSAT-SVM software was used to identify any specific segments or features within the nifH protein that might be related to structural features or interactions associated with membranes.In this study, all the three strains showed that nifH is extracellular (Figure 5), which aligns with the results obtained from physiochemical analysis.
The functional annotations of the nifH protein from Rhizobium sp.TN04 was further investigated assessed using various software, namely I-TASSER, COFACTOR, and COACH.
The I-TASSER modelling process initiates by utilizing structure templates identified through the LOMETS approach from the PDB library.LOMETS functions as a meta-server threading system comprising of multiple threading programs, each generating numerous template alignments.Within I-TASSER, only the most significant templates from the threading alignments are utilized, as determined by their Z-score [53].This score represents the deviation between average and scores, focusing on the selection of the top 10 templates extracted from threading programs (Figure 6).Further, at the level of Topology, it exhibits a "Rossmann fold" (Topology structural subunits of nitrogenase enzymes encoded by nifD, nifK, and nifH genes are responsible for nitrogenase activity.The proteins identified in diazotrophs such as Bradyrhizobium japonicum, Herbaspirillum seropedicae, Azotobacter vinelandii, and Pseudomonas stutzeri have common function and structure along with similar sequences [6-8].The availability of nifH protein structural model is essential for investigating biological N-fixation activities on the molecular scale.However, there is limited knowledge regarding the function and structure of nifH proteins in potato plants. ranks fourth global production, following wheat, and maize.It is known for its costeffectiveness and rich nutritional profile ( vital amino acids, proteins, minerals, antioxidants, vitamins, and carbohydrates) [16].Various plant-growth promoting bacteria have been identified in potato plants, including Bacillus, Aeromonas, Azospirillum, Morxella, and Pseudomonas, among others [17-19].However, limited studies are available that report the isolation of Rhizobium from non-leguminous potato plants.The study conducted by Naqqash et al., [20] reported the isolation and characterization of Rhizobium sp.TN04 from potato rhizosphere.from the results of their studyshowed increased levels of N under controlled and field conditions [20].Thus, more detailed understanding of the pathway involved in Rhizobium N-fixation, particularly the role of the nifH gene responsible for its nitrogenase activity, is needed.Therefore, the present study aimed to investigate the function of nifH gene in Nfixation by developing an in silico model.To gain a deeper understanding of N-NUST Journal of Natural Sciences, Vol. 8, Issue 1, 2023 In silico characterization of nifH gene of Rhizobium sp.TN04 isolated from the rhizosphere of non-leguminous potato plants 4 fixation, this study examined the protein structural variations and analysed the primary, secondary, and tertiary structures using in silico modelling techniques.However, the tertiary structures of numerous nitrogenase proteins from various diazotrophs, especially those of symbiotic organisms, have not been determined so far.Hence, constructing a model of the nifH tertiary structure is essential to gain a deeper insight into its activity.
identify templates for protein of interest and subsequently generates a 3-D model using data obtained from these templates [27].The constructed 3D model was assessed using the SAVES server (http://services.mbi.ucla.edu/SAVES/) to verify its quality.ERRAT (http://services.mbi.ucla.edu/ERRNUST Journal of Natural Sciences, Vol. 8, Issue 1, 2023 In silico characterization of nifH gene of Rhizobium sp.TN04 isolated from the rhizosphere of non-leguminous potato plants 6 AT/) was used to differentiate accurately identified protein regions from those that might have been improperly registered.The non-random distribution of atoms within the protein may potentially become randomised throughout the process of protein modelling.ERRAT examines the query model representing the distribution of atoms and provides an overall quality factor assessing non-bonding atomic interactions.Higher scores indicate higher quality, with an accepted range typically greater than 50, as stated by Li and Wang in 2007 and Naveed et al. in 2016.COFACTOR and COACH were used to identify enzyme and ligand binding sites [27].CATH classification CATH (http://www.cathdb.info/search/by_structure) is a system that organises protein domain structures into a hierarchical classification.The acronym CATH is formed by the first letters of the highest four levels of the classification system.The "Class" level represents the general composition of the domain's secondary structure, indicating whether it mainly consists of alphahelices, beta-sheets, a mixture of both, or a small number of secondary structures."Architecture" denotes significant structural resemblance, but there is no indication of homology, such as the presence of an alpha/beta sandwich."Topology" refers to the arrangement of elements on an enormous level with specific structural characteristics.The Homologous Superfamily provides a verified evolutionary connection [28].This classification was performed on the nifH protein of Rhizobium sp.TN04 to elucidate the structural characteristics of this protein.

(
http://stringdb.org/newstring_cgi) database was used to find genes that have functional relationships with Rhizobium sp.TN04.To achieve this, STRING executes searches within the clusters where the query gene has been identified repetitively.The concept of STRING has been based upon earlier studies indicating that genes which frequently appear near each other in genomes have a tendency to produce proteins that are functionally related and participate in similar metabolic pathways [29, 30].

Figure 1 :
Figure 1: Phylogenetic tree constructed using nifH gene of Rhizobium sp.isolated from potato rhizosphere ( ) in comparison with previously published sequences.The numbers indicated at the branch points represent bootstrap values greater than 70%.

Figure 2 :
Figure 2: Phylogenetic tree constructed using nifH protein of Rhizobium sp.isolated from potato rhizosphere ( ) in comparison with various protein sequences.The numbers indicated at the branch points represent bootstrap values greater than 70%.

showed 20 and 8 ,
respectively.The total positively charged residues were less compared to negatively charged one, thus it suggests the extracellular nature of the nifH protein [43].The extinction coefficient measures the light absorption by proteins at a specific wavelength.Computed extinction coefficients and protein concentration enable quantitative analysis of proteinligand and protein-protein interactions in solution [44].In this study, EC at a wavelength of 280 nm in water was estimated, assuming all cysteine residues are either in their reduced state or not.The EC value of nifH protein of all three strains was similar.The polypeptide chains of proteins are structured with 20 amino acid residues, each possessing distinct characteristics crucial for particular functions within a protein.The percentages of charge, polarity, aromatic, and aliphatic characteristics of proteins change depending on their function and location [45].Phosphorylation plays a pivotal role in enabling signaling pathways to function.Among the primary amino acid NUST Journal of Natural Sciences, Vol. 8, Issue 1, 2023 In silico characterization of nifH gene of Rhizobium sp.TN04 isolated from the rhizosphere of non-leguminous potato plants 12 residues, Threonine and Tyrosine are commonly phosphorylated due to their side chains containing hydroxyl groups that facilitate phosphate group binding [46].In this study, ProtParam tool was used to estimate all 20 amino acids.Among them, Glycine had the highest percentage, with values of 13.5, 12.80, and 13 in Rhizobium sp.TN04, Rhizobium sp.S1SS148 and R. rosettiformans, respectively.While no percentage of Typtophan was observed in any of the three strains (Table

Figure 6 :
Figure 6: Top 10 templates extracted from threading programs

Figure 10 : 1 (
Figure 10: STRING analysis for interaction nifH protein with other proteins.

Table 1 :
Identification of nifH gene and protein of TN04 strain isolated from non-leguminous study uses bioinformatic tools to predict the structure and function of nifH gene of Rhizobium sp.TN04 isolated from non-leguminous potato plants.The NUST Journal of Natural Sciences, Vol. 8, Issue 1, 2023 In silico characterization of nifH gene of Rhizobium sp.TN04 isolated from the rhizosphere In silico characterization of nifH gene of Rhizobium sp.TN04 isolated from the rhizosphere of non-leguminous potato plants 8 maize plant, and similarly observed that two distinct strains of Rhizobium clustered together.

Table 2 :
Physiochemical properties of nifH protein of Rhizobium strains whereas its net charge is zero, indicating the stable and compact state of the protein.

Table 3 :
Amino acid composition of nifH protein of Rhizobium strains

NUST Journal of Natural Sciences, Vol. 8, Issue 1, 2023
In silico characterization of nifH gene of Rhizobium sp.TN04 isolated from the rhizosphere of non-leguminous potato plants 21 associated nif genes, including nifN, nifE, nifK, nifD, nifX and nifB, within this network are major elements of the nif operon and play direct roles in the process of N-fixation [62, 63].Thus, it can be concluded that nifH protein of Rhizobium the quaternary structure of its nifH protein.