Unraveling the role of salt-sensitivity genes in obesity with integrated network biology and co-expression analysis

Obesity is a multifactorial disease caused by complex interactions between genes and dietary factors. Salt-rich diet is related to the development and progression of several chronic diseases including obesity. However, the molecular basis of how salt sensitivity genes (SSG) contribute to adiposity in obesity patients remains unexplored. In this study, we used the microarray expression data of visceral adipose tissue samples and constructed a complex protein-interaction network of salt sensitivity genes and their co-expressed genes to trace the molecular pathways connected to obesity. The Salt Sensitivity Protein Interaction Network (SSPIN) of 2691 differentially expressed genes and their 15474 interactions has shown that adipose tissues are enriched with the expression of 23 SSGs, 16 hubs and 84 bottlenecks (p = 2.52 x 10–16) involved in diverse molecular pathways connected to adiposity. Fifteen of these 23 SSGs along with 8 other SSGs showed a co-expression with enriched obesity-related genes (r ≥ 0.8). These SSGs and their co-expression partners are involved in diverse metabolic pathways including adipogenesis, adipocytokine signaling pathway, renin-angiotensin system, etc. This study concludes that SSGs could act as molecular signatures for tracing the basis of adipogenesis among obese patients. Integrated network centered methods may accelerate the identification of new molecular targets from the complex obesity genomics data.


Introduction
Obesity, an excessive body fat accumulation in individuals acts as a major risk factor for the development of diverse chronic diseases like impaired insulin metabolism, glycemic a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 abnormalities, hypertension and cardiovascular diseases in future. Obesity, owing to its complex multifactorial disease nature is not only challenging the molecular scientists to decode its molecular basis but also the clinicians who are involved in treating, preventing and disease management. Approximately 30% of the world population is either overweight or obese [1]. So far, the specific molecular and cellular mechanisms through which environmental factors increase the risk of developing obesity in genetically susceptible individuals still remains to be a mystery. The chronic low inflammation in different tissues is one of the characteristic features of obesity [2]. Particularly, chronic inflammatory reactions which takes place in adipose tissues contribute to the obesity associated insulin insensitivity. Adipose tissue plays an important role in the development of metabolic diseases due to dysregulated discharge of adipocytokines from adipocytes in visceral fat of obese individuals. This will subsequently induce insulin resistance condition in muscles and liver. The faulty insulin sensitivity of adipose tissues, connects the obesity with other chronic diseases like diabetes, hyperlipidemia, arthritis, hypertension, cardiovascular disease, ischemic stroke, hyperglycemia and different types of cancer [3] [4].
The importance of excess salt intake in the pathogenesis of metabolic diseases is widely recognized. Salt sensitivity is a physiological trait, in which the changes in salt intake parallel the changes in blood pressure [5]. The gene expression status of salt sensitivity genes (SSGs) in adipose tissues is not yet well explored. In the present study, we focused on SSGs expressed in adipose tissues to figure their influential role in the pathogenesis of obesity. We considered genes from renin-angiotensin system pathway which maintains the homeostasis of salt and body fluids, and regulate the blood pressure [6]. In addition, expression of renin-angiotensin system in adipose tissue is involved in the regulation of triglyceride accumulation, adipocyte formation, glucose metabolism, lipolysis, and the initiation of the adverse metabolic consequences of obesity [7], [8]. Therefore, in order to identify the candidate genes from SSGs and their molecular signature networks connected to the pathogenesis of obesity, the gene expression datasets collected from visceral adipose tissues were analyzed by knowledge based systemic investigations and statistical methods. We used different statistical parameters like graph theory to pick up biomarkers from the gene expression data. We also used gene-gene correlation, which relies on the fact that disease candidate genes showing a similar expression pattern are more likely to interact with one another for their biological functioning [9]. Our network biology integrated investigation will offer novel association with potential biological comprehensions and supports future translational assessment on SSGs and obesity.

Gene expression dataset
The microarray generated gene expression dataset with the reference ID of GSE88837 was collected from GEO (Gene Expression Omnibus) database [10]. This gene expression data is generated on Affymetrix microarray platform using the total RNA extracted from human visceral adipose tissue of 16 overweight woman adolescent samples (BMI > 25) and 14 lean adolescent women (BMI < 25). Complete information about the individuals and testing methods, can be found in S1 Table. Normalization of gene expression data Gene expression data analysis of the samples were implemented by means of R packages [11] [12]. For the standardization and noise reduction in the probe data, CEL files were incorporated into R package, Affy, and the unprocessed signal intensity values of each gene expression probe sets were standardized with help of a statistical algorithm called as RMA (Robust

Construction of subnetwork
The complex interactome Protein Interaction Network (PIN) was rescaled to a significant subnetwork of Salt Sensitivity Protein Interaction Network (SS PIN ) by following admitted notions in the network biology. From the Protein Interaction Network, we extracted genes that belong to (a) hubs based on degree centrality (DC), (b) betweenness centrality (BC) based bottlenecks (c) salt sensitivity genes. The PIN created from Bisogenet was optimized and imported to Cytoscape 3.2.1 in order to represent and measure the different parameters like DC and BC connected to network centrality of each individual protein in the biological network [20]. The Network Analyzer [21] Cytoscape plugin was deployed to monitor the network's local and global centrality parameters [14,[22][23][24].

Selection of hub proteins
DC of a gene is the number of partners that are connected to that specific gene. Genes which shows higher DC in any given biological network will possess many interacting partners [25]. In PIN, genes having higher DC corresponds to essential genes. For identifying the hubs, we followed the hub classification approach, which was previously described by Rakshit et al., [26]. The cut-off scores used for DC, while selecting the hub protein is described as: where Avg is the average DC of significantly expressed genes in the PIN and SD denotes the standard deviation values [26].

Identification of bottlenecks
The higher DC is in correspondence to biologically essential genes, but DC is unable to quantify significance of any gene in a network [27]. Based on the theory of the protein's local property, DC does not assess the global value of the protein in the network. There could be several other key indicators that show the importance of a protein in the network based on its global significance. A global BC measure was therefore implemented to determine the characteristics of any query gene at the entire interactome level [28]. BC is measured by applying following formula: where 's' and 't' are the network nodes, other than 'n' and σ st (n) is the number of shortest paths from s to t that 'n' lies upon [29]. The significantly expressed genes falling in top 25% are regarded under bottleneck category using the node betweenness distribution.

Salt sensitivity genes
The genes involved in the pathway of renin angiotensin aldosterone system were collected as they serve as chief component in the regulation of salt and water balance of the body [30]. We also collected salt sensitivity genes from a detailed literature survey [31] [5] [32] [33]. In total, we obtained 47 SSG as represented in S2 Table. Mapping of weighted gene-gene correlations The map detailing gene-gene correlations was created on the basis of the algorithm known as the Pearson correlation across the entire gene set in the SS PIN . The "r" value indicating the correlation between gene pairs in the expression data was generated with help of Pearson's correlation coefficient (PCC) method. The formula used for calculating PCC for gene pairs is described in below given Formula 3.
PCC r ð Þ ¼ P n i¼1 ðx i À xÞðy i À yÞ ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi P n i¼1 ðx i À xÞ 2 q ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi P n i¼1 ðy i À yÞ

Functional enrichment analysis
Functional enrichment analysis validates the physiological importance of the genes involved in a biological process and helps to reveal unintended gene activity. ToppGene Suite was employed to perform functional enrichment of the filtered genes [34].

Microarray gene expression profile analysis
We obtained 2691 significant genes from the analysis of raw gene expression signals using RMA with statistical significance of p value � 0.05. The intensity values of genes in the expression profiles, before and after normalization, are depicted as box plots which represents standardized form of representing the data distribution in Fig 1.

Constructed Protein Interaction Network
Overall 2691 differentially expressed genes generated from the microarray expression profile were inputted in Bisogenet, a plugin in Cytoscape, to create PIN by extracting all potential connectivity between the genes. The created PIN comprised of outliers like replicated edges and self-loops. The PIN is transformed to a stable network by eliminating self-loops and replicated edges which is then used to calculate the standardized graph centrality parameters for each single gene. The plugin created a complex PIN, covered of 2691 nodes and 15474 edges with edge-node ratio of 5.75 on an average. Next, the plugin NetworkAnalyzer, calculated the degree centrality betweenness centrality parameters of the network which are considered as local and global graph parameters respectively [21]. Table 1 provides a description of the top 10 significant genes dependent on the highest degree centrality along with general parameters of centrality.

Salt Sensitivity Protein Interaction Network (SS PIN )
PIN genes have been grouped into hubs and bottlenecks based on criteria of graph centrality to establish a large network of protein interactions. The cut-off limit for hubs and bottlenecks

Functional enrichment analysis
We used the ToppGene computational annotation system to determine the functional and biological importance of the genes. The genes of HBS have been enriched by 2192 biological  Table 3).
These genes were also involved in pathways associated with obesity like regulation of lipolysis in adipocytes, adipogenesis, adipocytokine signaling pathway, renin-angiotensin system, signaling by leptin, toll-like receptor pathway, PI3K-Akt signaling pathway, ras signaling pathway, cytokine signaling in immune system insulin pathway, glucocorticoid receptor regulatory network and NF-kappa B signaling pathway ( Table 4). The detailed list of genes involved in these pathways are given in the S3 Table. The enriched genes were also involved in obesity related diseases like Diabetes Mellitus, Hypertensive disease, Asthma, Autoimmune Diseases, Diabetes Mellitus (Insulin-Dependent), Congestive heart failure, Cardiovascular Diseases, Coronary Artery Disease, Heart failure, Coronary heart disease, Coronary Arteriosclerosis, Depressive disorder, Hyperglycemia, Metabolic Syndrome X, Essential Hypertension, Ischemic stroke, Hyperlipidemia. Obesity is one of leading cause of aforesaid diseases. The interaction map of genes to the diseases is depicted in the

Co-expression analysis
The expression pattern similarity between 574 HBS genes was established and ranked based on Pearson's correlation algorithm (Fig 3) for array of control and disease samples. For control and disease samples (Formula 3), the algorithm created PCC for 328329 pair of genes from 574 genes. Gene pairs were screened in this approach based on established concepts such as i) gene expression level with high positive correlation. ii) Genes with similar patterns of speech are more likely to interact. In obesity studies, gene pairs with value r = 0.8 are chosen from the correlation map as higher r score indicates a greater relationship. Corresponding gene pairs were extracted from normal correlation map to identify the variation in the co-expression from obesity to normal sample. Totally, 226 genes are observed to co-express with obesity related genes with 1126 interactions in obesity condition (Fig 4). There were 88 obesity related genes and 23 SSGs in the set which were co-expressed in samples of obese adipose tissue. We focused on the 23 SSGs that are found to have co-expressed with obesity related genes.
By performing co-expression analysis, we obtained 23 co-expressed SSGs with obesity related genes. Eight among the 23 co-expressed genes were not previously reported for the  ACE, ACE2, ADD1, ADRB2, AGT, AGTR1, AGTR2, ANPEP, ATP6AP2, CMA1,  CYP17A1, GNB3, GRK4, KLK1, LNPEP, MAS1, MME, NEDD4L, PRCP, PRKG1 Table 5. We developed an interaction map of unreported SSGs with obesity related genes ( Fig  5) by taking their co-relation score as weight ( Table 6). We extracted the edge weight of gene pairs in both obese and normal sample to identify the distinct variations across set of two conditions. This attempt was performed because of the fact that differentially co-expressed genes participate in numerous biological processes resulting in adverse or complementary effects. It is very clear from the plot that majority of the co-expressed genes in the obese conditions are not co-expressed in normal conditions. Considering them as a disease subnetwork, we calculated the local topological parameters based on graph theory. Among 8 unreported SSGs, the highly connected genes with obesity related genes is ENPEP followed by WNK1. These two genes were having 21 and 20 direct connectivity to the obesity related gene in the coexpressed state. The SSGs, THOP1, CLCNKB, SCNN1G and THOP1 were having poor connectivity in the disease subnetwork. Notably, CYP3A5 and CTSA formed two separate networks with connectivity 6 and 3 respectively to obesity related gene. The interactions of the unreported SSGs with obesity related genes was separated and depicted in the Fig 6. We have narrowed down unreported SSGs to 5 prioritized genes (ENPEP, WNK1, CYP3A5, SLC24A3 and CTSA) based on their co-expression and topological parameters. An attempt was made to associate novel genes found in this study to the genome wide association studies on many disease traits from around the world in the GWAS catalog (Mac-Arthur et al., 2016). We extracted the reported traits of these co-expressed genes from GWAS catalog to identify their association to obesity ( Table 7). Many of these traits were related to obesity or its associated traits in cardiovascular or metabolic diseases.

Discussion
Traditional gene profiling approaches are based on detecting individual targeted genes showing variations in the experimental group versus the control one. However, mere identification of differentially expressed genes cannot always help in understanding biological pathways Studying salt-sensitive genes in obesity with integrated network biology and co-expression analysis (metabolism, transcription, and gene interactions, among others) regulations involved in the disease pathogenesis [35]. This is especially true in case of multifaceted or complex disorders like obesity, which do not progress because of instabilities in a single gene, but due to the changes in several pathways comprising of various biological networks [14]. In the current study, we investigated the concepts of gene regulatory networks in order to profile the significant variations of salt-sensitive genes involved in obesity.
Local parameter DC and global parameter BC were used to dissect the complex interactome. DC of a gene is the number of partners that are connected to that specific gene. Protein Interaction Network (PIN) are mathematical representations of physical and/or functional interaction between nodes, where nodes are the genes and the edges represent the connection between them, which may be binding possibility, metabolic interaction or regulatory crosstalks [36]. In our built PIN, significant alterations were observed in the expression level of h selected genes in our experimental settings. Initially, a complex network of significant genes from adipose tissue was constructed which was further decomposed to a Salt Sensitivity Protein Interaction Network based on hubs and bottlenecks. Hubs are considered as key features in networks, because they project critical intersections, which gets disturbs the networks whenever they are removed [37]. In the constructed interactome PIN, highly essential genes show high degree of connectivity. Several publications strongly suggested that diseased genes have higher connectivity and cross-talks when compared to non-diseased ones which are supporting hubs impact in the network [38]. We obtained 40 hubs with an average connectivity of 208 edges. The enrichment analysis revealed that 16 hub genes were involved in obesity and 13 hubs were involved in Type 2 Diabetes, closely related to obesity. Thus, the identification of hub molecules in the PIN is of substantial interest to get better insights of the disease pathogenesis. On other hand, functionally relevant vertices (nodes) in the network were detected Studying salt-sensitive genes in obesity with integrated network biology and co-expression analysis using betweenness centrality (BC). In fact, This approach helped to sort-out vertices linking dense networks, rather than nodes located inside the dense cluster [28]. Functional enrichment analysis represented 84 bottleneck genes in obesity.
The unintended interactions of the genes may lead to deregulated functions. Hence, to better understand the gene function in cellular context, we need to understand how genes are interconnected together within several biological processes and molecular signaling pathways. In fact this type of structural and functional bio interactome can be created by evaluating the Studying salt-sensitive genes in obesity with integrated network biology and co-expression analysis functional features of the genes. Therefore, carrying out gene enrichment analysis is a vital part in exploring the high-throughput data extracted from different biological observations and experiments. This methodology helps to discover the non-predefined interaction between functional genes that significantly regulate different biological. Gene ontology analysis depicted the involvement of 125 genes in obesity and 24 genes among them were SSGs contributing to 50 percentage of total SSGs. These findings signifies the critical role of SSGs in the role of obesity. To explore more on salt related genes co-expression analysis of obesity related genes in adipose tissue was carried out. By performing co-expression analysis, we obtained 23 co-expressed SSGs with obesity related genes. Eight among the 23 co-expressed genes were not previously reported for the disease obesity via gene ontology analysis. Gene co-correlation can be explained by the fact that genes showing similar regulation/ expression patterns are frequently interconnected together than with arbitrary genes [9]. Interaction map of the unreported SSGs with obesity related genes showed stronger interactions in disease state. It is very clear from the plot (Fig 5) that majority of the co-expressed genes in the obese conditions are not co-expressed in normal conditions. The novel obesity associated SSG and their interactions supports the view that the differentially co-expressed genes are likely to get involved in numerous molecular processes resulting in adverse or balancing effects [39].
The established theory in network biology is that disease related genes existing in close physical proximity are most likely to cause diseases with similar molecular basis. In addition, in a network of disease genes, the non-disease genes are identified to have a higher tendency to interact with other disease genes [40]. Considering the theory, we looked into disease and pathway related to the prioritized gene from unreported SSGs. WNK1 and ENPEP act as central hub in the network with high number co-expressed partners. In the functional enrichment Studying salt-sensitive genes in obesity with integrated network biology and co-expression analysis data, the gene WNK1 is reported in diseases like Diabetes Mellitus, Cardiovascular Diseases, Metabolic Syndrome X, Hyperglycemia and heart failure. These enriched diseases also show close relationship with obesity. Recent report by Ding et al., [41] in mouse model suggests WNK1 as a novel signaling molecule involved in development of obesity. It suggests lack of Akt3 in adipocytes rises the WNK1 protein level which in turn activates SGK1 and stimulates adipogenesis through phosphorylation and inhibition of FOXO1 transcription factor, subsequently, activating the transcription of PPARg in adipocytes. Increased adipocyte results in high fat accumulation and ultimately to obesity. Thus, WNK1, can act as one of the potential biomarker or targets for controlling obesity. Additionally, at pathway level, WNK1 is known to be a potent regulator of Na + and Clions transport, and consequently the blood pressure. Ewout et al, (2011) describes about the role of WNKs in salt metabolism via regulating sodium, chlorine, potassium and blood pressure [42]. WNKs are involved in crucial molecular pathways via connecting hormones such as angiotensin II and aldosterone to sodium and potassium transport. WNK1 is significantly involved in homeostasis and several biological processes regulations including and not limited to cell survival, proliferation and signaling fates. WNK1 activates sodium channel epithelial (ENaC) gene subunits SCNN1A, SCNN1B, and SCNN1D. It is also known as an activator of SGK1. In fact, by inhibiting WNK4 activity through kinase phosphorylation, WNK1 controls Na + and Clions transport. Moreover, WNK1 plays a switch role-like (activation/inhibition) of the Na-K-Cl cotransporters (NKCC) respectively [43]. ENPEP is a member of the M1 family of endopeptidases. It is plays a role in the catabolic pathway of the renin angiotensin system which in turn is involved in regulation of blood pressure [44]. The gene is observed in Hypertensive disease which are closely associated with obesity. Currently, inhibition of ENPEP activity is one of procedure used to treat hypertension Studying salt-sensitive genes in obesity with integrated network biology and co-expression analysis condition. Hypertension is a growing problem affecting 40% percent of adults due to the growing prevalence of obesity and diabetes in many parts of the world [45]. In addition, DNA methylation study in human adipose tissue reveals ENPEP as one of the differentially methylated genes associated with obesity and related traits [46]. ENPEP is found to be a candidate gene associated with obesity and hypertension traits in GWAS (Genome Wide Association study) studies. ENPEP is highly correlated with obesity related genes and also correlated with the diseases that may be comorbidity conditions of obesity. Therefore, our work provides strong evidence for ENPEP to be a novel gene that contributing to obesity. CYP3A5 plays a role in the metabolism of many drugs and other metbolites, such as steroids. CYP3A5 is also involved in the oxidative metabolism of xenobiotics, as well as calcium channel blocking drugs and immunosuppressive drugs. CYP3A5 is a member of the cytochrome P450 superfamily of enzymes. These proteins are monooxygenases catalyzing reactions in metabolism of drugs, cholesterol, steroids and other lipids. The main functions associated with CYP3A5 are monooxygenase activity, iron ion binding, lipid metabolism and oxidoreductase activity [47].
Potassium-dependent sodium/calcium exchanger (SLC24A3) plays an important role in intracellular calcium homeostasis. It facilitates exchange of intracellular Ca ++ and K + ions for  Studying salt-sensitive genes in obesity with integrated network biology and co-expression analysis extracellular sodium ions [48]. CTSA is a member of cathepsins family which are a group of lysosomal proteases that have a key role in cellular protein turnover. CTSA is not directly reported in obesity, but an analysis performed by Nadia et al., (2010) implicates cysteine Studying salt-sensitive genes in obesity with integrated network biology and co-expression analysis proteases cathepsins S, L, and K in complications of obesity [49]. Similarly, a study conducted by Araujo et al., (2018) reports CTSB, a member in Cathepsin family, controls autophagy in adipocytes. In obese individuals, the expression of this gene increases which in turn regulates inflammatory markers [50]. In our analysis CTSA is co-expressed with obesity related genes suggesting a critical role in the pathway of obesity since the members of Cathespin family plays import role in obesity. The major functions associated with CTSA are glycosphingolipid metabolism, protein transport and enzyme activating activity. In GWAS analysis, the genes CYP3A5, SLC24A3 and CTSA are observed in obesity related diseases like Hypertensive disease, Asthma, Coronary Artery Disease, Essential Hypertension, Hypertensive disease and Heart failure. We found the gene CYP3A5 is reported as one of loci associated with obesity related traits in GWAS studies [51]. It is also associated with Factor VII and blood metabolite levels. Recent study by Takahashi et al., [52] reports the relationship of factor VII and obesity. The results propose Factor VII is an adipokine, enhanced by TNF-α or isoproterenol, which plays crucial role in the pathogenesis of obesity. SLC24A3 and WNK1 are mapped to traits like fat body mass and body mass index which are closely associated with obesity. This analysis of integrating GWAS studies also substantiates the possible association of novel genes identified through this study to obesity related traits and comorbidity symptoms and diseases.
We acknowledge that our strategy has some technical constraints. First, since experimentally derived protein interactions were retrieved using Bisogenet plugin. This plugin employs multiple databases of protein-protein interactions hence any interaction which has not been updated in those databases may not have been included in our study. In addition to this the insufficiency of data pertaining to certain genes in the Gene Ontology (GO) should also be considered. In order to overcome these limitations, we tried to include protein interaction based on co-expression. Overall, our research analysis has presented the effectiveness of linking genetic expression with their functional relationship in identification of obesity candidate genes. In order to demonstrate the involvement of the novel candidate genes mentioned in this study further experimental validation is required.

Conclusions
This work systematically outlines an integrated bioinformatics pipeline for figuring out the most indispensable key signatures from the interactome Salt Sensitivity Protein Interaction Network (SS PIN ). The findings with biological relevance depict 50% of the SSGs have experimental evidences for their role in the pathogenesis of obesity. A detailed parametric downstream analysis based on biological insights, illustrated 5 candidate genes that can act as potential biomarker or target for obesity. To authenticate our results, we illustrate the possible role of ENPEP and WNK1 which appeared in the top prioritized list. Overall, our research analysis has presented the effectiveness of linking genetic expression with their functional relationship in identification of obesity candidate genes.
Supporting information S1