Altered sphingolipid function in Alzheimer's disease; a gene regulatory network approach

Sphingolipids (SLs) are bioactive lipids involved in various important physiological functions. The SL pathway has been shown to be affected in several brain-related disorders, including Alzheimer's disease (AD). Recent evidence suggests that epigenetic dysregulation plays an important role in the pathogenesis of AD as well. Here, we use an integrative approach to better understand the relationship between epigenetic and transcriptomic processes in regulating SL function in the middle temporal gyrus of AD patients. Transcriptomic analysis of 252 SL-related genes, selected based on GO term annotations, from 46 AD patients and 32 healthy age-matched controls, revealed 103 differentially expressed SL-related genes in AD patients. Additionally, methylomic analysis of the same subjects revealed parallel hydroxymethylation changes in PTGIS, GBA, and ITGB2 in AD. Subsequent gene regulatory network-based analysis identified 3 candidate genes, that is, SELPLG, SPHK1 and CAV1 whose alteration holds the potential to revert the gene expression program from a diseased towards a healthy state. Together, this epigenomic and transcriptomic approach highlights the importance of SL-related genes in AD, and may provide novel biomarkers and therapeutic alternatives to traditionally investigated biological pathways in AD.


Introduction
Alzheimer's disease (AD) is the most common age-related neurodegenerative disorder, representing one of the main causes of dementia . The incidence and prevalence of AD has increased in the last 10 years representing a major challenge for the public health system and society. Currently, no therapy that can effectively halt or attenuate the disease process exists. AD is histologically characterized by the progressive overproduction and accumulation of amyloid β (A β) peptide and hyperphosphorylated tau protein that lead to the formation of extracellular senile plaques and intracellular neurofibrillary tangles, respectively ( Di Fede et al., 2018 ). These pathological changes are associated with neurotoxicity and inflammation as well as neuronal degeneration, with large downstream effects on the physiology of the central nervous system (CNS). The complex pathogenesis of AD is still not fully understood, but there is a growing body of evidence suggesting that genetic and environmental factors are involved in the development and progression of the disease and associated cognitive impairment.
A balanced lipid metabolism is essential for normal brain function, while dysfunction may contribute to neurodegeneration. While the link between aberrant lipid metabolism and AD was disclosed already in 1906 by Alois Alzheimer, its role in the pathophysiology of neurodegeneration gained more interest in 1993 when a higher risk for developing AD was found among those carrying the cholesterol transporter APOE type 4 allele ( Corder et al., 1993 ). Later, alterations in the balance of certain membrane and structural lipids such as sphingolipids (SLs) and ceramides were demonstrated to play a crucial role in AD. For instance, high serum ceramide levels in cognitively normal elderly individuals have been associated with an increased risk of developing cognitive impairment and subsequent AD. ( Mielke et al., 2012 ). SLs are ubiquitous structural lipids in cellular membranes and also potent regulators of critical biological processes. In the brain, SLs are abundantly present in different cell types, including neurons and glia. To guarantee optimal neuronal function, the balance of SLs and associated metabolites is tightly regulated. Alterations of this balance may contribute to the development of disease ( Crivelli et al., 2020 ;Olsen and Faergeman 2017a ).
Interestingly, different metabolic and lipidomic analyses have shown a positive association between SL metabolites and A β and tau in cerebrospinal fluid samples of healthy individuals with a familial history of AD ( Mielke et al., 2014 ). These analyses have strengthened the notion that defects in SL metabolism correlate with A β and tau levels. Additionally, there is a large body of evidence that gangliosides, a class of glycosylated SLs, contribute directly and indirectly to the initiation and progression of AD by facilitating plaque formation ( Olsen and Faergeman, 2017a ). Despite the fact that several studies are clearly showing an effective association between dysfunction of SL metabolism and AD, the specific molecular pathways driving these alterations still remain unclear. To this aim, a better understanding of the relationship between epigenetic and transcriptomic processes in regulating SL function is of utmost importance for elucidating the underlying role of SLs in the pathophysiology of AD and the potential development of novel SL-targeted AD therapeutics.
In the present study, we examined SL-related genes from an epigenetic-transcriptional point of view, to further understand the involvement of downstream SL (dys)function in AD. Accordingly, the main aim was to identify if, and if so, to which extent SL genes are dysregulated at the methylomic and transcriptomic levels in brain tissue from AD patients. For this purpose, we first identified a set of 252 SL-associated genes based on manually selected Gene Ontology (GO) terms. The samples investigated in the current study represent a subset of data reported in Lardenoije et al. (2019) , which passed our quality checking control, taking into account both the data on gene expression and DNA (hydroxy)methylation. Transcriptomic analyses showed a profound enrichment of SL-related differentially expressed genes in AD brains. Among those, the conducted epigenetic data analysis revealed PTGIS, GBA , and ITGB2 to be differentially hydroxymethylated, reflecting a significant overrepresentation (Fisher's exact test, P-value < 1.09e-06). Furthermore, to evaluate how SLs influence the disease, we performed a Gene Regulatory Network (GRN) analysis. The reconstructed phenotype-specific networks were employed for in-silico perturbation analysis and identified SELPLG, SPHK1 and CAV1 to be the most influential gene combination in the AD network. Taken together, these findings confirmed the initial hypothesis that SL metabolism is significantly altered in AD. Furthermore, the identification of dysregulated SL-related genes and systematic dissection of their downstream effects by in-silico network perturbation analysis revealed the potential of this approach to identify diagnostic biomarkers as well as aid in the development of novel SL-targeted AD therapeutics.

Identification of sphingolipid pathway-associated genes
The borders for classifying 'sphingolipid related genes' are not strict, which is mainly due to the lack of a clear classification of genes belonging to this metabolic pathway in the literature, as well as to lack of absolute boundaries between cellular processes in general. Hence, as an unbiased approach, we selected relevant manually curated Gene Ontology (GO) terms related to SL metabolism as provided in Supplementary file 1, including key terms such as 'sphingoid metabolic process' and 'sphingolipid transporter activity'. In addition, we included terms related to core SL-associated functions, such as caveola and membrane raft processing, in order to characterize the changes observed in those biological processes most directly related to SL function in our AD datasets. The reasoning behind the inclusion of 'caveola' is based on existing evidence in the literature which, amongst others, suggests that caveolin-1 (CAV1) deficiency results in altered cellular lipid composition, and plasma membrane (PM) phosphatidylserine distribution in CAV1-deficient cells ( Ariotti et al., 2014 ). We also included GO terms related to lipid rafts because they are enriched with sterols such as SLs (e.g., sphingomyelin and glycosphingolipids) and cholesterol, and they are associated with specific raft proteins ( Bieberich, 2018 ).
After selection of terms, the genes connected to these terms were extracted by using the WikiPathways plugin for PathVisio, which allowed to save all elements connected to a GO term of interest in an xml type file format (gpml format) ( Slenter et al., 2018 ) ( Kutmon et al., 2015 ). This plugin requires the GO ontology file ("go.obo") ( Bieberich, 2018 ) geneontology.org; downloaded Nov. 17th, 2018) and a bridgeDb file with gene identifier mappings ('Hs_Derby_Ensembl_91.bridge'from www.pathvisio. org in this case) ( van Iersel et al., 2010 ). Thereafter, an R script was used to extract all contributing genes (as identified by their HGNC symbols) from the gpml files for each term. Subsequently, all information per gene was combined, by merging all GO terms from the selection by which the gene is annotated (Supplementary file 2). Furthermore, a basic 'tree-like' textual display of the selected terms was generated highlighting which sub-term(s) fall(s) under which exact master-term(s). For example, the master SL term 'sphingolipid metabolic process' (GO:0 0 0 6 6 65) contains multiple sub-terms such as 'glycosylceramide metabolic process' (GO:0 0 0 6 677), which, in turn, contains various sub-terms like 'ganglioside metabolic process' (GO:0 0 01573). For further details of this hierarchy, please see the Supplementary files 3, 4, and 5. In conclusion, this procedure resulted in a gene set consisting of 252 SL-related genes that were assessed in downstream applications.

Postmortem brain tissues
This study made use of brain tissue from donors of the Brain and Body Donation Program (BBDP) at the Banner Sun Health Research Institute (BHSRI), who signed an informed consent form approved by the institutional review board, including specific consent of using the donated tissue for future research ( Beach et al., 2008 ). DNA was obtained from the middle temporal gyrus (MTG) of 46 AD patients and 32 neurologically normal control BBDP donors stored at the Brain and Tissue Bank of the BSHRI (Sun City, Arizona, USA) ( Beach et al., 2008 ) ( Beach et al., 2015 ). The groups were matched for age, gender, and APOE genotype. There were 38 male and 40 female samples with an average age of 85 years. The average Braak score for the considered samples was 3.85 with most of the samples belonging to Braak stages 3 and 5, 19 and 17 samples, respectively. The organization of the BBDP allows for fast tissue recovery after death, resulting in an average post-mortem interval of only 2.8 hours for the included samples. A consensus diagnosis of AD or non-demented control was reached by following National Institutes of Health (NIH) AD Center criteria ( Beach et al., 2008 ). Comorbidity with any other type of dementia, cerebrovascular disorders, mild cognitive impairment (MCI), and presence of non-microscopic infarcts was applied as exclusion criteria. Detailed information about the BBDP has been reported elsewhere ( Beach et al., 2008 ) ( Beach et al., 2015 ). The current analysis was performed on the only dataset around to date that includes both data on mRNA expression and levels of UC, 5-mC and 5-hmC. The uniqueness of this work therefore lies in the parallel analysis of data from several different, but interdependent layers of gene regulation extracted from the same AD and control post-mortem brain samples. A table summarizing the demographic characteristics for control and AD samples has been provided in the Supplementary  Table S7.

Differential gene expression analysis
For differential gene expression analysis, Illumina HumanHT-12 v4 beadchip expression array data for the same MTG samples was obtained from a recently published study ( Piras et al., 2019 ). Preprocessing and analysis of the raw datasets was conducted in R (version 3.4.4) ( Team, 2016 ). Raw expression data was log-transformed and quantile-quantile normalized. For computing the cell type composition, the Neun_pos (Neuronal positive) cell percentage was calculated from the methylation data. The same regression model used for assessing methylation was applied to the expression data where the effects of age, gender and cell type composition were regressed out using limma (version 3.32.10). The nominal P-values obtained from limma were FDR-adjusted for only the set of 252 genes in the SL pathway and only the genes with adj. P-value < 0.05 were considered significantly differentially expressed.

Differential (hydroxy)methylation analysis
For assessing differential DNA methylation (5-methylcytosine, 5mC), hydroxymethylation (5hydroxymethylcytocine, 5hmC) and levels of unmodified cytosine (uC), data was obtained from a recently published study from our group ( Lardenoije et al., 2019 ), where Illumina HM 450K arrays were used for quantifying DNA (hydroxy)methylation status of 485,0 0 0 different human CpG sites. We only considered methylation datasets related to 46 AD patients and 32 controls for which the corresponding gene expression profiles were also available. Preprocessing and analysis of the raw datasets was conducted in R (version 3.4.4) ( Team, 2016 ). Raw IDAT files corresponding to the selected individuals were read into R using the wateRmelon "readEpic" function (version 1.20.3) ( Pidsley et al., 2013 ). The "pfilter" function from the wateRmelon package (version 1.18.0) ( Pidsley et al., 2013 ) was used to filter datasets based on bead count and detection p-values. Background correction and normalization of the remaining probe data was performed by using the "preprocessNoob" function of minfi package (version 1.22.1) ( Aryee et al., 2014 ). Beta values for the probes were obtained by the "getBeta" function of the minfi package. We used the MLML function within the MLML2R package ( Kiihl et al., 2019 ) for estimating the proportion of uC, 5mC and 5hmC for each CG site (CpG), based on the combined input signals from the bisulfite (BS) and oxidative BS (oxBS) arrays. All of the cross-hybridizing probes and the probes that contained a SNP in the sequence were removed resulting into 407,922 probes to be considered for the differential methylation analysis ( Chen et al., 2013 ). Raw IDAT files corresponding to the selected individuals were loaded into R using the minfi "read.metharray" function (version 1.22.1) ( Aryee et al., 2014 ) to generate an RGset for computing the cell type composition of the samples by using the "estimateCellCounts" function of the same package. For estimating the cell composition, we used the FlowSorted.DLPFC.450k package (version 1.18.0) [13] as the reference data for "NeuN_pos" cell composition within the frontal cortex. The limma package (version 3.32.10) ( Ritchie et al., 2015 ) was used to perform linear regression in order to test the relationship between the beta values of the probes and the diagnosis of AD. The used regression model considered beta values as outcome, AD diagnosis as predictor, and age, gender, and neuronal cell proportion as covariates. In order to identify significantly differentially methylated probes (DMPs), FDR correction for multiple testing was applied, where unadjusted P -values were corrected for those 103 genes that were significantly differentially expressed and belong to the SL pathway. Illumina human UCSC annotation was used for assigning methylation probes to the HGNC gene symbols.

Correlation between methylation state and gene expression
For those CpG sites and associated genes that showed significant differences in (hydroxy)methylation and gene expression levels in AD patients and controls, we assessed whether there was a significant association between the normalized (hydrox)methylation beta values and corresponding gene expression levels, across all samples. For this purpose, we used the "cor.test" function from the base R package to calculate the Spearman correlation coefficient for the paired samples.

Gene-gene interaction network Gene Regulatory Network (GRN) reconstruction
We used the software Pathway Studio ( Nikitin et al., 2003 ) to obtain directed functional interactions between the genes belonging to SL associated pathway. The ResNet Mammalian database in Pathway Studio contains a collection of literature-curated and experimentally validated directed gene-gene interactions. The high level of literature curation ensures the creation of confident interaction network maps. In order to obtain a set of functional regulatory interactions among the selected genes, our analysis was restricted to interactions belonging to categories "Expression", "Regulation", "Direct Regulation", "Promoter Binding", and "Binding". The obtained interactions are directed, that is, the source and target genes are known. Furthermore, information about the interaction type (activation or inhibition) is taken into consideration, if available.
In order to reconstruct phenotype-specific networks for disease (AD) and healthy phenotypes, we employed an in-house developed differential GRN reconstruction approach ( Zickenrott et al., 2016 ). Briefly, this tool relies on a genetic algorithm for removing interactions that are not compatible with Booleanized gene expression states of the disease and control phenotypes. As some of the interactions retrieved from Pathway Studio have an unspecified effect, that is, information on the activating or inhibitory consequence of the interaction is missing, the tool also infers missing regulatory effect data from the given gene expression and network phenotype under consideration. Here, we used the set of significantly differentially expressed SL genes and regulatory interactions between them obtained from Pathway Studio, to reconstruct differential networks with stable steady states representing the disease and healthy phenotypes.

In-silico network simulation analysis for phenotypic reversion
The differential network topology allowed us to identify common and phenotype-specific positive and negative elementary circuits, that is, network paths which start and end at the same node and with all the intermediate nodes being traversed only once. These circuits have been shown to play a significant role in maintaining network stability ( Gouzé, 2003 ) and the existence of these circuits is considered to be a necessary condition for having stable steady states ( Thomas, 2011 ). Regarding the biological relevance of these circuits, it has been shown that perturbation of genes in positive circuits induces a phenotypic transition ( Crespo et al., 2013 ). Furthermore, the differential network topology also aids in identifying differential regulators of the genes common to both phenotype-specific networks. Altogether, the differential regulators and genes in the elementary circuits constitute an optimal set of candidate genes for network perturbation as they are predicted to be able to revert most of the gene expression program upon perturbation. Identification of network perturbation candidates was carried out using the Java implementation proposed by Zickenrott and colleagues ( Zickenrott et al., 2016 ). The same software was used to perform a network simulation analysis by perturbing multi-target combinations of up to 3 network perturbation candidate genes. As a result, a ranked list of single-and multi-gene combinations (maximally 3 genes) is obtained, and scores for each combination, which represent the number of other genes in the network whose expression is predicted to be reverted through the chosen perturbation genes. Generally, a high score for a single-or multi-gene perturbation is indicative of its ability to regulate the expression of a large subset of downstream genes, hence playing a key role in the maintenance and stability of the phenotype under consideration.

Transcriptome analysis of sphingolipid genes
In order to identify SL-associated genes, we used the chosen gene ontology (GO) terms as previously described and employed the WikiPathways plugin ( Slenter et al., 2018 ) for PathVisio ( Kutmon et al., 2015 ) to convert each GO term of interest into a tree-like pathway diagram, which facilitates extraction of the mapped genes. By removing the genes belonging to irrelevant families and keeping only those related to SL GO terms, we identified 252 genes involved in SL-related processes. Next, information on the expression of these 252 genes within the MTG was extracted from available microarray data, derived from brain tissue samples of AD patients and age-match controls ( Lardenoije et al., 2019 ). The genome-wide differential expression analysis (DEA) of the transcriptomic data highlighted 7,776 genes as significantly (FDR corrected P-value ≤ 0.05) differentially expressed between AD and controls. By applying multiple testing correction for the number of SL-associated genes, we confirmed 103 out of 252 genes as significantly differentially expressed (see Table 1 for the top 30 differentially expressed genes and Supplementary file 6 for a complete overview of genes in the networks).

The SL pathway is significantly dysregulated in AD
The comparison of genome-wide gene expression data for SLassociated genes between AD patients and unaffected controls revealed that 24.5% (7,776 out of 31,726 genes) of the genes were significantly differentially expressed ( adj. P-value < 0.05) between the 2 conditions. However, out of the 252 identified SL pathwayassociated genes, 103 had significantly altered activity (40.87%), reflecting a significant overrepresentation (Fisher's exact test, P-value < 7.0e-09).
In terms of epigenetic dysregulation, no significant alterations were observed after multiple testing adjustments. Prior to the adjustments, 129, 109, and 170 probes displayed nominally significant (unadjusted P-value < 0.05) differential levels of 5-mC, 5-hmC, and uC, respectively. Larger sample sizes will be required in the future to assess whether FDR-adjusted significance can be shown  for these probes at the observed effect sizes. These CpG sites were associated to 90, 78, and 112 unique genes, respectively. Interestingly, we noticed a relatively high degree of overlap in genes when comparing the various cytosine states (a Venn diagram representing the overlap of genes across different levels is shown in Fig. 1 A). Similarly, the overlap of specific probes across different epigenetic levels is depicted in Fig. 1 B. Notably, there were 28 genes that were both nominally differentially methylated, hydroxymethylated as well as displaying different levels of unmodified cytosine ( Fig. 1 A), showing consistent differences across all levels of methylation. This suggests that there is a robust cluster of alterations in an epigenetic regulatory circuit centered around SLassociated genes in AD. Next, in order to assess the significance of DNA (hydroxy)methylation changes, we corrected the p-values for multiple hypothesis testing. Three CpG sites, associated to PTGIS, GBA, ITGB2 genes, displayed differential levels of hydroxymethylation after corrections ( P-value < 0.0 0 048) when comparing AD and control samples (see Table 2 ). Differential hydroxymethylation analyses showed a profound enrichment of SL-related differentially hydroxymethylated genes in AD brains, reflecting a significant overrepresentation (Fisher's exact test, P-value < 1.09e-06).

Correlation analysis of cytosine states and mRNA expression
In order to examine the association between alterations at the epigenetic and transcriptomic level, we tested the significance of the Spearman correlation between the nominally differentially methylated probes at the mC, hmC and uC levels, and their corresponding gene expression levels across all samples ( Table 3 ). No correlations were significant after multiple testing adjustments, but nominal significance ( P-value < 0.05) between 15 differentially methylated probes (mC) and corresponding gene expression levels was observed, which may serve as candidates for further evaluation using larger sample sizes. Similarly, 10 hmC and 10 uC probes were found to have nominally significant correlations with corresponding gene expression levels. Interestingly, many genes displayed a nominally significant association between levels of mul-tiple types of cytosine states and mRNA expression. For example, the CLN8 gene a significant association between differential methylation at the mC hmC and mRNA expression level was observed ( Table 3 ).
Finally, we assessed whether there is a relationship between levels of gene expression and methylation with measures of AD pathology. We have computed a Spearman correlation analysis comparing expression levels of individual SL-related genes across all samples with the available pathological measures (e.g., CERAD, Braak, Plaque, Tangle, and CSF scores). We observed significant associations between some of the top-ranked DEGs and various pathological measures. For example, the correlation coefficient between STS gene expression values and CERAD scores is -0.49 with a hochberg-adjusted P-value of 5.35 E-06 . The correlation coefficient between the expression of this gene and Braak score is -0.60 with a hochberg-adjusted P-value of 1.99 E-06 . The data is provided as Supplementary Tables S10-S13 and, additionally, in the case of a significant association between gene expression and disease pathology markers, in the form of a heatmap (Supplementary Image S1). Similarly, we examined the correlation between the beta values representing the mC, hmC, and UC levels and the pathological measures described above. The results are provided in the form of Supplementary Tables S12 (hmC), S13 (mC), and S14 (UC) describing the correlation between methylation levels and various pathological features.

Gene regulatory network analysis reproducing context-specific network
In order to gain a deeper understanding of SL-associated dysregulations at a systems level, we conducted a differential gene regulatory network (GRN) analysis to reconstruct 2 context-specific networks, representing the AD and unaffected control phenotypes. The employed GRN reconstruction approach ( Zickenrott et al., 2016 ) relies on Booleanized differential gene expression data and a prior knowledge network (PKN) of gene-gene interactions to reconstruct context-specific networks. The reconstructed AD network comprised 41 genes and 64 interactions ( Fig. 2 A), whereas the unaffected control phenotype network consisted of 42 genes and 57 interactions ( Fig. 2 B). Although both the networks representing the AD and control phenotypes look very similar, the underlying differences actually lie in the expression levels of the genes in the networks, that is, whether they are up-or down-regulated in the respective phenotypes. As we operate in a differential expression setting, a gene that is up-regulated in the AD phenotype, is down-regulated in the control phenotype and vice versa. Therefore, changes in the network topology are not the main denominator, but the state (expression) of the genes in the respective net- works (i.e., up-or down-regulated) is. Further information about the topological characteristics (e.g., in-and out-degree, closeness centrality, and clustering coefficient) of the networks representing the disease and healthy phenotype are provided in Supplementary Tables (Table S8 and S9), helping us in identifying the subtle differences in the networks that can be overlooked by simply looking at the images of network topology.

In-silico network perturbation shows candidate genes able to revert the AD phenotype
Following the paradigm of modeling diseases as network perturbations ( del Sol et al., 2010 ), we performed in-silico network perturbations to identify the most influential combinations of genes in the constructed GRN representing the AD phenotype. This network perturbation analysis highlighted the key roles of the perturbation candidates in the GRN and revealed that a 3-gene perturbation combination, involving CAV1, SPHK1 , and SELPLG, has the potential to revert the expression levels of 18 genes in the network from a disease phenotype towards a healthy phenotype (see Table 4 ).
Although the gene signatures identified here are not necessarily responsible for disease onset and progression, they are predicted to revert the expression of a large number of diseaseassociated genes upon perturbation in the in-silico network model representing the disease phenotype. This indicates a key regulatory role of the predicted genes in the establishment and maintenance of the disease phenotype. Taken together, the insilico network perturbation analysis highlights novel candidates that could serve as potential targets for therapeutic intervention in AD.

Discussion
While new high-throughput sequencing technologies and computational analyses of omics datasets have extended our knowledge on AD, the mechanisms underlying the dysregulation of cellular pathways in AD still require further investigation. In this study, characterizing interconnected layers of regulation, we provided more insight into the genes and mechanisms behind the dysregulation of specific SL pathways in AD. We were able to identify SL-related genes differentially expressed in AD patients which also display multiple types of epigenetic alterations affecting different cytosine states, corroborating the relevance of SL alterations in AD. Furthermore, using in-silico analyses, we identified key regulatory genes in the disease-specific network and candidate genes able to revert the disease state towards the healthy phenotype. These data have helped to identify specific SL-related cellular processes that display pronounced alterations in AD, with potential relevance for the development of new biomarkers and therapeutic strategies.

Many SL-related genes display altered expression and DNA (hydroxy)methylation in AD
The present study reveals a significant enrichment of SL-related gene expression alterations in the MTG of AD patients in comparison to age-and sex-matched controls. We observed significant differential expression in 103 out of 252 SL-associated genes. Many of these genes have already been reported to display alterations in AD and/or other neurodegenerative disorders. For example, STS , encodes for a sulfatase protein representing the top underexpressed gene in our data set, is involved in the synthesis of cholesterol, which is a well-known interactor of SLs and fundamental to maintain the equilibrium of cell membranes in philological functions ( Lucki and Sewer 2010 ). STS has previously been observed to have decreased expression in AD ( Wu et al., 2019 ) and to harbor a  Perturbation score (Score) represents the total number of genes whose gene expression is reverted upon inducing a perturbation of a given gene combination (Combination).
genetic variant associated with cognition ( Humby et al., 2017 ). Similar to STS, ARSG , encodes for a sulfatase and is involved in the formation of cholesterol and steroids ( Frese et al, 2008 ). As aforementioned, the regulation of SLs metabolism by steroid hormones is involved in several physiological events as development, reproduction, and metabolism ( Lucki and Sewer, 2010 ). Furthermore, EZR , encoding for the protein ezarin, a top-ranked over-expressed gene in our data, also showed increased expression in other studies on AD ( Zhang et al., 2018 ). EZR , is an intermediate between the plasma membrane and the actin cytoskeleton, has recently emerged as a target of SL regulation mainly during endocytosis, exocytosis and cellular trafficking. ( Adada et al., 2014 ). In fact, Sphingosine 1-phosphate (S1P) activates ezrin-radixin-moesin complex proteins contributing to cytoskeletal remodeling and changing membrane properties, which is essential for cellular homeostasis ( Cencetti et al., 2019 ).
Interestingly, the analysis also reveals new AD-associated alterations, such as the overexpression of CLN8 , a gene that has previously been linked to neurological dysfunction, but not directly to AD ( Lonka et al., 2005 ). CLN8 , hypothesized to mediate SL synthesis, known to be involved in the ceramide synthesis and homeostasis ( Adhikari et al., 2019 ), also displayed a significant (albeit nominally) positive correlation between expression levels and DNA methylation and a negative correlation with DNA hydroxymethylation.
A further new gene of interest is ARSG, underexpressed in AD in the analyzed data. While it has previously been linked to multiple complex disorders such as lysosomal storage disorders, certain cancers, and neurodevelopmental dysfunction ( Wiegmann et al., 2013 ), our study provides the first evidence for a dysregulation of ARSG in AD.
Another SL-related gene displaying multiple alterations was ITGB2, encoding integrin subunit beta 2. This gene has been demonstrated to influence the reorganization of lipid rafts, membrane platforms rich in SLs ( Bang et al., 2005 ). ITGB2 displays significant hyper-hydroxymethylation as well as increased mRNA expression in AD patients, which is in line with a previous study reporting increased expression levels of this gene in a mouse model of AD ( Swartzlander et al., 2018 ). Notably, we also observed a positive correlation between expression levels and DNA hydroxymethylation for ITGB2 .
Similarly, the gene PTGIS showed increased DNA hydroxymethylation in AD. It has previously been suggested to contribute to neurodevelopmental disorders, including childhood onset schizophrenia, and up to now was never linked directly to AD ( Ambalavanan et al., 2016 ). Epigenetic variation was also observed in GBA, which displayed an increased level of hydroxymethylation, concomitant with a negative correlation with its expression levels. This gene is known to harbor risk factor variations associated with Parkinson's disease ( Davis et al., 2016 ), but has not been linked directly to AD. Finally, PRKD1 , which encodes for the protein kinase D1 ( Cobbaut et al., 2018 ) displayed increased expression, concomitant with an increase in DNA methylation in AD patients, indicating an altered epigenetic regulation. Interestingly, this protein kinase has been suggested to have protective functions in Parkinson's disease ( Asaithambi et al., 2014 ), but has, to the best of our knowledge, not been studied in the context of AD.

Gene regulatory network and perturbation analysis
Given the results from previous studies showing significant alterations of SL-related genes in AD patients, we aimed to provide a more comprehensive and detailed characterization of these alterations at the gene regulatory network level. For this purpose, we conducted an integrative analysis of genes involved in SL functions by reconstructing phenotype-specific networks for AD patients and unaffected controls. Using a dedicated GRN approach and associated perturbation analyses, we identified multiple genes, in particular CAV1, SPHK1 , and SELPLG , that could play a key role as regulatory genes in maintaining the AD phenotype. Interestingly, a single gene perturbation of CAV1 alone was predicted to normalize gene expression profiles of 16 genes in the AD network, providing a promising candidate for further experimental investigation. Moreover, CAV1 was over-expressed in AD patients, consistent with findings from prior studies that have linked its elevated expression levels to cerebral amyloid angiopathy in AD ( Gaudreault et al., 2004) ( Van Helmond et al., 2007. Similarly, the gene SPHK1 was found both to represent a key regulatory node in the network maintaining the AD phenotype and occurred among the top candidate genes derived from the in-silico network perturbation analysis. Furthermore, SPHK1 displayed a significant correlation between expression levels and (hydroxy)methylation levels. These results are in line with a previously suggested role of this gene in the progression of AD ( Maceyka et al., 2012 ) ( Lee et al., 2018 ) ( Lee et al., 2018 ).
Both CAV1 and SPHK1 encode for key proteins involved in SL pathways and SL metabolism. CAV1 , which encodes for the membrane protein caveolin 1, displays a strong and dynamic interaction with SLs in cellular membranes in which they concomitantly and reciprocally regulate each other's levels and functions ( Sonnino and Prinetti 2009 ). In particular, CAV1 interacting with cholesterol and sphingomyelin affects the structural composition of lipid rafts, thereby influencing the transduction of physiological signaling as well as pathological processes. A disturbed balance of CAV1-SLs rich-membrane domains has been implicated in various pathological processes linked to neurodegeneration, prion disease and viral infections ( Quest et al., 2004 ) ( Olsen and Faergeman 2017b ). SPHK1 is a central enzyme in SL pathways, it phosphorylates sphingosine (S) into sphingosine 1 phosphate (S1P), which represents a potent SL that acts as an activator of various cellular signaling pathways regulating stress resistance, proliferation, differentiation, and mature phenotypes of nervous system cells. In particular, S1P, modulates pathways known for their engagement in the regulation of cell survival and differentiation, and therefore it is also recognized as an important molecule during aging ( J ę śko et al., 2019 ). Dysregulation at the level of S1P, caused by alterations in SPHK1 activity, has damaging consequences on the physiology of the brain leading to neurodegeneration and neuroinflammatory processes ( Czubowicz et al., 2019 ).
As such, SPHK1 plays an important role in various cellular processes including cell proliferation, differentiation, angiogenesis and inflammation. Furthermore, it has been implicated in different disorders including AD ( Alemany et al., 2007 ) ( Adams et al, 2016 ) ( Dominguez et al., 2018 ).
Finally, SELPLG , another gene ranked high in the perturbation analysis, encodes for the E-selectin receptor, which cross-interact with glycoSLs in the membrane, promoting the transduction of E-selectin-mediated signaling ( Winkler et al., 2012 ). SELPLG , was also found to be upregulated in the brains of mice suffering from cerebral amyloidosis in a prior study ( Haure-Mirande et al., 2019 ), lending further support to a potential involvement of this gene in AD.

Conclusion
Overall, our integrative analyses have revealed both novel candidate AD-associated genes and confirmed previously reported associations. As a limitation, the moderate sample size available for the study might have restricted the detectable changes in AD, and further analyses on independent samples are warranted. In spite of these restrictions, our results show a statistically significant enrichment of changes in SL genes and related pathways in AD, corroborating the key role SL-associated alterations in AD. The data presented here may serve as a starting point to help filling the current knowledge gaps concerning the role of SLs in AD. Followup studies using extensive molecular profiling analyses across multiple brain regions in combination with perturbation experiments using in-silico and in-vivo AD models are needed to obtain a more comprehensive characterization and mechanistic understanding of the role of SLs in AD.