Apomorphine-induced pathway perturbation in MPP + -treated SH-SY5Y cells

: Apomorphine (APOM) is a non-selective dopamine agonist for Parkinson’s disease (PD). It also offers protection against oxidative stress. Thus, it has been used for treating advanced PD patients who do not respond to levodopa or other dopamine agonists. However, side effects such as orthostatic hypotension, nausea, and fibrotic nodules at the site of APOM injection have been reported after long-term use of APOM in PD patients. To secure the use of APOM for PD treatment without side effect, it is essential to understand the molecular mechanism involved in the action of APOM in PD. In this study, gene expression profile changes by APOM in a PD cell model, i.e., MPP + -treated SH-SY5Y cells, were measured at six time points (0, 3, 6, 9, 12, and 24 h) after APOM treatment using a commercial whole-genome expression array. A total of 2249 genes showed significant and differential expression profile. Pathways significantly affected by APOM were estimated using signaling pathway impact analysis (SPIA). In addition, differentially regulated regions within each affected pathway were identified with covariance analysis using a structure equation model.


Introduction
Apomorphine (APOM) is a strong antioxidant and free radical scavenger as well as a nonselective dopamine agonist that can stimulate D1-like (D1, D5) and D2-like (D2, D3, D4) receptors [1]. Due to its bioactivities, APOM has become the first dopamine agonist used to treat patients with Parkinson's disease (PD). As PD is caused by the loss of dopamine-generating neurons in the central nervous system, increase of dopamine in the brain has been used as a standard strategy for treating PD. Levodopa (l-dihydroxyphenylalanine), a natural dopamine precursor, can cross the blood-brain barrier. It can be converted to dopamine in the brain. Therefore, it has been used as a main drug for initial treatment of PD [2,3]. However, long-term use of levodopa may cause drug resistance and aggravation of the symptoms [4]. APOM has been used to treat advanced PD patients with persistent and disabling motor fluctuations [5][6][7]. It has been suggested that APOM's dyskinetic effect might be mediated by excessive activation of afferents to the centromedian-striatopallidal or pallidal-pedunculopontine pathways [8].
Although the neuroprotecive effect of APOM has been demonstrated both in vivo and in vitro experiments [9,10], the molecular mechanisms involved in the protection remains unclear. Furthermore, long-term use APOM for treatment of PD patients may lead to side effects such as orthostatic hypotension, nausea, and fibrotic nodules at the site of APOM injection [2]. To secure the use of APOM in PD treatment, it is essential to understand the underlying molecular mechanisms involved in the protection of APOM in PD patients. The aim of this study was to identify pathway regions significantly affected by APOM in PD cell model through analyzing two groups of time series microarray data, i.e., APOM treatment group and reference group using a structure equation model (SEM). Human neuroblastoma SH-SY5Y cells were treated with 1-methyl-4-phenyl-pyridium (MPP + ) and used as a PD cell model because they mimic many aspects of dopaminergic (DAergic) neuronal death observed in PD [11,12]. MPP + , an active metabolite of 1-methyl-4-phenyl-1, 2, 3, 6tetrahydropyridine (MPTP), is taken up by DAergic neurons via dopamine and noradrenaline transporters, resulting in inhibition of complex I of the mitochondrial membrane potential and formation of reactive oxygen species (ROS) [13,14]. As mitochondrial complex I deficiency has long been implicated in the pathogenesis of PD [15], MPP + treated SH-SY5Y cells have been extensively used as PD cell model. We investigated gene expression profile of SH-SY5Y cells co-treated by APOM and MPP + at six time points (0, 3, 6, 9, 12, and 24 h) with a commercial whole-genome expression array. To examine the effect of APOM in MPP + -treated SH-SY5Y cells, a reference without APOM treatment was needed for comparison. It has been reported that the genome-wide gene expression data of MPP + -treated SH-SY5Y cells at 0 (control), 3, 6, 9, 12, and 24 h without APOM treatment [12]. Therefore, these expression data were used as reference. The treatment with APOM to MPP + -treated SH-SY5Y cells resulted in significant expression profile changes for a total of 2249 genes. Signaling pathway impact analysis (SPIA) for these 2249 genes were performed and identified KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways significantly affected by APOM, including endoplasmic reticulum (ER) protein processing pathway, fanconi anemia pathway, colorectal cancer, pathogenic Escherichia coli infection, cell cycle, and TGF-beta signaling pathway. Covariance structure analysis of these perturbed pathways using SEM identified differentially regulated regions, i.e., significant modules within them. Significant modules might be closely related to the molecular function of APOM in PD cell model. Therefore, our results might improve the understanding of neuroprotective mechanisms of APOM in PD caused by the loss of DAergic neurons.

RNA extraction and microarray experiment
At each time point after co-treatment with APOM and MPP + , cells were harvested and total RNAs were extracted using TRIzol ® (Invitrogen Life Technologies, USA). RNAs were purified using RNeasy columns (Qiagen, USA) according to the manufacturers' protocol. The purity and integrity of the extracted RNA were examined with denaturing gel electrophoresis, optical density comparison of 260/280 ratio, and Agilent 2100 Bioanalyzer (Agilent Technologies, USA). Microarray experiment was performed for the extracted RNA according to protocols described in Choi et al. [16]. Briefly, 550 ng of the extracted total RNA was reverse-transcribed to first-strand cDNA using a T7 oligo (dT) primer. These first-stand cDNA was then converted to double stranded cDNA (ds-cDNA). The resulting ds-cDNA was employed as template for in vitro transcription to prepare labeled cRNA with biotin-NTP. Labeled cRNA (750 ng) was then hybridized to each human HT-12 expression v.4 bead array at 58 °C for 16-18 h according to the manufacturer's instructions (Illumina, Inc., San Diego, USA). Detection of the array signal was carried out using Amersham fluorolink streptavidin-Cy3 (GE Healthcare Bio-Sciences, Little Chalfont, UK) following the bead array manual. Arrays were scanned using Illumina bead array Reader confocal scanner according to the manufacturer's instructions. A total of 18 bead arrays were used for three controls (before treatment) and three samples at each time point (3,6,9,12, and 24 h) after co-treatment with APOM and MPP + .

Normalization of microarray data
After microarray data were exported using Illumina GenomeStudio software v2011.1, values of probe signal were log2 transformed and normalized using the function of lumiN() from the R package lumi [17]. Since the Illumina BeadChip employed in this study was a single-channel array, quantile normalization was chosen. After the normalization, signals under the detection limit were replaced with missing values. Probes with two missing values for the three replicates were filtered out.

Reference microarray data
To identify perturbed pathways after the addition of APOM to MPP + -treated SH-SY5Y cells, gene expression data of MPP + -treated SH-SY5Y cells from the study of Kim et al. [12] were used as reference, including time series microarray data at the control (before MPP + treatment) and 3, 6, 9, 12, and 24 h after exposure to 1 mM of MPP + .

Selection of genes with differential expression profiles between two time series of microarray data
To select genes showing differential expression profile from two time-series microarray data, maSigPro bioconductor package was employed [18]. This program uses a two-step regression approach to identify DEGs in time series microarray data. A global model was adjusted in the first step whereas the application of variable selection strategy was made to identify significant profile differences between two groups under comparison in the second step. Before comparing expression profiles between two groups of microarray data, log2 ratio of treated sample versus untreated sample (control) was calculated for each group of the microarray data. Microarray data from the study of Kim et al. [12] were used as data for the reference group. Genes showing differential expression profiles were considered significant genes. Enriched GO terms in the set of significant genes were explored with the function enrichGO() in the R package clusterProfiler [19].

Identification of perturbed pathways with SPIA
To identify affected pathways by APOM addition to MPP + -treated SH-SY5Y cells, an R package called SPIA [20] was employed. This program can find significantly affected pathways based on two types of evidence captured by two independent probability values, PNDE and PPERT. PNDE captures the significance of a given pathway Pi as provided by an over-representation analysis of the number of DEGs observed in the pathway whereas PPERT captures a total perturbation of the pathway and is estimated in a bootstrapping process where both pathway and the number of DEGs per pathway are maintained. Global probability PG was used for pathway ranking. It was estimated with the following equation: The net perturbation accumulation at the level of each gene i ( i g ), ( ) i Acc g , was calculated using the following equation: where ij  was the strength of the interaction between genes i g and j g . In this study, the value of  was +1 for activation and −1 for inhibition.
was the perturbation factor of gene i g whereas ( ) ds j N g was the number of down-stream genes of gene j g . Activation/inhibition of pathway was decided by the sign of the total net accumulated perturbation in the pathway:

Construction of the shortest path model for perturbed pathways
After identifying pathways affected by APOM addition to MPP + treated SH-SY5Y cells, connected structures of DEGs within them were assessed by considering each pathway as a directed graph with the function get.shortest.paths() of the R package igraph [21]. Each node and edge in the shortest path could not be presented for more than once. Self-loops were excluded but feed-backs and cycles were kept. Non-DEGs might be included in the shortest paths linking DEGs to allow the discovery of genes that are not DEGs but can be important for their mediating roles. The shortest path model (SPM) for each perturbed pathway was constructed by merging all of shortest paths among DEGs for the corresponding pathway as shown in our previous work [22].

Modularity measurement of SPM
The modularity of SPM for each perturbed pathway was calculated with the following equation: Where m is the number of edges, ij A is the element of the i-th row and j-th column of the A adjacency matrix, i k and j k are the degree of i and j, respectively, i c and j c are component of i and j, respectively. The sum includes over all i and j pairs of vertices, and ( , ) x y  is 1 if x y  and 0 otherwise [23]. Modules within each SPM were detected by using the function walktrap.community() in the R package igraph [21], which could find densely connected subgraphs in a graph via random walks. If the value of modularity is greater than 0.35, each module in the SPM is considered separately.

Evaluation of module/SPM with SEM
To examine if a module or SPM are statistically significant by the addition of APOM in MPP +treated SH-SY5Y cells, they were evaluated with SEM, as in our previous work [22]. Briefly, a module or SPM are considered to have casual relationships among variables such as structure of SEM where all variables can be represented as a system of linear equation: Every i th node in the set of variables V are characterized by uni-directional relationships with its "parents" pa(i) via path coefficients ( ij  ). The covariance structure can delineate the bi-directional relationships between the i th node ( i U ) and its "siblings" sib(i), as quantified by their covariance ( ij  ). The parameters the R package lavaan. After parameter estimation of a module/SPM, the significant module/SPM was detected with the following omnibus test:  ) values at a given degree of freedom. When the p-value in the Chi-square test is less than 0.05, the module/SPM is considered as statistically significant.

Identification of genes with differential expression profiles
Gene expression profiles of SH-SY5Y cells co-treated with APOM and MPP + were determined with Illumina whole-genome array (human HT-12 expression v.4 bead array) using 47,231 probes. A total of 18 arrays were employed for the examination of gene expression profile at six time points (0, 3, 6, 9, 12, and 24 h) with three replicates for each time point. After log2 transformation, probe signals were normalized with quantile normalization. Log2 ratio (co-treated sample with APOM and MPP + vs. control sample at 0 h) was obtained by subtraction of the average of normalized log2 values of three control arrays from normalized log2 value of each time point. When a gene had multiple probe IDs, the average value was assigned to the gene. Of 47,321 probes, only 32,421 probes with unique gene name were considered. To explore gene expression profile changes by the addition of APOM to MPP + -treated SH-SY5Y cells, gene expression data of MPP + -treated SH-SY5Y cells were required as reference. Thus, we employed the microarray data from the work of Kim et al. [12] as reference (see Materials and Methods section for details).
To identify genes showing differential expression profile between the two groups (APOM added group and reference group), a bioconductor package maSigPro [18] was employed using a tworegression step approach. In the first step, a global regression model was adjusted with all the defined variables to detect differentially expressed genes (DEGs). In the second step, a variable selection strategy was applied to examine differences between groups and identify statistically significant profiles. When the p-value threshold associated to the F-Static in general regression model was 0.05, the number of DEGs in the first step was 9395. In the second step, 2249 genes of these 9395 DEGs were found to have significantly differential expression profiles between the APOM added group and the reference group (adjusted p-value < 0.05). In this study, genes showing differential expression profiles were considered as significant genes. Their expression profiles are shown in Figures 1 and 2 using the clustering and plotting functions available in the maSigPro. In Figure 1, horizontal axis represents the array of reference and APOM added group at different time point. Replicate of each array was marked with a number. The vertical axis represents expression value (log2 ratio). A positive/negative value indicates up-/down-regulation, respectively. A value of 0 means no expression change. Clear contrast in expression profiles between the APOM added group and the reference group was observed in cluster 1. Genes in cluster 1 tended to be up-regulated in the APOM added group. However, they were not changed at expression level or down-regulated in the reference group. For easy comparison of their expression profiles between the two groups, average expression profiles were shown in Figure 2. The three values at each time point corresponded to the three replicates. Red and green lines represent the average expression profiles of the reference and the APOM added group, respectively. In the APOM added group, genes in cluster 1 and 3 tended to be up-regulated while genes in cluster 2 and 4 tended to be down-regulated at early stage.
To catch functional characteristic of each cluster genes, enriched gene ontology (GO) terms were examined with the function enrichGO in the R package clusterProfiler [19]. With a cutoff q-value of 0.05, GO terms such as GO:0036297 (interstrand cross-link repair) and GO:0022616 (DNA strand elongation) were significantly enriched in cluster 1 whereas GO terms such as GO:0009185 (ribonucleoside diphospahte metabolic process), GO:0019320 (hexose catabolic process), and GO:0006096 (glycolytic process) were significantly enriched in cluster 3. The up-regulation of genes in cluster 1 and 3 might have contributed to resistance against DNA damage and cellular metabolic deactivation resulted from MPP + toxicity. No significantly enriched GO terms were detected for cluster 2 or 4. Top 15 genes of each cluster ranked by p-value are shown in Table 1.

Analysis of perturbed pathways by the addition of APOM in MPP + -treated SH-SY5Y cells
Pathways significantly affected by APOM might be closely related to molecular roles of APOM in MPP + -treated SH-SY5Y cells. Thus, pathways significantly affected by the addition of APOM were examined with a Bioconductor package SPIA through signaling pathway impact analysis [20]. This analysis considered two independent probability values, PNDE and PPERT, which were calculated for each pathway with incorporating parameters such as log2 ratios (APOM added group/reference group) of genes showing differential profiles, statistical significance of the set of pathway genes, and the topology of the signaling pathway. Two types of probability were finally combined into a global probability value, PG, which was used for ranking the pathways and testing the hypothesis that the pathway was significantly perturbed by APOM addition. The impact analysis was performed for 2249 significant genes that showed differential expression profiles between the APOM added group and the reference group for 135 well characterized human gene signaling pathways available in KEGG (Kyoto Encyclopedia of Genes and Genomes). Since several pathways were tested simultaneously, the significance level was set at 5% after false discovery rate (FDR) correction [24].
Three KEGG pathways (protein processing in endoplasmic reticulum (ER), Fanconi anemia pathway, and TGF-beta signaling pathway) showed significant perturbation at 6, 12, and 24 h after APOM treatment. It is interesting to note that the ER protein pressing pathway is affected by APOM addition because MPP + toxicity in SH-SY5Y cells might induce ER stress [16]. Table 2 shows the six  Figure 1 for the reference group and the APOM added group.
perturbed pathways detected at 12 h. Status of each pathway was determined by the total net accumulated perturbation of a given pathway, A t , which was calculated as the sum of all perturbation accumulations for all genes in the pathway. If the value of A t for a given pathway was negative, the pathway was considered as negatively perturbed, i.e., inhibited. On the contrary, if the value of A t for a given pathway was positive, the pathway was considered as activated. APOM addition to MPP +treated SH-SY5Y cells resulted in the activation of three pathways (ER protein processing, colorectal cancer, and TGF-beta signaling pathway) and the inhibition of three pathways (fanconi anemia, pathological Escherichia coli infection, and cell cycle) compared to the reference group (Table 2).

Identification of significantly affected regions within each perturbed pathway with SEM
In the above section, pathways significantly affected by APOM addition in MPP + -treated SH-SY5Y cells were estimated with SPIA. Perturbed pathway may have different regulation structure between the two groups under comparison. To detect specific regions showing different regulation in each perturbed pathway, a graph model describing interactions in the pathway is useful. Since the perturbation of pathway is driven by significant genes showing differential profile between the APOM added group and the reference group, we focused on the connection of these genes. To identify how significant genes were connected in each perturbed pathway with the other genes on the microarray, the shortest paths (geodesic distance) between genes with differential profile on each perturbed pathway were searched with the function get.shortest.paths() of the R package igraph [21]. All shortest paths detected in a given pathway were merged into a graph called a shorted path model (SPM), where each node and each edge could not be presented for more than once while self-loops were excluded but feed-backs and cycles were kept. Table 3 shows the number of nodes and edges of SPMs corresponding to the six perturbed KEGG pathways shown in Table 2. Modularity and module for each SPM were detected by using the function of walktrap.community() of the R package igraph. The modularity indicates how good the division is or how separate the different vertices are from each other. A high degree of modularity indicates dense connections between the nodes within the same module but rare connections between nodes in different modules. The SPM for cell cycle and its module are shown in Figure 3. The connections between nodes in the same module were indicated in a black arrow while the connections between nodes in the different modules were represented by a red arrow. Each module within the SPM of the cell cycle was connected with genes such as CDC2 and CDC6.
Next, we detected specific regions showing significantly different regulation in each perturbed pathway. For this, each SPM/module was evaluated with SEM to verify whether the covariance structure was significantly different between the two groups. Here, covariance structures for SPM/module were calculated from the two conditions of gene expression data, i.e., the APOM added group and the reference group. SPM/module showing significantly different covariance structures between the two groups corresponded to regions perturbed by the addition of APOM in MPP + -treated SH-SY5Y cells. Evaluation of covariance structure with SEM was performed against each module of SPM except for two pathways (fanconi anemia and pathogenic Escherichia coli infection) which showed very low modularity in their SPMs (Table 3).  Evaluation of these two pathways was carried out against SPM itself and no significant difference was detected at any time points. Significant modules were detected in the other four pathways and summarized in Table 4. Module 1 of TGF-beta signaling pathway showed significance only at 3 h whereas module 4 of cell cycle showed significance at 6, 9, and 24 h. The fusion of all significant modules might provide an insight into the specific cellular mechanisms induced by APOM addition in MPP + -treated SH-SY5Y cells. Figure 4 shows the union of all significant modules in Table 4. The yellow and green nodes represent significant genes and non-significant genes within modules, respectively. The total number of genes on the graph of Figure 4 was 47, including 18 significant genes and 29 non-significant genes. Although non-significant genes did not show differential expression profile between the APOM-added group and the reference group, they might have important role as signal mediator when they are connected directly or indirectly to significant genes.
To examine the association of these 47 genes with disease, enriched disease ontology (DO) terms were investigated with the function enrichDO() of the R package DOSE [25]. DO analysis against these 47 genes detected many significant DO terms such as DOID:0060116 (sensory system cancer, q-value = 1.52 × 10 −15 ), DOID:768 (retinoblastoma, q-value = 6.07 × 10 −14 ), and DOID:769 (neuroblastoma, q-value = 6.99 × 10 −6 ). Most of the enriched DO terms were related to cancer.  This indicates that APOM might induce the expression change of cancer related genes to overcome the toxicity of MPP + . Figure 5 shows the expression profiles of 18 significant genes shown in Figure 5. The red color and the green color represent the reference and the APOM added group, respectively. The expression profile of BCL2 showed clear contrast between the two groups. It showed a time-dependent increase in the reference group but a time-dependent decrease in the APOM added group. This indicates that APOM may be able to induce the inhibition of BCL2 in MPP + -treated SH-SY5Y cells. Genes such as CDC14A, RBL2, and SMAD4 tended to be up-regulated in the reference group while genes such as ABL1, GADD45G, and GSK3B tended to be up-regulated in the APOM added group.