Metabolic Profiling Reveals Biochemical Pathways and Potential Biomarkers of Spinocerebellar Ataxia 3

Spinocerebellar ataxia 3, also known as Machado-Joseph disease (SCA3/MJD), is a rare autosomal-dominant neurodegenerative disease caused by an abnormal expansion of CAG repeats in the ATXN3 gene. In the present study, we performed a global metabolomic analysis to identify pathogenic biochemical pathways and novel biomarkers implicated in SCA3 patients. Metabolic profiling of serum samples from 13 preclinical SCA3 patients, 13 symptomatic SCA3 patients, and 15 healthy controls were mapped using ultra-high-performance liquid chromatography-mass spectrometry and gas chromatography-mass spectrometry techniques. The symptomatic SCA3 patients showed a metabolic profile significantly distinct from those of the preclinical SCA3 patients and healthy controls. The principal differential metabolites were involved in the amino acid (AA) metabolism and fatty acid metabolism pathways. In addition, four candidate serum biomarkers, FFA 16:1 (palmitoleic acid), FFA 18:3 (linolenic acid), L-Proline and L-Tryptophan, were selected to discriminate between symptomatic SCA3 patients and healthy controls by receiver operator curve analysis with an area under the curve of 0.979. Our study demonstrates that symptomatic SCA3 patients present distinct metabolic profiles with perturbed AA metabolism and fatty acid metabolism, and FFA 16:1, FFA 18:3, L-Proline and L-Tryptophan are identified as potential disease biomarkers.


INTRODUCTION
Spinocerebellar ataxia 3 (SCA3) or Machado-Joseph disease (MJD) is the most common of the SCAs, with a worldwide prevalence of 1.5 cases per 100,000 individuals (Ruano et al., 2014). It is caused by an abnormal expansion of CAG repeats in the ATXN3 gene and is characterized by a wide range of clinical features, including progressive ataxia, spasticity, ophthalmoplegia, and extrapyramidal signs (Durr et al., 1996;Riess et al., 2008;Schmitz-Hübsch et al., 2008;Jacobi et al., 2011;Costa Mdo and Paulson, 2012;Paulson, 2012). The median age of disease onset is about 40 years, and the patients usually die within 15-20 years (van de Warrenburg et al., 2005;Rub et al., 2013). Unfortunately, the pathogenesis mechanisms of SCA3 are not fully elucidated and no current therapeutic approach can alleviate the symptoms effectively (Evers et al., 2014;Li et al., 2015;Wu et al., 2015).
An increasing body of evidence indicates that the preclinical stage of SCAs already presents with detectable non-ataxia signs, including oculomotor deficits in SCA3, slowing of saccade in SCA7 and SCA2, and impaired smooth pursuit eye movements (SPEMs) in SCA17 Maas et al., 2015;Wu et al., 2017). Thus, the preclinical stage may provide a window for disease intervention. Although some disease-modifying compounds have emerged in clinical trials, sensitive biomarkers to measure subtle therapeutic benefits are still lacking (Schulte et al., 2001;Saute et al., 2014Saute et al., , 2015. Therefore, identification of molecular pathways and biomarkers may prove beneficial in uncovering pathogenic mechanisms, identifying drug targets, monitoring disease progression, and assessing therapeutic effects (Lima and Raposo, 2018).
Metabolomics has emerged as a powerful technique, which explores the metabolic responses towards internal or external stimuli by comprehensively monitoring the variations in small molecules in certain biological samples (Nicholson and Lindon, 2008). Alterations in brain function can directly impact the biofluid composition in which metabolites are in dynamic equilibrium. Metabolites in the biofluid reflect the chemical imbalances in the cerebrospinal fluid (CSF) at the brain level, or in the blood and urine at the systemic level (Lima and Raposo, 2018). Metabolomics has been widely developed to map potential perturbations and identify novel biomarkers in neurodegenerative disease, such as Alzheimer's disease, Parkinson's disease, motor neuron disease, Huntington's disease (Zhang et al., 2013;Chen-Plotkin, 2014;Jové et al., 2014). Iorio et al. (1993) investigated the serum fatty acid profile using gas chromatographic analysis in patients with Friedreich's ataxia and SCA and found no significant differences in the fatty acid profiles of these patients. Griffin et al. (2004) defined a metabolomic phenotype in the brain of a SCA3 mouse model using 1 H-NMR and identified an increase in the glutamine concentration and a decrease in the myo-inositol concentration in the brain. More recently, Toonen et al. (2018) performed a metabolomic analysis of the plasma in a SCA3 mouse model and identified tryptophan as the most promising biomarker. However, metabolomic analysis has not yet been reported in SCA3 patients.
Herein, we sought to determine the biochemical pathways and potential biomarkers in SCA3. A global metabolomics approach using the metabolomics platforms of ultra-high-performance liquid chromatography-mass spectrometry (UHPLC-MS) and gas chromatography-mass spectrometry (GC-MS) was used to analyze the serum samples of preclinical and symptomatic SCA3 patients.

Subjects
A cohort of 26 genetically confirmed SCA3 patients were enrolled from the First Affiliated Hospital of Zhengzhou University, including 13 preclinical SCA3 patients and 13 symptomatic SCA3 patients. All patients underwent a detailed neurological examination by two neurological specialist doctors. The severity of ataxia was evaluated using the Scale for the Assessment and Rating of Ataxia (SARA) and International Cooperative Ataxia Rating Scale (ICARS). Symptomatic SCA3 was defined as proven SCA3 mutation with symptomatic ataxia (SARA ≥ 3). Pre-SCA3 was defined as proven SCA3 mutation with mild coordination deficits (SARA < 3), and/or unspecific neurological symptoms (Schmitz-Hübsch et al., 2006;Maas et al., 2015;Lima and Raposo, 2018). The Mini-Mental State Examination (MMSE) was used to measure the general cognitive function. Non-ataxia features were also evaluated, including muscle cramps, sensory disturbances, hyperreflexia or hyporeflexia in lower limbs, extrapyramidal signs, extensor plantar, impaired vibration sense. Meanwhile, 15 age-, sex-, and BMI-matched volunteers without any neurological or psychiatric diseases were enrolled as healthy controls. The CAG repeats of ATXN3 were tested using capillary electrophoresis (Souza et al., 2016).
All the serum samples were collected and stored using standard procedures (Yin et al., 2013;Kamlage et al., 2014). All the participants in this study did not take any medications or any irritation causing drink/food 72 h before the test. Peripheral venous blood was collected in 5 ml K + -EDTA anticoagulant tubes (Sarstedt) in the morning after an overnight fasting. The blood was gently mixed and allowed to clot for 30 min in a 37 • C water bath. Serum was extracted by centrifuging at 5,000 rpm for 10 min (4 • C), and was stored at −80 • C immediately until analysis.
The Ethics Committee of the First Affiliated Hospital of Zhengzhou University approved this study. Written informed consent was obtained from each participant.

UHPLC-MS Analysis
An aliquot of 200 µl methanol containing internal standards was fully mixed with 50 µl serum, in order to remove the protein.
The suspension was subsequently drawn and lyophilized. The lyophilized powder was resuspended in 50 µl 20% acetonitrile in water by a 30 s vortex. After centrifugation, the supernatant was directly used for UHPLC (Waters, Milford, MA, USA) coupled to Q Exactive HF MS (Thermo Fisher Scientific, Waltham, MA, USA) analysis. For electrospray ionization positive (ESI+) mode, BEH C8 column (100 mm × 2.1 mm, 1.7 µm; Waters, Milford, MA, USA) was used for separation. The Mobile phases were 0.1% formic acid in water (A) and 0.1% formic acid in acetonitrile (B). The gradient started with 10% B and was maintained for 1 min, subsequently increased to 40% B within 5 min, and then reached 100% at 17 min. After maintaining for 5 min, it returned to the initial 10% B. For electrospray ionization negative (ESI−) mode, HSS T3 (100 mm × 2.1 mm, 1.8 µm; Waters, Milford, MA, USA) column was employed with 6.5 mM NH 4 HCO 3 in water (C) and 6.5 mM NH 4 HCO 3 in 95% methanol (D) as the mobile phases. The gradient started with 0% D and was maintained for 1 min, increased linearly to 40% D at 2 min and subsequently to 100% D at 13 min. After being maintained for 6 min, the gradient returned to the initial 0% D. For both the modes, the column temperature was set at 50 • C with elution flow rate 0.35 ml min −1 . For both ion modes, the MS capillary temperature was 300 • C with the auxiliary air heating temperature 350 • C. The sheath gas and auxiliary gas flow rate were set as 45 and 10 units. Full scan resolution was set as 12 million. For the positive mode, m/z scan range was 80-1,200 Dalton and the spray voltage was 3.5 kV. For the negative mode, m/z scan range was 70-1,200 Dalton and the spray voltage was 3 kV.

GC-MS Analysis
An aliquot of 400 µl methanol containing internal standard was fully mixed with 100 µl serum to remove the protein and extract metabolites. After centrifugation at 15,000 g and 4 • C for 15 min, 360 µl of the supernatant was lyophilized. An aliquot of 100 µl methoxyamine pyridine solution (20 mg ml −1 ) was added into the lyophilized powder. After 30 s vortex and 15 min ultrasonication, oximation was performed at 40 • C in a water bath for 2 h. Subsequently, 80 µl of N-methyl-N-(trimethylsilyl) trifluoroacetamide (MSTFA) was added for the following silylation, which lasted for 1 h at 40 • C in a water bath. After centrifugation, the supernatant was ready for subsequent GC-MS analysis. Prepared sample (1 µl) was injected into the GCMS-QP 2010 analysis system (Shimadzu, Kyoto, Japan) with the split ratio of 10:1. DB-5 MS capillary column (30 m × 250 µm × 0.25 µm; J & W Scientific, Folsom, CA, USA) was used for metabolites separation. The detailed separation parameters have been reported in our previous work (Ye et al., 2014). The full scan mode (33-600 m/z) was chosen with the event time modified as 0.5 s.
To ensure the data quality, quality control (QC) samples were prepared by mixing all the samples. During the analysis of the samples, one QC sample was run after every 10 injections.

Data Analysis
For both the UHPLC-MS and GC-MS raw data, peak alignments were first performed and the ion tables exported. Unique ions for known metabolites were chosen and the derived final metabolite tables were imported into GC-MS solution software (Shimadzu, Kyoto, Japan) and TraceFinder software (Version 3.2, Thermo Fisher Scientific, Rockford, IL, USA) for batch integration. The ion structure elucidation was performed with the National Institute of Standards and Technology (NIST) library for GC-MS database and an in-house LC-MS2 library for UHPLC-MS database.
The clinical parameters were expressed as means (SD, standard deviation). Statistical analysis was carried out using Chi-square test, one-way analysis of variance (ANOVA) test, and student's t-test by SPSS software (Version 20.0, IBM Corporation, Armonk, NY, USA). Significant differences were indicated at levels of p < 0.05.
The annotated metabolites from UHPLC-MS (ESI+, ESI−) and GC-MS analysis platforms were combined for further analysis. Multivariate analysis was performed to visualize general clustering of the samples using SIMCA-P+ software (Version 13.0, Umetric, Umea, Sweden). We performed unsupervised analysis by Principal Component Analysis (PCA), and supervised analysis by Orthogonal Projections to Latent Structures Discriminant Analysis (OPLS-DA) to assess the classification of the samples. Cross-validation was performed to check the robustness of the constructed OPLS-DA model. Variable importance in Projection (VIP) in the OPLS-DA was used to identify metabolites that mainly contributed to the separation between two groups. Metabolites with VIP > 1.0 were chosen for the subsequent Wilcoxon−Mann−Whitney test. To correct for multiple testing, false discovery rates (FDR) were calculated using q values. Metabolites with both multivariate significance and univariate significance (VIP > 1.0 and q < 0.05) were considered as the differential markers. The potential association between the candidate metabolites and clinical characteristics in SCA3 patients was also evaluated using Pearson correlation test by SPSS software.
The scatter plots of the candidate biomarkers were generated using GraphPad Prism version 7.0. Receiver Operating Characteristic curve (ROC) analysis and Logistic regression analysis were performed using SPSS software to evaluate the predictive potential of the candidate diagnostic biomarkers.

Metabolic Overviews of the SCA3 Patients
Three independent metabolomics platforms, including UHPLC-MS (ESI+), UHPLC-MS (ESI−), and GC-MS, were employed to collect comprehensive metabolic data. QC samples were prepared for monitoring the instrument robustness during data acquisition. The relative standard distribution (RSD) of the ions detected in each platform is shown in Supplementary  Figure S1. Typical total ion chromatograms (TIC) from the QC are shown in Supplementary Figure S2. The results indicated good reproducibility and stability during the procedure.  The data were qualified for the following metabolomics data analysis. Based on the integrated metabolomics analytical data, 321 metabolites in total were identified, including 115 metabolites from UHPLC-MS (ESI+), 62 metabolites from UHPLC-MS (ESI−) and 144 metabolites from GC/MS (Supplementary Table S1). In the PCA plot of the three groups, the SCA3 group was separated significantly from the control and Pre-SCA3 groups, while Pre-SCA3 group and control group overlapped ( Figure 1A, R 2 X cum = 0.93, Q 2 cum = 0.86). The PCA plot of Pre-SCA3 group and control group showed no significant distinction ( Figure 1B, R 2 X cum = 0.249, Q 2 cum = 0.057).
To further identify the metabolites that discriminate the SCA3 group and Pre-SCA3 or control group, supervised OPLS-DA models were performed between two groups (Figures 1C,D). The variables were unit variance scaled and cross-validation with 200-time permutation tests were used to identify the reliability of the models. The R 2 Y cum and Q 2 Y cum of the OPLS-DA model for SCA3 group and control group were 0.945, 0.794 with four components responsible for the classification. The R 2 Y cum and Q 2 Y cum of the OPLS-DA model for SCA3 group and Pre-SCA3 group were 0.851 and 0.691 with two components.

Differential Metabolites Related to SCA3
Univariate statistical analysis was subsequently performed based on the VIP of OPLS-DA model. Metabolites with VIP >1 in OPLS-DA and q value < 0.05 in the Wilcoxon-Mann-Whitney test after correction for multiple testing were selected as significantly differential metabolites. In total, 18 differential metabolites were highlighted between groups ( Table 2). The correlation analysis showed negative correlations between the SFA levels and ICARS scores (r = −0.714; p = 0.006), FFA 16:0 levels and ICARS scores (r = −0.649; p = 0.016), and a positive correlation between L-proline levels and MMSE scores (r = 0.593; p = 0.033; Table 3).

Perturbed Metabolic Pathways
According to MetaboAnalyst 3.0, HMDB, KEGG, a map of the altered metabolic pathways for SCA3 patients was constructed (Figure 2). The serum metabolic profiling changed with the progression of the disease and was mainly associated with AA and fatty acid metabolism pathways.

Potential Biomarkers
The ROC curves were plotted based on the differential metabolites between symptomatic SCA3 group and control group. Metabolites with area under curve (AUC) >0.7 were selected as potential biomarkers. FFA 16:1 (palmitoleic acid), FFA 18:3 (linolenic acid), L-Proline and L-Tryptophan were identified as potential biomarkers (Figures 3A-E). A combined ROC analysis was performed with the selected biomarkers, followed by binary logistic regression analysis, with AUC reaching 0.979 ( Figure 3F, Table 5).

DISCUSSION
This study presents the first evaluation of the serum metabolomic profile of patients with SCA3. Our results show that the metabolomic profile of symptomatic SCA3 patients differs significantly from preclinical SCA3 patients and healthy controls, while the metabolomic profile of preclinical SCA3 patients shows no obvious difference compared to healthy controls. Importantly, the differential metabolites of symptomatic SCA3 patients revealed perturbations in AA metabolism and fatty acid metabolism.
The AA metabolic pathway was found to be significantly disrupted in the symptomatic SCA3 group. Branched-chain amino acids (BCAAs) including valine and leucine, and aromatic amino acids (ArAAs) including tryptophan and tyrosine, were all downregulated in the serum of SCA3 patients. Moreover, proline and the product of phenylalanine, hippuric acid, were also decreased. With respect to the perturbed biochemical

0.778
Bold values indicate the statistically significant differences in the correlations (p < 0.05).
pathways, these altered metabolites are not only related to energy metabolism, but also influence the metabolism of neurotransmitters. We inferred that 5-hydroxytryptamine (5-HT), dopamine, and γ-aminobutyric acid (GABA) may also be affected in SCA3 patients. Indeed, previous studies have suggested that dopamine and 5-HT pathways are associated with SCA3 (Schols et al., 1998(Schols et al., , 2015Takei et al., 2002Takei et al., , 2005Teixeira-Castro et al., 2015;Martinez et al., 2017). Besides, BCAAs were associated with lowered 5-HT levels because BCAAs can compete with tryptophan (a precursor of 5-HT) for transportation across the blood-brain barrier (Choi et al., 2013). Interestingly, a recent metabolomic study of plasma from a mouse model of SCA3 reported tryptophan as the most promising biomarker, which is consistent with our results (Toonen et al., 2018). Tryptophan has also been reported to be elevated in Huntington disease because of increased 3-hydroxyanthranilate oxygenase activity (Schwarcz et al., 1988). Thus, the detection of tryptophan and 3-hydroxyanthranilate oxygenase in SCA3 patients corroborates our results.
Fatty acid metabolism was another significantly perturbed pathway in symptomatic SCA3 patients. β-oxidation of free fatty acid (FFA) is a multi-step process in which fatty acids are broken down in various tissues to produce energy. In this study, the level of saturated fatty acid (SFA) decreased in the serum of symptomatic SCA3 patients, whereas that of monounsaturated fatty acid (MUFA) and polyunsaturated fatty acid (PUFA) increased. Specifically, palmitic acid (FFA 16:0) and stearic acid (FFA 18:0) were decreased, while palmitoleic acid (FFA 16:1), oleic acid (FFA 18:1), linoleic acid (FFA 18:2) and linolenic acid (FFA 18:3) were increased. SCD is the key enzyme that catalyzes the conversion of saturated fatty acids to unsaturated fatty acids, especially for FFA 16:0 and FFA 18:0. The results indicated two-fold elevated SCD indices in symptomatic SCA3 patients. Indeed, previous studies have shown that SCD indices are elevated in some diseases, such as Alzheimer's disease and Amyotrophic Lateral Sclerosis (Astarita et al., 2011;Henriques et al., 2015). However, the role of SCD in SCA3 patients still needs to be elucidated in order to explain our results.
Glycochenodeoxycholate (GCDCA) is a conjugated bile acid, composed of glycine and chenodeoxycholic acid, which is associated with cholesterol metabolism (Li and Chiang, 2009). The carnitine system, including free carnitine and acylcarnitines, is essential for cellular energy metabolism as a carrier of long-chain fatty acids for β-oxidation or as a reservoir of acyl-CoA (Jones et al., 2010). Hippuric acid (N-Benzoylglycine) is synthesized from benzoic acid and glycine by the enzyme acyl-CoA:glycine N-acyltransferase (GLYAT; Irwin et al., 2016). The decrease of hippuric acid in SCA3 patients indicates the abnormal AA and fatty acid metabolism of SCA3 patients. However, it is unclear whether the key enzyme, GLYAT, functions abnormally.
According to the ROC curve analysis, a set of four candidate biomarkers, composed of FFA 16:1, FFA 18:3, L-Proline and L-Tryptophan, were selected to differentiate the symptomatic   SCA3 patients from the healthy controls with an AUC value of 0.979. These potential metabolite markers provide a novel and promising diagnostic approach for detection of SCA3.
However, there are limitations in this study. The chemical concentration of the metabolites in blood may weakly represent the concentration in the brain, and obtaining CSF or brain tissue may be invasive and unethical. The number of samples is another limitation as this is an exploratory study designed to evaluate the metabolic profiles of age-, sex-, BMI-matched preclinical and symptomatic SCA3 patients. Another limitation is that it is a cross-sectional study, which may only represent short-term metabolic perturbations. However, the data serves as a foundation for future longitudinal studies.
SCA3 is a slow progressing disorder with a long preclinical period, and the development of molecular biomarkers is urgently needed. Recently, promising candidate molecular biomarkers of SCA3 have emerged, including genetic modifiers, transcriptional biomarkers, and mitochondrial DNA damage (Franca et al., 2012;Raposo et al., 2015aRaposo et al., ,b, 2017Raposo et al., , 2019Ramos et al., 2019). In our study, although not being able to differentiate between preclinical carriers and symptomatic patients, these biomarkers should be crucial to improve sensitivity, when used in complement to clinical and imaging markers. Furthermore, when ameliorating drugs will be available, these biomarkers being able to detect pathogenic alterations will be useful to optimize therapeutics efficiency.
In conclusion, the serum metabolic profiling is altered with the progress of the disease in SCA3 patients, the perturbations being mainly associated with AA metabolism and fatty acid metabolism pathways. A panel of four biomarkers that confidently detect the disease is proposed as promising biomarkers. However, further longitudinal studies on larger cohorts, especially those designed to investigate changes in neurotransmitters and metabolomic profiles in CSF or cerebellum of SCA3 patients, are required to validate our findings.

DATA AVAILABILITY
All datasets generated for this study are included in the manuscript and/or the Supplementary Files.

ETHICS STATEMENT
This study was carried out in accordance with the recommendations of the Ethics Committee of the First Affiliated Hospital of Zhengzhou University with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the Ethics Committee of the First Affiliated Hospital of Zhengzhou University.

AUTHOR CONTRIBUTIONS
YX and GX designed the project and reviewed the article. ZY, CS and LZ performed the metabolomic analysis, data analysis, and drafted the manuscript. YLi revised the manuscript. JY and YLiu assessed the clinical characteristics of the subjects. CM and HL collected the serum.