Urine Protein Biomarker Candidates for Autism

Diagnosis of ASD is based on DSM behavioral criteria and best addressed by a multi-disciplinary team, utilizing standardized tools and laboratory evaluation for co-existing conditions. Diagnosticians require extensive training to apply diagnostic criteria to children at various developmental levels and recognize the wide heterogeneity in ASD features. Professionals are currently unable to meet the demands for timely and accurate diagnoses. Despite the fact that symptoms are often present before age 2 [2,3], the average age of diagnosis was 3.1 years for children with autistic disorder, 3.9 years for pervasive developmental disorder not otherwise specified, and 7.2 years for Asperger’s disorder [4]. Significant racial/ethnic disparities exist in the recognition of ASD [5]. Surveyed parents of children with autism waited almost three years to receive a diagnosis following their first visit to a professional regarding their child’s development [6]. Urine biomarkers of autism would be very useful as urine is easy to obtain non-invasively and diagnosis could be hastened. Non-proteomic candidate urine biomarker studies have been reviewed [7]. Proteomic methods have been advocated for discovery of autism biomarkers [8]. We present data regarding potential urine biomarkers discovered with the use of isobaric Tags for Relative and Absolute Quantitation (iTRAQ), a proteomic method not previously used for ASD biomarker discovery.


Introduction
It is estimated that 1 in 88 children in the USA have autism spectrum disorder (ASD) with a worldwide average prevalence of about 1% [1].
Diagnosis of ASD is based on DSM behavioral criteria and best addressed by a multi-disciplinary team, utilizing standardized tools and laboratory evaluation for co-existing conditions. Diagnosticians require extensive training to apply diagnostic criteria to children at various developmental levels and recognize the wide heterogeneity in ASD features. Professionals are currently unable to meet the demands for timely and accurate diagnoses. Despite the fact that symptoms are often present before age 2 [2,3], the average age of diagnosis was 3.1 years for children with autistic disorder, 3.9 years for pervasive developmental disorder not otherwise specified, and 7.2 years for Asperger's disorder [4]. Significant racial/ethnic disparities exist in the recognition of ASD [5]. Surveyed parents of children with autism waited almost three years to receive a diagnosis following their first visit to a professional regarding their child's development [6]. Urine biomarkers of autism would be very useful as urine is easy to obtain non-invasively and diagnosis could be hastened. Non-proteomic candidate urine biomarker studies have been reviewed [7]. Proteomic methods have been advocated for discovery of autism biomarkers [8]. We present data regarding potential urine biomarkers discovered with the use of isobaric Tags for Relative and Absolute Quantitation (iTRAQ), a proteomic method not previously used for ASD biomarker discovery.

Sample collection
This project was approved by the University of Minnesota Institutional Review Board and informed written consent was obtained from all subjects or their parents. The urine samples were collected from patients with well characterized severe ASD (n = 8) and age and gendermatched normal controls (n = 8) who consisted of pediatric subjects who had no complaints of any kind, normal growth, were taking no medications and had no known diseases. Subjects ranged in age from 5 to 15 years with a mean age of 9.4 year; 14 / 16 were male. ASD subjects were recruited from a University of Minnesota clinic which specialized in care for ASD patients, by a physician with special interest, training and experience with ASD. Diagnoses were established through medical evaluations which included clinical observation and Autism Diagnostic Observation Schedule testing performed by trained clinical psychologists and / or physicians. This paper examines the differences between the ASD and normal control groups. All urine samples were stored at -80°C until use. Aliquots of urine were then used for protein extraction and further analysis.

Protein extraction and quantification
For isobaric Tags with Relative and Absolute Quantitation (iTRAQ™) analysis, a mass spectrometry technology used for relative protein quantitation that incorporates stable isotopes into peptides, the urine sample was mixed with 4x acidic acetone, left at -20°C for 2 hrs and then centrifuged at 4000 × g for 15 min to precipitate the protein.
The pellet then was dissolved in water. The acetone precipitation (above) was repeated two more times. The proteins then were dissolved and dialyzed with ITRAQ sample buffer.

Abstract
Autism Spectrum Disorder (ASD) is increasingly common and treatment is most successful if instituted early in life. Urine biomarkers of autism could hasten diagnosis as urine is easy to obtain non-invasively. The purpose of this research was to compare urine samples from 8 ASD subjects and 8 age-and gender-matched controls. Samples were analyzed with isobaric Tags with Relative and Absolute Quantitation (iTRAQ™), a mass spectrometry method which enables identification and relative quantification of many proteins. We identified 231 proteins present in at least two control and two Autism subjects for statistical analysis. We ranked the proteins according to the P-values between Autism and control groups. The top five proteins were increased in ASD subjects compared to controls and included alpha 1-acid glycoprotein, prostaglandin-H2 D-isomerase, kininogen-1 isoform 2, leucine-rich alpha-2glycoprotein 1 and immunoglobulin fragment Fab New lambda light chain. Two of the top 10 proteins have previously been related to autism, while six have previously been related to inflammation. We analyzed the 231 proteins with Ingenuity Pathway Analysis (IPA) to assess pathways involved and potential biomarkers. 104 of the 231 proteins were suggested by IPA as possible biomarkers in urine. The remaining 127 urinary proteins we identified are novel as they are not included as IPA urinary proteins. These research data fit with some current hypotheses regarding autism and suggest a relationship between ASD, inflammation and gastrointestinal disease. Specific urinary proteins are identified which could potentially serve as biomarkers for ASD.

Urine iTRAQ™-labeling experiments
Protein concentration after sample desalting was determined with Bradford reagent (Bio-Rad, Hercules, CA) by absorbance measurements at 590 nm and serial dilutions of samples and bovine serum albumin standard (1 mg/ml). The urine samples were defrosted, diluted and loaded onto IgY serum depletion column (Proteomelab IgY HC Spin Column; Beckman Coulter, Fullerton, CA) to remove high abundant proteins by following manufacturer's instructions. The column removes 12 high abundant proteins (albumin, IgG, IgA, IgM, HDL, Apo-A-I, Apo-A-II, haptoglobin, alpha1 antitrypsin, alpha1 acid glycoprotein and alpha2 microglobulin) from the urine samples which include about 96% of the total proteins. Equal amounts of protein (20 µg) from urine of normal and autism subjects were labeled with iTRAQ™ reagents (ABI, Foster City, CA) according to the manufacturer's protocol as described previously [9]. Two iTRAQ 8-plex experiments were performed, each containing 4 control samples labeled with iTRAQ reagents 113, 114, 115, and 116 and four autism samples labeled with 117, 118, 119, and 121 in each 8-plex.

Strong cation exchange (SCX) chromatography
Peptide mixtures were reconstituted with SCX load buffer and fractionated off-line in the first dimension by SCX liquid chromatography as described previously [10] with the following modifications. Fractions were collected at 3 min intervals during a linear gradient from 10 mM -100 mM potassium phosphate over 55 minutes followed with a 100-500 mM potassium phosphate gradient from 55 min to 75 min. Fractions with UV absorbance > 2.0 mAU at 280 nm were dried in vacuo for reversed phase LC on a C18 column.

LC-MALDI and 4800 MS / MS
The Tempo™ LC MALDI spotting system (AB Sciex, Foster City, CA) was used to separate peptides in select (7 of 14 peptide containing) SCX fractions. LC eluate was deposited onto an LC MALDI target in a 1232-spot format. MS data were acquired on a 4800 MALDI TOF / TOF™ analyzer (AB Sciex) with a 200 Hz repetition rate Nd:YAG laser as described previously [11].

Peptide and protein identification
Tandem mass spectra were analyzed with Protein Pilot™ version 3.0.1 software (ABI), which uses the Paragon™ scoring algorithm [12] and a list of inferred proteins were provided. The ratios of select pairs of iTRAQ™ "reporter ions" (m/z 113.1, 114.1, 115.1, etc.) provided relative protein abundance among sample types for select proteins. The peptides used for quantitation were determined by Protein Pilot™ and no peptides were manually curated out of the final report. Briefly, any peptide included in the protein iTRAQ™ average with a peptide ID confidence ≥ 1%, not associated with two distinct proteins, and with a S / N sum for the peak pair of > 9 was included in the report. Search parameters included: 8-plex peptide mode; quantitation; cysteine fixed methyl methanethiosulfonate modification; trypsin enzyme; thorough search mode (which includes semi-trypsin peptides during the search); biological modifications (includes > 220 post-translational and artifactual modifications) and minimum Prot Score / protein confidence of 95% (see below). Precursor and product ion tolerances are calculated iteratively during the database search and are optimized from within the ProteinPilot software. The protein database was the Human subset of the NCBI RefSeq (http://www.ncbi.nlm.nih.gov/ RefSeq/) database, date of release: June 2008; 33850 target sequences (which contained 179 common contaminant protein sequences from the cRAP database (http://www.thegpm.org/crap/index.html)), and all reversed protein sequences. We removed 'reversed' hits, from our analysis [13]. The ProGroup algorithm within ProteinPilot software was used to create protein groups based on peptides observed.

Quantitative analyses
iTRAQ analysis: Relative protein abundance from iTRAQ was normalized across experiments based on the iterative regression approach of Oberg et al. [14] For normalization, only legitimate peptide sequences that mapped to a uniquely identified protein within an experiment were used. Any accession values that were partially nested between experiments were combined. Then any peptides that still mapped to multiple proteins across experiments were removed. After normalization, each protein was evaluated separately to estimate differences between disease groups [14]. The top 25 proteins, ranked by smallest p-value, were plotted with means and 95% confidence intervals based on quantiles from a normal distribution.
Ingenuity pathway analysis: (IPA, Ingenuity Systems, Redwood City, CA, http://www.ingenuity.com) IPA constructs hypothetical protein interaction clusters on the basis of a regularly updated "Ingenuity Pathways Knowledge Base (IPKB)". IPKB is a very large curated database that consists of millions of individual relationships between proteins, culled from the biologic literature. This database, which is updated every two weeks, also integrates a large range of biological information including protein function, cellular localization, and small molecule and disease inter-relationships. The IPA version available on 10-18-2012 was used for our analysis. IPA functional analysis identifies biological functions and / or diseases related to the proteins identified. Then after the number of proteins associated with each biological function and / or disease is counted, a Fisher's exact test is used to calculate a p-value to quantify the strength of evidence for those relationships. A p-value cut-off of 0.01 was considered for the analysis in this study [15,16].

Protein profile
In order to find urine proteins that may provide evidence of biological differences between normal and autism groups, we compared urine samples from 8 autism patients and 8 normal gender-and age-matched controls with 8-plex iTRAQ LC-MS. Labeled peptide fragments were subjected to 2DLC MS/MS. The NCBI RefSeq human database was used to correlate MS / MS spectra to peptides and the corresponding proteins in the database. Initially, the raw iTRAQ data had 362 proteins and 3,024 peptides with a measurable area. This was reduced to 324 proteins and 832 peptides after removing 'reverse hits' and peptides that mapped to multiple proteins within iTRAQ experiments. After combining the iTRAQ experiments, combining partially nested accession numbers, and removing repeated peptides, there were 249 proteins and 734 peptides. Finally, there were 231 proteins that had at least two control and Autism subjects for statistical analysis.
Comparing the autism and control cases with quantification data (231 protein groups had quantification data), the iTRAQ log abundance ratios of the top 25 proteins of interest (i.e., those showing largest estimated differences between autism subjects and normal controls) are shown in Figure 1 listed in ascending order of p values from the mean difference in log abundance between the autism subjects and the normal controls. If the protein was increased in autism subjects compared to controls, the resulting ratio would fall above the horizontal line. Proteins that are either significantly increased or decreased compared to controls could potentially serve as biomarkers. Table 1 shows the top 10 proteins ranked by p value. Some of those proteins are related to inflammation, which is one of the findings in the gastrointestinal tract in autism. For example, alpha 1-acid glycoprotein and collagen alpha-1(XII) are reported involved with autism [17,18]. Other proteins in the list related to inflammation include prostaglandin-H2 D-isomerase, which is related to colitis [19]. Leucine-rich alpha(2) is related to acute inflammatory response [20] (Table1).
The quantitated 231 proteins from the iTRAQ experiments were imported into IPA for functional analysis of canonical pathways associated with such proteins or genes. The functional analysis identified the biological functions and/or diseases that were most significant to the dataset. In this functional IPA analysis, the number of proteins associated with each biological function and / or disease is counted and a range of p values is generated. The top disease and disorder categories with smallest p value ranges from this analysis were Organ Injury and Abnormalities (3.23E-06 -3.20E-02), Inflammatory Response (7.47E-06 -4.91E-02) and Gastrointestinal Disease (4.06E-03 -3.20E-02), all of which have been reported related to autism ( Table 2). Interestingly the third lowest p value range was Renal and Urological disease (3.84E-05 -6.24E-03) though only 2 molecules were involved in this pathway, while the other pathways included 7, 19 and 11 molecules respectively.
The Pathway Report from IPA lists several molecules involved in signaling pathways related to inflammatory response. These molecules include chemokine CX3C ligand, trefoil fector2, inducible T-cell costimulator ligand, kallikrein-related peptidase 3 and plasminogen.
Pathway analysis also found that hepatocyte growth factor receptor and complement component 3 receptor are proteins among the ones involved in macropinocytosis signaling; integrin, alpha 9 is involved with the ERK5 signaling pathway. These inflammation pathways are consistent with the increased inflammation reported in autism.
IPA biomarker analysis of the 231 quantitated proteins shows that 104 proteins in this list could be biomarkers in urine (indicated by asterisks in Supplemental Table 1). 127 of the 231 urine proteins we identified are not included as urine proteins in the IPA database, which means we are the first lab to identify these proteins in urine.

Discussion
Urine is an ideal source of biomarkers [21] because it is abundant, easily acquired and enriched with proteins and metabolites. Abundant plasma proteins such as albumin are not filtered by the normal renal glomerulus [22] providing an advantage in urine proteomics. In addition, the array of proteases activated upon blood clotting and resulting in many degradation products [23] is not a problem with urine. Fiedler et al. [24] defined how variables within urine itself, such as salt composition, pH and sample handling influence protein recovery and made proteomic workup of urine reproducible. The urinary proteome did not undergo significant changes when urine was stored for 3 days at 4°C or 6 h at room temperature [25,26]. Urine offers a sampling of most plasma proteins, with increased proportions of low-molecular-weight proteins [27]. A urine study in healthy individuals [28] identified more than 1500 proteins.
Since the microvascular architecture of the kidney can be altered by pathologic processes in other parts of the body, it is not surprising that proteome analysis of the urine has revealed biomarkers for nonrenal diseases. Urine biomarkers for graft-versus-host disease after  [29] and confirmed [30] using proteomic techniques. Similar studies have defined urine biomarkers related to cardiovascular disease [21]. No urine biomarkers are currently used clinically in patients with ASD. However there is evidence that urine could be useful in the study of ASD.
Many ASD patients have gastrointestinal dysfunction [31,32]. A variant in the gene encoding the MET receptor tyrosine kinase is associated with ASD. MET is a receptor that functions in both brain development and gastrointestinal repair. The ASD-associated MET promoter variant is enriched in a subset of individuals with cooccurring autism spectrum disorder and gastrointestinal conditions [33]. Since gastrointestinal disorders may be associated with altered intestinal permeability, resulting blood and urine levels could also be affected.
AutDB (http://www.mindspec.org/autdb.html), a publicly available database for autism research, is built on information from studies on molecular genetics and biology of ASD. As of December 1, 2012, AutDB listed 331 autism-associated genes encoding 325 proteins, 31 of which have been reported in urine (Table 3), according to IPA and other studies. These 31 urinary proteins have not been studied in autism but by using mass spectrometry with inclusion lists these proteins can be specifically targeted for analysis. Seven of these ASDassociated proteins (* in first column of Table 3) have been identified in the MS analyses in this study and 6 have been reported by IPA as urinary proteins. These 6 proteins include 1) C4B, one of the proteins in the complement system which involves inflammation [34]. 2) CD44 was also reported involved with inflammation [35,36]. 3) Neurexin was reported involved with neuronal function and damage also [37,38]. 4) Protein kinase C beta was involved with autism [39,40] 5) Cell adhesion molecule 1 was related to autism [41,42] 6) Titin was related to inflammation in renal allograft rejection [43].
Syndecan-2 is a protein found in our autism urine samples which has not been reported in IPA. Syndecan-2 was also associated with neuronal inflammation [44]. These studies support the validity of proteomic study to explore the urine of patients with ASD.
Ingenuity Pathway Analysis with identified proteins showed the top related diseases and disorders were Organismal Injury and Abnormalities, Inflammatory Response and Gastrointestinal Disease.
That is consistent with literature reports about possible pathological mechanisms related to autism. There are also reports that autoimmune mechanisms and neuronal abnormal abnormalities are involved in autism [45][46][47][48].
Most of the top 10 proteins which we found different between ASD and controls were reported to be related to autism or inflammation (Table1). For example, alpha 1-acid glycoprotein was related to autism. [17] Prostaglandin-H2 D-isomerase [19] and leucine-rich alpha-2glycoprotein were related to inflammatory disease [49]. Kininogen was related to neuronal function [50,51]. Lithostathine-1-alpha was related to inflammatory disease [52,53]. Alpha-2-glycoprotein 1 might involve both neuronal and inflammatory function [20]. Collagen alpha-1 was also related to inflammatory disease and neuronal abnormalities [18,54,55]. Inter-alpha-trypsin inhibitor heavy chain H4 may be involved with inflammatory disease also [56,57]. Immunoglobulin is involved with inflammation as is commonly known but immunoglobulin fragment Fab new lambda light chain is increased in autism and this deserved further investigation. Vitelline membrane outer layer 1 homolog, isoform CRA_b is expressed in normal tissue but no description of function has been yet reported.
Previous proteomic studies of ASD mainly describe brain or serum proteins. In a post-mortem study [58], brain proteins were isolated from the frontal lobe gray matter from eight autism patients. After 2DE and LC-MS/MS, they identified glyoxlase I (Glo1) with altered mobility because of a single-nucleotide polymorphism. Although Glo1 has been reported in reported in urine (see Table 3), it was not a highly ranked biomarker candidate in our study. An LC-MS / MS based proteomic study identified four potential serum biomarkers in autistic children which included apoB100, complement factor H related protein, complement C1q, and fibronectin 1 [59]. We did find one of these four possible biomarker, fibronectin 1 isoform1, in urine and though it was well identified, it did not reach our top 10 list of candidates. We did not find complement factor H related protein of complement C1q, however we did find the related complement proteins, complement component C1r, C3, C4A, C4B,C6 and complement factor H, some of them with high confidence identification. Candidate non-proteomic urinary biomarkers for autism which have previously been reviewed do not include the high ranking candidates that we discovered with the iTRAQ technique.  A potential limitation is the possibility of false positive results when evaluating the relative abundance of such a large number of proteins (231) in so few independent subjects (8 in each group). The primary objective of the iTRAQ experiment was to determine a ranking of most promising potential protein biomarkers in a pilot study and not to declare precisely which proteins are and are not significantly different between autism and control subjects, which would require a much larger study. If one were to implement a correction for the multiple comparison issue, the Bonferroni correction is well-known to be overly conservative. A much more powerful, valid, approach most commonly used in genomic / proteomic (and other -omic) settings is control of the False Discovery Rate (FDR) using the q-value [60], with liberal thresholds reflecting the preliminary nature of the study. Using this approach the top ranked protein would be statistically significant at a q-value threshold of 0.3 and the 2nd ranked protein at a threshold of 0.5. However, several additional investigations (e.g. IPA studies) were also conducted to further support the findings of the iTRAQ evaluation; in this sense, the likelihood of false positives is reduced.
It is surprising that Renal and Urological disease also was among the top related diseases and disorders. Although autism has been reported to coexist with renal abnormalities [61] there is no obvious mechanism related to renal pathology of which we are aware. The fact that only two molecules were involved in this pathway further questions the relationship of this finding to autism. IPA analysis showed that several molecules in this study were involved in signaling pathways related to the inflammatory response, such as chemokine CX3C ligand, trefoil fector2, inducible T-cell costimulator ligand, kallikrein-related peptidase 3 and plasminogen. Pathway analysis also found that hepatocyte growth factor receptor and complement component 3 receptor are proteins among the ones involved in macropinocytosis signaling; integrin, alpha 9 is involved with the ERK5 signaling pathway. These inflammation pathways are consistent with the increased inflammation reported in autism. It has been reported that some signal transduction pathway changes are involved in the inflammation response in an autism model [62,63]. Our study and analysis suggest more details about this mechanism. IPA biomarker analysis also showed that 104 of the urine proteins in the identified protein list could be possible biomarkers for autism. In summary, ITRAQ urine analysis combined with IPA analysis could be an effective discovery approach to search for possible biomarkers of autism. This study suggests specific proteins of interest in ASD subjects compared to controls.