Autoantibody Profiling in Multiple Sclerosis Using Arrays of Human Protein Fragments

Profiling the autoantibody repertoire with large antigen collections is emerging as a powerful tool for the identification of biomarkers for autoimmune diseases. Here, a systematic and undirected approach was taken to screen for profiles of IgG in human plasma from 90 individuals with multiple sclerosis related diagnoses. Reactivity pattern of 11,520 protein fragments (representing ∼38% of all human protein encoding genes) were generated on planar protein microarrays built within the Human Protein Atlas. For more than 2,000 antigens IgG reactivity was observed, among which 64% were found only in single individuals. We used reactivity distributions among multiple sclerosis subgroups to select 384 antigens, which were then re-evaluated on planar microarrays, corroborated with suspension bead arrays in a larger cohort (n = 376) and confirmed for specificity in inhibition assays. Among the heterogeneous pattern within and across multiple sclerosis subtypes, differences in recognition frequencies were found for 51 antigens, which were enriched for proteins of transcriptional regulation. In conclusion, using protein fragments and complementary high-throughput protein array platforms facilitated an alternative route to discovery and verification of potentially disease-associated autoimmunity signatures, that are now proposed as additional antigens for large-scale validation studies across multiple sclerosis biobanks.

Profiling the autoantibody repertoire with large antigen collections is emerging as a powerful tool for the identification of biomarkers for autoimmune diseases. Here, a systematic and undirected approach was taken to screen for profiles of IgG in human plasma from 90 individuals with multiple sclerosis related diagnoses. Reactivity pattern of 11,520 protein fragments (representing ϳ38% of all human protein encoding genes) were generated on planar protein microarrays built within the Human Protein Atlas. For more than 2,000 antigens IgG reactivity was observed, among which 64% were found only in single individuals. We used reactivity distributions among multiple sclerosis subgroups to select 384 antigens, which were then reevaluated on planar microarrays, corroborated with suspension bead arrays in a larger cohort (n ‫؍‬ 376) and confirmed for specificity in inhibition assays. Among the heterogeneous pattern within and across multiple sclerosis subtypes, differences in recognition frequencies were found for 51 antigens, which were enriched for proteins of transcriptional regulation. In conclusion, using protein fragments and complementary high-throughput protein array platforms facilitated an alternative route to discovery and verification of potentially disease-associated autoimmunity signatures, that are now proposed as additional antigens for large-scale validation studies across multiple sclerosis biobanks. Molecular & Cellular Proteomics 12 Autoimmune diseases are commonly described by the breakdown of the immunological self-tolerance mechanisms (1). The onset of autoimmune diseases is believed to be induced by complex interactions of genetic alterations and environmental triggers. Recent genome-wide association studies have refined the genetic landscape across autoimmune diseases although only a limited clinical significance could be added from genetic associations (2). As autoimmune diseases ultimately manifest themselves on protein level, there is a potential for proteomic approaches for investigating the autoimmune diseases (3,4). Even though it is still elusive whether autoantibodies contribute to pathogenesis or are merely epiphenomenal (2), their presence in the circulation is a known fundamental feature of autoimmune diseases and they are therefore regarded as appealing biomarker candidates. Besides, compared with many other serum and plasma proteins, immunoglobulins are generally abundant and stable molecules of a common scaffold to which a wide range of detection reagents are available. These features enable an efficient analysis of autoimmunity signatures in plasma without extensive pre-analytical sample preparations (4,5).
There is growing evidence that multiple target antigens could be involved in the response in autoimmune diseases (6), which provides the rationale to collect reactivity patterns rather than single reactivity features. Accordingly, the use of antigen microarrays for a multiparallel determination of antibody reactivity toward hundreds or thousands of antigens represents an appealing, high-throughput concept (7)(8)(9), especially if arrays can be built without biased target selection so that novel autoantigen candidates can be proposed. Antigen microarrays, either in planar or bead-based format, have recently been shown useful for autoantibody profiling in a range of diseases including, but not limited to, autoimmune diseases (10 -16). Regardless of whether the antigens are expressed followed by immobilization, or directly expressed on-site (17,18), a resource of either protein antigens or cDNA clones is needed to build such arrays.
One such protein antigen resource is the Human Protein Atlas project, which aims at producing these antigens for the generation of antibodies toward the human proteome. Within the Human Protein Atlas, fragments from protein encoding genes are routinely selected based on regions of low similarity to other proteins, cloned, expressed, and purified (19,20). The protein fragments are eventually used for immunization and subsequently to affinity purify antibodies and to produce antigen microarrays, on which they serve to verify the specificity and selectivity of the generated antibodies (21). These arrays are built with 384 antigens each and because they are linked to the antibody production, their composition is not related to any criteria and therefore new antigen batches with new content are produced continuously.
For the presented study, we have extended the application range of these in-house produced antigen microarrays for the systematic profiling of autoimmunity repertoire of plasma in the context of multiple sclerosis (MS) 1 . MS is the most common cause of nontraumatic neurological disability among young adults and it is characterized by chronic inflammation in the central nervous system (CNS) causing axonal damage, demyelination, and neurologic disability (22). MS remains under the umbrella of autoimmune disorders (23) because of several arguments supporting that it is immune-mediated, most likely by autoimmune mechanisms: 1) the organ specific immune attack, 2) mimicry of MS by immunization of rodents with myelin antigens, 3) HLA and non-HLA gene association to immune genes and 4) the therapeutic response to immunemodulatory treatments directed at various immune functions or cells (24,25).
Autoantibodies against myelin antigens, such as myelin oligodendrocyte glycoprotein (MOG), myelin basic protein (MBP), and myelin associated glycoprotein (MAG) have been investigated as autoimmunity targets in animal models of MS with positive results (26,27) but similar studies have resulted in conflicting data (28,29). The target self-antigens in human MS and its different subtypes are still conjectural (30,31) despite various phage display library screening and massspectrometry based proteomics studies, as reviewed elsewhere (32). Conversely, antigen microarrays have been used so far only to analyze antibody reactivity toward a preselected collection of antigens in the form of dedicated lipid microarrays (33) or myelin microarrays (34,35).
Herein we describe a three-stage strategy for undirected proteomic profiling of the autoimmunity repertoire within MS using antigen microarrays built on protein fragments. The discovery stage constitutes the systematic analysis of MSplasma to collect autoantibody reactivity profiles on more than 11,000 protein fragments representing over 7,500 unique proteins. This was followed by the within-and across-platform verification of the selected antibody reactivity profiles and the extended analysis of plasma sample cohort using a suspension bead array platform.

EXPERIMENTAL PROCEDURES
Samples and Sample Preparation-EDTA plasma samples were obtained from an in-house biobank containing samples collected during routine neurological diagnostic work-up at the neurology clinic of Karolinska University Hospital Stockholm, Sweden. The patients with MS were classified as primary progressive (PPMS), secondary progressive (SPMS), and relapsing remitting (RRMS) MS, in which the latter subtype was subdivided further into patients during relapse (RRrel) or remission (RRrem). Additionally, samples from patients with a single demyelinating event, referred to as with clinically isolated syndrome (CIS), were included in the study. The control group consisted of individuals with other neurological diseases (OND) and ONDs with signs of inflammation (ONDinf). The individuals with OND had a variety of other neurological signs and symptoms such as sensory symptoms, visual disturbance, headache, etc. whereas ONDinf consisted of individuals with other autoimmunity-driven diseases including rheumatoid arthritis (RA), systemic lupus erythematosus (SLE), neuropathy or with viral/bacterial infections for example, herpes encephalitis. Sample donor information from both discovery and verification studies are summarized in Table I. All study enrolment followed the recommendations of the Declaration of Helsinki and the study was approved by the Ethics Committee of the Karolinska Institute. Oral and written information was given to the patients and confirmed consent in writing was received before inclusion.
For discovery, neat plasma samples were aliquoted into plates with a liquid handling system (EVO150, TECAN). Here, in total, 8.4 l of each sample was diluted 1:250 in assay buffer. The content of the deep-well plate was shared between 34 replicate plates. All plates were frozen and stored at Ϫ80°C until usage. The same strategy was applied to aliquot and dilute samples for the extended verification cohort.
Antigens-A total of 11,520 antigens, also denoted as protein epitope signature tags (PrESTs), were used in this study comprising 11,384 unique antigens, representing 7,644 unique Ensembl Gene IDs (ENSGs). The strategies and protocols for the design (36,37), cloning and recombinant expression of the antigens and their verification with mass spectrometry within the Human Protein Atlas routine workflow were applied as previously described (38 -41). In brief, using a wholegenome bioinformatics approach, antigens of 80 -100 amino acid residues were designed in silico based on the principle of lowest sequence similarity to other human proteins, avoiding transmembrane, signal peptide and respective restriction site regions. The antigens were then produced in E. coli Rosetta DE3 strain as fusion protein fragments with an N-terminal dual affinity tag (His 6 -ABP) consisting of a hexahistidyl (His 6 ) tag, which allows a one-step purification on nickel columns, and an albumin binding protein (ABP).
Planar Antigen Arrays-For the discovery phase of the study, antigen microarrays from 30 production batches were used, each consisting of 384 different antigens selected based on antibody production criteria. These arrays were generated as follows: Antigens were diluted to 40 g/ml in 0.1 M urea in 1xPBS, pH 7.4 and 40 l of each antigen was transferred to a 384-well printing plate. The antigens were immobilized by depositing approx. 100 pl onto epoxy slides as solid support (CapitalBio) using a noncontact inkjet arrayer (GeSIM Nanoplotter 2.0E), resulting in slides with 14 identical subarrays, each containing 384 different antigens. The printed slides were allowed to dry overnight at 37°C in a heat chamber (Amersham Biosciences Hybridization Oven), followed by a wash in 1ϫ PBS and blocking of the surface for 1h with PBS-T (PBS, 0.1% (v/v) Tween 20, pH 7.4) supplemented with 3% (w/v) BSA (Fraction V, Saveen Werner). Slides were then washed 2 x in PBS-T and 1 x in PBS for 15 min each prior to storage at 4°C.
For the experimental verification phase, 384 candidate antigens selected according to the applied criteria were reprinted in a single batch utilizing another arrayer (Marathon, Arrayjet) enabling a higher density and 21 identical subarrays per slide were printed. Here, the printing buffer was changed to 50 mM sodium carbonate-bicarbonate buffer supplemented with 50% (v/v) glycerol.
Assays on Planar Antigen Arrays-An assay protocol was adapted to analyze plasma from a previously developed procedure for anti- 1 The abbreviations used are: ABP, albumin binding protein; AU, arbitrary units; EBNA, Epstein-Barr virus nuclear antigen; CIS, clinically isolated syndrome; CSF, cerebrospinal fluid; IgG, immunoglobulin G; MFI, median fluorescence intensity; MOG, myelin oligodendrocyte glycoprotein; MS, multiple sclerosis; OND, other neurological diseases; PPMS, primary progressive multiple sclerosis; PrEST, protein epitope signature tag; RRMS, relapsing remitting multiple sclerosis; RRrel, RRMS with relapse; RRrem, RRMS with remission; SPMS, secondary progressive multiple sclerosis. body validation (21). Here, assay conditions were optimized in terms of plasma sample dilution rate, assay buffer and incubation time. A subset of seven plasma samples were diluted 1:100, 1:250, 1:500, and 1:1,000 in numerous buffers, including 1ϫ PBS, PBS-T, PBS-T supplemented with 10% (w/v) BSA, PBS-T supplemented with 10% (w/v) BSA -5% (w/v) milk powder and PBS-T supplemented with 3% (w/v) BSA -5% (w/v) nonfat milk powder and incubated on the antigen arrays for 1 h or overnight. Sample incubation time of 1 h and plasma dilution rate of 1:250 were selected as optimal conditions in terms of signal-to-noise ratios. For the assays, a buffer containing PBS-T with 3% (w/v) BSA and 5% (w/v) nonfat milk powder supplemented with 0.4 g/ml of chicken generated His 6 -ABP tag-specific IgY antibodies (Agrisera) was selected and used during the discovery and experimental verification stages of the study.
For each batch, the content of the sample assay plate (60 l/well) was applied onto the sub-arrays on slides by using an adhesive, 16-well silicone mask (Schleicher & Schuell, Keene. NH) on each slide and the samples were incubated for 1 h at RT. The slides were washed 2ϫ in PBS-T and 1ϫ in PBS for 5 min each, followed by a 1 h incubation with the secondary antibody mixture prepared in the assay buffer: To detect human IgG bound to the arrayed antigens Alexa Fluor 647 conjugated goat anti-human IgG (HϩL) (Invitrogen) was used at 30 ng/ml. For grid alignment purposes Alexa Fluor 555 conjugated goat anti-hen IgG (Molecular Probes, Eugene, OR) at 30 ng/ml was co-incubated for the detection of bound His 6 -ABP tagspecific chicken antibodies. After washing, the slides were spun dry and scanned at 10 m resolution using a microarray scanner (Agilent G2565BA array scanner), followed by image analysis using GenePix 5.1 (Molecular Devices).
In terms of reagents and conditions, the assay procedure for the re-analysis of selected and reprinted antigens in experimental verifi-cation stage was carried out as described above. Changes included handling of slides with a 96-well microarray hardware (Arrayit Corporation) attached to an adhesive silicone mask for generating 4 ϫ 24 chambers.
Antigen Suspension Bead Arrays-Antigens selected for verification were coupled to carboxylated magnetic beads (MagPlex-C, Luminex Corp.) as per previously developed antigen-and antibodycoupling protocols (42,43) with minor changes. In brief, 5 ϫ 10 5 beads per bead identity were distributed across 96-well plates (Greiner BioOne, Longwood, FL), washed and re-suspended in phosphate buffer (0.1 M NaH 2 PO 4, pH 6.2) using a plate magnet (Dexter) and a plate washer (EL406, Biotek, Winooski, VT). Beads were activated by 0.5 mg 1-ethyl-3(3-dimethylamino-propyl)carbodiimide (Pierce, Waltham, MA) and 0.5 mg N-hydroxysuccinimide (Pierce) in 100 l phosphate buffer. After 20 min incubation on a shaker (Grant Bio), beads were washed and re-suspended in activation buffer (0.05 M MES, pH 5.0). Antigens were diluted to 40 g/ml in activation buffer. Besides these antigens, four internal controls were employed for coupling: 1.6 g of rabbit anti-human IgG antibody (Jackson Immu-noResearch, West Grove, PA), 4 g of recombinant EBNA-1 protein (Tebu-Bio), 1.6 g of His 6 -ABP (fusion tag present in all antigens), and a protein-free activation buffer solution. All solutions were transferred to separate bead identities using a liquid handler (SELMA, Cybio). The coupling reaction was allowed to take place for 2 h at RT, the beads were washed 3ϫ in PBS-T and resuspended in 50 l PBS-T and stored in plates at 4°C overnight. A 384-plex antigen suspension bead array was then prepared by combining equal volumes of each bead identity and re-suspended in storage buffer (Blocking reagent for ELISA, Roche) supplemented with 0.5% NaN 3 . After adjustment of the final volume to enable the transfer of 5 l of bead solution per well, and stored at 4°C until further use. Immobilization of the protein fragments was confirmed by the use of antigen specific antibodies generated within the Human Protein Atlas (data not shown). These rabbit antibodies were diluted 1:1,000 in PBS-T and 45 l of each of the different antibodies, with a final concentration of ϳ100 ng/ml, were transferred into a flat-bottomed 96-well plate well (Greiner BioOne), mixed with 5 l of the bead array and incubated for 1 h on a shaker (Grant Bio). The beads were washed using a magnet and a vacuum device (Gilson, Villier Le Bel, France) and resuspended in 50 l of R-phycoerythrin (R-PE) conjugated anti-rabbit IgG antibody (0.5 g/ml, Jackson Immuno-Research). After an incubation of 20 min and a final washing step, the beads were resuspended in 100 l of PBS-T for read-out by a FlexMap3D instrument (Luminex Corp.). Also, a monoclonal mouse antibody specific for the His 6 tag (R&D Systems, Abingdon, Oxfordshire, UK), detected with R-PE conjugated goat anti-mouse IgG (MOSS), was used to validate successful immobilization of antigens on the beads, resulting in signal intensities varying between 1,000 and 10,000 AU for all bead IDs with an antigen, as shown before (44).
Based on the optimized conditions, the assay protocol can be summarized as follows: Before analysis, the plasma samples were pre-incubated with His 6 -ABP: Using a liquid handler, 45 l of plasma samples diluted 1:250 in assay buffer were added to 5 l of His 6 -ABP (1.6 mg/ml in PBS) distributed across 96-well plates and incubated for 1 h on a shaker (Grant Bio) at RT. Then, 45 l were added to 5 l of the 384-plex antigen suspension bead array using a liquid handler (SELMA, Cybio) and incubated for 1 h on a shaker (Grant Bio) at RT. The beads were washed with 3 ϫ 100 l PBS-T on a plate washer (EL406, Biotek) and resuspended in 50 l of R-PE conjugated goat anti-human IgG (HϩL, MOSS) at 1 g/ml. After incubation with the secondary antibody for 45 min, the beads were washed with 3 ϫ 100 l PBS-T and resuspended in 100 l PBS-T for measurement by a FlexMap3D instrument (Luminex Corp.). At least 50 events per bead identity were counted and binding events were displayed as median fluorescence intensity (MFI) values.
For inhibition experiments, plasma samples were incubated at the presence of 25 g/ml antigens over 1 h before being added to the bead array assays, as described above.
Data Analysis-Data analysis and visualizations were performed using R (45) and various R packages, unless otherwise indicated. The analysis of the discovery stage data on planar arrays consisted of two parts. First, the degree of heterogeneity of the plasma autoantibody profiles was investigated. To this aim, arbitrary sample-specific intensity thresholds were applied for each antigen batch data. IgG reactivity in a sample was dichotomized by transforming it to a binary variable that is set equal to 1 or 0 based on exceeding the median signal for that specific sample over the 384 antigens in a batch plus 5ϫ the standard deviation. Second, the antigen profiles across various sample groups were compared via different statistical approaches to identify antigens with a group separation power. The Wilcoxon rank-sum test was applied for a comparison between ONDs and the entire MS group and the Kruskal-Wallis test was applied for a multigroup comparison between MS subtype groups. Similarly, ANOVA was applied and carried out on Qlucore Omics Explorer software (Qlucore AB, Lund, Sweden). As multivariate methods, Between-Group Analysis (BGA, (46)) by applying the "MADE4" package (47) and Partial-Least Squares-Linear Discriminant Analysis (PLS-DA, (48)) by applying the "caret" package (49) were used. Antigens fulfilling the criteria set by the sample-specific intensity threshold and group discrimination were selected for verification on the suspension bead array platform.
The data from the antigen suspension bead array was normalized to the signal intensity of the control analyte, the anti-human IgG as follows. The median of the signals for anti-human IgG across all samples was determined and a normalization factor was calculated for each sample by dividing its signal for anti-human IgG to the median across all samples. Signal intensities for all antigens within each sample were then divided by the corresponding normalization factor for that sample. The intensity threshold for an antigen was set to 90% quantile of the data for each sample. It was also checked that this value was 50% greater than from the fusion tag His 6 -ABP. Based on this data was dichotomized for each sample by transforming it to a binary variable. A Fisher's exact test was performed for the statistical evaluation of differences in proportion of antibody-positive subjects per different sample groups.

RESULTS
Overview of the Study Structure-The goal of this study was to discover and verify autoantibody reactivities potentially associated with multiple sclerosis by screening a large panel of human antigens engulfing antigens representing more than one third of all human proteins. To this aim, two types of in house generated antigen microarrays were used, as illustrated in Fig. 1, and the study was organized into three parts: A discovery phase in which 90 samples were profiled using more than 11,000 antigens on planar microarrays, followed by two verification phases in which 384 selected antigens were first reprinted on microarrays and re-analyzed with the same 90 samples and in which subsequently a suspension bead array format was used to enable a cross-platform validation and an analysis of an increased sample size of 376 samples, as described in Fig. 2.
Discovery of Autoantibody Profiles on Planar Antigen Arrays-The discovery stage of the study focused on the sys-tematic analysis of 30 different antigen microarray batches, each consisting of 384 different antigens. For each of these batches, seven slides with 14 identical sub-arrays were used to analyze 90 samples of the initial MS sample cohort, resulting in total of 210 analyzed slides. A dual-color setup was chosen and co-incubation of an antibody for the detection of immobilized antigens via their tag was used for spot localization and assessment of spotting.
The background reactivity created by the plasma samples on the arrays was consistently low, with the median for the background signal ranging between 50 -150 AU across the 30 different batches of arrays. Similarly, the median of signals for detection of the tag, which is present in all antigens, was in the range of 1000 -2000 AU across the batches. At the same time, the maximum signal for the detection of bound human IgG changed between 9000 -65,000 AU across the batches. Examples of typical sample profiles are shown in Fig. 3, in which autoantibody reactivity was clearly distinguishable with high signals over the sample-specific intensity threshold.
Signal intensity from triplicated, sample-free incubations varied with an average intra-assay CV of 3% and from three different, randomly selected, triplicated plasma samples it varied by 17%, 24%, and 25% respectively.
Global Analysis of Reactivity Profiles-A very large set of autoantigens was recognized during the discovery stage of the study, in which 90 plasma samples were profiled on a set of 11,520 antigens. When applying the sample-specific intensity threshold, 2,397 antigens (21%) of the antigen discovery set were recognized by at least one sample. As illustrated by Fig. 4, 1,539 out of these 2,397 antigens (64%, and 13% of the entire antigen discovery set), were recognized in no more than one individual sample. On the other hand, 19 antigens (supplemental Table S2) were identified that were recognized by autoantibodies in at least 20 individuals. For instance, the antigen representing P4HA2 (prolyl 4-hydroxylase subunit alpha-2) was recognized in 71% of the cohort (64/90 individuals). Interestingly, prolyl 4-hydroxylase is an already identified target antigen of anti-endothelial cell antibodies, which are detected not only in autoimmune and/or inflammatory conditions but also in healthy individuals and are therefore pre-sumed as "natural auto-antibodies" (54). A STRING and Fun-Coup analysis revealed no known or predicted interactions between these 19 antigens, which were recognized by autoantibodies in at least 20 individuals. Yet, considering GO terms, four out of these 19 antigens were associated with regulation of transcription and two of these four antigens contained the protein domains ZINC_FINGER_C2H2_2 (Pros-  2. Schematic summary of the multistage strategy by using two complementary antigen array platforms for antibody response profiling. Initial analysis of a pilot cohort consisting of 90 plasma samples was performed on the planar array platform, during which 30 batches of antigens, each consisting of 384 different antigens were spotted onto glass slides. This stage resulted in IgG reactivity profiles against a total of 11,520 antigens (Stage I). Combinations of statistical methods were applied to select candidate antigens, of which 384 antigens were subsequently challenged with technical verification on planar arrays by reprinting these 384 antigens and repeating the screening of the pilot cohort (Stage II). This was followed by coupling these antigens on magnetic beads to perform a biological verification in a larger cohort of 376 plasma samples using the suspension bead array platform (Stage III).  ite Entry# PS50157) and SAND (Prosite Entry# PS50864), which are nucleic acid binding protein structures.

Number of SAMPLES
Considering the overall antigen recognition frequencies, all samples within the cohort contained antibodies against at least 16 out of the 11,520 antigens (0.1%). Interestingly, the highest total number of different antigens recognized per sample was 88 for an OND plasma sample, whereas the lowest total number of different antigens recognized per sample, 16, was observed also for another OND sample. The median number across the cohort was 55 antigens recognized per sample. The median of the number of recognized antigens within the MS subtypes and controls with ONDs were not significantly different from each other (Kruskal-Wallis test p value ϭ 0.45), varying between a median of 52 antigens for OND, 56 for remitting RRMS, 58 for relapsing RRMS to 60 for SPMS samples (Fig. 5A).
In total there were 82 unique antigens recognized by more than 10% of the entire cohort. When investigating the different sub-groups using this criterion of shared reactivity, 102 were recognized by ONDs and an even greater number by MS samples: There were 182 antigens recognized by the SPMS group, 161 and 149 by relapsing RRMS and remitting RRMS groups, respectively. Similar to this, the proportion of samples recognizing more than 55 antigens, which is the median across the entire cohort, was 42% for OND group, 69% for the SPMS group, 56 and 53% for the remitting RRMS and relapsing RRMS groups. These trends imply that an increase in both the inter-individual heterogeneity as well as diversity of the autoimmune profiles could potentially be related to the progression of the disease (Fig. 5B).
When investigating age in relation to the number of recognized antigens, no linear correlation could be observed (Pearson's r ϭ 0.02) (supplemental Fig. S2A). Similar to age, there was no significant difference between males and females considering the median of number of antigens per sample (Wilcoxon rank-sum test p value ϭ 0.14) (supplemental Fig. S2B).
Filtering for Putative Candidate Antigens-Applying the sample-specific intensity threshold revealed a total of 2,397 antigens being recognized in one or more samples. Further analysis was then carried out to examine the number of common antigens recognized across the entire cohort or across the individual sample groups, namely to reduce the number of putative antigens by eliminating relatively less informative ones (e.g. no difference across sample groups). Therefore, combinations of statistical methods were applied to filter out targets with a potential group discriminating power. This included both uni-and multivariate analysis as well as dual and multigroup comparisons. By combining different statistical tests for indications of significances, 803 antigens were identified. After filtering out those, that were detected in only single individuals and those that did not pass the sample-specific intensity threshold (487 antigens passing the intensity threshold), 384 were selected based on number of individuals showing the respective antigen profiles, thus antigens only identified by single individuals were not chosen (Fig. 7).
Development of Antigen Suspension Bead Arrays-Antigens were coupled to beads to create antigen suspension bead arrays after different buffers and supplement combinations were tested to define antigen profiles with low background binding, replicate consistency and a broad dynamic range. In general, signal intensities across all antigens in quadruplicates of chicken serum, serum-free control, and two different randomly selected plasma samples varied with an average intra-assay CV of 5%, 4%, and 6 -10%, respectively. Similarly, one antigen was coupled on three different bead identities and the signal intensities across all samples for this Sample groups were also investigated in terms of the number of antigens recognized by more than 10% of the group, percentage of antigens recognized by more than one individual and percentage of samples within the group recognizing more than 55 antigens, which is the median number of antigens recognized per sample across the entire cohort (B). triplicate antigen varied with an average intra-assay CV of 8%. For the beads carrying only the His6-ABP fusion tag, signal intensities ranged between 50 -150 AU, whereas the bead not subjected to protein coupling revealed MFI of 40 -100 AU. The signal intensity from beads with anti-human IgG was 23,000 -25,000 AU, serving as a positive control. Furthermore, the immunogenic EBNA-1 antigen (55) was included as a control antigen in the bead array. All individuals except two belonging the OND group showed reactivity toward EBNA-1 (MFI 800 -27,000 AU) and the relation of reactivity toward EBNA-1 in different sample groups and with age and gender is shown in supplemental Fig. S3.
Verification of Reactivity Profiles-To assess the reproducibility of the identified antibody reactivities and to increase the stringency for the verification phase, the selected 384 antigens were re-printed on planar microarrays using a new arraying device, as well as involved in the development of a suspension bead array assay. Based on this cross-platform comparison strategy, consistency in reactivity was assessed in 90 samples in parallel.
At first, the similarity of individual samples was summarized by performing unsupervised hierarchical clustering with data set from planar and suspension bead array data (supplemental Fig. S4), which revealed that 80% of the individuals (72/90 individuals) clustered in pairs irrespective of the microarray platform. This indicated that there were only minor platformdriven effects and the same samples being analyzed on both array platforms showed a good congruency. FIG. 6. Representative reactivity profiles across experiments and array platforms. IgG reactivity profiles in terms of sample-specific relative signal intensity are shown against the verification set of 384 antigens (on x-axes) in three different plasma samples A-C. The first two profiles for each sample were obtained on planar array platform during the discovery and experimental verification stages (Stage I and II) and the last profile was obtained on the suspension bead array platform (Stage III). Concordance of reactivity could in general be observed on both array platforms and at least two stages of the study. Yet, detection of reactivity against certain antigens was platform-specific and could not be confirmed at multiple stages.
Next, antigen reactivity profiles were investigated to monitor the concordance between the array platforms and assays. The representative profiles in Fig. 6 show that reactivity profiles were either confirmed in all three assays, or they were platform-or assay-specific. This finding highlighted the importance of both experimental and technical data replication. Here, the concordance in reactivity profiles generated on different array platforms were summarized by listing the intersection of the top 10 antigens being recognized in each of the 90 samples and merging these lists across all the samples. Based on this analysis, 56% of the antigens (214 of 384) could be verified on the planar microarray platform by re-printing them as an experimental verification step and 53% of the antigens (204 of 384) could be verified on the suspension bead array platform as a technical verification step. With regard to all antigens used, 28% of the antigens (107 of 384) were common between the two verification stages according to these stringent criteria (Fig. 7).
Extended Sample Analysis and Identification of 51 Targets-The selected set of 384 antigens was used on suspension bead arrays to analyze an extended cohort of 376 individuals. The reactivity profiles from the confirmed 107 antigens were further investigated by comparing the recognition frequencies for these antigens in different MS subtypes and controls with ONDs using the Fisher's exact test. Here, 51 out of 107 targets (48%) revealed differences of recognition frequencies in different groups at a statistically significant level, as summarized in Table II and detailed in  supplemental Table S3.
In Fig. 8A the recognition frequencies of 51 antigens across sample groups are shown. Here, unsupervised hierarchical clustering of recognition frequencies highlights the presence of five main antigen clusters. The first and the third clusters comprise antigens with relatively high or low recognition frequencies in PPMS group, respectively. Both the second and fourth cluster comprises antigens with differential recognition frequencies for a range of group comparison such as SPMS versus CIS or RRrem versus OND. The recognition frequencies on average are relatively higher for cluster 4 antigens compared with cluster 2. The fifth, small cluster comprises antigens being widely recognized across all sample groups. Recognition frequencies in different sample groups are shown in Fig. 8B for five representative antigens, each one selected from a different antigen cluster.
Reactivity profiles toward these 51 antigens were also studied using paired cerebrospinal spinal fluid (CSF) samples from the discovery set (n ϭ 90) on the re-printed (Stage II) planar microarrays. As summarized in supplementary information, profile concordance was identified for 27% of these 51 antigens, indicating that reactivity toward certain targets, such as ANO2 (anoctamin 2) can be found both in plasma and CSF.
As a further step toward an understanding about the 51 targets and their relations to each other, STRING, FunCoup and Gene Ontology (GO) analyses were used. The STRING protein-protein interaction analysis based strictly only on evidence from experimental repositories revealed no prominently known interaction partners among the targets (supplemental Fig. S5A), whereas including computational prediction methods by FunCoup revealed 11 proteins with potential functional relation to each other over a confidence cutoff 0.9 (supplemental Fig. S5B). The "lowest" GO term in the GO hierarchy for the "biological process" category was extracted for each of the 51 antigens (supplemental Table S3). Interestingly, eight out of 51 targets were associated with regulation of transcription and 2 out of these 8 transcription regulation factors were zinc finger proteins (ZNF70 and ZNF480).
Specificity Assessment Via Inhibition Assays-To further verify that the signals derive from anticipated antigen-specific autoantibody interactions, an inhibition study was performed for five of the 51 selected antigens (RNF126, ZNF480, ZNF70, Analysis of the discovery sample cohort revealed a total of 2,397 antigens, which were recognized in one or more sample based on the sample-specific intensity threshold. At the same time, applying four different statistical methods to the entire antigen set revealed different lists with different number of antigens having a group separating power, either between ONDs and the entire MS group or between the different MS subtype groups. There were in total 803 antigens indicated by more than one out of the four methods. Out of these 803 antigens, 487 were among 2,397 antigens recognized by more than one sample. These 487 antigens were furthermore ranked based on number of samples recognizing them and a final list of 384 antigens were selected as the verification set. Fifty-six percent of these antigens could be verified on either of the planar or bead array platforms and 107 antigens, corresponding to around 28% of the verification set, could be verified on both array platforms. For 51 of these 107 antigens there were statistically significant differences in their recognition frequencies across different sample groups. ANO2, and PGAM5) using the antigen suspension bead array. For each of these five representative antigens, three different individuals, each demonstrating a prominent reactivity toward the selected antigen, were selected. The selected samples were pre-incubated for 60 min with the corresponding antigens and following an analysis with the 384-plex suspension bead array, antigen-specific signal inhibitions were revealed. On average, the specific inhibition reduced the intensities down to 11%, in which the highest reduction of the original signal intensity corresponding to a relative signal intensity of Ͻ1% was observed for PGAM5 and the mildest reduction of the original signal intensity (relative signal intensity of 36%) was observed for ZNF70 (Fig. 8B). Furthermore, signals for unrelated antigens else than the inhibiting antigen remained unaffected for each of the five representative antigens, as shown in supplemental Fig. S6. In all, these results indicated that the antibody reactivity to antigens was specific and therefore allowed multiplexed monitoring of IgG autoantibody responses for individual antigens in plasma using the suspension bead array platform. DISCUSSION We herein describe the broad exploration of antigen arrays for proteomic profiling of autoantibody repertoires. A proteomic resource of antigens generated within the Human Protein Atlas project (20) was used to characterize autoimmunity signatures across 11,520 antigens on planar microarrays in 90 individuals with MS-related diagnosis. 384 antigens, identified as potentially interesting candidates, were verified using both planar and suspension bead arrays. The bead arrays were further employed in verifying more samples (n ϭ 376) to define a set of 51 antigens that provided differences in recognition frequencies across ONDs and different MS sub-types.
Planar antigen arrays are increasingly considered as a powerful tool for the study of antibody responses in autoimmune diseases. Using antigen arrays, very small volumes of body fluids can be screened to decipher the diversity of autoimmune repertoire. Accordingly, the described study consumed less than 10 l of collected plasma. Yet, the assessment of   FIG. 8. Recognition frequencies for 51 antigens within the different sample groups and inhibition assays demonstrating the specificity of autoantibody reactivity. The heatmap (A) summarizes the recognition frequencies within different MS subtypes, ONDs and the CIS group for 51 antigens, which were verified on both array platforms, at three stages and the differences in recognition frequency of these antigens were statistically significant (Fisher's exact test p valueϽ0.05). Color intensity denotes the degree of recognition frequency for an antigen within the sample group. Recognition frequencies for five of these antigens within each subtype are shown in (B), each demonstrating a slightly different frequency pattern across different sample groups. Examples of significant differences in recognition frequencies are denoted either with a single (p valueϽ0.05) or a double star (p valueϽ0.01). Inhibition assays for this representative set of five antigens revealed that antigen-specific signals could be substantially reduced in all the samples (S1-S15) for each selected antigen, in which the reduced signal intensities varied between Ͻ1% (for PGAM5) and 36% (for ZNF70) (C). the diversity of autoimmune repertoire is dependent on the comprehensiveness of the applied antigen collection. In many cases, collections are built on selected sets of antigens, which were previously associated with the disease of interest or with a related tissue/organ or a physiological process (e.g. inflammation). The inherent limitation of such a strategy can be addressed by studying untargeted collections of antigens. This could be achieved by utilizing commercially available but costly antigen arrays (10, 12, 56 -59). We used here an alternative approach by producing arrays in-house, which though requires access to sustainable resources of large antigen collections (60) such as the Human Protein Atlas. Within this project, human protein fragments are produced recombinantly and new sets of 384 of them, selected in an unbiased manner, are printed routinely and continuously, creating many different, untargeted array batches, of which 30 were used in this study.
The design of the antigens generated within the Human Protein Atlas and employed in this study is directed by the aim to produce unique sequence representations of a proteinencoding gene and the produced antigens are continuous stretches of 80 -100 residues, representing selected areas of the target protein. We used 11,520 antigens representing 7,644 unique proteins, corresponding to 38% of the human protein encoding genes and roughly 9% of all human protein sequences. As detailed in supplemental Table S1, more than one antigen had been designed for 2,663 of these 7,644 target proteins. Such a "multiple antigen approach" might still not account for the possibility of tertiary recognition elements being not accessible at all. It is therefore likely that using protein fragments might limit findings to conformation-independent autoantibody epitopes. Yet, it still illustrates the possibility of capturing polyclonal reactivities to various areas of a target via a set of representative antigens. Besides, given the length of the protein fragments used here, one may speculate that the antigens still form secondary structural features such as coiled coils, which are suggested to be involved in epitope binding of autoantibodies (61,62). If so, these conformations could be recognized by autoantibodies as long as they present structures similar or alike to those parts of the full-length version of proteins they are representing. Finally, it may also be disadvantageous to use recombinant antigens in general, because autoantibodies toward proteins with post-translational modifications cannot be identified.
Protein arrays for autoantibody profiling should preferably contain full-length proteins so that conformation-sensitive autoantibodies, such as those toward folded myelin oligodendrocyte glycoprotein (MOG) can be identified (63). On the other hand, autoantibodies toward oligodendrocyte specific protein (OSP) recognize only the denatured OSP and a certain peptide but not the folded OSP (64). This example illustrates the need to study both the conformation-sensitive and conformation-independent autoantibody responses for capturing a greater part of autoantibody complexity in body fluids, even if it is speculated that linear epitopes might comprise about 10% of the autoantigenic epitopes (5). "The repertoire of target autoantigens is a Wunderkammer -a collection of curiosities-of molecules with no obvious linking principle", as stated by Paul Plotz in 2003 (65). A decade after, our understanding about the nature of autoantigens seems still very limited. Accordingly, there is no established "ultimate" strategy in terms of the type of the employed affinity reagents, which autoantibodies could potentially recognize. As reviewed very recently (66), there are studies demonstrating not only the value of employing full-length proteins but also the value of peptides (67), peptoids (68), or lipids (34). Even antibody arrays can be used to study in particular those autoantigens circulating in complex with their autoantibodies (69). There is a vast universe of possible affinity reagents to interrogate the autoantibody repertoire, each with their inherent biases and advantages. We believe that using human protein fragments, that were selected to be a most unique representative of a full-length protein, is a complementary approach that offers alternative routes for autoantibody profiling.
Currently, immunoblotting is the most widely used technique for the confirmation of autoimmunity data generated on planar antigen arrays and relies on the analysis of one protein at a time. Thus, it is critically important to establish efficient strategies suitable for the high-throughput verification of large data sets generated on planar arrays. In line with this, we here describe the use of two independent antigen array platforms. During the discovery stage, planar arrays with an epoxide solid support were used whereas for verification, a strategy was employed relying on data concordance between the planar array and a suspension bead array platform, which uses beads with carboxyl groups as solid support. Antigens are immobilized to the carboxyl surface via their primary amine groups, whereas they can bind to epoxide surface also via their exposed thiol-and hydroxyl-groups. In this regard, the epoxide surface might be offering a more versatile surface in terms of accessibility of epitopes as compared with the carboxyl surface, especially if an antigen has several lysine residues. All in all, the chemical and kinetic properties of the two surfaces are different, potentially influencing antigen recognition. These factors, together with the differences in assay buffers and detection antibodies could explain differences in recognition patterns for some of the antigens on the two different platforms. To our current knowledge, this is the first large-scale study aiming at systematic comparison between these two different antigen array platforms for profiling autoantibody profiles. This mutual validation approach enables an efficient verification to discriminate consistent reactivity patterns and might therefore be envisaged as a required component of high-throughput autoantibody reactivity profiling screenings. Besides, suspension bead array platform is considered as being suitable for eventual implementations of multiplexed antigen assays into clinical assays (5), which ad-ditionally highlights the value of verification of reactivity patterns using this platform.
One of the challenges in analyzing the autoimmunity data is to define autoantibody reactivity. Although antigen arrays are used extensively to study autoantibody reactivity, there are very few reports available (59, 70) with well-described data analysis strategies. Yet, defining thresholds for reactivity is especially critical when filtering putative antigens for further verification. The challenge is very much because of differences across individuals, in terms of their plasma reactivity to the overall antigen content. Some individuals may reveal prominent profiles to a small set of the antigens, whereas others might have a more diverse reactivity pattern to a much larger set. We therefore defined a sample-specific threshold to base the reactivity or "positivity" of samples. Considering the analysis of autoimmunity data, there is also another important difference when compared with analysis of protein profiles, in which mostly classical statistical tests (e.g. Student's t test) are used to identify significant changes in mean or median of signals across different groups. This type of statistical analysis may not suit autoimmunity data because of the relative correlation between signal intensity and autoantibody concentration (7). Because the aim in studies like the presented one is to identify targets recognized even in a small number of individuals within a group with high reactivity, Fisher's exact test was used to identify antigens with differential recognition frequencies.
One of the main findings of this study was the number and diversity of antigens reacting with plasma antibodies, which varied greatly between individuals and irrespective of disease status. A vast majority of these profiles were only detected in single individuals, suggesting a tremendous heterogeneity of autoimmunity signatures. Generalizing this observation on a whole proteome scale may eventually suggest that the autoimmune repertoire in plasma is under the influence of various individual factors and humans potentially host autoantibodies toward hundreds, if not thousands, of autoantigens. This observation is in line with the outcome of similar recent studies (12,13).
Despite the heterogeneity, a small portion of the antibody profiles detected in this study displayed differences in recognition frequencies across the ONDs and MS subtypes. At this stage we can only speculate on the origin and role of them: They could represent a spreading of the immune response from an initial triggering pathogenic response against a so far unknown critical target followed by liberation of antigens on damage to the CNS and a subsequent response to these antigens. The increased numbers of targets recognized going from OND to RRMS and progressive MS would be consistent with such a hypothesis (Fig. 5B). This does not exclude the possibility that they may take part in the MS pathogenesis or be potentially useful as part of a biomarker set up. In particular, one might speculate that they can take part in driving the progressive phase of MS, in which meningeal lymphoid folli-cles with abundant collections of B-cells are present (71). Furthermore, it does not exclude the possibility that some of the antibody reactivities indeed would represent primary MS pathogenic events. Although MS is thought to be mainly T-cell driven, a close at hand speculation is that B cells are pivotal in antigen presentation to pathogenic T cells and may enrich for antigens present in low concentration and thereby drive the pathogenic T-cell response. Thus, the detection of potentially disease-associated autoantibodies might direct us to the autoantigens driving the disease. If these can also be well defined via functional studies, it would open up for antigen specific tolerogenic protocols that have been successful in rodent models.
As an outcome of this study, a set of 51 antigens was identified with differences in recognition frequencies mainly within different disease sub-types (Table II and supplemental  Table S3). A remarkable portion of this set comprised of proteins associated with regulation of transcription. From a broader perspective this is interesting and in line with a report with a focus on studying autoimmunity to regulatory elements such as transcription factors and highlighting the concept of "immunoregulomics" (72). The majority of the targets within the set of 51 antigens have not been described as related to MS before, but this set also included targets reported as potential autoantibody targets in MS or which are closely related to such targets. One is GPR62, which is a G-protein coupled receptor, and reactivity toward antigen targets belonging to GPCR family have been reported in two other recent studies (73,74) and we observed increased reactivity toward this target in the more progressive form of MS compared with relapsing-remitting MS. Similarly, DNAJ (Hsp40) homologue was reported as one of the top ten significant autoimmune targets in MS by Beyer et al. (74). In our study there was accordingly no reactivity toward DNAJC3 (DnaJ (Hsp40) homolog, subfamily C, member 3) in the CIS group, whereas profiles in the progressive MS subtype revealed an increased reactivity. Toward the antigen PGAM5 (phosphoglycerate mutase family member 5) frequent reactivity was observed in all sample groups, though at a statistically significant higher frequency within relapsing-remitting MS as compared with secondary-progressive MS. Not PGAM5 but another antigen belonging to the phosphoglycerate mutase family, PGAM1, had been reported as a potential autoimmune target in MS (75)(76)(77). Besides, two antigens representing ATP10A (ATPase, class V, type 10A) and UBE3A (ubiquitin protein ligase E3A) were among the set of 51 antigens identified in this study, which were not reported within the context of MS before. The genes coding for these proteins were found being imprinted, constituting a candidate region for autismspectrum disorders and the proteins were considered to be involved in CNS signaling (78). Furthermore, the set included APP (amyloid precursor protein), which has been reported as an autoimmune target in MS in two different studies (34,74). Finally, the target list included ANO2 (anoctamin 2), known as transmembrane protein 16B (TMEM16B), which was the antigen demonstrating the highest degree of concordance in recognition frequency in paired plasma-CSF samples (supplementary Fig. S1). The recognition frequencies for this antigen were significantly different when comparing ONDs to the entire MS group and also the relapsing-remitting MS group. Same as KIR4.1, the very recently reported potential autoantigen in MS (79), ANO2 is also an ion channel highly expressed in photoreceptor synaptic terminals (80).
Nevertheless, identification of these targets alone may not provide full biological insight because they might not be the immunogen eliciting the original immune response, as pointed out in a very recent report (81) suggesting the need to use additional tools to determine the causal event triggering the autoimmune response. Indeed, we have yet not demonstrated a direct pathogenic role for the autoantibodies identified. Our results indicate increase or decrease of autoantibody reactivity across diagnostic groups of MS for 51 antigens and these exploratory observations give a first insight into the potential of studying IgG reactivity on hypothesis-free assembled protein fragment collections. Considering the fact that not only an increase but also a decrease or loss in the abundance of certain autoantibodies can be associated with advancing disease status (82), the antigens for which we report changes in autoantibody reactivity across diagnostic groups of MS can be included in large-scale, targeted validation studies using larger and preferably multicenter MS biobank collections.
In conclusion, substantial microarray-based screening utilizing protein fragments for the identification of IgG-derived immunoreactivity profiles offers an emerging and appealing approach for broad analysis of autoantibody signatures. As exemplified here in the context of MS, heterogeneity in autoimmune-response demands tailored data analysis strategies and exemplifies the necessity of further exploration in larger sample collections.