Cerebrospinal-fluid-derived Immunoglobulin G of Different Multiple Sclerosis Patients Shares Mutated Sequences in Complementarity Determining Regions*

B lymphocytes play a pivotal role in multiple sclerosis pathology, possibly via both antibody-dependent and -independent pathways. Intrathecal immunoglobulin G in multiple sclerosis is produced by clonally expanded B-cell populations. Recent studies indicate that the complementarity determining regions of immunoglobulins specific for certain antigens are frequently shared between different individuals. In this study, our main objective was to identify specific proteomic profiles of mutated complementarity determining regions of immunoglobulin G present in multiple sclerosis patients but absent in healthy controls. To achieve this objective, we purified immunoglobulin G from the cerebrospinal fluid of 29 multiple sclerosis patients and 30 healthy controls and separated the corresponding heavy and light chains via SDS-PAGE. Subsequently, bands were excised, trypsinized, and measured with high-resolution mass spectrometry. We sequenced 841 heavy and 771 light chain variable region peptides. We observed 24 heavy and 26 light chain complementarity determining regions that were solely present in a number of multiple sclerosis patients. Using stringent criteria for the identification of common peptides, we found five complementarity determining regions shared in three or more patients and not in controls. Interestingly, one complementarity determining region with a single mutation was found in six patients. Additionally, one other patient carrying a similar complementarity determining region with another mutation was observed. In addition, we found a skew in the κ-to-λ ratio and in the usage of certain variable heavy regions that was previously observed at the transcriptome level. At the protein level, cerebrospinal fluid immunoglobulin G shares common characteristics in the antigen binding region among different multiple sclerosis patients. The indication of a shared fingerprint may indicate common antigens for B-cell activation.

B lymphocytes play a pivotal role in multiple sclerosis pathology, possibly via both antibody-dependent and -independent pathways. Intrathecal immunoglobulin G in multiple sclerosis is produced by clonally expanded B-cell populations. Recent studies indicate that the complementarity determining regions of immunoglobulins specific for certain antigens are frequently shared between different individuals. In this study, our main objective was to identify specific proteomic profiles of mutated complementarity determining regions of immunoglobulin G present in multiple sclerosis patients but absent in healthy controls. To achieve this objective, we purified immunoglobulin G from the cerebrospinal fluid of 29 multiple sclerosis patients and 30 healthy controls and separated the corresponding heavy and light chains via SDS-PAGE. Subsequently, bands were excised, trypsinized, and measured with high-resolution mass spectrometry. We sequenced 841 heavy and 771 light chain variable region peptides. We observed 24 heavy and 26 light chain complementarity determining regions that were solely present in a number of multiple sclerosis patients. Using stringent criteria for the identification of common peptides, we found five complementarity determining regions shared in three or more patients and not in controls. Interestingly, one complementarity determining region with a single mutation was found in six patients. Additionally, one other patient carrying a similar complementarity determining region with another mutation was observed. In addition, we found a skew in the -toratio and in the usage of certain variable heavy regions that was previously observed at the transcriptome level. At the protein level, cerebrospinal fluid immunoglobulin G shares common characteristics in the antigen binding region among different multiple sclerosis patients. Autoimmune mechanisms play a central role in the pathogenesis of multiple sclerosis (MScl). 1 Recent trials indicate that B-lymphocyte depletion therapy can substantially reduce disease activity in relapsing-remitting MScl patients (1). Clinical amelioration after depletion seems to precede a reduction in autoantibody levels, possibly because this treatment rapidly affects the antigen-presenting cell functions of B cells (2). This finding has boosted interest in studies on the pathogenic role of autoreactive B cells. Despite the success in inhibiting antibody-independent functions of B cells, arguments remain for an additional chronic pathogenic role for autoantibodies within the central nervous system (CNS). This includes (a) the presence of antibodies in cerebrospinal fluid (CSF) and brain tissue (3), (b) depositions of antibody within areas of demyelination along with local complement activation (3), and (c) myelin oligodendrocyte glycoprotein specific antibodies in some subpopulations of MScl patients (4,5). Additionally, KIR 4.1 was recently identified as a target of autoantibody response in a subgroup of persons with MScl (6).
It has been shown that the distribution of genes used to generate antibodies in B cells from CSF and lesions of MScl patients is skewed relative to naturally expected distributions. Several groups described clonal B-cell populations within the CNS, sometimes even skewed to certain families of variable heavy (VH) regions (7,8). No common motifs have yet been found to be shared between different MScl patients. This would be in line with the classical immunological insight that suggests that it is extremely rare to find common sequences in the immunoglobulin G (Ig) variable regions among different individuals. However, this view has recently been challenged (9 -11). Both after vaccination and in paraneoplastic syndromes such as anti-Hu, strikingly identical shared complementarity determining region (CDR) motifs were observed among patients (12). Of note is also a surprising study in which it was observed that malignant chronic lymphocytic leukemia B cells in different patients all recognized a single fungal antigen and showed shared use of CDR3 sequences among different individuals (13).
A novel approach for studying Ig gene usage in the biofluids of MScl patients is the use of proteomic sequencing. Obermeier and colleagues described overlap between Ig B-cell (CSF) transcriptomes and proteomes in four individual MScl cases, without interindividual overlapping sequences (14). However, this elegant proof-of-principle study was limited to four MScl patients, and there was no comparison between patients and controls. The possibility of sequencing CSF Ig at the protein level (10,14,15) may bring along some advantages. The genetic approaches used so far share the benefit that complete sequences can be identified at the single-cell level, but Ig derived from such clones does not necessarily represent the actual Ig repertoire found in CSF. Furthermore, whereas genomic studies are restricted to CSF cells, humoral CSF studies also include Ig proteins from other anatomical brain areas, such as parenchyma, meninges, and Virchow-Robin spaces. Finding common characteristics of the antigen binding sites of Ig among patients might provide leads regarding the question of whether common antigenic stimuli are responsible for the recruitment of intrathecal B cells in MScl. We previously described a new approach using advanced nanoscale liquid chromatography coupled online to a high-resolution mass spectrometer (LC-MS) (10,16,17), a reliable and powerful method for the sensitive detection of CDR peptides (18). Moreover, it can also be used to compare CDR peptide profiles among a relatively large number of patients and controls.
Our main question was whether we could detect specific proteomic profiles of CDR present in MScl patients but absent in controls. Here, we report a number of common CDR sequences in Ig of a group of MScl patients that were not observed in healthy controls. In addition, we show disturbed / chain ratios in CSF Ig of MScl patients and VH family usage in patients relative to controls.

MATERIALS AND METHODS
Clinical Samples: MScl Patients and Non-neurological Controls-CSF samples were collected from untreated MScl patients, which were selected by an experienced neurologist (R.Q.H.) and were followed prospectively by the Rotterdam Multiple Sclerosis Center ErasMS at the Department of Neurology at the Erasmus Medical Center (Rotterdam, The Netherlands). The procedure for CSF sample collection was as described previously (19). All MScl patients had defined relapsing-remitting MScl or a clinically isolated syndrome according to the 2005 McDonald criteria for MScl (20). The control individuals were free of any neurological disorders; therefore, these individuals are hereinafter referred to as healthy controls. They underwent minor, non-neurologically indicated surgeries, and CSF was taken prior to the administration of sedatives as part of the anesthesia procedure. Immediately after collection, the CSF samples were centrifuged (10 min at 3000 rpm) so that cells and cellular elements could be discarded, and the supernatant was aliquoted and stored at Ϫ80°C until further use for this study. Blood-contaminated CSF samples were excluded based on the presence of erythrocytes detected via microscopic examination immediately after sampling. One aliquot of a sample was used for routine CSF diagnostics in the Clinical Chemistry Department of Erasmus Medical Center. This diagnostic procedure for the MScl patients included the quantification of total protein and albumin, assessment of the number of oligoclonal bands, and Ig index. Samples were also taken from another aliquot that had been studied previously to determine intraindividual variations in CSF protein abundances (37). This study was approved by the Institutional Ethical Committee of the Erasmus Medical Center, and written informed consent was obtained from all participants.
Ig Quantification Assay-An ELISA assay using 96-well plates (Immuno 96 MicroWell TM solid plate, Thermo Fisher, Bremen, Germany) was used to determine Ig concentrations in CSF samples. In this assay, AffiniPure F(abЈ)2 fragment goat anti-human IgG (H ϩ L) (Jackson ImmunoResearch Laboratories, Suffolk, UK) at a concentration of 1.3 mg/ml was used to coat the wells as a capture antibody. As a detection antibody, horseradish peroxidase conjugated polyclonal secondary antibody antihuman Fc goat (anti-human IgG HRP , Sigma-Aldrich, Saint Louis, MO) was used. Samples were incubated with antibodies for 5 min and gently shaken (400 rpm) at 4°C on a thermo-cycler shaker (Eppendorf AG, Hamburg, Germany). 3,3Ј,5,5Јtetramethylbenzidine (Sigma-Aldrich, Saint Louis, MO) (100 l per well) was used as a substrate for horseradish peroxidase and developed a soluble blue reaction product. The reaction was stopped with 100 l 1 M hydrochloric acid. The Ig concentration was determined photometrically based on absorbance at 450 nm and was quantified using an eight-point calibration curve ranging from 1.0 ϫ 10 Ϫ5 to 0.5 g/l.
Ig Purification-Ig was purified from CSF samples using a Melon Gel IgG Spin Purification Kit (Pierce, Rockford, IL) according to the manufacturer's protocol for serum, with slight modifications. All CSF samples were diluted at a ratio of 1:3 with purification buffer. We used 100 l CSF from patients and 200 l from controls based on the Ig concentration assay (two times the CSF volume was used for controls in order to normalize for Ig concentration). Subsequently, the diluted CSF was added to a spin column containing Melon Gel resin. After 15 min of incubation, the spin column was centrifuged at 5000g, and the flow-through containing the purified Ig was collected. Ig concentrations were determined in the flow-through fractions after purification (with the above-described Ig ELISA). Equal amounts of Ig across all patients and controls were then taken and subsequently lyophilized (Sublimator 400, Zirbus Technology, Tiel, The Netherlands) for six hours. The lyophilized Ig fractions were then stored at Ϫ20°C for one day before undergoing separation by SDS-PAGE.
For SDS-PAGE separation, loading buffer was added to each lyophilized sample and heated at 90°C for 10 min. Purified Ig antibodies were resolved into heavy (IgH) and light (IgL) chains by means of reducing one-dimensional SDS-PAGE using Bio-Rad Mini-Protean electrophoresis system gels (10% polyacrylamide gels of 0.75-mm thickness). The gels were stained with Novex ® Colloidal Blue Staining (Invitrogen, Carlsbad, CA) according to the manufacturer's instruc-tions. Overnight destaining was performed for visualization of IgH and IgL chain protein bands, and subsequently gels were scanned.
In-gel Trypsin Digestion-We excised gel bands manually in a laminar flow cabinet as a preventive measure to minimize environmental keratin and other contaminating protein-like materials. Protein bands were cut into plugs and transferred into Eppendorf tubes. We performed reduction by dithiothreitol and alkylation with iodoacetamide. Subsequently, in-gel digestion was performed overnight at 37°C, and further peptide extraction procedures were performed (18). After peptide extraction, samples were dried for three hours in a vacuum centrifuge (SPD 1010, Thermo Savant, Holbrook, NY) and then stored at Ϫ80°C until LC-MS measurements.
Chromatography Separation and Mass Spectrometric Measurement-Before LC-MS measurements, the dried peptide samples were dissolved in 40 l of an aqueous solution of 0.1% TFA and sonified. The samples were measured with a nano-LC system (Ultimate 3000, Thermo Fisher Scientific, Amsterdam, The Netherlands) coupled online to a hybrid linear ion trap/Orbitrap mass spectrometer (LTQ-Orbitrap-XL, Thermo Fisher Scientific, Bremen, Germany). Samples were loaded onto a trap column (PepMap C18, 300-m inner diameter by 5-mm length, 5-m particle size, 100-Å pore size; Thermo Fisher Scientific) and washed and desalted for 10 min using 0.1% TFA (in water) as the loading solvent. Then the trap column was switched online with the analytical column (PepMap C18, 75 m inner diameter by 250 mm, 3-m particle size, 100-Å pore size; Thermo Fisher Scientific) and peptides were eluted with the following binary gradient: 100% solvent A, then from 0% to 25% solvent B in 60 min and from 25% to 50% solvent B in 30 min. Solvent A consisted of 2% acetonitrile and 0.1% formic acid in HPLC-grade water, and solvent B consisted of 80% acetonitrile and 0.08% formic acid in HPLC-grade water. All LC solvents were purchased from Biosolve (Valkenswaard, The Netherlands). The column flow rate was set at 300 nl/min, and eluting peptides were measured by a UV detector (at a wavelength of 214 nm in a 3-nl nanoflow cell (Thermo Fisher Scientific)) and consecutively introduced into the mass spectrometer. For electrospray ionization, metal-coated nano-electrospray ionization emitters (New Objective, Woburn, MA) were used and a spray voltage of 1.5 kV was applied. For MS detection, a data-dependent acquisition method was used: a high-resolution survey scan from 400 -1800 Th was detected in the Orbitrap (target of automatic gain control ϭ 10 6 , resolution ϭ 30,000 at 400 m/z, lock mass set to 445.120025 Th (protonated [Si(CH 3 ) 2 O] 6 (21))). On the basis of this full scan, the five most intensive ions were consecutively isolated (automatic gain control target set to 10 4 ions), fragmented via collision-activated dissociation (applying 35% normalized collision energy), and detected in the ion trap. Precursor masses within a tolerance range of Ϯ5 ppm that were selected once for MS/MS were excluded for MS/MS fragmentation for 3 min or until the precursor intensity fell below a signal-to-noise ratio of 1.5 for more than five scans. Samples were prepared and measured in a randomized order. An internal quality-control sample was measured once in every five measurements. Before each run, a blank run was performed to monitor the background of the system.
Data Analysis and Peptide Identification-Peptide Identification by Mass and Fragmentation-Acquired LC-MS profiles for the separate purified IgH chain and IgL chain datasets were analyzed separately using the Progenesis LC-MS software package (version 2.6, Nonlinear Dynamics Ltd., Newcastleupon-Tyne, UK). Individual runs were aligned with each other to compensate for variations in retention times (samples that could not be aligned by at least 200 vectors using the automated alignment option were excluded from further analysis as recommended by the manufacturer). Before peak selection, integration of the area of the peaks was performed. The resulting peaks could then be associated with the amino acid sequence information if corresponding fragmentation spectra were available.
From raw data files, MS/MS spectra were extracted and converted into MGF files using extract msn (part of Xcalibur version 2.0.7, Thermo Fisher Scientific). Sequencing of the fragmentation spectra was conducted via a Mascot MS/MS database search (version 2.3.01, Matrix Science Inc., London, UK) against the human subset of the NCBInr non-redundant sequence database (August 15, 2010, Homo sapiens taxonomy; 232,854 sequences). The following settings were used for the database search: a maximum of two missed cleavages, tryptic cleavage, oxidation as a variable modification of methionine (ϩ15.995 u), and carbamidomethylation as a fixed modification of cysteine (ϩ57.021 u). The peptide mass tolerance was set at 10 ppm, and the fragment mass tolerance at 0.5 Da. For peptide identification, a minimum ion score of 25 was required. The resulting peptide identifications were filtered using Scaffold (version 3.2.0, Proteome Software Inc., Portland, OR). Peptide false discovery rates were calculated by Scaffold as (false positives)/(false positives ϩ true positives). On average, the false discovery rate determined was always lower than 0.1%. Filter criteria for the generated identification result table were a peptide probability and a protein probability greater than 95%. At this stage, all non-Ig proteins (albumin, transferrin, keratin, etc.) were filtered out on the basis of their protein names so we could focus solely on the Ig proteins. Subsequently, the identified peptides and proteins were imported into the Progenesis software package and linked to their corresponding peptide peaks. The Progenesis analysis matrix contained mass, charge, intensity, abundance, and MS/MS fragmentation spectra of the detected peptides.
The abundance listed for all peaks can be defined as the background or signal. To remove background from real peak signals, we used the following procedure: First, around 30 randomly chosen peaks in all samples with a low-intensity region were reviewed regarding their isotopic pattern. Peaks with more than two isotopes were classified as valid peaks, and peaks with no isotope or just one or two isotopes were classified as background. We performed manual checks on all peptides of interest to confirm the background level and the detection of the exact location and overlap of peaks among samples in the abovementioned Progenesis software package. The specific aim of this work was to find CDR peptide fragments that were present in MScl patients but not in the control group. Therefore, we filtered the datasets for candidates that were found in at least three MScl patients but were absent in control patients.
Manual Confirmation of Peptide Identifications-For further confirmation of the identification of peptides as described above, we assessed the MS/MS fragmentation spectra. If a peptide is common among patients, one should expect a similar spectral pattern (in terms of mass fragmentation and retention time window) to be present in different samples. Additionally, we evaluated the isotopic patterns in the mass spectrometry spectra in terms of the number and appearance of isotopes. Using the method described above, those marker peptides that did not pass our filter were not qualified as markers.
CDR Identification (Assigning Location in Ig Structure)-All identified amino acid sequences were aligned to a variable (V), diverse (D), joining (J), or constant (C) region human (Homo sapiens) Ig germ line sequence derived from the International ImMunoGeneTics Information System (IMGT) database (Montpellier, France) (22). As described previously (16,17,23), we used the BLASTp search algorithm (NCBI BLAST version 2.2.22) to align the identified peptide sequences to the corresponding Ig.
All peptide alignments with bit scores greater than 12.5 were selected for further analysis. Peptides aligned to the variable domain germ line were further submitted to the IMGT/domain gap alignment tool, which positions the peptide to the germ line sequence in the IMGT unique numbering residue system. The alignment with the most homologous germ line allele was provided by the IMGT tool and included in the data if the identity score was at least 70%. With this approach we were able to locate a peptide in a CDR or framework region. A CDR peptide was defined as a peptide containing a minimum of three amino acids in the CDR part, irrespective of framework length. An overview of the methodology is presented in the form of a flow chart in Fig. 1. The analysis matrix generated for IgH and IgL chain comparison by the Progenesis software included with the peptide alignment summary (derived from a BLAST-IMGT search) is presented in supplemental Files S1 and S2.
Statistical Analysis-We found a set of CDR peptides that were present in at least three MScl samples but not in any control samples (true result). To determine the probability that this finding was not due to chance alone, we performed permutation tests. We permutated (randomized) the sample group assignment (patient or control group) and reran the determination of exclusively present CDR peptides 5000 times (false hits). The relative frequency of the occurrence of exclusive CDR peptides in a randomized sample set was calculated by dividing the number of false hits exceeding the true result by the numbers of randomization trials. For these computations we used the statistics package R (version 2.15.2). Other statistical analyses of data and graphical presentations were performed using GraphPad Prism (GraphPad Software version 5.00) or by an Excel 2010 function.

RESULTS
In total, 59 samples were included for analysis of the IgH comparison set (n ϭ 29 MScl versus n ϭ 30 controls). The IgL comparison set included 55 samples (n ϭ 29 MScl versus n ϭ 26 controls). Four IgH samples (two MScl and two controls) and six IgL samples (one MScl and five controls) were excluded after label-free analysis because of weak alignment and UV information.
Patient Characteristics-All patient characteristics are shown in Table I. Significant differences in gender and age (Mann-Whitney test, p Ͻ 0.0001) were observed between MScl patients and controls, but no other parameters showed any differences between the groups (Mann-Whitney test, p Ͼ 0.05). MScl patients have slightly increased Ig concentrations in CSF. Therefore, we normalized the CSF Ig concentration. After normalization, the UV-quantified area (peptide abundance) obtained during LC-MS measurements did not show any significant difference (Mann-Whitney test, p ϭ 0.80 IgH and p ϭ 0.18 IgL) in the concentration of digested peptides between groups (supplemental Fig. S1).
VH and VK Family Distribution in MScl Patients and Controls-The peptide spectral count (based on MS/MS identification) information (Scaffold-based) acquired at the individual level ( Fig. 1) was used to analyze the Ig VH family distribution between MScl patients and controls. An alignment match score of 70% was used as a cutoff value, and a mean score of 90% Ϯ 8.5% was observed. The mean sequence length was 10 Ϯ 2.1 amino acids (mean Ϯ S.D.). Peptide counts were normalized at the individual patient level to the total number of peptides found for the given IgH chain family. The usage of peptides assigned to each family was compared between groups. Log-transformed data were not normally distributed (according to a D'Agostino and Pearson omnibus normality test); therefore, a non-parametric statistical test was used to determine statistical significance using raw data. In comparison to controls, VH 4 was found to be slightly but statistically significant (p ϭ 0.03), and a trend toward increased VH 3 in MScl was observed (p ϭ 0.05) (Fig. 2). No significant differences were found for the other six VH families (p Ͼ 0.05) (supplemental Fig. S2A). The difference in family usage of the VK chains between MScl patients and controls was also determined in a similar way. We found a peptide alignment match mean score of 86% Ϯ 11%. The mean amino acid sequence length was 10 Ϯ 2.6 (mean Ϯ S.D.). Analysis did not show a significant difference between groups (supplemental Fig. S2B).
Disturbed / Ratio in MScl Patients-The / ratio was analyzed in controls (n ϭ 26) and MScl patients (n ϭ 29). The ratio of and light chains was determined from the abundance data for representative peptides from the LC-MS dataset (Progenesis). The peptides used were C-region peptides (n ϭ 10) having an alignment match score of 100% Ϯ 0% (mean Ϯ S.D.) for (n ϭ 6) and 97% Ϯ 3% (n ϭ 4) for peptides. The / mean ratio in controls was 2.52 Ϯ 1.71, and  1. Schematic illustration of the proteomics-based methodology used to assign CDR presence exclusively in the CSF Ig of MScl patients. A, Ig was purified from the CSF of MScl patients (100 l) and non-neurological controls (200 l) based on the Ig concentration assay. Purification was performed using a Melon Gel IgG Spin Purification Kit. Purified Ig was separated into IgH and IgL chains via reducing one-dimensional SDS-PAGE gels. B, after in-gel trypsin digestion of excised IgH and IgL bands, the mixture of peptides was measured via nano-LC-LTQ-Orbitrap MS. Mass spectra were analyzed by Progenesis software, and the peptide search was performed by Mascot. Identified peptides were used for CDR identification using the IMGT database and BLASTp search algorithm to find MScl-specific CDRs. (1) Nano-LC-LTQ-Orbitrap MS-generated LC-MS profiles (raw MS run data) for the IgH and IgL datasets. These were analyzed separately using the Progenesis LC-MS software package (label-free quantification). Sequencing of the MS/MS spectra was executed via a Mascot MS/MS database search against the human subset of the NCBInr sequence database. Afterward, the identified peptides were imported into Progenesis and linked to their corresponding peaks. The peak abundance (UV area under the curve) information was used for analysis of CDR presence or absence. (2) Identified peptides were extracted and converted into FASTA format. They were aligned to V, D, J, and C elements of Ig (Homo sapiens) germline sequences derived from the IMGT database. We used the BLASTp search algorithm to align the identified peptide sequences to the corresponding Ig fragments. Next, they were submitted to the IMGT/domain gap alignment tool. This analysis provided alignment details of CSF Ig peptides relative to the Ig germline that included the gene name, homology match score, mutation/mismatch, and start and end positions. (3) Peptide identification details were uploaded in Scaffold. The Scaffold file contained a peptide identification view report based on the spectral counts (at MS/MS level). Peptide counts were obtained at the individual level for each sample and were exported to a spreadsheet containing detailed information about the protein and peptide hits. On the basis of the resulting combined peptide set, alignment summaries relative to the germline were assigned at the individual level (using a BLASTp algorithm and the IMGT database). This information was used for VH and VK family distribution analysis. BLAST, basic local alignment search tool; BLASTp, basic local alignment search tool for protein; IMGT, ImMunoGeneTics information system; Feature (or Peak), object of defined mass with an identified charge state, retention time, and isotopes characterized by the analysis software.
in MScl it was 4.06 Ϯ 2.76. A two-tailed t test was applied because on log scale transformed data a normal distribution was found by means of normality testing. The ratio showed significant elevation in the MScl group (p ϭ 0.03) (Fig. 3).
Identical CDR Peptide Identification in Several MScl Patients-As a result of nano-LC-LTQ Orbitrap MS measurements of tryptically digested Ig bands, we found 54,057 MS peaks in the IgH chain and 56,664 in the IgL chain comparison set from Progenesis. The database (NCBInr) analysis by Mascot identified 1086 peptides for IgH samples. Similarly, (via NCBInr) we identified 920 peptides for IgL samples. Next, the BLASTp algorithm aligned our peptide (experimental) sequences to a database (IMGT) of germline sequences that were present in naive B cells (as described in "Materials and Methods"). The best-matching germline allele was selected from the database. Comparison of the peptide with the bestmatching germline allele also revealed which amino acids were most likely mutated during rearrangement and affinity maturation. In this way, from the 1086 IgH peptides, 809 sequences were assigned to the V region, 32 sequences to the J region, and 99 sequences to the C region. Similarly (by IMGT), from the 920 IgL peptides, 722 peptides were assigned to the V region, 49 sequences to the J region, and 99 to the C region. Next, within the IgH variable domain, we were able to assign 41 peptides to CDR1, 128 to CDR2, and 171 to CDR3. Within the IgL variable domain, we were able to assign 78 peptides to CDR1, 233 to CDR2, and 51 to CDR3. A summary of the identification is presented as a flowchart in supplemental Fig. S3. Many of the IgH and IgL peptides in our dataset were unassigned. A possible reason for this might be that rearranged CDR3 sequences do not have enough similarity to the germline sequence to allow for alignment.
We were interested in peptides that were observed only in the MScl group. Based on the presence or absence of these peptides in the MScl group, we found 24 IgH and 26 IgL peptides in the MScl group that could be identified only in the MScl group and not in controls. In contrast, as expected, peptides derived from the Ig C region were equally distributed in both patients and controls (data not shown). We searched for those CDR-related peptides in the group of peptides that were common among multiple MScl patients. Next, we zoomed into the CDR regions to assess for peptides that were shared in at least three MScl patients and absent in the controls. We found nine IgH peptides and six IgL peptides. At a lower threshold (i.e. shared in at least two MScl) we observed 14 IgH peptides and 13 IgL peptides (data not shown). The elucidated results here emphasized the fact that CDR peptides were from MScl patients and were not from three or more controls. As a confirmation step, a second round of assessment was performed by means of two-dimensional view analysis via the Progenesis software package. To check whether the fragmentation spectra of a peptide were similar in all patients in whom that peptide had been identified, we critically scrutinized and assessed the fragmentation spectra of peptides. We found five peptides that were proven to be shared by three or more MScl patients. The exact characteristics are shown in Table II.
To determine the probability that the exclusive presence of CDR in MScl was not due to chance alone, we permutated the entire dataset repeatedly 5000 times for CDR marker category. Analysis showed that CDR's exclusive presence in MScl was not due to chance in IgH (p ϭ 0.0005) (see supplemental Five peptides were found to be specific to three or more patients in the MScl group. Interestingly, one of the mutated CDR2s commonly used was seen in seven different MScl patients (supplemental Table S1A). IMGT alignment analysis showed that this CDR peptide had different mutations (T, E, and N) or an insertion (F) at the same spot. The QDGSE-TYYVDSVK (amino acid in bold; mismatch/mutation from Ig germline) peptide was quantified in 6 MScl patients and not in 30 controls (Fig. 4). Furthermore, four out of the six MScl patients were identified via MS/MS (but not controls), providing additional support for proper sequence identification (Fig.  4). Sequence alignment with the human germline sequences (derived from the IMGT database) showed homology to IGHV 3-7, and threonine (T) was found mutated/mismatched from the lysine (K) of the germ line. A similar peptide, QDG-SEEYYVDSVK, was identified in three MScl patients (supplemental Fig. S5). Alignment showed the mutation of a glutamic acid (E) at the same spot. In addition to the peptide described above, two CDRs resembling mutations, QDGSETFYVDSVK (the bold italic letter represents the insertion) and QDGSE-NYYVDSVK, were also observed solely in one MScl patient (supplemental Table S2 and supplemental Fig. S6). Next, the IDWDDDKYYSTSLK peptide was quantified exclusively in four MScl patients (supplemental Fig. S7A). For two of these MScl patients, fragmentation spectra were obtained that showed identical peptide MS/MS results, supporting the robustness of the peptide identification. Alignment analysis showed homology to IGHV 2-70 *01 (CDR2) for these peptides; another identical CDR, IDWDDDKYYTTSLK, with a mutation, was observed solely in one MScl patient (supplemental Table S2 and supplemental Fig. S7B). Supplemental Table  S1B shows common uses of the same peptide in five MScl patients. Next, the YNSAPLTFGGGTK peptide was identified exclusively in three patients (supplemental Fig. S8). A homology search showed alignment to the CDR3 of gene IGKV 1-27 *01 and IGKJ 4 *01. Finally, the LLIHGASNR peptide was identified solely in three MScl patients (supplemental Fig. S9). Alignment showed homology to IGKV 3-20 *02 CDR2, and histidine (H) and asparagine (N) were found mutated from the tyrosine (Y) and serine (S) relative to the germ line.

DISCUSSION
The main observation of this study was the shared use of identical mutated sequences in CDRs of purified Ig from the CSF of different MScl patients. In contrast, in a set of 30 healthy controls, no overlap in the tryptic peptide sequences of CDRs existed. To the best of our knowledge, this was the first study to use a proteomics approach to analyze and compare the sequences of purified Ig from the CSF of a significant, large set of MScl patients and controls. Most studies until now have investigated clonality at the transcriptome level (8, 24 -26). In these previous studies, no common sequences between distinct individuals had been reported. These studies used FACS-based cell sorting methods or CSF cell isolation and ended up with a limited number of CD19ϩ B cells and/or CD138ϩ plasma cells (8,25). Further, they used aЈ and aЉ denote sequences resembling sequences with a different mutation, present exclusively in one patient, and are shown in supplementary Table S2. b The position of the sequence in the Ig structure is indicated. Peptides were analyzed by Progenesis LC-MS, and comparison between MScl and control groups was performed. After the chromatograms had been aligned in the Progenesis software, two-dimensional profiles were compared with find-expression profiles between the two groups. A, precursor mass of the peptide not found in any of the 30 controls. B, precursor mass of the peptide identified in six out of 29 MScl patients. The region covered in A and B by red boxes/isotope boundaries indicates the area of interest and shows the precursor mass isotopic pattern. The peptide abundance was calculated as the sum of the peak areas within these isotope boundaries. C, example of identical MS/MS fragmentation spectra in different cloning at the nucleic-acid level by means of PCR-based technologies. Apart from the larger number of patients investigated, this study differs in that we targeted the CSF Ig at the protein level and did not investigate CSF B cells.
Although all MScl CSF samples tested here had signs of elevated intrathecal Ig production, we cannot claim that the public sequences shared here between distinct individuals are responsible for the oligoclonal bands seen in immunoelectric focusing used in routine hospital chemistry. It should be realized that the sophisticated MS technique applied here makes it possible to detect peptide sequences at concentrations far below the threshold for routine immunoelectric focusing (18).
The current paradigm in immunology is that the antigen specificity of B cells is determined via random mechanisms, and therefore one would expect different sequences of the antigen binding CDR in different individuals. This has recently been challenged by a number of observations. First, Scheid et al. cloned 576 new human immunodeficiency virus (HIV) antibodies from four unrelated individuals and found that despite extensive hypermutation, these antibodies shared consensus sequences in both framework and CDR V regions of IgH chains (9). In another study on Sjö gren syndrome, secreted human Ro52 antibody from unrelated patients was found to share public V region sequences (27). Finally, our group observed 28 common Ig-derived sequences in paraneoplastic anti-Hu-syndrome that were specific for autoantigen and were found exclusively in samples from single autoantibody-defined clinical neurological entities (12).
Our study in MScl patients appears to follow the same paradigm-challenging pattern. A considerable set of 24 IgH and 26 IgL CDR peptides were exclusively present in MScl patients. We found nine peptides in the IgH set and six peptides shared among at least three MScl patients. These numbers were somewhat higher with a lower threshold for sharing Ig in at least two MScl patients. At the lower threshold (shared in two MScl patients), we observed 14 CDR peptides in IgH and 13 in IgL exclusively in MScl patients. This indicates that under (auto)antigenic pressure there may be common selection mechanisms for the production of intrathecal Ig production by B lymphocytes. The suggestion of shared sequences within different MScl individuals is reminiscent of what has been reported on recruited T lymphocytes in this disease (28). Although the reason remains unclear, it would not be farfetched to speculate that some CDR sequences might better survive the clonal selection process than others, perhaps because of a stronger binding of the three-dimensional structure of the CDR to the antigen. Better insight into common selection mechanisms for (auto)antibodies in MScl and the identification of interindividually shared specific CDR sequences might even deliver markers for subgroup identification.
Apart from focusing on the CDR regions, we also investigated the possible use of common VH and lgL chain families. VH repertoires can be divided into seven families based on sequence similarity (22). Previous studies at the genomic level indicated a skewed use of VH4 in MScl (8,29). Although it is hard to draw firm conclusions here, it was striking to see the overrepresentation of peptide sequences (p ϭ 0.03) of the VH4 family in MScl patients relative to controls. In addition, we observed a trend toward increased VH3 family usage in MScl patients (p ϭ 0.05). No significant difference in the use of the VK chain was observed between the two groups.
An additional observation here was the increased / ratios in the MScl group. This observation is in agreement with previous studies (30) using conventional assays for IgL detection.
Compared with the other studies, a limitation of our study is that we do not know the antigen specificity of the V regions identified here. In fact, we do not show complete Ig sequences, because trypsin digestion is needed for this approach. In light of the shared sequences observed in known antigen-specific antibodies in HIV infection, anti-Hu paraneoplastic disease, and Ro52 autoimmunity, it would be of future relevance to investigate the possible use of public V-region sequences in purified specific antibodies against MScl candidate antigens such as anti-MOG, antineurofascin, and anti-KIR 4.1 (6,31,32).
In the available previous studies, no attempts were shown to match the identified sequences with those available in public databases such as BLAST (NCBI). We here performed such a cross-check and were surprised to notice that similar sequences were found in other studies investigating B cells from the CSF of MScl patients.
Although the technique used makes it impossible to show the sequence of complete Ig proteins, it is striking that these peptides have also been identified in other MScl studies. For example, our study showed a common CDR usage in seven different patients. This QDGSEKYYVDSVK peptide, which is part of CDR2, belongs to the IGHV 3-7 germline. We found different amino acid mutations (T, E, or N) and an insertion (F) at the same spot (K) as in the germline. QDGSETYYVDSVK and QDGSETFYVDSVK sequences were published in the database from a study assessing intrathecal CSF B-cell sequences in two MScl patients using an RT-PCR approach (33). The peptide QDGSETFYVDSVK was also shown by a different study that sequenced CSF oligoclonal bands in four MScl samples. Red circles (left) in the figure indicate the location of the retention time and the mass (m/z) where the MS/MS scan was triggered. Almost identical MS/MS fragmentation spectra were observed in different patients. D, MS/MS spectra containing the m/z values and abundances of peptide ion products. A longer series of contiguous y and b ions shows a higher probability of correct identification. The sequence proposes a mutated CDR2 peptide. Red circles show a mutated (T) amino acid from the germline that was found during alignment using the IMGT database.
MScl patients using a proteomics approach (14). Furthermore, gene IGHV 3-7 associated with the same has also been shown in previous studies performed on CSF B cells of MScl patients (14,33,34). Next, gene IGHV 2-70, related to IDWD-DDKYYSTSLK and IDWDDDKYYTTSLK, has also been described in different studies performed on CSF B cells of MScl patients (34,35). The peptide IDWDDDKYYSTSLK has no mutation; although it is plausible that this peptide may be found in the larger, healthy population, we presume that this peptide is enriched in the MScl population. Certain V genes may be preferentially selected in the antibody response against certain antigens (repertoire bias) (7). Thus, such peptides, especially as a part of a larger panel of peptides, help in the identification of patient populations even if they are not unique to that population. Next, peptide YNSAPLT-FGGGTK does not contain amino acids introduced by somatic hypermutation, but it was generated by V (D) J recombination. Nevertheless, it is unique because of its particular deletion of two nucleotides of the V region in conjunction with one nucleotide of the J region and the lack of any additional N nucleotides. In all, three of the CDRs observed in three or more MScl patients have been linked to MScl in the past.
In studies involving MScl patients, the phenomenon of repertoire bias (7,8,29,36) has been described, whereby specific genes from the germline repertoire are favored in the panel of antibodies that is produced during the immune process. These pressures drive antibodies in convergent directions. Indeed, one might expect many CDR3 sequences, as they are important for the specificity of an antibody. However, two aspects may favor CDR1/2 instead. First, CDR3 is generally highly mutated on the border of V, D, and J germline alleles. As such, it is difficult to identify based on homology to the germline sequences, and it may remain unidentified. Second, it might be that the highly mutated CDR3 is in fact mostly unique to an individual, and that motifs shared between individuals are instead found in the moderately mutated CDR1/2 regions.
To conclude, this proteomic study shows for the first time CDR peptides shared among individual MScl patients but not controls. There was striking overlap with a few CDR peptides identified in other studies that assessed B-cell clonality in the CSF of MScl patients at the nucleic-acid level. Whether such common B-cell responses are indeed driven by autoantigens remains to be determined. It will be of interest to study common V-region use in known autoreactive Igs that appear to play a role in MScl (6).