Peptides Presented by HLA-DR Molecules in Synovia of Patients with Rheumatoid Arthritis or Antibiotic-Refractory Lyme Arthritis*

Disease-associated HLA-DR molecules, which may present autoantigens, constitute the greatest genetic risk factor for rheumatoid arthritis (RA) and antibiotic-refractory Lyme arthritis (LA). The peptides presented by HLA-DR molecules in synovia have not previously been defined. Using tandem mass spectrometry, rigorous database searches, and manual spectral interpretation, we identified 1,427 HLA-DR-presented peptides (220–464 per patient) from the synovia of four patients, two diagnosed with RA and two diagnosed with LA. The peptides were derived from 166 source proteins, including a wide range of intracellular and plasma proteins. A few epitopes were found only in RA or LA patients. However, two patients with different diseases who had the same HLA allele had the largest number of epitopes in common. In one RA patient, peptides were identified as originating from source proteins that have been reported to undergo citrullination under other circumstances, yet neither this post-translational modification nor anti-cyclic citrullinated peptide antibodies were detected. Instead, peptides with the post-translational modification of S-cysteinylation were identified. We conclude that a wide range of proteins enter the HLA-DR pathway of antigen-presenting cells in the patients' synovial tissue, and their HLA-DR genotype, not the disease type, appears to be the primary determinant of their HLA-DR-peptide repertoire. New insights into the naturally presented HLA-DR epitope repertoire in target tissues may allow the identification of pathogenic T cell epitopes, and this could lead to innovative therapeutic interventions.

Rheumatoid arthritis (RA), 1 the most common form of chronic inflammatory arthritis, is an autoimmune disease of unknown cause. In contrast, Lyme arthritis (LA), another type of inflammatory arthritis known to result from infection with Borrelia burgdorferi (1), can usually be treated successfully with antibiotic therapy, an outcome called antibiotic-responsive LA. However, in a small percentage of LA patients, synovitis persists for months to several years after apparent spirochetal killing with antibiotic therapy. This outcome, called antibiotic-refractory LA, may result from infection-induced, autoimmunity (2). Inflamed synovial tissue, which shows synovial hypertrophy, vascular proliferation, and infiltration of mononuclear cells, including macrophages, plasma cells, and T and B cells, has a similar appearance in all forms of chronic inflammatory arthritis, including in RA and antibiotic-refractory LA, and is a target tissue of the immune response in these patients. Inflamed synovia show marked up-regulation of HLA-DR molecules on professional antigen-presenting cells (APCs) and synoviocytes (3,4), and this provides evidence that HLA-DR expression is intense throughout the synovial lesion.
We and others have reported that specific HLA-DR alleles constitute the greatest known genetic risk factor for RA or antibiotic-refractory LA (5)(6)(7). In RA, the implicated DR alleles, primarily the DRB1*0401, -0404, -0405, -0101, and -0102 alleles, code for a highly homologous amino acid sequence at positions 70 -74 of the B1 chain of the molecule (8 -10). This region of the molecule is thought to be important in the specificity of peptide binding, and therefore, it seems to be a critical factor for defining a person's HLA-DR-peptide repertoire. These same RA-associated HLA-DR alleles and the DRB5*0101 allele, which bind an epitope of B. burgdorferi outer surface protein A (OspA(161-175)), occur more frequently in patients with antibiotic-refractory LA than in those with antibiotic-responsive LA (7).
It is unclear how these HLA-DR molecules are involved in autoimmune arthritis (11): these DR molecules may present specific arthritogenic autoantigens in the joint; they may fail to present specific self-peptides during ontogeny, resulting in the survival of certain autoreactive T cells; or they may simply be markers for closely related inflammatory genes (12,13). These hypotheses are not mutually exclusive, and all three factors may have a role in autoimmune arthritis. However, it has been difficult to prove these hypotheses, and pathogenic T cell epitopes have not yet been identified in any form of autoimmune arthritis, including RA or antibiotic-refractory LA (14,15).
The advent of highly sensitive nanoflow liquid chromatography-tandem mass spectrometry (LC-MS/MS) systems has made it possible to identify peptides presented by HLA-DR molecules in patients' cells or tissues (16). In 1995, in the first study of this type Gordon et al. (17) identified 14 HLA-DRpresented peptides in the spleen of an RA patient with Felty syndrome. Subsequently, larger numbers of HLA-DR-peptides were identified in colon tissue from patients with inflammatory bowel disease (18), kidney primarily from patients with renal cell carcinoma (19), pooled bronchoalveolar lavage (BAL) cells from patients with sarcoidosis (20), or thyroid from patients with Graves disease (21). These studies only identified an average of 20 -40 peptides per patient, but the lists of peptides did include suspected or known autoantigens, such as thyroglobulin in Graves disease.
In the study reported herein, we used high performance liquid chromatography-tandem mass spectrometry, rigorous application of multiple database search approaches, and manual spectral interpretation to identify HLA-DR-presented peptides, their post-translational modifications, and source proteins from the synovia of four patients, two diagnosed with RA and two diagnosed with antibiotic-refractory LA. Three of the four patients each had at least one of the DR alleles associated with these diseases (6,7), but only two patients, one with RA and one with antibiotic-refractory LA, had the same allele (DRB1*0101). A wide range of both intracellular and extracellular self-proteins apparently entered the HLA-DR pathway of APCs in the inflamed synovia of these patients. However, only a few identified epitopes appear to be diseasespecific. Instead, two patients who had different diseases but the same HLA-DR allele had the largest numbers of epitopes in common, suggesting that epitope selection is more dependent on HLA-DR genotype than on the type of arthritis.

EXPERIMENTAL PROCEDURES
Study Patients-The Human Investigations Committee at Massachusetts General Hospital approved the study, and patients gave written informed consent. The two patients with RA met the American Rheumatism Association criteria for the diagnosis of this disease (22), and the two patients with LA met the Centers for Disease Control and Prevention criteria for the diagnosis of Lyme disease (23,24). Both patients with LA had antibiotic-refractory arthritis, defined as persistent arthritis for Ն3 months after the start of Ն2 months of appropriate oral antibiotic therapy, the start of Ն1 month of intravenous antibiotic therapy, or both (2). HLA-DR alleles were determined by Dr. Lee Ann Baxter-Lowe (Immunogenetics and Transplantation Laboratory, University of California, San Francisco) using high resolution molecular typing with sequence-specific oligonucleotide probes, automated sequencing, and sequence-specific priming (25).
Synovial Tissue Preparation-Reagents were obtained from Sigma (St. Louis, MO). Fat was removed from the tissue by dissection, and the tissue was frozen in Dulbecco's phosphate-buffered saline with 10% dimethyl sulfoxide (RA2) or with fetal calf serum (RA1, LA1, and LA2) at Ϫ124°C. For analysis, 8 -10 g of tissue was thawed and processed using a Polytron tissue homogenizer (Kinematica, Bohemia, NY) on ice with 2.5 ml of lysis buffer (150 mM NaCl, 20 mM Tris-HCl, pH 8.0, 5 mM EDTA, 0.5 mM 4-(2-aminoethyl)benzenesulfonyl fluoride hydrochloride, 10 g/ml leupeptin, 10 g/ml pepstatin A, 5 g/ml aprotinin)/ml of tissue. The detergent CHAPS was then added to 1% final concentration, and the extract was incubated at 4°C for 1 h. Insoluble material was removed by centrifugation at 800 ϫ g for 5 min and 27,000 ϫ g for 30 min.
Immunoaffinity Purification of HLA-Peptide Complexes-The procedures used for MHC peptide isolation were similar to those reported recently by Fissolo et al. (26) for their isolation of peptides presented by soluble MHC I and II molecules obtained from lesions in the central nervous systems of multiple sclerosis patients. Water (Honeywell Burdick and Jackson, Morristown NJ) and acetic acid (Mallinckrodt Baker) were HPLC grade. CNBr-activated Sepharose 4 fast flow beads (GE Healthcare), a highly cross-linked preactivated matrix produced by reaction of cyanogen bromide with Sepharose beads, were used for coupling of the HLA-DR-specific antibody (L243; 4 mg) through multipoint covalent linking of -NH 2 groups in the antibody. Before antibody coupling, the Sepharose beads were prehydrolyzed for 3 h in coupling buffer (0.1 M NaHCO 3 , 0.5 NaCl, pH 8.3) to reduce the number of reactive groups on the beads as excess covalent coupling of antibody to beads can reduce antibody binding to HLA-DR complexes through steric hindrance or loss of antigenbinding sites. After prehydrolysis, antibody was incubated with the beads for 2 h followed by a 2-h incubation with blocking buffer (0.015 g/ml glycine in coupling buffer) to block any unreacted sites. Preclear beads were prepared identically but without antibody.
Lysates were incubated with Preclear beads for 4 h at 4°C and then with L243-conjugated beads overnight at 4°C. Beads were washed 4 ϫ 15 min and 1 ϫ 30 min with 10 ml of 1% CHAPScontaining lysis buffer and then 4 ϫ 15 min and 1 ϫ 40 min with 10 ml of 20 mM Tris-HCl, pH 8, 150 mM NaCl (wash buffer 1). Beads were transferred to a column (5 ml, Handee, Pierce) and washed with 50 ml of wash buffer 1 and 50 ml of 20 mM Tris-HCl, pH 8 by gravity flow. Residual buffer was removed by centrifugation for 1 min at 800 ϫ g. Peptides were eluted by 3 ϫ 5-min incubations at RT with 0.5 ml of 10% acetic acid followed by centrifugation for 1 min at 800 ϫ g to collect eluates. Proteins were removed using 10,000 molecular weight cutoff filters (Microcon, Millipore, Billerica, MA). Eluates were dried under vacuum and stored at Ϫ20°C.
LC-MS/MS-All solvents were LC-MS grade (Fisher Optima). Trifluoroacetic acid (TFA) and formic acid (FA) ampoules were from Pierce. Solvent A was 0.1% FA in water. Solvent B was 0.1% FA in acetonitrile (ACN). Peptides were desalted prior to LC-MS/MS using C 18 tips (OMIX, 100 l; Varian, Palo Alto, CA) and eluted with 100 l of 50% ACN, 0.1% TFA. The eluate ACN concentration was reduced by vacuum concentration of the eluate to ϳ20 l followed by addition of 75 l of 0.1% FA. For LC-MS/MS, a 3-9-l aliquot of this solution was used. A CapLC HPLC system (Waters) was used for LA1, LA2, and RA1, and a Waters nanoACQUITY UPLC system was used for RA2. Both systems ran 500 nl/min at the column and used 180-m i.d. trap and 100-m i.d. analytical columns. The gradient was ϩ0.5% B/min from ϩ2-35% B and 2% B/min from 35-85% B for both the CapLC and nanoACQUITY systems except that the gradient started at 3% B for the nanoACQUITY. The nanoelectrospray source was a polyether ether ketone tee (Microtight; Upchurch Scientific, Oak Harbor, WA) fitted with a platinum electrode (World Precision Instruments, Sara-sota, FL) and a 10-m fused silica emitter (360-20-10-6.25 uncoated; New Objective, Woburn, MA). Sensitivity and resolution of the two LC systems were found to be comparable by using a reference digest (enolase, Waters).
HPLC-tandem mass spectrometry analyses were performed on a Sciex QSTAR Pulsar i quadrupole-orthogonal-TOF mass spectrometer (Applied Biosystems, Framingham, MA) with a nanospray stage (Proxeon, Odense, Denmark) and the CapLC HPLC system (Waters) with the conditions listed above. Data were acquired in informationdependent acquisition mode using Analyst 1.0. Survey scans covered the range m/z 350 -1600, and the five most abundant ions (2ϩ to 5ϩ) were selected for MS/MS. MS/MS spectra (m/z 70 -1600) were obtained using nitrogen as the collision gas with rolling collision energy. Ions selected for MS/MS were dynamically excluded for 75 s. Data processing was performed using AnalystQS 1.1. All spectra files were recalibrated using high confidence peptides identified during preliminary database searches. Average mass accuracy was 2.1 ppm with a standard deviation of 7.7 ppm. The results from two LC-MS/MS runs were combined for each sample.
Analysis of post-translational modifications was conducted using accurate precursor mass measurements and collision-induced dissociation on an LTQ-Orbitrap Discovery mass spectrometer (Thermo-Fisher Scientific, Waltham, MA) that was fitted with a TriVersa Nano-Mate nanoelectrospray source (Advion, Ithaca, NY) and used the nanoACQUITY UPLC system with the columns and solvent gradient described above. Mass resolution was 60,000, and mass accuracy was within 2 ppm using external calibration against a mixture of peptide standards (Bruker Daltonics, Billerica, MA). Survey and high resolution MS scans were performed in the Orbitrap over the range m/z 300 -2000. The five most intense ions were selected for CID MS/MS in the linear ion trap with a minimum signal of 8,000, an isolation width of 4 m/z, and a normalized collision energy of 35%. Singly charged ions were rejected, and dynamic exclusion was applied to selected ions for a duration of 75 s and a repeat count of 2. The automatic gain control target was set to 30,000 for the ion trap and 2,000,000 for the Orbitrap. The RAW file was converted to Mascot generic file format with Mascot Daemon 2.2.2 (Matrix Science, London, UK). Mascot 2.2.0 (Matrix Science) was used for database searching with a precursor tolerance of 5 ppm and a fragment ion tolerance of 0.5 Da.
Protein Database Searching-Spectra were searched against version 3.41 of the International Protein Index human database, which contained 72,155 protein sequences (European Bioinformatics Institute, www.ebi.ac.uk) concatenated with a randomized (decoy) version of the same database generated by the Perl script decoy.pl (Matrix Science). Data processing was performed using AnalystQS 1.1. Mass lists in Mascot generic format were obtained using Mascot.dll version 1.0.0.23. The results from two LC-MS/MS runs were combined for each sample.
Database searches were conducted with Mascot 2.2.0 (Matrix Science), OMSSAcl 2.1.1.win32 (NCBI), and X!Tandem TORNADO win32-08-02-01-1 (The Global Proteome Machine Organization www. thegpm.org). Searches were conducted with no fixed modifications but with variable modifications for oxidized methionine, pyroglutamic acid from amino-terminal glutamine and glutamic acid, and deamidation of glutamine and asparagine. Separate database searches were conducted to identify additional post-translationally modified peptides, including O-N-acetylglucosamine at Ser or Thr; acetylation at Lys or the amino terminus; amidation at the carboxyl terminus; Arg to citrulline; S-cysteinylation; S-cysteinylglycine; S-glutathionylation; glycation at Lys or the amino terminus; S-homocysteinylation; 4-hydroxy-5-nonenal at Cys, Lys, or His; malondialdehyde at Lys; methylation at Lys or Arg; oxidation at Cys; and phosphorylation at Ser, Thr, or Tyr. All identifications were manually verified.
Because cystine in plasma may be a source for cysteinylation of peptides, we were concerned that the processing of the RA1, LA1, and LA2 samples in the presence of serum may have led to artifactual cysteinylation of the peptide found in patient LA1. To exclude this possibility, the sample from patient RA2, the last one analyzed, was not processed in the presence of serum, and it was this sample in which most of the cysteinylated peptides were identified (see "Results" and "Discussion").
Search Result Filtering by Consensus Matching-Each spectrum was searched against the International Protein Index database using three search programs: Mascot, OMSSA, and X!Tandem. Score cutoffs for individual database searches were as follows: Mascot score of 20 or higher, OMSSA e-value of 0.001 or less, and X!Tandem e-value of 0.2 or less. Spectrum lists were obtained from Spectrum List View of OMSSA browser 2.1.1 (NCBI). A Microsoft Access query was run to associate peptide-to-spectrum matches from the three programs to searched spectra. For each spectrum searched, a peptide-to-spectrum match was accepted only when two or three of the search programs assigned the identical sequence (consensus match) to that spectrum. The false discovery rate was calculated as follows: ((2 ϫ decoy)/(target ϩ decoy)) ϫ 100 where target is a match to the intact database.
Predicted HLA-DR Binding of Identified Peptides-All peptide sequences were screened with the bioinformatics program TEPITOPE to predict anchor residues and the likelihood of HLA-DR binding of one or more of 25 different HLA-DR molecules (27). The binding potential of each peptide was assessed at a threshold ranging from 1 to 10 (the most stringent to the least stringent score).
Statistics-The identity of groups was compared by 2 tests. All p values are two-tailed. A p value Յ0.05 was considered statistically significant.

Patients-
The two patients with chronic, symmetrical polyarthritis (RA1 and RA2) met clinical criteria for RA (22); neither tested positive for either rheumatoid factor or anti-CCP antibodies. Patient RA1, a 70-year-old woman, had the onset of arthritis at age 4; her arthritis persisted throughout adulthood. Her synovial tissue was obtained when a hip prosthesis was revised 66 years after disease onset. Patient RA2, a 37-yearold woman, had a suboptimal response to therapy with disease-modifying anti-rheumatic drugs (DMARDs), and one knee remained markedly inflamed. An arthroscopic synovectomy of that joint was performed 3 years after disease onset.
The two patients with antibiotic-refractory LA arthritis (LA1 and LA2), a 12-year-old boy and a 43-year-old man, met the Centers for Disease Control and Prevention criteria for Lyme disease (23). Their synovitis, which affected one knee, persisted for more than 1 year after 3-4 months of oral doxycycline and intravenous ceftriaxone or penicillin therapy. Therefore, they underwent arthroscopic synovectomies. Consistent with results in previous patients (2,28), their joint fluid and synovial samples obtained at synovectomy had negative PCR results for B. burgdorferi DNA and negative cultures for B. burgdorferi. Thus, their synovitis persisted after apparent spirochetal killing with antibiotic therapy but was treated successfully with synovectomy.
Although the four patients all had different HLA-DR genotypes (Table I), both patients with antibiotic-refractory LA and one with RA (RA2) had DRB1 alleles (0101, 0401, or 1001) associated with these diseases (6,7). The other RA patient (RA1) had the DRB1*0402 allele (verified by sequencing). This allele has been reported to protect against severe RA (8), but she had had severe, erosive arthritis for 66 years.
Number of Peptides and Source Proteins Identified from HLA-DR-Peptide Complexes-Altogether, 7,260 MS/MS spectra were obtained from the synovial samples of the four patients (Table I). Of the 7,260 spectra, 1,427 (20%) had a consensus match with at least two of the three search programs (Mascot, OMSSA, and X!Tandem), and 262 of these corresponded to non-redundant sequences; 2,258 (31%) of the peptide assignments were obtained with one of the three programs. Search results provided by the three programs are included in the supplemental material pages 3-84 and supplemental Excel file E1. The 262 epitopes identified by consensus match originated from 166 different source proteins. Using a target-decoy database strategy (29), the number of false-positive hits for the 1,427 spectra was estimated to be between 0 and 3 per sample; this represents an overall false discovery rate of 0.6%, which indicates a high degree of accuracy for peptide identification. Because peptides identified by consensus match generate a greater level of confidence, these peptides were taken as the focus for the subsequent analysis, the results of which are presented here.
Source Proteins for HLA-DR-presented Peptides-The peptides presented by HLA-DR molecules in the synovia of the four patients were derived from 166 source proteins, including intracellular, membrane, extracellular matrix, and plasma proteins that have a wide range of biologic functions (Fig. 1, A and B, and Table II). When both primary and secondary locations are considered, it becomes apparent that 64 -87% of the proteins in each sample could be found in plasma and presumably in synovial fluid exudates. The selectivity of our results is indicated by the fact that the list of source proteins for the MHC-presented peptides differs from the reports of Dasuri et al. (30) who used fibroblast-like synoviocytes to analyze the overall synovial proteome and Tilleman et al. (31) who analyzed the proteome of inflamed synovial tissue from osteoarthritis and RA patients. In addition, it is notable that a total of 44% of the 166 synovial source proteins had been identified previously as source proteins for HLA-DRpeptides in at least one other type of human tissue, including thyroid, kidney, spleen, or BAL cells (17, 19 -21).
Of the 166 synovial source proteins, five have been associated with autoantibodies in RA. These autoantigens include vimentin, fibrinogen, fibronectin, and collagen, which may induce anti-CCP antibodies, and immunoglobulin, an autoantigen that is recognized by rheumatoid factors (32,33). The HLA-DR-peptides from these source proteins (Table VI and  supplemental Tables 1-4) were found in one or both patients with RA or LA. For example, peptides from five types of collagen (type I, ␣-1; I, ␣-2; V, ␣-2; XI, ␣-1; and XIV ␣-1) were identified in the synovial sample of patient RA2 (Table VI), whereas peptides from only a single collagen type were identified in the RA1 sample (type I, ␣-1) (supplemental Table 1) or the LA1 sample (type XII, ␣-1) (supplemental Table 3), and no collagen peptides were identified in the LA2 sample.
General Characteristics of HLA-DR-eluted Peptides-The number of peptides identified for a given core sequence (the nine amino acids bound in the HLA-DR-peptide-binding groove) ranged widely from those identified only once to the 44 overlapping sequences for one epitope of serum albumin that were identified in the sample from patient LA1. The HLA-DR-eluted peptides ranged in length from 9 to 26 amino acids, but the majority (59%) had a length of 14 -16 residues ( Fig.  2A), which is common for HLA-DR-presented peptides. Furthermore, proline residues were over-represented near the amino and carboxyl termini of the peptides, specifically in the N2 and C2 positions (Fig. 2B). These may serve as anchor positions as suggested by Rammensee (34). Approximately 20% of the peptides had proline residues in these positions, whereas 6% would be expected at random based on the frequency of proline in the human protein database (www. ebi.ac.uk, Integr8 release 88). Supplemental Tables 1-4 present lists of all identified, non-redundant, HLA-DR-eluted peptides and their source proteins for each of the four patients.
TEPITOPE (27), one of the most accurate prediction algorithms for MHC class II molecules (35), predicts peptidebinding registers for the nine core amino acids bound within the MHC groove for each of 25 HLA-DR molecules. The threshold setting assesses the probability of epitope binding; it can be set anywhere from most stringent (1) to least stringent (10). Only one patient (LA2) had an allele (DRB1*1001) that was not modeled in the algorithm. Using TEPITOPE, almost all of the 1,427 peptides (99%) identified by two or more of the mass spectrum search programs were predicted to bind at least one of the associated patient's HLA-DR molecules at a threshold score of Յ10, and nearly half of the peptides from each patient were predicted to bind that patient's HLA-DR molecules at the most stringent score of 1 (data not shown). At a threshold setting of Յ3, a setting associated with low false-positive frequency (27), about threequarters of the peptides from each patient (65-78%) were predicted to bind that patient's HLA-DR molecules (Fig. 3). Epitopes in Common According to Patients' Disease or HLA-DR Type-Among the four patients, the majority of epitopes identified in a given synovial sample were found only in that sample. Only three epitopes (derived from collagen type I, ␣-1; plexin domain-containing protein 2; and tumor necrosis factor (TNF)-inhibitory protein) were identified exclusively in the two RA patients, and only two epitopes (derived from epididymal secretory protein E1 and Rho GDP dissociation inhibitor 1␤) were identified exclusively in the two antibiotic refractory LA patients (Table III). At the other end of the range, two epitopes (one derived from ␣ 2 -macroglobulin and the other derived from complement C3) were found in all four patients' synovia, and eight other epitopes were identified in three patients' samples. Six of these 10 commonly identified epitopes were predicted to bind more than half of the 25 DR molecules modeled in the TEPITOPE program at a threshold of Յ3, suggesting that these epitopes have promiscuous binding behavior (Table III).
In general, seven to 11 epitopes (8 -12%) were mutually shared between any two patients' samples (Table IV). The only exception was patients RA2 and LA1 who had different forms of arthritis but the same HLA-DR allele (DRB1*0101). They had 38 (37%) mutually shared epitopes (Tables IV and  V), which was a significantly larger number than other patient pairs (p Յ 0.002). Moreover, of these 38 peptides, 27 (71%) were predicted to bind the DRB1*0101 molecule at the most stringent threshold setting of 1, and 36 of the 38 peptides (95%) were predicted to have a specific binding register that could be utilized by both the DRB1*0101 and -0401 molecules (data not shown). Furthermore, these two patients (RA2 and LA1) had a greater percentage of the same source proteins than patients with different alleles (46 versus 21-29%, p ϭ 0.024) (Table IV). This is presumably due to the binding specificity of the patient's DR molecules and not to a different pool of source proteins entering the HLA-DR pathway. Patient RA2, who had the HLA-DRB1*0401/0101 alleles, had 19 epitopes identified from five types of collagen and from fibrinogen, fibronectin, vimentin, and immunoglobulin (Table VI), which are thought to be potential source proteins for anti-CCP antibodies or rheumatoid factors (33). Almost all of these peptides were predicted to be bound well by the DRB1*0401 molecule and, in some instances, by the DRB1*0101 molecule. A few of these epitopes were also identified in the sample from patient RA1 or in the samples from the patients with Lyme arthritis. The sequence locations of the epitopes on the most highly represented proteins showed that the presented peptides originated from narrow regions of the proteins rather than being uniformly distributed (Fig. 4).

Search for Peptides with Post-translational Modifications-
The tandem mass spectra were searched for 14 different post-translational modifications (see "Experimental Procedures"). Because autoantibodies to citrullinated proteins are found in about 60% of patients with RA (36 -38) and because each patient, particularly patient RA2, had peptides identified from source proteins that may undergo citrullination, the spectra were searched carefully for the 0.984-Da increase in mass that corresponds to the post-translational conversion of arginine to citrulline. However, citrullinated peptides were not found in any of the four patient samples; this result is consistent with the fact that no patient had anti-CCP antibodies.
Instead, in patient RA2, 10 peptides were identified with the post-translational modification of S-cysteinylation (Table VII). These 10 peptide sequences were derived from four source proteins, cathepsin S, vitronectin, microfibril-associated glycoprotein 4, and von Willebrand factor. The cathepsin S peptides were modified with a disulfide-linked cysteine on the carboxyl-terminal cysteine residue. The structural assignments indicated in the schemes above the two spectra in Fig. 5 allow determination of the nature of the modification and clearly define its location; comparison of the two spectra indicates the presence of an additional amino-terminal leucine residue in the peptide whose CID MS/MS spectrum is shown in Fig. 5B. The three S-cysteinylated peptides derived from cathepsin S were also found in the sample from patient LA1. Peptides from the other three proteins were modified on internal cysteine residues. Tandem mass spectra for all modified peptides are included in the supplemental material pages 75-84. DISCUSSION Using LC-MS/MS, we identified 1,427 HLA-DR-presented peptides (220 -464 per patient) derived from 166 source pro-

HLA-DR-presented Peptides in Synovial Tissue
teins in synovial tissue from two patients each with RA or antibiotic-refractory LA. Such studies have marked clinical and technical hurdles. Clinical challenges include difficulty in obtaining large amounts of synovial tissue, patient heterogeneity even among patients who meet criteria for the same disease, and selection of control groups.
Large amounts of synovial tissue, the target tissue in chronic inflammatory forms of arthritis, can only be obtained from patients who undergo surgical procedures because of inadequate responses to medical therapy, a narrow spectrum of the total patient population. Even among patients who meet clinical criteria for RA or antibiotic-refractory LA, patient heterogeneity still exists, and several HLA-DR alleles contribute to disease severity (1,9). Ideally, one would like to compare the HLA-DR-peptide repertoire of inflamed synovia with that of normal synovial tissue. However, normal synovium is microscopically thin, and expression of HLA-DR molecules is considerably less than that in inflamed synovium, making it unsuitable for the type of analysis done here. Therefore, we chose to compare synovial samples from patients with two different types of inflammatory arthritis, RA and antibioticrefractory LA. This allowed us to begin to identify possible disease-specific or promiscuous epitopes and to compare the peptide repertoires of patients with the same or different HLA-DR alleles. Although an analysis of four patients may seem small, this study gives the first description of peptides presented by HLA-DR molecules in synovia of patients with chronic inflammatory arthritis, and it reports the largest number of HLA-DR-peptides identified in any human disease prior to the submission date of this report.
The technical challenges of HLA-DR-peptide identification in patients' samples include the inability to purify all HLA-DRpeptide complexes from the mixture of cells in synovial tissue, the inability of LC-MS/MS to ionize or select all peptides, and the inability of the search programs to match all spectra to peptide sequences due to information-poor CID spectra that contain few significant peaks e.g., spectra recorded with low signal intensities, interference from co-eluting species, or unanticipated post-translational modifications. Consequently, a peptide may be present in a mixture but not identified, and therefore, it is currently possible to identify only a limited number of the HLA-DR-presented peptides present in patients' tissues (39). Despite these challenges, LC-MS/MS instruments and computer search programs are steadily improving. In this study, 220 -446 peptides per patient were identified; this on average is 10-fold more than in previous studies (17)(18)(19)(20)(21). Furthermore, the validation of the peptide assignments was rigorous: three different search programs were used, and individual spectrum-topeptide matches were only accepted when at least two search programs gave a consensus match. The rate of false-positive matches was very low, and all spectrum assignments of HLA-DR-peptides were manually verified. In support of the accuracy of this approach, the majority of peptides identified were of the usual length of HLA-DR molecules (21), a number contained a putative proline-processing motif (40), and most were predicted to be bound by the appropriate patient's HLA-DR molecules. Thus, we are confident that the peptides were presented by HLA-DR molecules in inflamed synovia.
HLA-DR molecules are highly expressed on a variety of cell types in inflamed synovia, including synoviocytes and a Amino acid sequence of HLA-DR bound epitopes. Because the peptide flanking regions varied, the longest identified peptide is shown. Peptides followed by a capitalized superscripted letter indicate that these HLA-DR-bound peptides were identified in other tissues: "T" for thyroid (21), "B" for bronchoalveolar lavage cells (20), and "K" for kidney (19). Tandem MS/MS spectra obtained for the listed peptides are presented in supplemental material pages 3-17.
b Number of times a peptide sequence was identified in the patients' samples. c Of the 25 HLA-DR molecules modeled in TEPITOPE, the number of molecules predicted to present the peptide at a threshold setting of Յ3. infiltrating cells such as macrophages, dendritic cells, and T and B cells. Because synoviocytes are the most numerous cell types in the synovial lesion, the majority of the HLA-DR-presented peptides identified in this study probably came from synoviocytes but could have come from any of these cell types. The 1,427 peptides identified here were derived from 166 source proteins, including constituents of plasma present in synovial fluid exudates and intracellular and membrane-bound proteins. This list of proteins was quite different from the source proteins of HLA-DR-presented peptides in cultured cell lines (41,42), a result that likely reflect proteins present in the culture medium or proteins released through the natural turnover of cells in the cell line. Therefore, the identification of pathogenic T cell epitopes requires that these analyses be performed using human target tissues as cell lines grown in culture medium do not give a true representation of the in vivo HLA-DRpeptide repertoire.
Plasma or synovial fluid proteins likely entered the HLA-DR pathway of APCs through phagocytosis or pinocytosis of synovial exudates, whereas intracellular or membrane-bound proteins likely entered due to phagocytosis of apoptotic cells or possibly by autophagy (43). It appears that the endosomal cathepsins, which are important for antigen processing in APCs (44,45), may also be degraded and converted to antigens that are presented on the cell surface. Thus, the proteins that enter the HLA-DR pathway of APCs in inflamed synovia reflect the wide range of proteins found in the intracellular and extracellular milieu of the joint.
For the most part, the synovial HLA-DR-peptide repertoire identified here was different in each of the four patients presumably because the patients generally had different HLA-DR a Because the identified peptides varied in length, only the longest peptide is shown. Amino acid residues in bold print indicate the TEPITOPE-predicted P1 binding sites for HLA-DRB1*0101. Because multiple binding registers were predicted for many peptides, only the register with the highest likelihood of binding is indicated. Tandem MS/MS spectra obtained for the listed peptides are presented in supplemental material pages 18 -55. b Number of times a peptide sequence was identified in the patients' samples.
alleles and because only a minority of the HLA-DR-presented peptides could be identified in each patient. Although a small number of possibly disease-specific or promiscuous epitopes were detected, two patients with different diseases (RA2 and LA1) who had the same HLA-DR allele (HLA-DRB1*0101) had the largest number of epitopes in common (37%). Moreover, a Because the identified peptides varied in length, only the longest peptide is shown. Predicted P1 residues are bold for 0401 and underlined for 0101. Because multiple binding registers were predicted for many peptides, only the register with the highest likelihood of binding is indicated. For some peptides, a binding register to only one of the patient's HLA-DR molecules was predicted. Tandem MS/MS spectra obtained for the listed peptides are presented in supplemental material pages 56 -74. b Number of peptides identified in the patient's sample. Because of peptide length variability, not all peptides contained the predicted nine-amino acid core sequence for each HLA-DR molecule. Therefore, where the two numbers differed, 0401 is bold, and 0101 is underlined.
c The TEPITOPE binding score for the P1 site indicated. TEPITOPE predicted that most of these epitopes would have a common binding register that could be utilized by both the DRB1*0101 and -0401 molecules. Thus, these data begin to define the HLA-DR epitope repertoire in inflamed synovia in patients who have alleles that are characteristic of RA or antibiotic-refractory LA (7)(8)(9)(10). Moreover, when we compared our patients' synovial HLA-DR repertoires with those reported in previous studies of spleen, thyroid, BAL, or kidney, 26% of the synovial HLA-DR-presented peptides had a core sequence identical to that of peptides in these other tissues, and most (73%) were from patients who expressed an HLA-DR allele shared by at least one of our four patients. Although the number of patients tested is still small, these data suggest that epitope selection is dependent primarily on HLA-DR alleles rather than type of disease or tissue. Therefore, a given autoimmune disease is probably perpetuated by a limited selection of possible autologous epitopes. In a study that was published while this manuscript was being revised in response to its initial review, Bassani-Sternberg et al. (46) reported that they reached a similar conclusion based on data they have obtained for peptides presented by soluble HLA molecules in the plasma of cancer patients.
Approximately 60% of patients with RA have anti-CCP antibodies (34). This is the first autoantibody identified that seems to be specific for RA (33). Moreover, the presence of anti-CCP antibodies correlates with RA-associated alleles, including DRB1*0401 and -0101 (47)(48)(49). Interestingly, patient RA2 who had both of these alleles had peptides identified from source proteins (vimentin, fibrinogen, fibronectin, and collagen) that may undergo citrullination in RA (32,50). Furthermore, TEPITOPE predicted that the majority of these peptides would have a high probability of being bound by the HLA-DRB1*0401 molecule and sometimes by the HLA-DRB1*0101 molecule. Therefore, this study begins to define T cell epitopes that may be important in the activation of T helper cells involved in the production of high affinity autoantibodies, including anti-CCP antibodies.
However, a careful search of the LC-MS/MS spectra from the synovium of this patient failed to demonstrate any spectra that corresponded to the post-translational conversion of arginine to citrulline. It is possible that a B cell epitope of these proteins might undergo citrullination but not the corresponding T cell epitope. On the other hand, consistent with our LC-MS/MS results, she did not have anti-CCP antibodies as determined by ELISA. Instead, 10 peptides were identified from other source proteins with the posttranslational modification of S-cysteinylation. This sample was not processed in the presence of serum (which could be a source of cystine that might result in the cysteinylation of peptides), and thus, it is clear that this modification occurred in vivo. S-Cysteinylated peptides have been found in both MHC class I-and class II-presented peptides, and this post-translational modification may alter both peptide binding and presentation to T cells (51)(52)(53)(54), leading to novel T cell epitopes. However, it is not yet known whether cysteinylated peptides play a role in autoimmunity in RA. Cys- a All 10 cysteinylated peptides derived from the four source proteins were identified in patient RA2, and the three peptides from cathepsin S were also found in patient LA2. The predicted P1 contact residues are shown in bold for HLA-DRB1*0401, underlined for DRB1*0101, and italicized for DRB1*1101. Because multiple binding registers were predicted for some peptides, only the register with the highest likelihood of binding is indicated. For some peptides, a binding register to only one of the patient's HLA-DR molecules was predicted. TEPITOPE does not model the binding of peptides with post-translational modifications; therefore, predictions of the P1 contact residues were based upon unmodified sequences.
b The TEPITOPE binding score for the P1 site indicated. teinylated peptides that have been identified in this study will be synthesized to allow investigation of their behavior. Regardless of the source protein, post-translational modification, or clinical correlation, it is difficult to predict whether a self-epitope may be immunogenic. This may depend on the concentration of the epitope on the cell surface, the number of autoreactive T cells in inflamed joints, the inflammatory milieu within the joint, or more likely all three factors. As a next step in these investigations, it will be important to test the nonredundant peptides identified from each patient for reactivity with that patient's peripheral blood or synovial fluid mononuclear cells. In addition, to identify the full repertoire of synovial HLA-DR-presented peptides, it will be necessary to analyze synovia from more patients.
HLA-DR molecules are a central component of disease pathogenesis in most autoimmune diseases. Knowledge of the naturally presented HLA-DR epitope repertoire in target tissues may allow the identification of pathogenic T cell epitopes and thus could revolutionize therapy. Currently, synthetic DMARDs, such as methotrexate, are effective for the treatment of RA and other forms of chronic inflammatory arthritis, but they result in generalized immunosuppression and may cause a number of side effects. The newer biologic DMARDs, such as the TNF inhibitors, target a specific inflammatory mediator but may still compromise the ability to fight infection. In contrast, the identification of pathogenic T cell epitopes could lead to peptide-specific immunotherapies that may result in deletion, anergy, or immune deviation of disease-specific T effector cells or in the activation of T regulatory cells without compromising other immune functions (55,56).
The data associated with this study may be downloaded from the ProteomeCommons.org Tranche network using the following hash: 88UFG9TOa93L19Ao0fXWbD8zGbkqQLri-NR6RbW21JDvpBVQUioDT538InKNmH0Ck4WOeK6gbAlxc-P4HuXQQyϩHXrnvIAAAAAAAAQbQϭϭ. The hash may be used to prove exactly what files were published as part of this study's data set, and the hash may also be used to check that the data have not changed since publication. Because the difference between the two peptides is at the amino terminus, the two b-series of fragments exhibit a mass shift consistent with the presence of the additional amino-terminal Leu residue that corresponds to the difference between the two molecular masses, whereas the y-series show common fragments.