A proteomic profile of synoviocyte lesions microdissected from formalin-fixed paraffin-embedded synovial tissues of rheumatoid arthritis

Rheumatoid arthritis (RA) is a systemic autoimmune disease characterized by chronic inflammation of the synovial joints. Early intervention followed by early diagnosis can result in disease remission; however, both early stage diagnosis and provision of effective treatment have been impeded by the heterogeneity of RA, which details of pathological mechanism are unclear. Regardless of numerous investigations of RA by means of genomic and proteomic approaches, proteins interplaying in RA synovial tissues that contain various types of synoviocytes, are not yet sufficiently understood. Hence we have conducted an HPLC/mass spectrometry-based exploratory proteomic analysis focusing on synoviocyte lesions laser-microdissected (LMD) from formalin-fixed paraffin-embedded (FFPE) synovial tissues (RA, n = 15; OA, n = 5), where those of Osteoarthritis (OA) were used as the control. A total of 508 proteins were identified from the RA and OA groups. With the semi-quantitative comparisons, the spectral index (SpI), log2 protein ratio (RSC) based on spectral counting, and statistical G-test, 98 proteins were found to be significant (pair-wise p < 0.05) to the RA synovial tissues. These include stromelysin-1 (MMP3), proteins S100-A8 and S100-A9, plastin-2, galectin-3, calreticulin, cathepsin Z, HLA-A, HLA-DRB1, ferritin, neutrophil defensin 1, CD14, MMP9 etc. Our results confirmed the involvement of known RA biomarkers such as stromelysin-1 (MMP3) and proteins S100-A8 and S100-A9, and also that of leukocyte antigens such as HLA-DRB1. Network analyses of protein–protein interaction for those proteins significant to RA revealed a dominant participation of ribosome pathway (p = 5.91 × 10−45), and, interestingly, the associations of the p53 signaling (p = 2.34 × 10−5). An involvement of proteins including CD14, S100-A8/S100-A9 seems to suggest an activation of the NF-kB/MAPK signaling pathway. Our strategy of laser-microdissected FFPE-tissue proteomic analysis in Rheumatoid Arthritis thus demonstrated its technical feasibility in profiling proteins expressed in synovial tissues, which may play important roles in the RA pathogenesis.

from one's own immune system, but the details of the pathological mechanism are not clear. Recently, genomic and proteomic technologies have dramatically extended our ability to investigate the pathogenic process of RA. A series of reports has compared "fingerprint" profiles using a proteomic approach, which has found some RAspecific proteins including S100A9/A8, serum amyloid A, galectin, and ubiquitin-proteasome pathway components [9][10][11][12][13][14][15][16][17][18][19][20][21][22][23][24][25][26]. However, most of these studies were conducted with peripheral blood, synovial fluid (SF), or cultured synovial cells from patients with RA, when in fact most synoviocytes responsible for the inflammatory joint disorders in RA are found in the whole RA synovial tissue. Only a few studies have focused on the expression profile of such whole RA synovial tissue. Recent advancements in shotgun sequencing and quantitative mass spectrometry for protein analyses could make proteomics amenable to clinical biomarker discovery [27][28][29][30]. Moreover, selective collection of target cells from formalin fixed paraffin embedded (FFPE) tissues by laser microdissection (LMD) will allow access to tissues of a variety of cell types with a definite diagnosis. [31][32][33][34] Hence, we have applied this approach in order to attain a proteomic profile of RA from laser-microdissected FFPE synoviocyte lesions, which will help better understand the molecular mechanisms involved in RA.

Proteins candidates characteristic to RA and OA
We have identified a total of 508 proteins from OA and RA samples, among which 165 proteins were unique to RA, 309 proteins in common, and only 35 unique to OA, as shown in Fig. 1a. These proteins were subjected to Protein ANalysis THrough Evolutionary Relationships (PANTHER) Classification System version 9.0, [35] highlighting their biological processes. As Fig. 1b shows, large differences were found at the following biological processes of proteins characteristically expressed in the RA vs. OA pair: 3, localization (GO:0051179); 5, biological regulation (GO:0065007); 6, response to stimulus (GO:0050896); 8, multicellular organismal process (GO:0032501); 9, biological adhesion (GO:0022610); 11, immune system process (GO:0002376). Differential protein expression analysis has been performed by using the spectral index (SpI), [36] the fold change of a expressed protein in the base 2 logarithmic scale (R SC ) [37] which are based on spectral counting. G test was used for evaluating differential protein expression in pair-wise, RA vs. OA [38].
A protein characteristic to either group was defined to satisfy p < 0.05 in pairwise G-test, and R SC > 1 or < −1, under which criterions 98 proteins were characteristic to RA and 71 proteins to OA among 508 identified proteins.
The list of total 169 proteins is given in Additional file 1: Table S4, in which contained numerous RA biomarkers known previously, for example, stromelysin-1 (MMP3) and proteins S100-A8 and S100-A9, and so on. Table 1 lists the 31 proteins expressed in RA and OA with the significance of p values <0.0001 in G test, and under R SC values >2 or <−2.
A co-expression of both fibromodulin (FMOD) and biglycan (PGS1) observed in OA-group seems consistent with the recent study that those ECM proteins were suggested by using the genetic mouse model to be essential in regulating chondrogenesis and extracellular matrix turnover in temporomandibular joint (TMJ) osteoarthritis [39]. It was also reported that asporin, (also known as periodontal ligament-associated protein 1 (PLAP1), a member of the family of small leucine-rich proteoglycan (SLRP) family), is expressed within the cartilage extracellular matrix (ECM) and have a genetic association with osteoarthritis [40].

Network analysis of candidate proteins
Network analysis of significant proteins is helpful in understanding how these proteins interplay with other key proteins and pathways. This study utilized significant proteins (n = 98) relevant in RA to develop a predictive network model, which has the potential to be used for further biological investigation. This was done using Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database, [43,44] in which data were obtained from biological functions of local networks surrounding the protein candidates. The STRING network of proteins differentially expressed in RA is shown in Fig. 2, where node proteins of potentially importance in RA are indicated by red circles. STRING network enrichment analyses suggested a preferable association of RA with hematopoietic system disease (DOID 74: p = 3.53 × 10 −10 ) and immune system disease (DOID 2914: p = 5.28 × 10 −9 ). Enrichment analyses on the KEGG pathways indicated that RA was dominantly associated with ribosome (has03010: p = 5.91 × 10 −45 ). Interestingly, such would indicate that RA may involve protein networks that interplay with both p53 signaling (has04115: p = 2.34 × 10 −5 ) and leukocyte transendothelial migration (has04670: p = 5.75 × 10 −4 ). Recent therapeutic interventions for RA have been targeting cytokines, such as TNF-a, IL-1 and IL-17, a regulating matrix degradation. Numerous studies using anti-TNF agents have shown to slow or prevent the progression of bone and cartilage damage in RA, which could be attributed to the suppression of osteoclasts in joint lesions. Besides that, it have been reported that the pathogenesis of bone erosions in RA relates to the osteoclast-mediated bone resorption that is regulated by RANKL, the RANK (receptor activator of nuclear factor (NF)-kB) ligand [45]. RANKL is expressed by a variety of cell types involved in RA, including T-cells and synoviocytes. NF-kB is activated in the synovium of patients with RA [12,46] and regulates genes including TNFa, IL-6, IL-8, inducible nitric oxidase synthase (iNOS) and cyclooxygenase-2 (COX-2), all of which contribute to inflammation. It should be noted that the mitogenactivated protein (MAP) kinases are also regulators of a Proteins are listed in descending order of SpI-value, and "_HUMAN" are removed from UniProtKG entry names. cytokine and metalloproteinase production [47,48]. AKT2, IL6, MAPK1 and TP53 are all associated with the drugs used in RA treatment. It is known that methotrexate (MTX) causes single-and double-strand DNA breaks, which are associated with TP53 [49][50][51]. It is also known that the p53 pathway is affected by bucillamine (Buc), which is mainly used for pain-reduction purposes as part of RA treatment in Japan [52]. Several proteins identified as being specific to the RAgroup are those related to human leukocyte antigens, such as HLA class I histocompatibility antigens, Cw-12 (HLA-C 1C12) and A-33 (HLA-A), and HLA class II histocompatibility antigen, DRB1-4 βα (HLA-DRB1). It has been considered that genetic similarities between RA patients and specific human leukocyte antigen (HLA)-DR genes, [12,53], which reside in the major histocompatibility complex (MHC) and participate in antigen presentation, are associated with RA. The protein 2B14 is of the DRB1-4 β chain corresponding to the third hypervariable region, in which the susceptibility epitope may also influence the severity of the disease, and by which the strongest genetic link is suggested between the MHC and RA [54]. S100-A8 and S100-A9 (calgranulins, MRP8 and MRP14) are prominently released by activated macrophages. Inflammatory mediators such as IL-1, tumor necrosis factor (TNF) α or interferon (IFN) γ stimulate macrophages to up-regulate and secrete S100A8/S100A9, which induces proinflammatory responses in leucocytes and endothelial cells [55,56]. One of the RA-related proteins identified in this study includes CD14 (monocyte differentiation antigen), which is involved in Toll-like receptor signaling [57]. It has been reported that Toll-like receptor (TLR) 4 is the dominant receptor for S100A8 signaling, and that stimulation by S100A8/S100A9 leads to nuclear factor (NF)kB and MAP kinase (MAPK) signalling [58,59]. S100A8 and S100A9 and the heterodimer accumulate in inflammatory fluids, suggesting that those are involved in the pathogenesis of rheumatoid arthritis [60].

Conclusions
We have employed in this study an exploratory proteomic analysis of laser-microdissected FFPE-tissues to elucidate protein expression profiles at synoviocyte lesions obtained from RA and OA patients, in which the OA samples served as the control. To the best of our knowledge, this is the first proteomic study that has used FFPE synovial tissues of both RA and OA. Among a total of 508 proteins identified we have elucidated 98 and 71 significant proteins (p < 0.05 and R SC > 1 or −1) expressed in RA and OA, respectively. Molecular mechanisms leading to RA development involve quite a complex and diverse protein network interactions and thus is not yet completely understood. Identification, quantification and functional characterization of proteins are essential in further understanding RA pathogenesis. Our results confirmed the involvement of known RA biomarkers such as stromelysin-1 (MMP3) and proteins S100-A8 and S100-A9, and also that of leukocyte antigens such as HLA-DRB1. The STRING protein-protein network analysis on RA indicated the dominant participation of ribosome pathway, and, interestingly, highlighted the associations of both the p53 signaling and NFkB/MAPK signaling pathways. We have successfully identified several proteins expressed in RA synovial tissues, which may play important roles in RA pathogenesis. These results will help provide additional information about the molecular mechanisms of RA and improve diagnostic strategies in the future. Thus laser-microdissected FFPE-tissue proteomic analysis has its position as a technically feasible method in this area and further research including exploratory and validation studies in individual patients in larger populations across multiple locations should be carried out in the future.

Ethics approval
The study protocol conformed to the principles of the Declaration of Helsinki. All patients were provided with informed consent and the study protocol was approved by both the Niizashiki Central General Hospital ethics committee and Medical ProteoScope Co. Ltd. Ethical committee.

Patients' characteristics
Synovial tissue samples were obtained from patients with RA (n = 15) and OA (n = 5) undergoing a variety of orthopedic surgery (wrist joint, elbow joint, hip joint, knee joint) at the Niiza Shiki Central general hospital. All patients fulfilled the American College of Rheumatology criteria for the diagnoses of RA and OA [61][62][63]. Table 2 summarizes patients' characteristics and clinical information.

FFPE tissue sample preparation
The synovial samples were dissected from connective tissues and immediately stored at −80°C until use. Synovial tissues were then surgically removed and fixed with a buffered formalin solution containing 10-15% methanol and were finally embedded by a conventional method. Paraffin blocks were cut into 4-μm sections for diagnosis and 10-μm sections for proteomics. The 10-μm sections were stained only with haematoxylin, and diagnosis made using the 4-μm sections stained with haematoxylin-eosin (HE) according to the WHO classification.

Laser capture and protein solubilization
Targeted synoviocyte lesions were identified on serial sections of synovial tissues stained with hematoxylin and eosin (HE). For proteomic analysis, a 10-μm thick section prepared from the same tissue block was attached onto DIRECTOR ® slides (OncoPlexDx, Rockville, MD,

Table 2 Patients' characteristics and clinical information
The values are mean ± SD unless otherwise indicated. Knee joint, n 10 1 USA), de-paraffinized twice with xylene for 5 min, rehydrated with graded ethanol solutions and distilled water, and stained by hematoxylin. Those slides were air-dried and subjected to laser microdissection with a Leica LMD6000 (Leica Micro-systems GmbH, Ernst-Leitz-Strasse, Wetzlar, Germany). The DIRECTOR ® slide is similar to a standard glass (uncharged) microscope slide, but has an energy transfer coating on one side of the slide. Tissue sections are mounted on top of the energy transfer coating, and when the slide is turned over, the tissue faces down under the microdissection system. Targeting cells or tissue areas of interest are carried out on computer display. The laser energy is converted to kinetic energy upon striking the coating, vaporizing it and instantly propelling selected tissue features into the collection tube. At least 30,000 cells (ca. 8.0 mm 2 ) were collected directly into a 1.5-mL low-binding plastic tube. Proteins were extracted and digested with trypsin using Liquid Tissue ® MS Protein Prep kits (OncoPlexDx, Rockville, MD, USA) according to the manufacturer's protocol. Targeted lesions were laser-microdissected from FFPE synovial tissues as exemplified in Fig. 3.

Liquid chromatography-tandem mass spectrometry
We adopted a label-free semi-quantitation using spectral counting by liquid chromatography (LC)-tandem mass spectrometry (MS/MS) to a global proteomic analysis. The digested samples were analyzed in triplicates and orders randomized by LC-MS/MS using reversed-phase liquid chromatography (Paradigm MS4; Michrom Bioresources, USA) (RP-LC) interfaced with a LTQ-Orbitrap XL hybrid mass spectrometer (Thermo Fisher Scientific, Bremen, Germany) via a closed nano-electrospray device (ADVANCE Spray Source; AMR Inc. Japan) as described in details previously [64]. Briefly, the RP-LC system con- column, and the whole columns were developed for 100 min with a linear acetonitrile concentration gradient made from 5 to 35% solvent B (10% distilled water and 90% acetonitrile containing 0.1% formic acid) at the flow-rate of 300 nL/min. A 5 μL (corresponding to 1/10 of total sample amount) was used for each LC-MS analysis.
An LTQ was operated in the data-dependent MS/MS mode to automatically acquire up to three successive MS/ MS scans in the centroid mode. The three most intense precursor ions for these MS/MS scans could be selected from a high-resolution MS spectrum (a survey scan) that an Orbitrap previously acquired during a predefined short time window in the profile mode at the resolution of 30,000 and the lock mass of m/z 536.1,654 in the m/z range of 350-1,500. The sets of acquired high-resolution MS and MS/MS spectra for peptides were converted to single data files and they were merged into Mascot generic format files for database searching.

Database search
Mascot software (version 2.2.06, Matrix Science, London, UK) was used for database search against Homo sapiens entries in the UniProtKB/Swiss-Prot database (release 2012_02, 20413 entries). Peptide mass tolerance was 5 ppm, fragment mass tolerance 0.5 Da, and up to two missed cleavages were allowed for errors in trypsin specificity. Carbamidomethylation of cysteines was taken as fixed modifications, and methionine oxidation and formylation of lysine, arginine and N-terminal amino acids as variable modifications. A p values of < 0.05 was considered significant, lists of identified proteins were made under the criterions, peptide probability >95%, protein probability >99% and 2 minimum unique peptides, and then were merged into a master file where the primary accession numbers and entry names from UniProtKB were used. The false positive rates for protein identification were estimated using a decoy database created by reversing the protein sequences in the original database; the estimated false positive rate of peptide matches was 0.2% under protein score threshold conditions (p < 0.001).

Semi-quantitative group-comparison with spectral counting
Mascot search results were processed through Scaffold software (version 3.3.3, Proteome Software, Portland, OR, USA) to semi-quantitatively analyze differential expression levels of proteins by the spectral counting as described [32]. The number of peptide MS/MS spectra with high confidence (Mascot ion score, p < 0.005) was used for calculating spectral counts. Differential protein expression analysis was performed by the spectral index, SpI, which takes into account non-normal distribution and limited replicates and/or sample sizes [36]. SpI takes a value between −1 to 1, and a protein of SpI > 0.4 or <−0.4 are considered to be significant. R SC > 1 or <−1 corresponds to their fold changes >2 or <0.5. G test was used for evaluating differential protein expression in pair-wise cancer groups [38]. Although G test does not require replicates, spectral counts for each protein from triplicates were pooled and used for G-statistic calculation using a two-way contingency table arranged in two rows for a target protein and any other proteins, and two columns for cancer groups on an Excel macro. Statistical significance of p < 0.05 was used. The Yates correction for continuity was applied to the 2 × 2 tables. The spectral counts were calculated for identified proteins, and those from triplicate experiments were pooled, thereby improving the performance of G-test and decreasing false positive rates significantly [38].
The correction has made handling of data containing small spectral counts, including zero, possible. Statisticians, however, showed that the results of G-test using a contingency table containing small counts are not so convincing due to the assumption that the G statistic asymptotically obey a χ 2 distribution with one degree of freedom. To validate the G-test results, we calculated exact p values for the significant proteins without making any assumptions of statistical distribution, based on the permutational distribution of the test statistic, i.e., Fisher's exact test and Mann-Whitney U test for the contingency tables using a R package.

Network analysis of protein-protein interactions
Network analysis of protein-protein interactions was carried out by using STRING version 9.1, [43] in which nodes are proteins and edges are the predicted functional associations based on primary databases comprising of KEGG and GO, and primary literature. STRING predicts these interactions based on neighbourhood, gene fusion products, homology and similarity of coexpression patterning. Network interaction scores for each node are expressed as a joint probability derived from curated databases of experimental information, text mining and computationally predicted by genetic proximity [44]. In this study, STRING networks were calculated with the default settings-medium confidence score: 0.400, network depth: 0 and up to 50 interactions.