Proteomic Maps of Human Gastrointestinal Stromal Tumor Subgroups*

Quantitative proteome profiling of GIST from 13 patients classified into very low/low, intermediate and high-risk subgroups were performed. An extended cohort of GIST (n = 131) was used for immunohistochemical validation of proteins of interest. Functional insights into proteins associated with GIST risk grade were assessed. Immunohistochemical analysis revealed that GIST patients with high PTPN1 had low chances of developing metastasis. This work provides valuable information for understanding the inherent biology and evolution of GIST. Graphical Abstract Highlights Quantitative proteomes of GIST very low/low, intermediate and high risk subgroups. 9177 quantified proteins covering 57% of the GIT transcriptome. Functional insights into proteins associated with different GIST risk stage. GIST patients with high PTPN1 expression have low chances of developing metastasis. Gastrointestinal stromal tumor (GIST) is a common sarcoma of gastrointestinal tract (GIT) with high metastatic and recurrence rates, but the proteomic features are still less understood. Here we performed systematic quantitative proteome profiling of GIST from 13 patients classified into very low/low, intermediate and high risk subgroups. An extended cohort of GIST (n = 131) was used for immunohistochemical validation of proteins of interest. In total, 9177 proteins were quantified, covering 55.9% of the GIT transcriptome from The Human Protein Altas. Out of the 9177 quantified proteins, 4930 proteins were observed in all 13 cases with 517 upregulated and 187 downregulated proteins in tumorous tissues independent of risk stage. Pathway analysis showed that the downregulated proteins were mostly enriched in metabolic pathway, whereas the upregulated proteins mainly belonged to spliceosome pathway. In addition, 131 proteins showed differentially expressed patterns among GIST subgroups with statistical significance. The 13 GIST cases were classified into 3 subgroups perfectly based on the expression of these proteins. The intensive comparison of molecular phenotypes and possible functions of quantified oncoproteins, tumor suppressors, phosphatases and kinases between GIST subgroups was carried out. Immunohistochemical analysis of the phosphatase PTPN1 (n = 117) revealed that the GIST patients with high PTPN1 expression had low chances of developing metastasis. Collectively, this work provides valuable information for understanding the inherent biology and evolution of GIST.


In Brief
Quantitative proteome profiling of GIST from 13 patients classified into very low/low, intermediate and high-risk subgroups were performed. An extended cohort of GIST (n ϭ 131) was used for immunohistochemical validation of proteins of interest. Functional insights into proteins associated with GIST risk grade were assessed. Immunohistochemical analysis revealed that GIST patients with high PTPN1 had low chances of developing metastasis. This work provides valuable information for understanding the inherent biology and evolution of GIST.

Graphical Abstract
achieved to predict the potential malignancy of GIST, a deep proteome coverage and unbiased proteomic studies of human GIST subgroups are still necessary (21)(22)(23).
In this study, we performed large-scale quantitative analysis of proteome between tumorous tissues (N) and paired adjacent nontumorous tissues (T) of GIST, including 3 paired NIH consensus criteria based very low/low risk (NIH-L), 5 paired NIH consensus criteria based intermediate risk (NIH-I), and 5 paired NIH consensus criteria based high risk (NIH-H) GIST samples. Out of the 9177 quantified proteins in GIST proteome, 4930 proteins were quantified with good reproducibility (student-t test, p Ͼ 0.1), and 517 proteins upregulated, and 187 proteins downregulated in all 13 GIST tumorous tissues were found. Clustering and clustering enrichment analysis of 131 differentially expressed proteins clearly showed some distinctive biological processes and pathways enriched in GIST subgroups, and partial least squares discrimination analysis (PLS-DA) confirmed the distinctive expression patterns of these 131 proteins in GIST subgroups. We further systematically assessed the expression patterns of oncoproteins, tumor suppressors (TSs), kinases, and phosphatases, and discussed their potential functions in GIST. Immunohistochemical analysis of phosphatases tyrosine-protein phosphatase nonreceptor type 1 (PTPN1 or PTP1B) and serine/ threonine-protein phosphatase 2A catalytic subunit beta isoform (PPP2CB) in an extended cohort of GIST indicated that the GIST patients with high PTPN1 expression had low chances of developing metastasis. Collectively, this work is the first large-scale quantitative characterization of GIST proteome and will present valuable information for understanding the etiology of GIST progression.

EXPERIMENTAL PROCEDURES
Experimental Design and Statistical Rationale-We performed indepth proteomic analysis between tumorous and paired adjacent nontumorous tissues of GIST, including 3 paired NIH-L, 5 paired NIH-I and 5 NIH-H GIST samples, using tandem mass tag (TMT) 10plex labeling, respectively. To reduce the influence of data-dependent acquisition, all samples were run in duplicate. To minimize the protein abundance difference among individuals, the protein intensity of tumor was normalized to that of its corresponding nontumorous tissue. We used the p value to assess the reproducibility of the technical duplicates based on student-t test. If the protein ratios quantified in duplicates had p values (student-t test) more than 0.1, which meant that the quantified ratio difference between duplicates was small, the quantification of proteins was reproducible, and these proteins were used for data mining.
Clinical Samples Collection-One hundred thirty-one GIST patients were recruited from the department of gastrointestinal surgery, West China Hospital, with ethical approval from the Biomedical Ethics Committee of West China Hospital (Permission Number: 2017-254). All these cases were diagnosed as GIST by two independent pathologists according to Chinese consensus guidelines for diagnosis and management of GIST. To avoid protein degradation, all the tissues were frozen in liquid nitrogen 30 min after surgery. The criteria for GIST sample collection is as follows. Out of the 131 cases, 13 cases had tumors with paired adjacent nontumorous tissues, and 118 cases had only tumorous tissues. All tumorous and adjacent nontumorous specimens obtained from these patients were stored in West China Hospital Biobank of Sichuan University.
Protein Extraction and Isobaric Labeling-The GIST tissues were shredded and lysed in RIPA buffer (1% NP-40, 0.5% (w/v) sodium deoxycholate, 150 mM NaCl, 50 mM Tris (pH ϭ 7.5)) containing protease inhibitor, and then homogenized by Gentle-MACS (Miltenyi Biotec GmbH) under the procedure "protein 01. 01" twice, followed by 5 min sonication of 0.3 s on and 1.7 s off with 195 watt of JY92-IIN (NingBoXinYi, China). The lysate was centrifuged at 20,000 rcf for 30 min, and the supernatant was transferred to a new tube. The protein concentration was measured by Bradford assay. Extracted proteins (50 g) from each sample was reduced by 0.5 mol Tris (2-carboxyethyl) phosphine (TCEP) with a final concentration of 10 mM at 56°C for 1 h, alkylated with 1.0 mol iodoacetamide with a final concentration of 20 mM in the dark at room temperature for additional 30 min, and then precipitated with methanol, chloroform, and water (CH 3 OH: CHCl 3 :H 2 O ϭ 4:1:3). The precipitate was air-dried and digested using sequence grade trypsin in triethylammonium bicarbonate (TEAB) buffer. The tryptic peptides of each sample were labeled with TMT (Thermo Fisher Scientific) reagents according to the manufacturer's protocol. After quenching with 5% hydroxylamine, TMT-labeled peptides of each risk grade GIST were mixed and fractionated by reversephase C18 column.
LC-MS/MS Analysis-The desalted peptides were lyophilized and resuspended in buffer A (2% ACN, 0.1% FA), LC-MS/MS analysis was performed using an EASY-nanoLC 1000 nanoflow LC instrument coupled to a high-resolution mass spectrometer (Q Exactive Plus, Thermo Fisher Scientific). A 100 m (inner diameter) ϫ 2 cm (length) trap column and a 75 m (inner diameter) ϫ 12 cm (length) analytical column were pulled and packed in-house with C18 particle (DIKMA). Data dependent acquisition (DDA) was performed in positive ion mode at the flow rate of 300 nL/min. MS spectra were acquired from 350 m/z to 1600 m/z with a resolution of 70,000 at m/z ϭ 200. The automatic gain control (AGC) value was set at 3e6, with maximum injection time of 20 ms. For MS/MS scans, the top 15 most intense parent ions were selected with 1.6 m/z isolation window and frag- 1 The abbreviations used are: GIST, gastrointestinal stromal tumor; GIT, gastrointestinal tract; MS, mass spectrometry; DDA, data dependent acquisition; FDR, false discovery rate; NIH-L, National Institute of Health consensus criteria based very low/low risk; NIH-I, National Institute of Health consensus criteria based intermediate risk; NIH-H, National Institute of Health consensus criteria based high risk; KIT or CD117, mast/stem cell growth factor receptor Kit; DOG1 or ANO1, anoctamin-1; PTPN1 or PTP1B, tyrosine-protein phosphatase nonreceptor type 1; PPP2CB, serine/threonine-protein phosphatase 2A catalytic subunit beta isoform; PLS-DA, partial least squares discrimination analysis; TMT, Tandem mass tag; HPFs, high-power fields; IOD, integrated option density; CV, coefficient of variation; RSD, relative standard deviation; GO, Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes. mented with normalized collision energy (NCE) of 30%. The AGC value for MS/MS was set to a target value of 1e5, with a maximum injection time of 100 ms and a resolution of 35,000. Parent ions with a charge state of z ϭ 1 or with unassigned charge states were excluded from fragmentation and the intensity threshold for selection was set to 2e5.
Data Processing and Analysis-All the raw files were searched against the Swiss-Prot human protein sequence database (20413 entries, 2017/01/14) in Maxquant (version 1.6). The precursor peptide mass tolerance was 10 ppm and a fragment ion mass tolerance was 0.02 Da. Two missed trypsin cleavages were allowed. Cysteine carbamidomethylation was set as a fixed modification. Oxidation of methionine and protein N-terminal acetylation were set as variable modifications. Peptides with Ͻ1% false discovery rate (FDR) were chosen for further data processing. Supplemental Table S1 contained 9177 quantified proteins of all GIST subgroups in duplicates, excluding the reverse and potential contaminant flagged, zero intensity proteins, as well as some proteins with single-peptide identification. Supplemental Table S2 contained quantified proteins of each GIST subgroup in duplicates. Supplemental Table S3 included 7230 proteins simultaneously quantified in all GIST cases. Supplemental Table  S4 contained quantified proteins of each subgroup with good reproducibility between two technical repeats (student-t test, p Ͼ 0.1). Supplemental Table S5 included 4930 quantified proteins in all GIST subgroups with good technical reproducibility (student-t test, p Ͼ 0.1). Supplemental Table S6 contained 517 upregulated proteins and 187 downregulated proteins in all subgroups. Supplemental Table S7 contains the GIT-specific and -unspecific genes in each GIST subgroup. Supplemental Table S8 contains the differentially expressed proteins between tumorous and paired adjacent nontumorous tissues in each subgroup. Supplemental Table S9 showed Gene Ontology Cellular Component (GOCC) enriched results of differentially expressed proteins in each subgroup. Supplemental Table S10 included the differentially expressed proteins with statistical significance between GIST subgroups (student-t test, p Ͻ 0.05) and a small difference within a GIST subgroup. Supplemental Table S11 included cluster-specific enrichment results of Gene Ontology Biological Processes (GOBP), GOCC and Kyoto Encyclopedia of Genes and Genomes (KEGG). Supplemental Table S12 included all oncoproteins and TSs simultaneously quantified in all GIST subgroups based on 138 well-annotated cancer driver genes (24). Supplemental Table S13 included all phosphatases and kinases simultaneously quantified in all GIST subgroups referring to Eukaryotic Kinase and Phosphatase Database (EKPD, 2019/1/1). Supplemental Table S14 included the immunohistochemical results of both PPP2CB and PTPN1. Supplemental Table S15 contained the statistics of PPP2CB and PTPN1 positive rates in GIST subgroups. Gene Annotation, including biological process, molecular function and cellular component, and KEGG pathway were performed using DAVID 6.8 and gene set enrichment analysis (GSEA). Unsupervised-clustering and PLS-DA was applied to evaluate the difference of GIST subgroups. Pearson correlation coefficient analysis was used to confirm the difference and similarity among GIST subgroups, as well as the technical reproducibility in duplicates of each subgroup. The fold change statistics of kinases and phosphatases were performed in Excel. Volcano Plot was applied to show the significantly changed proteins in each subgroup.
Immunohistochemical Staining-All tumor specimens diagnosed as GIST by two independent pathologists were made into the donor paraffin blocks. Tumor microarray and immunohistochemical staining were performed as our previous study (25). The antibodies for immunohistochemistry were purchased from HuaBio (China, anti-PTPN1 antibody, Cat#RT1521; anti-PPP2CB antibody, Cat#ET1611-54). Image-Pro Plus 6.0 software was used to evaluate the intensity of protein expression.

Proteome Profiling of GIST Subgroups-We collected 131 GIST cases deposited in the tumor biobank of West China
Hospital. All 131 GIST cases were diagnosed by immunohistochemical staining of KIT, DOG1 and hematopoietic progenitor cell antigen CD34 (CD34), as well as histopathological review by two independent pathologists. Out of these, 13 tumors with their paired adjacent nontumorous tissues were used for the following proteome profiling to study the molecular phenotypes during GIST progression (Table I), whereas the remaining 118 cases with only tumorous tissues were used for immunohistochemical staining. In the 13 GIST cases, Clinicopathologic features of the collected GIST cases for proteome profiling. Thirteen GIST cases were selected for quantitative proteomic studies. All these tissues were frozen in liquid nitrogen 30 min after surgery. Among them, three cases were diagnosed as very low/low risk, five cases as intermediate risk, and five cases as high risk.
Note: M: male; F: female. *Prognostic classification based on tumor size and mitosis count.
3 patients were diagnosed with very low/low risk (tumor size Ͻ 2 cm, or mitotic count Ͻ 3/50 HPFs (high-power fields)) GIST, 5 patients with intermediate risk (tumor size 5-10 cm and mitotic count Ͻ 5/50 HPFs, or mitotic count 5-10/50 HPFs) GIST, and 5 patients with high risk (mitotic count Ͼ 10/50 HPFs, or tumor size Ͼ 10 cm) GIST, according to NIH standard classification system (Table I). It should be mentioned that one case belonging to NIH-I subgroup was KIT negative and included in the samples used for proteome profiling.
Quantitative proteomic analysis using isobaric isotope reagents such as TMT have been widely used to determine the relative protein ratios between different samples. In this study, to obtain the changing patterns of proteome between tumorous and nontumorous tissues in GIST subgroups, we performed quantitative proteomics using TMT-10plex isobaric label reagent, with five biological repeats applied for both NIH-I and NIH-H GIST, as well as three biological repeats for NIH-L GIST (Fig. 1A). To increase the throughput and minimize the co-isolated peptide contamination, the combined TMT labeled peptides were separated into 40 fractions by reversed phase high-performance liquid chromatography (RP-HPLC) under basic pH, and further analyzed by mass spectrometer with two technical replicates. The raw mass spectrometry (MS) data were searched using MaxQuant (version 1.6) against the Swiss-Prot human database (20413 entries, 2017/01/14), with protein false discovery rate (FDR) of 1% and MS2 tolerance of 10 ppm at peptide level (Fig. 1A). After removing reverse, contaminant and zero intensity proteins, as well as proteins with single-peptide identification, a total number of 9177 proteins, representing about 55.9% of all GIT coding genes (26), were quantified in GIST samples (Fig.  1B, and supplemental Table S1). The median protein coverage was ϳ21.33% and the median number of unique peptides for each protein was 10.1. Specifically, 8650, 8287, and 7797 proteins were quantified in NIH-H, NIH-I, and NIH-L GIST subgroups, respectively (supplemental Fig. S1A-S1D, and supplemental Table S2). Good reproducibility was observed between technical repeats (supplemental Fig. S1E-S1G). Besides, 7230 quantified proteins were overlapped in all GIST samples (Fig. 1C, and supplemental Table S3). Then, we applied student-t test for the T/N ratios in duplicates and revealed that the p values of 4930 proteins were greater than 0.1 between technical duplicates, indicating good reproducibility of protein quantification in between duplicates of each GIST subgroup, which were used for further data mining (supplemental Fig. S1H, and supplemental Table S4 -S5).
To further demonstrate the reliability of our proteomic data, we first examined some well-known overexpressed proteins in GIST, such as KIT, protein PML (PML), tyrosine-protein kinase transmembrane receptor 2 (ROR2), ATP-dependent RNA helicase DDX39A/B (DDX39A/B), apoptosis regulator Bcl-2 (BCL2), BTB/POZ domain-containing protein KCTD12 (KCTD12), BTB/POZ domain-containing protein KCTD10 (KCTD10), and ANO1 (Fig. 1D). Notably, the changing patterns of all these proteins in our proteomic data were accurately consistent with previous reports (9,10,22,23,27,28). Next, we calculated the Pearson correlation coefficient (r) of 4930 quantified proteins between GIST subgroups, and found that the coefficient between NIH-L and NIH-I subgroups was 0.77, between NIH-L and NIH-H subgroups 0.77, and between NIH-I and NIH-H subgroups 0.8 (Fig. 1E), suggesting a closer similarity between NIH-I and NIH-H subgroups, consistent with their clinical output that recurrence-free survival of NIH-L GIST is significantly higher than that of NIH-I and NIH-H GIST (7). Collectively, we performed the deepest quantitative proteomics described to date on NIH-L, NIH-I, and NIH-H GIST subgroups.
Common Proteomic Features of GIST Subgroups-To pursue the general features of GIST proteome independent of risk grade, we processed the data to define unregulated and downregulated proteins in all GIST subgroups using the following filtration conditions, respectively. First, we selected the proteins whose T/N ratios in all 13 cases were simultaneously greater or less than 1.0, and 1011 proteins were obtained (supplemental Table S6). Then, among the 1011 proteins, the proteins whose average T/N ratios in 13 cases were greater than 1.5 were defined as upregulated proteins, and the proteins whose average T/N ratios in 13 samples were less than 0.67 belonged to the downregulated proteins. As a result, 517 upregulated proteins and 187 downregulated proteins in all subgroups were obtained ( Fig. 2A, and supplemental Table  S6). Gene Ontology (GO) analysis of these 704 common differentially expressed proteins revealed that the upregulated proteins were significantly enriched in nucleus and cytoskeleton, whereas in the vesicles, mitochondrion and extracellular space only the downregulated proteins were enriched (Fig. 2B). KEGG pathway analysis showed that the upregulated proteins mainly belonged to spliceosome, whereas the downregulated proteins were involved in metabolic pathways including carbon metabolism, glycolysis/gluconeogenesis, metabolism of xen by c-P450, TCA cycle, propanoate metabolism and drug metabolism, which was related to the dysregulation of metabolism and detoxification in cancer (Fig. 2C). In total, 28 proteins such as probable ATP-dependent RNA helicase DDX5 (DDX5), serine/ arginine-rich splicing factor 1 (SRSF1) belonging to spliceo-some were upregulated (Fig. 2D, and supplemental Fig. S2A). DDX5, an RNA helicase regulating DNA replication and mi-croRNA expression, was upregulated in a majority of malignant breast cancer tissues (29,30). SRSF1 was a member of the SR protein family, which was intensely associated with mRNA metabolism, including mRNA splicing, stability, and translation (31), and found to be overexpressed in various cancer types as a potent proto-oncogene (29,32,33). RNA mis-splicing contributes to a great number of human disease (34), and our data indicates a potential correlation between RNA mis-splicing and GIST risk stage. Some critical metabolic enzymes such as fructose-1,6-bisphosphatase 1/2 (FBP1/2), fructose-bisphosphate aldolase B (ALDOB), ribose-phosphate pyrophosphokinase 2 (PRPS2), phosphoenolpyruvate carboxykinase (PCK), alanine aminotransferase 1 (GPT), acyl-CoA synthetase short-chain family member 1/2 (ACSS1/2), succinyl-CoA ligase [ADP/GDP-forming] subunit alpha/beta (SUCLG1/2), 2-oxoglutarate dehydrogenase (OGDH), alcohol dehydrogenase [NADP(ϩ)] (AKR1A1), isocitrate dehydrogenase [NADP] (IDH1, cytoplasmic), isocitrate dehydrogenase [NADP] (IDH2, mitochondrial) and aldehyde dehydrogenase (ALDH) were downregulated dramatically ( Fig.  2E-2F). The dysregulation of enzymatic enzymes was closely associated with cancer. For example, the down-regulation for IDH1/2 could decrease the effective level of ␣-ketoglutarate and inversely increase the stability of hypoxia-inducible factor 1-alpha (HIF-1A), a transcriptional factor facilitating the tumor growth under low oxygen environment (35), suggesting that down-regulation of IDH1/2 might also be involved in GIST progression. In addition, by statistically analyzing GIT specific proteins annotated in Human Protein Atlas (26), we found that these proteins were significantly downregulated compared with GIT unspecific proteins in all GIST subgroups (supplemental Fig. S2C, and supplemental Table S7), indicating that loss of GIT tissue identity is a common feature of GIST.
The Proteomic Variances in GIST Subgroups-Previous studies showed that the 5-year disease-free survival rates of very low/low, intermediate, and high risks GIST patients were 96%, 54%, and 20% respectively (7), indicating the intrinsic molecular differences existing in GIST subgroups. To explore this, we first analyzed the proteomic features of GIST subgroups separately. For each subgroup, we processed the data using the following screening conditions: (1) The intensity difference of same protein in tumors and adjacent nontumorous tissues should be significant (student-t test, p Ͻ 0.05), and the T/N ratio difference of same protein between technical duplicates should be less (student-t test, p Ͼ 0.1); (2) The average T/N ratio was greater than 1.5 and the ratio of each patient in this subgroup should be greater than 1.0, or the average T/N ratio was less than 0.67 and the ratio of each patient in this subgroup should be less than 1.0. Consequently, we obtained 896 upregulated and 93 downregulated proteins in NIH-L subgroup, 549 upregulated and 243 downregulated proteins in NIH-I subgroup, as well as 809 upregu-  Table S8). Remarkably, kinesin-1 heavy chain (KIF5B), pre-B-cell leukemia transcription factor-interacting protein 1 (PBXIP1) and peptidyl-prolyl cis-trans isomerase FKBP10 (FKBP10) in NIH-L subgroup, protein disulfide-isomerase TMX3 (TMX3), tubulin alpha-1B chain (TUBA1B) and SWI/SNF-related matrix-associated actin-dependent regulator of chromatin subfamily A member 5 (SMARCA5) in NIH-I subgroup, and myosin light chain 6B (MYL6B), sarcoplasmic/endoplasmic reticulum calcium ATPase 3 (ATP2A3), serine/arginine repetitive matrix protein 2 (SRRM2) and protein quaking (QKI) in NIH-H subgroup were dramatically upregulated, whereas histamine Nmethyltransferase (HNMT) and a-kinase anchor protein 1 (AKAP1, mitochondrial) in NIH-L subgroup, glycerol-3-phosphate dehydrogenase [NAD(ϩ)] (GPD1, cytoplasmic) and D-3phosphoglycerate dehydrogenase (PHGDH) in NIH-I subgroup, and medium-chain specific acyl-CoA dehydrogenase (ACADM, mitochondrial), succinate-semialdehyde dehydrogenase (ALDH5A1, mitochondrial) and very long-chain specific acyl-CoA dehydrogenase (ACADVL, mitochondrial) in NIH-H subgroup were obviously downregulated (supplemental Fig. S3B, and supplemental Table S4). Some of these proteins were known to be highly related to tumorigenesis. For example, overexpression of PBXIP1 could enhance the TGF-␤ induced epithelial-mesenchymal transition (EMT) (36) and activate the cell cycle check point G1/S and G2/M (37), indicating that the highly expressed PBXIP1 in tumor might play important roles in the early stage of GIST development.
Next, we performed GOCC analysis of significantly expressed proteins in each GIST subgroup, respectively, using DAVID 6.8. We found that the extracellular exosome proteins were predominantly enriched in NIH-H subgroup, and some membrane proteins were mostly enriched in both NIH-L and NIH-I subgroups (Fig. 3A, and supplemental Table  S9). GSEA analysis using tumor hallmark database (Version: h.all.v6.2.symbols.gmt [Hallmarks]) revealed that many signaling pathways showed similar regulatory patterns in three GIST subgroups (Fig. 3B). For example, G2M checkpoint and E2F target closely related to cell cycle were upregulated, whereas oxidative phosphorylation, xenobiotic metabolism, fatty acid metabolism, and adipogenesis were downregulated in all subgroups. It should be mentioned that the cholesterol homeostasis pathway was almost unchanged in NIH-L subgroup, slightly downregulated in NIH-I subgroup, and dramatically downregulated in NIH-H subgroup (Fig. 3B, and supplemental Fig. S3C). ACSS2, an important enzyme associated with homeostasis of cholesterol, was significantly downregulated in NIH-H GIST tumor. The low expression of ACSS2, exhibiting a tendency of elevated keratin, type II cytoskeletal 7 (KRT7) expression and decreased keratin, type II cytoskeletal 20 (KRT20) and homeobox protein CDX-2 (CDX2) expression, was an independent prognostic factor for poor 5-year progression-free survival in colorectal carcinoma (38).
Further, to better known the proteomic variances, we screened the differentially expressed proteins, which have significant difference between GIST subgroups and small difference within a GIST subgroup, using two filtration conditions. First, the protein p values (student-t test) were less than 0.05 between two GIST subgroups in at least two comparisons. Second, the relative standard deviation (RSD) of ratios of same protein in an identical GIST subgroup was less than 0.4, or the same protein T/N ratios of all cases in an identical GIST subgroup were greater or less than 1.0. Consequently, 131 proteins were identified as differentially expressed proteins (supplemental Table S10). Unbiased clustering and cluster-specific enrichment analysis using GOBP, GOCC, and KEGG (DAVID 6.8) revealed the biological process, cellular components and pathways enriched in different GIST subgroups (Fig. 3C, and supplemental Table S11). Pathways and biological functions enriched in NIH-L GIST included the endocytosis pathway and the protein transport processes. Whereas in the NIH-H GIST, the ubiquitin mediated proteolysis processes was significantly enriched. PLS-DA analysis confirmed the distinctive features of these 131 differentially expressed proteins in GIST subgroups (39), and these 13 cases of GIST were classified into 3 subgroups based on the expression of these proteins (Fig. 3D). Collectively, these analyses of specifically enriched proteins faithfully unveiled the unique proteomic features of GIST subgroups.
Oncoproteins and Tumor Suppressors in GIST-We next assessed the homeostasis of oncoproteins and TSs in GIST subgroups. The activation of oncogenes and inactivation of TS genes were the main causes of tumor progression. To identify the oncoproteins and TSs associated with GIST risk stage, we selected them based on 138 well-annotated cancer driver genes (24) and summarized their fold-changes in all GIST subgroups (Fig. 4A-4B, and supplemental Table S12). In total, 21 oncoproteins and 31 TSs were quantified in all 13 GIST cases (supplemental Table S12). KIT, which belonged to PI3K pathway and was a hallmark of GIST, as well as some other oncoproteins including tyrosine-protein kinase JAK1 (JAK1), splicing factor U2AF 35 kDa subunit (U2AF1), splicing factor 3B subunit 1 (SF3B1) and serine/threonine-protein phosphatase 2A 65 kDa regulatory subunit A alpha isoform (PPP2R1A) were significantly upregulated in all GIST subgroups. Epidermal growth factor receptor (EGFR) is a welldemonstrated therapeutic target for tumor treatment, and its overexpression and activation is associated with the development of a wide variety of tumors (40). As showed in Fig. 4A, the EGFR was highly expressed in NIH-H subgroup, but not NIH-L subgroup, indicating the EGFR pathway might be involved in the late stage of GIST development. Moreover, we found some TSs including fizzy-related protein homolog (CDH1 or FZR1) and fizzy-related protein homolog (SMAD4) were downregulated in all GIST subgroups. Loss of function of CDH1 have a strongly increased incidence of gastric can-cer (41). It was speculated that the down-regulation of CDH1 were mostly likely associated with the progression of GIST.
Kinases and Phosphatases in GIST-Kinases and phosphatases, two classes of essential cellular signaling regulators, have been widely used as targets for new drug development.
These phosphorylation regulatory enzymes play very important roles in the evolution of cancers including GIST. For example, the overexpression of KIT, a tyrosine phosphorylation kinase, has been recognized as a significant feature in GIST. To wholly understand the roles of kinases and phos- Six thousand fifty-three proteins of NIH-L subgroup, 6598 proteins of NIH-I subgroup, and 6955 proteins of NIH-H subgroup were used for GSEA analyses. C, Unbiased clustering and cluster-specific enrichment analysis of 131 differentially expressed proteins between GIST subgroups. The heatmap was drawn using "Heatmap" R package, and the rows were scaled. Z-score is a statistic normalization using 13 T/N ratios of GIST subgroups through an arithmetic included in "Heatmap" R package. D, Partial least squares discrimination analysis (PLS-DA) of 131 differentially expressed proteins between GIST subgroups. PLS-DA analysis was performed using "mixOmics" R package.
phatases during GIST progression, we selected kinases and phosphatases based on EKPD database, and 18 phosphatases (Fig. 5A) and 32 kinases (Fig. 5B) were quantified in all 13 GIST cases (supplemental Table S13). Notably, most phosphatases and kinases were upregulated in GIST. Specifically, the kinases such as KIT, nik-related protein kinase (NRK), cyclin-dependent kinase 7 (CDK7), serine/threonineprotein kinase Nek6 (NEK6), serine-protein kinase ATM (ATM), and bromodomain-containing protein 3 (BRD3) were upregulated in tumors (Fig. 5B). In contrast, we also found that the eukaryotic elongation factor 2 kinase (EEF2K) was downregulated in three GIST subgroups. EEF2K is a kinase to phosphorylate the substrate eEF2 and regulates the elon-gation stage of protein synthesis. It has a dual role in promoting or inhibiting tumorigenesis depending on the cancer types (42).
Compared with the kinases, the phosphatases were still understudied. Interestingly, the phosphatase PPP2CB showed distinctive expression patterns in GIST subgroups (Fig. 5A). Previous studies revealed that PPP2CB was related to cell cycle arrest phenotype (43,44) and established as a suppressor of NF-B signaling (45). Reduced expression of PPP2CB was observed in prostate cancer (46). Our MS results disclosed that the PPP2CB was upregulated in tumors of low-risk subgroup and downregulated in tumors of high-risk subgroup (supplemental Fig. S4A, and supplemental Table S10 And the size of circle positively correlated with the number of fold-change. Blue represented that the proteins were downregulated, and red mean that the proteins were upregulated. B, The corresponding pathways that the quantified oncoproteins and TSs belonged to. The red color proteins were oncoproteins, and the black color proteins were TSs. In the parentheses, the first number represented the average ratio in NIH-L subgroup, the second number represented the average ratio in NIH-I subgroup, and the third number represented the average ratio in NIH-H subgroup. differences between subgroups was significant (student-t test, p ϭ 0.012 between NIH-H and NIH-I groups. p ϭ 0.002 between NIH-H and NIH-L subgroups). These MS data were further confirmed by Western blots (supplemental Fig. S4B). Inspired by these results, we checked the absolute PPP2CB expression in tumors of an extended cohort of GIST cases (n ϭ 113) by immunohistochemical analysis using tissue microarray to explore the potential role of PPP2CB in different GIST risk stage (supplemental Table S14). As a result, we found that all the tumorous tissue spots belonging to NIH-L subgroups were positively stained, whereas only 86.1% and 90.4% of spots were positively stained in NIH-I and NIH-H subgroups, respectively (supplemental Fig. S4C, and supplemental Table S15). The average integrated option density (IOD) of staining differences did not show statistical significance between GIST subgroups (supplemental Fig. S4D). We assumed that the protein abundance of PPP2CB differed greatly in normal states between individuals (47). Although the absolute PPP2CB expression in tumors of GIST subgroups did not have significant changes, the T/N ratios were very different. We assumed that high PPP2CB level was likely to be associated with low risk of GIST.
In addition, among all upregulated phosphatases (Fig. 5A), PTPN1 gained our attention, because the function of PTPN1 in GIST was completely unknown and it was potentially associated with tumor metastasis (48,49). To reveal the relationship between PTPN1 and GIST progression, we first verified the MS data of PTPN1 (Fig. 5C) with Western blots and found that the Western blotting results and MS data matched perfectly (Fig.  5D). Then, we performed immunohistochemical analysis of 117 GIST tumorous tissues and revealed a high percentage of PTPN1 positive cases in NIH-L subgroup, and the positive rates of PTPN1 decreased gradually from NIH-L to NIH-H subgroups ( Fig. 5E-5G, and supplemental Table S14 -S15), especially the average IOD of PTPN1 in NIH-L was significantly higher than that in NIH-H subgroup (student-t test, p ϭ 0.037). Further analysis revealed that the GIST patients with high PTPN1 had low chances of developing metastasis (Fig. 5H). Taken together, these results indicated that PTPN1 might be a potential suppressor of GIST metastasis, consistent with that PTPN1 stabilizes VE-cadherin-mediated cell-cell adhesions and controls cell motility and invasion (48,49). DISCUSSION In summary, we performed in-depth quantitative proteomics to reveal the proteomic features of GIST subgroups. In total, 9177 proteins were ambiguously quantified in GIST samples, covering almost 55.9% of the GIT transcriptome from The Human Protein Altas. Among them, 704 with similar expression patterns and 131 proteins with distinctive expression patterns were identified in GIST subgroups. In addition, we performed immunohistochemical analysis in an extended GIST cohort to study the phosphatases PTPN1 and PPP2CB, which were found to be closely correlated with GIST metastasis and risk grade, respectively.
Previous studies showed that PTPN1 was a double-facet molecule in tumors. It acted as a tumor promoter in breast (50), nonsmall lung (51), ovarian (52), and prostate (53) cancers, but performed as a tumor suppressor in esophageal cancer (54) and lymphoma (55). Our immunohistochemical results suggested that the patients with very high expression of PTPN1 were metastasis-free, and the risk of metastasis increased as the PTPN1 abundance decreased (Fig. 3F). Previous study indicated that PTPN1 acted as a tumor suppressor inhibiting the cell mobility and invasion, though negatively regulates VEGFR by binding to the VEGFR and stabilization of cell-cell adhesions via tyrosine dephosphorylation of VE-cadherin (48). Most likely, a similar mechanism might be associated with GIST tumor metastasis, and its function was largely dependent on its absolute abundance. The functions and mechanisms of PTPN1 during GIST progression required more intensive studies in the following research.
Positive expression of KIT in GIST patients are very common (56). In all our GIST cases, only one patient was KIT negative. In this study we compared the proteomic features of GIST subgroups without considering the influence of KIT, we found that the proteomic patterns between KIT negative and positive cases were obvious in NIH-I subgroup (Fig. 3A). More KIT negative GIST cases may be required to study the impact of KIT on GIST proteome.
However, it should be mentioned that the sample set used for proteomic discovery in this study was a little bit small because the surgical principle in China for localized and potentially resectable GIST was R0 resection and it was very difficult to get paired adjacent nontumorous tissue after surgery (57). The small sample size for discovery might affect the quantification accuracy of some proteins and make the global statistical analyses less robust. Some interesting and important discoveries in this study would be verified in a larger GIST cohort when the samples were available. The heatmap was created in Excel. C, The fold changes (T/N) of phosphatase PTPN1 in all GIST subgroups measured by MS. D, Western blotting results of PTPN1 in paired tumorous and nontumorous tissues of GIST subgroups. E, Immunohistochemical analysis of PTPN1 using GIST tumorous tissues. The left picture represented PTPN1 with positive expression, and the right picture indicated PTPN1 was negatively expressed. F, The average IOD of PTPN1 in GIST subgroups. p values were calculated by student-t test. G, The positive ratios of PTPN1 in GIST subgroups based on immunohistochemical analysis results of 117 GIST cases. 14/131 cases were excluded because the observable areas of these tissues in the microarray were too small. p values were calculated by student-t test. H, The relationship between PTPN1 abundance and tumor metastasis. High proportions of metastatic cases were observed in the groups with lowly expressed PTPN1.
Acknowledgments-We thank the Department of West China hospital biobank for its assistance in specimen collection, and the Department of Pathology of the Second PeopleЈs Hospital of Neijiang City for providing technical assistance during the experiment.

DATA AVAILABILITY
All mass spectrometry raw data and the MaxQuant output tables have been deposited to iProX and are available using the iProX accession: IPX0001353000. □ S This article contains Supplemental material. ¶ To whom correspondence may be addressed. E-mail: lunzhi. dai@scu.edu.cn.
Competing financial interests: There is no competing financial interest.