Defining protein pattern differences among molecular subtypes of diffuse gliomas using mass spectrometry

Molecular characterization of diffuse gliomas has thus far largely focused on genomic and transcriptomic interrogations. Here, we utilized mass spectrometry and overlay protein-level information onto genomically-defined cohorts of diffuse gliomas to improve our downstream molecular understanding of these lethal malignancies. Bulk and macrodissected tissues were utilized to quantitate 5,496 unique proteins over 3 glioma cohorts subclassified largely based on their IDH and 1p19q co-deletion status (IDH wildtype (IDHwt), n=7; IDH mutated (IDHmt), 1p19q non-codeleted, n=7; IDH mutated, 1p19q-codeleted, n=10). Clustering analysis highlighted proteome and systems-level pathway differences in gliomas according to IDH and 1p19q-codeletion status including 287 differentially abundant proteins in macrodissection-enriched tumour specimens. IDHwt tumors were enriched for proteins involved in invasiveness and epithelial to mesenchymal transition (EMT), while IDHmt gliomas had increased abundances of proteins involved in mRNA splicing. Finally, these abundance changes were compared to IDH-matched glioblastoma stem-like cells (GSCs) to better pinpoint protein patterns enriched in putative cellular drivers of gliomas. Using this integrative approach, we outline specific proteins involved in chloride transport (e.g. chloride intracellular channel 1, CLIC1) and EMT (e.g. procollagen-lysine, 2-oxoglutarate 5-dioxygenase 3, PLOD3 and serpin peptidase inhibitor clade H member 1, SERPINH1) that showed concordant IDH-status dependant abundance differences in both primary tissue and purified GSC cultures. Given the downstream position proteins occupy in driving biology and phenotype, understanding the proteomic patterns operational in distinct glioma subtypes could help propose more specific, personalized and effective targets for the management of patients with these aggressive malignancies. of samples containing a measurement and imputing the remaining missing values (width 0.42, downshift 1.78). Cellular components and biological processes were assessed by uploading Gene names to DAVID functional annotation tool (https://david.ncifcrf.gov). Pearson correlations were performed by omitting imputed values and using protein measurements found in 100% of the analyzed samples. For hierarchical clustering protein measurements were Z-score transformed and plotted using complete Euclidean hierarchical clustering method with K-means preprocessing using either whole datasets or only statistically significant protein changes. Protein level changes between glioma subgroups were determined by either modified Welch’s student t -test or multi-sample ANOVA using FDR adjustment of 0.1, as indicated. GO term enrichment analysis was performed using the Fisher exact test with FDR cut-off of 0.02. Gene set enrichment analysis (GSEA) was used to define driver pathways within subgroups using the geneset permutation type with a weighted t-test enrichment statistic of 0.1. Heatmaps and networks of the proteins in GSEA were visualized using Cytoscape.


Introduction
Diffuse gliomas represent the most common intrinsic brain tumour type, but as a group, carry a remarkably variable clinical course (1)(2)(3). Some rapidly evolve, while others remain relatively stable for years before progression. Therefore, precise characterization of the unique biology of the different glioma subtypes is essential for personalized management. The current state-of-the-art classification system of adult gliomas relies on an integrated approach that incorporates histomorphologic and genomic features of malignancy. First, diffuse gliomas are subdivided into classes based on their microscopic resemblance to native astrocytes (astrocytomas) and/or oligodendrocytes (oligodendrogliomas). This is accompanied by quantifying features of aggressiveness including presence of nuclear atypia, proliferative activity and necrosis & microvascular proliferation (World Health Organization (WHO) grade II-IV, respectively) (4). To combat subjective inter-observer variability and more precisely resolve prognostically relevant subgroups, this microscopic exam is now supplemented with a handful of clinically relevant molecular alterations (5)(6)(7)(8). Today, this primarily includes assessment of mutations in isocitrate dehydrogenase genes (IDH1/2) and presence of 1p & 19q chromosomal arm co-deletions (1p19q codel). Together, this integrated approach defines 3 major diffuse glioma sub-groups: (i) IDH-wildtype (IDHwt) astrocytomas, (ii) IDH-mutated (IDHmt) astrocytomas, and (iii) IDHmt, 1p19q codeleted oligodendrogliomas (9-11). These distinct molecular groups are of tremendous clinical interest as they have both biological and prognostic significance. The vast majority of IDHwt astrocytomas represent glioblastomas (GBMs), that is, gliomas that show high grade (WHO grade IV) features at presentation and arise "de novo" in older patients with no prior history of a lower grade lesion. IDHmt gliomas however oftentimes arise as lower grade tumours (WHO grade II-III) in younger individuals and eventually undergo secondary malignant transformation to higher grade tumours (WHO grade III-IV) with time and eventually become glioblastoma. While IDHmt gliomas share some clinical and biological similarities, such as arising in the frontal cortical brain regions of young adults, astrocytic lesions (1p19q non-codel) usually evolve into "secondary" glioblastoma (WHO grade IV) and have an aggressive clinical course. IDHmt gliomas with 1p19q codeletions have distinct molecular drivers and follow a more indolent clinical course, even following transformation.
Although more common in younger patients, circumscribed gliomas that lack brain infiltration, a key feature of WHO II-IV gliomas, also exist. One such entity, referred to as pilocytic astrocytoma (WHO grade I), is defined histomorphologically by piloid cytoplasmic processes, Rosenthal fibers, microvascular proliferation and chronic inflammation. These tumours have quite distinct and recurring genomic alterations (BRAF duplications/fusions) that drive their distinct and more indolent biology.
Notwithstanding these significant milestones in our genome-level understanding and classification of diffuse gliomas, treatments between patients and even across molecular subgroups is relatively uniform, non-specific and ineffective (12,13). The existing multi-platform genomic data (RNA, DNA-copy number and DNA-methylation) however continues to by guest on July 17, 2020 improve proteomic coverage, considerably compromises sample throughput and eventual clinical utility (16). To overcome these practical limitations and determine if and what proteome-level differences exist between diffuse glioma subtypes, we decided to use a recently validated FFPE-compatible LC-MS/MS workflow (20,21). We previously showed that this approach can generate biologically relevant signatures from minute amounts of microscopically defined tissue making it a useful tool to study regionally heterogenous tissue types. Although this abbreviated approach is not designed to provide a comprehensive coverage of the proteome of profiled samples, it provides sufficient protein sequencing depth and more acceptable sample throughputs to resolve distinct pathways operational in different cellular compartments (20). Here, we extend this shotgun LC-MS/MS workflow to provide an overview of proteomic landscapes of genetically defined glioma subgroups. Importantly, we highlight pathway and individual protein-level differences in different glioma subtypes that are maintained in complementary glioma stem cell cultures, larger publicly available genomic dataset and by immunohistochemical staining.
Together, our dataset, that helps nominate key macromolecules and processes that remain operational at multiple levels of glioma biology, provide a rich resource for hypothesis generation and in selection of robust therapeutic targets for these aggressive brain neoplasms.

Experimental Design and Statistical Rationale
We developed 3 unique glioma cohorts to interrogate protein-level differences of glioma biology (See Supplement Table 1 for clinical and molecular details). Briefly, our first cohort consisted exclusively of bulk fresh frozen tissue of different glioma subtypes (n=15, 14 of these were primary non-previously treated/recurring tumours). This included 3 pilocytic astrocytomas (pediatric, WHO grade I), 2 IDHwt astrocytomas (WHO grade IV), 6 IDHmt, 1p19q-codel oligodendrogliomas (WHO grade II-III) and 4 IDHmt, 1p19q non-codeleted astrocytomas (WHO grade II-IV). One of these later astrocytic cases (WHO grade III) was a recurrence of a previous lesion. Here, the pilocytic astrocytomas were meant to serve as neoplastic controls to avoid comparing protein profiles of the WHO II-III diffuse gliomas to non-neoplastic brain tissue. Our FFPE cohort similarly consisted of both glioma (e.g. 3 IDHmt, non-codel; 4 IDHmt, codel and 5 IDHwt gliomas) and non-glioma control neoplastic specimens (e.g. meningioma and medulloblastoma). All gliomas in this cohort represented primary, non-recurrent, untreated tumours. For statistical analyses, independent tumour samples were treated as biological replicates. Glioma samples were obtained from the Canadian Brain Tumour Foundation, international collaborators (Carlo Besta Neurological Institute) and in-house UHN tumour archive and were approved by UHN Research Ethics Board (REB). In our experience, resected glioma tissue presents a unique challenge as they are often contaminated with vastly varying amounts of non-lesional tissue (whole slide tumour purity: 45-95%; See Supplement Table 1). Therefore, while frozen tissue is often preferred for molecular analysis, the goal of the additional FFPE cohort was focused on assessing if potential glioma molecular subgroup differences are maintained in macrodissected tumour enriched regions free of large areas of potential confounding tissue elements between subgroups (e.g. necrosis, brain tissue). Careful macrodissection of these FFPE cases allowed us to reach more homogenous tumour purities (80-95%; See Supplement Table 1 for pre-and post-macrodissection purity of each case). As it was impractical to remove all non-neoplastic tissue elements intermixed within the tumour tissue (vasculature and infiltrated brain), our final cohort consisted of GBM stem-like cell cultures and served to further validate if subtype-specific profiles identified in primary tumours were maintained in more purified populations of tumour cells. All cases were reviewed by at least two neuropathologists and additional ancillary molecular studies (e.g. immunostaining, sequencing and cytogenetics) were used to appropriately group our cohort into the major genomically-defined diffuse glioma subtypes for analysis (See Supplemental Table 1; "Class for Analysis").

Glioma stem-like cell (GSC) culture conditions
To study the protein patterns of more purified and controlled populations of glioma samples, we used three patient-derived cultures, either IDHwt (n=2) and IDHmt (n=1, recurrence) GSCs were grown in duplicate experiments as neurospheres in serum-free media containing EGF/FGF (22) (Base media: DMEM/F12 media, 1% Antibiotic-Antimycotic, 1% L-glutamine, 0.2% B27) or as differentiated cells in serum-supplemented media (base media with 10% FBS) without growth factors for eight days. We reasoned this variable exposure to serum would help capture protein patterns of both precursor and more mature neoplastic cells known to heterogeneously comprise the tumour bulk. For sample collection (day 0 and 8), media was withdrawn and cells washed twice with ice cold 1x PBS. Cells were either centrifuged (as undifferentiated) or scraped in lysis buffer (8 M urea, 0.05 M ammonium bicarbonate in ddH2O) and prepared for MS as described below.

Sample Preparation
LC-MS/MS analysis was performed on Thermo Scientific Q Exactive Plus high-resolution mass spectrometer as previously described (20). Briefly, for FFPE tissues we prepared 10 µm thick sections on charged glass slides and stained with hematoxylin and eosin (H&E) to highlight tumour regions free from contaminating surrounding brain tissue, hemorrhage and tissue necrosis. Areas enriched in lesional tissue were carefully macrodissected off adjacent deeper slides with a scalpel and proteins were prepared in MS-compatible cell lysis reagent (0.2% Rapigest TM , Waters Corp.) by sonicating on ice for 5 cycles of 10 seconds each. Protein crosslinks were reversed by boiling (at 95ºC) in presence of 5µM DTT for 60 minutes followed by 80ºC for 90 minutes. Proteins were quantitated using Coomassie protein quantification assay (Pierce) and trypsin (Sigma) by guest on July 17, 2020 https://www.mcponline.org Downloaded from digestion was performed overnight at 37ºC. 15 μg of trypsin-digested peptides were cleaned in C18 tips for MS analysis. Fresh frozen bulk tumours and tissue culture samples were processed identically but in 8M urea lysis reagent without the 95ºC and 80ºC crosslink reversal step. Tissue lysates from each sample were processed in trypsin and unfractionated digests were analyzed by LC-MS/MS to generate proteomic profiles. For all MS/MS experiments, liquid chromatography and nanoelectrospray pump was used in a 120 min (fresh frozen tumour and GSC samples) and 90 min (FFPE tumour samples) data-dependent acquisition (DDA) program and raw data files were searched using MaxQuant (version 1.5.5.1) Andromeda search engine (Supplemental Table 11 describes sample identifiers used in raw mass spectrometry files). Searches were performed against the Swiss-Prot database (www.uniprot.org July, 2017 release, 41,345 protein entries). Search was performed for trypsin-digested peptides with 2 allowed missed cleavages permitted. and tables of all identified proteins for the three datasets are included (Supplemental Tables 2-4).

Label Free Quantification (LFQ) Mass Spectrometry
Each sample was concentrated using Omix C18 (Agilent Technologies) tips and eluted with 3 µL of buffer A (0.1% formic acid, 65% acetonitrile). To each sample, 57µL of buffer B (0.1% formic acid) was added, of which 10µL (1.5µg of peptides) was loaded from a 96-well microplate autosampler onto a C18 trap column using the EASY-nLC1000 system (Thermo Fisher to 3e6 with maximum injection time (IT) of 100ms, MS2 AGC was set to 5e4 with maximum IT of 50 ms, isolation window was 1.6Da, underfill ratio 2%, intensity threshold 2e4, normalized collision energy (NCE) was set to 27, charge exclusion was set to fragment only 2+,3+ and 4+ charge state ions, peptide match set to preferred and dynamic exclusion set to 42 (for 90 min method) or 48 seconds (for 120 min method).

Dataset Availability
The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE (23,24) partner repository with the dataset identifier PXD010099 (

Biostatistical and informatics analysis
Raw data files were searched using MaxQuant Andromeda search engine (www.coxdocs.org) against the Human Swissprot protein database (July, 2017 version) using match between runs algorithm. Analysis of proteomic datasets was performed using biostatistics software platforms Perseus (www.coxdocs.org), R (www.r-project.org), and Cytoscape (www.cytoscape.org). LFQ intensity values were used to perform statistical analysis by filtering out protein IDs with less then 60% of samples containing a measurement and imputing the remaining missing values (width 0.42, downshift 1.78).
Cellular components and biological processes were assessed by uploading Gene names to DAVID functional annotation tool (https://david.ncifcrf.gov). Pearson correlations were performed by omitting imputed values and using protein measurements found in 100% of the analyzed samples. For hierarchical clustering protein measurements were Z-score transformed and plotted using complete Euclidean hierarchical clustering method with K-means preprocessing using either whole datasets or only statistically significant protein changes. Protein level changes between glioma subgroups were determined by either modified Welch's student t-test or multi-sample ANOVA using FDR adjustment of 0.1, as indicated. GO term enrichment analysis was performed using the Fisher exact test with FDR cut-off of 0.02. Gene set enrichment analysis (GSEA) was used to define driver pathways within subgroups using the geneset permutation type with a weighted t-test enrichment statistic of 0.1. Heatmaps and networks of the proteins in GSEA were visualized using Cytoscape.
by guest on July 17, 2020

Tissue microarray and glioblastoma atlas RNAseq database analyses
In order to validate proteins expression patterns in an independent cohort of gliomas, we assembled and constructed a tissue microarray of 22 non-treated primary glioma tissue samples (See Supplemental Table 1 for clinical details). FFPE tissue arrays consisting of 3mm cores (diameter) were sectioned to 5µm in thickness, slides were deparaffinized in xylene and rehydrated in decreasing concentrations of ethanol. Antigen retrieval was then performed in citrate dehydrate at pH of 6.0 by boiling in a pressure cooker for 20 minutes. Slides were then washed three times in PBS and endogenous peroxidase activity was blocked by incubation in 3% hydrogen peroxide in methanol for 20 minutes. Slides were then blocked for 1 hour in 10% FBS, 0.1% Triton X-100. Antibody incubation was performed with Anti-PLOD3 antibody (Sigma, HPA001236) in the blocking buffer overnight at 4℃ followed by washing and incubation with anti-rabbit HRP secondary antibody for 1 hour at room temperature. To further confirm our defined proteomic profiles, we leveraged the expansive glioma subtype cohorts of

Protein patterns define glioma subgroups that correlate with genomic alterations
To determine if proteomic analysis can cluster gliomas into groups that correlate with established genomic subtypes, we first  Table 1) (Fig. 1A). Macrodissected FFPE tumour regions yielded ~2,500 proteins comparable to ~2,900 proteins per sample isolated from bulk fresh frozen tumour and cultured precursor specimens, with significant overlaps in proteins IDs (Fig. 1B). Thus, we further confirm that FFPE tissue proteomics, with slight technical modifications in sample preparations, can yield comparable MS-based proteomic datasets to their fresh-frozen tissue counterparts. In fact, the only distinguishing feature seems to be the slightly larger percentage of proteins identified by single peptide hits (41% vs. 32% in FFPE vs. Fresh-frozen tissue, respectively) (Fig. 1B). Altogether, we recovered 5,496 proteins from our integrated effort producing the largest proteomic resource of gliomas to date, expanding on previous analysis of glioma tissues using mass by guest on July 17, 2020 spectrometry (25) and MALDI mass spectrometry (26) (Fig. 1B). Importantly, proteins derived from both biopsy sources had high degrees of similarity in molecular functions, biological processes and cellular components (Fig. 1C).
Unsupervised hierarchical clustering of Pearson correlations of protein intensities across samples in frozen and FFPE tumour cohorts found 3 major branches of proteomic patterns (Fig. 1D). In the frozen tissue cohort, pilocytic astrocytomas (WHO grade I) formed their own branch, with a separate branch of WHO Grade II-IV IDHmt codel and non-codel gliomas.
To validate and extend our findings, we performed a similar analysis using our macrodissected FFPE tissue cohort. In addition to a non-glioma cluster, IDHwt formed a distinct branch from IDHmt gliomas (1p19q-codel and non-codel). In both cohorts, this unsupervised analysis of the entire dataset highlighted that there are distinct protein pattern differences that closely reflect the well-understood histologic and molecular glioma subgroups.

Supervised proteomic analysis resolves distinct pathway-level differences among glioma subtypes
To identify proteins and potential biological pathways driving clustering divergence between molecular subtypes of gliomas, we performed multi sample ANOVA analysis in the two glioma datasets ( Fig.2A). In the bulk frozen tumour comparisons, major differences in protein abundance levels were due to highly distinct molecular landscape of pilocytic astrocytomas (486 proteins, FDR<0.1, Supplemental Table 5). Hierarchical clustering of samples reinforced tumour type-specific differences, where IDHwt gliomas clustered in their own node with a secondary bimodal node composed of pilocytic astrocytomas and an intermixed IDHmt branch of 1p19q codeleted and non-codeleted diffuse gliomas ( Fig. 2A). Relative to the entire dataset of quantified proteins, these 486 protein groups are significantly enriched for the mitochondrion with biological functions of developmental growth, metabolism and catabolic processes (Fisher exact test, FDR<0.1, Supplemental Table 6). Biological processes enriched in each tumour molecular subgroups were investigated by defining 4 protein clusters: IDHwt diffuse glioma and pilocytic astrocytoma specific (Cluster 1), IDHwt diffuse glioma exclusive (Cluster 2), IDHmt (Cluster 3) and IDHwt depleted (Cluster 4). Only cluster 1 and 3 had statistically enriched gene ontology (GO) biological processes (FDR<0.1, Table 1). This analysis revealed that IDHwt tumours exhibit higher inflammatory response while pilocytic astrocytomas were defined by elevated membrane invagination and endocytosis. Tumour infiltrating lymphocytes is a common and well recognized phenomenon in lower grade gliomas highlighting the power of our analysis to capture heterogenous biological elements and processes within profiled tissues. These pathway-level findings were further recapitulated when low grade pilocytic astrocytomas (pilo) were compared to higher grade tumours using the GSEA approach (Supplemental Fig. 1). For example, pilocytic astrocytomas are represented by interferon proteins and, most notably, integrins and other proteins of the extracellular matrix and NABA matrisome pathway. In contrast, pilocytic astrocytomas had reduced levels of proteins involved in glutamate and other synaptic signaling proteins that were enriched in higher grade by guest on July 17, 2020 tumours. Expression of neurotransmitter pathways is an emerging phenomenon in glioma biology with promising therapeutic potential (27)(28)(29). Conversely, it could represent intervening infiltrated brain tissue that would be over-represented in diffuse gliomas. Notably, higher grade tumours also demonstrated an enrichment of MYC target genes and mRNA processing.

IDHmt gliomas bear distinct proteomic signatures relative to IDHwt GBMs
While our analysis of frozen bulk tumour biopsies was successful at defining proteomic differences between different tumour grades, differentiation between 1p19q codeleted and non-codeleted diffuse gliomas was less pronounced. We reasoned this could be a result of contaminated normal and non-informative tumour elements (e.g. necrosis). Thus, to define proteomic differences between these more closely related tumours, we performed macrodissection of lesional tumour regions from normal brain and necrotic tumour tissue within FFPE surgical biopsies. By performing multiple sample ANOVA, we identify 287 proteins with significant level differences (Permutation-based FDR<0.1) between the three glioma subtypes defined by their IDH and codeletion status (Supplemental Table 7, Fig. 3A). This approach resulted in robust clustering of IDHmt, codeleted oligodendrogliomas and non-codel astrocytomas with a separate cluster of IDHwt GBMs. Three clusters of protein abundance patterns are apparent by hierarchical clustering composed of astrocytoma-specific (Cluster 1), IDHwt-specific (Cluster 2) and IDHmt-specific (Cluster 3) proteins. GO term enrichment analysis revealed that Cluster 2 proteins largely belong to the extracellular matrix involved in exocytosis (Supplemental Table 8). Distinct proteomic landscapes of macrodissected tumours were further demonstrated by PCA where IDHmt, non-codel astrocytomas and IDHmt,codel oligodendrogliomas occupied a region distinct from IDHwt GBMs (Supplemental Fig. 2). Direct comparison of IDHmt, non-codel to IDHwt astrocytomas reveals 116 significantly different proteins (Supplemental Table 9, Fig. 3B). Among the most abundant IDHwt GBM proteins is retinol-binding protein 1 (RBP1), known to be frequently hypermethylated in IDHmt gliomas and negatively correlated with patient survival (31).
by guest on July 17, 2020 To more systemically interrogate proteins with abundance level differences between IDHmt and IDHwt tumour samples, we performed GSEA on our datasets (Supplemental Fig. 3A). Consistent with the GO term enrichment analysis, proteins more abundant in IDHwt tumours belong to the extracellular matrix (ie. NABA matrisome and laminin interactions),

BRAF signalling and epithelial to mesenchymal transition (EMT) processes. On the other hand, proteins enriched in the
IDHmt tumours are involved in oxidative-induced senescence, non-coding RNA metabolism, class I HDACs and cadherin signaling. Proteins belonging to the epithelial to mesenchymal transition are traditionally thought to be involved in increasing the stem cell pool of the proliferating tumour, increasing its likelihood of metastasis and radiotherapy resistance (32).
Consequently, IDHwt GBMs exhibited high levels of EMT-related proteins such as TGFBI, TAGLN, DCN and FN1 as well as several collagen chain proteins (i.e. COL1A2, COL6A2 and COL4A2) (Fig. 3C). Members of this group of genes was also detected in previous RNAseq analysis comparing GBMs to normal brain tissue (33). In our analysis, we demonstrate that these EMT genes are detectable at the protein level and are more specifically enriched in IDHwt GBMs. In contrast, IDHmt GBMs contain an enrichment of HDAC-mediated chromatin remodelling and mRNA post-transcriptional regulation in mRNA splicing (Supplemental Fig. 3 and Fig. 3C).
To validate the generalizability of our proteomic signatures, we leveraged the large resource of RNA transcriptional profiles of brain tumors in The Cancer Genome Atlas (TCGA) consortium database (https://cancergenome.nih.gov). We extracted RNAseq values of 421 glioma tumour samples that included IDHwt and IDHmt GBMs, as well as IDHmt astrocytomas and oligodendrogliomas. Correlating our MS-based measurements of 2,567 proteins to corresponding RNAseq values demonstrates that our shotgun MS strategy quantitates some of the highest expressing genes (Supplemental Fig. 4A).
Direct comparisons of RNAseq-to-protein abundance values from our datasets indicates some protein-level changes between molecular subtypes that could have been missed by previous transcriptome-level analysis. Spearman correlations between RNAseq and MS-based quantifications demonstrate that the highest levels of similarity of ~0.5 are found in corresponding tumour types (Supplemental Fig. 4B). This calculation supports the relatively low level of correlation between RNA and protein measurements, previously observed in other systems. Importantly and despite limitation of RNA-protein correlations, when we extracted RNAseq values of 260/287 proteins we defined as significantly different between anaplastic oligodendrogliomas and GBMs, hierarchical clustering correctly separates samples according to their IDH-mutation status (Supplemental Fig. 4C). Thus, although our MS-based proteomic signatures of IDHwt GBMs were derived on a limited number of patient tissues, they hold true when extrapolated on an independently derived, sufficiently large, glioma cohort.
Interestingly, although there is a low number of IDHmt GBMs (n=10) in the TCGA cohort, 8/10 co-clustered with other IDHmt tumour samples, reflecting this subtype's distinct molecular landscapes compared to IDHwt neoplasms. Given that our proteomic profiles were generated from macrodissected FFPE tissues, we also utilized an independent RNAseq dataset of there are modest levels of clustering of intra-tumoural regions suggesting that our macrodissected proteomic profiles indeed capture sub-tumour specific regional profiles (Supplemental Fig. 4D).

Proteomes in GBM-derived stem-like cell lines maintain IDH-genotype specific tumour patterns
Our global proteomic analysis of primary FFPE GBM tissues revealed proteomic signatures that were specific to the IDHmutation status of tumours. To more accurately define relevant proteins to tumour cells and perhaps even stem-like cells, we analyzed proteomes of established GBM-derived stem-like cell lines (GSCs) from the MD Anderson Cancer Center (34, 35).
We reasoned that proteins identified in both primary tumour tissue cohorts and confirmed in GSC cultures could serve as tumour cell-specific candidate proteins that could be prioritized for future functional studies. Towards this, we analyzed three GSC lines (IDHwt GSC1, IDHwt GSC-2 and IDHmt GSC1) in their undifferentiated state, in presence of EGF and FGF, or as differentiating cells in serum without growth factors for 8 days (Supplemental Fig. 5A). Hierarchical clustering yields two main nodes of proteome patterns, separating the IDHmt and IDHwt samples irrespective of culture conditions ( Supplemental   Fig. 5B). Multiple sample ANOVA identified 845 significantly different proteins that distinguish cell lines according to their IDH mutation status (Supplemental Table 10 and Supplemental Fig. 5C). While we detected some proteome-level changes after one week in serum supplemented media, inter-GSC proteome differences and IDH-status were a more dominant feature driving clustering (Supplemental Fig. 5D).
Importantly, of the 287 protein IDs with significantly different abundance levels in our primary GBM tumour analysis we found 170 proteins were also detected in GSC cultures. When extrapolated, these 170 proteins accurately segregate the cell lines into hierarchical clusters of IDHwt and IDHmt GSCs (Fig. 4A). We find that IDHmt-GSCs contained relatively low levels of classic precursor markers such as Nestin (NES), glial fibrillary acidic protein (GFAP) and vimentin (VIM), compared to IDHwt-GSCs. Direct comparisons of primary glioma tumours to GSCs using these overlapping proteins by guest on July 17, 2020 by Spearman Rank Coefficients demonstrate that IDHwt GBMs more closely resemble IDHwt than IDHmt GSCs (Fig. 4B).
Although overall correlation of fold changes in protein levels are modest (Spearman rank IDHwt vs. IDHmt GSC/GBM=~0.56) a subset of 79 proteins have a remarkably similar level change in both primary tumours and cultured GSCs (Fig. 4C). These proteins encompass some previously discussed biological functions of EMT (e.g. PLOD3, SERPINH1 and COL4A2), interferon and integrin signaling (STAT1 and ITGB1) and the extracellular matrix (LMNB2, LAMA5 and ANXA1). GO term enrichment analysis of these proteins revealed that they are largely located to the endoplasmic reticulum and are involved in response to hormone stimulus and stress (Fig. 4D). To begin validating our identified candidates in a larger external cohort, we carried out orthogonal immunohistochemical staining of PLOD3, an endoplasmic reticulumlocalized protein we found to be enriched in both IDHwt tissues and GSCs, in an independent tissue microarray (TMA) of 22 additional glioma samples. This analysis revealed a variable expression pattern across the tissues ranging from being restricted to vessels to low, moderate or high staining patterns within the tumour cells (Fig. 4E). We also used the available immunohistochemically-stained cores of the online www.proteinatlas.org resource (containing a TMA of high (n=10) and low grade (n=2) gliomas) to validate our findings. While both low grade gliomas had undetectable PLOD3 in this dataset, 6/10 high grade gliomas had moderate or high levels of PLOD3. A similar frequency patterns was observed in our own TMA further validating the robust protein differences we identified in our discovery MS cohorts (Fig. 4F). While this online resource does not contain comprehensive glioma molecular information, our genomically-annotated TMA confirms that vigorous and diffuse staining of PLOD3 protein is a feature of IDHwt tumours (6/12) and found at much lower levels in IDHmt lesions (Fig 4F, Supplemental Fig. 6). To further lend support for the biological relevance of our defined candidates in our cohort, we reasoned that proteins candidates such as PLOD3 or the chloride signalling-related protein CLIC1, found to be elevated in our IDHwt vs. IDHmt comparisons, could also correlate with clinical outcomes (survival) on larger clinical cohorts. Indeed, stratifying tumours based on the available RNA expression of CLIC1 and PLOD3 in the online www.proteinatlas.org dataset yielded 3-year survivals of 10% and 19% for low-expressing tumours and 0% and 6% in the high expressing tumour samples, respectively (Fig. 4G).

Discussion
Our molecular understanding of brain tumours has been largely dominated by genomic tools and profiling approaches. Given the functional role proteins play in biological function, complementary approaches that overlay global, unbiased, translational readouts onto well-characterized genomic events stand to substantially improve our models of gliomas. While accumulating studies are now beginning to unravel the glioma proteome, our study offers distinguishing insights. Firstly, we performed shotgun MS analysis on diverse gliomas spanning the common genetic subtypes providing important insights into inter-by guest on July 17, 2020 tumoural differences and thus extending on previous comparisons to normal tissue. Specially, we demonstrate that pilocytic astrocytomas are proteomically distinct from higher grade infiltrative tumours even when bulk frozen tumours are analyzed.
Given that such shotgun LC-MS/MS workflows can be rapidly carried out on FFPE tissue, optimized proteomic assays (e.g. Selected Reaction Monitoring, SRM) could provide an alternative molecular tool for rapidly differentiating between these clinical distinct glial neoplasms. While differences of these circumscribed lesions were easily resolved in bulk tissue, our analysis further highlighted the challenges of expression-based assays for more infiltrative tumours (36). Using our FFPE cohort, we show how this can be overcome by macrodissecting away large regions of normal brain and necrotic tissue that can sometimes dominate a biopsy or resection specimen. We use this approach to propose biological pathways operational By comparing proteins similarly enriched in patient tissue and GSCs derived from the sample molecular background, we further refine promising tumor-specific candidates for future subtype-specific mechanistic and therapeutic studies. For instance, our data shows that CLIC1, a chloride intracellular channel protein, and PLOD3, a collagen-crosslinking protein of the endoplasmic reticulum that is involved in EMT, are enriched in IDH wildtype glioma cells. Building on other studies that demonstrate connect CLIC1 expression to poor survival in gliomas, we now specifically associated its activity within the GSC compartment (47,48). This is exciting target as the biguanide drug family, which includes metformin and derivatives appears to have good CLIC1 inhibitory activity in glioblastoma (49,50). Similarly, additional ion dysregulation in glioma progression could also be mediated by the overabundance of the S100A11 protein we founds to be enriched in IDHwt GBMs /GSCs and by guest on July 17, 2020 known to be over-expressed in a wide range of other cancers (51). Together, we believe these protein-based analyses may help explain the better outcomes of IDH-mutated gliomas and offer new precision-driven avenues for therapeutic intervention.
Our study is not without its own limitations. To allow us to cover a diverse cohort of glioma subtypes, we chose to accept somewhat modest protein depths and have small subgroups of gliomas. While the distinct biology of the major glioma subtypes allowed us to capture meaningful protein-level differences within our relatively small cohorts, there are other fruitful opportunities to potentially define clinically relevant protein-level differences within glioma subtypes (e.g. long-term survivorship in IDHwt GBMs (3,52,53). In addition to large cohorts, these would likely require deeper proteomic coverage and investigation of post-translational modifications, which was also outside the scope of this initial study. With those limitations in mind, we note that we detect substantial protein pattern differences between glioma subtypes and define pathway enrichments that are preserved between a variety of different datasets. Our generated dataset thus offers a reliable resource for hypothesis generation and more targeted novel biomarker development.
While diffuse gliomas are now understood to comprise of distinct genomically-defined subroups, the downstream phenotypic changes driven by their unique genetic alternations are largely unexplored. This study provides the first integrated overview of the proteomic landscape of glioma and identifies distinct and targetable pathways differentially operational in subtypes defined by their IDH-status. Moreover, we highlight how challenges of expression-based readouts of samples with heterogeneous compositions can be overcome with careful microdissection and comparison to controlled and relatively purified glioma cultures. Assembly of larger, more clinically-annotated cohorts and incorporation of advanced proteomic approaches such as deep phospho(proteomic) analysis could greatly expand our phenotypic understanding of this complex disease. Similarly, routine application of the presented FFPE-compatible workflows, can help define patient specific pathways driving individual tumors and better guide personalize glioma care.

Conflicts of Interests:
The authors declare no conflicts of interests.     Table 7). Three row clusters identify astrocytoma-specific (Cluster 1), IDHwt-specific (Cluster 2) and IDHmt-specific (Cluster 3). (B) Volcano plot directly comparing IDHwt and mt GBMs identifies proteins with the most robust significant changes in abundance. (C) GSEA analysis of protein levels identifies enrichment of processes specific to IHDwt GBMs (epithelial to mesenchymal transition) or IDHmt astrocytomas (increased mRNA splicing). For full GSEA processes see Supplemental Fig. 3. (E) Representative cores of PLOD3 immunocytochemistry in a glioma tissue microarray demonstrates diverse protein abundance patterns, either vessel only and/or tumour cells (low, moderate or high). Insets show high magnification of tissue cores. For the full TMA see Supplemental Fig. 6. (F) Ratios of PLOD3 expression patterns in gliomas from www.proteinatlas.org pathology resource (left panel) and our own independently assembled TMA (right panel). In our TMA gliomas were classified according to IDH and 1p19q co-deletion status. (G) Kaplan-Meier survival curves of glioma patients according to PLOD3 and CLIC1 RNA expression values from 153 gliomas cataloged in the www.proteinatlas.org. P score signifies the log rank P value of the correlation between RNA expression and patient survival.