Human Leukocyte Antigen (HLA) Peptides Derived from Tumor Antigens Induced by Inhibition of DNA Methylation for Development of Drug-facilitated Immunotherapy *

Treatment of cancer cells with anticancer drugs often fails to achieve complete remission. Yet, such drug treatments may induce alteration in the tumor's gene expression patterns, including those of Cancer/Testis Antigens (CTA). The degradation products of such antigens can be presented as HLA peptides on the surface of the tumor cells and be developed into anticancer immunotherapeutics. For example, the DNA methyl transferase inhibitor, 5-aza-2′-deoxycytidine (Decitabine) has limited antitumor efficacy, yet it induces the expression of many genes, including CTAs that are normally silenced in the healthy adult tissues. In this study, the presentation of many new HLA peptides derived from CTAs and induced by Decitabine was demonstrated in three human Glioblastoma cell lines. Such presentation of CTA-derived HLA peptides can be exploited for development of new treatment modalities, combining drug treatment with anti-CTA targeted immunotherapy. The Decitabine-induced HLA peptidomes include many CTAs that are not normally detected in healthy tissues or in cancer cells, unless treated with the drug. In addition, the study included large-scale analyses of the simultaneous effects of Decitabine on the transcriptomes, proteomes and HLA peptidomes of the human Glioblastoma cells. It demonstrates the poor correlations between these three levels of gene expression, both in their total levels and in their response to the drug. The proteomics and HLA peptidomics data are available via ProteomeXchange with identifier PXD003790 and the transcriptomics data are available via GEO with identifier GSE80137.

Grade IV astrocytoma, also called glioblastoma multiforme (GBM) 1 , is the most common and aggressive primary brain tumor, with a prognosis of about three months if untreated. Despite advances in treatment, and even after applying a combination of maximal surgical resection, followed by radiation and Temozolamide (TMZ) chemotherapy, the survival rate for GBM has not changed much over the past 50 years, remaining largely an incurable disease with a median survival of about 15 months (1). Hence, there is a pressing need to develop better therapeutic modalities, such as immunotherapies, which are emerging as promising treatment options for otherwise incurable cancers, such as GBM (2,3). Different cancer immunotherapy regimens are now approved by the FDA and the European Medicines Agency for various cancers (4), including HLA peptide vaccines (5,6). Presentation of HLA peptides derived from tumor antigens (TA) can lead to activation of anticancer T cells and to cytotoxic killing of the diseased cells. Therefore, the cancer cells' HLA peptidomes were studied extensively as sources for tumor antigens potentially useful as cancer vaccines (5,(7)(8)(9)(10). The HLA peptidomes (also called the immunopeptidome or the HLA ligandome) are the assortments of peptides bound and presented at the cells' surface by the HLA molecules. Tumor antigens potentially useful for cancer immunotherapy are tumor-specific antigens (TSA), including neo-antigens derived from mutations unique to the tumor cells (10 -14). Tumor-associated antigens (TAA) are another useful group of antigens that are expressed in larger amounts on tumor cells relative to normal cells (15,16) and therefore pose some risk of eliciting autoimmune reactions (17). A special group of tumor antigens attracting significant attention as potential targets for cancer immunotherapy are the cancer/testis antigens (CTA), which are preferentially expressed in immune privileged sites, such as male germ cells, placenta, and ovary, but are often absent from the normal somatic cells (18,19). The biological role of the CTAs in the germline cells is not always known, yet their expression in the tumor cells was exploited to elicit immune responses in treated cancer patients, bringing about complete regressions in a few cases (19 -21). Furthermore, combining TA vaccines with inhibitors of immune modulators may become even a more powerful modality for cancer treatment (5,13,22,23).
The expression level of many of the CTAs is regulated by constant DNA methylation of their promoters in normal and transformed cells (24 -26). Although most of the CTA promoters are methylated in the healthy adult tissues, such regulation is often lost in the tumor cells. These epigenetic abnormalities, prevalent in cancer cells, induce differential expression of oncogenes and CTAs as side effects of the loss of cellular control (27)(28)(29)(30). Moreover, although the expression of CTAs in human cancer cells is heterogeneous, their expression can be up-regulated by treatment with inhibitors of methylation, such as 5-aza-2Ј-deoxycytidine (Decitabine). Decitabine is a cytosine analog, which inhibits DNA methyltransferases by trapping these enzymes after its incorporation into the DNA, thus reducing methylation of newly synthesized DNA strands (31)(32)(33). Decitabine was shown also to reduce the methylation and to elevate the expression of the MHC class I genes along with different tumor antigens (34 -36). Treatment of human glioma cells with Decitabine uniformly up-regulated the expression of NY-ESO-1 and other well characterized CTAs in the cancer cells (37) but not in nonmalignant cells (38 -40). Decitabine is not the only anticancer drug affecting the epigenome. Currently, 87 clinical trials of epigenetic cancer therapy are registered at Clinical Trials (http://www.clinicaltrials.gov) (30).
Identification of HLA peptides as tumor antigens and candidates for immunotherapy is complicated by the difficulty of detection of their presentation on tumor cells. HLA peptidome analysis is limited by the yield of immunoaffinity purified HLA molecules and by the technical limitations of the chromatography and mass spectrometry analysis of the recovered peptides (7)(8)(9)41). To this end, analysis of the tumor cells' exomes and transcriptomes may provide useful data from which the presentation of HLA peptides can be inferred (9,13,14,16,(42)(43)(44)(45). The cancer cells' proteomes is less useful for elucidation the presentation of HLA peptides, because the levels of the presented peptides correlates poorly with the levels of their source proteins (42)(43)(44)(45)(46)(47)(48).
This research aimed to promote the development of a reliable method for selection of potential candidates for cancer immunotherapy, especially CTAs induced by drugs, such as Decitabine. It followed the effect of the drug on the transcriptomes, proteomes, and the HLA peptidomes of cultured human GBM cells, resulting in the discovery of large sets of druginduced CTAs. The obtained data are potentially useful for advancing the development of new approaches for immunotherapy, based on the induction by drugs of tumor antigens that are not normally expressed by the cancer cells and can be used for combined drug and immunotherapy cancer treatment.
Drug Treatment Regimen-Cells were counted and plated in new 150 mm Petri dishes 1 day prior to the treatment. The following day, the media was removed and replaced with fresh medium with or without 1 M of 5Ј-Aza-2Ј-deoxycytidine (Decitabine, AdooQ Bioscience, Irvine, CA). After incubation with Decitabine for 72 h, the cells were harvested, counted and prepared for FACS, HLA peptidomics, proteomics analysis, or for RNA sequencing.
Experimental Design and Statistical Rationale-Three Glioblastoma cell lines (U-87, T98G, and LNT-229), treated and untreated with Decitabine, were used for the HLA peptidomics analysis, proteomics analysis, and for the RNA sequencing and quantification. Three HLA peptidomics and proteomics analyses biological replicates were performed with each of the treated and untreated Glioblastoma cell lines. In addition, two out of these three replicates of each of the cells lines were also used for RNA-seq analyses.
FACS Analysis-About 2 ϫ 10 5 Decitabine-treated and untreated cells were used for each flow cytometric analysis. The cells were tagged with the W6/32 monoclonal antibody (anti-HLA-A, B and C) produced from the HB95 hybridoma and used at a final concentration of 0.5 g/l. Secondary antibodies were anti-mouse IgG conjugated to FITC (Sigma, St. Louis, MO). FACS analysis was performed using LSRII instrument (BD Biosciences, San Jose, CA) and the data were analyzed using FCS Express 5 Plus (DeNovo Software, Glendale, CA).
Affinity Purification of HLA Molecules-HLA class I molecules were purified from three biological replicates for each Glioblastoma cell line. Each replica with about 5 ϫ 10 8 cells was lysed with 0.25% sodium deoxycholate, 0.2 mM iodoacetamide, 1 mM EDTA, 1:200 Protease Inhibitors Mixture (Sigma), 1 mM PMSF, and 1% octyl-␤-D glucopyranoside (Sigma) in PBS at 4°C for 1 h. The cell extracts were cleared by centrifugation for 45 min at 18,000 rpm, 4°C. The recovered HLA class I molecules were immunoaffinity purified using the W6/32 mAb bound to Amino-Link beads (Thermo Scientific, Waltham, MA) as in (49). The HLA molecules with their bound peptides were eluted from the affinity column with five column volumes of 0.1 N acetic acid. The eluted HLA class I proteins and the released peptides were loaded on disposable C 18 columns (Harvard Apparatus, Holliston, MA) and the peptides fraction was recovered with 30% acetonitrile in 0.1% TFA. The peptide fractions were dried using vacuum centrifugation, reconstituted in 100 l of 0.1% TFA, reloaded on Stage-Tips (50), eluted with 80% ACN, dried, and reconstituted with 0.1% formic acid for LC-MS-MS analysis.
Proteomics analysis-Two approaches for proteome analysis were employed in this study. The first was in-solution tryptic digest followed by resolution of the peptides by long (3 h) one-dimensional reversed phase capillary chromatography and tandem mass spectrometry and the other was based on in-gel proteolysis of 5 gel slices from each lane, followed by two hours LC-MS-MS of the tryptic peptides from each gel slice, as in (51). Thirty g of total protein were used for the in-gel digest and LC-MS-MS and 2.5 g of proteins were used for the in-solution digest and LC-MS-MS analysis.
Identification of the HLA and Tryptic Peptides-The HLA and tryptic peptides were resolved by capillary chromatography and electrospray tandem mass spectrometry with Q-Exactive-plus mass spectrometers (Thermo Scientific). The reversed phase capillary chromatographies were performed with home-packed a 30 cm long, 75 m inner diameter column with 3.5 m silica ReproSil-Pur C18-AQ resin (Dr. Maisch GmbH, Ammerbuch-Entringen, Germany). HLA and tryptic peptides were eluted with a linear gradient of 5-28% of buffer B (100% ACN and 0.1% Formic acid). The gradients were run at flow rates of 0.15 l/min for 2 h for the HLA peptides and the in-gel digested fractions and for 3 h for the in-solution digests. Data was acquired using a data-dependent "top 10" method, fragmenting the peptides by higher-energy collisional dissociation (HCD). We acquired full scan MS spectra at a resolution of 70,000 at 200 m/z with a target value of 3 ϫ 10 6 ions. Ions were accumulated to an AGC target value of 10 5 with a maximum injection time of generally 100 msec. For the HLA peptides with unassigned precursor ion charge states, or charge states of four and above, no fragmentation was performed. For tryptic peptides, the fragmentation was performed on charge states between 2 to 7. The peptide match option was set to Preferred. Normalized collision energy was set to 25% and MS/MS resolution was 17,500 at 200 m/z. Fragmented m/z values were dynamically excluded from further selection for 20 s.
Data Analysis-The MS data was analyzed by the MaxQuant computational proteomics platform (52) version 1.5.0.25 and searched with the Andromeda search engine (53). Peptide identifications were based on the human section of the Uniprot database (http://www. uniprot.org/) of July 2015 containing 69,693 entries. Mass tolerance of 4.5 ppm for the precursor masses and 20 ppm for the fragments were allowed. Methionine oxidation was accepted as variable modification for both tryptic and HLA peptides. Carbamidomethyl cysteine was accepted as a fixed modification for the proteomics data and as a variable modification for the HLA peptidome data. Methionine sulfoxide and n-acetylation were set as variable modifications for both the proteomics and HLA peptidomics analyses. Minimal peptide length was set to seven amino acids and a maximum of two miscleavages was allowed for tryptic peptides. The false discovery rate (FDR) was set for tryptic peptides to 0.01 for protein identifications, and 0.05 for the MHC peptides. The resulting identified protein tables were filtered to eliminate the identifications derived from the reverse database, as well as common contaminants. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium (54) (http://proteomecentral.proteomexchange.org) via the PRIDE partner repository with the data set identifier PXD003790.
Normalization of the Transcriptome, Proteome, and HLA Peptidome Data-The transcriptome data of each of the cell lines was normalized using a normalization factor (size factor) for each sample, using the DESeq2 package (http://bioconductor.org/packages/ release/bioc/vignettes./DESeq2/inst/doc/DESeq2.pdf). The proteome data was normalized by the iBAQ and the LFQ tools of the MaxQuant software (indicated in the relevant legends). The medians of the fold changes of each HLA peptidome experimental repetition, before and after the Decitabine treatment, were zeroed and differentially expressed peptides were defined as those changing by at least twofold.
mRNA-Seq Library Preparation and Sequencing-Total RNA was prepared from between 1 to 2 ϫ 10 6 GBM cells, using the Total RNA Purification Plus kit (Norgen Biotek, Thorold, ON, Canada) as per the kit instructions. The isolated total RNA was quantified using Qubit 2.0 machine (Invitrogen, Carlsbad, CA) before the sequencing library preparation. The mRNA population in each sample was enriched by removing ribosomal RNA (rRNA) using the Epicenter ScriptSeq Complete kit (human, Illumina, Madison, WI) as per the kit instructions. Gene expression analysis of 12 samples, including two biological replicates per each of the three GBM cell lines before and after Decitabine treatment, was performed by Illumina HiSeq 2500 in 50 base pair single read mode in the Technion Genome Center. The RNA-seq reads were mapped to the GRCh38 genome using TopHat version 2.0.13 (55). Only uniquely mapped reads were counted in the analysis, using the HTSeq-count package version 0.6.1 with "union" mode (56). The counts normalization and the differential expression analysis were performed using the DESeq2 package version 1.8.1. The RNA sequencing data have been deposited to the GEO Consor-tium (http://www.ncbi.nlm.nih.gov/geo/) with the data set identifier GSE80137.
RT-qPCR of NY-ESO-1 Expression-Total cellular RNA was extracted using the Total RNA purification Plus kit (Norgen biotek) as per the kit instructions. The RNA concentration was determined using the Epoch microplate reader and software (BioTek, Winooski, VT). One microgram of total RNA was used to generate cDNA with the qScript cDNAsynthesis Kit (Quanta Biosciences, Beverly, MA). Quantitative polymerase chain reaction (RT-qPCR) of NY-ESO-1 mRNA was performed using SYBR green FastMix assay (Quanta biosciences) and the Bio-Rad (Hercules, CA) real-time PCR system and software. HPRT1 mRNA was used as the endogenous control. Primers specific for NY-ESO-1 and HPRT1 were NY-ESO-1: forward primer: 5Ј-CGCCTGCTTGAGT-TCTACCT-3Ј, reverse primer: 5Ј-TGCAGCAGTCAGTCGGATAG-3Ј; HPRT1 forward primer: 5Ј-ACCCCACGAAGTGTTGGATA', reverse primer: 5Ј-AAGCAGATGGCCACAGAACT-3Ј. Each sample was amplified by 1 cycle at 95°C for 3 min, 40 cycles at 95°for 10 s and 60°for 30 s. All RT-qPCR assays were carried out in triplicate. Negative controls for the RT-qPCR reactions were performed by omitting the cDNA templates in at least one reaction per tested transcript.
HLA Typing-In brief, DNA was extracted from 10 6 cells using the DNeasy Blood & Tissue kit (Qiagen, Hilden, Germany) as per the kit instructions. The HLA typing was conducted with this DNA at the Sheba Medical Center Tissue Typing Unit (Israel). HLA typing was performed using Luminex technology and Immucor Transplant Diagnostic kits, as per the kits' instructions to obtain HLA A*, B*, and C* loci typing at low/intermediate resolution. proteome analyses identified about 7500 proteins and the HLA class I peptidome analyses resulted in above 25,000 identified HLA peptides (Table I). Two biological repetitions of the transcriptome, three of the proteome and three of the HLA peptidome were performed with each of the cell lines and treatment, resulting in highly reproducible data sets. Even though, rather similar transcriptomes and proteomes were detected in the three cell lines ( Fig. 2A-2B), the detected HLA peptidomes were largely different (Fig. 2C). In addition, al-though the transcriptomes and the proteomes of the cells were reasonably correlated with each other in each of the cell lines, the correlations of both with the HLA peptidomes of the same cells were poor (Fig. 3). Significantly, many of the gene products induced by Decitabine were derived from TAs, including many CTAs, which are prime candidates for further evaluation as immunotherapeutics (Table II).
Large and Unique HLA Peptidomes were Detected in Each of the Cell Lines-The HLA class I peptidome analyses, before   (Table I and supplemental Table S1). Most of these peptides (23,439, 92%) fitted the typical HLA class I ligands' lengths of 8 -14 amino acids. The majority of peptides fitted the expected sequences motifs of the HLA allotypes of the studied cells (listed in Fig. 1 and supplemental Table S1) according to NetMHC server with score below 1000 (http://www.cbs.dtu.dk/services/NetMHC/) (58). As many as 67.3% of the peptides in the LNT-229 cells, 56.7% of the peptides in the T98G cells, and 67.9% of peptides in the U-87 cells, fitted the cells' HLA sequences motifs with scores below 1000 nM. Larger similarities were observed between the HLA peptidomes of the T98G and U-87 cells ( Fig. 2C and supplemental Fig. S1), which share an HLA allotype (HLA-A*02:01) relative to the LNT-229 cells with their completely different HLA haplotype (listed in Fig. 1).

Many HLA Peptides Derived from Tumor Antigens Were Up-regulated by Decitabine-The
Decitabine treatment resulted in small up-regulation of the cells' surface HLA class I expression, as assessed by flow cytometry and by the LFQ values of the individual HLA allotypes proteins detected in each cell line during the proteome analyses (supplemental Fig. S2). Most of the identified HLA peptides were detected at similar levels in each of the cell lines and their levels were not significantly affected by the Decitabine treatment (Fig. 4A). However, of the 23,439 identified HLA peptides, as many as 1855 peptides were up-regulated in their presentation levels by the Decitabine treatment and only a small number of the affected peptides were down-regulated (see statistics in Table I and the full data set in supplemental Table S1) according to a "one sample test" with p value Ͻ 0.05 and log 2 fold change Ϫ1 Ͼ x Ͼ 1 when the HLA peptides were identified both in treated and untreated cells. If the HLA peptides were identified only before or after the treatment, the HLA peptides were considered as changing significantly only if they were not detected at all the three repetitions before (or after) the treatment, and were positively identified in at least two repetitions of the other (treated or untreated) cells. A significant number of the up-regulated HLA peptides were detected only after the Decitabine treatment and were not observed in the untreated cells (Fig. 4A). The Decitabine treatment affected in a similar manner different HLA peptides derived from individual proteins in more than one of the cell lines ( Fig. 5 and supplemental Table S1). Importantly, above 730 of the identified HLA peptides were derived from tumor antigens (Table  I and supplemental Table S1) and as many as 72 of these HLA peptides were up-regulated by twofold or more following treatment with Decitabine. Moreover, out of these 72 upregulated HLA peptides, 21 belong to the CTA group. For example, five different HLA peptides derived from the CTA antigen, MAGEA1, were up-regulated by Decitabine in at least one of the three cell lines (boxed in Fig. 5).
The Effect of Decitabine on the Cellular Proteomes-A total of 7555 different proteins were identified by the proteome analyses, with at least two identified tryptic peptides per protein. As expected, significantly higher correlations were observed between the proteomes of the different cell lines (Fig. 1B and supplemental Fig. S3) relative to their HLA peptidomes ( Fig. 1C and supplemental Fig. S1). A total of 2990 proteins were affected by the Decitabine treatment with approximately equal numbers up or down regulated (Table I, Fig.  4B and supplemental Table S2). Significantly affected proteins were defined as those changing with a log2 fold-change Ϫ1Ͼ x Ͼ1 before and after treatment in at least one of the repetitions with the other two repetitions not behaving at the opposite direction. The proteomics analyses included both "gel-slicing" and "in-solution" tryptic digest, which cannot be combined for proper t test analysis because of the different sizes of the lists of quantified proteins. Therefore, for the proteome data, this manual analysis was employed rather than the t-tests used for transcriptome and the HLA peptidome analyses. Another group of proteins included those detected only before or after Decitabine treatment. These proteins, were defined as up-or down-regulated only if they were not detected in all three biological repetitions before (or after) the treatment, and were identified in at least two repetitions of the treated (or untreated) cells. As many as 141 different TA proteins were identified in the proteomics analyses and 45 of them were up-regulated by the Decitabine treatment in at least one of the cell lines whereas seven of them were down-regulated (Fig. 6). Of these 45 up-regulated TA's, 14 belong to the CTA group (supplemental Table S2).
The Effect of Decitabine on the Cells' Transcriptomes-The transcriptomes of the cell lines were defined to a much larger depth than the proteomes or the HLA peptidomes of the same cells. The reproducibility of the transcriptome analyses was high (supplemental Fig. S4), resulting in detection of between 30 -36 million "reads" per sample, with 77-82% of the reads uniquely mapped. Significant changes in genes' expressions were defined as those passing a t test with p valueϽ 0.05 and log 2 fold change Ϫ1Ͼ x Ͼ1 (supplemental Table S3). More transcripts were up-regulated by the Decitabine than downregulated with 704 of these up-regulated significantly in all three cell lines ( Fig. 4C and supplemental Table S3). A larger number of TA-derived transcripts (313 transcripts) were identified in the transcriptome analyses relative to the proteome and HLA peptidome analyses, with about a third of these TA transcripts up-regulated by the Decitabine treatment (109 transcripts, Table I). Out of these 109 TA transcripts, 42 belong to the CTA group.
To obtain an indication for the sensitivity and coverage of the analysis performed in this study, the NY-ESO-1 gene was selected as a model CTA, because it is known to be upregulated by Decitabine (37). The NY-ESO-1 gene products were not observed in any of the analyses described here, therefore, an RT-qPCR analysis was performed with RNA extracted from the three GBM cell lines, before and after treatment with Decitabine. The results of the RT-qPCR indicated a uniform up-regulation of the NY-ESO-1 transcripts in the three Decitabine treated cell lines (supplemental Fig. S5).

The Differential Effects of Decitabine on the Transcriptomes, Proteomes, and HLA Peptidomes of the Different Cell
Lines-The large numbers of transcripts, proteins and HLA peptides identified in these analyses from each of the three GBM cultured cells (Tables I, supplemental Tables S1, S2, and S3) prompted us to attempt to determine the correlations between these three levels of gene products. Larger numbers of transcripts were identified and quantified relative to the proteins or HLA peptides because of the different technologies used. It is difficult to perform comparisons between levels of transcripts, which are defined by multiple RNA reads, with levels of proteins that are often defined by just a few tryptic peptides (59). Furthermore, it is even more challenging to compare the levels of transcripts and proteins with those of the HLA peptides, which are detected only with a single intensity peak during the LC-MS analyses. Here we attempted to bypass this limitation by comparing instead the drug-induced fold-changes in the levels of the transcripts, proteins and HLA peptides of the same genes rather than their total signal intensities. This was based on the assumption that the ratios between the LC-MS signals of the tryptic and HLA peptides and their cellular amounts remain constant despite the drug-induced perturbation.
The Decitabine treatments caused a reproducible effect on the transcriptomes, proteomes and HLA peptidomes of the different cell lines, mostly upregulating only a subset of these gene products (Tables I, supplemental Tables S1, S2, and S3, and Fig. 7). Concurrent changes in the transcripts, proteins and different HLA peptides derived from the same 96 genes could be observed in at least one of the tested cell lines. Yet, the correlations between the changes induced by the Decitabine treatment in these three levels of gene expression were very poor (Fig. 7). In addition, in some cases, the levels of different HLA peptides derived from one protein were affected differently by the drug treatment. Such phenomenon is likely because of a possible complex effect of the drug on the HLA peptidomes' production pipeline and not only because of the changes in the levels of the source transcripts, rates of synthesis or degradation of the proteins. PLIN2 is an example for a gene whose transcript, protein and HLA peptides changed uniformly in all the three tested cell lines, whereas SPTBN2 represents an example for a more complex response to the drug (Fig. 8).
Selection of Tumor Vaccine Candidates-CTAs selected for further investigation as tumor antigens were those detected as up-regulated HLA peptides. The selection for further investigation was based on information about their mRNA gene expression patterns in healthy and diseased tissues using the BioGPS data bank http://biogps.org/ (60,61). The preferred CTAs for development of immunotherapeutics are likely those derived from genes expressed only in malignant tissues, in male germline cells or in placenta, and not in any other healthy tissues. The selected CTA genes in this study were those with mRNA expression levels below nine units in all normal, es- sential tissues (BioGPS), which are expressed at significantly higher levels in tumor cells. It was assumed that transcript levels below nine units could be considered as background because these are the expression levels of many well-characterized CTAs in healthy (nontestis) adult tissues. Moreover, some of these CTAs are already used in multiple clinical studies without observable autoimmune reactions (6,(62)(63)(64). Candidates for combined drug and immunotherapy treatments were selected in this analysis if their transcripts, proteins or HLA peptides were up-regulated at least by twofold in the GBM cells in response to the Decitabine treatment (supplemental Table S4). A few examples for TAs, including CTAs, which were up-regulated at all the three levels of gene expression following treatment with Decitabine, are listed in Table II. Also listed are their degrees of change and their CTA scores, which indicate their potential relevance for immunotherapy. CTA score of '1Ј indicates that the genes mRNA is expressed at levels below nine units in all the healthy (nottestis) tissues, respectively (according to the BioGPS database). These TAs were further analyzed for their mRNA expression levels using two additional databases, the Human protein Atlas (http://www.proteinatlas.org/) and ISTonline (http://ist.medisapiens.com/) to confirm that their mRNA expression is restricted to tumor and nonessential tissues (Table II). DISCUSSION As expected, the treatment of the GBM cell lines with Decitabine induced relatively mild effects on the cells without any detectable stress. Yet, it had a pronounced effect on the expression levels of a subset of genes, many of which are CTAs, whose transcripts, proteins, and most significantly, HLA class I peptides, appeared only after the drug treatment. Thus, we suggest that such analyses can be exploited for the development of immunotherapy based on induction of CTAs by drug treatments. In addition, this is the first study, to our knowledge, investigating the simultaneous effects of a drug on the three important levels of gene expression, namely the transcriptome, proteome and HLA peptidome, all in the same cultured human cancer cells. The data described here indicates that only minimal levels of correlations exist between the HLA peptidomes, the transcriptomes and the proteomes of the cells (Fig. 7). The use of modern LC-MS-MS and RNA-seq technologies enabled reaching a significant coverage of the transcriptomes, proteomes and HLA peptidomes, which was useful for the selection of numerous drug-induced CTA candidates (supplemental Table S4) and most importantly HLA peptides that appear only after the Decitabine treatment (Fig. 4).
The most significant dilemma in the selection of antigens as candidates for cancer immunotherapy relates to the selection of sufficiently immunogenic antigens, while avoiding induction of potential life-threatening autoimmune reactions (17). Tumor antigens of choice are likely those expressed at sufficient levels in the tumors but not at all in any of the essential tissues. Assigning to each antigen its CTA score, proposed here, can help prioritize the identified CTAs for subsequent immunogenicity testing. These scores indicate if the individual antigens are absent from all essential healthy tissue and are expressed at sufficiently high level in the tumor tissues. Furthermore, such CTA scores may provide additional input on how strong the immune reactions to the administered antigens can be enhanced by use of strong adjuvants, without fear of causing damaging autoimmunity (17). To obtain the CTA score for each antigen detected as HLA peptide or transcript, its expression needs to be evaluated with every patient's tumor sample. To obtain the expression levels of the same antigens in the healthy tissues it is essential to rely on public gene expression databases of healthy tissues, such as the three databases used here (BioGPS, Human protein Atlas and ISTonline). The assumption is that in the absence of transcripts in healthy tissues, the derived HLA peptides are also not expressed. We were somewhat surprised to notice discrepancies between databases used here listing the expression levels in the normal tissues of some of the CTAs (examples in Table II). Although these databases are based on numerous studies, further analyses are clearly needed to resolve these conflictions and confirm that the selected CTAs are indeed not expressed in any of the essential tissues. Although neo-antigens are likely the preferred type of immunotherapeutics (11,13,14,22,65) they are rare, difficult to identify and will require costly analyses and validations before their routine incorporation into the personalized immunotherapy pipeline. The more general CTAs, expressed by the tumors of large cohorts of patients, may become clinically useful earlier and at lower costs. Therefore, the selection of CTAs expressed in many patients' tumors and absent from all essential healthy tissues, will be beneficial for the development of cancer immunotherapy (17). The suggested strategy may involve pre-vaccination of the patients and induction of preexisting strong immune reaction against the selected tumor antigens, which will activate the cytotoxic T lymphocytes to eradicate the tumor cells as soon as they express the CTAs after the drug treatment.
Some overlapping groups of peptides were observed in this analysis with extended n-and c-termini, as is commonly observed in HLA peptidome analyses (66,67). We listed these extended length peptides separately, each with its NetMHC score according to the HLA allomorphs of each cell line, assuming that while being trimmed in the ER these extended peptides may bind and be stabilized by the different HLA molecules present in the different cells.
The poor correlations between the levels of the HLA peptides and their source transcripts or source proteins (Fig. 3) and the poor correlations of the fold-changes observed in response to the drug stimulations (Fig. 7), point to the importance of following the effects of the drug on the HLA peptidomes rather than on the transcriptomes or proteomes). Transcripts' levels can be defined more accurately than of proteins' because of the larger numbers of detected RNA 'reads' relative to quantifiable tryptic peptides derived from each of the proteins (59, 68 -70) and certainly better than the HLA peptides. HLA peptides are quantified with single LC-MS signal intensities, which correlate poorly with the amounts of the peptides. Thus, the precise definitions of ratio between the copy numbers per cell and LC-MS signal intensities (MA/MS ratios) of the individual HLA peptides will need to be defined in the future by large scale quantitative mass spectrometry analyses, such as SRM or PRM (41,71,72), which were outside the scope of this study.
The findings described here raise the question whether transcriptome analyses may be sufficient in the future for selection of tumor antigens for immunotherapy without HLA peptidome analyses. Will it be possible to design tumor immunotherapy solely based on 'reverse immunology' using exome (for selection of neoantigens) or transcriptome (for selection of CTAs) analyses. Prediction of the HLA peptidomes based on the HLA typing of the presenting cells and familiarity with the HLA consensus binding motifs is currently limited in its capacity to define the nature of the presented HLA peptidome (5,9,44,73,74). Analyses of large HLA peptidomes presented by different HLA allotypes may facilitate the prediction of the sequences of the presented HLA ligands derived from each of the genes (73,75). Because vaccination with peptides should preferably be based on peptides that are presented at sufficient levels to drive strong immune reactions to the deformed cells, quantitative HLA peptidome analysis may be beneficial for selection of the more highly expressed antigens for personalized immunotherapy (5).
NY-ESO-1 transcript was absent from the RNA-seq analysis, although it is well known to be elevated in cancer cells in numerous gene expression studies, including cells treated with Decitabine (37). The NY-ESO-1 transcript was clearly observed in the qRT-PCR analysis performed in this study (supplemental Fig. S5). This discrepancy between the qRT-PCR and the RNA-seq is because of the fact that the RNA reads associated with it are "multi-mapped" and therefore were removed from the transcriptome listing by the HTSeq- count package. Such shortfall should be avoided while looking for gene expression in the healthy tissues' transcriptome databases. It is pertinent to confirm that the selected genes are truly not expressed in the healthy tissue and not just artificially removed from the databases.
To conclude, our data suggest that drugs such as Decitabine should be considered as potential treatment for different cancers, in combination with pre-vaccination with the drug-induced CTAs. The identification of a numerous HLA peptides derived from CTAs and induced by the drug treatment suggests that the large-scale HLA peptidome and transcriptome analyses should be combined to search for other HLA peptides derived from tumor specific antigens to be tested as cancer vaccine candidates.