IRIS: Discovery of cancer immunotherapy targets arising from pre-mRNA alternative splicing

Significance Despite the success of cancer immunotherapy, discovering actionable tumor antigens as immunotherapy targets remains a major challenge. Aberrant alternative splicing (AS) is widespread in cancer and generates a large repertoire of potential immunotherapy targets. However, there is no well-established strategy to discover AS-derived immunotherapy targets. We describe an integrated computational workflow for comprehensive discovery and characterization of AS-derived immunotherapy targets, leveraging large-scale RNA-seq resources of tumor and normal tissues. We demonstrate the application of this workflow for target discovery of neuroendocrine prostate cancer, a highly lethal cancer with no effective therapies. We experimentally confirm the immunogenicity and T cell recognition of AS-derived T cell receptor targets. Collectively, this work introduces a broadly applicable framework for discovering cancer immunotherapy targets arising from AS.

. IRIS DB: A reference database of alternative splicing (AS) profiles across tumor and normal tissue samples. a, Percent-spliced-in (PSI)-based principal component analysis (PCA) of RNA-seq data of 9,932 samples from 33 tumor types from the TCGA consortium. b, PSI-based PCA of RNA-seq data of 9,024 samples from 51 normal tissue types of 30 histological sites from the GTEx consortium.

Figure S2. CAR-T target prediction by IRIS.
Computational workflow for annotating protein extracellular domain (ECD)-associated AS events for discovering chimeric antigen receptor T-cell (CAR-T) targets.

Figure S3. IRIS discovery of AS-derived targets for NEPC arising from four types of AS events.
Stepwise results of IRIS to identify AS-derived cancer immunotherapy targets from 23 neuroendocrine prostate cancer (NEPC) samples. Skipped exon (SE), alternative 5' splice sites (A5SS), alternative 3' splice sites (A3SS), and retained intron (RI) events identified by the IRIS RNA-seq data processing module (blue) were screened against 11 normal tissue types from the IRIS DB (yellow) to identify tumor-associated events and predict corresponding T-cell receptor (TCR) and CAR-T targets (purple). ECD, extracellular domain; HLA, human leukocyte antigen; SJ, splice junction. Figure S4. Representative examples of 5 IRIS-predicted CAR-T targets for NEPC. IRISpredicted CAR-T targets are visualized by IRIS in paired violin and bar plots. Each row shows one IRIS-predicted CAR-T target. Violin plots show the PSI values of each target in NEPC and the normal tissue panel. Bar plots show the fraction of samples expressing the SJ(s) of the tumorenriched isoform in NEPC and the normal tissue panel. If the tumor-enriched isoform is the exon inclusion isoform, the bar plot displays the upstream and downstream inclusion SJ as two bars. If the tumor-enriched isoform is the exon skipping isoform, the bar plot displays the skipping SJ as one bar. Positions of ECDs in amino acid (aa) sequences of the corresponding UniProtKB canonical proteins are shown for individual CAR-T targets. ECD, extracellular domain; PSI, percent spliced in; SJ, splice junction. Figure S5. IRIS Explorer: a web-based tool to explore and visualize IRIS results. Shown are various visualizations generated by the web-based tool, IRIS Explorer, using an AS event in SYPL1 as an example. For this AS event, we used IRIS Explorer to generate five visualizations using data across NEPC and six selected normal tissue types from the IRIS DB (heart, blood, lung, liver, brain, and nerve). a, Violin plots show the PSI values in NEPC and the normal tissue panel. b, Bar plots show the fraction of samples expressing the skipping SJ in NEPC and the normal tissue panel. c, Bar plots show the fraction of samples expressing the inclusion SJs in NEPC and the normal tissue panel. The upstream and downstream inclusion SJs are displayed as two bars. d, Violin plots show the SJ count (in CPM) of the skipping SJ in NEPC and the normal tissue panel. e, Violin plots show the SJ count (in CPM) of the inclusion SJs in NEPC and the normal tissue panel. The upstream and downstream inclusion SJs are displayed as two violin plots. CPM, counts per million; SJ, splice junction. Figure S6. IFNγ ELISA of TCR-transduced PBMCs co-cultured with K562-A2 and exogenously added peptides. a, IFNγ ELISA of TCR-transduced peripheral blood mononuclear cells (PBMCs) co-cultured with K562 cells expressing HLA-A*02:01 (K562-A2) antigen-presenting cells (APCs) pulsed with a single cognate peptide. Error bars represent standard deviation (n=3). b, IFNγ ELISA of TCR-transduced PBMCs co-cultured with K562-A2 APCs in a dilution series of cognate peptide. Dataset S1 (separate file). IRIS proteo-transcriptomics analysis of alternative splicing (AS)derived peptides in cell line immunopeptidomes. a. Summary of AS-derived epitopes in JeKo-1 cancer cell line. b. Summary of AS-derived epitopes in B-LCL-S1 normal cell line. c. Summary of AS-derived epitopes in B-LCL-S2 normal cell line.
Dataset S3 (separate file). Selected IRIS-predicted NEPC epitopes for TCR isolation and characterization. a. Selected 76 IRIS-predicted NEPC epitopes for TCR isolation and characterization. b. Unique TCR clones recognizing the total IRIS-predicted epitope pool. c. Unique TCR clones specifically recognizing single IRIS-predicted epitopes.

Supplementary Materials and Methods
IRIS tumor-recurrence screen IRIS's in silico screening module provides three distinct screening tests: a 'tumor-association screen', a 'tumor-specificity screen', and a 'tumor-recurrence screen', to identify alternative splicing (AS) events of varying degrees of tumor association and specificity. The first two screening tests are described in detail in the main manuscript. Here, we describe the tumorrecurrence screen, which allows IRIS to identify AS events that are recurrent (shared) among independent cohorts of the similar tumor type. Specifically, for tumor-associated AS events identified by the tumor-association screen, the tumor-recurrence screen compares tumor types of similar histology (i.e. independent tumor cohorts selected from the IRIS DB or provided by users) against a matched normal tissue type selected from the IRIS DB. To define a differential AS event in this test, IRIS sets two default requirements: 1) a significant p-value from a statistical test (default: one-sided t-test p < 0.01, unequal variance allowed) in the same direction as identified in the tumor-association screen, and 2) a threshold of average PSI value difference (default: abs(ΔPSI) > 0.05). IRIS defines an AS event as tumor-recurrent, if the number of significant tests for independent tumor cohorts against the matched normal tissue type reaches a user-defined threshold (e.g. 1 out of 2 independent tumor cohorts tested).
Additional criteria to evaluate IRIS-predicted targets IRIS evaluates and visualizes predicted targets based on multiple criteria (Fig. 4). In addition to the three main criteria described in the main manuscript (degree of tumor association, FC of the tumor-enriched isoform between tumor and normal tissues, and gene expression level in tumor tissues), additional features are reported for predicted targets. The 'degree of tumor specificity' is the number of normal tissue types compared to which all SJ(s) of its corresponding tumorenriched isoform are tumor-specific as defined by the SJ count-based tumor-specificity screen. The 'degree of tumor recurrence' is the number of independent tumor cohorts for which the AS event is defined as differential against a matched normal tissue type in the tumor-recurrence screen. The 'predicted HLA binding affinity' is the IEDB-predicted binding affinity for the epitope of interest. The 'mappability' is a measure of the SJ region's uniqueness or repetitiveness in the genome, based on the UCSC mappability track (1). The 'peptide uniqueness' indicates whether the SJ peptide sequence is unique within all annotated SJs and detected SJs in the analysis.

IRIS function for chimeric antigen receptor T-cell (CAR-T) target prediction
IRIS maps AS events to protein extracellular domains (ECDs) to discover potential CAR-T targets (Fig. S2). Specifically, IRIS collects and curates annotations of protein ECDs from UniProtKB (2). First, protein cellular localization information is retrieved from the UniProtKB database (flat file downloaded in April 2018). ECD information is retrieved by searching for the term 'extracellular' in topology annotation fields, including 'TOPO_DOM', 'TRANSMEM', and 'REGION', in the flat file. Next, BLAST (3) is used to map exons in the gene annotation (GENCODE V26) to proteins with topology annotations. Finally, the BLAST result is parsed to associate exons with protein ECDs. These curated annotations are used to identify SJs that overlap with protein ECDs as sources for potential AS-derived CAR-T targets. Human granulocyte-colony stimulating factor (G-CSF)-mobilized peripheral blood was purchased commercially (HemaCare, Northridge, CA). Leukopaks were purchased from HemaCare. CD34+ hematopoietic stem cells (HSCs) were isolated using CliniMACS (Miltenyi) at UCLA. HSCs were thawed in warm R10 media and subsequently resuspended in dendritic cell (DC) differentiation media, which included Minimum Essential Medium alpha (MEMa; Gibco, Thermo Fisher Scientific, CAT# 12571063) with 20% defined FBS with 2X concentration of the following cytokines: 5 ng/ml stem cell factor (SCF; Peprotech, CAT# 300-07), 5 ng/ml FMS-like tyrosine kinase 3 ligand (FLT3-L, Peprotech, CAT# 300-19), 50 ng/ml thrombopoietin (TPO; Peprotech, CAT# 300-18), and 10 ng/ml granulocyte macrophage colony-stimulating factor (GM-CSF; Peprotech, CAT# 300-03). DMS79 cells were cultured in R10 media.

Autologous T-cell priming by DCs
When conventional type 1 dendritic cells (cDC1 cells) were used as antigen-presenting cells (APCs), autologous T cells were originally frozen down as the CD34-fraction of mobilized peripheral blood. On the day of DC harvesting, CD34-cells were thawed in warm R10 media and incubated overnight in AIM-V media supplemented with 5% heat-inactivated human AB serum, 1X GlutaMAX, and 55 µM β-mercaptoethanol with 5 ng/mL IL7 (Peprotech,. Isolated CD45+ DCs were resuspended in DC differentiation media with 10 µg/mL poly(I:C) (Sigma, CAT# P1530-25MG), 10 µg/ml R848 (Sigma, CAT# SML0196-10MG), and the peptide pool of interest, such that the concentration of each peptide was within 2-10 µg/mL. The DCs were then plated overnight to mature in a 96-well V-bottom plate at 200 µL/well. The next day, T cells were isolated using magnetic beads (Miltenyi,. Mature DCs and isolated T cells were combined at a 1:4 ratio in AIM-V media supplemented with 5% heat-inactivated human AB serum, 1X GlutaMAX, 55 µM β-mercaptoethanol, 5 µg/mL poly(I:C), 5 µg/mL R848, 10 ng/mL GM-CSF, 30 ng/mL IL21 (Peprotech,, and 5 ng/mL IL7. The cell suspension was then plated into 48-well plates at 500 µL/well. Three days later, cells were supplemented with fresh AIM-V media supplemented with 5% heat-inactivated human AB serum, 1X GlutaMAX, 55 µM β-mercaptoethanol, 10 ng/mL IL7, and 10 ng/mL . T-cell expansion in IL-7/IL-15 was carried on until analysis. Priming assay replicates were kept separate throughout the expansion phase to prevent dilution of antigen-specific responses.

T-cell priming by PBMCs
Frozen vials of primary human PBMCs from a single donor were purchased from AllCells (Alameda, CA). PBMCs were thawed and plated at 5 x 10 6 cells/mL in AIM-V media supplemented with 5% heat-inactivated human AB serum, 1X GlutaMAX, 55 µM βmercaptoethanol, 50 U/mL IL-2, and 1 ng/mL in a 24 well plate. Cells were rested overnight. On the following day, 1 mL of media containing 1-10 µg/mL peptide and 50 U/mL IL-2 was added. A half media change with 2X peptide and cytokine was performed every 2-3 days for up to 9 days.

10X Genomics single-cell VDJ sequencing
The TCR VDJ libraries were constructed by the Technology Center for Genomics & Bioinformatics at UCLA per the standard 10X Genomics protocol. Libraries were then sequenced on MiSeq or NextSeq (Illumina).

Cloning of TCR constructs
TCR alpha and beta chain sequences from activated cells returned from 10x Genomics sequencing were ordered as gBlocks (IDT or Twist) and cloned as previously described (4).

TCR screening in Jurkat NFAT-GFP
TCRs were cloned into the pmaxCloning TM Vector (Lonza, CAT# VDC-1040) and screened by coculturing Jurkat-NFAT-GFP CD8 reporter cells and K562-A2. Miniprep plasmid DNA was purified (Qiagen, CAT# 27106) and eluted in nuclease-free water. DNA concentrations were routinely above 200 ng/µL. Jurkat cells were spun down at 596 x g for 5 min and resuspended in 20 µL of the Lonza SE cell line media (Lonza, CAT#: V4XC-1032) per transfection reaction. Next, 5 x 10 5 Jurkat cells and 2 µL of miniprep DNA were added per nucleofection well and electroporated using the 4D Nucleofector Jurkat E6.1 protocol (Lonza). Reactions were rested for 10 min, and then 80 µL of warm R10 media was added to each well and plated in 500 µL of R10 media overnight. The next day, cells were stimulated with K562-A2 cells loaded with DMSO (solvent control) or peptide at 10 µg/ml. Co-cultures were incubated overnight. The following day, 96-well plates were spun down at 1,026 x g for 2 min and stained with CD8-PE (Invitrogen, CAT# 12-0088-42) and murine TCR beta-APC (BioLegend, CAT# 109212) antibodies. The response was quantified using flow cytometry with the following gating scheme: light scatter, murine TCR+/CD8+, and GFP+.

Peptide pooling and deconvolution
Peptides were pooled in both a total pool and in a tiled sub-pool matrix, where peptides were represented in two unique sub-pools. TCRs reactive to two sub-pools were then re-stimulated with the corresponding single peptide. Peptide pool deconvolution and single peptide confirmation were performed using sub-pools of the total peptide pool and single peptide at 10 µg/mL.