In vitro proteasome processing of neo-splicetopes does not predict their presentation in vivo

Proteasome-catalyzed peptide splicing (PCPS) of cancer-driving antigens could generate attractive neoepitopes to be targeted by T cell receptor (TCR)-based adoptive T cell therapy. Based on a spliced peptide prediction algorithm, TCRs were generated against putative KRASG12V- and RAC2P29L-derived neo-splicetopes with high HLA-A*02:01 binding affinity. TCRs generated in mice with a diverse human TCR repertoire specifically recognized the respective target peptides with high efficacy. However, we failed to detect any neo-splicetope-specific T cell response when testing the in vivo neo-splicetope generation and obtained no experimental evidence that the putative KRASG12V- and RAC2P29L-derived neo-splicetopes were naturally processed and presented. Furthermore, only the putative RAC2P29L-derived neo-splicetopes was generated by in vitro PCPS. The experiments pose severe questions on the notion that available algorithms or the in vitro PCPS reaction reliably simulate in vivo splicing and argue against the general applicability of an algorithm-driven ‘reverse immunology’ pipeline for the identification of cancer-specific neo-splicetopes.


Introduction
Defined anti-tumor CD8 + T cell responses require the proteasome-dependent processing of intracellular proteins and the efficient generation of antigenic peptides presented in the context of HLA class I molecules at the cell surface for TCR recognition. An important step in defining the proteasome as HLA class I epitope generation machine was the early observation that purified 20S proteasomes in combination with synthetic polypeptide substrates encompassing the epitope of interest reproduced the in vivo generation of these epitopes with high fidelity (Boes et al., 1994;Guillaume et al., 2012;Kessler et al., 2006;Niedermann et al., 1995;van der Bruggen and Van den Eynde, 2006). Thus, in vitro antigen processing experiments in combination with specific CD8 + T cells to monitor HLA class I binding and immune recognition are a widely used reliable tool to verify the generation efficiency of antigenic peptides of viral, bacterial, and human origin.
Our view on antigen processing was significantly extended by analysis of cancer patient-derived CD8 + T cells revealing that by proteasome-catalyzed peptide splicing (PCPS) proteasomes can also fuse excised peptide fragments in a reverse proteolysis reaction, thereby generating new immune reactive spliced epitopes (splicetopes) with an amino acid sequence that differs from that of the substrate protein (Hanada et al., 2004;Vigneron et al., 2004). The isolation of splicetope-specific CD8 + T cells from cancer patients and the finding that splicetope-specific CD8 + T cells derived from tumor-infiltrating lymphocytes (TILs) inhibited the engraftment of human acute myeloid leukemia cells in SCID mice indicated the potential immune relevance of such tumor antigen-derived splicetopes (Robbins et al., 1994).
Importantly, for the fibroblast growth factor (FGF)-5 and several splicetopes derived from the tumor differentiation antigen gp100mel, in vitro proteasome splicing reactions were also found to mimic the in vivo splicing reactions (Ebstein et al., 2016;Warren et al., 2006), suggesting that in vitro PCPS reactions may be a useful tool to discover new spliced epitopes generated from tumor antigens of interest. To be able to identify splicetopes in in vitro PCPS experiments independent of the availability of patient-derived CD8 + T cells, we developed the prediction algorithms ProteaJ (Liepe et al., 2010) and the here-described ProtAG. Using these algorithms, we established an inclusion list of potentially immune-relevant spliced peptides theoretically generated from a given antigen, which in combination with the mass spectrometric analysis of the in vitro digest should allow the identification of new splicetopes. Testing the feasibility of such an algorithm-aided 'reverse immunology' approach, we had isolated CD8 + T cells from Listeria monocytogenes-infected mice that specifically recognized two phospholipase PlcB-derived splicetopes generated by the proteasome in vitro and in vivo (Platteel et al., 2017).
Targeting somatic cancer-specific driver mutations derived from neoantigens by TCR-mediated adoptive T cell transfer (ATT) represents a promising approach for personalized cancer therapy . One drawback of this approach is that often neoepitopes may not exhibit HLA class I binding affinities sufficient to trigger an efficient T cell response or not be generated efficiently by the proteasome. In fact, even if a suitable neoepitope is generated, its HLA haplotype specificity frequently does not match with the patient's HLA class I allele, consequently excluding these tumor patients from ATT.
As outlined above, taking advantage of PCPS for the identification of spliced neoepitopes (neosplicetopes) may therefore represent an interesting approach to identify suitable TCR targets when the recurrent somatic mutations in a tumor antigen do not result in the production of a non-spliced tumor neoepitope either exhibiting a sufficient HLA class I binding affinity or the appropriate HLA class I haplotype. Furthermore, due to the ligation of two distant generated peptide fragments PCPS not only possesses the interesting potential to generate high-affinity neo-splicetopes harboring the respective somatic mutation but also to extend the HLA haplotype diversity of epitopes generated from a given neoantigen.
In a proof-of-principle 'reverse immunology' study, we here identified HLA-A*02:01 restricted putative neo-splicetopes predicted by spliced peptide prediction algorithm derived from the two recurrent somatic mutations KRAS G12V and RAC2 P29L . TCRs specific for the putative neo-splicetopes were generated in huTCR-a/huTCR-b gene loci transgenic HLA-A*02:01 mice (Li et al., 2010). TCRs recognized the respective putative neo-splicetope with high efficacy when tested in vitro. However, we failed to detect a neo-splicetope-specific T cell response when testing the in vivo (in cellulo) generation of the predicted neo-splicetopes and thus failed to gain evidence that the two KRAS G12V and RAC2 P29L derived neo-splicetopes were also generated in vivo as predicted by algorithm-aided studies. In addition, only the predicted neo-splicetope for RAC2 P29L could be confirmed by in vitro proteasomal digest. The experiments pose severe questions on the applicability of the previously highlighted pipeline  for the identification of immune-relevant neo-splicetopes.

Generation of KRAS G12V splicetope-specific TCRs in a humanized mouse model
To analyze the immunogenicity of spliced epitopes, we utilized transgenic mice (ABabDII mice) that harbor the human TCRab gene loci as a source for a diverse human TCR repertoire that is selected by chimeric HLA-A*02:01 (Li et al., 2010). Upon immunization with the peptides KLVVGAVGV and KLVVVAVGV (representing sp1 and sp2, respectively), these mice mounted a CD8 + T cell response detected by in vitro re-stimulation of peripheral blood lymphocytes 7 days after the last immunization, whereas mice without immunization did not show reactivity ( Figure 1A, Figure 1-figure supplement 1A). Both peptides induced a specific CD8 + T cell response. By sorting IFNg-positive sp1and sp2-reactive CD8 + T cells from splenocytes of responder mice using IFNg-capture assay (not shown), specific TCRs were isolated upon rapid amplification of cDNA end (5 0 RACE)-PCR and cloning of the most abundant rearranged TCR-a and TCR-b genes for each individual mouse. One TCR directed against sp1 epitope (1376) and two TCRs specific for sp2 epitope (9383B2 and 9383B14) were isolated. Codon-optimized sequences encoding the a-and b-chains were linked with a P2A element and inserted into retroviral expression vector pMP71, transduced into human peripheral blood mononuclear cells (PBMC) ( Figure 1B, Figure 1-figure supplement 1B) and tested for specificity measuring release of IFNg in a co-culture with TAP-deficient T2 cells loaded with titrated amounts of sp1 ( Figure 1C) or sp2 peptides (Figure 1-figure supplement 1C), respectively. All three TCRs induced robust IFNg release at peptide concentrations of up to 10 À10 M, suggesting high functional avidity for these TCRs. For TCR 1376 and TCR 9383B2 , cross-reactivity to the in silico predicted linear KRAS G12V KLVVVGAVGV peptide was only seen for the highest peptide concentrations ( Figure 1D and not shown, respectively).

KRAS G12V splicetope-specific TCRs do not recognize cancer cells endogenously expressing mutant KRAS G12V
One of the critical tests for the usefulness of therapeutic TCRs in genetically modified T cells is the recognition of cancer cells that endogenously express the respective mutation. This approach was even more decisive for our approach because so far the predicted neo-splicetopes had been predicted in silico but not detected in cells. Therefore, PBMC genetically engineered to express sp1and sp2-specific TCRs were co-cultured with a series of cancer cell lines that harbored the G12V mutation within the KRAS gene. MCF7 and Mel624 cells with two KRAS wildtype copies served as controls. Whereas some of the cell lines used expressed HLA-A*02:01 ( Figure 2A, Figure 2-figure supplement 1A, B), the HLA-A*02:01-negative cell lines were transduced with an HLA-A*02:01 expressing retroviral construct ( Figure 2B). Presence of sufficient amounts of HLA-A*02:01 for T cell recognition was analyzed by prior loading of the tumor cells with 10 À6 M of the respective peptide. In all cases, peptide-loaded cancer cells were recognized by TCR 1376 (Figure 2A (Figure 2A, B) or TCR 9383B2 and TCR 9383B14 engineered T cells (Figure 2-figure supplement 1A, B) was not above background when cancer cells were co-cultured without prior peptide loading, indicating that the endogenous KRAS G12V protein is not recognized. Cancer cells were also treated with IFNg 48 hr prior to co-culture with the respective TCR-modified PBMCs. As exemplarily shown for sp2-specific TCRs, again only peptide-loaded tumor cells were recognized (   (Li et al., 2010) 7 days after the last immunization with sp1 (KLVVGAVGV). Stimulation with CD3/CD28 beads served as positive control, co-culture without peptide (Ø) was used as negative control. Numbers in brackets represent percent IFNg + CD8 + T cells, respectively. Spleens of mice with IFNg-reactive CD8 + T cells were cultured for 10 days in the presence of 10 À8 M of sp1 KRAS peptide, and reactive CD8 + T cells were purified by IFNg-capture assay for isolation of TCR a and b chains by RACE-PCR. (B) The corresponding TCR a and b chains isolated from one KRAS G12V sp1 peptide immunized ABabDII mouse, respectively (1376), were cloned into retroviral vector pMP71 and reexpressed in human PBMC. Transduction efficacy was measured by staining of the mouse TCRb constant chain on CD8 + T cells, and the number of positive CD8 + T cells is shown in brackets. (C) TCR gene transfer confers specificity for mutant spliced KRAS G12V peptide KLVVGAVGV (sp1). IFNg production of KRAS G12V splice-specific 1376 TCR-transduced T cells upon co-culture with sp1-peptide-loaded T2 cells Figure 1 continued on next page Figure 1 continued (1376 [solid bars]). As negative control, T2 cells were not peptide loaded. For maximal stimulation, phorbol myristate acetate (PMA) and ionomycin (p + I) were added to the co-culture. All target cells were also co-cultured with non-transduced T cells (Ø, open bars). (D) TCR gene transfer confers cross-reactivity for mutant linear KRAS G12V peptide KLVVVGAVGV. IFNg production of KRAS G12V splice-specific 1376 TCR-transduced T cells upon coculture with KRAS G12V linear peptide-loaded T2 cells (1376 [solid bars]). As negative control, T2 cells were not peptide loaded. For maximal stimulation, PMA and ionomycin (p + I) were added to the co-culture. All target cells were also co-cultured with non-transduced T cells (Ø, open bars). Experiments were done at least in duplicate. The online version of this article includes the following figure supplement(s) for figure 1:   T cells harboring KRAS G12V splicetope-specific TCRs do not recognize overexpressed KRAS G12V One challenge of targeting neoepitopes with T cells is the low abundance of many neoantigens on the surface of the respective HLA class I molecules that may hamper recognition by T cells. To exclude low expression level as one reason for the failure of TCR 1376 -engineered T cells to recognize the spliced form of the KRAS G12V peptide on the cancer cells, we generated cancer cells (MCF7, Mel624, and mouse NIH-HHD) that ectopically overexpressed triple minigenes encoding three copies of the KRAS G12V mutation interconnected by an AAY sequence that ensures proteasomal cleavage (Spiotto et al., 2002). We therefore generated triple minigene cassettes that either encoded the N-terminal 35mer polypeptide of KRAS G12V 1-35 or as control triple minigenes that encoded the predicted non-spliced 10mer KRAS G12V 5-14 peptide epitope, the spliced 9mer KRAS G12V 5-8/10-14 , or KRAS wt 5-8/10-14 peptide epitope, respectively ( Figure 3A). As shown in Figure 3B-D, TCR 1376 -positive T cells efficiently recognized the KRAS G12V 5-8/10-14 peptide when loaded either onto MCF7 ( Figure 3B), Mel624 ( Figure 3C), and mouse NIH-HHD ( Figure 3D) cells or when expressed as a triple epitope. In contrast, no IFNg release was elicited with cells expressing the triple KRAS G12V 1-35 35mer polypeptide ( Figure 3B-D). Quantitative PCR analysis of KRAS G12V triple minigene 35mer and KRAS G12V triple epitope spliced nonamer revealed that KRAS G12V triple minigene 35mer is expressed almost twice as high as the KRAS G12V triple epitope spliced nonamer ( Figure 3E). Altogether, this indicates that the spliced peptide, theoretically predicted, is either not generated in vivo or, despite the overexpression of the KRAS G12V 1-35 substrate, is produced at amounts insufficient to be recognized by KRAS G12V 5-8/10-14 -specific high-affinity T cells.

KRAS G12V splice peptide-specific TCR 1376 cross-reacts with HLA-C07 allele
We initially identified two cell lines with the G12V mutation (SW480 and SW620) that induced IFNg release by TCR 1376 -transduced PBMC upon co-culture (Figure 3-figure supplement 1A). Upon coculture with a panel of lymphoblastoid B cell lines (BLCLs) that harbor a series of different HLA class I molecules, a test for potential HLA allo-reactivity that we routinely perform with TCRs obtained from ABabDII mice, we uncovered reactivity to several BLCLs ( To finally prove HLA allo-reactivity of the TCR 1376 to HLA-C*07, we performed co-culture with the HLA-deficient myelogenous leukemia cell line K562 that had been transduced with HLA-C*07:01, HLA-C*07:02, and HLA-A*02:01 molecules, respectively. The experiments showed that K562-C*07:01 and K562-C*07:02 cell lines were recognized by three independent TCR 1376 -transduced PBMC donors irrespective of loading with peptide sp1, whereas K562-A02:01 cells only induced IFNg release when these cells were loaded with sp1 peptide prior to co-culture (Figure 3-figure supplement 1C). These results clearly indicate that the TCR 1376 directly recognizes members of the HLA-C*07 sub-family and/or peptides bound therein as well as sp1 peptide bound to HLA-A*02:01.

Triple 35mer polypeptide of KRAS G12V 1-35 minigenes are not immunogenic in vivo
In order to analyze whether triple 35mer polypeptide of KRAS G12V 1-35 minigenes would induce a CD8 + T cell response in vivo, we performed immunizations of ABabDII mice with an adenovirus expressing the N-terminal 35mer polypeptide of KRAS G12V 1-35 as a triple minigene. Despite multiple immunizations, neither restimulation with the linear 10mer KLVVVGAVGV nor with the spliced epitopes sp1 KLVVGAVGV, sp2 KLVVVAVGV, sp3 YLVVVGAVGV, or sp4 KLVVVGVGV induced IFNg release by CD8 + T cells in an intracellular cytokine staining of PBMCs 7 days after the last immunization ( Figure 3-figure supplement 2). This supports the notion that the KRAS G12V mutation is not immunogenic in the context of HLA-A*02:01, irrespective of whether splicing events occur or not. Our failure to detect immune-reactive KRAS G12V -derived neo-splicetopes under in vivo conditions raised doubts with respect to the reliability of the previously proposed solely algorithm-based pipeline for identification of immune-relevant neo-splicetopes . Therefore, we studied the generation of the KRAS G12V -derived neo-splicetopes in more detail in in vitro PCPS assays. Accordingly, the polypeptide substrates KRAS G12V 2-35 , KRAS G12V 2-32, KRAS G12V 2-21 , and KRAS G12V 2-14 were synthesized. However, due to the extreme hydrophobicity of the KRAS protein, the designed longer polypeptide substrates KRAS G12V 2-35 and KRAS G12V 2-32 encountered considerable difficulties during synthesis and subsequent purification, resulting in a highly impure product not suited for in vitro digestion experiments (Figure 3-figure supplement 3). Consequently, we used the polypeptides KRAS G12V 2-21 and KRAS G12V 2-14 for the in vitro PCPS reactions. KRAS G12V 2-14 was chosen based on previous data showing that C-terminal cleavage generating the C-terminal anchor residue is not essentially required to generate a spliced gp100-derived epitope (Liepe et al., 2010;Vigneron et al., 2004). Monitoring the kinetics of proteasomal spliced peptide generation represents an essential parameter for assessing the fidelity of in vitro PCPS reactions. To search for spliced peptides, a fasta data file generated with ProtAG was loaded onto PD2.1 and the kinetics analyzed with LC Quan 2.7 (Willimsky et al., 2021). At t = 0, none of the predicted spliced neo-splicetopes was identified. However, following the generation of the KRAS G12V -derived putative spliced neoepitopes from the polypeptide substrates KRAS The apparent contradiction between our in vivo experiments reported above and the results of the in vitro PCPS reactions was unexpected, considering that for the several spliced epitopes published so far there seemed to be a good correlation between the in vitro and in vivo results (Dalet et al., 2011;Ebstein et al., 2016;Michaux et al., 2014;Mishto et al., 2012;Platteel et al., 2017).
This led us to perform a more detailed MS analysis of the polypeptide substrate used for the in vitro PCPS experiments. Indeed, we found that most likely the accumulation of hydrophobic amino Target cells were loaded with 10 À6 M spliced peptide or transduced with either KRAS G12V triple minigene 35mer or KRAS G12V triple epitope spliced nonamer. KRAS wt triple epitope spliced nonamer and KRAS G12V triple epitope linear decamer were used as control. IFNg production of transduced T cells is shown (red bars). For maximal stimulation, phorbol myristate acetate (PMA) and ionomycin (PMA/Iono) were added, and all target cells were also co-cultured with non-transduced T cells (gray bars; Ø). Representative measurements are shown, and experiments were done at least in duplicate. (E) Relative amounts of KRAS G12V triple minigene 35mer and KRAS G12V triple epitope spliced nonamer were determined by qPCR on transduced NIH-HHD cells. KRAS G12V triple epitope spliced nonamer expression is arbitrarily set to 1. The online version of this article includes the following figure supplement(s) for figure 3:   acid residues within the KRAS G12V polypeptide substrates had led to mistakes during polypeptide synthesis, resulting in the synthesis of faulty polypeptides (Supplementary file 2) mimicking in sequence the results of the predicted splicing reaction (Figure 3-figure supplement 5B, Willimsky et al., 2021). Therefore, it was impossible to decide whether the candidate KRAS G12Vderived spliced peptides identified in vitro were true splicing products or as it appeared the product of normal proteasomal cleavage of already preexisting faulty polypeptides inappropriately simulating a splicing event. Furthermore, depending on the substrate, in vitro generation of non-spliced epitopes can be by orders more efficient than the generation of spliced epitopes . Thus, polypeptide substrates with mistakes in their sequence that are degraded at a rate similar to the rate of the correct substrate ( Figure 3-figure supplement 5C) may become a prevalent source for the generation of faked spliced peptides. We eventually obtained a KRAS G12V 1-21 polypeptide substrate (JPT Peptide Technologies, Berlin, Germany) without contaminants mimicking the predicted KRAS G12V 5-8/10-14 (sp1) and KRAS G12V 5-10/12-14 (sp4) splicing events. However, using this new KRAS G12V 1-21 polypeptide as substrate for kinetic in vitro PCPS experiments, we now failed to identify generation of either predicted KRAS G12V 9mer neo-splicetope. Although our experiments cannot completely exclude the generation of minor amounts of KRAS G12V -derived spliced peptides, they are in line with our failure to detect any immune-reactive KRAS G12V -derived spliced epitopes in vivo.

Identification and functional characterization of RAC2 P29L -derived neosplicetopes
RAC2 is a small GTPase belonging to the Rho family of GTPases. The RAC2 P29L mutation is another so-called driver mutation facilitating tumor growth as well as metastasis and thus presents a potential target in ATT. The linear RAC2 P29L FLGEYIPTV epitope has been predicted with an IC 50 of 2 nM. To identify RAC2 P29L -specific neo-splicetopes, we applied the ProtAG algorithm to predict all theoretically possible RAC2 P29L 20-44 -derived spliced peptides. From this initial screen, we selected all theoretical linear spliced 9mer peptides with a calculated HLA-A*02:01 binding affinity of IC 50 < 100 nM (Jurtz et al., 2017). To establish a cleavage map and identify all linear proteasomal cleavage products generated from the RAC2 P29L 20-44 polypeptide substrate, we performed in vitro digestions for 24 hr and 48 hr using erythrocyte and LcL 20S proteasomes (Willimsky et al., 2021). In these digests, also the non-spliced RAC2 P29L 28-36 neoepitope FLGEYIPTV was identified ( Figure 4A). To search for spliced peptides, a fasta data file generated with ProtAG was loaded onto PD2.1 (Willimsky et al., 2021). In this search, the spliced RAC2 P29L 28-34/36-37 ( 28 FLGEYIP 34 / 36 VF 37 ) peptide, with a calculated HLA-A*02:01 binding affinity of IC 50 = 24,7 nM, was found to be the only HLA-A*02:01 restricted putative RAC2 P29L neo-splicetope with an IC 50 < 100 nM generated. To confirm the initial identification of RAC2 P29L 28-34/36-37 kinetic in vitro, PCPS reactions were performed and analyzed by applying the LC Quan software version 2.5 (Thermo Fisher) ( Figure  The amounts of peptides generated in an in vitro processing reaction can vary dramatically depending on the assay conditions allowing only a relative estimation. However, judging by ion counts in vitro generation of the non-spliced RAC2 P29L 28-36 neoepitope was approximately 200-fold more efficient than generation of the putative neo-splicetope RAC2 P29L 28-34/36-37 ( Figure 4A). To exclude that generation of RAC2 P29L 28-34/36-37 was the result of an accidental singular splicing event we screened the digests for additional PCPS products. Interestingly, the putative RAC2 P29L 28-34/36-37 neo-splicetope seemed to be the result of the excision of a single aa residue (T 35 ) and the C-terminal ligation of the dipeptide 36 VF 37 to the N-terminal 28 FLGEYIP 34 fragment. However, repetitive specific ligation of a dipeptide in a PCPS reaction would require the unlikely existence of a corresponding specific dipeptide binding site close to the active site and the respective acceptor fragment.

RAC2 P29L
28-34/36-37 splicetope-specific TCR does not recognize RAC2 P29L triple epitope 45mer Spliced RAC2 P29L 28-34/36-37 peptide-specific TCRs were generated by immunizing ABabDII mice with the corresponding synthetic 9mer peptides (not shown). For analysis of in vivo generation and presentation of the spliced RAC2 P29L 28-34/36-37 peptide, we transduced Mel21a cells to express a triple RAC2 P29L 1-45 45mer polypeptide minigene ( Figure 5A) and monitored HLA-A*02:01 epitope presentation by T cell recognition. As shown in Figure 5B, no IFNg release was obtained for the putative neo-splicetope RAC2 P29L 28-34/36-37 using TCR 20967A2 -transduced T cells, while peptide-loaded Mel21a cells were readily recognized and IFNg production demonstrated the target specificity of the TCR. Thus, despite the overexpression of the RAC2 P29L 1-45 45mer substrate peptide, we failed to verify the in vivo generation of the RAC2 P29L 28-34/36-37 peptide. We also raised a TCR (TCR 22894 ) in ABabDII mice against the linear RAC2 P29L peptide. Because Mel21a cells, transduced to express a triple RAC2 P29L 1-45 45mer ( Figure 5B) or RAC2 P29L cDNA ( Figure 5C), were readily recognized by T cells transduced with TCR 22894 , we could exclude that our inability to detect cell surface expression of the RAC2 P29L 28-34/36-37 peptide was due to defects in the antigen presentation pathway. We repeated the experiments with TCR-engineered mouse T cells derived from TCR1xCD45.1xRag1 -/mouse splenocytes (expressing a monoclonal irrelevant TCR against SV40 large T) to monitor RAC2 P29L 28-34/36-37 peptide cell surface expression using mouse NIH-HHD cells expressing a chimeric HLA-A02:01 (HHD) molecule. As observed in Mel21a cells, the non-spliced RAC2 P29L neoepitope (derived from RAC2 P29L 1-45 45mer as well as cDNA) was efficiently presented also by NIH-HHD cells, excluding potential differences in the catalytic properties of mouse and human proteasomes ( Figure 5D). More importantly, again the spliced peptide-specific TCR 20967A2 did not confer any reactivity to T cells upon co-culture without prior peptide loading of the target cells or overexpression of the spliced epitope ( Figure 5B-D). Quantitative PCR analysis of the triple RAC2 P29L 1-45 45mer polypeptide minigene and RAC2 P29L triple epitope spliced nonamer expressed in mouse NIH-HHD cells revealed that RAC2 P29L 1-45 45mer polypeptide minigene is expressed almost fivefold higher than the RAC2 P29L triple epitope spliced nonamer ( Figure 5E). These experiments do not categorically exclude any in vivo generation of the in vitro identified RAC2 P29L -derived spliced peptide. However, they clearly demonstrate that even if the RAC2 P29L 28-34/36-37 neo-splicetope is derived from an overexpressed substrate protein, its amounts are negligible and insufficient to allow its recognition by T cells.
In summary, our results strongly question the idea that in vitro PCPS reaction simulates the in vivo situation with the same high fidelity as the in vitro generation of non-spliced epitopes and contradicts the previously highlighted idea that an algorithm-supported identification of in vitro-generated spliced epitopes is a suitable general approach for the facilitated identification of tumor-specific immune-relevant neo-splicetopes for consecutive TCR generation.

Discussion
Effective CD8 + T cell-induced immune responses depend on both the quality and the amount of proteasome-generated antigenic peptides available for presentation by HLA class I molecules to peptide-specific TCR at the cell surface (Niedermann et al., 1999;Princiotta et al., 2003). Not neglecting TCR affinity or the HLA class I binding affinity of an epitope, in each case the amount of a   Figure 5 continued on next page specific epitope generated by proteasomes from a given antigen has to rise above a certain threshold to elicit a relevant T cell response.
In addition to substrate amounts and protein turnover rates, epitope generation efficiency of both non-spliced and spliced epitopes is determined by the sequence of the epitope, its surrounding protein sequence, and connected with the cleavage site usage and cleavage strength of proteasomes (Mishto et al., 2014;Niedermann et al., 1995;Sijts and Kloetzel, 2011). Thus, even highaffinity peptides, if embedded in a non-favorable protein sequence, will not surpass the necessary threshold for eliciting a T cell response.
The cleavage properties of proteasomes have been intensively studied. However, due to the complexity of protein sequences and lacking information on cleavage strength that decisively determines epitope generation efficiency, algorithms predicting proteasomal generation of immune-relevant non-spliced epitopes still do not reach prediction efficiencies sufficient for large-scale 'reverse immunology' approaches (Calis et al., 2015;Di Carluccio et al., 2018;Singh and Mishra, 2016). Thus, many predicted epitopes may be false-positives, which could impede immunotherapy, for example, neoantigen vaccines. Prediction algorithms for spliced epitopes, which are based on protein sequence and proteasomal cleavage properties, do not even yet exist.
Therefore, in vitro generation of epitopes using purified 20S proteasomes and synthetic polypeptide substrates encompassing the epitope(s) of interest, in combination with mass spectrometric analyses and both in vitro and in vivo CD8 + T cell assays, still represents the most frequently used tool to validate the generation of immune-relevant non-spliced peptides, assuming that it closely simulates the in vivo (in cellulo) situation with respect to both quality and relative amounts of the epitope (Kessler et al., 2001;Kessler and Melief, 2007;Sijts and Kloetzel, 2011).
While a number of virus-or tumor-derived non-spliced epitopes have been validated by in vitro experiments and correlated to the in vivo situation, examples for spliced HLA class I epitopes are still very limited. Nevertheless, what applied to non-spliced epitopes also seemed to be valid for spliced HLA class I epitopes generated in vitro by PCPS. Thus, FGF-5, SP110, and several gp100-derived spliced epitopes that are recognized by CD8 + T cells on the cell surface were demonstrated to be produced also in in vitro PCPS assays (Ebstein et al., 2016;Vigneron et al., 2019). Although in these cases the generation efficiency of spliced epitopes in in vitro PCPS assays seemed to be in line with the in vivo situation, it should be noted, however, that the abundance of spliced epitopes presented at the cell surface is a matter of substantial controversy (Liepe et al., 2016;Liepe et al., 2019;Mylonas et al., 2018;Paes et al., 2019;Rolfs et al., 2019). Thus, in light of recent reports (Mylonas et al., 2018;Rolfs et al., 2019) the amount of cell surface-presented spliced epitope seems to be considerably less than initially estimated.
On the other hand and supporting a potential immune relevance, the initial discovery of the splicing event and spliced epitopes was based on the identification of patient-derived CD8 + T cells reactive towards spliced epitopes generated from tumor antigens (Hanada et al., 2004;Vigneron et al., 2004). Widespread identification of spliced epitopes is however limited by the rare availability of corresponding specific CD8 + T cells. Therefore, we developed prediction algorithms allowing the mass spectrometric identification of predicted and in vitro proteasome-generated spliced peptides. Indeed, applying such an algorithm-aided 'reverse immunology' approach successfully led previously to the identification of two spliced phospholipase PlcB epitopes that primed antigen-specific CD8 + T cells in L. monocytogenes-infected mice (Platteel et al., 2017).
Because somatic mutations in tumor antigens frequently do not result in the generation of neoepitopes suitable for generation of TCRs for ATT therapy, we applied spliced peptide prediction algorithms to identify neo-splicetopes with HLA-A*02:01 binding affinity predicted to be generated from the mutant tumor antigens KRAS G12V and RAC2 P29L and used those for TCR generation.

Figure 5 continued
Respective human and mouse target cells were loaded with 10 À6 M spliced or non-spliced RAC P29L peptide, or transduced with either Rac2 P29L triple epitope 45mer, Rac2 P29L triple epitope nonamer, or Rac2 P29L cDNA. Upon co-culture with recombinant TCR + T cells, IFNg release was measured. For maximal stimulation, phorbol myristate acetate (PMA) and ionomycin (PMA/Iono) were added, and all target cells were also co-cultured with nontransduced T cells (gray bars; Ø). Representative measurements are shown, and experiments were done at least in duplicate. (E) Relative amounts of Rac2 P29L triple epitope 45mer and Rac2 P29L triple epitope nonamer were determined by qPCR on transduced NIH-HHD cells. Rac2 P29L triple epitope nonamer expression is arbitrarily set to 1.
In vitro PCPS experiments in combination with MS analysis aiming at the identification of the algorithm-predicted putative KRAS G12V -derived neo-splicetopes, however, gave no final evidence for their in vitro generation. In the initial kinetic splicing reactions, we seemed to have identified the predicted spliced peptides, thereby corroborating also data obtained with a KRAS G12V 2-35 polypeptide substrate reported by Mishto et al., 2019. However, we found that most likely the extreme hydrophobicity of the KRAS amino acid composition had led to faulty polypeptide synthesis, resulting in polypeptide substrates mimicking in sequence the results of the predicted splicing reaction (Figure 3-figure supplement 5,  Supplementary file 2). Considering that in general the generation of spliced peptides is significantly less efficient than that of non-spliced peptides ( Figure 4A), even minor amounts of faulty peptide substrates will become prevalent in in vitro splicing reactions (see also Figure 3-figure supplement 5B). Because high-quality peptide synthesis reaches a purity of 95-99% at most, a thorough substrate analysis appears essential to avoid false-positive results in in vitro PCPS experiments, particularly for chemically difficult substrates. However, when we used a newly synthesized KRAS G12V substrate not contaminated with peptides mimicking the KRAS G12V 5-8/9-14 and KRAS G12V 5-10/12-14 splicing reactions, we failed to identify the in silico-predicted neo-splicetope. This negative result cannot finally prove the non-existence of the KRAS G12V 5-8/9-14 and KRAS G12V 5-10/12-14 , but being in line with the in vivo experiments, one has to conclude that if these KRAS G12V -derived spliced epitopes are generated, then their amount is below detectable level. In contrast to the negative results obtained with respect to KRAS G12V , the analysis of RAC2 P29L led to the identification of the in silicopredicted RAC2 P29L 28-34/36-37 peptide in in vitro PCPS experiments. Nevertheless, the putative RAC2 P29L 28-34/36-37 neo-splicetope was generated significantly less efficient than the non-spliced RAC2 P29L 28-36 neoepitope. Testing the in vivo generation of the spliced KRAS G12V 5-8/10-14 and RAC2 P29L 28-34/36-37 peptides using the respective peptide-specific TCRs, which were of high affinity and recognized as little as 10 À10 M peptide, we obtained no T cell signal and no evidence for the immune relevance of the two candidate neo-splicetopes, independent of the experimental conditions. Our experiments provide no evidence that either KRAS G12V 5-8/10-14 or RAC2 P29L 28-34/36-37 are produced in vivo or presented at the cell surface. However, even in case both neo-splicetopes were generated in vivo, their generation efficiency and the total amount presented by HLA-A*02:01 molecules on the cell surface are too low to be of any immune significance. One possible explanation for the failure to verify the in vitro PCPS reaction for RAC2 P29L 28-34/36-37 in in vivo settings could be the high substrate and proteasome concentration as used for in vitro PCPS, thereby forcing splicing reactions that do either not or only inefficiently occur under in vivo conditions where only a single substrate protein enters the catalytic cavity of the proteasome for processing at a given time.
Thus, quite in contrast to the experience resulting from proteasome-dependent processing of non-spliced epitopes in vitro, in vitro generation of spliced epitopes by PCPS may not exhibit the same fidelity and does not always simulate the efficacy of in vivo spliced epitope generation. This of course strongly questions the general application of the recently highlighted experimental pipeline for the identification of cancer-specific neo-splicetopes . Reconsidering the workflow for the identification of neo-splicetopes, it thus seems that in vitro PCPS even when combined with peptide binding and TAP transport assays are not sufficient for the prediction of their immune relevance. We therefore believe that it is mandatory to first prove the cell surface presentation of algorithm-predicted candidate neo-splicetopes, either in humanized mice under conditions requiring processing and presentation or by peptide elution experiments, before TCR generation. Our data also support the notion (Mylonas et al., 2018) that the frequency of spliced epitopes is largely overestimated.

Materials and methods
Peptides, proteasome, and PCPS Biochemistry (Dr. Petra Henklein) using standard Fmoc (N-(9-fluorenyl) methoxycarbonyl) methodology (0.1 mmol) on an Applied Biosystems 433A automated synthesizer. The peptide was purified by HPLC and analyzed by mass spectrometry (ABI Voyager DE PRO). The KRAS G12V 1-21 (MDC 27) (MTE YKLVVVGAVGVGKSALTI) polypeptide substrate was obtained from JPT Peptide Technologies (Berlin, Germany). 20S proteasomes were purified from human red blood cells, LcL or T2.7 cells in principle following the procedure as previously described (Textoris-Taube et al., 2019). Proteasome composition of LcL and T2.7 cells, which express immunoproteasomes, however, may vary dependent on batch and culture conditions. For kinetic experiments and better comparison, therefore only the results obtained with erythrocyte 20S proteasomes were used for the kinetic experiments. Proteasome digests of the synthetic RAC2 P29L and KRAS G12V polypeptides were performed in 100 ml of TEAD buffer (20 mM Tris, 1 mM EDTA, 1 mM NaN 3 , 1 mM DTT, pH 7.2) over time at 37˚C. For establishing a cleavage map for RAC2 P29L 20-44 , processing times were 24 hr and 48 hr. The RAC2 P29L 20-44 and KRAS G12V 1-21 synthetic polypeptide at a concentration of 60 mM was digested by 8 mg 20S proteasome. Proteasomal processing of the synthetic KRAS G12V 2-21 and KRAS G12V 2-14 polypeptides was performed at a substrate concentration of 40 mM or 60 mM and in the presence of 4 mg or 8 mg 20S proteasome, respectively. 10 ml digested sample was loaded for 5 min onto a trap column (PepMap C18, 5 mm Â 300 mm Â 5 mm, 100 A , Thermo Fisher Scientific) with 2:98 (v/v) acetonitrile/water containing 0.1% (v/v) trifluoroacetic acid (TFA) at a flow rate of 20 ml/min and analyzed by nanoscale LC-MS/MS using an Ultimate 3000 and LTQ Orbitrap XL mass spectrometer (Thermo Fisher Scientific). The system comprises a 75 mm i.d. Â250 mm nano LC column (Acclaim PepMap C18, 2 mm; 100 Å ; Thermo Fisher Scientific) or a 200 mm PicoFrit analytical column (Pep-Map C18, 3 mm, 100 Å , 75 mm; New Objective). The mobile phase (A) is 0.1% (v/v) formic acid in water and (B) is 80:20 (v/v) acetonitrile/water containing 0.1% (v/v) formic acid. For elution, a gradient 3-45% B in 85 min with a flow rate of 300 nl/min was used. Full MS spectra (m/z 300-1800) were acquired on an Orbitrap instrument at a resolution of 60,000 (FWHM). At first, the most abundant precursor ion was selected for either data-dependent collision-induced dissociation (CID) fragmentation with parent list (1 + , 2 + charge state included). Fragment ions were detected in an Ion Trap instrument. Dynamic exclusion was enabled with a repeat count of 2 and 60 s exclusion duration. Additionally, the theoretically calculated precursor ions of the expected spliced peptides were preelected for two Orbitrap CID (resolution 7500) and higher energy collisional dissociation (HCD) (resolution 15,000) fragmentation scans. The maximum ion accumulation time for MS scans was set to 200 ms and for MS/MS scans to 500 ms. Background ions at m/z 371.1000 and 445.1200 act as lock mass.
For LC-MS/MS runs using a Q Exactive Plus mass spectrometer coupled with an Ultimate 3000 RSLCnano (Thermo Fisher Scientific), samples were trapped as described above and then analyzed by the system that comprised a 250 mm nano LC column (Acclaim PepMap C18, 2 mm; 100 Å ; 75 mm Thermo Fisher Scientific). A gradient of 3-40% B (alternatively 3-45% B) in 85 min was used for elution. The mobile phase (A) was 0.1% (v/v) formic acid in water and (B) 80% acetonitrile in water containing 0.1% (v/v) formic acid. The Q Exactive Plus instrument was operated in the data-dependent mode to automatically switch between full-scan MS and MS/MS acquisition. Full MS spectra (m/z 200-2000) were acquired at a resolution of 70,000 (FWHM) followed by HCD MS/MS fragmentation of the top 10 precursor ions (resolution 17,500, 1 + , 2 + , 3 + , charge state included, isolation window of 1.6 m/z, normalized collision energy of 27%). The ion injection time for MS scans was set to maximum 50 ms, automatic gain control (AGCs) target value of 1 Â 10 6 ions and for MS/MS scans to 100 ms, AGCs 5 Â 10 4 , dynamic exclusion was set to 20 s. Background ions at m/z 391.2843 and 445.1200 act as lock mass.
Peptides were identified by PD2.1 software (Thermo Fisher Scientific) based on their merged tandem mass spectra (MS/MS) of CID and HCD. For peptide identification, we set mass tolerances of either 10 ppm (for XL mass spectrometer) or 6 ppm (for Q Exactive mass spectrometer) on precursor masses and either 0.6 Da for fragment ions using Ion Trap or 0.06 Da using Orbitrap for fragmentation (for XL mass spectrometer) or 0.02 Da (for Q Exactive mass spectrometer).
In addition, for spliced peptides we compared the retention time and the merged MS/MS of CID and HCD with the fragmentation pattern of their synthetic counterparts. To identify spliced peptides, a fasta data file was generated with ProtAG for the KRAS G12V and RAC2 P29L polypeptide substrates and loaded onto PD2.1. The kinetics were analyzed with LC Quan 2.7. HLA-A * 02:01 binding affinity of putative spliced epitopes was calculated by the netMHCpan 4.0 algorithm (Jurtz et al., 2017).

TCR gene transfer
TCR gene transfer was carried out as described before (Niedermann et al., 1995). In brief, packaging cell line HEK-293-GALV (amphotropic) or Plat-E (ecotropic) were grown to approximately 80% confluence and transfected with pMP71 vector carrying the TCR cassette using Lipofectamine2000 (Life Technologies), and retrovirus-containing supernatant was harvested 48 hr and 72 hr after transfection.
Human PBMCs were isolated from healthy donors by Ficoll gradient centrifugation. 1 Â 10 6 freshly isolated or frozen hPBMCs were stimulated with 5 mg/ml anti-CD3 (OKT3) and 1 mg/ml anti-CD28 (CD28.2) (BioLegend)-coated plates in the presence of 300 U/ml recombinant human interleukin 2 (hIL-2, Peprotech). Transductions were performed 48 hr and 72 hr after stimulation by addition of retrovirus-containing supernatant and 4 mg/ml protamine sulfate followed by spinoculation. Transduced T cells were kept in the presence of 300 U/ml hIL-2 for a total of 2 weeks followed by at least 2 days of culture in the presence of 30 U/ml hIL-2, before they were used for experiments.

Functional assays
IFNg production was measured by ELISA after 16 hr co-culture of 1 Â 10 4 TCR-positive T cells with 1 Â 10 4 target cells (human/mouse tumor cell lines or peptide-loaded T2 cells). Stimulation with phorbol myristate acetate (PMA) and ionomycin was used as a positive control. All samples were measured in duplicate.

ProtAG prediction algorithm
The ProtAG prediction algorithm was used in combination with mass spectroscopy to identify peptides and spliced peptides derived from an oligomeric protein substrate. The peaks of the MS intensity profile were approximated by Gaussian functions. Goodness of fit was used as one criterion for assessing the reliability of a mass peak. Only peaks above a user-defined noise threshold were compiled together with their HPLC retention times. The likelihood for correctly assigning a peptide to an MS peak was scored by the correspondence between computational and experimental values for peptide mass, occurrence of expected m/z values, similarity of retention times, and tandem MS/MS data. Chemically modified peptides (e.g., by oxidation) were identified by adding to the theoretical mass the masses of possible modifiers. Such modified peptides were included into the list of identified peptides only if the non-modified peptide could also be identified. After assigning the MS peaks to all direct or chemically modified fragments that theoretically can be derived from the protein substrate by one or multiple cleavages, a group of significant but 'unexplained' MS peaks remains, which may represent possible spliced products, that is, peptides composed of fragments distant in the parental protein substrate. The likelihood for correctly assigning a splice peptide to unexplained MS peaks was computed in the same way as for conventional peptides, including the additional criterion, that the two fragments merged together in the presumed splice peptide were also present in the set of identified conventional peptides.
Proteasomal cleavage products (PCP) of a substrate peptide can clearly be described by the numbers of the first and last amino acid within the substrate: P(i,j) denominates the peptide of length j-i +1 starting with amino acid i and ending with amino acid j. Proteasomal splice products (PSP) consist of two such fragments, therefore denominated by SP(i,j,k,l), consisting of the peptides P(i,j) and P(k,l) and having a length of (j + l-i-k + 2). They can be in normal order (i < j < k < l), inverse order (k < l < i < j), or overlapping. Overlapping splice product means that there exists a position m that is both part of P(i,j) and P(k,l), therefore max(i,j) <= min(j,l), meaning that the splice product consists of parts of two substrate peptides.
Splice peptides in normal order with j + 1 = k are identical to the original PCP P(i,l) and should therefore be excluded. Splice peptides can have the same sequence like PCPs, for example, if the sequence of the substrate is redundant, or the length of one of the parts is short -such peptides can be excluded from the database. Splice peptides can have the same sequence, so the splice peptides SP(i,j,k,l) and SP(i,j + 1,k + 1,l) have the same sequence if the amino acid in position j + 1 and in position k are the same. Nevertheless, both versions should be kept within the database because if the original peptides P(i,j) and P(k,l) are found within the cleavage products, and P(i,j + 1) not, the first version of the PSP is more likely.
When searching for splice products that are at the same time epitopes for MHC class I or MHC class II, the length of the predicted splice products should be limited. Therefore, according limits are included into the algorithm. Also, the database of spliced products can become very large if you try to evaluate all possible splice products of a large substrate without limits. The number of all possible splice products of a substrate of length 100 consists of about 25 million peptides, and results into a database of nearly 3 GB, so you can predict the size of the database without evaluating it, and avoid the evaluation.
The ProtAG program evaluates a database of splice products in fasta format according different parameters: . Sequence of the substrate of length L(sub). . Minimal/maximal length of the parts of the splice peptides (MinP,MaxP). . Minimal/maximal length of the gap between the parts of the splice peptide (only used if you evaluate splice peptides in normal or inverse order) (MinG,MaxG).
. Minimal/maximal length of the whole splice peptide (MinS,MaxS). . Do you want to include PCPs into the database (recommended)? . Do you want to exclude PSPs with sequences identical to PCPs (recommended)? . Do you want to evaluate only PSPs in normal order, or PSPs in normal or inverse order (coming from the same substrate), or all PSPs including PSPs from different substrates?