Role of ultraviolet mutational signature versus tumor mutation burden in predicting response to immunotherapy

Hydrophobic neoantigens are more immunogenic because they are better presented by the major histocompatibility complex and better recognized by T cells. Tumor cells can evade the immune response by expressing checkpoints such as programmed death ligand 1. Checkpoint blockade reactivates immune recognition and can be effective in diseases such as melanoma, which harbors a high tumor mutational burden (TMB). Cancers presenting low or intermediate TMB can also respond to checkpoint blockade, albeit less frequently, suggesting the need for biological markers predicting response. We calculated the hydrophobicity of neopeptides produced by probabilistic in silico simulation of the genomic UV exposure mutational signature. We also computed the hydrophobicity of potential neopeptides and extent of UV exposure based on the UV mutational signature enrichment (UVMSE) score in The Cancer Genome Atlas (TCGA; N = 3543 tumors), and in our cohort of 151 immunotherapy‐treated patients. In silico simulation showed that UV exposure significantly increased hydrophobicity of neopeptides, especially over multiple mutagenic cycles. There was also a strong correlation (R 2 = 0.953) between weighted UVMSE and hydrophobicity of neopeptides in TCGA melanoma patients. Importantly, UVMSE was able to predict better response (P = 0.0026), progression‐free survival (P = 0.036), and overall survival (P = 0.052) after immunotherapy in patients with low/intermediate TMB, but not in patients with high TMB. We show that higher UVMSE scores could be a useful predictor of better immunotherapy outcome, especially in patients with low/intermediate TMB, likely due to increased hydrophobicity (and hence immunogenicity) of neopeptides.

Hydrophobic neoantigens are more immunogenic because they are better presented by the major histocompatibility complex and better recognized by T cells. Tumor cells can evade the immune response by expressing checkpoints such as programmed death ligand 1. Checkpoint blockade reactivates immune recognition and can be effective in diseases such as melanoma, which harbors a high tumor mutational burden (TMB). Cancers presenting low or intermediate TMB can also respond to checkpoint blockade, albeit less frequently, suggesting the need for biological markers predicting response. We calculated the hydrophobicity of neopeptides produced by probabilistic in silico simulation of the genomic UV exposure mutational signature. We also computed the hydrophobicity of potential neopeptides and extent of UV exposure based on the UV mutational signature enrichment (UVMSE) score in The Cancer Genome Atlas (TCGA; N = 3543 tumors), and in our cohort of 151 immunotherapy-treated patients. In silico simulation showed that UV exposure significantly increased hydrophobicity of neopeptides, especially over multiple mutagenic cycles. There was also a strong correlation (R 2 = 0.953) between weighted UVMSE and hydrophobicity of neopeptides in TCGA melanoma patients. Importantly, UVMSE was able to predict better response (P = 0.0026), progression-free survival (P = 0.036), and overall survival (P = 0.052) after immunotherapy in patients with low/intermediate TMB, but not in patients with high TMB. We show that higher UVMSE scores could be a useful predictor of better immunotherapy outcome, especially in patients with low/intermediate TMB, likely due to increased hydrophobicity (and hence immunogenicity) of neopeptides.

Introduction
Cells in the human body naturally present antigens, which are short peptide fragments derived from intracellular and extracellular sources, on their surfaces in major histocompatibility complex (MHC) proteins. Intracellular antigens are usually presented by MHC class I molecules to effector T cells [1] to help the immune system recognize whether the cell is healthy and whether it belongs to the host ('self'), whereas extracellular antigens are often displayed in MHC II moieties on professional antigen-presenting cells like dendritic cells, B cells, and monocytes [2]. Healthy cells displaying valid 'self' antigens are not recognized by effector T cells due to negative selection in the thymus, T reg cells meant to suppress the immune system do recognize these cells [2]. In contrast, cancer cells should be recognized and attacked by cytotoxic T cells because malignant cells harbor mutations that manifest as altered peptide neoantigens marking them as nonself. Antigens presented in MHC II could also be important in activating CD4+ 'helper' T cells, which are involved in a variety of antitumor responses [2]. Logically, more mutations should result in a greater probability of presenting immunogenic neoantigens on the cell surface [3].
In order to survive and evade the immune response, highly mutated tumor cells use several evasion techniques including downregulating MHC I expression, though natural killer cells are more likely to target these without the involvement of a separate evasion mechanism involving shedding like in some prostate cancers [4], since the presence of MHC I inhibits their activity [2,5], and expressing immune checkpoint surface proteins, such as programmed cell death 1 (PD-1) ligands [6] to dull the adaptive response their foreign antigens trigger. PD-1 ligands are induced by interferon gamma found in the proinflammatory tumor microenvironment [7] and cause CD8+ T cells to become anergic, even if they recognize the foreign antigens present on the tumor. Checkpoint blockade immunotherapies (e.g., anti-PD-1/PD-L1 antibodies) counter this effect by obstructing the PD-1/PD-L1 interaction. These are more effective in cancers characterized by a high tumor mutational burden (TMB) such as melanoma [8] and the high TMB subset of patients in other cancers [8][9][10][11][12][13][14]. Other factors that correlate, albeit imperfectly, with a propensity to PD-1/ PD-L1 inhibitor responsiveness include PD-L1 overexpression [8,12,13] and Apolipoprotein B mRNA Editing Enzyme, Catalytic Polypeptide-like (APOBEC) mutational activity [15].
Most mutations in melanomas are caused by exposure to ultraviolet (UV) light [16] through the action of free radicals formed by high-energy UV rays disrupting covalent double bonds in pyrimidine DNA bases [16]. There are three forms of UV radiation, categorized by wavelength, and therefore energy level: UVA (340-400 nM), UVB (280-320 nM), and UVC (200-280 nM). Free radical oxidation reactions due to UVB and UVC light cause dipyrimidine dimers to form, most of which should be repaired by nucleotide excision repair. However, some residual dipyrimidine mutations remain uncorrected, leading to increased brittleness in the DNA helix and improper replication and transcription. Dipyrimidines composed of linked cytosines are usually mispaired with two adenines during DNA replication, resulting in the characteristic CC→TT mutations commonly associated with UV light [17]. Lower energy UVA radiation tends to cause G→T mutations by free radicals oxidizing guanine, creating a new 7,8-dihydro-8oxoguanine species that can pair with adenine, which then causes the guanine's replacement with a thymine in a succeeding DNA replication cycle [16].
In this paper, we describe biochemical (hydrophobicity) and clinical outcomes as related to UV-induced hypermutation. We show that neoantigens produced from a UV-mutated genome tend to be more hydrophobic and therefore are likely to be more immunogenic because they are better presented by MHC and they are more easily recognized by T cells [18][19][20]. We also show a positive correlation between response to immunotherapy and level of UV mutations in 151 patients seen at the University of California San Diego Moores Cancer Center. These data suggest that the shift toward hydrophobicity induced by UV mutations is likely to underlie the enhanced responsiveness to immunotherapy.

In silico UV mutagenesis
We generated all possible 6-nucleotide stretches (representing two codons) and applied the UV mutational signature, as previously described [21], onto them. The 6-nucleotide length for each stretch was used to allow us to consider the effect of mutations occurring in all possible reading frames in a codon. This is because the UV mutation signature is defined using the context of the two nucleotides flanking the substitution site since mutations arising from UV exposure often involve reactions between neighboring bases [22], therefore necessitating the presence of at least five nucleotides per stretch to allow the signature to be applied to every possible reading frame of the codon. The 4096 6-nucleotide stretches, before and after mutation, were then virtually transcribed into their corresponding amino acids whose total hydrophobicity was calculated with the Kyte-Doolittle hydrophobicity scale [23] (with and without the reciprocal strand). The hydrophobicity of the dipeptides was multiplied before and after mutagenesis by the probability of observing the codons corresponding to the dipeptide on the human coding genome (derived from the Kazusa's codon usage database [24]) and the probability of UV mutagenesis on the stretch [21]. The change in hydrophobicity due to in silico mutagenesis was compared using the Wilcoxon signed-rank test.
For example, using the 6-mer TCCGAG, encoding for the dipeptide Ser-Glu, would have a Kyte-Doolittle hydrophobicity index of (−0.8) + (−3.5) = (−4.3) arbitrary unit (AU) [23]. This 6-mer can be mutated with the most frequent UV mutagenesis pattern TCC>TTC at the second position to yield the sequence TCCGAG/ Phe-Glu, which has a hydrophobicity index of (+2.8) + (−3.5) = (−0.7) AU. The substitution alone results in an increase in hydrophobicity of +3.6 AU, but this value must be further weighted by the probability of the mutation occurring in the specific 6-nucleotide stretch. The mutation occurrence probability is calculated as the joint probability of encountering the original TCCGAG sequence in the genome, 0.00070092, and the probability of the TCC>TTC mutation occurring based on the signature 7 from [21], 0.2887, yielding a joint mutation probability for this specific case of 0.00070092 × 0.2887 = 2.024 × 10 −4 .
The relative change in hydrophobicity of this substitution is therefore 2.024 × 10 −4 × (+3.6) = +72.84 × 10 −4 . Analogous calculations were performed for all possible nucleotide substitutions (three unique nucleotides per position) at all definable, mutable positions (2nd, 3rd, 4th, and 5th positions) in all 6-nucleotide stretches (n = 4096) resulting in a total of 49 152 possible singly mutated stretches, whose relative hydrophobicity changes were summed together to estimate the genome-wide hydrophobicity change for each round of mutagenesis. Each succeeding cycle of mutagenesis starts with the preceding 6-nucleotide stretches, and the joint probability of the two codons was modified based on the mutations applied in the previous cycle. Each iteration of mutagenesis described above corresponds to a single iteration of UV-mediated mutagenesis, equivalent to an AU dose of UV exposure. Multiple iterations of mutagenesis in this method are intended to correspond to increasing doses of UV light. We repeatedly simulated UV mutagenesis for up to 100 iterations to investigate and model the effects of long-term UV exposure on antigen hydrophobicity.

2.2.
Analysis of UV mutational signature in TCGA repository pan-cancer tumor samples Molecular profiles, obtained by next-generation sequencing (NGS) of human tumors, consisting of mutations such as substitutions or small insertions/ deletions and mRNA expression data, were downloaded from the community resource project The Cancer Genome Atlas (TCGA), using the Broad GDAC Firehose website (https://gdac.broadinstitute.orgstandardized data run release 2016_01_28). All samples were published and available without usage restrictions as of January 14, 2019. All TCGA data used in this study respected the TCGA Human Subjects Protection and Data Access Policies (https://cancergenome.nih. gov/abouttcga/policies/tcga-human-subjects-data-polic ies). Another set of mutation data containing mutation data for acral melanomas [25] was downloaded from cBioPortal (http://www.cbioportal.org/study?id=mel_ tsam_liang_2017). The acral melanoma data were collected in accordance with the protocol approved by Vanderbilt University and Memorial Sloan-Kettering Cancer Center Institutional Review Boards, as detailed in Liang et al. [25]. The mutation annotation file from both data sources containing the mutation data were then filtered by genomic coordinates corresponding to the exon regions sequenced by Foundation Medicine.

UV mutational signature enrichment estimation for TCGA samples
An estimation of the enrichment of mutations due to UV exposure was performed using our implementation of a signature estimator (publicly available software tool at https://github.com/UCSD-CCAL/Mutational-Signature-Enrichment-Calculator). The results were given as a numerical score (UV mutational signature enrichment, UVMSE) representing the enrichment of mutations likely to be caused by UV exposure. In TCGA tumors, for each sample the total number of mutations was multiplied by its UVMSE score to quantify the extent of UV-induced mutagenesis in each sample when correlating it with each sample's overall neopeptide hydrophobicity.

Hydrophobicity analysis
From 9166 samples in the TCGA database (33 distinct tumor types), we selected 3543 tumors without (a) POLE and POLD1 mutations, (b) mismatch repair gene loss, underexpression, or mutations, (c) and microsatellite instability-high alterations because these alterations are already known to influence immunotherapy response [12,26,27]. Using the mutation description available for these tumors, we then performed two types of analysis: (a) In the first analysis, for each tumor, the differences in total hydrophobicity (i.e., the sum of the hydrophobicity of all amino acids) of each transcript's full-length peptide product (after versus before mutagenesis) were considered; and (b) for the second analysis, for each tumor, mutated transcripts were used to generate all possible 8-to 10-mer neoantigens encompassing a mutation (since MHC I presents 8-10 amino acid peptides); the differences in total hydrophobicity of the neoantigens after versus before mutagenesis were considered. The results of both (a) and (b) above were computed either not weighted by mRNA expression levels or weighted by these levels (in order to take into consideration whether the neoantigens were actually transcribed and their respective levels of expression). For each sample, the weighted and unweighted hydrophobicities were then correlated against the UVMSEweighted total mutation count.

Analysis of UV mutagenesis signature and tumor neoantigen hydrophobicity in patients receiving immunotherapy (PD-1/PD-L1 blockade agents)
We reviewed the electronic medical records of 1638 eligible patients with malignancies at UC San Diego Moores Cancer Center who have undergone hybrid capturebased NGS (Foundation Medicine, Cambridge, MA, USA) starting in October 2012. Only patients having received at least one line of immunotherapy were considered (N = 151). For each case, responses to therapy were assessed based on physician notation, using the Response Evaluation Criteria in Solid Tumors (RECIST) criteria. This study was performed in accordance with UCSD Institutional Review Board guidelines for data analysis (NCT02478931) and for any investigational treatments for which patients consented. In addition, the study methodologies conformed to the standards set by the Declaration of Helsinki.
Formalin-fixed paraffin-embedded tumor samples from these patients were submitted for NGS to Foundation Medicine's clinical laboratory improvement amendments-certified laboratory. The patients' mutations were assessed with the FoundationOne ® assay (hybrid capture-based panel exome NGS; panel of up to 315 genes-http://www.foundationone.com/). The methods have been previously described in Frampton et al. [28]. Average sequencing depth of coverage was greater than 250×, with more 99% of exons covered having greater than 100×. TMB, measured in mutations per megabase (Mb), was calculated by extrapolating the number of somatic mutations detected on NGS to the whole exome with a validated algorithm [26,29]. Alterations likely or known to be bona fide oncogenic drivers and germline polymorphisms were excluded. TMB levels were divided into three groups: low (1-5 mutations/Mb), intermediate (6-19 mutations/Mb), and high (≥ 20 mutations/Mb), which stratified roughly 50% of patients to low TMB, 40% to intermediate TMB, and 10% to high TMB in our cohort [30].

UV mutational signature enrichment (UVMSE) for patient samples
An UVMSE score was computed for each of the 151 patients from their NGS data on the Foundation Medicine gene panel using our MSE software tool as was used to calculate the UV MSEs of TCGA data. Demographic data for the patients were previously provided (8).
Variant call format (.VCF) files for the 151 Moores Cancer Center patients were generated by processing the Binary Sequence Alignment/Map format (.BAM) files, obtained from Foundation Medicine Inc. (www.f oundationmedicine.com/) NGS, with the variant detection FreeBayes algorithm, and excluding low-quality variants (QUAL score of < 50 or read depth of < 100). Further, the VCF files were then filtered by genomic coordinates corresponding to the exonic regions Foundation Medicine sequences for their commercially available report. We then defined a singlestrand, DNA-specific UV signature, signature 7 from [21]. Substitutions in the reverse complement were treated the same as those in the forward strand for counting purposes. The UVMSE score quantifies how frequently mutations described in the defined UV signature 7 occur at a specific sequence context compared to analogous single nucleotide substitutions in other contexts. It is our adaptation of the quantification method described in Roberts et al. [31]. For example, the mutation TCC→TTC, the most frequent UV-induced mutation according to Alexandrov et al. [21], can be used to illustrate our method. Its enrichment can be calculated as follows: Mut TCC→TTC is the amount of TCC→TTC and GGA→GAA reverse complement mutations counted in a 41-nucleotide stretch on the human genome (GrCh37.75) centered around a detected single nucleotide substitution from the VCF file. Con TCC→TTC is the amount of TCC contexts that can be potentially mutated found in the reference genome copy of the stretch. Mut C is the number of C→T and G→A reverse complement mutations found in the stretch, and Con C is the number of C and G nucleotides found in said stretch. This enrichment value was then weighted by the probability of the mutation occurring, according to the signature [21].
The weighted enrichment value for each of the 192 described mutations in the signature was then summed together to yield the UVMSE score for a particular sample.

Clinical outcome analysis of patients receiving PD-1/PD-L1 blockade agents
Patients were divided into two groups of interest: (a) patients presenting a tumor with a high load of UV mutations ('UV high'); and (b) patients presenting a tumor with a low load of UV mutations ('UV low'). The optimal threshold (0.7917) for UVMSE estimate was selected using the receiver operating characteristic (ROC) curve method (with UV high being ≥ 0.7917) and evaluating the performance of the UVMSE score to discriminate patient outcomes. (It should be noted that the cutoff in the patient set and the TCGA set for UV high was different, probably because the tumor type distribution differed; for instance, 88 of 3543 (2.5%) of TCGA tumors were melanoma, while 52 of 151 patients (34%) of the patients treated had melanoma, and furthermore, the UV high designation was not used to determine outcome in the TCGA dataset.) Patients were then grouped by best response: Complete or partial responses (CR/PR) in patients were considered favorable outcomes, whereas a poor outcome was defined as patients with a stable or progressive disease (SD/PD). Best response observed, progression-free survival (PFS) and overall survival (OS) in months, and TMB and patient demographics were compared between patients presenting a high UVMSE score (≥ 0.7917) versus patients presenting a low UVMSE score.

Statistical analysis and outcome assessment
The association between the UVSME score and clinical outcome was conducted using SAS ® University Edition software (Cary, NC, USA; http://support.sas. com/software/products/university-edition/) and GRAPH-PAD PRISM ® version 6.01 (San Diego, CA, USA; http:// www.graphpad.com/scientific-software/prism/). Twotailed tests were used, and P-value ≤ 0.05 was considered significant.
Statistical significance for the in silico modeling results was assessed using a Wilcoxon signed-rank test (nonparametric paired test) for the comparison of change in total hydrophobicity before and after UV mutagenesis in 6-nucleotide stretches. Change in total hydrophobicity was also calculated in TCGA cohort samples, and those with and without UV mutagenesis were compared; these calculations were performed for the products of full-length transcripts as well as for 8to 10-mer peptides; the Mann-Whitney U-test (nonparametric unpaired test) was used.
For the clinical study, Fisher's exact test was used to assess for the association between categorical variables and the response to therapy, defined as CR/PR, and stable disease or progressive disease (SD/PD) by RECIST criteria. Patient characteristics were summarized using descriptive statistics. Medians and respective 95% confidence intervals (CIs) and range were calculated, whenever possible. The association of UVMSE and TMB levels with PFS and OS in months (calculated using the Kaplan-Meier method) was assessed using the Mantel-Cox log-rank test. PFS and OS were calculated from the date of starting the immunotherapy. For patients who received multiple immunotherapy regimens, the treatment with the longest PFS was chosen for analysis. Patients were excluded from the survival analysis if they were lost to follow-up before their first restaging. Patients were censored at date of last follow-up for PFS or OS, if they had not progressed or died, respectively. In multivariate analysis, associations between categorical variables were tested using a binary logistic regression model and associations between categorical and continuous values such as survival time were assessed with the Cox's proportional hazards model. Linear variables were tested using the Mann-Whitney U-test for univariate analysis.

Results
Overall, both in silico and TCGA-based analysis demonstrated increased hydrophobicity (which is in turn associated with increased immunogenicity) [18][19][20] after UV mutagenesis, and that a high UV signature predicts longer PFS and OS in low/intermediate TMB tumors, but not high TMB tumors, after checkpoint blockade (Tables 1-2 and Tables S1-S9, Figs 1-4 and Figs S1-S2).

UV mutational signature and increased exome hydrophobicity are associated as determined by in silico mutagenesis
Overall, 4096 six-nucleotide stretches were generated for in silico mutagenesis, each consisting of different combinations of the four canonical DNA nucleotides A, T, C, and G (4 6 = 4096). The 2nd, 3rd, 4th, or 5th position on each stretch was mutated once, separately, per cycle of mutagenesis creating 49 152 possible mutated stretches [4096 stretches × 4 mutable positions × 3 possible mutations (nucleotides other than the preexisting one)]. Overall exome hydrophobicity increased proportionally with the number of in silico UV mutagenesis cycles. After one cycle of UV mutagenesis when considering all 4096 possible stretches (P < 0.0001), the median hydrophobicity increased by 1.6 × 10 −7 AU, and the average hydrophobicity increased by 5.8 × 10 −6 AU (Tables 1 and S1). The calculations were repeated with five nucleotide stretches and showed the same significant increase in hydrophobicity (data not shown).
The number of codons coding for hydrophobic amino acids increased, while the number of hydrophilic amino acid codons decreased with each successive iteration of in silico UV mutagenesis (Fig. 1A-C). The number of stop codons present in the exome also increased with each iteration of mutagenesis: by 38% after one round and by 383% after 20 rounds (Fig. 1  D). (Transcripts with premature stop codons tend to become neoantigens due to a quality control mechanism involving the pioneer round of translation in the nucleus [32].) An overall increase in exome hydrophobicity proportional to the number of iterations of in silico mutagenesis is the net result of these changes (Fig. 1E).

Neoantigen hydrophobicity correlates with UV exposure mutations in TCGA samples
In the selected TCGA cutaneous melanoma cohort, the total number of mutations in each sample was weighted by the sample's UVMSE as a proxy for how many UV-induced mutations are present in the sample. Neoantigen hydrophobicity shows a strong correlation (R 2 = 0.953, P < 0.0001) with UVMSEweighted mutation quantity in the selected 88 cutaneous melanoma tumors. Weighting neoantigen hydrophobicity by expression preserves the significance (R 2 = 0.5527, P < 0.0001; Fig. 2). Figure S1 shows that melanoma has higher UVMSE versus nonmelanoma (Panel A; P = 0.0002), but the MSE of other signatures was not significantly different between the two groups (Panel B) in the 151 UCSD Moores Cancer Center patients. Further, patients who attained a CR or PR after immunotherapy had a higher UVMSE in the pan-cancer cohort (P = 0.0165), but this difference in UVMSE was not significant (n.s.) Table 1. Overall hydrophobicity of the human coding genome increases in a single iteration of UV mutagenesis (per in silico computation)*. Table 1 shows the analysis using all existing 6nucleotides stretches; alterations on the reciprocal strands were not included because of an existing bias against mutations in the reciprocal strand [21]. See Table S1  Bolded values are meant to highlight statistically significant results. a UV signature pattern 7 (as described by Alexandrov et al. [21] was used. All stretches have at least one mutation. For one iteration, every possible six-nucleotide combination was generated, and then virtually transcribed to amino acids, and then the hydrophobicity of the amino acids was calculated, and then multiplied by the frequency that the two codons (six nucleotides) would appear in the human genome (probability based on Kazusa's codon usage database [24]). We then virtually mutated nucleotides 2,3,4, and 5 of each 6 nucleotide stretch (since that would result in all possible configurations for the two codons), and for each mutation, we multiplied the probability that the mutation would occur as part of the UV signature, with the latter being derived from Alexandrov et al. [21],finally, the hydrophobicity of the new amino acids would be calculated and multiplied by the probability of the two original codons occurring.  in the melanoma patients, perhaps because of the small number of cases (Panels C and D).

UVMSE correlates with tumor types that are known to have high UV exposure
The UVMSE scores of the cutaneous melanoma cohort in TCGA were also compared with those from the acral melanoma cohort [25] in order to determine whether UVMSE can accurately predict high or low UV exposure (Fig. S2). The acral samples were all placed in the 'low UV' group because of the low rate of UV mutation in acral melanomas [25]. The low UV group had an average UVMSE of 1.206 (95% CI: 0.8290-1.583), while the high UV group had an average UVMSE of 1.842 (95% CI: 1.762-1.922; calculated on whole exome). The difference between the two groups was statistically significant (P < 0.0001). We therefore used the data to determine a threshold of 1.642 using the ROC curve method to dichotomize the TCGA data between low and high UV exposure. This threshold has a sensitivity of 73.10% and a specificity of 73.68% when used to predict whether a TCGA patient was diagnosed with cutaneous melanoma.
In our cohort of 151 patients, the mean UVMSE for patients with a melanoma diagnosis, which we used as a proxy for high UV exposure status due to the most common etiology of melanoma being excessive UV exposure [16], was significantly higher at 0.8043 (95% CI: 0.7887-0.8199) compared to nonmelanoma patients who had an average UVMSE of 0.7827 (95% CI: 0.7737-0.7918; P = 0.0029; calculated on the  Table 1 becomes even more pronounced. Increasing rounds of mutagenesis cause a loss in hydrophilic amino acid encoding codons and a gain in hydrophobic amino acid encoding codons, therefore increasing the overall hydrophobicity of peptides encoded by the exome, including those of neoantigens. Neoantigen production could potentially increase as well due to an increase in the number of stop codons caused by increasing mutagenesis. genomic real estate in the Foundation Medicine panel; Fig. 3).

Clinical factors associated with high UV signature
In a univariate model based on our 151 patients in the UCSD Moores Cancer Center Cohort, the factors associated with high UV signature included Caucasian ethnicity, tumor type being melanoma, and TMB high (all P < 0.003; Table 2).

Univariate and multivariate analyses of factors associated with response, PFS, and OS in immunotherapy-treated patients
Overall, 151 patients from the UCSD Moores Cancer Center were analyzed for immunotherapy response. Fifty-two percent were ≤ 60 years old; 62% were men; 74% were Caucasian; 34% had melanoma; 25% had high TMB; and 68% received a PD1/PDL1 inhibitor as a single agent (Table 2). Overall, 30% of patients achieved a CR or PR. The median PFS for all patients was 4.6 months, and median OS was 25.4 months ( Table 2).

Univariate analysis of all patients
A UVMSE threshold value of 0.7917 was designated to distinguish between UV-high and UV-low status in patients (threshold determined using the ROC curve method; e.g., we identify the threshold that maximizes the area under the ROC for predicting clinical outcome using the UVSME as predictor). Table S8 shows that tumor type melanoma, TMB high, UV high, and immunotherapies other than single-agent checkpoint inhibitors were significantly associated with better response rates as well as longer PFS and OS (all P < 0.01; univariate analysis). Overall, 24/ 46 patients (52% of UVMSE high) versus 21/105 (20% of UVMSE low) patients responded (P = 0.0002; Table S8). In the UV-high group, the median PFS was 9.3 versus 3.2 months (P = 0.0001) in the UV-low group, whereas median OS was not reached (n.r.) in the UV-high group compared to 21 months in the UV-low group (P = 0.0139; Table  S8).

Multivariate analysis of all patients
Multivariate analysis demonstrated that only melanoma and TMB high were selected as independent variables predicting response rate, PFS, and OS (all P < 0.03; Table S8).
Results were similar in that UV high versus UV low was associated with response in univariate but not multivariate analysis when only nonmelanoma or only melanoma patients were analyzed (Tables S2 and S3). For PFS, UV high was selected as an independent factor predicting PFS only in melanoma (but not in nonmelanoma) patients (Tables S4 and S5). OS could not be associated with any factor once the groups were split into nonmelanoma and melanoma patients, perhaps because of the limited number of patients in each group (Tables S6 and S7).

UV high versus UV low predicts favorable outcome in the TMB-low/intermediate but not the TMB-high subgroup
Considering the lower and higher TMB groups separately, UVMSE score was effective at identifying responders in the low/intermediate TMB group  Table S9]. Similar results were seen for PFS and OS: In the low/intermediate TMB group, the UVhigh and UV-low status predicted longer PFS (P = 0.036) and OS (P = 0.052) but did not stratify PFS or OS for the TMB-high patients (Fig. 4).
There was a significant association between UVMSE and clinical response in univariate (P = 0.0026) and in multivariate (P = 0.0108) analysis among patients with low or intermediate TMB. Within those patients, melanoma tumor type was also significant in both univariate (P = 0.0041) and multivariate analyses (P = 0.0139), but immunotherapy type was only significant in univariate analysis (P = 0.0101; Table S10). UVMSE was significantly associated with PFS or OS time in univariate analysis (P = 0.036 and P = 0.05) within the low TMB group, but not in multivariate analysis ( Fig. 4 and Tables S11 and S12); since only 22 patients were in the high UVSME group in this subanalysis, the small numbers of patients may have precluded robust correlations.

Discussion
Immunotherapy such as checkpoint blockade has been lauded in both the scientific and popular press because it can effectively suppress or even eradicate some advanced cancers. This phenomenon is due to the way immunotherapy serves as a force multiplier for the body's endogenous immune system by reactivating it so that cancer cells are recognized and attacked [33].
While immunotherapy results in exceptional responses in certain tumors, such as melanoma, which is characterized by a high TMB [8], it only has a~20% overall response rate in the unselected population of patients with malignancies [8]; further, in certain cases, checkpoint blockade may result in hyperprogression of the tumor and it is not without toxicity [34]. Interestingly, other factors such as PD-L1 amplification may also predict immunotherapy response [35]. It is apparent that more predictive biomarkers in addition to histologic diagnosis, TMB, and a better understanding of immunosuppressive mechanisms that incapacitate the natural immune response are needed to more effectively route the appropriate therapies to patients.
We show that the mutational landscape caused by UV light, as quantified by the UVMSE, is positively correlated with increased hydrophobicity of exome protein products in both in silico simulation and pancancer TCGA data. This observation is similar to that for APOBEC signatures, which also increase hydrophobicity, but differs from other signatures such as microsatellite instability and tobacco, which may decrease hydrophobicity (even while increasing number of mutations) [15]. The increased hydrophobicity of UV-mutated proteins derived from the altered coding genome would increase the immunogenicity of the antigens derived from them because T cells have a higher affinity for more hydrophobic antigen peptides [18] and because more hydrophobic peptides bind more strongly to MHC class I molecules' hypervariable regions, particularly when the antigen's hydrophobic peptides are located at the anchor positions [19,36]. An increased number of UV mutations, which bias the resulting peptides toward increased hydrophobicity, would logically increase the probability of hydrophobic peptides being placed at the anchor position. In addition, antigens intended for presentation in MHC I are only 8-10 amino acids long [2], with one or two of those being anchor peptides, so changing only the anchor positions may have important effects. Similar effects regarding antigen hydrophobicity enhancing presentation in MHC II have also been noted since the peptide binding groove of MHC II requires hydrophobic amino acids at key locations [37,38]. However, antigens bound to MHC II can be larger than those in MHC I due to the open structure of MHC II, and antigens with more hydrophilic external peptides have been shown to be more immunogenic to CD4+ T cells interacting with MHC II [39]. The fact that the number of UV mutations is associated with an increase in putative neoantigen hydrophobicity is a possible explanation for why melanomas tend to respond well to checkpoint blockade immunotherapy [40]. In addition, the increase in the number of stop codons also increases the number of neoantigens produced from the mutated exome due to the resulting excess of faulty transcription products being routed to and processed into antigens via mechanisms such as nonsense-mediated decay and a separate quality control process located in the nucleus that is part of the pioneer round of transcription [32].
In our study, higher UVMSEs correlated with response to therapy, PFS, and OS. However, despite its significance in univariate analysis, we found that the UV mutational signature was not an independent variable predicting outcome in multivariate analysis; the tumor type, specifically a melanoma versus nonmelanoma histology, and TMB (high versus low/intermediate) were independent predictors of better outcome. These observations are consistent with UV exposure being strongly associated with melanoma diagnoses due to it being the predominant etiology for the disease [16]. Even so, UVMSE can effectively detect responders as well as those with longer PFS and OS after immunotherapy in the low or intermediate TMB patient cohort. This could be explained by the extremely hydrophobic nature (hydrophobicity being associated with immunogenicity [18]) of the UV mutational signature, hence driving the immune response. On the other hand, UVMSE does not appear to be predictive of better outcome in the high TMB cohort, possibly because the sheer number of neoantigens arising from a highly mutated sample, regardless of mutational source, would elicit a strong immune response.
There are several limitations to this study. For instance, the range of UVMSE signatures differs when the patient cancer type distribution differs. This issue can be seen in the range of the UVMSE in TCGA compared to that of the 151 Moores Cancer Center patients. Second, patients had a variety of tumor types and immunotherapies, and the limited number of patients with individual tumor types precluded an analysis by histology. However, the data may also suggest that the results are generalizable across cancers and treatments.

Conclusion
In summary, we show, through in silico simulation and analysis of TCGA data, that a genome altered with the characteristic UV exposure mutation signature would produce 8-10 mer antigens of significantly elevated hydrophobicity. The hydrophobicity of these neoantigens is also proportional to the number of mutations caused by UV exposure in individual samples. This increased hydrophobicity, among other physicochemical properties, may cause T cells of the immune system to recognize cells presenting the neoantigens extracellularly in MHC I molecules as 'foreign', marking these cancer cells for destruction. T cells preferentially bind through their T-cell receptors to more hydrophobic antigens, both because these antigens are more likely to be presented in MHC I moieties due to the requirement for hydrophobic anchor positions to facilitate antigen presentation, a prerequisite for interacting with T cells, and because of the hydrophobic peptides' intrinsically immunogenic nature [15,[18][19][20]. Therefore, checkpoint blockade immunotherapies are more likely to be effective since the tumor will be infiltrated by T cells attached to cancer cells prevented from doing their effector functions only by PD-1/PD-L1 interactions. The correlation of UV exposure with better immunotherapy outcome appears to be more important in cases with low/intermediate TMB (versus high TMB), perhaps because, in the latter, the large number of mutations already permits immune recognition once T cells are reactivated after checkpoint blockade therapy. Table S1. Consequences of a single iteration of UV mutagenesis on the overall hydrophobicity of the human coding genome, including mutations appearing on the reciprocal strand (computed in silico)*. Table S2. Univariate and multivariate analysis of factors affecting response rate for non-melanoma patients treated with immunotherapy agents (N = 99). Table S3. Univariate and multivariate analysis of factors affecting response rate for melanoma patients treated with immunotherapy agents (N = 52). Table S4. Factors associated with PFS on immunotherapy for 99 non-melanoma patients treated with immunotherapy. Table S5. Factors associated with PFS on immunotherapy for 52 melanoma patients treated with immunotherapy. Table S6. Factors associated with OS on immunotherapy for 99 non-melanoma patients treated with immunotherapy. Table S7. Factors associated with OS on immunotherapy for 52 melanoma patients treated with immunotherapy. Table S8. Univariate and multivariate analysis of factors affecting response rate, progression-free and overall survival for all patients treated with immunotherapy agents (N = 151). Table S9. Factors associated with response to immunotherapy for the treated 151 patients separated into higher and lower TMB groups.  Fig. S1. UV signature enrichment analysis in a cohort of 151 patients and correlation to the response to immunotherapy. Fig. S2. UV signature enrichment analysis in a cohort of 328 acral and cutaneous melanomas.