Genomics of NSCLC patients both affirm PD-L1 expression and predict their clinical responses to anti-PD-1 immunotherapy

Background Programmed Death Ligand 1 (PD-L1) is a co-stimulatory and immune checkpoint protein. PD-L1 expression in non-small cell lung cancers (NSCLC) is a hallmark of adaptive resistance and its expression is often used to predict the outcome of Programmed Death 1 (PD-1) and PD-L1 immunotherapy treatments. However, clinical benefits do not occur in all patients and new approaches are needed to assist in selecting patients for PD-1 or PD-L1 immunotherapies. Here, we hypothesized that patient tumor cell genomics influenced cell signaling and expression of PD-L1, chemokines, and immunosuppressive molecules and these profiles could be used to predict patient clinical responses. Methods We used a recent dataset from NSCLC patients treated with pembrolizumab. Deleterious gene mutational profiles in patient exomes were identified and annotated into a cancer network to create NSCLC patient-specific predictive computational simulation models. Validation checks were performed on the cancer network, simulation model predictions, and PD-1 match rates between patient-specific predicted and clinical responses. Results Expression profiles of these 24 chemokines and immunosuppressive molecules were used to identify patients who would or would not respond to PD-1 immunotherapy. PD-L1 expression alone was not sufficient to predict which patients would or would not respond to PD-1 immunotherapy. Adding chemokine and immunosuppressive molecule expression profiles allowed patient models to achieve a greater than 85.0% predictive correlation among predicted and reported patient clinical responses. Conclusions Our results suggested that chemokine and immunosuppressive molecule expression profiles can be used to accurately predict clinical responses thus differentiating among patients who would and would not benefit from PD-1 or PD-L1 immunotherapies. Electronic supplementary material The online version of this article (10.1186/s12885-018-4134-y) contains supplementary material, which is available to authorized users.


Background
In clinical trials of PD-1 or PD-L1 checkpoint immunotherapies, patients with NSCLC separate into groups that respond or do not respond to immunotherapy treatment [1][2][3][4]. Objective responses for NSCLC in these studies range from 19.0-23.0%. Patients are selected for immunotherapy based on immunohistochemistry (IHC) detection of PD-L1 reactivity. Positive PD-L1 reactivity in tumors is considered to be important to predicting the success of PD-1 and PD-L1 immunotherapy treatments [2,5]. However, IHC results to detect PD-L1 reactivity can vary depending upon different IHC platforms, differences in anti-PD-L1 antibodies, differences in scoring systems, and differences in positivity cut-off values [4,[6][7][8][9]. In the Blueprint PD-L1 IHC Assay Comparison Project [10], similar antibody-specific differences were seen. In all, this variability presents challenges for using PD-L1 reactivity as a sole marker for diagnosis and as a marker to predict the success of PD-1 and PD-L1 immunotherapy treatments.
More likely, there is a complex profile of molecules that contributes to the regulation of PD-L1 and to the subsequent immunosuppressive effects that NSCLC cells have on immune cells [11]. Chae et al. suggested that reliable predictive molecules need to be identified that can be used to select patients who would benefit from immunotherapy, yet limit the exposure of patients who would not benefit or have adverse reactions [12]. Multifaceted predictive biomarker systems have also been proposed that contain input on PD-L1 expression, tumor mutations, and the roles of inflammatory cells to identify patients that would respond or not respond to immunotherapy treatment [13,14]. For better treatment outcomes, there is a recognized need to develop additional methods that can identify a profile of molecules that contributes to the regulation of PD-L1 expression and affirms IHC PD-L1 positive reactivity. One approach is to use the influence of patient cell genomics on tumor cell signaling to identify the downstream effects on PD-L1 expression.
In this study, we hypothesized that patient tumor cell genomics influences cell signaling and the expression of PD-L1, chemokines, and immunosuppressive molecules. We also hypothesized that these profiles can be used to predict patient clinical responses. Rizvi et al. assessed the mutational profiles that determined sensitivity to PD-L1 blockade from patients with NSCLC treated with pembrolizumab [15] and we used the Rizvi et al. dataset to test our hypothesis. We first assessed the effect of patient genomics on the expression profile of 24 molecules: PD-L1, 9 chemokines, and 14 immunosuppressive molecules. Differences among patient-specific models reflected the input that their deleterious gene mutation profiles had on modeled signaling pathways and the expression of PD-L1, chemokines, and immunosuppressive molecules. Second, we used the expression profiles of these 24 chemokines and immunosuppressive molecules to sort patients into those that would or would not respond to PD-1 immunotherapy. The 9 chemokines were used to generate an index to predict dendritic cell infiltration and PD-L1 and the 14 immunosuppressive molecules were selected as tumor-derived molecules with a long list of reported immunosuppressive functions (Additional file 1: Table S1). Our results suggest that patient-specific chemokine and immunosuppressive molecule expression profiles can be used to accurately predict clinical responses thus differentiate among patients who would or would not respond to PD-1 immunotherapy.

Patient clinical characteristics and mutation profiles
This was a retrospective study and patient data, clinical characteristics, and exome sequencing information for each of 34 patients were obtained directly from Supplement Table 3 of the Rizvi et al study. study [15]. To maintain anonymity, a random string generator was used to create a new random, 6-character uppercase alpha numeric string for each patient. This blinded both the identities of the patients in this study and their link to the prior published dataset we modeled.
All patients had stage IV NSCLC and were treated at Memorial Sloan Kettering Cancer Center (n = 29) or the University of California at Los Angeles (n = 5) on protocol NCT01295827. All patients had consented to Institutional Review Board-approved protocols permitting tissue collection and sequencing by the co-authors in this study (Naiyer A. Rizvi and Timothy A. Chan). All patients initiated therapy in 2012-2013 and were treated at 10 mg/kg every 2-3 weeks. Five patients were treated at 2 mg/kg every 3 weeks. The overall response rate and progression-free survival were reported to be similar across dose and schedules. PD-L1 expression on NSCLC tumor cells and immune cells by IHC was reported and scored semi-quantitatively: ≥50.0% membranous staining was considered strong, 1-49.0% was considered weak, and < 1.0% was considered negative [15].

Simulation models
A validated cancer network containing a database of proteins involved in cell signal transduction, metabolism, and epigenetics obtained from manual review of new and published research (Additional file 3: Figure S1) was used to create patient NSCLC-specific predictive computational simulation models. This approach modeled protein-protein interactions at each step in a signaling pathway using ordinary differential equations (ODE) [21] and to predict specific pathway output [22]. Pathway protein-protein interactions at each specific node were modeled as Michaelis-Menten equations that contained the reaction, enzyme, initial concentrations of protein intermediate reactants, and parameters of the reaction like Ka, Km, kcat, Vmax, etc. ODE were solved at each step by the Radau method [23]. To demonstrate this modeling approach, an annexure section of the PD-L1 pathway is illustrated showing the step-by-step details of the protein-protein interactions at each node in the pathway as an example of the modeling process that also occured in all of the other pathways (Additional file 4: Supplementary Materials and Methods and Supplement Table 3 of the Rizvi et al study). The cancer network and the schema for creating these simulation models, predicting molecule responses, and identifying those patients who would or would not respond to PD-L1 immunotherapy is shown in Fig. 1.
NSCLC models in the cancer network were created for each patient. At the initial step, models did not contain patient-specific deleterious gene mutation profiles and were simulated to reach a homeostatic steady state, which served as the control baseline for the molecules of interest. Then patient-specific deleterious gene mutation profiles were converted into a computational format and annotated into the NSCLC cancer network, simulated to induce the patient-specific cancer disease states, and used to predict the expression of PD-L1, chemokines, and immunosuppressive molecules. At the network level, mutations of oncogenes were represented as gain of function at the activity level and mutations of tumor suppressor genes were represented as a loss of function at the activity level unless explicit functionality of the mutation was known from published studies. Copy number variations such as amplifications and deletions were represented as over-expression or deletion of gene function at the expression level. The time required to achieve a patient-specific network varied depending upon on the complexity of the patient-specific deleterious gene mutation profile.
The modeled output contained the expression profiles of 24 molecules (e.g., PD-L1, 9 chemokines, and 14 immunosuppressive molecules). PD-L1 expression was reported as percent change calculated as ((D/C)-1)*100. C was the absolute value of the non-tumorigenic baseline control (μM) and D was the absolute value of PD-L1 obtained from the patient-specific cancer state network (μM) [24]. CCL2 [25], CCL3 [26], CCL4 [27], CCL5 [28], CCL11 [29], CCL20 [30], and CX3CL1 [31] expression were determined similarly. These chemokines are capable of trafficking dendritic cells into the tumor microenvironment. Individual chemokine percent expression values were given weightage and normalized to sum to 1. A dendritic cell infiltration index was then calculated to be the sum of each prediction % change * weightage (Additional file 6: Table S4). Finally, the expression of 14 immunosuppressive molecules thought to facilitate the ability of cancer cells to escape normal tumor surveillance was determined (Additional file 1: Table S1).
Patient-specific simulation model predictions were also assessed using Weka 3, a data mining software program in Java [32]. Weka 3 contained machine learning algorithms for data pre-processing; data classification, regression, clustering, and association rules; and data visualization. Using the predicted responses in Table 1, several machine-learning algorithms were implemented to learn prediction models (Additional file 7: Table S5).

Clinical response projections
Differences among the expression of 14 molecules were used in a 3-step process to sort patients into those that  Fig. 1 The schema for creating predictive computational simulation models to predict molecule responses and identify patients that would respond or not respond to PD-1 immunotherapy treatment using patient SA97V5 as a model example. Exome information from patient SA97V5 (a) contained 1192 total mutations with 36 deleterious mutations. This profile (b) was converted from a mutational profile to a computational format and annotated into the computational workflow to convert (c) a nontransformed model in the cancer network into (d) a patient SA97V5-specific simulation model. The patient SA97V5-specific simulation model was used to predict PD-L1 expression (e.g., 67.0% with respect to control), dendritic cell (DC) infiltration index (e.g., 23.8% with respect to control); and an immunosuppressive molecule expression profile (e.g., range − 1.9% to 56.5% with respect to controls) (e). Predicted expression responses were all used (f) to sort patients into groups that would respond or not respond to PD-1 immunotherapy treatment. SA97V5 was identified as a patient who would respond to PD-1 immunotherapy treatment. Numerous validation checks (g) occurred on the cancer network, the simulation model predictions, and the PD-1 match rates between the predicted responses and the patient clinical responses would or would not respond to PD-1 immunotherapy (Fig. 2). Patients were sorted by their PD-L1 expression (Step 1), their dendritic cell infiltration index (Steps 2a and b), and their immunosuppressive molecule expression (Steps 3a and b).

Signal pathways
Graphic representations of the simulation model networks for each patient-specific model and the underlying network relationships were created as previously described [24] to identify similarities in patient-specific signaling pathways and to identify the influence of the pathway intermediates altered by the patient deleterious gene mutation profiles.

Simulation model validations
A series of internal control check analyses were used to validate the cancer network input and output data. These control checks monitored a) the effects of select pathway molecule over-expression or knockdown on pathway predictions, b) the effects of select drugs on pathway predictions, and c) the effects of activation, regulation, and cross-talk interactions among pathway intermediates on pathway predictions. A cross-validation approach was used to assess the match scores of the PD-1 predicted responses against the PD-1 clinical responses in the Rizvi et al. 2015 Discovery dataset vs. the Validation dataset [14]. The datasets were then pooled and re-partitioned into two new Training and Test datasets. A similar crossvalidation approach was then used to assess the match scores of the PD-1 predicted responses vs. the PD-1 clinical responses. Differences between the match rates of the PD-1 predicted responses and the PD-1 clinical responses were performed via chi-square test or Fisher's exact as previously described [24]. All statistical tests utilized a 0.05 level of significance.

Simulation models
In this retrospective study, we created 29 of 34 separate and patient-specific simulation models from the exome sequencing information for each of the 34 patients listed in Supplement Table 3 of the Rizvi et al study [15]. In the Discovery dataset, 13 of 16 patients had sufficient information to create simulation models and patients RYRJFL, IYXPLI, and GOFKQI did not (Additional file 2: Table S2). In the Validation dataset, 16 of 18 patients had sufficient information to create computational simulation models and patients 6NLFT5 and 32I5VC did not (Additional file 2: Table S2). The 5 patients with insufficient information were omitted from this study since Step 2a Non-responders J0T9TJ ZX7V33 If DC Index <20% True If DC Index >60% Step 3a Step 3b Step 2b Responders MJXYP6 DFZLO2 True False Start Fig. 2 A decision tree was used to identify PD-1 drug responder status. At step 1, 9 patients with PD-L1 expression below 29.0% were identified as PD-1 drug non-responders. The remaining 16 patients (including patient SA97V5) with PD-L1 expression equal to or greater than 29.0% proceeded to step 2. At Steps 2a and 2b, 2 patients with dendritic cell infiltration index values below 20.0% were identified as non-responders and 2 patients with index values greater than 60.0% were identified as PD-1 drug responders. Twelve patients with index values greater than 20.0% (including patient SA97V5), but less than 60.0% proceeded to step 3. At Step 3, 4 patients with immunosuppressive molecule (ISM) values higher than that of their PD-L1 expression with a margin of greater than 5.0%, were identified as non-responders (Step 3a) and 8 patient-specific models with values lower than that of their PD-L1 expression with a margin of greater than 5.0% were identified as responders (Step 3b, including patient SA97V5). Mismatch patients GI7AGZ, 2FCOH7, F3FK2W were not listed their deleterious gene mutation profiles lacked driver genes and we were unable to achieve an increase in tumor phenotypes of proliferation and viability with the subset of gene aberrations reported. The remaining patients in the Discovery dataset (n = 13) contained 5 clinical responders and 8 clinical non-responders and the patients in the Validation dataset (n = 16) contained 6 clinical responders and 10 clinical non-responders (Table  1). Objective responses to PD-1/PD-L1 immunotherapies are known to vary. For example, objective responses to PD-1 immunotherapies for NSCLC were reported to range from 19.0-21.0% and objective responses to PD-L1 immunotherapies for NSCLC were reported to range from 10.0-23.0% [1,2]. Hence the proportion of more non-responders than responders in this sample size was representative of the responses previously reported in larger patient populations.

Patient molecule responses
Results reported for patient SA97V5 were used as an example to clearly illustrate the process of model creation, model prediction, and model validation.
Modeled chemokine expression was used to create a dendritic cell infiltration index. This index was a weighted function of the percentage change of each of the 9 individual chemokines (Table 1) Table S4.
Modeled expression profiles for 14 immunosuppressive molecules, including those for patient SA97V5, are also shown in Table 1.

Clinical response projections
The expression of PD-L1, dendritic cell infiltration index, and immunosuppressive molecules were used in a 3-step process to sort patients into those that would or would not respond to PD-1 immunotherapy (Fig. 2).
At step 1, 9 patients with PD-L1 expression below 29.0% were identified as PD-1 drug non-responders ( Figs. 2 and 3a). The remaining 16 patients with PD-L1 expression equal to or greater than 29.0% proceeded to step 2. Patient SA97V5 had a PD-L1 expression value of 67.0% and proceeded to step 2.
At Step 2, 2 patients with dendritic cell infiltration index values below 20.0% were identified as nonresponders (Figs. 2 and 3b) and 2 patients with index values greater than 60.0% were identified as PD-1 drug responders (Figs. 2 and 3b). One mismatch occurred at this step. Clinical non-responder GI7AGZ with a dendritic cell index of 64.9% was misidentified as a PD-1 responder (Fig. 3b). Twelve patients with index values greater than 20.0%, but less than 60.0% proceeded to step 3. Patient SA97V5 had a dendritic cell infiltration index of 23.9% and proceeded to step 3. At Step 3, 4 patients with immunosuppressive molecule values higher than that of their PD-L1 expression with a margin of greater than 5.0% were considered to be non-responders (Fig. 2, Step 3a) and 8 patientspecific models with values lower than that of their PD-L1 expression with a margin of greater than 5.0% were considered to be responders (Fig. 2, Step 3b). Three mismatches occurred at this step. Clinical responder patient 2FCOH7 had an immunosuppressive molecule expression profile of a non-responder: vascular endothelial growth factor (VEGF), cytotoxic T-lymphocyte-associated protein 4 (CTLA4), ganglioside GM3 (GM3), and ganglioside GD2 (GD2) were all higher than that of PD-L1 with a margin of greater than 5.0%. Clinical non-responder patients F3FK2W and 6QFSVV had immunosuppressive molecule expression profiles of responders. Patient SA97V5 had all 14 immunosuppressive molecules below the threshold of PD-L1 and was identified as a PD-1 drug responder (Figs. 2 and 4a) and patient QIA43T had molecules TGFB1 and IL6 above the threshold of PD-L1 and was identified as a PD-1 drug non-responder (Figs. 2 and 4b).
Patient-specific model predictions in the Discovery and Validation datasets were also checked using Weka 3 [32] and SMO support vector machine with a normalized polynomial kernel had the best performance (Additional file 7: Table S5). The relationship between PD-L1 expression and predicted TGFB1 expression using Weka 3 algorithms for all patients in the dataset is shown in Additional file 8: Figure S2 and similar trends were seen when comparing the PD-L1 expression level to the other 13 predicted molecules. Weka 3 correctly identified 24 out of 29 patients whereas the computational simulation models correctly identified 25 of 29 patients.

Model validation
In the cross-validation analysis of the match scores between the Rizvi et al. 2015 Discovery and Validation datasets, there were no significant differences between the match scores of non-responders and responders in the PD-1 clinical response group (38.5% vs. 37.5%; p = 0.9577, Additional file 9: Table S6) and PD-1 predicted response group (30.8% vs. 56.3%; p = 0.2642, Additional file 9: Table S6). Even though the Discovery dataset had a higher match score rate among the PD-1 clinical response group and the PD-1 predicted response group Fig. 4 Patient-specific simulation models were used to predict the expression of 14 immunosuppressive molecules. At step 3 of the decision tree, patients with immunosuppressive molecule predictions higher than that of PD-L1 with a margin of greater than 5.0% (bold line), were considered to be non-responders and patient-specific models with predictions lower than that of PD-L1 with a margin of greater than 5.0% were considered to be responders. Eight remaining patients were identified as responders and 4 remaining patients were identified as non-responders. Patient SA97V5 (a) had all 14 molecules below the threshold and was identified as a PD-1 drug responder. Patient QIA43T (b) had 2 molecules above the threshold and was identified as a PD-1 drug non-responder Similarly, in the cross-validation analysis of the match scores between the Training and Test datasets, there were no significant differences between the match scores of non-responders and responders in the PD-1 clinical response group (38.9% vs. 36.4%; p = 0.9999, Additional file 9: Table S6) and PD-1 predicted response group (44.4% vs. 45.5%; p = 0.9577, Additional file 9: Table S6). In the Training dataset, the PD-1 predicted responses had an 83.3% match score with the PD-1 clinical responses and in the Test dataset the PD-1 predicted responses had a 90.9% match score with the PD-1 clinical responses. Again, there was no significant difference between the two datasets (p = 0.9999).

KRAS mutations and PD-L1 expression
To further affirm the association between the presence of KRAS mutations or KRAS co-mutations and PD-L1 expression, 2 additional datasets [33,34] were modelled beyond the Supplement Table 3 of the Rizvi et al study [15].
KRAS mutations in lung adenocarcinoma were reported to be associated with co-mutations in TP53. In modeled simulations of the Dong, et al. dataset [33], the KRAS+TP53 co-mutation (KP subgroup) was predicted to increase PD-L1 expression. The KRAS +TP53 co-mutation had higher levels of predicted PD-L1 expression than the KRAS mutation and TP53 mutation alone.
KRAS mutations in lung adenocarcinoma were also reported to be associated with co-mutations in STK11/ LKB1 (the KL subgroup) [34]. In modeled simulations of the Skoulidis, et al. dataset [34], KRAS+STK11+KEAP1 co-mutation was predicted to reduce PD-L1 expression. The KRAS+STK11+KEAP1 co-mutation had lower levels of predicted PD-L1 expression than the KRAS +TP53 co-mutation.

Discussion
In this retrospective study, we used a recent dataset from NSCLC patients treated with pembrolizumab and identified deleterious gene mutational profiles in patient exomes. We annotated the deleterious gene mutational profiles into a cancer network to create NSCLC patientspecific predictive computational simulation models. We used these models as a tool to identify and validate a profile of 24 chemokines and immunosuppressive molecules that could accurately affirm expression of PD-L1 and predict patient clinical responses to PD-1 immunotherapy. We found that patient tumor cell genomics influenced cell signaling and altered the expression of PD-L1, 9 chemokines, and 14 immunosuppressive molecules. We also found that expression profiles of these 24 chemokines and immunosuppressive molecules could be used to identify patients who would or would not respond to PD-1 immunotherapy. Adding chemokine and immunosuppressive molecule expression profiles to a predicted PD-L1 profile allowed models to achieve a greater than 85.0% predictive correlation among predicted and reported patient clinical responses. This differentiated patients who would and would not benefit from PD-1 or PD-L1 immunotherapies. To validate our results, we used retrospective correlation of our simulation models against patient genomic signatures and clinical outcome data that was available in the NSCLC cohort of the Rizvi et al. study [15]. We also used Weka 3 to validate predictions determined using the predictive computational simulation models. The Weka 3 results were similar to that generated via machine-learning methods and the Chi-square test was used to show no differences among the match rate results in these datasets. It is important to note that expanding PD-L1 expression profiles to include 23 additional chemokine and immunosuppressive molecule expression responses allowed models to achieve a greater than 85.0% correlation among predicted and reported patient clinical responses.
The 24 molecules used in this study have immunosuppressive properties. The role of PD-L1 in tumor pathogenesis is well known. Increased expression of PD-L1 on tumor cells inhibits T-cell proliferation, reduces T-cell survival, inhibits cytokine release, and promotes T-cell apoptosis [3,[35][36][37][38]. This leads to T-cell exhaustion and adaptive tumor immunosuppression [39].
Cytokines also have a role. Cytokine scores were recently found to be associated with overall survival in CheckMate 017 and 057 (both for nivolumab and docetaxel treated patients) [40]. In this present study, 9 chemokines were selected that chemoattractant dendritic cells [41][42][43]. In other studies, dendritic cells were present in NSCLC tumors [42,44] and dendritic cell infiltration was reported to be an independent prognostic factor for NSCLC [44]. The simulation models here captured the role of dendritic cells by predicting chemokine expression in the form of a functional dendritic cell index (Additional file 6: Table S4).
Using a 24 molecule expression profile allowed computational models to achieve a greater than 85.0% predictive correlation. However, this list was not exclusive and incorporating additional molecules into patient NSCLCspecific expression profiles may have merit and improve computational model accuracy. These included Lymphocyte activation gene-3 (LAG-3), T cell immunoglobulin-3 (TIM-3), and T cell immunoglobulin and ITIM domain (TIGIT) co-inhibitory receptors [64]. LAG-3 is a coinhibitory receptor upregulated on activated CD4+ T cells, CD8+ T cells, and subsets of natural killer (NK) cells [64][65][66]. It impairs T cell proliferation and cytokine production and alters NK cell cytotoxicity and cytokine production. TIM-3 is a cell surface molecule expressed on IFNγproducing CD4+ T helper 1 cells, CD8+ T cytotoxic 1 T cells, NK cells, monocytes, and dendritic cells [64]. TIM-3 dampens the development of protective immunity and TIM-3 blockade improves cell function. In patients with NSCLC, co-blockade of the TIM-3 and PD-1 pathways suppresses tumor growth. TIGIT is another co-inhibitory receptor expressed on NK cells, T cells, and Treg cells [64]. CD155, CD112, and TIGIT ligands suppress immune responses through CD155 on dendritic cells. TIGIT is thought to work with PD-1 and TIM-3 to attenuate T cell responses and promote T cell dysfunction.
After the NSCLC patient-specific predictive computational simulation models were created and the profiles of 24 chemokines and immunosuppressive molecules were predicted, we created a decision tree to identify patients who would or would not respond to PD-1 immunotherapy. Decision cutoffs were established at 29.0% PD-L1 expression (Step 1), < 20.0% dendritic cell infiltration (Step 2a), > 60.0% dendritic cell infiltration (Step 2b), and immunosuppressive molecule expression as < PD-L1 with a margin of greater than 5.0% (Step 3) (Fig. 2). The decision tree was robust and had built-in redundancy. Basing the PD-L1 drug responder status on 3 separate predicted criteria allowed a responder/non-responder not identified at one step to be identified at a later step. Also the thresholds were specific. At 29.0% PD-L1 expression (Step 1), 9 non-responder patients were identified. Decreasing the PD-L1 expression cutoff from 29.0% to 25.0% identified only 6 non-responders. Increasing the PD-L1 expression cutoff from 29.0% to 35.0% identified a number of false negatives and setting the PD-L1 expression cutoff at 35.0% identified up to 13 non-responder patients: the three additional patient responders L8MTGU, P90A0O, and 26YMUF would be falsely identified as nonresponders.
A diversity of signaling pathways are reported to be involved in the expression and regulation of PD-L1 [67][68][69] and these pathways were observed in expression and regulation of PD-L1 in this study (Fig. 5). Responder patients had mutations around the rapidly accelerated fibrosarcoma (RAF)-rat sarcoma (RAS)-ERK pathway including KRAS/BRAF and MEK-related mutations that predicted the profiles to have stronger expression of PD-L1. However, the presence of KRAS cannot be the only criteria for predicting strong expression of PD-L1 and thus a likely PD-1 drug responder, since there were nonresponder profiles that also had KRAS mutations. We observed that matched predicted and clinical nonresponder patients 195P5D and J0T9TJ and mismatched predicted responder and clinical non-responder patient 6QFSVV all had KRAS mutations (Additional file 2: Table S2, Fig. 5).
Recent studies support the concept that NSCLC is not a homogeneous disease and at least 3 subtypes of KRAS mutations involving LKB1 or TP53 can be identified. The tumors with these mutations have different PD-L1 expression patterns (higher in KRAS mutations and TP53 mutations) and different sensitivities to immune checkpoint blockade. Thus the effects of KRAS mutations and KRAS co-mutations on PD-L1 expression was further assessed using 2 additional datasets [33,34] beyond the Supplement Table 3 of the Rizvi et al study [15].
Dong, et al. [33] reported that TP53 and KRAS mutations may predict which patients would or would not respond to PD-1 immunotherapy. Modeling the dataset in their study, we predicted that KRAS+TP53 co-mutation (KP Subgroup) would lead to increased PD-L1 expression. Skoulidis, et al. [34] reported that KRAS mutations in lung adenocarcinoma were associated with co-mutations in STK11/LKB1 (the KL subgroup) [34]. KL tumors had high rates of KEAP1 mutations with lower PD-L1 expression.
Modeling the dataset in their study, we predicted that KRAS+STK11+KEAP1 co-mutations (KL Subgroup) also would lead to reduced PD-L1 expression. We predicted that KRAS+CDKN2A/B co-mutation (KC Subgroup) would lead to reduced PD-L1 expression. There was a reduction in positive regulation due to reduction in AMPK, mTOR pathway and also due to KEAP1 loss of function. There was an increase in the WT TP53 mediated inhibitory regulation of PD-L1 expression. This was a novel finding based on network analysis.
The techniques described in this retrospective study have application. Although the techniques were complicated and need more extensive validation with larger datasets, their utility in clinical practice is possible. Profiling of tumors is becoming more main stream for precision personalized medicine. The approach may not necessarily be expensive, but in fact provides more utility to the generated profiling data for most tumor samples.

Conclusions
Patient tumor cell genomics were found to influence cell signaling with downstream effects on the expression of 24 chemokines and immunosuppressive molecules. This allowed us to establish patient-specific profiles of these molecules that could be used to predict patient clinical responses with greater than 85.0% correlation among predicted and reported patient clinical responses. Developing a workflow incorporating immunosuppressive molecules could a) be used as a potential complementary assay to affirm IHC results or used as an alternate assay where IHC in unfeasible, b) affirm patient PD-1 and PD-L1 drug responder status, c) as a method to determine influencing factors on PD-L1 expression, and d) as a potential clinical decision support system facilitating selection of therapies based on individual patient mutational profiles. The latter application used shortly after cancer diagnosis and just before cancer treatment could generate important patient-specific treatment options that could assist clinicians in selecting appropriate mono-therapies or combination therapies.

Additional files
Additional file 1: Table S1. Molecules with immunosuppressive functions used in simulation models to predict PD-1 drug responder status. (DOCX 55 kb) Additional file 2: Table S2. Individual mutational profiles of patient drug responders (n = 11) and nonresponders (n = 18). (DOCX 19 kb) Additional file 3: Figure S1. A schematic pyramid showing the levels of information used to develop the validated, cancer network. This network was created from published reports on cell receptors, signaling pathways, pathway signaling intermediates, activation factors, transcription factors, and enzyme kinetics. Information on each pathway node, its functionality, and its links with other genes, proteins, and pathways was manually researched, analyzed, curated, and aggregated to construct the integrated network maze. Every process or reaction was modeled mathematically using Michaelis Menton kinetics, mass action kinetics, and variations of these representations using ordinary differential equations (ODEs). Modeled events included but were not limited to interactions at the cell surface (e.g., binding of ligands to receptors, etc.), metabolic and cell signaling (e.g., signal pathway events, cross talk interactions among pathways, feedback control, etc.), activation and regulation of genes (e.g., activation links of transcription factors, etc.), intracellular processes such as proteasomal degradation, endoplasmic reticulum (ER) stress, oxidative stress, DNA damage and repair pathways, and cell cycle pathways. Time-dependent changes in signaling pathway fluxes of every biological reaction modeled utilizing modified ODEs were solved with a proprietary solver. Models were validated with a series of internal control analysis checks on predictions. These checks included assessing the effects of pathway molecule over-expression or knockdown on pathway predictions; effects of drugs on pathway predictions; and activation, regulation, and cross-talk interactions among pathway intermediates on pathway predictions.  Table S3. An example of the predictive computational modeling process. Specific details on an annexure section of the PD-L1 pathway show the step-by-step reactions, mechanisms, and reaction equations that occur. Such reactions also occurred in all of the other pathways. (DOCX 102 kb) Additional file 6: Table S4. Creation of the dendritic cell infiltration index for the patient SA97V5-specific simulation model. Chemokines CCL11, CCL20, CCL2, CCL3, CCL4, CCL5, CCL7, CX3CL1, and CXCL14, capable of trafficking of dendritic cells into the tumor microenvironment, were used to create the index. Individual chemokine percent expression (with respect to non-tumorigenic baseline controls) was predicted and given weightage so as to normalize the total to 1. The index was then calculated to be the sum of each prediction % change * weightage. (DOCX 16 kb) Additional file 7: Table S5. Analysis of the Discovery and Validation datasets was performed using Weka 3. The first number in each column represented the number of patient treatment responses correctly classified by the model. The second number represented the number of incorrectly classified patient treatment responses. The GOAL row at the bottom of each column described the number of correctly and incorrectly classified patients in the simulation models. The Test Set columns described the output from applying the model trained on the Discovery set to the Validation set. The "Test and Train" columns described test set accuracy (test set column) plus the training error (results obtained by applying the model to the training set, i.e. training error). (DOCX 19 kb) Additional file 8: Figure S2. An example of the relationship between PD-L1 expression and predicted TGFB1 expression using Weka 3 algorithms for all patients in the dataset. Similar trends were seen when comparing the PD-L1 expression level to the other 13 predicted molecules. For this, the number of gene mutations identified for each patient ranged from 2 to 36 with a total of 264 unique genes between all patients. This categorical data was preprocessed and expanded into a gene vector of length 264 to represent each of the unique genes. For each gene in the vector, the data was represented in binary; a 1 was assigned if the patient had a mutation in this gene, a 0 otherwise. Two datasets, one including gene mutations (Molecules and Gene Mutations) and one without (Molecules), were both used to learn prediction models. The Discovery and Validation datasets were determined based on the split provided to allow for comparable results. The performance of a subset of these models on the testing and training sets for both Molecules and Molecules and Gene Mutations datasets are shown. The SMO support vector machine with a normalized polynomial kernel had the best performance when applied to the molecule dataset. This model correctly identified 24 out of 29 patients whereas the simulation models correctly identified 25 of 29. This was only a difference of one match between the two prediction methods. Still, several other methods, while not performing as well overall, were able to identify 9 patients in the test dataset accurately. This was near the computational simulation model prediction capability in which 10 patients were successfully identified in the test dataset. In general, adding the gene mutation data to the molecule data either maintained or decreased the performance of a model. (DOCX 4114 kb) Additional file 9: Table S6. Comparisons of clinical and predicated responses and match scores. We used a cross-validation approach to assess the match scores in Table 1  . NIDCR did not have a role in the study design, analysis, or interpretation of the data, writing of the manuscript, or the decision to submit the manuscript for publication.

Availability of data and materials
All data generated or analyzed during this study are included in this published article, figures, table, and additional files. Patient data, clinical characteristics, and exome sequencing information for each of 34 patients were obtained directly from Additional file 5: Table S3 of the Rizvi et al. study [15]. To maintain anonymity, a random string generator was used to create a new random, 6-character uppercase alpha numeric string for each patient. This blinded both the identities of the patients in this study and their link to the prior published dataset we modeled.
Authors' contributions NAR and TAC provided the patient data, clinical characteristics, and exome sequencing information for each of 34 patients (Additional file 5: Table S3) from their study [15]. DP, TA, and SV performed the predictive computational modeling of the data. ARH and TB performed machine learning analysis as a validation step of the modeled predictions. FQ performed statistical analysis of the dataset to also validate predictions. KAB analyzed and interpreted the computational model predictions. ADB and MMM provided background information and assisted in preparation of the manuscript. All authors assisted in preparation and writing of the manuscript. All authors have read and approved the final manuscript for publication.
Ethics approval and consent to participate All patients had stage IV non-small cell lung cancer (NSCLC) and were treated at Memorial Sloan Kettering Cancer Center (n = 29) or the University of California at Los Angeles (n = 5) on protocol NCT01295827. All patients had consented to the Memorial Sloan Kettering Cancer Center Institutional Review Board-approved protocols permitting tissue collection and sequencing. All patient related research was Memorial Sloan Kettering Cancer Center Institutional Review Board-approved and treated under protocol NCT0129827. Written informed consent was obtained from all patients.

Consent for publication
Not applicable.

Competing interests
KAB has had a Cooperative Research and Development Agreement with Cellworks Group Inc., San Jose, CA. TA and SV work for Cellworks Group Inc., San Jose, California. DP, NKS, and UM work for Cellworks Research India Ltd.,