Integrate Exploration to Identify Radiosensitive Biomarkers in Genes in PD-L1 Expression and PD-1 Check Point Pathway in Cancer


 Purpose

Exploration to identify radiosensitive biomarkers in genes in PD-L1 expression and PD-1 check point pathway in cancer.
Methods and Materials:

Gene expression datasets and information were downloaded from TCGA. Stepwise multivariate Cox regression based on AIC was performed using stacking multiple interpolation data to identify radiosensitive (RS) genes.
Results

Among the 74 PD-1/PD-L1 pathway genes, we identified 10 RS genes in BRCA dataset, 11 RS genes in STAD dataset and 13 RS genes in HNSC dataset. These genes can be thought as independent factors to identify the sensitivity of cancer patients to radiotherapy. Gene CD274 was the common gene in the three tumor datasets. And gene ZAP70 was verified as a RS gene in the external validation. There were moderate co-expression relationships and interactions in these genes. Functional enrichment analysis showed that most of these genes were related to T cells.
Conclusions

Our study identified potential radiosensitive biomarkers of several main cancer types in an important tumor immune checkpoint pathway. New types of RS genes were identified based on expanded definition to radiosensitive genes. Different types of tumors may share some common carcinogenic mechanisms and may have same RS genes.


Introduction
Radiation therapy remains the primary treatment for nearly two-thirds of cancers, including the primary curative or palliative treatment for breast cancer and adjuvant therapy for radical resection of gastric cancer [1][2][3]. Unfortunately, because of tumor heterogeneity, tumor response rates to radiotherapy can vary conspicuously, even among patients who are diagnosed with the same tumor type [4]. Despite signi cant technological advances in radiation therapy for tumors in recent years, personalized radiotherapy regimens based on cancer biology have become increasingly di cult [5]. A major issue in radiation therapy is predicting cancer radiosensitivity.
Biomarkers that provide information about tumor prognosis and predict tumor's inherent radiation sensitivity or its response to treatment may be valuable in helping to personalize radiation dose, allowing clinicians to make decisions about treatment regimens for different patients, while avoiding radiationinduced toxicity in patients who are unlikely to reap the bene ts from the treatment [6,7]. Tumor molecular mapping has been used to develop radiosensitive genetic signatures and has been used to identify prognostic or predictive biomarkers of radiation responses [8][9][10]. Given strong evidence of the pathway-based genetic nature of cancer, one of the main shortcomings of past studies is the failure to use prior biological information into identifying biomarkers [11]. The potential for carcinogenic mechanisms are grouped into pathways based on biological functions such as cell cycle, hypoxia, DNA damage, tumor micro-environment, immune checkpoints and others [12][13][14][15][16].
Programmed death-1 (PD-1) and its ligand PD-L1 check point pathway as a key regulatory immune checkpoint, plays a crucial role in maintaining the balance between immune tolerance and autoimmunity [17]. Studies have shown that PD-L1 presented on the surface of the tumor cells can activate the downstream of the PD-1/PD-L1 pathway to over-inhibit T cells proliferation and differentiation [18] and promote immune escape and tumor growth [19]. In addition, the expression of PD-1/PD-L1 has been found associated with tumor radiosensitivity in a variety of solid tumor types also. When Bum-Sup Jang et al. [20][21][22] evaluated the predictive value of radiosensitive gene signatures in invasive breast carcinoma and lower grade glioma, they discovered the relationship between radiosensitive gene signatures and PD-L1. Xintong Lyu et al. [23] reported that in head and neck cancer, patients with high PD-L1 expression had better recurrence-free survival in receiving radiotherapy.
These evidences seem to indicate that PD-L1 expression and its regulation in solid tumors is affected by radiotherapy, thereby altering the outcome of patients' prognosis. In this case, it is necessary to understand the regulatory mechanism of PD-L1 in cancer. In solid tumors, up-regulation of PD-L1 is caused by activation of pro-survival pathways MAPK and PI3K/Akt as well as transcriptional factors HIF-1, STAT3 and NF-kappa B [24]. It can be supposed that genes regulating PD-1/PD-L1 check point pathway in cancer may as well associate with cancer radiosensitivity and might be useful biomarkers for predicting radiosensitive of cancer. In fact, the relationship between these genes and radiotherapy sensitivity of gastric cancer has been preliminarily investigated, and some conclusions have been obtained [25].
In this study, we have enhanced the evidence and supplemented the previous studies. We explored the radiosensitivity of genes in PD-1/PD-L1 check point pathway in several cancers using reliable method and validated in an external cohort. Conclusively, for precision medicine, our work offered more evidence for using PD-1/PD-L1 related pathway genes as potential biomarkers to predict radiosensitive for cancer patients.
The corresponding expression datasets were collated to exclude normal tissues and retain tumor samples. At the same time, we examined clinical information on each type of tumor and found GBM had too few samples for no radiotherapy (n = 18) while LIHC had too few samples for radiotherapy (n = 14). These two datasets were abandoned. Next, we removed patients with missing survival and radiotherapy information. Patients with survival time less than 5 days were also excluded. Then multivariate Cox stepwise regression analysis (see Table 1, TableS1/2) was performed on the remaining six tumor datasets, and three tumor datasets (BRCA, HNSC, STAD) whose radiotherapy was protective effect (hazard ratio, HR < 1, P < 0.05) were selected for subsequent analysis. In addition, we also performed external validation, using the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) cohort (https://www.cbioportal.org/datasets). Figure 1 is the ow chart.

Radiosensitive genes (RS genes)
We obtained a total of 74 genes (See TableS3)  In this study, we de ned radiosensitivity as patients with different gene expression obtained discrepant bene t from radiotherapy. Based on the median of gene expression, the whole included participants were roughly divided into two groups as the high expression group and the low expression group. If one group (A group) had better overall survival (OS) in receiving radiotherapy (RT) than in non-RT (scenario A) while the other of the two groups (B group) had consistent OS whatever in RT group or non-RT group (scenario B) and meanwhile A group had a better OS than B group in receiving RT (scenario C) or A group had a lower OS than B group in non-RT (scenario D), we de ned this gene as RS gene. That is if scenario A and scenario B happened and meantime scenario C or scenario D happened, the gene was deemed as RS gene.

Analysis methods
The relationship between genes expression levels and radiosensitivity was analyzed by the multivariate Cox proportional hazards models, the stepwise method based on Akaike information criterion (AIC) was used for variable selecting. The variables that remained in the model were considered to have an impact on OS. In this study, we included as many clinical variables as possible to screen out the best correction factors. Then the whole included participants could be divided into four groups: high-RT group, low-RT group, high-non-RT group and low-non-RT group. An example of how to identify RS gene. In high expression group, if radiotherapy had remained in the multivariate Cox regression model (HR < 1), corresponding to scenario A; and in low expression group, radiotherapy had no impact (scenario B); meanwhile in RT group, high expression group compared to low expression group had a HR < 1 and remained in the multivariate Cox regression model, corresponding to scenario C. This gene was considered as a RS gene.
For missing variable data, R packet mice (multiple imputation by chained equations) was used for multiple interpolation [27]. Next we utilized the strategy of imputation stacking, where multiple imputations of the missing data were stacked on top of each other to create a large dataset [28]. We then estimated parameters in the analysis model by tting a weighted model for Y |X on the stacked dataset [29]. Kaplan-Meier (K-M) curves were used to show the survival curves. The log-rank test evaluated the statistically signi cant differences. Wilcoxon test was used to compare continuous variables that were not normal. Correlation was calculated by Pearson correlation coe cient (r). Absolute value of r > 0.8, as strong correlation; Between 0.3 and 0.8, as a moderate correlation; Below 0.3, considered as weak correlation [30]. The Search Tool for the Retrieval of Interacting Genes (STRING) [31] online tool was applied to analyze the protein-protein interaction (PPI) network (minimum required interaction score > = 0.4). Functions and pathways were analyzed by Gene Ontology (GO) and KEGG with p value cutoff = 0.05 and q value cutoff = 0.05. All statistical analyses were performed using the R (4.0.2). A P-value of 0.05 was considered signi cant. All statistical tests were two-sided.

Identi cation of RS genes
We take BRCA as an example to illustrate the identi cation of RS genes. expression had better OS than the low. That is BRCA patients with relative high expression of RASGRP1 and TRAF6 could bene t from radiotherapy, we called them radiosensitive genes within high expression (RGH). High expression of Gene TIRAP had a lower OS than the low expression in non-RT group but get high OS when received RT, which was also a RGH gene. In the other hand, in non-RT group, low expression of genes CD3G, IFNG, NFKBIA, PDCD1, CD274, STAT1 and ZAP70 had much lower OS compared to the high. But these low expression genes could obtain a big promotion OS in RT group (That is so called RGL). In addition, we found that without adjustment for clinical factors, these genes were strong indicators as well (See K-M curves in Fig. 3). RS genes of HNSC and STAD see TableS4. We compared RS genes in the three tumor datasets and found some crossover genes (See Fig. 4). Gene CD274 was the common gene in the three tumor datasets.

Distribution of RS genes in BRCA
We extracted BRCA patients receiving radiotherapy who survived more than 8 years (n = 49) and those who survived less than 3 years (n = 29). We compared the expression of the 10 RS genes in the two groups (See Fig. 5A). From the boxplot, genes RASGRP1 and TRAF6 had a signi cantly difference expression level (P < 0.05) between alive group (higher) and dead group (lower). By contrast, among non-RT patients, most RGL genes had a trend that their median expression values in alive group (n = 32) would be higher than those in dead group (n = 29) (See Fig. 5B).

Relationship of BRCA RS genes
We explored the correlation among these 10 RS genes expression level, the result is as shown in Fig. 6. Genes NFKBIA, RASGRP1, TIRAP and TRAF6 had a correlation coe cient of less than 0.3 with other genes (Fig. 6A). The remaining six genes were moderate correlated, r=(0.3,0.8) (Fig. 6B). Specially, there was a strong co-expression relationship between PDCD1 and ZAP70. Further analysis of PPI network (Fig. 6C) shows that CD274, PDCD1, CD3G, STAT1, IFNG and ZAP70 were at the hub position.
GO and KEGG analysis of 10 RS genes to obtain the biological process (BP), molecular function (MF), cellular component (CC), and pathways. KEGG pathway analysis (Fig. 7A) showed that the 10 RS genes in BRCA mainly related to "PD-L1 expression and PD-1 checkpoint pathway in cancer" and "T cell receptor signaling pathway". The BP of the 10 RS genes mainly related to "positive regulation of lymphocyte activation" and "T cell activation"; the CC of the 10 RS genes mainly involved in "plasma membrane signaling receptor complex" and "T cell receptor complex"; the MF of the 10 RS genes mainly associated with "tumor necrosis factor receptor binding" (Fig. 7B).  Figure 8 shows the unadjusted K-M curves of ZAP70 from METABRIC.

Discussion
Along with some chronic diseases such as cardiovascular disease, cancer remains one of the biggest killers of human health [32]. The World Health Organization (WHO, https://www.who.int/) has recently announced on 5 March, 2021 that, the breast cancer has now overtaken lung cancer as the world's mostly commonly-diagnosed cancer and the new global breast cancer initiative highlights renewed commitment to improve survival. At the same day, new WHO publication provides guidance on radiotherapy equipment to ght cancer like colorectal and lung cancer. Radiotherapy is remain one of the most effective tools to mitigate pain and suffering associated with advanced cancers, also, improve the quality of life and survival [33,34]. Nevertheless, heterogeneity in terms of tumor characteristics, prognosis, and survival among cancer patients has been a persistent problem for many decades. Vast studies have shown that, the investigation of biomarkers related to radiation could provide another means by which radiotherapy could become personalized [2,35].
Understanding the mechanism of tumors is also a major issue in identifying effective biomarkers and potential drug targets of radiosensitivity [36, 37]. PD-1 and its ligand PD-L1 are important immune checkpoints as a potential therapeutic target in cancer [19]. PD-L1/PD-1 pathway plays a critical role in transmitting co-stimulatory molecules to activate T cells as the second signal and maintain the balance of the immune microenvironment [38]. Well, when the body is invaded by the tumors, the balance of the immune microenvironment is destroyed. PD-L1 on tumor cells may engage the PD-1 receptors resulting in suppression of T-cell mediated immune response. Studies show that therapeutic antibodies blocking the PD-1/PD-L1 pathway by targeting PD-L1 or PD-1 are highly effective in rescuing T cell anti-tumor effector functions [18,39]. In addition, the expression level of PD-L1 seems to be related to the radiotherapy sensitivity of tumors [20,22]. As PD-L1 expression is regulated by the upstream signaling pathway, while PD-1/PD-L1 combination is transferred to the downstream T cell regulation as the second signal, the expression level of relevant genes in regulating PD-L1 expression and in PD-1 checkpoint pathway in cancer appears to be of vital importance, which may indicate the potential sensitivity of the tumor to radiotherapy.
In this study, we identi ed the radiosensitivity of genes in PD-L1 expression and PD-1 checkpoint pathway in cancer using the TCGA datasets of BRCA, HNSC and STAD. Because radiotherapy had non-positive effect (HR > = 1) to OS in lung cancer and LGG, we excluded these type of tumors for further exploration and perhaps they could be the subject of the next study. Then, we developed a more comprehensive de nition of radiosensitive genes since most studies have neglected many genes that directly affect the OS of patients without radiotherapy (scenario D). Such as gene IFNG, although its expression level did no effect to OS when people received RT, in non-RT group, patients with low expression of IFNG had a signi cantly amazing lower OS than the high (scenario D, see Fig. 2/3). And through scenario B we can see, patients with low expression of IFNG with RT had a much improved OS.
In addition, we systematically considered clinical factors in the datasets as many as possible. We performed multiple interpolation to missing clinical variables and stacked them to perform weighted multivariate Cox regression. Therefore, the clinical variables were well controlled to ensure the reliability of the results. In the BRCA dataset, radiotherapy, chemotherapy, age, surgery type, margin status, PR status, menopause status, NM stage and pathological stage were the impact factors of OS, which were reasonable and validated [40]. In the HNSC dataset, the impact factors included radiotherapy, age, gender, TN stage, margin status, anatomic site and smoking. Notably, females OS was not as good as males (HR: 1.149(1.016, 2.066), P = 0.041). And as for STAD, radiotherapy, age, gender, TN stage and residual tumor were the main in uencing factors.
Totally, among genes in regulating PD-1/PD-L1 pathway in cancer, we identi ed 10 RS genes in BRCA dataset, 11 RS genes in STAD dataset and 13 RS genes in HNSC dataset, with overlapping genes between each other to varying degrees. CD274 was the common gene in the three tumor datasets. As known to all, CD274 is the gene that encodes PD-L1, predicting the expression level of PD-L1. The expression level of CD274 has been speculated to be related to radiosensitivity of a variety of cancers [22,23,25].
Theoretically, there are two types of radiosensitive genes. The expression level of the rst type of genes (A genes) do not affect patients' OS, but their different expression level can in uence patients' OS after radiotherapy, like RASGRP1 and TRAF6. Only those with high expression of these genes could obtain bene t from RT. More often, however, are the second type of genes (B genes). Their expression could in uence patients' OS, for instance, patients with low expression of CD274 had much lower OS than the high. But these patients would bene t much receiving RT. And these genes can be thought as independent factors to identify the sensitivity of cancer patients to radiotherapy (Fig. 3/5). In addition, in B genes, there were moderate co-expression relationships and interactions (Fig. 6). Functional enrichment analysis showed that most of these genes were related to T cells. Nevertheless, more experimental studies are needed to con rm the ndings of this study. In the external validation, ZAP70 was veri ed as a RS gene. Many studies have shown that it is related to the immunity of cancers [41,42]. Importantly, there was also a strong co-expression relationship between PDCD1 and ZAP70 in METABRIC (r = 0.8) (See

FigureS1).
This study has its merits. Firstly, we expanded the de nition of radiosensitive genes and identi ed radiosensity of those genes in important pathway of cancer using TCGA public datasets recognized as authoritative. Secondly, we took into account as much useful clinical information as possible to control in uence factors by stacking multiple interpolation data, making the results more persuasively. Thirdly, we also validated the results with a big external dataset, METABRIC, although only one gene ZAP70 was turned out to be consistent. However, this might be due to different sample sizes and large gaps in followup time. The limitation of this study is that we don't have performed experimental study, also no cohort to verify the ndings. In addition, because we only explored a few major cancers, more tumor types should be brought into the discussion.
In conclusion, our study identi ed potential radiosensitive biomarkers of several main cancer types in an important tumor immune checkpoint pathway. New types of RS genes may be identi ed based on expanded de nition to RS genes. Different types of tumors may share common carcinogenic mechanisms and may have same RS genes. We hope that further studies will be performed to con rm our ndings.

Figure 1
Schematic of study design.

Figure 8
The unadjusted survival curves for the association analysis between OS and radiotherapy under different expression levels of ZAP70 in METABRIC.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.