A prognostic model of drug tolerant persister-related genes in lung adenocarcinoma based on single cell and bulk RNA sequencing data

Background Acquired resistance to targeted drugs is a major challenge in cancer. The drug-tolerant state has been proposed to be an initial step towards acquisition of real drug-resistance. Drug tolerant persister (DTP) cells are purported to survive during treatment and stay dormant for several years. Single cell sequencing can provide a comprehensive landscape of gene expression in DTP cells, which can facilitate investigation of heterogeneity of a drug tolerant state and identification of new anticancer targets. Methods The genetic profiling of DTPs was explored by integrating Gene Expression Omnibus (GEO) datasets, and a prognostic signature of DTP-related genes (DTPRGs) in lung adenocarcinoma of TCGA LUAD cohort was constructed. The scores of infiltrating immune cells were calculated and activity of immune-related pathways was evaluated by single-sample gene set enrichment analysis (ssGSEA). Functional enrichment analysis of the DTPRGs between low- and high-risk groups was performed. Immune cell subtypes and immune-related pathways were analyzed. Results An 11-gene panel (MT2A, UBE2S, CLTB, KRT7, IGFBP3, CTSH, NPC2, HMGA1, HNRNPAB, DTYMK, and IHNA) was established. DTPRGs were mainly correlated with nuclear division, chromosome segregation, and cell cycle pathways. Infiltration of immune cells was lower in the high-risk group while the inflammation-promoting and MCH-class I response pathway had higher activity in the high-risk group. A nomogram was generated with prognostic accuracy, further validated using clinical outcomes following therapy with epidermal growth factor receptor (EGFR) tyrosine kinase inhibitors (TKIs). Discussion A prognostic model of lung adenocarcinoma based on DTPRGs was constructed. Targeting DTP cells is a potential therapeutic approach to prevent a drug tolerant state.


Introduction
The outcome of anti-cancer therapy is inevitably perturbed by drug resistance.A small population of tumor cells evade cell death from cytotoxic and targeted therapy by entering a slow proliferation but reversible state, known as the "drug tolerant persister (DTP)" state [1].Evidence suggest that DTP cells without bona fide resistance mechanisms display a reversible phenotype characterized by reduced drug sensitivity and decreased cell proliferation, that can maintain residual disease and may serve as a reservoir for resistant phenotypes [2].Residual tumors rely on DTP cells to evade treatment and give rise to disease relapse.The persister cells comprise a reservoir from which drug-resistant cancers may arise.Drug holidays at the drug-tolerant state enable tolerant cells to reverse into parental cells and become re-sensitized to the drug they were previously exposed to Ref. [3].
Drug tolerant persisters have been reported in various cancer types, models and therapies [1].A similar phenomenon was also observed in patients receiving immune checkpoint inhibitors (ICI), with a subpopulation of immunotherapy persister cells (IPCs) identified that could resist CD8 + T cell-mediated cell killing [4].In non-small cell lung cancer (NSCLC), patients with NSCLC characterized by activating mutations in the epidermal growth factor receptor (EGFR) clearly benefit from EGFR-tyrosine kinase inhibitor (TKI) therapy.However, EGFR-TKIs lead to acquired resistance and eventual tumor relapse, thereby affecting the patient survival and highlighting the urgent need to investigate novel strategies for circumventing drug resistance [5].DTP cells majorly contribute to the development of drug insensitivity state and irreversible resistance to EGFR TKIs [6,7].For instance, osimertinib, is an irreversible third-generation EGFR TKI that is approved as first-line treatment for EGFR mutation-positive NSCLC [8].It is of vital importance to explore the mechanisms of tumor repopulating cells induced by osimertinib exposure.
The genes involved in drug tolerance are diverse and complex, reflecting that the process of DTP is rather multifaceted and requires urgent elucidation.The bulk sequencing of tumor tissues is not enough to depict the temporal and spatial status of a cancer patient.Single cell sequencing has superior performance, with comprehensive analysis of different subgroups within the tumor.It also provides the molecular landscape of each subpopulation to explore diverse cancer targets.Moreover, single cell sequencing can be integrated with bulk RNA-seq data deposited in public datasets, which could be used to construct biomarkers for predicting the clinical outcomes of cancer patients.
In this study, we described the spectrum of changes associated with drug tolerance in EGFR TKI-tolerant cancers.The drug tolerant persister-related genes (DTPRGs) in osimertinib-tolerant cancer were first screened based on a single cell sequencing dataset and a bulk RNA sequencing dataset.Then, a prognostic DTPRGs risk score was developed by univariate and multivariate cox analysis and LASSO method.Furthermore, a nomogram combining DTPRGs risk score and clinical variables were constructed.Our findings demonstrated that these drug tolerant persister-related genes play crucial roles in the process of lung cancer development and could be potential targets for treatment of lung cancer patients.

Data collection
Single-cell RNA sequencing (RNA-seq) and corresponding information of GSE150949 were retrieved from GEO database for exploring potential DTPRGs, which contained 8139 cycling cell samples from PC9 cells treated with osimertinib.GSE153183 dataset with paired parental and TKI-induced persister cells were also downloaded.Batch effect was corrected by sva function in two different datasets.Besides, the RNA-seq fragments per kilobase of exon per million fragments mapped reads (FPKM) from TCGA-LUAD cohort were downloaded.Gene expression datasets GSE68465, GSE72094, GSE68571, GSE41271 and GSE31210 were used as external validation sets consisting of 926 lung adenocarcinoma patients.

Data processing and DTPRGs screening
The type of lung cancer cycling PC9 cells was isolated from the mixed total cell samples in GSE150949.Then, the count matrix and clinical information were combined to generate the object by "Seurat" package in R software.The poor quality of cells and genes were filtered out, and only the genes with more than only 5 cells detected and the cells detected more than 1500 gene numbers were Z.Du et al.
selected.A false discovery rate (FDR) < 0.05 was set.Dimensions with significant separation were screened out through principal component analysis (PCA).The t-distributed stochastic neighbor embedding (t-SNE) algorithm was employed to perform dimension reduction for the top 20 principal components (PCs) and obtain the major clusters.Marker genes in each cluster were accessed with log2 [fold change (FC)] ≥1 and FDR <0.05, and the top 10 % of marker genes were laid out in the heatmap.Clusters were annotated based on marker genes through 'SingleR' package.Pseudotime and trajectory analyses of PC9 cells were performed by 'Monocle' package.Then, intracellular differentially expressed genes with distinct differentiation states were designated as DTPRGs with |log2 (FC)| ≥1 and FDR <0.05.The DTPRGs were screened by "limma" package for each set.Then, the tumor mutation burden (TMB) per megabase of each sample was calculated in the exome content.

Co-expression network construction
The expression profiles of DTPRGs between DTP and control group were extracted to construct weighted gene co-expression network (WGCNA) using "WGCNA" package.First, scale independence and mean connectivity were calculated with candidate softthresholding powers .The first candidate power was selected as the proper power with degree of independence >0.8.Then, a co-expression network was constructed.Modules were identified with the parameters mergeCutHeight set to 0.3 and minModuleSize set to 30.Finally, the Pearson correlation coefficient and p value between each module were calculated.

Construction and validation of the DTPRGs prognostic model
Cox regression analysis was employed to evaluate the correlations between each gene and survival status in the TCGA-LUAD cohort.The cut-off p value was set as 0.2 The least absolute shrinkage and selection operator (LASSO) Cox regression model was used to narrow down the candidate genes.Ultimately, 11 genes and their coefficients were obtained, and the penalty parameter (λ) was decided by the minimum criteria.The risk score was calculated after centralization and standardization of TCGA-LUAD data, and the risk score formula was as follows: The risk score = where Expr i represents the expression level of gene i and coef i represents the regression coefficient of gene i in the signature.
The TCGA-LUAD patients were divided into low-and high-risk groups according to the median risk score, and the OS time was compared between the two groups.PCA based on the 11-gene signature was performed by "stats" package.The "survival", "survminer" and "timeROC" packages were employed to perform a 5-year receiver operating characteristic (ROC) curve analysis.For the external validation, lung adenocarcinoma patients from the GEO databases (GSE68465, GSE72094, GSE68571, GSE41271 and GSE31210) were employed.The risk score was calculated by the same formula used for the TCGA cohort.The patients in the GEO cohort were also divided into low-or high-risk groups, and these groups were then compared to validate the model.

Prognostic analysis of the risk score
The clinical information (age, gender, and stage) of patients in the TCGA LUAD cohort were extracted, and the age and stage data of patients in the GEO cohort were also obtained.These parameters were analyzed with risk score in regression model.Univariate and multivariable Cox regression models were used.Then, a nomogram was constructed with the above factors through "rms", "nomo-gramEx", and "regplot" packages.Next, ROC and calibration curves analysis were performed.

Functional enrichment of the DTPRGs
Patients in the TCGA-LUAD cohort were stratified into low-and high-risk groups according to the median risk score.DEGs between two groups were filtered with |log2FC| ≥1 and FDR <0.05.Based on these DEGs, Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses were performed via "clusterProfiler" package.The single-sample gene set enrichment analysis (ssGSEA) via "gsva" package.

Statistical analysis
For violin plots, the Wilcoxon rank sum test was performed.Kaplan-Meier method with a two-sided log-rank test was used compare the OS of patients between subgroups.Univariate and multivariate Cox regression models were employed to assess the independent prognostic value of the risk model.Mann-Whitney test was used to compare the immune cell infiltration and immune pathway activation.All statistical analyses were carried out in R (version 4.0.3) and Perl software (version January 6, 7601), and p < 0.05 was considered to be statistically significant.★ p＜0.05，★★ p＜0.01，★★★ p＜0.001.

Quality control and normalization of data-108 DTPRGs was identified in GEO database
GSE153183 and GSE150949 datasets were analyzed.The study flow diagram is shown in Supplementary Fig. S1.Based on the selection criteria, 8139 cycling PC9 cell samples and 1500 high variable genes were identified in GSM4561725 and GSM4561726 samples derived from GSE150949 dataset.Fig. 1A shows the range of single cell RNA numbers and the RNA count of each cell, indicating a good quality control.Fig. 1B illustrates the 1500 high variable genes and the names of the top 10 genes of the cell samples.There was a significantly positive correlation between the gene numbers and sequencing depth with Pearson's r = 0.89 (Fig. 1C).According to the tSNE algorithm (Fig. 1D), 7143 PC9 cells are classified into 7 clusters, and the top 10 % of marker genes are displayed on the heat map, which represent low to high gene expression levels (Fig. 1E).Expression profile of GSE150949 displays 276 DTPRGs, in which 202 and 74 genes are upregulated and downregulated, respectively (Fig. 1F, Supplementary Table 1).Based on the cut-off values of |log2 (FC)| ≥1 and FDR <0.05, 1971 DTPRGs of persister cells were recognized in GSE153183 dataset, comprising 1007 upregulated and 964 down-regulated genes (Fig. 1G, Supplementary Table 2).We obtained 108 DTPRGs (36 downregulated and 72 upregulated genes) as the overlapping gene profile of persister by Venn plots (Fig. 1H).
WGCNA and identification of key modules-genes in key modules are mainly involved in nuclear division and chromosome segregation.
WGCNA on the TCGA-LUAD dataset was performed to find the key modules.After the intersection of DTPRGs in GSE153183 and GSE150949 sets, 2139 DTPRGs were enrolled in WGCNA.Seven modules were evaluated with soft threshold = 2 (Fig. 2A-B), of which 3 modules (e.g., blue, yellow module and turquoise module) were associated to overall survival (OS) of lung adenocarcinoma patients.The turquoise module was the most highly correlated with clinical traits from the heatmap of module-trait correlations (Fig. 2C).GO and KEGG analyses were performed to reveal biological functions of the genes within the turquoise module.Significant GO terms and KEGG pathways, are shown in Fig. 2D-E.This analysis indicates that genes within the turquoise module are mainly involved in nuclear division and chromosome segregation.

Identification of DTPRGs between normal and tumor tissues-hub genes were identified
The expression levels of 108 DTPRGs were compared in TCGA data from 54 normal and 497 tumor tissues, and 87 differentially expressed genes (DEGs) were identified.Among them, 36 and 51 genes were downregulated and enriched in the tumor group, respectively (Table 1).The RNA levels of the genes are presented as heatmaps in Fig. 3A.A protein-protein interaction (PPI) analysis was conducted to explore the interactions of these DTPRGs (Fig. 3B).The minimum required interaction score was set at 0.4.CTSD, HSPB1, KRT7, CLDN7, CD59, CLDN4, CLU, LGALS3, SPP1, CTSA, IGFBP3, IGFBP5, TIMP3, and TIMP2 were the top ranked genes and identified as hub genes.All the DEGs between normal and tumor tissues were among them.The correlation network of DEGs is presented in Fig. 3C.

Tumor classification based on the DTPRGs-DTPRSs was associated with tumor stage and survival
To investigate the correlations between the expression of the 87 DTPRGs and lung adenocarcinoma subtypes, a consensus clustering analysis was performed with all 504 patients in the TCGA LUAD cohort.With clustering variable (k) increased from 2 to 10, we found that when k = 2, the intragroup correlations are the highest and the intergroup correlations are low, indicating that the patients can be well divided into two clusters based on the 87 DTPRGs (Fig. 4A).The gene expression profile and the clinical parameters including lymph node metastasis (N), metastasis (M), tumor (T), stage, age, and gender are presented in a heatmap (Fig. 4B).We noticed that patients with early lung adenocarcinoma are associated with cluster 1, explaining why cluster 1 is associated with a better survival advantage (Fig. 4C).We also found that females and patients ≥65 are associated with cluster 1 and have a better overall survival.

Prognostic value of the risk model
The univariate Cox regression analysis indicated that the risk score is an independent factor predicting poor survival in both TCGA and GEO cohorts (HR = 5.349, 95 % CI: 3.066− 9.333 for TCGA, and HR: 2.344, 95 % CI: 1.518− 3.618 for GEO cohorts, Fig. 6A and C).The multivariate analysis also suggested that the risk score is a prognostic factor for patients with lung adenocarcinoma in both cohorts after adjusting for other confounding factors (HR = 4.445, 95 % CI: 2.458− 8.040 for TCGA, and HR: 1.962, 95 % CI: 1.666− 2.310 for    GEO cohorts, Fig. 6B and D).The heatmap of clinical features for the TCGA cohort (Fig. 6E) found that gender, age, and the tumor stage are diversely distributed between the low-and high-risk subgroups.

Construction and validation of a nomogram
A nomogram was constructed combining the prognostic signature with clinical factors, such as age, gender and stage (Fig. 7A).The calibration curve of 45-degree line indicated the actual survival outcomes.The 1-, 3-, and 5-year OS demonstrated that the nomogrampredicted survival matched with the optimal prediction performance.The AUCs for the 1-, 3-, and 5-year OS were 0.749, 0.725, and 0.682, respectively (Fig. 7B).The calibration plot showed that the performance of the nomogram was an ideal model (Fig. 7C).These findings suggest that the nomogram combining the signature with clinical factors has a great prognostic accuracy.

External validation of the risk signature
A total of 442 lung adenocarcinoma patients from GSE68465 were utilized as the validation set.Based on the median risk score of the TCGA LUAD cohort, 228 and 214 patients in the GEO cohort were classified into the low-and high-risk group, respectively (Fig. 8A).PCA and t-SNE showed well separation between the subgroups (Fig. 8B-C).Patients in the low-risk subgroup had longer survival times (Fig. 8D).Kaplan-Meier analysis demonstrated a significant difference in the survival rate between the groups (HR 0.69, 95 % CI: 0.53-0.89,log-rank test p = 0.004, Fig. 8E).ROC curve analysis showed that the model has good prognostic efficacy (AUC = 0.713, 0.660, and 0.629 for 1-, 3-, and 5-year survival, Fig. 8F).
Functional analyses-the DTPRGs are mainly correlated with the nuclear division, chromosome segregation, and cell cycle pathways.
In total, 139 DTPRGs between the low-and high-risk groups were identified in the TCGA cohort.Among them, 72 and 67 genes were upregulated and downregulated in the high-risk group, respectively (Table 3).GO enrichment and KEGG pathway analysis were performed based on these DTPRGs.The results indicated that the DTPRGs are mainly correlated with the nuclear division, chromosome segregation, and cell cycle pathways (Fig. 9A-D).
Cell trajectory analysis, interrelation of the risk Score and TMB-the risk score is positively correlated with tumor malignancy.
According to the cell trajectories of PC9 cells, differential genes are scattered throughout the cell differentiation trajectory, with diversity and heterogeneity (Fig. 9E-G).The expression levels of 11 selected DTPRGs (MT2A, UBE2S, CLTB, KRT7, IGFBP3, CTSH, NPC2, HMGA1, HNRNPAB, DTYMK, and IHNA) in seven clusters are shown in Fig. 9H.Pseudotime and trajectory analysis indicated that MT2A and UBE2S increased in cluster 5 (belonging to state 1 and 9); CLTB in cluster 6 (belonging to state 7).We further found that there is a significant difference of TMB between high-and low-DTPRG associated prognostic model (DAPM) (Fig. 9I).However, there is no significant difference in OS between two groups (Fig. 9J).We then explored whether the combination of DAPM and TMB could be a better prognostic analysis.DAPM and TMB were integrated to stratify the samples into TMB high /DAPM low , TMB low /DAPM low , TMB high / DAPM high , and TMB low /DAPM high groups.As shown in Fig. 9K, there are significant differences among all groups (Log-rank test,p < 0.001), and patients in the TMB high /DAPM low group had the best OS.These results demonstrate that the risk score and tumor malignancy is positively correlated.

Immune activity between subgroups-the risk score is associated with immune activity
The enrichment scores of 16 types of immune cells and the activity of 13 immune-related pathways between the low-and high-risk groups in both the TCGA and GEO cohorts were compared by ssGSEA.In the TCGA cohort (Fig. 10A), the high-risk subgroup had lower levels of immune cell infiltration, especially aDCs, B cell, iDCs, DCs, mast cells, macrophages, NK cells, neutrophils, tumor-infiltrating lymphocytes (TILs), T helper cells, and regulatory T (Treg) cells.The inflammation-promoting and MCH-class I response pathway showed higher activity in the high-risk group in the TCGA cohort (Fig. 10B).Similar conclusions were drawn with GEO cohort (Fig. 10C-D).

Discussion
In this study, pseudotime and trajectory analyses of single-cell RNA-seq data were carried out to screen DTPRGs.Bulk RNA sequencing data were also integrated with single cell RNA-seq data.WGCNA was used to explore potential genes associated with survival.Cox regression model was used to select genes and establish the prognostic model in TCGA LUAD cohort.A 11-gene signature of prognostic capability was constructed.External validation was performed using multiple GEO datasets of osimertinib-tolerant cell lines.The immune activity between subgroups was compared, and a nomogram of risk score combined with clinical characteristics was constructed and validated using external cohorts.
The results indicated that the biological processes in GO analysis related to DTP are nuclear division, organelle fission, chromosome segregation, mitotic nuclear division, positive regulation of cell cycle, and DNA replication, etc., which is consistent with previous studies demonstrating that the mitogenic and survival pathways are related to cells with slow-proliferating drug-tolerant phenotypes [9].The mechanism of DTP state to EGFR TKI is diverse.The molecular mechanism behind the emergence of DTPs includes inactivation of AKT/Ets-1 signaling [10] and the expression of branched-chain amino acid aminotransferase (BCAT1) [11].A metabolic shift to fatty acid oxidation and upregulation of antioxidant genes is associated with DTP state in multiple cancer types [12].DTCs display markers associated with stem-cell like cancer cells [13].A recent study using single-cell RNA sequencing confirmed the existence of diverse tumor cell subpopulations associated with EGFR TKI-tolerance, including epithelium development, epithelial-to-mesenchymal   transition, drug metabolism vesicle mediated transport, and cholesterol homeostasis [8].Mechanisms involving epigenetic modifications may also be responsible for DTP due to the reversible nature of drug tolerance [14], which is demonstrated by the fact that DTPs can be inhibited by treatment with chromatin-modifying drugs [13].Studies of patient derived xenograft (PDX) models using DNA-barcodes for clonal tracking demonstrated that DTPs are more likely driven by a stochastic reprogramming process than a Abbreviations: FDR, false discovery rate; LogFC, Log fold change.pe-existing clonal selection [15,16].Altogether, 11 genes have been selected to construct a prognostic signature.MT2A gene encodes for metallothionein 2A protein.The levels of MT2A mRNA are considered as a marker of poor prognosis for lung cancer patients [17].Additionally, inhibited MT2A expression resulted in cell death and apoptosis in prostate cancer cells [18].UBE2S gene encodes for ubiquitin conjugating enzyme E2 S, which is a regulator of mitosis [19].UBE2S contributes to lung cancer development by regulating canonical Wnt signaling [20].UBE2S expression in lung cancer resulted in binding with IκBα to activate NF-κB signaling and cancer cell metastasis [21].Keratin 7 (KRT7) is a member of the keratin gene family.Also known as cytokeratin-7 (CK7), KRT7 expression in lung cancer is associated with lymph node metastasis and T stage, which may serve as an independent factor for poor prognosis of lung cancer [22].IGFBP3 gene encodes Insulin Growth Factor Binding Protein 3, which is regulated by tumor suppressor p53.IGFBP-3 is responsible for transporting IGF-I and IGF-II in circulation, thus, prolonging the half-life of IGFs [23].IGFBP3 could modulate cell growth and lung tumorigenesis through IGF1 signaling [24].CTSH encodes for cathepsin H. Cathepsins are lysosomal enzymes belonging to the papain family.Cathepsins are involved in cancer progression and drug resistance [25].NPC2 encodes for NPC intracellular cholesterol transporter 2. The major function of NPC is to regulate cholesterol transport through lysosomal system [26].NPC2 can be secreted by early-stage lung cancers and influence the tumor microenvironment [27].HMGA1 (High Mobility Group AT-Hook 1) has been related to lung cancer.Phosphoproteomic studies reveals HMGA1 as a drug-resistant target in NSCLC [28].HNRNPAB encodes for heterogeneous nuclear ribonucleoprotein A/B (hnRNP AB), which belongs to the hnRNP families.Elevated expression of HNRNPAB was positively associated with more aggressive diseases and poorer survival rates in breast cancer [29].DTYMK encodes deoxythymidylate kinase, which catalyzes biosynthesis of deoxythymidine triphosphate (dTTP).A previous study has shown that DTYMK is a predictive factor of poor prognosis in NSCLC [30].IHNA (Inhibin Subunit Alpha) encodes TGF-β (transforming growth factor-beta) superfamily proteins.Elevated level of inhibin-α subunit is pro-tumorigenic and pro-metastatic in prostate cancer [31].CLTB gene encodes clathrin light chain b.Upregulation of clathrin light chain b has been shown to couple with dynamin-1 in cancer cells, which leads to adaptive clathrin-mediated endocytosis and increases metastasis [32].
In our study, the high-risk subgroup had lower levels of immune cells infiltration and the inflammation-promoting and MCH-class I response pathway had higher activity in the high-risk group.Furthermore, combination of DAPM and TMB could be a powerful

Z
. Du et al.

Fig. 1 .Fig. 2 .
Fig. 1.Quality control of single cell RNA-seq data and dimensionality reduction of differentially expressed DTPRGs.(A) Cells with poor quality were filtered out and the detected gene counts, mitochondrial gene sequences and sequencing depth were analyzed in sub-populations.(B) A subset of features that exhibit high cell-to-cell variation was calculated.Red dots indicate the 1500 variable genes.The top 10 gene names were labeled.nFeature_RNA indicates the numbers of genes detected in each cell.nCount_RNA indicates the total number of mRNA molecules detected in cells.Percent.mtindicates the ratio of mitochondrial gene expression to all cellular gene expression.(C) Correlation analysis between mitochondrial gene sequences and sequencing depth.nFeature_RNA indicates the numbers of genes detected in each cell.nCount_RNA indicates the total number of

Fig. 4 .
Fig. 4. Tumor classification based on the DTPRGs.(A) 504 patients from TCGA LUAD cohort were grouped into two clusters according to the consensus clustering matrix (k = 2).(B) Heatmap and the clinicopathologic characters of the two clusters grouped by DTPRGs.(C) Kaplan-Meier OS curves of the two clusters.N: lymph node metastasis, describe regional lymph node involvement of the tumor; M: Metastasis, identify the presence of distant metastases of the primary tumor; T, tumor, identify the size and extension of the tumor.★ p＜0.05，★★ p＜0.01，★★★ p＜0.001.

Fig. 5 .
Fig. 5. Construction of risk signature in the TCGA-LUAD cohort.(A) Univariate cox regression analysis of OS for 24 DTPRG with p < 0.2.(B) LASSO regression of the 11 OS-related genes.(C) Tuning parameter selection via cross-validation in the LASSO regression.(D) Distribution of patients based on the risk score.(E) PCA plot for PC9 based on the risk score.(F) t-SNE plot for OS based on the risk score.(G) Survival status of each patient (lowrisk: left side of the dotted line; high-risk: right side of the dotted line).(H) Kaplan-Meier curves for the OS of patients in the high-and low-risk groups.(I) ROC curves demonstrated the predictive efficiency of the risk score.

Fig. 6 .
Fig. 6.Univariate and multivariate Cox regression analyses for the risk score.Univariate analysis (A) and multivariate analysis (B) for the TCGA cohort.Univariate analysis (C) and multivariate analysis(D) for the GEO cohort.(E) Heatmap (blue and red indicate low and high expression, respectively) for the correlation of clinicopathologic features and the risk groups.★ p＜0.05，★★ p＜0.01，★★★ p＜0.001.(For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)

Fig. 7 .
Fig. 7. Construction and verification of the nomogram.(A) Nomogram constructed combining DTPRGs risk signature and clinical parameters.(B) The ROC curve and AUC of 1-, 3-, and 5-year of the nomogram for TCGA LUAD cohort.(C) The calibration plot of the nomogram for TCGA-LUAD cohort.

Fig. 8 .
Fig. 8. Validation of the risk model in the GEO cohort.

Fig. 10 .
Fig. 10.The ssGSEA scores for immune cells and immune pathways.(A, B) The enrichment scores of 16 types of immune cells and 13 immunerelated pathways between low-and high-risk group in the TCGA-LUAD cohort.(C, D) the tumor immunity between low-and high-risk group in the GEO cohort.

Table 1
Genes downregulated and enriched in the tumor group.

Table 2
The coeffient of 11 genes.
Z. Du et al.

Table 3
DTPRGs between the low-and high-risk groups in the TCGA LUAD cohort.