Construction and Validation of a Robust Genomic Classier for High-Risk Sporadic Type 2 Papillary Renal Cell Carcinoma: Support More Accurate Utilization of mTOR inhibitors

To construct a robust genomic classier for high-risk sporadic type 2 papillary renal cell carcinoma (sPRCC2) and identify potential therapeutic targets. A cohort from The Cancer Genome Atlas and two datasets from Gene Expression Omnibus were examined. Common differentially expressed genes were screened, and the ConsensusClusterPlus package was used to identify potential high-risk molecular subtypes of sPRCC2. Targeting protein for Xklp2 (TPX2) expression was used to simulate the classier according to the logistic model. Ninety-two samples from Fudan (FUSCC) were obtained, and TPX2 immunostaining was performed to validate the predictive ability of the genomic classier. Retrospective analysis was used to compare drug ecacy between the groups. High-risk sPRCC2 had worse overall (hazard ratio [HR] = 7.804) and disease-free survival (HR = 8.777) than low-risk sPRCC2, and the tumor and lymph node stages were signicantly higher in high-risk sPRCC2. Gene set enrichment analysis revealed signicant enrichment of genes in the mammalian target of rapamycin (mTOR) complex 1 signaling pathway in the high-risk group. In the FUSCC cohort, high-risk sPRCC2 (high TPX2 expression) was signicantly correlated with worse overall and progression-free survival. Retrospective analysis indicated that mTOR inhibitor (everolimus) had greater ecacy in the high-risk group than in the low-risk group (overall response rate: 28.6% vs. 16.7%) and that everolimus had greater ecacy than sunitinib in the high-risk group (overall response rate: 28.6% vs. 20%). This study successfully constructed a genomic classier for identifying high-risk sPRCC2. mTOR inhibitors may have good ecacy in patients with high-risk sPRCC2.


Introduction
Renal cell carcinoma (RCC) is the third most common malignant tumor of the genitourinary system. In 2019, 73,750 people were newly diagnosed with renal tumors, and 14,830 deaths were attributed to renal tumors [1]. Clear cell RCC (ccRCC) represents approximately 70% of kidney cancer cases in adults [2].
Papillary renal cell carcinoma (PRCC) is the most common non-clear cell RCC (nccRCC), accounting for 10-15% of RCCs [3]. Delahunt and Eble [4] characterized the histologic dissimilarities of PRCC and divided this malignancy into two subtypes (PRCC1 and PRCC2). Molecular analysis further clari ed differences between the two subtypes. PRCC1 features gains in chromosomes 7, 17, 16 and 20 but loss of the Y chromosome [5]. MET pathway activation is frequently implicated in PRCC1 [6]. Conversely, PRCC2 has a more heterogenous spectrum of chromosomal gains and losses. It has been reported that 8q gains are especially related to the poor prognosis of PRCC2, and the NRF-ARE2 pathway was also revealed to be enriched in PRCC2 [7,8]. Previous studies demonstrated that PRCC2 has signi cantly worse clinical outcomes than PRCC1 [9,10]. In summary, PRCC2 differs from PRCC1 and features a more aggressive phenotype.
PRCC2 can be further divided into hereditary and sporadic types. The hereditary form is associated with biallelic inactivation of the gene encoding the Krebs cycle enzyme fumarate hydratase (FH), which leads to hereditary leiomyomatosis and RCC (HLRCC) syndrome, which is characterized by a high incidence of RCC, uterine leiomyoma, and cutaneous leiomyomatosis [11,12]. Patients with HLRCC syndrome are also genetically susceptible to bladder cancer, collecting duct tumors, and adult Leydig cell tumors of the testes [13][14][15]. Sporadic PRCC2 (sPRCC2) accounts for most cases of PRCC2, and previous studies demonstrated that despite differences in genetic etiology, sPRCC2 shares many clinical and morphologic phenotypes with HLRCC syndrome [16]. The most prominent common biochemical feature of HLRCC syndrome and sPRCC2 is the continuous activation of NRF2, which is caused by intracellular fumaric acid accumulation attributable to fumarate hydratase (FH) inactivation [16], but the mechanism of NRF2 activation in sPRCC2 has not been determined.
Although rapid progress in medical science has facilitated the development of cancer therapy and multiple new drugs exert antitumor effects against ccRCC, problems remain in the management of PRCC2. Numerous clinical trials aimed to explore potential useful treatments for PRCC. Ravaud et al. [17] found that sunitinib was effective in the treatment of metastatic PRCC1 and PRCC2, but its e cacy was lower than that against metastatic ccRCC. Both progression-free survival (PFS) and overall survival (OS) are longer in PRCC1. In addition, Armstrong et al. [18] claimed that compared with everolimus, sunitinib improved PFS in patients with metastatic nccRCC. However, the results of these clinical trials including various targeted therapies and immunotherapies did not revolutionize the treatment of PRCC [19][20][21][22]. Because of its rarity and heterogeneity, there is little useful information regarding the rational clinical management of metastatic PRCC2.
Although OS is considered short in PRCC2, we also identi ed a subset of patients with histopathologically con rmed sPRCC2 and prolonged survival. It is important to clarify the mechanism responsible for the difference in survival. In this research, we focused on the molecular pattern of sPRCC2 and explored outcomes in different sPRCC2 subtypes using bioinformatics. A genomic classi er was successfully constructed, and the high-risk sPRCC2 subgroup was identi ed. Ninety-two samples from Fudan University Shanghai Cancer Center (FUSCC) were examined to validate the predictive ability of the genomic classi er, and retrospective analysis was used to compare drug e cacy between the groups.

Materials And Methods
Comparison of PRCC1 and PRCC2 in The Cancer Genome Atlas (TCGA) cohort Data for 77 patients with PRCC1 and 85 patients with PRCC2 and complete genetic alteration and clinical data were obtained from TCGA. Clinical information and genetic alterations in PRCC1 and PRCC2 were obtained from cBioPortal (https://www.cbioportal.org/). The Kaplan-Meier method was used to compare OS and disease-free survival (DFS) between the PRCC1 and PRCC2 groups. The chi-squared test and Kruskal-Wallis test were also applied to assess other clinical information including American Joint Committee on Cancer tumor stage, lymph node stage, metastasis stage, and serum calcium levels.
Gene expression pro les of PRCC2 and differential gene expression analysis Gene expression pro les and clinical information for patients with PRCC in TCGA were downloaded from https://portal.gdc.cancer.gov/. Germline mutation data in PRCC2 were obtained from the supplementary le of a previous study [23]. Because the molecular patterns and clinical behavior varied between patients with hereditary PRCC2 (patients with FH germline mutation) and sPRCC2, this study only focused on sPRCC2 (82 samples from TCGA, Table 1) and hereditary PRCC2 was excluded (two patients). The mutation patterns and corresponding gene expression patterns of these 82 sPRCC2 samples were obtained from cBioPortal. Two datasets containing expression pro les of sPRCC2 were downloaded from Gene Expression Omnibus: GSE26574 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE26574, contains 12 sPRCC2 samples) and GSE48352 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi? acc=GSE48352, contains 19 sPRCC2 samples). The limma package [24] and GEO2R were used to explore differentially expressed genes (DEGs) between normal and sPRCC2 tissues from these three cohorts (adjusted p < 0.05 and fold change ≥ 2). A Venn diagram was applied to identify the overlapping upregulated and downregulated DEGs.  [25], a Cytoscape plug-in [26], was used to identify the most signi cant hub genes in the PPI network. According to the gene expression of hub genes, the ConsensusClusterPlus [27] package was applied to identify potential molecular subtypes of sPRCC2, and a cumulative distribution function (CDF) curve was used to determine the most accurate number of clusters (relative change in the area under the CDF curve ≥ 0.1 was considered signi cant). L-clusters and H-clusters were separately used to represent identi ed clusters according to the expression patterns of downregulated and upregulated hub genes. In this research, the numbers of L-clusters and H-clusters were set at 5 and 7, respectively, according to the CDF curves. The clusters were separately simpli ed to two main clusters according to the cluster dendrogram. The Kaplan-Meier method was used to compare OS and DFS between the clusters. The H-clusters were successfully simpli ed into two subtypes, and the subtype with signi cantly worse prognosis was de ned as the high-risk group. Receiver operating characteristic [28] (ROC) curves were constructed to describe the binary classi er value of the classi er using the area under the curve (AUC).

Comparison of low-risk and high-risk sPRCC2
The H-clusters were simpli ed into low-risk (n = 69) and high-risk groups (n = 13). The chi-squared test and Kruskal-Wallis test were also applied to compare clinical information and genomic variation between the two groups using cBioPortal. To explore the mechanism responsible for the worse prognosis of high-risk sPRCC2, we performed gene set enrichment analysis (GSEA) to identify potential differences in transcriptomics.
Validating the genomic classi er using the FUSCC cohort Because a genomic classi er that could identify high-risk sPRCC2 was constructed using the expression information of 14 upregulated hub genes, we aimed to simplify the genomic classi er. Bivariate logistic regression analysis was used, and we found that clusters separated using targeting protein for Xklp2 (TPX2) expression (cutoff was set as the thirteenth expression value) could fully match the clusters as divided by the genomic classi er. This study included 92 patients (clinical information is listed in Table 2) with histopathologically con rmed sPRCC2 (positive staining for FH) who underwent surgical treatment at FUSCC between 2009 and 2019, and tumor specimens were obtained with informed consent. Immunostaining of TPX2 was performed using a rabbit monoclonal anti-TPX2 antibody (Cat. ab270612, Abcam, USA). Positive or negative staining for a certain protein on a formalin-xed, para n-embedded slide was independently assessed by two experienced pathologists. The staining intensity level was graded as follows: 0, no staining; 1, weak staining; 2, moderate staining; and 3, strong staining. The extent of staining ranged 0-4 based on the percentage of immunoreactive tumor cells (0%, 1-25%, 26-50%, 51-75%, 76-100%). The overall immunohistochemistry (IHC) score was obtained by multiplying the staining intensity by the extent of staining. IHC scores of 0-3 represented low risk, and scores of 4-12 indicated high risk. Then, the Kaplan-Meier method was used to compare OS and PFS between the groups. Validation of the potential utility of the genomic classi er for accurate drug selection In total, 24 patients in the FUSCC cohort had a pathologically con rmed diagnosis of metastatic sPRCC2.
We retrospectively collected the baseline characteristics, treatment details, and clinical outcomes of these patients by reviewing their electronic medical records, and the details were veri ed by two investigators.
In the low-risk group, six patients received everolimus as rst-line therapy. In the high-risk group, seven patients were treated with everolimus as the rst-line therapy, and ve patients were treated with sunitinib as the rst-line therapy. Radiologic assessment was performed according to RECIST 1.1 criteria [29] to classify the best response to treatment as complete response (CR), partial response (PR), stable disease (SD), or progressive disease (PD).

PRCC2 was more aggressive than PRCC1
Both OS and DFS were shorter in patients with PRCC2 than in those with PRCC1 (both p < 0.05, Fig. 1A-B), and the chi-squared test indicated that PRCC2 was often correlated with a higher tumor stage and lymph node stage ( Fig. 1C-D). The somatic mutation pattern between PRCC1 and PRCC2 was diverse (Fig. 1E). PRCC1 had higher frequencies of KMT2C and PCLO mutation, whereas the most characteristic somatic alteration in PRCC1 was MET mutation. However, MET mutation was only detected in two PRCC2 samples. Meanwhile, PRCC2 had higher frequencies of CUL3, SETD2, and PBRM1 mutation. In summary, PRCC2 had a higher frequency of pathogenic mutations.
Some common genes may play a key role in the malignant phenotype of sPRCC2 TCGA cohort included samples from two patients with PRCC2 as indicated by FH germline mutation, and the remaining 82 patients with PRCC2 were grouped into the sPRCC2 cohort ( Fig. 2A). CUL3 mutation was most common in patients with sPRCC2, and SETD2, PBRM1, and KMT2C mutations were also common. These somatic mutations inevitably exerted in uences on the gene expression pattern of sPRCC2 (Fig. 2B). The GSE26574 and GSE48352 datasets were also used to explore DEGs between sPRCC2 and normal tissues (Fig. 2C-D). In total, 316 downregulated and 65 upregulated genes were identi ed (Fig. 2E-F).
Genomic classi er could not identify the subtype with worse prognosis in L-clusters A PPI network was constructed, and downregulated hub genes (BDKRB2, NPY1R, SUCNR1, KNG1, PTGER3, S1PR3, S1PR1) were screened (Fig. 3A-B). L-clusters were divided into ve clusters (L-clusters A-E), and the survival curve of each cluster was drawn (Fig. 3C-E). L-clusters were then simpli ed into two clusters according to the cluster dendrogram. The two subtypes did not exhibit signi cant differences in either OS or DFS (both p > 0.05).
High-risk sPRCC2 is a highly aggressive molecular subtype The genomic classi er divided the TCGA cohort into two subtypes ( Fig. 5A-C), namely high-risk (n = 69) and low-risk groups (n = 13). Chi-squared tests indicated that the high-risk group had a higher tumor stage, a higher lymph node stage, a higher frequency of new neoplasm events, lower hemoglobin levels, and a relative higher genomic alteration frequency (all p < 0.05, Fig. 5D-H). GSEA indicated that compared with the ndings in the low-risk group, gene expression was signi cantly enriched for E2F targets, the G2M checkpoint, Myc targets, and other pathways in the high-risk group (Fig. 5I). To our interest, gene expression in the high-risk group was also enriched in the mammalian target of rapamycin (mTOR) complex 1 (mTORC1) signaling pathway (Fig. 5J), which may shed light on the accurate use of mTOR inhibitors such as everolimus.
The genomic classi er also identi ed the subtype in the FUSCC cohort with the worse outcome and shed light on the accurate use of everolimus in sPRCC2 As described in the methods, TPX2 expression level was used in place of the genomic classi er (Fig. 6A-B). Ninety-two patients with histopathologically con rmed sPRCC2 (positive staining for FH) were included (Fig. 6C-D), and images of TPX2 expression (IHC, low and high) are presented in Fig. 6E-F.
Samples with low and high TPX2 expression comprised the low-risk (N = 49) and high-risk groups (N = 43), respectively. Survival analysis indicated that both OS (HR = 3.361, p < 0.0001, Fig. 6G) and PFS (HR = 4.209, p < 0.0001, Fig. 6H) were signi cantly worse in the high-risk group than in the low-risk group.
Although not statistically signi cant, among patients who received rst-line everolimus therapy, PFS was better in the high-risk group (N = 7) than in the low-risk group (N = 6, Fig. 6I). A retrospective analysis also indicated that everolimus exhibited better e cacy (Fig. 6J) in the high-risk group than sunitinib (N = 5). In summary, everolimus displayed greater e cacy in the high-risk group than in the low-risk group (overall response rate: 28.6% vs. 16.7%), and everolimus had greater e cacy than sunitinib in the high-risk group, including a better overall response rate (28.6% vs. 20%) and greater reduction of the target lesion ( Fig. 6K).

Discussion
The present study successfully constructed a genomic classi er for identifying patients with high-risk sPRCC2 using transcriptomic data from multiple cohorts. The high-risk group had a signi cantly worse prognosis than the low-risk group concerning both OS and DFS. In addition, the high-risk group featured a higher tumor stage, a lymph node stage, a higher frequency of new neoplasm events, lower hemoglobin levels, and a relative higher genomic alteration frequency. The GSEA results indicated that compared with the ndings in the low-risk group, gene expression in the high-risk group was signi cantly enriched in the mTORC1 signaling pathway, which may shed light on the use of mTOR inhibitors. In the external validation, TPX2 expression was used to simulate the classi er according to the logistic regression results, and it could fully match the clusters. This method for simplifying the classi er may be not strictly rigorous, but it also revealed its strong ability to predict OS and PFS in the FUSCC cohort. The stronger e cacy effect of everolimus in the high-risk sPRCC2 group, although not statistically signi cant, exceeded our expectation.
Previous studies demonstrated that PRCC2 represents a heterogeneous group of lesions that can be divided into various subtypes according to genetic and molecular patterns, and these patterns re ect differences in the clinical course and prognosis of the disease. In a previous study [30], comprehensive genomic pro ling was performed to sequence 315 genes, and the commonly altered genes in PRCC2 were CDKN2A/B (18%), TERT (18%), NF2 (13%), and FH (13%). Yang et al. [31] identi ed two highly distinct molecular PRCC subclasses via morphologic correlation, and they found that G1-S and G2-M checkpoint genes were dysregulated in class 1 and class 2 tumors. A similar pattern was observed in this research. We found that gene expression in high-risk sPRCC2 was enriched in the G2M checkpoint pathway (Fig. 5I), which suggests that the G2M checkpoint plays a key role in the malignant phenotype of sPRCC2. In 2016, the TCGA research network [6] revealed that PRCC2 can be further classi ed into three individual subgroups based on molecular differences associated with patient survival. In this research, we compared the previously reported classi er with that developed in this study. The predictive accuracy of the two classi ers for OS was similar ( Figure S1).
Our results demonstrated that high-risk sPRCC2 was signi cantly correlated with a higher tumor stage, a higher lymph node stage, and worse OS. Because the treatment of advanced PRCC2 remains di cult, it is of great importance to identify potential targets suppressing PRCC2 growth. As mentioned in a previous study [6], the classi cation of PRCC may have a signi cant impact on clinical and therapeutic management and clinical trial design. Mutation of NF2 (the Hippo pathway tumor suppressor) was observed in a number of PRCCs, and this pathway has been targeted in other cancers [32]. The NRF2-ARE pathway was upregulated in both hereditary PRCC and sPRCC2. Currently, researchers are interested in the NRF2-ARE pathway, and novel strategies targeting this pathway have recently been developed [33,34]. In this study, we found that high-risk sPRCCC2 exhibited excellent mTORC1 signaling pathway activity, which suggests the potential accurate use of mTOR inhibitors. Everolimus, an oral mammalian mTOR inhibitor, has antitumor activity in multiple cancer types [35], and previous research demonstrated that everolimus has some clinical bene t in patients with metastatic PRCC [36,37]. In our retrospective analysis of the FUSCC cohort, everolimus exhibited a stronger drug effect against high-risk sPRCC2 than against low-risk sPRCC2, and everolimus had greater activity in the high-risk group than sunitinib. This result conferred that the genomic classi er can also guide the accurate use of everolimus in sPRCC2.
This study had several limitations. The nature of retrospective research limits the clinical value of this work. Further validation in multicenter or prospective studies is needed to verify the ndings. However, it is di cult to conduct randomized controlled trials in sPRCC2 because of its rarity. There is also an urgent need for in vitro and in vivo experiments to explore the underlying mechanisms behind the genomic classi er.