GPX2 predicts recurrence-free survival and triggers the Wnt/β-catenin/EMT pathway in prostate cancer

Objective This study aimed to establish a prognostic model related to prostate cancer (PCa) recurrence-free survival (RFS) and identify biomarkers. Methods The RFS prognostic model and key genes associated with PCa were established using Least Absolute Shrinkage and Selection Operator (LASSO) and Cox regression from the Cancer Genome Atlas (TCGA)-PRAD and the Gene Expression Omnibus (GEO) GSE46602 datasets. The weighted gene co-expression network (WGCNA) was used to analyze the obtained key modules and genes, and gene set enrichment analysis (GSEA) was performed. The phenotype and mechanism were verified in vitro. Results A total of 18 genes were obtained by LASSO regression, and an RFS model was established and verified (TCGA, AUC: 0.774; GSE70768, AUC: 0.759). Three key genes were obtained using multivariate Cox regression. WGCNA analysis obtained the blue module closely related to the Gleason score (cor = –0.22, P = 3.3e − 05) and the unique gene glutathione peroxidase 2 (GPX2). Immunohistochemical analysis showed that the expression of GPX2 was significantly higher in patients with PCa than in patients with benign prostatic hyperplasia (P < 0.05), but there was no significant correlation with the Gleason score (GSE46602 and GSE6919 verified), which was also verified in the GSE46602 and GSE6919 datasets. The GSEA results showed that GPX2 expression was mainly related to the epithelial–mesenchymal transition (EMT) and Wnt pathways. Additionally, GPX2 expression significantly correlated with eight kinds of immune cells. In human PCa cell lines LNCaP and 22RV1, si-GPX2 inhibited proliferation and invasion, and induced apoptosis when compared with si-NC. The protein expression of Wnt3a, glycogen synthase kinase 3β (GSK3β), phosphorylated (p)-GSK3β, β-catenin, p-β-catenin, c-myc, cyclin D1, and vimentin decreased; the expression of E-cadherin increased; and the results for over-GPX2 were opposite to those for over-NC. The protein expression of GPX2 decreased, and β-catenin was unchanged in the si-GPX2+ SKL2001 group compared with the si-NC group. Conclusion We successfully constructed the PCa RFS prognostic model, obtained RFS-related biomarker GPX2, and found that GPX2 regulated PCa progression and triggered Wnt/β-catenin/EMT pathway molecular changes.


INTRODUCTION
Prostate cancer (PCa) is one of the most common male malignant tumors in the United States and the second leading cause of cancer-related deaths in men (Kang et al., 2020). More than 80% of PCa cases are diagnosed as local diseases and usually treated by radical prostatectomy. However, about 15% of patients have a biochemical recurrence within 5 years after surgery, and the recurrence rate has been reported to be as high as 40% within 10 years. Local PCa that relapses after treatment can progress to fatal castration-resistant prostate cancer (CRPC) (Li et al., 2017). The causes of PCa recurrence are complex and diverse, and the specific mechanism has not yet been clarified (Siegel, Miller & Jemal, 2019). Therefore, research on the mechanism of PCa recurrence and the application of prognostic biomarkers may be of great significance in improving the survival rate of patients with PCa.
Many studies have shown that the epithelial-mesenchymal transition (EMT) and Wnt/βcatenin signaling pathways play an essential role in the occurrence and development of PCa (Montanari et al., 2017). EMT is necessary for PCa occurrence and distant metastasis, and plays a critical role in PCa metastasis to other organs (He et al., 2020). Epithelial cells attain the biological characteristics of stromal cells (Chaves et al., 2021). Studies have shown that the EMT and Wnt/β-catenin signaling pathways are closely related. Wnt binding to its receptor frizzled protein results in protein phosphorylation, which inhibits GSK-3 β activity. Consequently, β-catenin degradation is blocked and β-catenin accumulates in the cytoplasm, enters the nucleus, interacts with cytokines, activates the transcription of downstream target genes, induces EMT in cells, and promotes tumor growth and metastasis (Hseu et al., 2019;Sun et al., 2020).
Bioinformatics analysis is one of the crucial methods used for gene molecular research based on Big Data (Hutter & Zenklusen, 2018;Botía et al., 2017). In this study, PCa RFSrelated differentially expressed genes (DEGs) were screened by analyzing the data of PCarelated gene expression and clinicopathological characteristics in the Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO) databases. We analyzed the protein-protein interaction (PPI) based on DEGs. Survival, Cox regression, and LASSO regression analyses were used to establish and verify the prognostic model. The DEGs between different Gleason scores of PCa tissues and the key gene glutathione peroxidase 2 (GPX2) were obtained by weighted gene co-expression network analysis (WGCNA). The GSEA of GPX2 and its significance in prognosis and immunity were analyzed. Finally, a series of in vitro experiments were conducted to explore the potential role of GPX2 in PCa, so as to provide new clues for diagnosing and treating PCa.

Data acquisition
The data of gene expression profiles from the TCGA-PRAD dataset were downloaded from the TCGA database and standardized. At the same time, the clinical information of patients, including age, gender, TNM stage, pathological stage, and prognosis, was downloaded. The samples with incomplete clinical information and survival data were excluded. A total of 481 PCa samples and 51 adjacent tissue samples were included in the study. Three datasets of gene expression and clinical profiles of Pca were downloaded from the GEO database (GSE70768, GSE46602, and GSE6919).

Construction and validation of the Pca RFS prognostic model
The data were analyzed using the R software DESeq and Limma package. The volcano and heat maps were drawn using the P value <0.05 and |logfc|>2 as the screening conditions. After taking the intersection, Pca DEGs were obtained. The PPI network of DEGs was constructed using the STRING database, and the setting was adjusted to the interactive score of 0.7. Cytohubba and MCODE modules were used to screen Top30 and topologyrelated genes from Cytoscape 3.7.2. The prognostic model was established using univariate Cox, LASSO, and multivariate Cox regression analyses. Finally, the model was validated in the TCGA-PRAD and GSE70768 datasets.

WGCNA and GPX2 predicted PCa RFS
A total of 2,191 TCGA DEGs were analyzed using WGCNA. The Pearson correlation coefficient between genes was calculated. The scale-free network was constructed and the appropriate threshold was selected for network construction. Using two-step construction, the adjacency matrix was transformed into a topological overlap matrix, the clustering tree was generated through hierarchical clustering, and clustering was combined through a dynamic cut. The significance of gene and module was estimated, and the clinical sample grouping information was obtained. The identity of each gene module was calculated to measure the importance of genes in each module. Setting parameters |gene module|>0.8 and |gene significance|>0.2 as criteria, we screened the hub genes of modules closely related to clinical traits. The blue module was found to be significantly related to the Gleason score.
The key genes and blue modules were crossed, and the single gene GPX2 was obtained. The TCGA-PRAD dataset verified the GPX2-predicted PCa RFS.

GSEA and immunohistochemical (IHC) analysis
GSEA 4.0.1 was used to compare and analyze the DEGs between high-and low-expression groups of GPX2 in the PCa tumor samples of the TCGA-PRAD dataset. Gene set database selection KEGG v7. 4 was used to set the replacement times to 1,000; P < 0.05 and false discovery rate (FDR) <0.25 indicated significantly enriched genes. Between May 2021 and January 2022, the Second Affiliated Hospital of Nanjing University of Chinese Medicine collected tissues from 20 patients with PCa (10 patients with a high Gleason score ≥8 and 10 patients with a low Gleason score ≤7) and 10 patients with benign prostatic hyperplasia. Detailed information is shown in Table S1. All patients signed an informed consent form. This study was approved by the ethics committee of the Second Affiliated Hospital of the Nanjing University of Chinese Medicine (2021SEZ-030-01). The biopsy samples were collected, fixed with 10% formaldehyde, and embedded in paraffin after routine treatment. The prepared wax block was cut into sections with a thickness of 2 µm. Immunohistochemical staining was performed on the treated sections. The processed sections were stained with GPX2 (ab140130; Abcam, Cambridge, UK). Two pathologists used the double-blind method to judge each slice. The sections were observed using a low-power mirror under a microscope to select the best field of vision, and then a high-power lens was used in this range 10× 40. Five visual fields were randomly observed, and the IHC score was defined as the product of the frequency of positive cells and the intensity of staining.

GPX2 expression with immune cells in PCa
In this study, the CIBERSORT algorithm was used to calculate the infiltration proportion of 22 kinds of immune cells in PCa tissue, and 481 PCa samples were analyzed using the R software. The samples with a P value <0.05 were included in the follow-up analysis. Taking the median expression level of GPX2 mRNA as the boundary, the samples were divided into high-and low-expression groups.

Cell culture and qRT-PCR analysis
Human PCa cell lines (PC-3, DU145, LNCaP, and 22RV1) were purchased from Procell (Wuhan, China). The cells were cultured in RPMI-1640 (bl303a;Biosharp, Anhui, China) at 37 • C and in the presence of 5% CO2. The cells adhered to the wall and were passaged every 3 days. The cells in the logarithmic growth phase were used for the experiment. TRIzol reagent (bs259a, Biosharp, China) was used to extract the total RNA of PC-3, LNCaP, 22Rv1, and DU145 cells. A reverse transcription kit (11119es60; Yeasen, Shanghai, China) was used to reverse transcribe RNA into cDNA, and a SYBR Green Kit (11201es50; Yeasen, China) was used for qRT-PCR amplification, with β -actin as an internal control. The 2-CT method was used for calculation. The primers were as follows: GPX2: 5 -GCCTCCTTAAAGTTGCCATA-3 and 5 -GCCCAGAGCTTACCCA-3 ; β -actin: 5 -GAAGAGA-GAGACCCTCACGCTG-3 , and 5 -ACTGTGAGGAGGGGAGATTCAGT-3 . The experiment was repeated three times.

Transfection and grouping
LNCaP and 22RV1 cells in the logarithmic growth phase (n = 200,000) were inoculated into the cell culture plate and transfected according to the Lipofectamine 2000 (11668-027, Invitrogen, Waltham, MA, USA) instructions. They were divided into GPX2 low expression (si-GPX2) and negative control (si-NC) groups, GPX2 overexpression (over-GPX2) and negative control (over-NC) groups, and SKL2001(HY-101085, MCE, USA) +si-GPX2 groups. The transfection effect was verified using qRT-PCR and Western blotting. The experiment was repeated three times.

CCK-8 assay
LNCaP and 22RV1 cells in the logarithmic growth phase (n = 2,000) were inoculated into the cell culture plate. After undergoing corresponding treatment according to experimental grouping, 10 µL of cells were added to each well containing CCK-8 solution (PR645; Dojindo, Kumamoto, Japan). The culture plate was incubated and the absorbance value was determined to be 450 nm using a microplate reader (SpectraMax i3; Molecular Devices, San Jose, CA, USA). Cell proliferation inhibition rate = (control group absorbance valueexperimental group absorbance value)/control group absorbance value ×100%. The experiment was repeated three times.

Flow cytometry assay
The transfected cells were collected and digested with trypsin without EDTA. The adherent cells were collected and centrifuged. The supernatant was discarded and the cell precipitate was washed twice with phosphate-buffered saline (PBS). Annexin V-FITC/PI (556547, BD; Franklin Lakes, NJ, USA) was added. After incubation in the dark at room temperature for 5 min, we detected the apoptosis rate of LNCaP and 22RV1 cells using a flow cytometer (LSRII instrument; BD, Franklin Lakes, NJ, USA). The experiment was repeated three times.

Transwell invasion assay
The transfected cells were collected and the cell concentration was adjusted to 3× 105/mL. The cells were inoculated into the upper layer of the Transwell chamber (3422; Corning, Corning, NY, USA) which contained a serum-free medium, and 100 µL/well of the cell suspension was added. Additionally, 600 µL of the fresh culture medium was added to the lower layer of the chamber. The liquid in the upper chamber was discarded after culturing for 24 h, and the cells were wiped off the upper-chamber membrane with a wet cotton swab. The cells on the lower-chamber membrane were fixed with methanol for 20 min, dyed with crystal violet, rinsed with PBS until the background was clean, dried, and imaged after sealing. ImageJ software was used to count the number of transmembrane cells. The experiment was repeated three times.

Statistical analysis
The Student t -test was used for continuous variables, while the classification variables were analyzed using the χ 2 test. Cox and LASSO regression models were used to analyze the predictors of RFS. The data were expressed as mean ± standard deviation. All data were analyzed with R version 4.1.2, SPSS 24.0 and GraphPad Prism 8.0. A P value <0.05 indicated a significant difference. All tests were repeated three times.

Identified DEGs
The GSE46602 dataset had 211 upregulated genes and 409 downregulated genes. The TCGA-PRAD dataset was comprised of 898 upregulated genes and 1,293 downregulated genes. The volcano and heat maps showed DEGs (Figs. 1A and 1B). After further taking the intersection of the aforementioned datasets, we obtained the common 262 DEGs (94 upregulated genes and 168 downregulated genes) (Table 1 and Fig. 1C). STRING and Cytoscape were used to construct the PPI network of DEGs (Fig. 1D). Cytohubba and MCODE modules were used to screen Top30 and topology-related DEGs (Figs. 1E and 1F).  and low-risk groups, and the RFS-related scatter plot and the heat map of the prognostic model were constructed (Figs. 2C-2E). The ROC curve of the risk score was also generated, with AUC = 0.774 (Fig. 2F). The GSE70768 dataset was used to verify the prognostic model, and the RFS-related scatter plot and the heat map of the prognostic model were constructed (Figs. 2H-2J); the ROC curve of the risk score had AUC = 0.759 (Fig. 2K).

PCa RFS prognosis model
Three key genes obtained by the intersection of the prognostic model and Top30 were GPX2, EZH2, and COL2A1 (Fig. 2L). The multivariate Cox regression showed that the three genes significantly correlated with patient RFS (P = 0.001, 0.023,0.008, Fig. 2M).

WGCNA analysis and GPX2
According to WGCNA analysis and taking the correlation coefficient of 0.85 as the standard, the pickSoft threshold function was used to select the weight parameter of the adjacency matrix (soft threshold); β = 2 was the standard gene module (Figs. 3A-3C). Using the two-step method, the minimum number of genes in each gene module was set to 30, and the height of cutting branches and merging modules was set to 0.25. Finally, nine modules were obtained (Fig. 3D). Of these, we selected the blue module for this study, which was comprised of 450 genes with the correlation (r = -−0.22, P = 3.3e−05) (Fig. 3E). The intersection of blue module and key genes yielded only one gene GPX2 (Fig. 3F), which was used as in follow-up research.

GPX2 expression independently predicted RFS in PCa
To further evaluate the prognostic value of GPX2 for PCa, the prognostic nomogram was constructed by integrating clinical factors and gene expression (Fig. 4D), and the correction curve was drawn to evaluate the predictive ability of the nomogram. The correction curve showed that the risks predicted by the nomogram were highly consistent with the observed RFS for 1, 3, and 5 years (Figs. 4A-4C).

GSEA analysis
The GSEA analysis showed that GPX2 expression was mainly related to EMT and infectious disease biology, including the EMT and Wnt signaling pathways (Figs. 4E and 4F). The

CIBERSORT analysis
A significant difference was found in the degree of immune cell infiltration in 61 samples (20 in the GPX2 low-expression group and 41 in the GPX2 high-expression group) (Figs. 5A and 5B). Eight kinds of immune cells (activated dendritic cells, resting dendritic cells, M0 macrophages, M2 macrophages, monocytes, neutrophils, resting memory CD4 T cells, and CD8 T cells) showed GPX2 expression with significant differences (Fig. 5C).

GPX2 expression in PCa with the Gleason score
The immunohistochemical analysis showed that a GPX2-positive immune reaction was located in the cytoplasm and brownish yellow particles existed in the cytoplasm (Fig. 6A). GPX2 expression in benign prostatic hyperplasia tissue was significantly higher than in PCa tissue, with no significant difference in Gleason score (Fig. 6B). GSE66602 and GSE6919 datasets also confirmed the aforementioned results (Fig. 6C).

Expression of GPX2 in PCa cells and transfection
qRT-PCR and Western blotting were used to detect the highest mRNA level of GPX2 in LNCaP and 22RV1 cells, and GPX2 was used for subsequent experiments (Fig. 6D). Western blotting showed that the expression of GPX2 in the si-GPX2 group was significantly lower than that in the si-NC group. Also, the expression of GPX2 in the over-GPX2 group was significantly higher than that in the over-NC group, indicating that silencing and overexpression were successful and could be used in subsequent experiments (Figs. 6E and 6F).

GPX2 regulates the Wnt/β-catenin/EMT pathway in LNCaP and 22RV1 cells
The protein expression of Wnt3a, GSK3 β, p-GSK3 β, β-catenin, pβ-catenin, c-myc, cyclin D1, and vimentin decreased and that of E-cadherin increased in the si-GPX2 group compared with the si-NC group. The results for over-GPX2 were opposite to those for over-NC (Figs. 8A-8B and 8E-8F). Additionally, the protein expression of β-catenin increased and that of GPX2 decreased in the si-GPX2 + SKL2001 group compared with the si-NC group (Figs. 8C-8D and 8G-8H).

DISCUSSION
Although the incidence rate of PCa in Asia is far lower than that of Europe and North America, the incidence and mortality rate of PCa in China has rapidly increased in recent years (Bray et al., 2018;Gu et al., 2018). The routine clinical application of prostate-specific antigen has produced good results in helping the early diagnosis of PCa (Perera et al., 2021). However, PCa is a clinically heterogeneous cancer with large individual differences, particularly involving the diagnosis and treatment of early tumor diagnosis and later tumor progression, tumor metastasis, and hormone resistance (Ji et al., 2019). Identifying the genes related to the occurrence and development of PCa and clarifying the pathogenesis of cancer can provide a theoretical basis for preventing and treating PCa (Giri et al., 2018;Velho et al., 2018). Therefore, in-depth clinical and basic research involving a larger sample size is necessary in order to explore reliable diagnosis and treatment methods. PCa is a complex disease affected by multiple genes. In this study, we developed a risk score based on 18 genes that was verified using GSE70768 and TCGA-PRAD datasets, with a good prediction performance. WGCNA was used to analyze the TCGA-PRAD dataset, and nine modules associated with the pathological grade, Gleason scores, TNM stage, and clinical characteristics of PCa were obtained. We selected the Gleason score and blue module for analysis and found a significant correlation (Cor = -−0.22, P = 3.3e−05). Then, we predicted the intersection of three parts of the model genes using the blue module, Top30, and key genes. Finally, only the core gene GPX2 was obtained. A nomogram was constructed to predict the recurrence of PCa. The nomogram could predict the possibility of recurrence in PCa patients and was more accurate than clinical indicators.
GPX2, also known as gastrointestinal-specific glutathione peroxidase, is a seleniumcontaining protein. It is mainly expressed in the gastrointestinal system and exerts antiinflammatory and antioxidant effects (Lennicke et al., 2017). In recent years, GPX2 has been found to be highly expressed in a variety of tumors, especially inflammation-induced tumors, and may promote cell proliferation and inhibit apoptosis (Minato et al., 2021;Ji et al., 2021;Tian et al., 2021). GPX2 is also overexpressed in human and mouse CRPC cells and promotes the malignant proliferation of PCa cells. Inhibition of GPX2 expression significantly inhibited the proliferation of PCa cells and made them stagnate in the G2/M phase (Naiki et al., 2014). Inhibiting the expression of GPX2 can also improve the level of reactive oxygen species (Wu et al., 2021). These results showed that GPX2 had a certain correlation with tumor immunity. In this study, the high and low expression of GPX2 could influence eight kinds of immune cells to participate in the immune response of PCa. However, there have been few studies on the relationship between the expression of GPX2 and PCa prognosis and mechanism. The Gleason score system is a PCa pathological grading system that was introduced in 1974 ( Gleason & Mellinger, 1974). It has become the most powerful tool used to predict the prognosis of patients with PCa (Nagpal et al., 2020). It is closely related to the differentiation and invasion of PCa, which is of great significance for clinicians choosing and making treatment plans (Thomsen et al., 2020). IHC staining showed that the expression of GPX2 in PCa tissues had no significant correlation with the Gleason score; two datasets (GSE66602 and GSE6919) were used to verify the same results. Therefore, we speculated that GPX2 played an important role in PCa with no correlation to the Gleason score. We concluded that data mining must be combined with experimental verification. Furthermore, the survival prognosis of patients with high and low GPX2 expression was analyzed according to the public datasets GSE70768 and TCGA-PRAD. The results showed that the RFS time of patients in the GPX2 high-expression group was shorter than that of patients in the GPX2 low-expression group. In addition, this study used TCGA-PRAD data to construct a nomogram to predict the prognosis of patients with PCa, which helped us more intuitively understand the importance of the GPX2 expression levels in predicting PCa prognosis.
We used lentivirus transfection technology to promote the over-expression and low expression of GPX2 in LNCaP and 22RV1 cells in order to determine the role of GPX2 in the occurrence and development of PCa. The corresponding in vitro cell experiment results showed that inhibiting the expression of GPX2 could inhibit the proliferation and invasion of LNCaP and 22RV1 cells and induce apoptosis. Also, promoting the expression of GPX2 could promote proliferation and invasion and prevent the apoptosis of LNCaP and 22RV1 cells. The Wnt/β-catenin and EMT pathways were closely related to the occurrence and development of PCa (Kaplan et al., 2021;Chaves et al., 2021). Nath et al. (2019) found that Abi1 loss promoted the progression of PCa by modulating the Wnt signal and inducing EMT. Zhang & Li (2020) found that long noncoding RNA NORAD contributed to the metastasis of PCa via the Wnt/β-catenin/EMT pathway. However, the modulation of GPX2 on the Wnt/β-catenin/EMT pathway has not been reported. This study was novel in reporting that when the expression level of GPX2 changed, the proteins related to the Wnt/β-catenin/EMT pathways also changed. Therefore, we concluded that the mechanism of GPX2 in influencing the occurrence and prognosis of PCa was related to the Wnt/β -catenin/EMT signaling pathway.
However, despite the clinical significance of our findings, this study had some limitations. First, although the performance and AUC values of the calibration curve were excellent in the validation group, multicenter clinical application is still needed to further evaluate the external utility of the prognostic model. Only 262 genes were defined as genes related to the recurrence of PCa, and the construction of the prognostic model was evaluated. Some important genes might have been excluded before establishing the prognostic model. Second, GPX2 was highly expressed in benign prostatic hyperplasia compared with PCa. However, over-GPX2 promoted PCa cell proliferation and invasion and inhibited apoptosis, indicating that over-GPX2 promoted tumor progression to a certain extent. The underlying mechanism needs to be further examined.
In conclusion, the expression of GPX2 in PCa can be used as a new prognostic biomarker of RFS of PCa. GPX2 might regulate PCa progression via the Wnt/β-catenin/EMT pathway, and is expected to become a potential target for treating PCa.

ADDITIONAL INFORMATION AND DECLARATIONS Funding
This study was funded by The Second Affiliated Hospital of Nanjing University of Chinese Medicine (grant number SEZ202006). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.