An Individualized Prognostic Model in Patients with Locoregionally Advanced Nasopharyngeal Carcinoma Based on Serum Metabolomic Profiling

Purpose: This study aims to evaluate the value of a serum metabolomics-based metabolic signature for locoregionally advanced nasopharyngeal carcinoma (LA-NPC) patients, thereby assisting clinical decisions. Methods: In this retrospective study, a total of 320 LA-NPC patients were randomly divided into a training set (ca. 70%; n = 224) and a validation set (ca. 30%; n = 96). Serum samples were analyzed using widely targeted metabolomics. Univariate and multivariate Cox regression analyses were used to identify candidate metabolites related to progression-free survival (PFS). Patients were categorized into high-risk and low-risk groups based on the median metabolic risk score (Met score), and the PFS difference between the two groups was compared using Kaplan–Meier curves. The predictive performance of the metabolic signature was evaluated using the concordance index (C-index) and the time-dependent receiver operating characteristic (ROC), and a comprehensive nomogram was constructed using the Met score and other clinical factors. Results: Nine metabolites were screened to build the metabolic signature and generate the Met score, which effectively separated patients into low- and high-risk groups. The C-index in the training and validation sets was 0.71 and 0.73, respectively. The 5-year PFS was 53.7% (95% CI, 45.12–63.86) in the high-risk group and 83.0% (95%CI, 76.31–90.26) in the low-risk group. During the construction of the nomogram, Met score, clinical stage, pre-treatment EBV DNA level, and gender were identified as independent prognostic factors for PFS. The predictive performance of the comprehensive model was better than that of the traditional model. Conclusion: The metabolic signature developed through serum metabolomics is a reliable prognostic indicator of PFS in LA-NPC patients and has important clinical significance.


Introduction
Nasopharyngeal carcinoma (NPC) is a malignant epithelial tumor with unique geographic distribution characteristics; it is primarily found in Southern China and Southeast Asia [1]. Due to its insidious onset and non-specific symptoms in early stages [2], nearly not only detect hundreds of metabolites in batches but also shows higher sensitivity and accuracy in the identification and quantification of metabolites [33].
In this study, serum widely targeted metabolomic profiling based on Metabolon ultraperformance liquid chromatography-tandem mass spectrometry (UPLC-MS/MS) was conducted in 320 individuals with LA-NPC before treatment. We sought to develop a metabolic signature for predicting the prognosis of patients with LA-NPC and build a potential prognostic model based on metabolomics for predicting progress-free survival (PFS). We hope that our research will map the metabolomic landscape of LA-NPC and provide a new biomarker for tailoring treatment decisions.

Study Design and Clinical Specimens
The overall study design and workflow are presented in Figure 1. A total of 320 patients with LA-NPC (stage III to IVa) treated at Sun Yat-sen University Cancer Center between 2013 and 2016 were enrolled in this study. The study population was divided randomly into a training set (224 cases, 70%) and a validation set (96 cases, 30%) aiming at a ratio of 7:3. According to the eighth edition of the American Joint Committee on Cancer (AJCC) Staging Manual, two radiologists reevaluated all patients by reassessing baseline MRI scans and resolved discrepancies by consensus. Complete baseline clinical information and follow-up data were available for these patients. All patients underwent a complete physical examination, MRI of the nasopharynx and neck, bone scan, fiberoptic nasopharyngoscopy, and complete blood sampling, including routine blood tests, biochemical characteristics, and EBV-DNA copies before treatment. Individuals with impaired heart, lung, liver, or kidney function and those with a history of malignant tumors were excluded from this study. All patients underwent platinum-based chemotherapy and intensity-modulated radiotherapy (IMRT). The Research Ethics Committee of the Sun Yat-sen University Cancer Center approved this study (B2022-429-01).
identifying metabolites. Notably, widely targeted metabolomics based on multiple reaction monitoring, a novel metabolomics technology, subtly integrates the advantages of non-tar geted and targeted metabolomics [32]. This high-throughput technology can not only detec hundreds of metabolites in batches but also shows higher sensitivity and accuracy in the identification and quantification of metabolites [33].
In this study, serum widely targeted metabolomic profiling based on Metabolon ultra-performance liquid chromatography-tandem mass spectrometry (UPLC-MS/MS) was conducted in 320 individuals with LA-NPC before treatment. We sought to develop a metabolic signature for predicting the prognosis of patients with LA-NPC and build a potential prognostic model based on metabolomics for predicting progress-free survival (PFS) We hope that our research will map the metabolomic landscape of LA-NPC and provide a new biomarker for tailoring treatment decisions.

Study Design and Clinical Specimens
The overall study design and workflow are presented in Figure 1. A total of 320 patients with LA-NPC (stage III to IVa) treated at Sun Yat-sen University Cancer Center be tween 2013 and 2016 were enrolled in this study. The study population was divided ran domly into a training set (224 cases, 70%) and a validation set (96 cases, 30%) aiming at a ratio of 7:3. According to the eighth edition of the American Joint Committee on Cancer (AJCC) Staging Manual, two radiologists reevaluated all patients by reassessing baseline MRI scans and resolved discrepancies by consensus. Complete baseline clinical infor mation and follow-up data were available for these patients. All patients underwent a complete physical examination, MRI of the nasopharynx and neck, bone scan, fiberoptic nasopharyngoscopy, and complete blood sampling, including routine blood tests, biochemical characteristics, and EBV-DNA copies before treatment. Individuals with impaired heart, lung, liver, or kidney function and those with a history of malignant tumors were excluded from this study. All patients underwent platinum-based chemotherapy and intensity-modulated radiotherapy (IMRT). The Research Ethics Committee of the Sun Yat-sen University Cancer Center approved this study (B2022-429-01).

Clinical Endpoints and Follow-Up
The primary endpoint was 5-year PFS, and the secondary endpoints were overall survival (OS), 5-year distant metastasis-free survival (DMFS), and 5-year local-regional recurrence-free survival (LRFS). PFS was defined as the time from the initial diagnosis until relapse, progression, or death due to any cause, whichever occurred first. OS was defined as the time from diagnosis date to death from any cause. DMFS was defined as the time from diagnosis to data of distant recurrence, and LRFS was defined as the time from diagnosis to local or regional recurrence.
Patients underwent routine imaging every 3 months for the first 2 years and every 6 months for years 3-5. The follow-up duration was defined as the interval from diagnosis to death or the last follow-up (31 December 2020). All local and regional recurrences were confirmed by radiological examination and pathology. The diagnosis of distant metastases was based mainly on imaging methods, such as MRI, CT, or PET/CT.

Blood Sample Collection
To minimize the impact of food and nutrition on the serum metabolome and laboratory results, peripheral blood samples were collected in the morning under fasting conditions before the initial treatment. The blood samples were immediately centrifuged at 1000× g at 4 • C for 10 min to isolate serum, which was stored at −80 • C until assayed. Baseline routine hematology and biochemical assays were performed at admission using a fully automated hematology analyzer Sysmex XE-5000 (Sysmex, Kobe, Japan) and an automated immunoturbidimetric analyzer 7600-020 (Hitachi High-Technologies, Tokyo, Japan). The circulating EBV-DNA levels in the blood were detected using quantitative real-time polymerase chain reaction (RQ-PCR).

Widely Targeted Metabolite Profiling
We applied a pseudotargeted metabolomics strategy based on UPLC-MS/MS. The sample orders were randomized, and all samples were measured in a single batch. Serum samples were thawed on ice, vortexed for 10 s, and mixed well. We then added 300 µL of extracting solution, which consisted of internal standards and 20% methanol/acetonitrile, to 50 µL of serum. The mixture was stirred for 3 min and centrifuged at 12,000 rpm for 10 min at 4 • C. The supernatant was transferred to a new tube and centrifuged at 12,000 rpm for 5 min at 4 • C. The supernatant was then placed at −20 • C for 30 min before being centrifuged at 12,000× g for 5 min at 4 • C. Finally, 150 µL of supernatant was removed from a liquid chromatography (LC) injection bottle, and the injection volume was 2 µL for on-board analysis using LC-MS.
In the training and validation sets, widely targeted metabolite profiling of samples was performed using the self-built Metware database (MWDB) developed by MetWare Biotechnology Co. (Wuhan, China). In total, 746 endogenous water-soluble metabolites were detected using an UPLC-MS/MS system (ExionLC AD coupled to a QTRAP spectrometer). The specific information of all metabolites is given in Table S1. In order to facilitate the observation of changes in the relative content of metabolites, we used standardization processing (unit variance scaling, UV scaling) for the original relative content of differential metabolites identified by the screening criteria, and we then drew the heatmap. The result is shown in Figure S1. For additional details on the widely targeted metabolomics analysis and quality control analysis, see the experimental conditions in Supplementary Materials and Figure S2.

Metabolites Selection and Metabolomics Signature Building
All the samples were randomly divided into training and validation sets at a ratio of 7:3. The training set was used to calculate the risk score (Met score) and establish a prognostic model, whereas the validation set was used to verify the efficiency of the model. We used the R language to standardize the metabolome data (746 metabolites) with the Z-score method and screened out 27 metabolites associated with PFS by univariate Cox regression (false discovery rate [FDR] < 0.05) (Table S2). We used multi-factor Cox regression based on the screened metabolites to build a prognostic model, and we calculated the model's Akaike information criterion (AIC) value using forward-backward stepwise regression. The smaller the AIC value of the model, the better the model fit the data. Finally, we obtained a metabolically related risk regression model named the 9-metabolite signature.

Prognostic Validation of 9-Metabolite Signature
We calculated the Met score for all patients with Cox regression coefficients of each metabolite in the 9-metabolite signature. Kaplan-Meier (KM) survival analysis was used to verify survival differences in both sets (log-rank test). The patients were divided into high-and low-risk groups using the median Met score. Then, time-dependent receiver operating characteristic (ROC) analysis and the concordance index (C-index) were applied to assess the predictive power of the 9-metabolite signature for PFS. At the same time, we also evaluated the prediction performance of the 9-metabolite signature in the secondary endpoints including 5-year OS, 5-year DMFS, and 5-year LRFS.

Performance of the Nomogram Based on Metabolomics and Traditional Clinical Factors
The prognostic value of the metabolic signature and traditional prognostic factors was assessed using Cox proportional hazard models. Variables with p < 0.05 in the univariate analyses were used for further multivariate analysis. The final Cox model retained significant variables as independent predictors (p < 0.05), including the Met score (continuous variable), sex (male vs. female), overall stage (III vs. IVa), and EBV-DNA level (<4000 vs. ≥4000 copies/mL). For the multivariable Cox regression model, coefficients were used to construct the nomogram. The discrimination of the model predicting the 5-year PFS probability was evaluated using the C-index in the training and validation sets. Clinical usefulness was estimated by decision curve analysis (DCA). Finally, a nomogram was used to visualize the regression coefficients of the comprehensive model and was calibrated using a calibration plot [24].

Statistical Analysis
Data analyses were performed using the R software (version 4.1.0). Participants were randomly assigned to the training and validation sets in a ratio of 7:3 using the R package "IOBR" [25]. Metabolomic data were normalized using the Z-score method. Metabolites related to PFS were screened out by univariate Cox regression (FDR < 0.05), the p-value of the Cox proportional hazards regression model was obtained by the Wald test, and the R function p.adjust was used for the FDR by Benjamini-Hochberg method adjusted p-values. Based on the minimum Akaike information criteria (AIC), multivariate Cox regression analysis was performed to further narrow down the range of candidate metabolites related to PFS. The Kaplan-Meier analyses and log-rank tests were performed using the R package survival, and survival curves were visualized using the R package survminer. The timedependent area under the ROC curve was computed using the R package "timeROC". The C-index was calculated using the rcorr.cens function implemented in the Hmisc library. The bootstrap percentile method was used to calculate 95% confidence intervals. The rms package was used for nomograms and calibration curves. Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis was performed on the differential metabolites using the ClusterProfile package in R. All figures and heatmaps were produced using the ggplot2 and pheatmap packages in R, unless otherwise specified. All statistical tests were conducted with a two-sided significance level of 0.05.

Patient Sets and Baseline Characteristics
For this study, we collected 320 pretreatment and non-distant metastatic LA-NPC serum samples. Table 1 shows the baseline characteristics of the patients in the training (n = 224) and test (n = 96) sets. There were no significant differences observed in baseline characteristics between the two sets in terms of age, sex, smoking, drinking, BMI, stage, EBV-DNA level, and some biomarkers that have been extensively investigated in NPC and are highly correlated with survival outcomes [34][35][36]. The median follow-up was

Construction of the 9-Metabolite Signature
Widely targeted metabolite profiling of samples was performed using the self-built Metware database (MWDB) developed by MetWare Biotechnology Co. In total, 746 endoge-Life 2023, 13, 1167 7 of 17 nous metabolites were detected using an UPLC-MS/MS system. Metabolite annotation details are described in Table S1.

Construction of the 9-Metabolite Signature
Widely targeted metabolite profiling of samples was performed using the self-built Metware database (MWDB) developed by MetWare Biotechnology Co. In total, 746 endogenous metabolites were detected using an UPLC-MS/MS system. Metabolite annotation details are described in Table S1.
Twenty-seven different metabolites were significantly associated with PFS in the training set (FDR < 0.05) (Table S2). Using an Akaike information criterion (AIC)-based stepwise selection method, we computed an optimal predictive model including nine metabolites ( Figure 2). Among these metabolites, 2-(4-Hydroxyphenyl) ethanol, 11-Ketoetiocholanolone, 2-(4-Hydroxyphenyl) ethanol, 11-Ketoetiocholanolone4-Guanidinobutyric acid, and tridecanedioic acid were negatively correlated with PFS, whereas Leu-Gly, bis (1-inositol) -3,1′-phosphate 1-phosphate, and N-(3-Indolylacetyl)-L-alanine were correlated with positive clinical outcomes. We constructed a metabolic risk score (Met score) with the Cox regression coefficients of each metabolite in the 9-metabolite signature. Met score = (0.2630 × abundance of We constructed a metabolic risk score (Met score) with the Cox regression coefficients of each metabolite in the 9-metabolite signature. Met score = (0.2630 × abundance of MEDN0554) The Met score for each patient in the training set is shown in Figure 3a. Patients were stratified into high-and low-risk groups based on the median Met score (cutoff = 0.92). The Met-score distribution was significantly different between the groups with and without progression (p < 0.001, Figure 3b). A heatmap of significantly different metabolites is shown in Figure 3c to facilitate understanding of the prognostic signature. The Met score for each patient in the training set is shown in Figure 3a. Patients were stratified into high-and low-risk groups based on the median Met score (cutoff = 0.92). The Met-score distribution was significantly different between the groups with and without progression (p < 0.001, Figure 3b). A heatmap of significantly different metabolites is shown in Figure 3c to facilitate understanding of the prognostic signature.

Prognostic Validation of 9-Metabolite Signature
We assessed the potential relationship between the 9-metabolite signature and PFS in the training set and validated it in the validation set. KM survival analysis was used to compare survival differences between the two sets. Patients were divided into low-and high-risk groups based on the median Met score (cutoff = 0.92). The results of the KM curve showed that the prognosis of the high-risk group was worse than that of the lowrisk groups in the training set (Figure 4a, log-rank p < 0.001; HR = 3.40, 95% CI = 2.13-5.40).

Prognostic Validation of 9-Metabolite Signature
We assessed the potential relationship between the 9-metabolite signature and PFS in the training set and validated it in the validation set. KM survival analysis was used to compare survival differences between the two sets. Patients were divided into low-and high-risk groups based on the median Met score (cutoff = 0.92). The results of the KM curve showed that the prognosis of the high-risk group was worse than that of the low-risk groups in the training set (Figure 4a, log-rank p < 0.001; HR = 3.40, 95% CI = 2.13-5.40). Five-year PFS was 53.7% (95% CI 45.12-63.86) for the high-risk group and 83.0% (95% CI 76. 31-90.26) for the low-risk group. When patients in the validation set (n= 96) were stratified according to their Met score with the same cutoff values used in the training set, the 5-year PFS rates for the low-risk (n= 52, 54.2%) and high-risk (n= 44, 45.8%) patient groups were 90.9% (95% CI 82.8-99.8) and 58.6% (95% CI 46.4-74.1), respectively (p = 0.003, log-rank test, HR = 3.68, 95% CI = 1.68-8.08 (Figure 4b)). Time-dependent ROC curves for each specified time point were used to assess the accuracy of the 9-metabolite signature in predicting PFS in LA-NPC patients. The area under the curve (AUC), calculated for predicting the 5-year PFS in the training set and validation set, was 0.76 ( Figure 4c) and 0.75 (Figure 4d), respectively. Furthermore, the C-index of this model, based on metabolites, also validated that the model had good prognostic power (Table 2. Next, we explored the prognostic value of the 9-metabolite signature on other endpoints. In the training set, a high Met score was associated with a higher risk of distant metastasis, local recurrence, and death (HR 2.76, 95% CI 1.52-5.03, p = 0.002, Supplemental Figure S3a; HR 4.21, 95% CI 2.12-8.33, p < 0.001, Figure S3c; HR 2.45, 95% CI 1.26-4.75, p = 0.012, Figure S3e). This trend was also consistently observed for DMFS, LRFS, and OS in the validation set, although it was not statistically significant ( Figure S3b,d,e). The ROC curve shown in Figure S4 indicates

Development of an Individualized Prognostic Model
We developed a prognostic model considering both the 9-metabolite signature and other clinical factors for predicting PFS. In univariate analysis, Met score and clinical factors including EBV-DNA level, overall stage, smoking, sex, and CRP were found to be significantly associated with the 5-year PFS (  Table 3). The results showed that our constructed 9-metabolite signature could serve as a robust and novel biomarker for predicting prognosis.

Development of an Individualized Prognostic Model
We developed a prognostic model considering both the 9-metabolite signature and other clinical factors for predicting PFS. In univariate analysis, Met score and clinical factors including EBV-DNA level, overall stage, smoking, sex, and CRP were found to be significantly associated with the 5-year PFS (  Table 3). The results showed that our constructed 9-metabolite signature could serve as a robust and novel biomarker for predicting prognosis.

Performance and Validation of the Prognostic Nomogram
A nomogram for individualized PFS prediction was constructed using these variables (Figure 5a). Compared with the traditional prognostic model-including sex, overall stage, and pretreatment EBV-DNA-the additional combination of the Met score showed better discrimination in the training (C-index, 0.77 vs. 0.71) and validation sets (C-index, 0.72 vs. 0.67) ( Table 2). The calibration curves plotted at the 5-year time points visually confirmed a good fit between the predicted nomogram-predicted PFS probability and observed PFS rates in the two sets (Figure 5b,c). DCA was used to evaluate the clinical utility of the nomogram. The DCA for the 5-year PFS nomogram showed that for a threshold probability >15%, using the comprehensive nomogram model to predict the 5-year PFS could add more benefits than the traditional model in the training set (Figure 5d), which are then identified in the validation set (Figure 5e).

Metabolite Set Enrichment Analysis
The metabolites that were significantly related to PFS were analyzed for enrichment in KEGG pathways using the KEGG compound database. We observed that metabolites were enriched in four pathways: tyrosine metabolism, inositol phosphate metabolism, fructose and mannose metabolism, and arginine and proline metabolism (Figure 6b). The identified PFS-related metabolites were widely involved in amino acid and energy metab-

Metabolite Set Enrichment Analysis
The metabolites that were significantly related to PFS were analyzed for enrichment in KEGG pathways using the KEGG compound database. We observed that metabolites were enriched in four pathways: tyrosine metabolism, inositol phosphate metabolism, fructose and mannose metabolism, and arginine and proline metabolism (Figure 6b). The identified PFS-related metabolites were widely involved in amino acid and energy metabolism. The tyrosine metabolism pathway was the most significantly enriched pathway (p < 0.05, Figure 6a).

Discussion
In our study, we developed an individualized biomarker named the 9-metabolite signature for predicting the PFS of patients with LA-NPC based on metabolomics. A prognostic model integrating metabolites and important clinical characteristics was proposed with better performance in the training and validation sets, compared with the model based on conventional clinical data. The study focused on investigating new prognostic factors to complement the TNM staging system and improve the accurate prognosis of patients with LA-NPC for providing aggressive treatment plans for high-risk groups. To our knowledge, this is the first large-scale study to focus exclusively on the relationship between patient prognosis and pretreatment serum metabolomics in advanced NPC.
The reprogramming of cellular metabolism is one of the hallmarks of cancer transformation and progression [23,37]. The metabolites are closely related to the phenotype and dynamic changes in the organism, which are generally used as biomarkers for cancer diagnosis, progression, and assessment of therapeutic efficacy [22,38]. Thus, highthroughput metabolomics can provide more comprehensive information about tumors. Many studies have explored the association between metabolic characteristics and the occurrence and progression of various types of cancer [27,39].
However, the metabolomics for NPC is still in its infancy. Previous studies mainly focused on the early diagnosis of NPC, which detected serum metabolites in a crowd of small-scale patients with NPC and normal controls by GC-MS-based metabolic profiling [29,30]. Research regarding the prognostic significance of metabolomics in the field of NPC is less extensive. Tang et al. [29] observed changes of metabolites in 19 patients with primary NPC at three time periods after radiotherapy. They found the high expression of kynurenine, Nacetylglucosamine, N-acetylglucosaminylamine, and hydroxyphenylpyruvate was related to cancer recurrence and distant metastasis. The drawbacks of these works include the use of a small cohort and the lack of an independent validation set, the relatively short follow-up period, and the absence of evaluation of other clinical variables related to survival. Compared with earlier studies, our study has several differences and advantages. We analyzed a relatively larger cohort, with hundreds of samples available

Discussion
In our study, we developed an individualized biomarker named the 9-metabolite signature for predicting the PFS of patients with LA-NPC based on metabolomics. A prognostic model integrating metabolites and important clinical characteristics was proposed with better performance in the training and validation sets, compared with the model based on conventional clinical data. The study focused on investigating new prognostic factors to complement the TNM staging system and improve the accurate prognosis of patients with LA-NPC for providing aggressive treatment plans for high-risk groups. To our knowledge, this is the first large-scale study to focus exclusively on the relationship between patient prognosis and pretreatment serum metabolomics in advanced NPC.
The reprogramming of cellular metabolism is one of the hallmarks of cancer transformation and progression [23,37]. The metabolites are closely related to the phenotype and dynamic changes in the organism, which are generally used as biomarkers for cancer diagnosis, progression, and assessment of therapeutic efficacy [22,38]. Thus, high-throughput metabolomics can provide more comprehensive information about tumors. Many studies have explored the association between metabolic characteristics and the occurrence and progression of various types of cancer [27,39].
However, the metabolomics for NPC is still in its infancy. Previous studies mainly focused on the early diagnosis of NPC, which detected serum metabolites in a crowd of small-scale patients with NPC and normal controls by GC-MS-based metabolic profiling [29,30]. Research regarding the prognostic significance of metabolomics in the field of NPC is less extensive. Tang et al. [29] observed changes of metabolites in 19 patients with primary NPC at three time periods after radiotherapy. They found the high expression of kynurenine, Nacetylglucosamine, N-acetylglucosaminylamine, and hydroxyphenylpyruvate was related to cancer recurrence and distant metastasis. The drawbacks of these works include the use of a small cohort and the lack of an independent validation set, the relatively short follow-up period, and the absence of evaluation of other clinical variables related to survival. Compared with earlier studies, our study has several differences and advantages. We analyzed a relatively larger cohort, with hundreds of samples available for metabolomic analysis, and performed a 5-year follow-up. In addition, a widely targeted metabolomics approach using ultra performance liquid chromatography-tandem mass spectrometry (UPLC-MS/MS) was used to investigate serum metabolites and produce a metabolic profile in LA-NPC patients in our study. This technology not only enables high-throughput analysis but also ensures reliable high-quality data with higher resolution separations [40]. It broadens the scope of potential biomarker discovery by improving the detection limit of low-abundance metabolites. Additionally, we added the value of the Met score to clinical data in developing the prognostic model and assessed its performance in a validation set. Notably, the Met score remained an independent prognostic factor after the inclusion of conventional clinical biomarkers as covariates in the multivariable analysis. In our study, we combined a newly developed metabolomics-based biomarker with reliable clinical prognostic factors including sex, overall stage, and pretreatment EBV-DNA level to add more value for prognostic prediction and clinical application [41][42][43]. This complementary ability seems to be practical because serum metabolomic profiling is more convenient and cheaper than other omics methods. This nomogram provides a simple and accurate tool for predicting prognosis in patients with LA-NPC before treatment.
Most of the metabolites screened in this study have been reported to be related to the occurrence and development of diseases in previous studies, which is consistent with our results. Luo et al. found that 2-(4-Hydroxyphenyl)ethanol was involved in inflammation and NF-κB activation [44]. Etiocholanolone is an excreted metabolite of testicular hormone and 11-Ketoetiocholanolone is a metabolite of Etiocholanolone [45]. We observed that 11-Ketoetiocholanolone has a significant negative correlation with a good prognosis in LA-NPC, which indicates that males always have a poorer prognosis. Zeebroeck discovered that the dipeptide L-Leu-Gly, a non-transport agonist of the active transporter receptor, could activate the PKA pathway, which may promote the proliferation of tumor cells [46]. Studies have demonstrated that dopaquinone is an intermediate in melanogenesis that can produce toxic oxygen free radicals and cause cellular damage [47]. In addition, 4-Guanidinobutanoic acid, a gamma-amino acid and uremia toxin, has been shown to increase in early renal cell carcinoma compared to that in healthy humans [48]. Tridecanedioic acid, an unusual fatty acid related to peroxisomal disorders, can discriminate healthy people from patients with psoriatic arthritis, as reported in previous literature [49,50]. Regrettably, except for those mentioned above, there are few relevant literature reports on the relationship between bis (1-inositol)-3,1 -phosphate 1-phosphate, sorbitol 6-phosphate, and N-(3-Indolylacetyl)-L-alanine and diseases. Although the roles of these metabolites in NPC or other diseases are currently unclear, our results suggest that further research is needed.
KEGG enrichment analysis identified metabolites annotated using the KEGG compound database. The tyrosine metabolism pathway was the most significantly enriched pathway (p < 0.05). Several studies have confirmed that tyrosine metabolism is an important process that is often dysregulated in cancer development and progression [51][52][53]. Cheng et al. [54] observed that tyrosine metabolism is disturbed in esophageal squamous cell carcinoma patients, and the metabolites involved in the tyrosine pathway can be used as diagnostic biomarkers of the disease. These results suggest a relationship between tyrosine metabolism and nasopharyngeal carcinoma that needs to be investigated in future studies.
Finally, we developed an integrated prognostic model incorporating 9-metabolite signature, gender, overall stage, and pretreatment EBV-DNA levels. In addition to the Met score, other traditional prognostic factors included in this integrated prognostic model have been proven to be powerful independent prognostic factors for NPC in previous studies. Previous research has shown that male patients are more likely to exhibit inferior overall survival and disease progression-free survival rates compared to female patients [41,55], which is consistent with our findings. Male and female patients with NPC may exhibit differences in tumor behavior, indicating potential biological distinctions. These gender disparities in NPC incidence and prognosis may be influenced by genetic variations influenced by hormonal factors. The TNM stage remains an important clinical factor that affects cancer patient prognosis, as it dictates the overall clinical stage [42]. Furthermore, multiple studies have shown that pre-treatment EBV-DNA is closely correlated with the tumor burden and prognosis in patients with NPC [56,57]. In our study, we combined these reliable clinical prognostic factors with a newly developed metabolomics-based biomarker to enhance the prognostic prediction and clinical application of our model.
The results showed that the integrated model was better than the traditional model in predicting 5-year PFS in locally advanced nasopharyngeal carcinoma (training set: C-index 0.77 vs. 0.71; validation set: C-index 0.72 vs. 0.67). The main reason for this is the limitations of the TNM staging system [42,43]. This staging system mainly focuses on the macroscopic information of tumors provided by imaging but omits the biological characteristics of tumors and cannot reflect tumor heterogeneity. As a powerful tool, metabolomics can provide more comprehensive tumor information through the analysis of circulating metabolites in vivo. Therefore, it can complement traditional prognostic factors, improve the accuracy of prognosis prediction, and provide active treatment options for high-risk groups in a timely manner. In addition, the metabolic signature developed in this study has certain clinical application value. Compared with other omics, serum metabolomics is convenient to obtain materials, and the price is lower, which will not increase the economic burden of patients. This metabolic signature may provide a simple and accurate prognostic prediction tool for locally advanced NPC patients before treatment.
The present study had certain limitations. First, all study participants were enrolled from a single institution in endemic NPC areas, which may have enhanced the standardized sample collection. However, these observations require confirmation in an external validation set so that our prognostic model could become more widely applicable. Second, the biological functions and underlying mechanisms of these significant metabolites related to poor prognosis require further investigation. Third, as the metabolome is highly dynamic and changes in response to the occurrence and progression of the disease, longitudinal, continuous, and prospective metabolomic profiling during the treatment process is necessary to capture more precise biological information. Finally, we will use multi-omics data to explore new insights into survival prediction in NPC in the future.

Conclusions
In conclusion, we developed and validated a signature based on a panel of metabolites as a novel biomarker to predict PFS in patients with LA-NPC (stages III-IVa). Nine metabolites that were closely associated with PFS were identified via the application of highly sensitive UPLC-MS/MS-based widely targeted metabolomic analysis. The 9-metabolite signature can effectively distinguish populations at high or low risk for LA-NPC so that appropriate treatment strategies can be implemented. The prognostic model combining the metabolic signature with the staging system and clinical data showed robust predictive power for pretreatment PFS, which may have translational implications in clinical practice.