The Prognostic Value and Potential Mechanism of Tumor-Nutrition-Inflammation Index and Genes in Patients with Advanced Lung Cancer

Background Lung cancer (LC) has the highest mortality rate all over the world. It is necessary to search for novel potential biomarkers that are easily accessible and inexpensive in identifying patients with LC at early stage. Methods A total of 195 patients with advanced LC who have received first-line chemotherapy were involved in this study. The optimized cut-off values of AGR and SIRI (AGR = albumin/globulin; SIRI = neutrophil ∗ monocyte/lymphocyte) were determined by survival function analysis based on R software. COX regression analysis was performed to obtain the independent factors for establishing the nomogram model. A nomogram model comprising these independent prognostic parameters was built for the TNI (tumor-nutrition-inflammation index) score calculation. The predictive accuracy was demonstrated through ROC curve and calibration curves after index concordance. Results The optimized cut-off values of AGR and SIRI were 1.22 and 1.60, respectively. It was revealed that liver metastasis, SCC, AGR, and SIRI were independent prognostic factors in advanced lung cancer by Cox analysis. Afterwards, the nomogram model comprised of these independent prognostic parameters was built for TNI scores calculation. Based on the TNI quartile values, patients were divided into four groups. And it was indicated that higher TNI had worse OS (P < 0.05) via Kaplan–Meier analysis and log-rank test. Moreover, the C-index and 1-year AUC area were 0.756 (0.723–0.788) and 75.62, respectively. There was high consistency shown in the calibration curves between predicted and actual survival proportions in the TNI model. In addition, tumor-nutrition-inflammation index and genes play an important role in LC development that might affect some pathways related to tumor development including cell cycle, homologous recombination, and P53 signaling pathway from a molecular level. Conclusion TNI might be an analytical tool which was practical and precise for survival prediction of patients with advanced LC. Tumor-nutrition-inflammation index and genes play an important role in LC development. A preprint has previously been published [1].


Introduction
Lung cancer (LC) which has the highest mortality rate worldwide is the third most common cancer type behind breast and prostate cancers [1,2]. Most LC patients are diagnosed at an advanced stage with a poor prognosis and short survival time. Terefore, it is vital to explore potential biomarkers that may predict survival time and identify patients who may beneft from early treatment. Immune checkpoints are widely accepted biomarkers in immunotherapy, including programmed cell death protein 1/programmed cell death 1-ligand 1, and cytotoxic T-lymphocyteassociated protein 4. Other biomarkers such as EGFR, RAS, and TP53 are widely used in target therapy for LC. However, invasive procedures were required to detect these biomarkers including obtaining pathological tissues that are costly. Terefore, it is optimal to identify potential biomarkers that are easily accessible and inexpensive.
Laboratory blood tests, including various indicators such as absolute white cell counts, albumin, globulin, neutrophils, monocytes, and lymphocytes, are widely used in clinical practice. Previous studies have suggested that these blood indicators could be used as predictive and prognostic biomarkers for various tumors, including LC [3][4][5]. And albumin and globulin are the most common clinical nutritional indicators. Moreover, infammatory indicators, including neutrophils, monocytes, and lymphocytes, usually can be used to refect the infammatory state. Besides, it has been reported that low AGR or high SIRI is associated with poor survival outcomes [6][7][8]. Nutritional and infammatory indicators usually are related to the prognosis of patients with cancer [2,9,10]. Nevertheless, there are some limitations on these studies, including focusing on a single marker, patients at a certain stage, and a particular cytological classifcation. Few studies have investigated the association between combined factors and the prognosis of advanced LC. Tis study aimed to explore the prognostic signifcance and potential mechanisms of integrated nutritional and infammatory values and genes in patients with advanced LC.

Patients.
A retrospective analysis was conducted those enrolled patients with a defnite diagnosis of stage IV LC treated in the respiratory medicine department of the Fourth Afliated Hospital of Zhejiang University School of Medicine in the past 5 years from February 2015 to December 2019. Te inclusion and exclusion criteria were as follows: Inclusion criteria were at least 18 years of age with a defnite diagnosis of stage IV LC by CT or MRI imaging examination and pathological examination, initial frst-line chemotherapy was treated in the Fourth Afliated Hospital; all patients were ECOG PS 0-1; complete clinical and followup information; and sufcient pretreatment routine blood laboratory test data.
Exclusion criteria were patients without a diagnosis of LC; patients with repeat names and hospital admission number; without complete clinical information; without sufcient follow-up information; no available routine blood laboratory data; and no history of acute infection.
Te study was approved by the Research Ethics Committee of the Fourth Afliated Hospital of Zhejiang University School of Medicine (reference number: K2021063).

Data Collection.
Clinical information was collected from the electronic medical record system. CT or MRI imaging and pathology examinations were performed by at least two professional physicians. Laboratory test data were selected within two weeks prior to the frst-line chemotherapy by detecting the patient's peripheral blood.
2.3. Follow-Up. All patients were followed up every three months. Te estimated endpoints were the overall survival (OS). OS was defned as the interval from the start of frstline chemotherapy up to the time of the patient's last follow-up or the time of death. All patients were followed up to January, 2021.

Te Evaluation of AGR and SIRI.
Te albumin, globulin, neutrophil, monocyte, and lymphocyte values were collected to calculate the AGR and SIRI (AGR � albumin/globulin; SIRI � neutrophil * monocyte/lymphocyte). Te optimized cut-of values were dichotomized through survival function using R software 3.6.2. According to the optimized cut-of value, all included patients were classifed into elevated and low groups. Te cut-of values were 1.22 and 1.60, respectively.

Te Analysis of Biological Functions and Pathways.
Te dataset was downloaded from TCGA-LUAD (https:// portal.gdc.cancer.gov/) which included 522 LUAD patients with complete clinical information, survival information, gene expression, etc. A total of 389 CRP-ALB related-genes was obtained from the website https://www.gsea-msigdb. org/gsea, the neutrophil-related gene set was received from the dataset of HP_ABNORMALITY_OF_NEUTROPHILS, and the albumin-related gene set was downloaded from the dataset of HP_HYPOALBUMINEMIA. Diferential genes between tumor and nontumor tissues were identifed by limma package and the criteria of P < 0.05. Te dependent risk prognostic genes were screened by the univariate and multivariate Cox regression and LASSO regression using the survival and glmnet package of R studio software. KEGG functional enrichment analysis was performed to explore the potential mechanisms.
2.6. Statistical Analysis. All statistical analyses were performed using R software (R 3.6.2 version), SPSS software (IBM SPSS statistical 20.0 version), and GraphPad software (GraphPad Prism 6 version). All count data were extracted as continuous variable values or percentage values. Chisquare test, Fisher's exact test, and Bonferroni correction were used to compare categorical variables. Kaplan-Meier (KM) survival curves and log-rank tests were used to explore the distribution of the OS of categorical variables. Univariate and multivariate Cox regression analyses were performed to analyze the signifcant independent prognostic factors. Te statistical signifcance threshold was set to P value of less than 0.05.

Patients Selection.
Tis retrospective analysis study included 945 patients with LC initially, who were hospitalized in the respiratory medicine department of the Fourth Affliated Hospital of Zhejiang University School of medicine in the past 5 years from February 2015 to December 2019. Based on the inclusion and exclusion criteria, 395 patients were without complete information, 20 patients were without frst-line chemotherapy, 277 patients were without follow-up information, and 108 patients were without suffcient laboratory test data. Finally, 195 patients were included in this retrospective study. In addition, R software was used to randomly group the patients to a 7 : 3 ratio. Finally, the nomogram prediction model was established based on the training group including 136 patients. And other 59 patients were assigned to the validation cohort. Te total patients were assigned to the testing cohort to assess the model (Figure 1).

3.2.
Association between AGR, SIRI, and the OS. Te characteristic variables of the training cohort are summarized in Table 1. Te median value of age was regarded as the cut-of value. Te cut-of values of CRP, CEA, and CA199 are defned by the maximum of the normal range setting by the Fourth Afliated Hospital of Zhejiang University School of Medicine. Te training cohort consisted of 96 (70.6%) men and 40 (29.4%) women. In addition, Table 1 shows that low AGR is signifcantly associated with other prognostic outcomes, including no history of LC operation (P � 0.005), body mass index (BMI) of ≥18.5 (P � 0.008), carcinoembryonic antigen (CEA) of ≥5 (P � 0.018), and an increased C-reactive protein (CRP) level (P < 0.001). It was signifcantly diferent when comparing high SIRI with gender (P � 0.002), pathology (P � 0.008), and CRP (P < 0.001).
Subsequently, Cox univariate and multivariate regression analyses included variables that were signifcant in Table 1 or meaningful related clinical work. It was indicated that a history of LC operation, liver metastasis, history of smoking, BMI, CA199, squamous cell carcinoma antigen (SCC), CRP, AGR, and SIRI were signifcantly associated with OS (P < 0.05; Figure 2). And it was revealed that liver metastasis, SCC, AGR, and SIRI were independent prognostic factors in advanced LC through Cox multivariate proportional hazard analysis (P < 0.05; Figure 3).
To explore the prognostic value of AGR and SIRI in patients with advanced LC, KM analysis and log-rank test demonstrated that the relationship between low AGR and poorer OS was statistically signifcant in the training set (hazard ratio [HR] � 2.435 [1.55-4.88], P � 0.007; Figure 4(a)). Te lower AGR group had shorter 5-year OS rate (0% vs. 42.3%) and median OS time (15.0 months vs. 30.3 months) in comparison with the elevated AGR level group. When patients with advanced LC were in hyperinfammatory states, it revealed that high SIRI level had lower 5-year OS rate (0% vs. 54.9%) and median OS time (16.7 months vs. NA; HR � 3.135 (1.77-5.24); P < 0.001; Figure 4(d)). Similar results were confrmed in the validation and testing sets (P < 0.05; Figures 4(b) and 4(c)-4(e) and 4(f )).

Te Analysis of the Prognostic Value of TNI.
Te potential value of the clinical factors in the training set was further explored. As known, SCC and liver metastasis are important biomarkers for LC screening [11,12]. However, not all patients with high SCC or liver metastasis have a poor survival time. SCC or liver metastasis alone is insufcient as a prognostic biomarker for patients with advanced LC. Terefore, more prognostic biomarkers for LC need to be explored. To predict survival precisely and quantitatively, a nomogram model based on relevant parameters was established. Te total points were calculated by determining the score of the parameters by establishing the nomogram as shown in Figure 5(a). Liver metastasis had the largest interval while the SCC risk score indicated the minimum range in this model. Te total point was defned as the TNI, which was calculated for each patient based on the model. We could get a formula: TNI � 10 * liver metastasis yes + 5.37 * SSC high + 5.69 * AGR low + 5.55 * SIRI high . TNI scores were calculated using the R software for each patient with advanced LC. Ten, all patients included were divided into four groups based on their TNI quartile values. KM analysis and log-rank test indicated that the high-risk TNI group signifcantly predicted poorer OS compared to the other groups, as shown in Figure 5 To verify whether the nomogram model is applicable to both the training and validation sets, the TNI score for each patient was disposed in the same manner as the testing set. Te survival curves were still statistically signifcant, as plotted in Supplementary Figures 1A and 1B (P < 0.05). In order to further validate the diagnostic ability of the nomogram model, the concordance index (C-index) and time-dependent receiver characteristic operator (ROC) curves were drafted by R studio according to the SCC combined liver metastasis model, AGR combined SRI model, and TNI model, respectively. Te results showed that the C-index was 0.658 (0.621-0.694), 0.703 (0.666-0.739), and 0.756 (0.723-0.788), respectively. Te 1-year AUC areas were 68.93, 67.34, and 75.62, respectively (Supplementary Table 1; Supplementary Figure 1C). Tis demonstrated that TNI had a higher diagnostic ability than the other two models. It showed elevated consistency for comparing predicted and actual survival proportions for the TNI model in the training, validation, and testing sets, which were revealed by calibration curves at 1 year, 2 years, and 3 years (Supplementary Figures 2A-2I).
Based on the TNI scores, the patients' clinical characteristics in the total population are shown in Supplementary  Table 2. We then performed Cox univariate and multivariate regression analyses, as shown in Supplementary Table 3. BMI, CRP, and TNI were independent prognostic factors in patients with advanced LC (P < 0.05). A nomogram prognostic model was established to predict survival time rates according to these three independent risk factors in the total population ( Figure 6). Using the prognostic model, we can intuitively observe the survival rate of patients with advanced LC.
Te regimen of LC has entered an era of precision treatment, so subgroup analysis was conducted in the LC subtypes, and patients were divided into EGFR-mutation and non-EGFR-mutation groups for exploring the potential signifcance of TNI. Te optimized cut-of value was obtained using the R package as shown in Supplementary  Figures 3A and 3B. Both subgroups showed longer survival time in patients with low TNI. Additionally, when patients were separated into chemotherapy and targeted and immunotherapy groups according to the First-line chemotherapy regimen, the results demonstrated that patients with  Figures 3C and 3D). So, we could conclude that TNI may be a potential biomarker for patients screening and treatment regimens options infuencing.

Te Potential Mechanism Exploration of Tumor-Infammation-Nutrition-Genes.
Te results above of this study showed that infammation and nutritional levels play a crucial role in tumor development, so we explored the potential mechanisms of neutrophil-and albumin-related genes afecting lung cancer. Tere were 11 independent prognostic tumor-neutrophil-albumin-associated genes (AK2, BTK, DMD, DSG2, EIF2AK3, PIK3CG, PRKCD, RFXAP, ANLN, MYO1E, OSGEP) by diferential genes screening, univariate and multivariate COX regression analyses, and LASSON regression (Table 2). Ten, we performed pathway enrichment analysis using the genes above, and the results indicated that these diferential tumorneutrophil-albumin-associated genes may be involved in the following pathways: cell-cycle, homologous recombination, P53 signaling pathway, pyrimidine metabolism, pathogenic Escherichia coli infection, alpha linolenic acid metabolism, arachidonic metabolism, aldosterone-regulated sodium reabsorption, vascular smooth muscle contraction, autoimmune thyroid disease (Figures 7(a)-7(j)). Te p53 protein is a nuclear transcription factor that regulates the expression of a wide variety of genes involved in apoptosis, growth arrest, or senescence in response to genotoxic or cellular stress. Abnormal cell cycle and homologous recombination could cause over proliferation of cells and an accumulation of abnormal cell numbers. So, the genes above may infuence some tumor signal pathways during tumor progression to a great extent.

Discussion
In this study, we found that liver metastasis, SCC, AGR, and SIRI were independent signifcant prognostic factors in patients with advanced LC that is consistent with previous studies [8,13]. Liver metastasis and SCC are widely accepted biomarkers in clinical studies. AGR has been reported to be related to long-term survival in various tumors, and it has been suggested that AGR is an independent prognostic factor [7,14]. Previous meta-analysis including 12 studies on AGR and gastric cancer outcomes demonstrated that a higher AGR was associated with longer survival time [15]. SIRI has also been demonstrated as an efective prognostic biomarker for solid tumors, including stage III nonsmall cell LC [8,[16][17][18][19]. However, it is still controversial to utilize these indicators as independent prognostic biomarkers. On the one hand, the correlation among the variables in the prediction model was demonstrated via the nomogram of TNI, which is based on multifactor regression analysis to integrate multiple predictors and then use scaled line segments, drawn on the same plane according to a certain scale. Afterwards, complex regression equations were transformed into visual graphs that make the results of the prediction model more readable, convenient for patient assessment, and intuitive, and it is also easy to understand in medical research and clinical practice. On the other hand, similar nomograms have been used in the past to evaluate the prognostic prediction of tumors [20,21]. However, there is no such model for TNI. Terefore, TNI may be a potential biomarker for efectively and precisely predicting the survival rate of patients with advanced LC. It makes a signifcant contribution to the literature because it is the frst to combine nutritional, infammatory, and clinical indicators to establish an integrated biomarker and nomogram model for predicting survival outcomes. Tis provides a practical analytical tool for more accurate prediction of survival outcomes in patients with advanced LC.
Albumin, synthesized in the liver, is considered the most essential protein in human plasma. Several bodily activities were maintained by albumin, including nutrition and osmotic pressure, transporting and binding hormones, pharmaceuticals, fatty acids, and cations [13]. Close relationship was indicated between serum albumin level and nutritional status. Some previous studies have revealed that albumin participates in the infammatory   6 International Journal of Clinical Practice response process [22,23]. And it was shown that upregulation of albumin promotes tumor proliferation and metastasis via activating expression of tumor necrosis factor-α, interleukin-1, and interleukin-6 [24]. Moreover, albumin nanovectors play a crucial role in increasing the availability of drugs and drug delivery, such as albumin paclitaxel [25]. Low albumin levels are associated with poor liver function, and patients with advanced cancer were reported to display a high incidence of malnutrition due to cancer cachexia and cancer-associated bleeding [26]. A large number of immune-related products were contained in globulin, which can trigger antigen binding  International Journal of Clinical Practice and recognition, complement activation, and Fc receptor binding by stimulating the lymphatic system [27]. Although it has been confrmed that globulin plays an important role in the immune microenvironment [28,29], it was still confused whether it could afect tumor immunotherapy or not. Studies have shown that neutrophils can promote tumor metastasis through arachidonate 5lipoxygenase-dependent leukotriene synthesis [30]. In addition, it can inhibit the activation of CD + T cells and increase the secretion of cathepsin G, neutrophil elastase, and other factors that promote tumor metastasis [31,32]. Monocytes can diferentiate into macrophages or dendritic cells, which are involved in the immune response. Studies have confrmed that the CCL2-CC chemokine receptor 2 signaling pathway can be blocked by reducing the activation and proliferation of monocytes to promote    tumor cell metastasis inhibition [33]. Lymphocytes are a type of cell line with immune recognition function, which can be divided into T lymphocytes, B lymphocytes, and natural killer (NK) cells. T lymphocytes, such as CD4+ cells, usually play a crucial role in the tumor immune response including releasing immunoregulatory factors, inhibiting tumor growth and metastasis, and other processes [34]. B lymphocytes and NK cells play a crucial role in tumor immune response and inhibition of tumor proliferation and metastasis through the secretion of tumor-specifc antibodies [35,36]. A valuable prognostic indicator would be produced when the above indicators are combined. In addition, it may afect some pathways related to tumor development including cell cycle, homologous recombination, and P53 signaling pathway from a molecular level. However, whether these genes actually afect lung cancer invasion and metastasis at the molecular level? whether they will comprehensively  afect antitumor efcacy? Tis mechanism needs to be further explored.
Tis study has several limitations. First, as mentioned, although this study had an external verifcation of the validation cohort and the testing cohort based on the results of the training set, it is still a single-center retrospective study, and a more multicenter retrospective studies with more patients and high-quality prospective studies are needed in the future. Second, this study only included patients with advanced LC and patients who underwent surgery or had sufcient concurrent infection are not included here. And salvage treatments may have altered the results in favor of one group unintentionally. Tird, this study only focused on the evaluation of the nutritional and infammatory status of patients with advanced LC before frst-line chemotherapy. It is still unknown whether the TNI index can be used as an indicator for the dynamic monitoring indicator in the treatment stage since the infuence of subsequent chemotherapy or radiotherapy and other antitumor therapies on the overall nutritional infammation level of patients has not been thoroughly explored. Fourth, the follow-up time was short because patients included in this study started in February 2015. Te follow-up time should be extended in future studies to make the results more reliable. Finally, the cut-of value adopted in this study was optimized by calculation using R software. It was still uncertain whether this value can better classify patients, or whether it can be applied in a larger population for business reasons or not.

Conclusion
In conclusion, to our knowledge, this study frstly combined nutritional, infammatory, and clinical indicators to establish an integrated biomarker TNI for predicting survival outcomes. It showed that there are some potential molecular mechanisms between neutrophil-albumin-related genes and LC development. A practical analytical tool is provided for more accurate prediction of survival outcomes in patients with advanced LC. Tis analysis may provide a strong support for the selection of clinical treatment strategies. set; (H) 2 years in testing set; (I) 3 years in testing set. Each point in the plot refers to a group of patients, with the nomogram predicted probability of survival shown on x axis and actual survival proportion shown on y axis. Distributions of predicted survival probabilities are plotted at the top. Error bars represent 95% confdence intervals. Supplementary fgure 3: Kaplan-Meier survival curve of diferent TNI groups in patients with EGFR mutation (A), patients with non-EGFR-mutation (B), patients who received chemotherapy only as frst-line chemotherapy (C), and patients who chosen targeted or immunotherapy regimens as frstline chemotherapy (D). Supplementary