A innovative prognostic symbol based on neutrophil extracellular traps (NETs)-related lncRNA signature in non-small-cell lung cancer

Neutrophil extracellular traps (NETs) are closely related to cancer progression. NETs-related lncRNAs play crucial roles in non-small-cell lung cancer (NSCLC) but there have been no systematic studies regarding NETs-related long noncoding RNA (lncRNA) signatures to forecast the prognosis of NSCLC patients. It’s essential to build commensurate NETs-related lncRNA signatures. The expression profiles of prognostic mRNAs and lncRNAs and relevant clinical data of NSCLC patients were downloaded from The Cancer Genome Atlas (TCGA) database. The NETs-related genes came from the results of our transcriptome RNA microarray data. The co-expression network of lncRNAs and NETs-related genes was structured to confirm NETs-related lncRNAs. The 19 lncRNAs correlated with overall survival (OS) were selected by exploiting univariate Cox regression (P < 0.05). Lasso regression and multivariate Cox regression (P < 0.05) were utilized to develop a 12-NETs-related lncRNA signature. We established a risk score based on the signature, which suggested that patients in the high-risk group displayed significantly shorter OS than patients in the low-risk group (P < 0.0001, P = 0.0023 respectively in the two cohorts). The risk score worked as an independent predictive factor for OS in both univariate and multivariate Cox regression analyses (HR> 1, P< 0.001). Additionally, by RT-qPCR, we confirmed that NSCLC cell lines have higher levels of the three adverse prognostic NETs-related lncRNAs than normal lung cells. The expression of lncRNAs significantly increases after NETs stimulation. In short, the 12 NETs-related lncRNAs and their model could play effective roles as molecular markers in predicting survival for NSCLC patients.


INTRODUCTION
Lung cancer is the leading cause of tumor-related death worldwide and ranks second in incidence among malignancies [1]. Specifically, the proportion of non-small-cell lung cancer (NSCLC) in diagnosed lung cancer cases is approximately 80-85% [2]. Although advances in chemotherapy, radiation therapy, immunotherapy, and targeted therapy have reduced mortality among patients with NSCLC over the years [3], the long-term cancer-specific survival rates have scarcely been increased, especially compared with those of other cancers [4]. Hence, it is essential to probe the molecular mechanisms of NSCLC progression and to explore more precise tumor prognostic markers that accurately predict the survival of patients with NSCLC.
Neutrophils [5], the most affluent endogenous immune effector cells, can desorb modified chromatin structures decorated with given cytoplasmic and granular proteins called neutrophil extracellular traps (NETs) to respond to specific stimuli, mainly via a cell death process termed NETosis [6]. Commonly, NETs trap, neutralize AGING and kill bacteria, fungi, viruses, and parasites and are thought to prevent bacterial and fungal dissemination [7,8]. Consequently, the generation of NETs and NETosis are deemed evolutionary processes, of which disorder and dysregulation can result in many diseases, such as infection, thrombosis, tissue injury, organ dysfunction, and cancer metastasis [9,10]. However, the molecular mechanisms of NETs in cancer remain poorly understood. Knowledge about these mechanisms may assist in the development of NETs-focused therapeutic interventions.
Long noncoding RNAs (lncRNAs) [11], recognized as transcripts of more than 200 nucleotides that cannot be translated into proteins, were identified to execute an extensive range of functions in various crucial biological activities, such as cell proliferation and differentiation, genetic regulation of gene expression, action variation of the transcriptome, and microRNA (miRNA) regulation [12]. Importantly, lncRNAs contribute to the development of NSCLC in proliferation, migration, invasion, and chemoresistance [13,14]. Previous studies have commonly focused on a single or a few lncRNAs for NSCLC [14][15][16]. Moreover, NETs-related lncRNA expression profiles have not yet been developed to explore novel biomarkers for forecasting the prognosis of NSCLC. Meanwhile, whether NETs encourage the malignant phenotype of NSCLC through some lncRNAs remains largely unknown. Consequently, we aimed to utilize bioinformatics to establish NETS-related lncRNA signatures and to seek new biomarkers to predict the prognosis of patients with NSCLC.

RESULTS
The flow diagram of this research is shown in Figure 1. The RNA-seq data of 1037 NSCLC tissue samples, and 108 healthy lung samples were obtained from The Cancer Genome Atlas (TCGA) and were randomly divided into two groups, a training cohort and a validation cohort, were finally enrolled. Meanwhile, the clinical data of 967 NSCLC patients were obtained from TCGA, and their detailed clinical characteristics are shown in Table 1.

Identification of the candidate NETs-related genes
DEGs between NSCLC patients and healthy people from TCGA were analyzed (Figure 2A, 2B). Simultaneously, we obtained the results of DEGs from the transcriptome RNA microarray in the NETs treated and untreated group ( Figure 2C, 2D). Then, with | log2 (fold change) | ≥ 1 and FDR < 0.05, we performed a comprehensive analysis on 1115 prognostic NSCLCrelated DEGs and 834 NETs-related DEGs, ultimately obtaining 119 genes that were mainly were correlated with NSCLC and NETs ( Figure 3A).

Establishment of a prognostic model in the training cohort
A prognostic model based on the 19 lncRNAs referenced above was built following univariate Cox regression analysis, and 12 lncRNAs were determined through Lasso Cox regression based on 19 lncRNAs. The formula, risk score = 0.044 * expression level of AC020765.2 + 0.079 * expression level of AC090152.1 + 0.133 * expression level of AF131215.5 + 0.067 * expression level of AL035587.1 + 0.044 * expression level of AP000695.2 + 0.059 * expression level of AP004608.1 + 0.216 * expression level of MMP2.AS1 + 0.006 * expression level of MYOSLID + 0.076 * expression level of NKILA + 0.018 * expression level of SCAMP1.AS1+ 0.027 * expression level of SNHG10+ 0.063 * expression level of TRG.AS, was utilized to calculate the risk score. By evaluating these 12 NETs-related lncRNAs, we were able to acquire the risk score of each patient. The patients were divided into a high-risk group (n=241) and a low-risk group (n=241) in accordance with the risk score ( Figure 5C). Next, the Kaplan-Meier curve suggested that patients in the high-risk group had a significantly worse OS than their low-risk counterparts ( Figure 5A, P < 0.05). Additionally, PCA indicated that the patients in different risk groups were distributed in two directions ( Figure 5B). Patients with high risk had a higher AGING probability of earlier death than those with low risk ( Figure 5D, 5E).

Verification of the 12 NETs-related lncRNA signature in the validation cohort
To test the strength of the model built from the training cohort, the patients in the validation cohort were also categorized into high-and low-risk groups by the median value calculated with the same formula as that of the training cohort. Similar to the results obtained from the training cohort, patients in the high-risk group, had a reduced survival time, compared with those in the low-risk group ( Figure 6A, P < 0.05) and were more likely to die earlier ( Figure 6D, 6E). Additionally, PCA confirmed that patients in the two subgroups were distributed in discrete directions ( Figure 6B).

Independent prognostic value of the 12 NETs-related lncRNA signature in both two cohorts
Univariate and multivariate Cox regression analyses were carried out among the available variables to determine whether the risk score was an independent prognostic predictor for OS. In univariate Cox regression analyses, the risk score was significantly associated with OS in both the training and validation cohorts (HR = 3.715, 95% CI = 2.622-5.263, P < 0.05; HR = 3.539, 95% CI = 1.841-6.802, P < 0.05, respectively) ( Figure 7A, 7C). After correction for other  confounding factors, the risk score still proved to be an independent predictor for OS in the multivariate Cox regression analysis (training cohort: HR = 3.316, 95% CI = 2.307-4.767, P < 0.05; test cohort: HR = 2.908, 95% CI = 1.496-5.652, P < 0.05; Figure 7B, 7D).

Validation of the expression of the 3 adverse prognostic lncRNAs at the NSCLC cell level
As mentioned before, survival data analyses clearly revealed that AP000695.2, MYOSLID, and NKILA were unfavorable prognostic factors. Here, we conducted further verification to understand the characteristics of these 3 lncRNAs at the cell level. As shown in Figure 8A-8C, the levels of AP000695.2 and NKILA in A549 cell were much higher than those in human normal epithelial lung cells (BEAS-2B), whereas the expression of MYOSLID in the A549 cells was lower. The expression levels of 3 lncRNAs, namely, AP000695.2, MYOSLID, and NKILA, in NSCLC cells (H1299, SK-MES-1, H1703) was distinctly higher than those in BEAS-2B, as detected by RT-qPCR ( Figure  8D-8F). Therefore, we tested the 3 lncRNAs that were expressed at higher levers in A549 and SK-MES-1 cells following NETs treatment or not for 12 h ( Figure 9A-9F), and the results showed that the expression levels of MYOSLID and NKILA were elevated in both NSCLC cells lines after NETs treatment for 12 h (P < 0.05).

DISCUSSION
Non-small cell lung cancer, a primary thoracic malignancy worldwide, is the leading cause of cancerous death, thus, it is urgent to find reliable molecular biomarkers that can predict the prognosis of NSCLC to improve the survival rate [2,17]. NETs have been reported to play complicated and momentous roles in NSCLC progression [18]. NETs can regulate cell AGING activities and metabolism by inducing tumor microenvironment heterogeneity and chiefly promote tumor progression by setting the premetastatic niche, such as capturing circulating tumor cells (CTCs) and inducing epithelial-mesenchymal transition (EMT) [19]. LncRNAs, as a significant kind of noncoding RNAs, have an intimate connection with the genomes that impacts the transformed phenotype of cancer cells in terms of cell cycle variation, survival, immune response, and other processes [20]. Moreover, because the expression levels of lncRNAs were found to be different in tumors, they became one of the direct reasons for the normal cells to convert into tumor cells [21,22], and play vital functions in cancer diagnosis and prognosis as new biomarkers [14,23]. However, the prognostic significance of related lncRNAs in NSCLC accepting NETs has not been covered. Here, we established a 12 NETs-related lncRNA signature model to predict the prognosis of NSCLC patients.
In the present research, we comprehensively examined NETs-related lncRNAs by building a co-expression network between lncRNAs and NETs-related genes. Furthermore, 12 prognostic NETs-related lncRNAs AGING were identified through Lasso Cox regression as follow: AC020765.2, AC090152.1, AF131215.5, AL035587.1, AP000695.2, AP004608.1, MMP2.AS1, MYOSLID, NKILA, SCAMP1.AS1, SNHG10, and TRG.AS. A fresh prognostic model was constructed based on these lncRNAs and was tested in a validation cohort, and all showed the potential of being prognostic biomarkers. Six NETs-related lncRNAs (AF131215.5, MYOSLID, NKILA, AC090152.1, SNHG10, and TRG.AS) have been reported to be associated with cancer progression. AF131215.5 was found to have an independent prognostic value of overall survival for patients with lung adenocarcinoma (LUAD) [24]. In terms of partial epithelial-mesenchymal transition (p-EMT), MYOSLID expression in head and neck squamous cell carcinoma (HNSCC) was closely correlated with Slug, PDPN, and LAMB3, and is a key regulator of tumor cell survival. Knockout of MYOSLID in the HNSCC cell lines Cal27, SCC4 and SCC9 significantly inhibited migration and invasion [25]. NKILA is considered part of a class of NF-κB modulators that suppress cancer metastasis [26], whereas Huang et al [27] reported that NKILA could sensitize T cells to activation-induced cell death (AICD) which can promote tumor immune evasion. Research [28] has shown that AC090152.1 is capable of effectively predicting the overall survival (OS) in HCC patients with or without fibrosis. As novel drivers of the malignant phenotype of HCC, SNHG10 [29] and its homolog SCARNA13, which form a positive feedback loop, coordinately contribute to the cancer development. High expression of TRG.AS [30], results in poorer survival than in patients with low expression, as TRG.AS serves as a molecular sponge for microRNA-543 (miR-543), thereby mechanistically contributing to the increased expression of Yesassociated protein 1 (YAP1). For the six remaining NETs-related lncRNAs (AC020765.2, SCAMP1.AS1, AL035587.1, AP000695.2, AP004608.1, and MMP2.AS1), there have been no studies exploring their potential roles in the development of cancer at present. Regrettably, there are still few reports revealing the association between the 12 lncRNAs mentioned above and NSCLC, reports on how the mutual effect of lncRNAs with NETs-related genes are even unusual.  AGING Therefore, it is essential to carry out correlational research in the future. For this reason, more research is necessary to explore whether these lncRNAs are closely connected with the prognosis of NSCLC patients after NETs are stimulated.
A signature-based on 12 NETs-related lncRNAs significantly predicted the prognosis of NSCLC patients. In keeping with former studies [31,32], the OS of the low-risk group was longer than that of the highrisk group. This phenomenon reminded us that the risk score signature had a definite ability to forecast survival. Remarkably, we observed identical results in the validation cohort. The independent prognostic value of the signature was corroborated by employing both univariate and multivariate Cox analyses.
Our study observed that the expression levels of 3 adverse prognostic lncRNAs (AP000695.2, MYOSLID, and NKILA) in NSCLC cells tended to be higher than in normal lung epithelial cells, which is in accordance with previous research [28,33]. Moreover, the expression levels of the lncRNAs rose distinctly with NETs treatment, compared to the untreated group, which verified the prognostic model outlined above. Hence, it is our hope that our discoveries will contribute to identifying the prognostic lncRNAs related to NETs stimulation, offering opinions on their possible roles in NSCLC tumorigenesis and progression.
There are several limitations in this study. First of all, without functional enrichment analyses, the potential molecular mechanisms of NETs-related lncRNAs in prediction remains unclear. Second, more prospective studies are needed to confirm the prognostic function of NETs-related lncRNAs because our research is a retrospective study. Third, we only tested our conclusion at the cell level. Animal experiments and even human studies are warranted to verify its clinical utility. In addition, we tested three adverse prognostic lncRNAs treated with NETs by RT-qPCR. Focusing on validated target lncRNAs may exclude potential targets  that the experiment has not validated. In brief, we conducted all-sided bioinformatics analysis of NETsrelated lncRNAs in NSCLC. Our research detected 12 NETs-related lncRNAs deemed to be distinctly associated with the prognosis of the NSCLC patients. A NETs-related lncRNA model consisting of the lncRNAs above was regarded as independently associated with OS in both the training and validation cohorts, offering insight into the forecast of NSCLC prognosis proven in cell experiments. Hence, the model outlined suggests that the 12 NETs-related lncRNAs identified could play an influential role as molecular markers in NSCLC.

Cell culture
The human bronchial epithelial cells line (BEAS-2B), and human lung adenocarcinoma cells (A549 and H1299), human squamous cell carcinoma cell lines (SK-MES-1 and H1703), purchased from the Type Culture Collection of the Chinese Academy of Sciences, Shanghai, China, were all grown in high glucose DMEM medium (BI, Israel) supplemented with 10% fetal bovine serum (Gibco, Grand Island, USA), penicillin (100 U/ml), and streptomycin (100 μg/ml). The cells were incubated in a humidified atmosphere containing 5% CO 2 at 37° C.

NETs isolation and treatment
NETs were obtained from rat neutrophils following phorbol 12-myristate-13-acetate (PMA) 100 uM stimulated 4h as previous studies [34,35]. A549 cells were randomly divided into two groups (each group n=3): without NETs-stimulated group, NETs-stimulated 12h group. TRIzol reagent (Invitrogen, Carlsbad, USA) was used to isolate and purify the total RNA of A549 cells. Then, RNA samples, A549 cells with or without NETs treated 12h, were sent to the laboratory of the OE Biotech Company (Shanghai, China) and the microarray profiling was carried out.

Data and sample collection
The RNA sequencing (RNA-seq) data of 1037 NSCLC patients and 108 healthy people samples were received from the TCGA database (https://portal.gdc.cancer.gov/ repository). Additionally, the detailed clinical materials of partial patients were obtained from TCGA. Patients without clinical data recorded were eliminated when the related clinical prognostic analysis was performed. Both NSCLC patients and healthy people samples fell into two groups at random equally and served as training and validation data sets, respectively. The gene expression profiles data from TCGA were normalized, and normalization was not needed. The data from TCGA is openly getable, so that the present study was not required the permission of local ethics committees. The current study abides by the access policies and publication guidelines of TCGA.
The NETs-related genes were obtained from the transcriptome RNA microarray data of A549 cells with or without NETs treated 12h.

Identification of NETs-related genes and lncRNAs
"limma" R package was exploited to identify the differentially expressed genes (DEGs) between NSCLC and healthy people with | log2(Fold Change) | ≥ 1, and FDR < 0.05, in the training cohort, and the differentially expressed NETs-related genes from the transcriptome RNA microarray were screened in the same way. Then, prognostic NETs-related genes were selected by using Venny 2.1.0 based on further analysis of genes aforesaid. Subsequently, Pearson correlation was utilized to measure the relationship between lncRNAs and NETs-related genes. NETs-related lncRNAs were ensured according to the square of correlation coefficient |R 2 | > 0.3 and P < 0.05.

Establishment and test of a prognostic NETs-related lncRNA signature
NETs-related lncRNAs with prognostic values were picked out via univariate Cox analysis of overall survival (OS). P values were adjusted by Benjamini and Hochberg (BH) correction. The Lasso Cox regression analysis [33,36] constructed an NSCLC prognostic model to lessen the chance of overfitting as much as possible. The Lasso algorithm and "glmnet" R package were utilized for selecting variable and shrinkage together. The normalized expression array of prognostic NETs-related lncRNAs was the independent variable in the regression, and the dependent variables were the overall survival and status of patients in the training cohort. Then, the Co-expression networks were depicted through Cytoscape software 3.7.2. Judging by the normalized expression level of each lncRNA and its homologous regression coefficients, the risk scores of the NSCLC patients were gained. The formula was operated: score = sum (each gene's expression × corresponding coefficient). The patients were divided into high-and low-risk groups severally according to the median value of the risk score. PCA was performed with the "ggplot2" function in the "Rtsne " R package in light of the expression of lncRNAs in USAthe signature. The ideal cut-off expression value was calculated via the "surv_cutpoint" function of the AGING "survminer" R package to achieve survival analysis for each lncRNA.

Real-time quantitative polymerase chain reaction
Total RNA from the BEAS-2B, A549, H1299, SK-MES-1, and H1703 cell, according to the manufacturer's specification, was separated using Trizol (Invitrogen, Carlsbad, USA). Then, the first-strand cDNA at a volume of 20μL was compound by using the reverse transcription kit and finally diluted to 100uL. Quantitative real-time polymerase chain reaction (RT-qPCR) analyses were processed utilizing an SYBR Green mix in the Real-Time PCR System (TAKARA, Kyoto, Japan). 20μL of PCR reaction was prepared as follows: 10μL of 2×SYBR Green PCR master mix, 1 μL of 10μM of suitable forward and reverse primers, 7μL of RNase-free water, and 2μL of cDNA template. RT-qPCR was performed for 10 minutes at 95° C, followed by 40 cycles of 5 seconds at 95° C and 30 seconds at 60° C. A dissociation melting curve was generated using thermal conditions from 60° C to 95° C. Human GAPDH was employed as an internal housekeeping reference. The sequences for all primers were as follows: (reverse primer): TGCTGTAGCCAAATTCGTTG.

Statistical analysis
The OS between different groups was compared by Kaplan-Meier analysis with the log-rank test. Univariate and multivariate Cox regression analyses were implemented to identify independent predictors of OS. All statistical analyses were finished with R software (Version 3.5.3) or GraphPad Prism 8 (Version 8.0). If not particular remark, P < 0.05 was considered statistically significant, and all P values were twotailed.

AUTHOR CONTRIBUTIONS
YL designed the research. YW and SY made a significant contribution to the data analyses. CF drew up the manuscript. XQ and XZ accomplished statistical analysis for the data. RC and ZX conducted molecular experiments. XQ and BF processed the figures and tables. FL and YL undertook support for the project. All authors have browsed and agreed with the final manuscript.