Identification of KIF4A as a prognostic biomarker for esophageal squamous cell carcinoma

Esophageal squamous cell carcinoma (ESCC) is the most common and aggressive tumor worldwide, and the long-term survival of these patients remains poor. Three databases (GSE17351, GSE20347, and GSE100942) were obtained from Gene Expression Omnibus, and 193 differentially expressed genes including 56 upregulated and 137 downregulated genes were identified by paired test using limma R package. Then, functional enrichments by gene ontology and Kyoto Encyclopedia of Genes and Genomes analyses showed these genes were mainly related protein digestion and absorption, and IL-17 signaling pathway. We then constructed a protein-protein interaction network and cytoHubba module to determine the six hub genes and overall survival analysis of the six hub genes were evaluated by UALCAN and GEPIA2 analysis. Ultimately, the experimental results confirmed the KIF4A was overexpressed in the ESCC tissues and cell lines compared with the normal esophageal mucosal tissues and was linked to poor prognosis. Moreover, we also revealed that KIF4A facilitates proliferation, cell cycle, migration, and invasion of ESCC in vivo and in vitro. Overall, these findings demonstrated that KIF4A could serve as diagnostic and prognostic biomarkers and may help facilitate therapeutic targets in ESCC patients.


INTRODUCTION
Esophageal cancer (EC) is an aggressive malignant tumor globally, which ranks seventh in morbidity and fourth in mortality among all tumors. Meanwhile, obvious geographical differences are found in esophageal cancer and most of which are in East Asia [1,2]. Among all the histological types of esophageal carcinoma, esophageal squamous cell carcinoma (ESCC) is the most frequent worldwide [3]. Recently, although various therapies such as esophagectomy, chemotherapy, radiotherapy, and immunotherapy are frequently used to treat ESCC, the long-term outcome remains poor, with a five-year survival rate in the range of 15-25% [4]. ESCC is a highly heterogeneous disease involving changes in multiple gene expression patterns. Thus, revealing ESCC initiation mechanisms and potential prognostic molecular markers are essentially required to improve patients' outcomes with ESCC [5].
The kinesin superfamily (KIFs) is a major component of the intracellular transport system, responsible for cell In recent decades, microarray assays and bioinformatic analysis have been extensively used to study the molecular landscapes of tumors at multiple levels, such as somatic mutations, copy number alterations, and epigenetic alterations [19]. In our study, three datasets were downloaded from the Gene Expression Omnibus database, and they contain 26 pairs of tissues. The sva and limma R packages were utilized to remove the batch effects across different platforms and integrate them to obtain differentially expressed genes (DEGs) between paired samples. The gene ontology (GO) functional annotation and Kyoto encyclopedia of genes and genomes (KEGG) pathway analysis were then performed. Protein-protein interaction (PPI) network and module analyses were performed by Cytoscape software. Meanwhile, pathological staging and OS analysis of the six hub genes were evaluated by UALCAN and GEPIA2 analysis. Finally, we identified that KIF4A might be a potential biomarker for ESCC. To further determine the specific mechanism of KIF4A in ESCC, we found that the expression of KIF4A in ESCC tissues and cell lines was higher, and its high expression level predicted worse survival in ESCC patients. Besides, we demonstrated that KIF4A accelerated ESCC proliferation in vitro and in vivo. Furthermore, KIF4A promoted metastasis by attenuating epithelial-mesenchymal transition (EMT) in ESCC cells. In short, we propose for the first time that KIF4A overexpression could serve as a novel prognostic biomarker and potential therapeutic target toward ESCC therapy.

Microarray data
GSE17351, GSE20347, and GSE100942 databases were downloaded and analyzed. If a dataset needs to be normalized, this can be accomplished using the normalize Between Arrays function in the limma R package, as shown in Supplementary Figure 1. In this regard, GSE100942 and GSE17351 did not need to be normalized, and GSE20347 needed to be normalized.

Differentially expressed genes between paired samples of primary tumors and normal tissue by paired test
The sva and limma R packages were utilized to remove batch effects ( Figure 1A, 1B) across the different platforms and integrate datasets to obtain DEGs between paired samples by paired test. Principal component analysis (PCA) was done for the merged dataset for dimensionality reduction and quality control ( Figure 1C). A total of 193 DEGs, including 56 upregulated and 137 downregulated genes, were identified. The Volcano map and heatmap of DEGs were depicted in Figure 1D-1E.

GO Functional and KEGG Pathway Enrichment Analysis of DEGs.
GO analysis showed that DEGs were mainly involved in skin development, collagen-containing extracellular matrix, and extracellular matrix structural constituent ( Figure 1F). The enriched pathways obtained in KEGG analysis were amoebiasis, protein digestion and absorption, and IL-17 signaling pathway ( Figure 1G).

PPI network construction and hub gene selection
As displayed in Figure 2A, the most significant module and PPI network of DEGs was obtained by STRING and Cytoscape. Hub gene screening was accomplished using cytoHubba module. We then applied the three most common topological algorithms (Maximal Clique Centrality, Density of Maximum Neighborhood Component and Degree) to filter the top 20 genes, and finally, we obtained six hub genes. Figure 2B and Figure 2C displayed the Venn diagram and heatmap of six hub genes, respectively. The expression difference between paired samples of tumor and normal group of the six genes from GEO database is demonstrated in Figure 2D. The six hub genes, namely CDKN3, BUB1, TOP2A, CEP55, KIF4A, and ECT2, were all upregulated in ESCC. The names, abbreviations, and functions for these hub genes were shown in Table 1.

Analysis of hub genes as prognostic indicators of survival in ESCC
To determine whether hub genes were linked to ESCC patients' survival, six prognostic hub genes were identified and survival analysis was further performed. Only high KIF4A and CDKN3 expression were correlated with poor OS in ESCC patients ( Figure 3A-3B, and Supplementary Figure 2). However, CDKN3 demonstrated that it might promote proliferation, migration and invasion of ESCC Figure 1. Identification of DEGs. The merged dataset before removed the batch effect (A). The merged dataset after removed the batch effect (B). PCA was done for the merged dataset for dimensionality reduction and quality control (C). Volcano plot of DEGs, the red dots represent the upregulated genes and the green dots represent the downregulated genes based on an adjusted P-value < 0.001 and | logFC | ≥ 2, the black spots represent genes with no significant difference in expression (D). Heatmap of DEGs, red representing high relative expression, and blue representing low relative expression (E). DEGs with the top 5 enriched GO terms (F) of each subclass and KEGG (G) pathway enrichment analyses of DEGs (P < 0.05) AGING via the AKT pathway [44]. In order to study whether these six hub genes affected ESCC metastasis and invasion, the expression levels of these genes were analyzed through the GSE21293 dataset which containing invading and non-invading EC cells [17]. Primary esophageal cells were established from surgically resected specimens of ESCC patients (n = 35). The present study showed that the expression levels of all six hub genes were up-regulated in invading cells ( Figure 3C). In conclusion, we determined that KIF4A was our research target. Meanwhile, for N categories, high KIF4A expression was tightly correlated in patients with N1, N2, and N3 stages. Moreover, KIF4A expression was closely associated with stage I, II, III, and IV ( Figure 3D). All in all, among these six hub genes, KIF4A may be a prognostic indicator of ESCC.  It is essential for spindle-assembly checkpoint signaling and for correct chromosome alignment. Acting as a substrate for anaphase-promoting complex or cyclosome (APC/C) in complex with its activator CDH1 (APC/C-Cdh1)

ECT2
Guanine nucleotide exchange factor (GEF) that catalyzes the exchange of GDP for GTP

CDKN3
May play a role in cell cycle regulation. Dual specificity phosphatase active toward substrates containing either phosphotyrosine or phosphoserine residues. Dephosphorylates CDK2 at 'Thr-160' in a cyclin-dependent manner.

KIF4A was upregulated in ESCC and indicated poor prognosis
To further explore KIF4A expression between ESCC and paired adjacent normal tissues, quantitative realtime PCR (qRT-PCR) was used to confirm the mRNA levels of KIF4A, and the result showed that in 25 paired of tumor tissues, KIF4A mRNA levels were significantly higher than those in adjacent normal tissues (1.7 vs. 5.3-fold changes) ( Figure 4A). Levels of KIF4A proteins were also detected in four primary ESCC tissues and matched adjacent normal tissues by western blotting (WB). The findings revealed that KIF4A protein levels were higher in primary ESCC tissues than in paired adjacent normal tissues ( Figure  4B), and immunohistochemistry (IHC) analysis demonstrated similar results ( Figure 4C). WB analysis also indicated that KIFA expression was higher in all ESCC cell lines than immortalized esophageal epithelial cell line NE1 ( Figure 4D). Based on IHC results, 34 cases were defined as low KIF4A expression, while 31 cases were identified as high expression. The KIF4A protein expression and clinicopathological features are summarized in Table  2. The results indicated that KIF4A protein expression levels were markedly correlated with tumor diameter (P = 0.021), degree of infiltration (P = 0.012), lymph node metastasis (P = 0.047), distant metastasis (P = 0.009) and TNM stage (P < 0.001). Then, we analyzed 65 human ESCC samples and the results showed that higher KIF4A expression relate to a poor OS (P < 0.05), and our samples confirmed that patients with higher KIF4A expression predicted a decreased OS (P < 0.05) and DFS (disease-free survival, P < 0.01) ( Figure 4E). Multivariate analysis exhibited that KIF4A was an independent predictor (P = 0.001) ( Table 3).

KIF4A facilitated the proliferation of ESCC in vitro
In order to study the potential role of KIF4A play in the progress of ESCC, we used lentiviral KIF4A short hairpin RNAs (shRNAs) stable knockdown the expression of KIF4A in KYSE150 and TE1 cells, and a scrambled shRNA was used as a negative control (NC). The expression of KIF4A in cells was evaluated by WB ( Figure 5A) and qRT-PCR ( Figure  5B). Based on these results, CCK-8 cell assay showed that the proliferation capability of the two cell lines was significantly reduced after KIF4A knockdown ( Figure 5C). Colony formation experiment confirmed that KIF4A knockdown cells produce lower numbers and smaller colonies ( Figure  5D). In addition, we have further investigated whether exhaustion of KIF4A can lead to cell cycle stoppage. The results showed that KIF4A knockdown can trigger the G2/M phase arrest of KYSE150 and TE1 cells, resulting in a significant increase in the ratio of G2/M phase cells ( Figure 5E). In summary, these data suggested that KIF4A might be essential for proper mitotic progression. However, KIF4A had no significant effect on apoptosis in both ESCC cell lines (Supplementary Figure 3). These results suggest that KIF4A promote ESCC cell proliferation.

KIF4A promote cell migration and invasion of ESCC
An earlier study found that KIF4A expression was significantly associated with metastasis and prognosis in HCC patients [26]. Our data revealed that the expression of KIF4A was associated with distant metastasis in ESCC patients. By studying the cell movement of ESCC after KIF4A knockdown, the AGING effect of KIF4A on the migration and invasion of ESCC cells was explored. The results showed that invading and migrating ESCC cells were significantly inhibited ( Figure 6A-6B). These results indicated that KIF4A promoted migration and invasion of ESCC cells.

KIF4A induced EMT and the MAPK and PI3K/AKT pathway in ESCC cells
We used WB to test the expression levels of phosphorylated AKT and P38, ERK1/2 and JNK after knockdown of exogenous KIF4A in ESCC cells, with the   purpose of studying the effect of KIF4A on PI3K/AKT and MAPK pathways, and the results showed that all the markers above decreased in KIF4A knockdown cells.
EMT plays a crucial role in migration and invasion of tumor cells derived from epithelial cells. We detected EMT hallmark using WB, and we discovered increased expression of E-cadherin (E-cad) and decreased expression of vimentin and N-cadherin (N-cad) after exogenous KIF4A knockdown in ESCC cells. The results were all shown in Figure 7. Overall, our findings showed that KIF4A promoted tumorigenesis and metastasis through the promotion of the PI3K/AKT and MAPK pathway and EMT in ESCC, respectively.

KIF4A promoted ESCC tumor xenograft growth in vivo
The role of KIF4A in the development of the ESCC in vivo was assessed in a xenograft mouse model. The KYSE150 cells which stably expressing shCtrl or shKIF4A were injected subcutaneously into nude mice at the same dose. As illustrated by Figure 8, knockdown KIF4A suppressed the tumor xenografts growth in mice and reduced the tumor volumes and weights ( Figure 8A-8C). The findings demonstrated that KIF4A knockdown suppressed tumor growth of ESCC in vivo.

DISCUSSION
Esophageal squamous cell carcinoma (ESCC) is a global common cancer, particularly in some parts of Asia [1,2]. The exact molecular mechanisms behind ESCC are still incompletely defined, leading to a poor ESCC prognosis. Resolving these concerns can be accomplished by analyzing three GEO mRNA AGING microarray datasets to identify DEGs between primary tumor tissues (ESCC) and matched adjacent normal esophageal mucosa. Functional enrichment analyses were performed by GO and KEGG analyses. Then, we constructed a PPI network to visualize interactions of DEGs. Eventually, pathological staging and prognostic analysis of hub gene expression were carried out. Ultimately, we uncovered that high KIF4A expression implied a poor ESCC prognosis and revealed that KIF4A facilitated proliferation, migration, and invasion of ESCC in vivo and in vitro. Overall, these results indicated that KIF4A had a significant function as an oncogene in ESCC progression.
KIF4A, the vital subfamily of kinesin superfamily proteins (KIFs), is essential to regulate mitosis and AGING meiosis, spindle organization and cytokinesis, and promotes proper axon growth in neuronal cells [9][10][11]. Interestingly, KIF4A is overexpressed in various cancers. Nevertheless, it is downregulated in gastric cancer by inhibiting gastric cancer cell proliferation [17], suggesting its distinct functions and mechanisms for different tumors. Previous studies have demonstrated that high KIF4A expression levels are associated with poor patient prognosis [10,18]. Similarly, our study showed that KIF4A was upregulated in ESCC tissues relative to normal matched tissues and ESCC cell lines. Moreover, our follow-up analysis exhibited that upregulation of KIF4A expression was positively linked to increased invasion depth, tumor diameter, lymph node metastasis and advanced TNM stage. Moreover, multivariate Cox regression further demonstrated that KIF4A could be an independent predictor of poor prognosis in ESCC patients.
ESCC progression is characterized by excessive proliferation, avoiding programmed cell death, and infinite invasion and metastasis of cancer cells [19].
Previous studies have reported that KIF4A is involved in proliferation, apoptosis, and metastasis of cancer cells [20,21]. From our data, ESCC cell proliferation and colony formation were downregulated by KIF4A knockdown in cultured ESCC cells. Additionally, KIF4A knockdown significantly restricted tumor growth in vivo. KIF4A downregulation decreased CRC cell lines relative to regulate p21 to promote CRC cell cycle progression at G1/S transition and by overexpression of cyclin D1-Cdk2/Cdk4 and cyclin E-Cdk2 directly implicate cyclins in RCC cell lines [20,22]. Given our findings, Down-regulation of KIF4A significantly increases the percentage of cells in G0/G1 phase, while the distribution of cells in S phase and G2 phase decreases. Unfortunately, we did not detect the expression of p21, p27, Cdk2, Cdk4, and Cdk6 proteins. In addition, decreasing apoptosis is also a hallmark of cancer progression [23]. Yet, there was no evidence that KIF4A was involved in regulating ESCC cells' apoptosis in our study.
To profoundly study the underlying mechanisms of ESCC, GO and KEGG pathway enrichment analyses AGING were conducted to identify functional processes and pathways. Furthermore, PI3K pathway holds a significant function in cell cycle entry and glucose metabolism [24]. AKT phosphorylation promotes tumorigenesis via several oncogenic events, including apoptosis and cell proliferation [25]. KIF4A enhanced cell proliferation via activating Akt signaling in hepatocellular carcinoma; meanwhile [26], KIF4A downregulation dramatically decreased the expression of p-AKT in colorectal cancer [27]. Our data showed that KIF4A knockdown obviously decreased p-AKT expression in ESCC cell lines. MAPK had a dozen signaling pathways involved in cell growth, differentiation, proliferation, and apoptosis [28]. However, three major signaling pathways have been well-studied until now: Extra cellular regulated protein kinases (ERK) transduction pathway, P38 mitogen~ activated protein kinase MAPK pathway, and C-Junterminal kinase (JNK) transduction pathway. In particular, ERK signaling pathway is often upregulated in many tumors, mainly cell proliferation [29]. P38 transduction pathway is the main pathway involved in apoptosis initiation. After activated by extracellular stimulation, it exerts its biological effect and participates in differentiation, proliferation and apoptosis of cells [30]. JNK signaling pathway activation regulates a series of intracellular responses including cell differentiation, apoptosis and stress response [31]. Meanwhile, our study demonstrated similar results.
EMT is a feature of epithelial cells transforming into mesenchymal phenotype cells through specific procedures, holding a crucial role in migration and invasion of tumor cells derived from epithelial cells [32,33]. An important hallmark of EMT is decreased expression of E-cadherin (E-cad) and increased expression of vimentin. In our study, transwell experiments revealed that ESCC cell migration and invasion were inhibited by KIF4A silencing. Moreover, E-cadherin level is downregulated, and vimentin expression is up after KIF4A knockdown, suggesting that KIF4A could promote tumor invasion and metastasis through EMT in vitro.

AGING
To this end, our study showed that KIF4A was upregulated in ESCC and significantly correlated with a poorer prognosis for the first time. KIF4A facilitated proliferation by PI3K/AKT, and MAPK signaling pathways and promoted migration and invasion through EMT. These findings shed light on KIF4A prospects as a prognostic ESCC biomarker. However, how KIF4A is activated and how it affects downstream pathways to achieve the function remains unclear.

Human ESCC samples
The human research was approved by the principles of the Declaration of Helsinki and the local ethics committee of East Hospital Affiliated to Tongji University. Samples of Primary ESCC tumor and paired adjacent normal tissue were collected from 65 patients who had an oesophagectomy at the Department of Thoracic Surgery in East Hospital (Shanghai, China) from March 2014 to March 2016. None of the patients had a history of radiotherapy or chemotherapy before surgery. In accordance with WHO standards, each specimen underwent histological review and evaluation by two senior pathologists.

Identification of DEGs
We removed the probe sets that cannot match any of the gene symbols and averaged the genes that correspond to more than one probe set and if the dataset needs to be normalized, then normalized it using the normalize Between Arrays function in the limma R package. The sva R package was employed to remove batch effects across different platforms. The three datasets were then integrated into a whole dataset for analyzing the following DEGs. The limma R package analyzed DEGs between paired samples by paired test. The DEGs were defined as |logFC| ≥2 and adj. P < 0.001.

KEGG and GO enrichment analyses of DEGs
Regarding common DEGs function, significant gene ontology and Kyoto Encyclopedia of Genes and Genomes enrichment analysis (P < 0.05) were performed by R package clusterProfiler [38,39].

PPI network construction and cytoHubba module analysis
In this study, PPI network of DEGs was constructed using STRING database (https://string-db.org/) [40], and a combined score of interaction >0.4 was considered statistically meaningful [41]. We then used cytoHubba module of Cytoscape (version 3.4.0) to screen key genes, predicting and exploring significant nodes and subnet of a given network through a number of topological algorithms. We applied the three most common topological algorithms (Maximal Clique Centrality, Density of Maximum Neighborhood Component and Degree) to filter the top 20 genes. Finally, we took the intersection of 60 genes and got six hub genes.

Pathological staging and prognostic analysis of hub gene expression
For prognostic analysis, GEPIA2 and UALCAN analysis were used to analyze the overall survival, and P-value <0.05 was considered to be significantly related to prognosis. Besides, UALCAN analysis was utilized to analyze the relationship between KIF4A and N categories and clinical staging.

Analysis of the role of prognostic hub genes in EC invasion
A further dataset, GSE21293 [42], which comprising gene expression data for invasive and non-invasive ESCC cells, was evaluated based on the GEO database to explore the role of hub DEGs in EC invasion.

Establishment of KIF4A-knockdown ESCC cells
Lentiviral which comprising shRNAs targeting KIF4A was provided by GeneChem (GeneChem Co., Ltd., Shanghai, China) and transfected into KYSE150 and TE1 cells as directed by the manufacturer. Stable cell lines were screened with puromycin as described above. KIF4A shRNA targets were CCCTTTACAGAACA GACTA. The cells which transfected with scrambled shRNA were taken as NC.

Immunohistochemical staining
Esophageal tissue of ESCC patients and paired adjacent normal tissues were fixed with 4% paraformaldehyde for 24 hours, and then the fixed tissues were paraffinembedded and sliced into four µM sections. Immunohistochemical staining was performed with an antibody KIF4A (Abcam, ab122227). Two pathologists evaluated all slides blindly and independently. Tumor and normal tissues were categorized as KIF4A overexpression and low expression according to whether cancer cells with KIF4A staining was ≥5%.

Cell proliferation assay
TE-1 and KYSE150 were cultured in 96-well plates at 5 × 10 3 in serum starvation overnight. Each well was handled with 10 μL reagents from Cell Counting Kit-8 (Beyotime Biotechnology, Shanghai, China) and then each well was supplemented with 10 µL CCK-8 and incubated at 37°C for 2 h. Three auxiliary holes in each group. Optical density (OD) was measured at 450 nm using a microplate reader (BioTek, USA) [43]. The cells were transfected by shRNA for 48 hours and then counted.

Flow cytometry for cell cycle and apoptosis analysis
After 48 hours of transfection, synchronize the cells through serum starvation overnight and induce re-entry into the cell cycle by incubating for 4 hours. Fix the collected cells in precooled 70% ethanol at 4°C overnight, and then resuspend them with RNaseA at 37°C for 30 minutes. Subsequently stained with propidium iodide (PI) in a dark environment at 4°C for 30 minutes. Apoptosis assay was carried out using Annexin V-PE apoptosis detection kit (Beyotime Biotechnology, Shanghai, China) as per manufacturer's protocol. After 48 hours of transfection with shRNA, the cells were stained with membrane Annexin V and PE for 15 minutes, and then analyzed by flow cytometry (Beckman Coulter, USA).

Transwell migration and invasion assays
Briefly, the upper chamber is filled with 1 × 10 5 cells, 200 μL of fetal bovine serum (FBS)-free medium is added, and the lower chamber with 600 μL of 80% FBS-containing medium. Then the cell line was fixed with 10% formalin for 15 minutes and stained with 0.25% crystal violet for 15 minutes, rinsed again with sterile water. Finally Counting the migrating cells in 5 photographed areas at 200 × magnification.

Colony formation assay
For colony formation analysis, we inoculated 200 cells on each 6-well plate. After 14 days of culture in RPMI1640 medium with 10% FBS, crystal violet staining was used to count cell colonies. We repeated experiments at least three times, and data were represented as mean ± SEM.

Mouse xenograft tumor model
BALB/c nude mice (aged 6-8 weeks) were purchased from Beijing Vital River Laboratory Animal Technology Corporation. Mice experiments were approved by Tongji Hospital Affiliated to Tongji University, Animal Care Committee of Tongji Hospital Affiliated to Tongji University. Stable KIF4A knockdown cells and control Kyse150 cells (5 × 10 6 ) were injected into the mice's armpits. The volume (V) of formed tumor was computed using the formula AGING V = L × W 2 /2 (L: long axis; M: short axis). Twenty-one days after the injection, the nude mice were sacrificed, and the tumors were dissected and weighed.

Data analysis
The results are represented as mean ± SEM. Two-tailed Student's tests were used for two-group comparisons, and ANOVA followed by post hoc Tukey's test was used for multiple-group comparisons. GraphPad Prism 8.0 (San Diego, CA, USA) and SPSS 22.0 (New York, USA) statistical software are used for data analysis. A p-value less than 0.05 was considered statistically significant.

AUTHOR CONTRIBUTIONS
Yunqing Mei and Qinchuan Li helped to formulate the hypothesis. Lei Zhou and Lingwei Wang conceived the project, analyzed the data and wrote and revised the manuscript. Gang Liu analyzed the data and revised the manuscript. Enkhbat Bolor-Erdene helped operate the experiments.