Clinical and Survival Impact of Sex-Determining Region Y-Box 2 in Colorectal Cancer: An Integrated Analysis of the Immunohistochemical Study and Bioinformatics Analysis

Transcription factor sex-determining region Y-box 2 (SOX2) involves in the maintenance of cancer stem cells. However, the role of SOX2 in colorectal cancer (CRC) remains unclear. This study was conducted to investigate the effect of SOX2 on CRC. Studies were searched using electronic databases. The combined odds ratios (ORs) or hazard ratios (HRs: multivariate Cox survival analysis) with their 95% confidence intervals (CIs) were calculated. The Cancer Genome Atlas (TCGA) and GEO datasets were further applied to validate the survival effect. The functional analysis of SOX2 was investigated. In this work, 13 studies including 2337 patients were identified, and validation data were enrolled from TCGA and GEO datasets. SOX2 expression was not significantly related to age, gender, microsatellite instability (MSI) status, clinical stage, histological grade, tumor size, pT-stage, lymph node metastasis, distal metastasis, and cancer-specific survival (CSS) but was correlated with worse overall survival (OS: n = 536 patients) (P < 0.05). Furthermore, TCGA data demonstrated similar results, with no significant correlation between SOX2 and pathological characteristics. Further validation data (OS: n = 1408 and disease-free survival (DFS): n = 1367) showed that SOX2 expression was correlated with worse OS (HR = 1.35, 95% CI: 1.11–1.65, P=0.004) and DFS (HR = 1.30, 95% CI: 1.04–1.62, P=0.02). The functional analyses showed that SOX2 involved in cell-cell junction, focal adhesion, extracellular matrix- (ECM-) receptor interaction, and MAP kinase activity. Our findings suggest that SOX2 expression may be correlated with the worse prognosis of CRC.


Introduction
Colorectal cancer (CRC) is one of the most common malignant tumors and a major cause of cancer-related death in the world and in China [1,2]. According to the GLOBOCAN estimates, approximately 1,800,977 new cases were diagnosed with CRC, leading to approximately 861,663 deaths due to this disease in 2018 worldwide [1]. Although some significant achievements have been made in early detection and treatment in recent years, most patients are diagnosed with advanced disease and show a poor 5-year survival rate [3][4][5].
erefore, it is needed to find new efficient biomarkers for the treatment of CRC along with the prognosis.
Cancer stem cells (CSCs) are a special subpopulation of tumor cells and have the ability and characteristics of self-renewal and multilineage differentiation and proliferation potential, which are related to tumor progression, metastasis, recurrence, prognosis, and drug resistance [6][7][8]. Many CSC-related markers have been identified and reported to be associated with poor prognosis and resistance to therapy in cancer [9,10]. Sex-determining region Y-box 2 (SOX2), a high mobility group (HMG) DNA-binding domain, is mapped to human chromosome 3q26.3-q27 and belongs to a key transcription factor [11]. SOX2 regulates the self-renewal and pluripotency of undifferentiated stem cells such as human embryonic stem cells and plays an important role in maintaining the stem cell-like features in cancer cells [12][13][14]. Additionally, SOX2 involves in the migration, invasion, and proliferation of cancer cells and resistance to therapy [15,16]. SOX2 has been reported to be commonly expressed in many human cancers, such as breast cancer, lung cancer, esophageal cancer, and CRC [17]. For example, SOX2 expression is correlated with worse prognosis in breast cancer and gastric cancer [18,19]. While SOX2 expression shows a favorable prognosis in non-small cell lung cancer and cervical cancer [20,21]. erefore, it is of great importance to determine the prognostic role of SOX2 expression in CRC.
e previous meta-analysis only included eight studies with small sample sizes (n � 1113) and only analyzed the correlation between SOX2 and overall survival (OS) and a small part of the clinical features of CRC. Additionally, the result on OS was not reasonable [22]. erefore, the significance of SOX2 expression in CRC is not fully understood. For example, Lundberg et al. reported that SOX2 expression was not related to cancer-specific survival in CRC [23], but SOX2 expression was correlated with worse cancer-specific survival by Miller et al. [24]. us, the current work with 3745 CRCs was performed to determine the survival impact of SOX2 expression on CRC ((cancer-specific survival (CSS), overall survival (OS), and disease-free survival (DFS)) and to evaluate the association between SOX2 expression and general clinicopathological characteristics.

Literature Search.
is meta-analysis was performed based on the Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) statement [25] (Supplementary Materials). e electronic databases PubMed, EMBASE, and Web of Science were comprehensively searched to identify eligible publications until July 29, 2019, by searching the following key words and search terms: "colorectal cancer OR colorectal tumor OR colorectal carcinoma OR colorectal neoplasm OR CRC OR rectal cancer OR rectal tumor OR rectal carcinoma OR colon cancer OR colon tumor OR colon carcinoma" and "SOX2 OR SOX-2 OR Sex-determining region Y-box protein 2 OR Sex determining region Y box-2 OR Sex-determining region Y-box 2 OR SRY box-2" (Supplementary Materials). Moreover, the reference lists of the identified articles were also examined to find other relevant studies.

Eligibility Criteria.
For the enrollment of publications, the main inclusion criteria were included as follows: (1) the patients with CRC were reported; (2) studies reported the detection of SOX2 using immunohistochemistry (IHC); (3) expression status of SOX2 was defined from the original articles; (4) studies provided available data to assess the correlation between SOX2 expression and clinicopathological parameters; (5) studies provided sufficient hazard ratio (HR) with 95% confidence interval (CI) to evaluate the prognostic impact of SOX2 expression on CRC patients based on multivariate Cox survival analysis. If the results of interest were not completely reported, the corresponding author will be contacted via e-mail. Only the recent article or the article with the most complete data was selected when multiple articles using overlapping tissue samples from the same institute were published. We mainly excluded review articles, letters, conference abstracts, case reports, cell line/ animal studies, and articles lacking sufficient data.

Data Extraction and Study Quality Assessment.
e relevant data and search of the included studies were conducted from two independent authors (KS and JH) using standardized forms, including the surname of the first author, time of publication, country, ethnicity, median/mean age, disease stage, antibody information, number of participants, cutoff values of SOX2, expression frequency, clinicopathological parameters such as age, gender, microsatellite instability (MSI) status, clinical stage, histological grade, tumor size, pT-stage, lymph node metastasis, and distal metastasis, and the survival data of multivariate Cox analysis such as cancer-specific survival (CSS), overall survival (OS), and disease-free survival (DFS). e quality of the available publications was assessed according to the Newcastle-Ottawa Scale (NOS) for cohort studies [26,27].
ree parameters of quality included a total of nine scores: selection (0-4), comparability (0-2), and outcome assessment (0-3). In this meta-analysis, the publication with ≥6 scores was considered to be of high quality, and the NOS score with <6 was defined as low-quality study. Any disagreements in the selected literature were discussed by all authors and then reached a consensus.

Validation Data from TCGA.
e RNA-sequencing data and corresponding clinical information on CRC were obtained from e Cancer Genome Atlas (TCGA) (https:// portal.gdc.cancer.gov/repository). Eventually, 618 cases with CRC with sufficient expression data and clinical information were selected.

Functional Analysis of SOX2. Association between SOX2
and genes was analyzed using TCGA sequencing data. Spearman coefficients with an absolute value of >0.2 and P < 0.001 were applied for SOX2. e potential function and biological mechanism of the SOX2 gene such as GO (Gene Ontology) and KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways were investigated by clusterProfiler package.

Survival Analysis from Validation Datasets.
Normalized GSE17538 (n � 232 patients), GSE39582 (n � 558 patients), and TCGA (n � 618 patients) datasets were applied due to sufficient survival information and the batch effects were adjusted using the ComBat method [28,29]. Finally, 1408 CRC patients were used to further confirm whether SOX2 expression was still correlated with worse OS and 1367 CRC patients were used to further validate whether SOX2 expression was still related to disease-free survival (DFS). e optimal cutoff value (SOX2: 3.31) was selected and survival analysis was performed using "survival and survminer" packages.

2
Journal of Oncology 2.7. Statistical Analysis. Data on meta-analysis were obtained from the original articles. e pooled odds ratios (ORs) and the corresponding 95% CIs were calculated to assess the relationship of SOX2 expression with the clinicopathological parameters. e pooled HRs with their 95% CIs were performed to estimate the prognostic impact of SOX2 expression on CRC patients using multivariate Cox survival analysis. As described in the report of IntHout et al., the Hartung-Knapp-Sidik-Jonkman (HKSJ) method was applied to improve the reliability of the pooled results with ≤10 studies in the current meta-analysis [30]. e heterogeneity assumption between studies was measured using Cochran's Q statistic [31]. A Q test of P < 0.1 was considered to be a significant heterogeneity. When substantial heterogeneity was detected, sensitivity analyses were conducted to estimate the change of heterogeneity and stability of an individual study on the recalculated results by removing a single study [32]. e possible publication bias was performed by Egger's test [33]. e pooled ORs and HRs were calculated using "metafor" package via R version 3. For bioinformatics validation data, the TCGA patients were divided into positive and negative groups based on the median value of SOX2 expression. e relationships between SOX2 expression and the clinicopathological parameters were performed using the univariate logistic regression analysis. e clinicopathological parameters included age, MSI status, clinical stage, lymph node metastasis, distal metastasis, venous invasion, and lymphatic invasion. Analysis for validation data was performed by using R version 3.5.1 [R Core Team, 2018].

Correlation of SOX2 Expression with Clinicopathological
Variables. A summary of the calculated results is shown in    (Figure 3).

Survival Impact of SOX2 Expression Using Multivariate
Cox Analysis. Data showed that SOX2 expression was not correlated with cancer-specific survival (CSS) of CRC in three studies with 855 patients (HR � 1.18, P � 0.667) ( Figure 4) but was correlated with worse overall survival (OS) in two studies with 536 patients (P < 0.05) [24,34].

Heterogeneity Analysis.
Heterogeneity analysis was conducted between SOX2 expression and clinical stage, histological grade, and pT-stage. We conducted sensitivity analyses to estimate the change of the pooled results. When the study of Lundberg et al. [23] was deleted, the recalculated OR was still not correlated with the clinical stage (OR � 0.55, 95% CI � 0.06-4.66, P � 0.350), with no evidence of heterogeneity (P � 0.272). When the study of Han et al. [40] was omitted, the recalculated result was still not associated with the pT-stage (OR � 1.04, 95% CI � 0.30-3.54, P � 0.908), with no heterogeneity (P � 0.216). When the study of Yan et al. [36] was removed, the recalculated result was significantly correlated with the histological grade (OR � 2.70, 95% CI � 1.15-6.37, P � 0.035), with no heterogeneity (P � 0.250).

Publication Bias.
No publication bias was found between SOX2 expression and gender and histological grade (P > 0.1) ( Figure S1).

Discussion
CSCs contribute to tumor metastasis and prognosis and therapeutic resistance [45,46]. CSCs may offer new promising therapeutic targets of treatment modalities applicable to human various cancers [47,48]. SOX2 involves in the development and maintenance of stem-like properties in cancer cells [14,15]. SOX2 involves in many signaling pathways such as VEGF, MAPK, Notch, P53, Wnt, and Jak-STAT, regulates many expression of genes, and regulates self-renewal and differentiation of stem cells, which may contribute to migration, invasion, and proliferation of cancer cells and the stemness of cancer stem cells, thereby affecting cancer progression, prognosis, and resistance toward anticancer therapies [14,15,17]. SOX2 expression is found across a wide range of human cancers such as breast cancer, lung cancer, and esophageal cancer [17]. SOX2 expression is correlated with the prognosis of some human cancers such as non-small cell lung cancer and cervical cancer with better prognosis [20,21] and breast cancer and gastric cancer with worse prognosis [18,19]. Studies have reported that SOX2 is frequently expression in CRC [6,34,35]. However, the function of SOX2 and its survival impact in patients with CRC are still largely uncertain. In the present work, we determined the clinicopathological effect of SOX2 and its expression on the prognostic impact of patients with CRC. e relationships between SOX2 expression and the clinicopathological characteristics of CRC were investigated.
e pooled data showed that SOX2 expression was not correlated with age and MSI status, which were consistent with the previous studies on age [23,38] and MSI status [23,24,37,39]. Further TCGA also showed no correlation with age and MSI status. Although no correlation was found between SOX2 expression and gender among all eligible studies, a large population with 441 cases reported that SOX2 expression was negatively correlated with gender and was lower in males than in females [23]. Further data from TCGA also demonstrated a negative association with gender, suggesting that more studies with larger sample sizes are needed to confirm this controversial finding. SOX2 expression was not associated with clinical stage, histological grade, tumor size, pT-stage, lymph node metastasis, and distal metastasis, which were in line with the previous publications regarding clinical stage [14,35,38], histological grade [14,36,41], tumor size [36], pT-stage [36,38,41], lymph node metastasis [14,40], and distal metastasis [36]. Moreover, TCGA data also confirmed no significant association with clinical stage, lymph node metastasis, and distal metastasis.
ese results suggested that SOX2 expression was not significantly associated with progression and  Miller et al. [24] Lundberg et al. [23] Ong et al. [42] HR [95% CI] Studies 0.25 1 4 Figure 4: Forest plot for the correlation between SOX2 and cancer patients' prognosis using multivariate cox analysis in cancer-specific survival (CSS). HR: hazard ratio; CI: confidence interval. Journal of Oncology 7 metastasis of CRC. Further relevant studies are necessary to confirm our results in the future. e survival impact of SOX2 expression in patients with CRC was performed based on multivariate Cox analysis. SOX2 expression was not correlated with CSS but was significantly associated with worse OS in CRC. Additionally, further validation data from TCGA and GEO datasets (OS: n � 1408 CRC patients and DFS: 1367 CRC patients) demonstrated that SOX2 expression was significantly correlated with worse OS (HR � 1.35, P � 0.004) and DFS (HR � 1.30, P � 0.02). ese analyses suggested that SOX2 was an independent prognostic marker for predicting poor prognosis. Additionally, the functional analysis showed that SOX2 involved in cell-cell junction, focal adhesion, extracellular matrix-(ECM-) receptor interaction, transmembrane receptor protein tyrosine kinase activity, transmembrane-ephrin receptor activity, semaphorin receptor activity, mitogen− activated protein kinase binding, transmembrane receptor protein kinase activity, MAP kinase activity, etc., which further suggested that SOX2 may be closely linked with CRC prognosis.
ere are several limitations in the present study. First, this meta-analysis has not been registered online. Second, the main ethnic population was Asian and European; other ethnic groups, such as Africans, were lacking. ird, due to fewer studies, subgroups were insufficient and meta-regression analysis was not performed. However, heterogeneity analysis was carried out based on sensitivity analysis and the detailed reasons for the potential heterogeneity were not very certain. e possible use of the inappropriate and different conditions of IHC methods such as different/unclear cutoff values of SOX2 expression and different sources of anti-SOX2 antibody may lead to the potential sources of heterogeneity. For example, in the future, the expression of SOX2 is defined as positive or negative, which should be recommended using a uniform standard. Fourth, only Han et al. reported that SOX2 expression was correlated with DFS in 164 CRC cases (HR � 2.558, P � 0.020) [6]. Finally, sample sizes on the analyses between SOX2 expression and the clinicopathological features and the prognosis were relatively small in the meta-analysis, more studies with large study population are necessary to further determine whether SOX2 is correlated with clinical and prognostic value in CRC based on IHC methods.
In conclusion, the current study suggested that SOX2 expression was not significantly correlated with age, gender,  Journal of Oncology MSI status, clinical stage, histological grade, tumor size, pTstage, lymph node metastasis, distal metastasis, and CSS but was associated with worse OS and DFS. Our data suggested that SOX2 may be used as an independent prognostic marker for worse prognosis in CRC. More large-scale prospective studies with larger sample sizes are required to further validate these findings.

Data Availability
Data are available in this manuscript.

Ethical Approval
All data collection and processing, including the consenting process, were performed after approval by all local institutional review boards and in accord with e Cancer Genome Atlas and Gene Expression Omnibus Human Subjects Protection and Data Access Policies.

Conflicts of Interest
e authors declare that they have no conflicts of interest.

Authors' Contributions
Kun Song and Pushi Chen contributed to the study conception and design. Kun Song, Jingduo Hao, Zuyin Ge, and Pushi Chen contributed to drafting of the article and final approval of the submitted version. Kun Song, Jingduo Hao, Zuyin Ge, and Pushi Chen contributed to the analyses and interpretation of the data and completion of figures and tables. All authors read and approved the final manuscript.