SPP1 Might Be a Novel Prognostic Biomarker for Patients With Malignancy: a Meta-analysis and Sequential Verification Based on Bioinformatic Analysis


 Background: Several studies have investigated the relationship between secreted phosphoprotein 1 (SPP1) expression level and prognosis of various tumors, but the results are far from conclusive. Therefore, we performed the present meta-analysis to investigate the prognostic value of SPP1 in pan-cancer. Furthermore, a followed confirmation based on The Cancer Genome Atlas (TCGA) database was also performed to verify our results.Methods: We performed a systematic search from PubMed, Embase, Web of Science, and Cochrane Library databases and 19 articles, including 3403 patients and 9 types of tumors, were pooled in our meta-analysis. Overall survival (OS) and disease-free survival (DFS), which correlated with SPP1 expression, were considered as the primary outcome. Subgroup analyses, sensitivity analysis, and publication bias were used to investigate heterogeneity and reliability of the results. Furthermore, we also explored the relationship between SPP1 expression and clinical parameters of tumor patients. Finally, the results were verified with TCGA database and we further explored the relationship between SPP1 expression and tumor immuno-microenvironment (TIME), DNA methylation, and enriched gene pathway.Results: Our meta-analysis showed that high-expressed SPP1 was significantly related to poor OS and DFS in various cancers, especially in liver hepatocellular carcinoma (LIHC). Furthermore, we also identified that the high expression level of SPP1 was significantly correlated with tumor grade. The expression level of SPP1 in the majority of tumor types were much higher than the corresponding normal tissues analyzed from databases. Besides, we also observed that high-expressed SPP1 was related to poor OS and DFS in LIHC, which supported the conclusion of meta-analysis. In addition, high-expressed SPP1 is related to 6 immune cells in TIME and DNA methylation regulatory genes. Ultimately, the results of Gene Set Enrichment Analysis (GSEA) suggested that tumor-related gene sets, such as hypoxia and lipid metabolism, were significantly enriched in high-expressed SPP1 group.Conclusions: SPP1 is high-expressed in various tumor tissues and correlated with poor prognosis. SPP1 might promote cancer invasion and metastasis by affecting tumor grade, TIME, DNA methylation, hypoxia, and lipid metabolism. SPP1 is expected to become a new clinical indicator for tumor detection and prognosis, and provide a new idea for tumor targeted therapy.


Introduction
Nowadays, the incidence and mortality of tumor are increasing year by year, which has become a major public health problem all over the world. According to incomplete statistics, about 18 million people were newly diagnosed with cancer and about 9 million people died from cancer each year around the world [1] . In recent decades, although signi cant advances have been made in diagnosis, treatment, and precise management of oncological patients, their prognosis remains bleak. On the one hand, the majority of patients were diagnosed at an advanced stage. On the other hand, early diagnosis of cancer is lack of more accurate markers. Therefore, it is pivotal to nd biomarkers that can early diagnose and evaluate the prognosis of cancer.
Secreted phosphoprotein 1 (SPP1), also known as osteopontin (OPN), is encoded by the SPP1 gene located on the long arm of chromosome 4 region 22 (4q22.1) of human [2] . SPP1 is mainly secreted by osteoblasts, vascular smooth muscle cells, endothelial and epithelial cells and it makes SPP1 widely detectable in body uids such as blood and bile [3] . In normal tissues, SPP1 expression is found in the bone matrix, gallbladder, bile and pancreatic ducts, gastrointestinal (GI) tract, respiratory, urinary and reproductive tracts [4,5] . SPP1 is primarily located in the extracellular matrix and involved in many physiologic processes, including bone matrix remodeling, biomineralization, immune regulation, and anti-apoptosis [6][7][8] .
Recently, a growing number of studies have elucidated that SPP1 is related to cancer invasion, metastasis, and poor prognosis. SPP1 promotes the growth of tumor cells by regulating cell-matrix interaction and cellular signaling through binding with CD44 receptors and integrin [9] . SPP1 can also promote tumorigenesis and metastasis by stimulating angiogenesis and inhibiting tumor cell apoptosis [10,11] . Numerous studies also suggested that SPP1 expression level in tumor tissues is increased compared with normal tissues and it is associated with poor prognosis of some cancers, such as lung cancer, glioblastoma, hepatocellular carcinoma, gastric cancer, colorectal cancer, and so on [12][13][14][15][16] . Hence, we conducted the present meta-analysis to explore the relationship between the expression level of SPP1 and the prognosis of pan-cancer. In addition, we further veri ed the results of meta-analysis through The Cancer Genome Atlas (TCGA, http://cancergenome.nih.gov) database.

Search strategy
Our meta-analysis was based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines [17] . We performed an overall literature retrieval for PubMed, Embase, Web of Science, and Cochrane Library database published up to October 1, 2020, using both MeSH and free terms searching for Title/Abstract. We used the following terms for literature selection: ("cancer" OR "tumor" OR "neoplasm" OR "carcinoma") AND ("SPP1" OR "Secreted phosphoprotein 1") AND ("prognosis" OR "prognostic" OR "outcome"). Besides, reference lists and other relevant studies were reviewed to nd more potential articles. The search was conducted independently by two investigators (AM Jiang and N Liu).

Inclusion and exclusion criteria
The selection process of eligible articles was done by two investigators (QQ Ding and FM Zhao). Inclusion criteria were as follows: (1) the object of the study is human and the type of cancer is the solid tumor; (2) the study involved the correlation between the expression level of SPP1 and survival data of tumor patients; (3) the study provided the relevant clinicopathological parameters. Literature that satis ed the following criteria was excluded: (1) conference, reviews, patents, case reports, or meta-analysis without original data; (2) articles that were not described in the English language; (3) studies that were not based on human; (4) overlapping or duplicate data; (5) insu cient Hazard ratios (HRs) or other data.
Data extraction and quality assessment Two investigators (HR Zheng and QQ Ding) extracted the data independently, using a standardized method. In the study selection and data extraction phases, any disagreement was resolved by discussing with the third investigator (AM Jiang). The following data information was retrieved from each article: (1) the rst author's name; (2) year of publication; (3) country; (4) number of patients; (5) tumor type; (6) detection method of SPP1; (7) cut-off criteria; (8) data of overall survival (OS) or disease-free survival (DFS); (9) antibody type of immunohistochemistry (IHC); (10) clinical parameters. The Engauge Digitizer 4.1 software was used to extract data from the Kaplan-Meier (K-M) plot, when there was no HRs and its 95% con dence inters (CIs) offered directly [18] . Newcastle Ottawa Quality Assessment Scale (NOS) was used to assess the quality of selected studies in our research [19] .
Data collection and analysis from TCGA database TCGA database includes more than twenty thousand tumor samples from 33 types of tumor and their corresponding normal samples. The RNA sequencing data of SPP1 gene expression for tumor and adjacent non-carcinoma tissues and their clinicopathological parameters were extracted from TCGA database.
Because of normal samples in TCGA database are too little, we integrated the data of normal tissues in the Genotype-Tissue Expression (GTEx, https://gtexportal.org) database to analyze the difference of SPP1 expression level between tumor and normal tissues. All patients were divided into two groups according to the median value of SPP1 expression level. Subsequently, the K-M survival curves and log-rank tests were exploited to compare the survival difference between these two groups. DNA methylation is associated with tumorigenesis and cancer metastasis. As DNMT1, DNMT2, DNMT3A, and DNMT3B are the major enzymes of DNA methylation [20] , we further analyzed the relationship between their expression levels and those of SPP1 in tumor tissues from TCGA database.

Relationship between SPP1 and tumor immuno-microenvironment
We explored the correlation between SPP1 expression level and tumor immuno-microenvironment (TIME) by the Tumor Immune Estimation Resource (TIMER, https://cistrome.shinyapps.io/timer) database. TIMER is a database designed to analyze immune cell in ltration in pan-cancer. The database used statistical methods con rmed by pathological examination to estimate the in ltration of neutrophils, macrophages, dendritic cells, B cells, and CD4 + /CD8 + T cells in tumor tissues [21] . Using the TIMER database, we examined the associations between SPP1 expression level and 6 immune cells in ltration.
The ESTIMATE algorithm in the estimate package of R software was used to estimate the ratio of the immune-stromal component in TIME, which was shown in the form of three kinds of scores: ImmuneScore, StromalScore and ESTIMATEScore. It's related to the ratio of immune, stromal, and the sum of both, which means the higher the respective score, the larger the ratio of the corresponding component in TIME [22] . We analysed the correlation between SPP1 expression and the three kinds of scores. Spearman's test was used for correlation analysis.

Gene Set Enrichment Analysis
We downloaded the Hallmark dataset from the MolecularSignatures (https://www.gsea-msigdb.org/gsea/msigdb/index.jsp) database as the target set for Gene Set Enrichment Analysis (GSEA) [23] . The whole transcriptome of all tumor samples was used for GSEA and only gene sets with NOM p < 0.05 and FDR q < 0.25 were considered as signi cant.

Statistical analysis
The prognostic value of SPP1 on OS and DFS was calculated by pooled HRs with 95% CIs. Odds ratios (ORs) with 95% CIs were used to evaluate the relationship between SPP1 expression and clinicopathological features. Heterogeneity was assessed using Cochran's Q test and the I 2 statistic test. When the heterogeneity was statistically signi cant (P < 0.05 and I 2 > 50%), a random-effects would be adopted; otherwise, a x-effect model would be performed. Subgroup analyses were conducted to explore the sources of heterogeneity. We also performed a sensitivity analysis to evaluate the quality and stability of results by omitting one study in each turn. Begg's test and Egger's test were used to assess the publication bias. All analyses were achieved using Stata V.14.0 (Stata Corporation, College Station, TX) and R V3.6.0 software (R Foundation, Vienna, Austria). P < 0.05 was considered statistically signi cant in all statistical methods.

Study characteristics and quality assessment
In total, 753 studies were identi ed and 378 duplicates were excluded. After excluding irrelevant researches by reading titles and abstracts, 78 articles were eligible in our study. Of these, 59 studies don't have HRs or other data. Finally, 19 eligible studies were included in our meta-analysis [24][25][26][27][28][29][30][31][32][33][34][35][36][37][38][39][40][41][42] . The detailed ow chart of the study selection process was presented in Fig. 1. These eligible studies contained 3403 patients, involved 9 types of cancers, including the breast cancer (BRCA) (n = 5), intrahepatic cholangiocarcinoma (n = 3), non-small cell lung cancer (NSCLC) (n = 2), soft tissue cancer (n = 1), oesophageal squamous cell carcinoma (n = 1), renal clear cell carcinoma (n = 2), hepatocellular carcinoma (n = 3), colorectal cancer (n = 1), and nasopharyngeal carcinoma (n = 1). The characteristics of the eligible studies were listed in Table 1. NOS scores for all studies were more than 5 points. Table S1 showed the results of the quality assessment. these studies by using the random-effect model (I 2 = 59.7%, P = 0.001) ( Fig. 2A). Additionally, 9 articles, including 1836 patients, were recruited to evaluate the expression level of SPP1 on DFS. The pooled HR and 95%CI showed that high-expressed SPP1 was signi cantly correlated with poor DFS in tumor patients (HR = 1.60, 95%CI = 1.18-2.18, P = 0.002) with signi cant between-study heterogeneity, also by using the random-effect model (I 2 = 57.6%, P = 0.016) (Fig. 2B). In the subgroups based on analysis, antibody type, region, sample size and NOS score, we also found that the relationship between high-expressed SPP1 and poor OS, except for patients from America and the antibody type of monoclonal antibody and polyclonal antibody. We didn't nd the source of heterogeneity. Due to the small number of studies, we didn't conduct the regression analysis to further look for the source of heterogeneity.

The relationship between SPP1 and clinical parameters
We explored the relationship between SPP1 expression and clinical parameters to nd more clinical values of SPP1 (

Sensitivity analysis and publication bias
The results of sensitivity analysis showed that no individual studies in uenced the overall results ( Fig. 3A and 3B). Begg's test (Fig. 3C and 3D) and Egger's test showed that no publication bias existed in studies on associations between high-expressed SPP1 and OS (P = 0.173 for Begg's test; P = 0.083 for Egger's test) and DFS (P = 0.917 for Begg's test; P = 0.184 for Egger's test).
The expression level of SPP1 extracted from TCGA and GTEx databases.
The differences of SPP1 RNA expression between various tumor tissues and corresponding normal tissues were obtained from TCGA and GTEx databases (Fig. 4). The results showed that SPP1 expression level was much higher than the corresponding normal tissues in 25 types of cancers, such as BRCA, cholangiocarcinoma (CHOL), colon adenocarcinoma (COAD), esophageal carcinoma (ESCA), liver hepatocellular carcinoma (LIHC), NSCLC, and so on. On the contrary, SPP1 expression was lower than normal tissues in kidney chromophobe (KICH) and kidney renal clear cell carcinoma (KIRC).
Correlation between SPP1 expression and survival from TCGA database.
To validate the clinical prognosis indication value of SPP1, we explored the relationship between SPP1 expression level and the OS and DFS of tumor patients from TCGA database. The results showed that high-expressed SPP1 was related to poor OS in LIHC, bladder urothelial carcinoma (BLCA), glioblastoma multiforme (GBM), brain lower-grade glioma (LGG), ovarian serous cystadenocarcinoma (OV), and thyroid carcinoma (THCA). Also, the high expression of SPP1 was linked with poor DFS in LIHC and esophageal carcinoma (ESCA). In conclusion, high-expressed SPP1 was correlated with poor OS and DFS in LIHC ( Fig. 5) and it is consistent with our results of the meta-analysis. Based on this result, we further explored the effect of SPP1 expression on the TIME and methylation in LIHC.

Correlation between SPP1 expression and DNA methylation
We explored the correlations between the expression of DNA methylation regulatory genes (DNMT1, DNMT2, DNMT3A and DNMT3B) and SPP1 expression level (Fig. 7). The results showed that SPP1 affected the expression of DNA methylation regulatory genes in 14 types of cancers, such as LIHC, BRCA, COAD, and so on. Not surprisingly, we observed that there were positive correlations between high-expressed SPP1 and DNMT2 (R = 0.20, P < 0.001), DNMT3A (R = 0.15, P = 0.010), and DNMT3B (R = 0.20, P < 0.001) in LIHC.

Gene Set Enrichment Analysis
GSEA was used to assess the biological signi cance of SPP1 expression in cancers (Fig. 8). Three pathways, including Pathogenic Escherichia COL1 infection, pentose phosphate pathway and proteasome, were signi cantly enriched in high-expressed SPP1 group of KEGG, and three pathways, including ABC transporters, ether lipid metabolism and linolenic acid metabolism were signi cantly enriched in low-expressed SPP1 group of KEGG. mTORC1 signaling, hypoxia and glycolysis were signi cantly enriched in high-expressed SPP1 group of HALLMARK collection. Hedgehog signaling, WNT beta catenin signaling, bile acid metabolism and KRAS signaling were signi cantly enriched in low-expressed SPP1 group of HALLMARK collection.

Discussion
SPP1 plays an important role in many physiologic processes, such as bone matrix remodeling, immune regulation, anti-apoptosis, wound healing, and so on [6][7][8]43] . However, more and more researches have shown that SPP1 was correlated with tumor microenvironment and poor prognosis in various cancers. A single research is limited because of the insu cient data and single experimental model, so that a meta-analysis of pooling researches is necessary to explore the potential clinical value of SPP1.
Our meta-analysis showed that high-expressed SPP1 was signi cantly correlated with poor OS and DFS in various cancers, especially in LIHC. The analysis of clinical parameters found that high-expressed SPP1 might affect tumor grade, which in turn caused poor clinical prognosis. Then we used TCGA database to validate the results of our meta-analysis. SPP1 expression level was much higher than the corresponding normal tissues in 25 types of cancers from TCGA and GTEx database. Survival analysis from TCGA database showed that high-expressed SPP1 was related to poor OS and DFS in LIHC, which supported the conclusion of our meta-analysis. Besides, our research further found that SPP1 affected TIME and DNA methylation of LIHC. High-expressed SPP1 was correlated with many biological pathways of tumor prognosis, such as hypoxia, lipid metabolism, and mTOR signaling from GSEA.
Although, SPP1 had been proved that it was highly expressed in various tumor tissues and was related to tumorigenesis and metastasis, the exact mechanisms of which role did SPP1 play was not clear. The research from Bramwell et al. found that high-expressed SPP1 mRNA was correlated with the higher grade of soft tissue tumors [25] and it is consistent with our result. But our research didn't nd the relationship between SPP1 expression level and other advanced features of cancer, such as tumor size, distant metastasis, lymph node metastasis and vascular invasion. The reason might be that we accepted more resectable tumor samples in the early stage, while the advanced tumor samples were too few.
We found the positive correlations between high-expressed SPP1 and in ltration of B cells, T cells, neutrophils, macrophages, and dendritic cells in LIHC.
Besides osteoblasts, vascular smooth muscle cells, endothelial and epithelial cells, SPP1 can also be synthesized by activated immune cells as a multifunctional cytokine, including T cells, natural killer cells, and macrophages [3,44] . In addition, SPP1 promotes cell-mediated immune responses and plays an important part in chronic in ammatory [45] . During local in ammation, SPP1 promotes immune cells' chemotaxis, B cells' multiplication, immunoglobulin production, and mast cell degranulation [46] . Furthermore, SPP1 improves the activity of Th1 cells by stimulating the production of IL-12 and IFN-γ and inhibiting the production of Th2-dependent IL-10 [47][48][49] . SPP1 also affected the maturation, migration, and polarization of dendritic cells [50] . All in all, SPP1 might have a certain effect on the TIME and the speci c mechanisms need to be further veri ed by more vitro and vivo experiments.
SPP1 is correlated with histone H3 lysine 4 trimethylation (H3K4me3) as a key regulator [51] . H3K4me3 often leads to transcriptional activation of tumorrelated genes [52] and is linked to tumor progression and poor prognosis in hepatocellular carcinoma, lung cancer, prostate cancer and pancreatic cancer [53][54][55][56] . Additionally, SPP1 expression is stimulated by hypoxia and SPP1 signaling also transcriptionally upregulates the expression of hypoxia markers, enhancing tumor angiogenesis and promoting cancer progression and invasion in breast and melanoma cancer [57,58] . SPP1 is involved in lipid metabolism and related to the prognosis of breast cancer [59] . These researches are consistent with our results.
At present, a large number of studies have shown that the high expression level of SPP1 in body uids and tumor tissues is correlated with the occurrence and development of many kinds of tumors. SPP1 can easily be detected in body uids, such as blood, urine, pleural and peritoneal ascites as a secreted protein [3] .
SPP1 may be a convenient clinical test biomarker to diagnosis cancer and evaluate the prognosis. However, SPP1 splice variants (OPN-a, OPN-b and OPN-c) in tumors are cell/tissue-type speci c and might have functional heterogeneity [60] . It challenges the detection methods and predictive value of SPP1.
However, our research still has some limitations. To begin with, many unavoidable factors, such as different tumor types, analysis methods, antibody types, sample size, and so on, contributed to the heterogeneity. Although, we did subgroup analysis, we didn't nd the source of heterogeneity in OS. Because of the small number of studies, we didn't conduct regression analysis to further look for the source of heterogeneity. Secondly, we used Engauge Digitizer software to extract HRs and its 95% CIs from K-M plots, when the data could not be obtained from the article directly. It might affect the accuracy of the results. Thirdly, we properly uni ed the names of some subgroups in clinicopathological features, such as age and tumor size, in order to facilitate comparison. Last but not least, our study explored the relationship between SPP1 expression and tumor immuno-microenvironment, DNA methylation and enrichment analysis based on the database. The speci c mechanisms still need to be further veri ed by basic and clinical research.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.