Identification of seven tumor‐educated platelets RNAs for cancer diagnosis

Abstract Background Tumor‐educated platelets (TEPs) may enable blood‐based cancer diagnosis. This study aimed to identify diagnostic TEPs genes involved in carcinogenesis. Materials and Methods The TEPs differentially expressed genes (DEGs) between healthy samples and early/advanced cancer samples were obtained using bioinformatics. Gene ontology (GO) analysis and Kyoto encyclopedia of genes and genomes (KEGG) pathway enrichment analysis were used to identify the pathways and functional annotation of TEPs DEGs. Protein‐protein interaction of these TEPs DEGs was analyzed based on the STRING database and visualized by Cytoscape software. The correlation analysis and diagnostic analysis were performed to evaluate the diagnostic value of TEPs mRNAs expression for early/advanced cancers. Quantitative real‐time PCR (qRT‐PCR) was applied to validate the role of DEGs in cancers. Results TEPs mRNAs were mostly involved in protein binding, extracellular matrix, and cellular protein metabolic process. RSL24D1 was negatively correlated to early‐stage cancers compared to healthy controls and may be potentially used for early cancer diagnosis. In addition, HPSE, IFI27, LGALS3BP, CRYM, HBD, COL6A3, LAMB2, and IFITM3 showed an upward trend in the expression from early to advanced cancer stages. Moreover, ARL2, FCGR2A, and KLHDC8B were positively associated with advanced, metastatic cancers compared to healthy controls. Among the 12 selected DEGs, the expression of 7 DEGs, including RSL24D1, IFI27, CRYM, HBD, IFITM3, FCGR2A, and KLHDC8B, were verified by the qRT‐PCR method. Conclusion This study suggests that the 7‐gene TEPs liquid‐biopsy biomarkers may be used for cancer diagnosis and monitoring.

To date, many blood-based biomarkers, including circulating nucleic acids, circulating tumor cells (CTCs), extracellular vesicles, and TEPs have been used for cancer detection and diagnosis. [4][5][6][7][8][9][10] Nevertheless, CTCs and ctDNA detection is difficult and not equally feasible in all tumor entities. 11 As a second most abundant cell type in peripheral blood generated from megakaryocytes' cytoplasm, platelets are a central regulator of thrombosis and hemostasis. 12 Platelets can ingest cellular RNA and proteins. 13,14 Recent studies have demonstrated that platelets have essential roles in tumor progression and metastasis. 15,16 They can be directly or indirectly activated by tumor cells, leading to their behavior and RNA profile alteration. 17,18 Tumor-educated platelets (TEPs) can promote tumor metastasis. 15,19,20 TEPs RNAs have been emerging as potential blood-based biomarkers for cancer diagnosis, prognosis, and prediction. 9,10 Recent studies have proved that TIMP1 and TGA2B mRNA in TEPs and a three-platelet mRNA set (MAX, MTURN, and HLA-B) may be used as a diagnostic biomarker for colorectal cancer and lung cancer. [21][22][23][24] Yet, the role and function of TEP mRNAs in other types of cancers are still not clear. Most platelet profile analysis data from patients with different cancer have been uploaded to public databases, but thorough analysis has not been performed. Data from independent studies are also limited by their sources (i.e., single cohort studies) and sample heterogeneity.
Therefore, the use of integrated bioinformatics methods to re-analyze these data may provide new insights for further cancer diagnosis.

| Data process
EdgeR package was used to identify the differential expression of genes by linear modeling. Genes with FC (fold change) > 1 and adj p-value (adjusted p-value) < 0.05 were considered to be differentially expressed in platelets collected from early/metastatic cancer samples and healthy samples. R software was then quoted to obtain the heatmaps and volcano plots about differential expression of genes in TEPs.

| Functional and pathway enrichment analysis
Gene ontology (GO) analysis was used to reveal the function of genes and gene products in many organisms. 25 Kyoto Encyclopedia of Genes and Genomes (KEGG) allots genes and genomes functional meanings at the molecular and higher levels. 26

DAVID (Database for Annotation
Visualization and Integrated Discovery, http://www.david.niaid.nih. gov) is a database that is applied for annotation, visualization, and integrated discovery. 27 p-value < 0.1 28 was considered as enriched.

| PPI network construction
The Search Tool for the Retrieval of Interacting Genes (STRING, https://strin g-db.org/) database aims to collect and integrate the information, which represents all functional interactions between the expressed proteins through strengthening known and predicted protein-protein association data among plenty of organisms. 29 Thus, the protein-protein interaction (PPI) network of DEGs was built using a STRING database. The Molecular Complex Detection (MCODE) based on Cytoscape was applied to screen modules of the PPI network with degree cutoff = 2, node score cutoff = 0.2, k-core = 2.

| Patients and healthy volunteers
Approval for obtaining whole-blood samples from healthy volunteers and patients was obtained from the Ethics Committee of The  Spearman correlation coefficient was used to assess the correlation between the average gene expression and that of the sample group for identifying genes whose expression change goes up or down strictly monotonically with respect to the group. The Mann-Whitney test was then applied to identify the differential expressed genes among the different stages. Receiving operating characteristic (ROC) curve analysis was used to evaluate the discriminatory power of the combinations. Data were shown as median ±interquartile range.

| mRNA profiles of TEPs in localized and metastatic NSCLC cancer
The data of GSE89843 include 779 platelet samples collected from  (Table S3). In addition, there were 49 differentially expressed mRNAs between 402 metastatic NSCLC treated platelet samples and 234 platelet samples from healthy controls, including 39 up-regulated and 10 down-regulated genes (Table S4).
Hierarchical clustering and a volcano plot were implemented to identify the differentially expressed mRNAs ( Figure 2B,C, Figure S2A,B). As shown in Figure 2D, among the 33 commonly altered differential mRNAs from above platelets RNA-sequencing datasets, 26 were consistently up-regulated, and seven were consistently down-regulated.

| Identification of differential expressed mRNAs in platelets between localized and metastatic cancer patients
To investigate the differentially expressed mRNAs in platelets between localized or metastatic cancer and healthy donors, we analyzed the microarray data of GSE68068 and GSE89843. Using fold  Table S5). In addition, to further screen out the consistent altered DEGs in the localized or/and metastatic tumor-educated platelets, we integrated the four up-regulated/down-regulated groups to take the intersection. We extracted 13 common DEGs, eight DEGs mainly found in tumor-educated platelets, and 12 DEGs in the metastatic tumor-educated platelets in the four up-regulated groups. Moreover, we also found two common DEGs, seven DEGs mainly located in tumor-educated platelets, and one DEGs in the metastatic tumoreducated platelets in the four down-regulated groups ( Figure 3A,B, Table 1). DEGs mainly located in localized tumor educated platelets

| Molecular concepts significantly enriched in tumor educated platelets
To explore the underlying mechanism and signaling pathways of Cell components showed that DEGs were enriched in extracellular exosome, extracellular region, proteinaceous extracellular matrix, extracellular matrix, mitochondrial outer membrane, and extracellular space ( Figure 3D, Figure S4, Table S6). By using KEGG analysis, we found these 43 DEGs were significantly enriched in metabolic processes, mostly in glycine, serine, and threonine metabolism ( Figure 3E,  Figure 3F).

| Diagnostic value of RSL24D1 for early, localized cancer
To further investigate the signature of these TEPs DEGs, we primarily Taking the results from two datasets into intersection, we found that RSL24D1 was negatively correlated with the early, localized cancer as compared to healthy controls and had a diagnostic value for early, localized cancer with a sensitivity of 71.8%, and a specificity of 64.3%. In this study, we explored the diagnostic value of these TEPs DEGs for advanced, metastatic cancers. By using Spearman correlation coefficient, we identified that apart from CA1, the expression levels of the remaining 12 TEPs DEGs were positively or negatively correlated with advanced cancers through integrating two datasets from GSE68086 and GSE89843 ( Figure 6A, Figure S7A Figure S7B). Taking these results into intersection, we discovered that ARL2, FCGR2A, and KLHDC8B were negatively correlated with the advanced, metastatic pan-cancer in comparison with healthy controls and they were essential for advanced, metastatic cancers diagnosis with a sensitivity of 59.2%, 61.8%, and 59.7%, and a specificity of 80%, 89.1%, and 83.6%, respectively.

| Validation of selected DEGs by qRT-PCR
To confirm and raise the credibility of 12 DEGs originating from the public datasets, we collected and washed the platelets from NSCLC, CRC cancer patients, and healthy volunteers to further test the expression of DEGs. The relative expression levels of selected DEGs were detected using qRT-PCR. Among the 12 selected DEGs, the relative expression of RSL24D1 was down-regulated in cancer patients' platelets, while IFI27, CRYM, HBD, IFITM3, FCGR2A, and KLHDC8B were up-regulated in cancer patients' platelets, which was consistent with the results analyzed by using bioinformatics (Figure 7).

| DISCUSS ION
In this study, we downloaded the next-generation sequencing datasets, GSE68086 (285 samples) and GSE89843 (636 samples), from GEO (https://www.ncbi.nlm.nih.gov/geo/). We then explored the mechanisms of changes in TEPs RNA profiles by using bioinformatics analysis, correlation analysis, diagnostic analysis, and qRT-PCR.
Seven TEPs mRNAs were discovered in our study: RSL24D1, IFI27, F I G U R E 2 mRNA profiles of TEPs from localized and metastatic NSCLC cancer patients, and platelets from healthy controls, based on the datasets GSE89843. (A) Number of platelet samples of healthy controls and NSCLC cancer patients at different stages. (B, C) Hierarchical clustering heatmap of DEGs in the expression profiling datasets GSE89843. (B) Heatmap of DEGs in TEPs collected from healthy controls and early, localized NSCLC cancer patients. (C) Heatmap of DEGs in TEPs collected from healthy controls and metastatic NSCLC cancer patients. The horizontal axis indicates the sample, and the vertical axis indicates the DEGs. Red represents the up-regulated DEGs, and green represents the down-regulated DEGs. (D) Identification of TEPs mRNAs between localized and metastatic NSCLC cancer. Left, commonly altered differential expressed TEPs mRNAs. Middle, identification of up-regulated differential expressed TEPs mRNAs. Right, identification of down-regulated differential expressed TEPs mRNAs | 9 of 15 GE Et al.
CRYM, HBD, IFITM3, FCGR2A, and KLHDC8B in TEPs, which are mainly enriched in protein binding, extracellular matrix, and metabolic process, and may be used for cancer diagnosis.
Blood samples for liquid biopsy include circulating tumor cells (CTCs), cell-free nucleic acids, exosomes (DNA, RNA, miRNA, proteins), and TEPs. 31 Liquid biopsy has been shown to be useful for the early detection of cancers compared to CTCs, cell-free nucleic acids, and exosomes found at lower levels at the early stage. 31 In this study, 7 TEPs mRNAs were verified, including RSL24D1, IFI27, CRYM, HBD, IFITM3, FCGR2A, and KLHDC8B, which may be used for cancer detection. RSL24D (ribosomal L24 domain containing 1, also known as C15orf15, RPL24L, or My024) encodes ribosome biogenesis protein RLP24 involved in the biogenesis of the 60S ribosomal subunit, ensuring the docking of GTPBP4/NOG1 to pre-60S particles. RSL24D1 is related to hypercholesterolemia and children's chronic kidney disease (CKD). 36,37 A recent genome-wide methylation profile analysis indicated that the change of RSL24D1 was associated with advanced-stage NSCLC. 38 Our study found that RSL24D1 in TEPs was negatively associated with cancers at an early stage, including breast cancer, lung cancer, CRC, PAAD, and HBC, compared to healthy controls. In addition, we also demonstrated the diagnostic value of RSL24D1 for early pan-cancer with a sensitivity of 71.8% and a specificity of 64.3%.
FCGR2A and KLHDC8B in TEPs were positively related to metastatic cancers in comparison with healthy controls and had potential diagnostic significance for metastatic cancers with a sensitivity of 61.8%, 59.7%, and a specificity of 89.1%, 83.6%, respectively.
Previous studies reported that FCGR2A regulates cancer growth, cancer invasion and has an important role in tumor recurrence. 39,40 On the other hand, the role of KLHDC8B in tumors is not clear.   [49][50][51] According to the GO term analysis and KEGG pathway analysis, we found that TEPs mRNAs were correlated with protein binding, extracellular matrix, cellular protein metabolic process, mitochondrial outer membrane, and innate immune response in the mucosa, as well as enriched metabolic process, mostly glycine, serine, and threonine metabolism. As one of the ten cancer hallmarks, metabolic abnormalities are mutually causal with tumor tumorigenesis. 52 Tumor blood metastasis can be divided into three  Relative expression level of KLHDC8B, FCGR2A, ARL2, IFITM3, LAMB2,  COL6A3, HBD, CRYM, LGALS3BP, IFI27, HPSE and RSL24D1. NSCLC: non-small cell lung carcinoma; CRC: colorectal cancer. *p < 0.05, **p < 0.01, and ***p < 0.001 promote tumor growth and angiogenesis of tumor tissue. 16,54 Consequently , we believe alternative TEPs mRNAs, including   RSL24D1, IFI27, CRYM, HBD, IFITM3, FCGR2A, and KLHDC8B mRNA, can potentially serve as non-invasive biomarkers for diagnosing cancers and can even predict the prognosis of pan-cancer.
Yet, more scientific research and evidence are needed to further verify this conclusion.

CO N FLI C T O F I NTE R E S T
The authors declare no conflict of interest.

DATA AVA I L A B I L I T Y S TAT E M E N T
All data are available at reasonable request to the corresponding author.