Plastin-3 is a diagnostic and prognostic marker for pancreatic adenocarcinoma and distinguishes from diffuse large B-cell lymphoma

Altered Plastin-3 (PLS3; an actin-binding protein) expression was associated with human carcinogenesis, including pancreatic ductal adenocarcinoma (PDA). This study first assessed differentially expressed genes (DEGs) and then bioinformatically and experimentally confirmed PLS3 to be able to predict PDA prognosis and distinguish PDA from diffuse large B-cell lymphoma. This study screened multiple online databases and revealed DEGs among PDA, normal pancreas, diffuse large B-cell lymphoma (DLBCL), and normal lymph node tissues and then focused on PLS3. These DEGs were analyzed for Gene Ontology (GO) terms, Kaplan–Meier curves, and the log-rank test to characterize their association with PDA prognosis. The receiver operating characteristic curve (ROC) was plotted, and Spearman’s tests were performed. Differential PLS3 expression in different tissue specimens (n = 30) was evaluated by reverse transcription quantitative polymerase chain reaction (RT-qPCR). There were a great number of DEGs between PDA and lymph node, between PDA and DLBCL, and between PDA and normal pancreatic tissues. Five DEGs (NET1, KCNK1, MAL2, PLS1, and PLS3) were associated with poor overall survival of PDA patients, but only PLS3 was further verified by the R2 and ICGC datasets. The ROC analysis showed a high PLS3 AUC (area under the curve) value for PDA diagnosis, while PLS3 was able to distinguish PDA from DLBCL. The results of Spearman's analysis showed that PLS3 expression was associated with levels of KRT7, SPP1, and SPARC. Differential PLS3 expression in different tissue specimens was further validated by RT-qPCR. Altered PLS3 expression was useful in diagnosis and prognosis of PDA as well as to distinguish PDA from DLBCL.


Introduction
common cause of cancer death in the USA by 2030 [2]. The development of PDA is associated with many risk factors that are poorly characterized, rendering PDA prevention almost impossible [1]. Moreover, despite recent advancements in our understanding of the tumor's biology, PDA is still frequently diagnosed at advanced stages of disease (initial diagnosis only occurs in up to 15% of PDA patients with surgically resectable tumors) [1]. Surgical tumor resection is not possible in many cases, and PDA is insensitive to chemoradiotherapy [3,4]; however, in early stage PDA, preoperative chemotherapy followed by surgical resection is regarded as a curative treatment [5], but it has not been associated with significant improvement in patient outcomes or quality of life in advanced PDA [6]. Thus, there is an urgent need to understand the molecular mechanisms of PDA carcinogenesis and progression in order to develop biomarkers to diagnose PDA early, to predict prognosis and treatment responses, and to design novel strategies for the control of PDA.
In terms of early PDA detection, the differential diagnosis of an abdominal mass, like PDA, is challenging. Primary pancreatic lymphoma (PPL) is an extremely rare form of extranodal malignant lymphoma, accounting for less than 0.5% of pancreatic neoplasms and 1% of extranodal lymphomas [7]. The most common histological type of PPL is diffuse large B cell lymphoma (DLBCL), which accounts for nearly 60% of all PPL cases [8]. PPL manifests as an abdominal mass that is similar to PDA [9]; however, chemotherapy (like the CHOP regimen) is the preferred treatment option for PPL patients, so the differential diagnosis is crucial. However, if PPL was misdiagnosed, the patients would undergo unnecessary surgery, which is not an ideal treatment option for the patients [10][11][12].
In this study, we performed bioinformatical analyses of multiple datasets to assess differentially expressed genes (DEGs) in PDA, normal pancreas, DLBCL, and normal lymph node tissue. We then focused on PLS3 as a biomarker for the early diagnosis of PDA, as well as the prediction of prognosis and differential diagnosis from DLBCL. We expect to provide useful information regarding PLS3 as a biomarker for PDA, in future validation studies.

Identifications of DEGs
To identify the DEGs, we processed the downloaded database data with the "limma" R package using the R version 3.6.3 (https:// www.r-proje ct. org) according to a previous study [21]. The false-positive results were corrected by adjusting the P-values (adj. P) during Benjamini-Hochberg analysis. The fold-change value was obtained from the logarithm (logFC) analysis. The threshold for each DEG was set as an adj. P < 0.05 and |logFC| > 1. The R package "sva" was used to adjust batch effects among the GSE16515, GSE15471, GSE32676, and GSE71989 datasets [22]. DEGs were visualized with the "pheatmap" R package. An online Venn diagram (http:// bioin forma tics. psb. ugent. be/ webto ols/ Venn/) tool was used to identify intersections among gene sets.

Gene ontology (GO) analysis
GO analysis was applied to define genes and their products (mRNA or proteins), to identify unique biological properties of high-throughput transcriptome or genome data. These analyses were conducted with the "cluster-Profiler" R package. With the cut-off criterion for a significant function was set as an adj. P < 0.05 [23], the GO terms were classified into three groups: biological processes (BP), cellular components (CC), and molecular functions (MF). Data were plotted with the "GOplot" R package [24].

Tissue samples, RNA isolation, and reverse transcription quantitative polymerase chain reaction (RT-qPCR)
Ten pairs of PDA and normal pancreas samples and five pairs of DLBCL and normal lymph node samples were obtained from Tongji Hospital of Huazhong University of Science and Technology, Wuhan, China. This study  Table S1). Tissue samples were subjected to total RNA isolation using the RNA isolater Total RNA Extraction Reagent (Vazyme, Nanjing, China) and reverse-transcribed into cDNA using the HiScript III RT SuperMix for qPCR (+gDNA wiper) (Vazyme) according to the manufacturer's instructions. qPCR was then performed using the ChamQ Universal SYBR qPCR Master Mix (Vazyme) in the iQ5 ™ quantitative PCR detection system (Bio-Rad, Richmond, CA, USA). The primers were: PLS3, 5′-AAG ACC TTC CGC AAA GCA ATC-3′ and 5′-TGT TCC TTC GCT GGA CAA CTC-3′, and ACTB, 5′-GTC CAC CGC AAA TGC TTC TA-3′ and 5′-TGC TGT CAC CTT CAC CGT TC-3′. The qPCR data were quantified using the 2 −∆∆Ct method.

Statistical analyses
To assess bivariate correlations between variables, we determined Spearman's rank correlation coefficient (r s ) using the R 3.6.3 package and SPSS 21.0 (SPSS Inc., Chicago, IL, USA). The output results were visualized using the "corrplot" R package, and P < 0.05 was considered as statistically significant. Kaplan-Meier curves were plotted for 176 PDA patients, the data of which were obtained from TCGA and downloaded from the UCSC Xena site (https:// xena. ucsc. edu), then analyzed with the log-rank test to calculate overall survival for groups of patients after stratification for DEGs. Another dataset was downloaded from the complementary data available on the R2: Genomics Analysis and Visualization Platform (http:// r2. amc. nl) and International Cancer Genome Consortium (ICGC; https:// dcc. icgc. org/) [25]. The data were analyzed by using for Kaplan-Meier analysis with the "survival" R package. For data analyses, all patients were divided into high vs. low groups, depending on the median expression level of each DEG (cut-off P < 0.05), using the log-rank test. We also performed univariate and multivariate Cox regression analyses using the "survival" R package. The association between PLS3 expression and clinicopathological features was analyzed with a Chi-square test (P < 0.05). To predict the utility of PLS3 in diagnosing PDA, we plotted the receiver operating characteristic curve (ROC) with the "pROC" R package [26], then calculated the area under the curve (AUC) with SPSS 21.0.

Identification of PDA-related DEGs using various online datasets
In this study, we downloaded multiple datasets and performed bioinformatic analyses to identify DEGs in PDA, normal pancreas, DLBCL, and normal lymph node tissue. With use of the MERAV dataset, we found a total of 1611 DEGs between pancreatic and lymph node samples and 3063 DEGs between pancreatic and DLBCL samples ( Fig. 2A, B and Table 3). Using 113 PDA and 70 pancreas samples, we compared the DEGs identified with the MERAV dataset with those identified with the GEO dataset (GSE16515, GSE15471, GSE32676, and GSE71989) ( Table 1). This approach ultimately resulted in the identification of 1881 upregulated DEGs and 128 downregulated DEGs (Table 3 and Fig. 3A).
We created a search filter for these DEGs in PDA from normal pancreas, lymph node, and DLBCL using the intersection calculation and the transitivity of inequality relation of gene expression sourcing from GEO and MERAV for the "Pancreas > Lymph Nodes", "Pancreas > DLBCL", and "PDA > Pancreas" groups. The intersection showed that 84 DEGs were significantly higher in PDA than in the other three tissue types (Fig. 3B); however, we did not identify any DEGs among the other three groups (Fig. 3C).

Functional GO term analysis of these DEGs
We focused solely on these 84 DEGs for the GO term analysis and found that the top six terms ("cell-cell junction", "cell adhesion molecule binding", "apical part of cell", "lateral plasma membrane", and "desmosome") were significantly associated with PDA development ( Fig. 4 and Table 4).

Validation of these DEGs using TCGA and GTEx data
To validate the DEGs from the MERAV and GEO datasets, we searched and downloaded mRNA sequencing data on 178 PDA and 171 normal pancreas samples from TCGA and the GTEx database. Among a total of 2971 upregulated DEGs in the MERAV and GEO datasets, 46 were confirmed as highly expressed in PDA in TCGA and GTEx datasets. These DEGs may serve as indicators for differentiation between PDA and the other three tissue types (Fig. 5A). An intersectional analysis of GSE62165 (13 pancreas and 118 PDA samples) and GSE62452 (61 pancreas and 69 PDA samples) revealed that 16 of these 46 DEGs were significantly overexpressed in PDA ( Fig. 5B and Table 1).

Association of DEGs with PDA prognosis
We investigated the association of these 16 DEGs with PDA prognosis by plotting the Kaplan-Meier curves and performing the log-rank test on data for the 176   Table S1). Our data showed that five (NET1, KCNK1, MAL2, PLS1, and PLS3) of these 16 DEGs were associated with poor overall survival (OS) in PDA patients ( Fig. 6A, C, E, G, and I). However, only PLS3 data were verified by the survival data integrated from the R2 and ICGC databases (Additional file 1: Table S1) . We also had contradictory data on PLS1, i.e., data from TCGA showed that PLS1 expression was associated with poor survival, but data from another dataset showed the opposite result, Fig. 3 DEGs identified from the batched GEO datasets with the Venn diagrams. A The hierarchical cluster heatmaps of DEGs between PDA and normal pancreas from the four GEO batched datasets. The gradual change from red to green represents changes in gene expression from high to low. The black color indicates no difference in gene expression. B The intersection among "Pancreas > Lymph Nodes", "Pancreas > DLBCL" and "PDA > Pancreas" groups. C The intersection among "Pancreas < Lymph Nodes", "Pancreas < DLBCL" and "PDA < Pancreas" groups. DEG differentially expressed gene, GEO Gene Expression Omnibus, PDA pancreatic ductal adenocarcinoma, DLBCL diffuse large B-cell lymphoma that a decrease in PLS1 expression was associated with poor patient survival (Fig. 6B, D, F, H, J). In this regard, we removed the PLS1 data from our subsequent data analyses. The tumor N classification was reversely associated with survival of PDA patients (Additional file 2: Figure  S1), but we didn't find a correlation between PLS3 expression and the clinicopathological features of PDA ( Table 5).
The results of univariate and multivariate Cox analyses showed that PDA N classification (P = 0.004) and PLS3 expression (P = 0.037) were significant risk factors in developing PDA (Additional file 1: Table S1). The results of multivariate analysis showed that PDA N classification (P = 0.036) and PLS3 expression (P = 0.026) were independent predictors of PDA survival ( Table 6).

Accuracy of PLS3 expression in the diagnosis of PDA
To assess the diagnostic value of PLS3 expression for PDA, we plotted ROC curves and found that PLS3 expression was significantly elevated in PDA (Fig. 7A-D). The diagnostic efficiency of PLS3 in distinguishing PDA from normal pancreas was moderate, with AUC 0.7-0.9 in GSE62452 and high (0.9-1.0) in the other three datasets (Fig. 7E-H). We found that PLS3 expression was higher in pancreas than in lymph nodes in the MERAV data (Fig. 8A). The diagnostic efficiency of PLS3 was moderate (Fig. 8B). The examination of 46 normal pancreas and 10 normal lymph node samples revealed a similar result for GSE71729 (Fig. 8C, D and Table 1).
The diagnostic value of PLS3 expression was used to differentiate pancreas from DLBCL in 171 pancreas samples from the TCGA and GTEx, 16 pancreas samples from the MERAV, 48 DLBCL samples from the TCGA, and 4 DLBCL samples from the MERAV. Our data showed that PLS3 was overexpressed in normal pancreas and that the diagnostic efficiency of PLS3 was high (Fig. 8E-H). To verify higher PLS3 expression in PDA than in DLBCL and lymph nodes, we analyzed expression data from GSE71729 (145 PDA and 10 lymph node samples) and cell line data from the MERAV (58 PDA and 17 DLBCL samples). The results revealed that PLS3 expression in PDA was dramatically higher than in DLBCL and normal pancreatic tissues (Fig. 5B, 8I, K) and that the diagnostic efficiency of PLS3 was high (Fig. 8J,  L). These data indicate that PLS3 could serve as an effective diagnostic marker to differentiate PDA not only from normal pancreas but also from DLBCL and lymph nodes.

Association of PLS3 with known prognostic and diagnostic markers in PDA
To further elucidate the role of PLS3 expression in PDA, we selected various biomarkers that were previously used to diagnosis PDA or to predict prognosis in affected patients [27][28][29] and calculated the r s values. Using the batched dataset from GEO data, we found that KRT7 (also known as CK7) and SPP1 (secreted phosphoprotein 1) were associated with PLS3 expression (Fig. 9A),   while SPARC (secreted protein acidic and rich in cysteine) had a strong association with PLS3 expression in PDA (Fig. 9B-D). These patterns were also confirmed in TCGA PDA data (Fig. 9F-H). TCGA data revealed a weak association between KRT19 (another PDA marker, also known as CK19) with PLS3, while the GEO dataset showed a moderate association (Fig. 9E, I). We also performed Kaplan-Meier analysis. KRT7 was confirmed to be associated with prognosis in PDA patients in both datasets (Fig. 9J, N). KRT19 and SPP1 were validated by only one dataset (Fig. 9K, L, O, P). There was insufficient evidence to confirm the association between SPARC expression and OS in PDA patients (Fig. 9M, Q).

Validation of differential PLS3 expression in different tissue specimens
We performed RT-qPCR analysis of PLS3 in ten PDA samples, ten normal pancreas samples, five DLBCL samples, and five lymph node samples (Additional file 1: Table S1). We found that the level of PLS3 mRNA in PDA was significantly higher than that in normal pancreas, DLBCL, and lymph node samples. PLS3 expression in normal pancreas was higher than that in DLBCL and lymph node samples. Intriguingly, PLS3 expression was similar in DLBCL and lymph nodes (Fig. 10).

Discussion
In the current study, we bioinformatically analyzed the DEGs among PDA, normal pancreas, DLBCL, and normal lymph node tissue samples by downloading the corresponding data from multiple online databases. We narrowed our analysis for association with PDA prognosis to those DEGs that occurred in all datasets, then focused on PLS3 as a biomarker for the early diagnosis of PDA, the prediction of prognosis, and differentiation from DLBCL. Our current study identified a great number of DEGs among these tissue samples, and our intersectional analysis narrowed them down to 84 DEGs, PLS3 may also be used to diagnose PDA. In conclusion, our current data demonstrate that the detection of PLS3 expression may be used effectively to diagnose PDA, to predict OS in PDA patients, and to distinguish PDA from DLBCL. However, future study is warranted to verify our current data on the use of PLS3 as a biomarker for PDA. PDA is one of the most malignant and lethal malignancies, and early diagnosis is crucial in controlling this deadly cancer. Surgical resection is the only method proven to control PDA clinically. Nevertheless, DLBCL  is usually treated with chemotherapy in the clinic. It is very important to make a differential diagnosis between PDA and pancreatic DLBCL because they have similar clinical appearance and medical imaging [11,12,20]. Due to the anatomic location of the pancreas, it is difficult to access and has a risk of intraoperative injury, while biopsy via surgery or laparoscopic surgery might not be the best choice. The EUS-FNA (endoscopic ultrasoundguided fine-needle aspiration biopsy) could be used to take a biopsy of peripancreatic masses, but as an invasive manipulation, the patients still have to take a risk of postoperative complications. Besides, a technical obstacle is an unavoidable issue [30]. By contrast, despite lower accuracy, serum tumor marker is non-invasive and easier to perform. To date, there have been numerous studies investigating biomarkers for PDA, including single markers [31] multiple biomarker panels [32], and immunebased proteomic panels [33]. For example, a recent study of oral flora showed that specific phylotypes were associated with increased risk for developing PDA [34]. Other recent work identified new-onset diabetes as a biomarker for early pancreatic cancer [35,36]. K-Ras is mutated or altered in 95% of all PDA cases; p16 is mutated or altered in 95%; p53 is mutated or altered in 75%; Smad4 is mutated or altered in 55% [37]. Mutations of any of these genes are associated with a poor PDA prognosis [3]. Regarding PLS3, one recent study that included 207 PDA tissue specimens showed that the overexpression of PLS3 was associated with tumor stage and pathology as well as poor OS in PDA patients. PLS3 was an independent prognostic factor in PDA patients [14]. The authors' in vitro data demonstrated that PLS3 expression induced PDA cell proliferation and invasion, which were associated with increased PI3K/AKT activity in PDA cells [14]. Our current data support these recent reports of the role of PLS3 in PDA [14]. Nevertheless, there have been a number of studies of PLS3 in other human cancers, including gastric, colorectal, and breast cancers, as well as cutaneous T-cell lymphoma and acute myeloid leukemia [13][14][15][16][17][18][19].
PPL is an extremely rare form of malignant lymphoma with histology characteristic of DLBCL. Similarly to stage-matched DLBCL in the other organs, the condition can be treated with CHOP chemotherapy, which achieves equivalent outcomes [8]. One previous case report described the challenges in diagnosing DLBCL with an intra-sinusoidal pseudoglandular growth that mimicked poorly differentiated metastatic PDA in an intra-abdominal lymph node [38]. It is difficult to differentiate these DLBCLs from PDA [39,40]. However, our current study did not include a direct comparison of the microarray data for PDA and the other three tissue types. Instead, we performed indirect analyses of differences in gene expression profiles between PDA and each one of the three other types. We believe that such analyses could be informative and the conclusion was validated by qPCR (Fig. 10). Our current study provided data on the differential diagnosis for PDA and DLBCL. We identified DEGs that were expressed at significantly higher levels in pancreas, compared with lymph nodes and DLBCL, including PLS3. PLS3 expression was also confirmed to significantly differ between PDA and the pancreas, which was consistent with a recent study of PLS3 in PDA [14]. Moreover, we applied bioinformatic methods to highlight the potential role of PLS3 as an improved PDA marker with Fig. 8 Diagnostic value of PLS3 in PDA and pancreas vs. lymph nodes and DLBCL. A, B Level of PLS3. PLS3 was significantly higher in the normal pancreas than in the lymph nodes in the MERAV and GSE71729 datasets. C, D ROC curves of PLS3 in the MERAV and GSE71729 datasets. E, F Level of PLS3. PLS3 was significantly higher in normal pancreas than in DLBCL in the MERAV, TCGA, and GTEx datasets. G, H ROC curves for PLS3 in the MERAV, TCGA, and GTEx datasets. I Level of PLS3. PLS3 expression was notably higher in PDA than in DLBCL in the cell-line data from the MERAV. J ROC curve of PLS3 in the cell line data. K PLS3 level in GSE71729. PLS3 expression was higher in PDA than in lymph nodes. L ROC curve of PLS3 in GSE71729. PDA pancreatic ductal adenocarcinoma, MERAV Metabolic Gene Rapid Visualizer, ROC receiver operating characteristic curve, DLBCL diffuse large B-cell lymphoma, TCGA The Cancer Genome Atlas, GTEx Genotype-Tissue Expression; ****P < 0.0001, ***P < 0.001, **P < 0.01, and *P < 0.05, analyzed by Student's t-test differentially diagnostic value for DLBCL that a traditional tumor marker may not have.
We further verified the association of PLS3 expression with other PDA biomarkers, like KRT7 and KRT19 [41], and SPARC and SPP1. For example, SPARC was reported to affect tumor cell proliferation and migration by activating PI3K/AKT signaling and the epithelial-mesenchymal transition in liver, lung, and head and neck cancers [42][43][44]. SPP1 was associated with the development of gastric cancer [45]. However, their interaction in PDA needs to be further investigated.
Overall, our current study has several advantages. We used the best of our knowledge to assess PLS3 as a biomarker to aid in the differential diagnosis of PDA from DLBCL. We endeavored to collect gene expression data from multiple databases, ultimately obtaining a total of 681 PDA, 361 normal pancreas, 69 DLBCL, and 14 lymph node samples, leading to a large sample size. Measurements of PLS3 are feasible because PLS3 levels can be detected in circulating tumor cells, indicating the potential value of PLS3 as a serum tumor marker [17,46,47]. However, our current study does have some limitations. For example, our current data did not allow for conclusions as to whether PLS3 overexpression occurred consistently in all PDA cells throughout our clinically heterogeneous population. Moreover, low levels of PLS3 expression may prevent accurate PPL diagnosis because of the inability to rule out the possibility of secondary pancreatic lymphoma. In addition, our current study did not include other types of pancreatic cancers, like neuroendocrine tumors, follicular lymphoma, and Hodgkin's lymphoma. Again, technically, our filter criteria for DEGs were extremely stringent, resulting in the inevitable loss of some useful information. Our current study utilized data from different online databases and datasets, leading to different numbers of cases and comparison groups, which sometimes made comparisons difficult. For example, we were only able to associate 16 DEGs (not 46 or 84) with patient survival (because these 16 DEGs were verified in the MERAV, GEO, and TCGA databases).

Conclusions
Our current study demonstrated that the detection of PLS3 expression may be useful in diagnosing PDA, predicting PDA prognosis, and distinguishing PDA from DLBCL. Further study will be needed to verify our current data on PLS3 as a biomarker or even as a novel target for PDA therapy.  RT-qPCR validation of differential PLS3 expression in tissue specimens. Ten pairs of PDA and normal pancreas and five pairs of DLBCL and normal lymph node samples were subjected to total RNA isolation, reverse-transcribed into cDNA, and then subjected to qPCR. Data were quantified using the 2 −∆∆Ct method. ***P < 0.001 and *P < 0.05 by ANOVA