mRNA expression profiles obtained from microdissected pancreatic cancer cells can predict patient survival

Background Pancreatic ductal adenocarcinoma (PDAC) is one of the most devastating malignancies in developed countries because of its very poor prognosis and high mortality rates. By the time PDAC is usually diagnosed only 20-25% of patients are candidates for surgery, and the rate of survival for this cancer is low even when a patient with PDAC does undergo surgery. Lymph node invasion is an extremely bad prognosis factor for this disease. Methods We analyzed the mRNA expression profile in 30 PDAC samples from patients with resectable local disease (stages I and II). Neoplastic cells were isolated by laser-microdissection in order to avoid sample ‘contamination’ by non-tumor cells. Due to important differences in the prognoses of PDAC patients with and without lymph node involvement (stage IIB and stages I-IIA, respectively), we also analyzed the association between the mRNA expression profiles from these groups of patients and their survival. Results We identified expression profiles associated with patient survival in the whole patient cohort and in each group (stage IIB samples or stage I-IIA samples). Our results indicate that survival-associated genes are different in the groups with and without affected lymph nodes. Survival curves indicate that these expression profiles can help physicians to improve the prognostic classification of patients based on these profiles.


INTRODUCTION
Pancreatic ductal adenocarcinoma (PDAC) is currently the fourth leading cause of cancer-related death in developed countries, and has a considerable economic and social impact [1]. Despite the availability of many treatment options, the prognosis for patients with PDAC remains very poor, with a 5-year survival rate of less than 3-4%. This type of tumor is usually diagnosed at a late stage (because symptoms do not usually present until the cancer is advanced), and this is directly related to the bad prognosis for this disease [2]. PDAC tends to rapidly invade surrounding structures and organs, metastasize early, and be highly resistant to both chemo-and radiation therapies.
Currently, surgery remains the only curative treatment for pancreatic cancer, but only 10-20% of patients are candidates for surgery at the time of diagnosis [3]. Even when patients undergo radical resection, only 20% of them remain alive after 5 years [4].
Unfortunately, there are currently no screening tests nor any useful biomarkers available for early PDAC detection which would allow pancreatic adenocarcinoma to be distinguished from other inflammatory pancreatic diseases like chronic pancreatitis, or which can be used to evaluate treatment responses or relapses in follow-up examinations [2].
The most commonly used marker for clinical diagnosis and to assess the effects of treatments is CA19-9. However, its sensitivity and specificity is not very high and its levels in serum can also be significantly increased in pancreas and biliary tract inflammatory diseases. This means that CA19-9 is not useful for predicting patient responses or prognoses [5]. Although many other approaches have been taken to find pancreatic cancer biomarkers, the clinical utility of these biomarkers remains to be determined, and many of these studies are still in their early phases. Genome analysis has shown that PDAC tumors contain a wide spectrum of mutations, however, only a few mutations are detected in most tumors (for instance those in the KRAS gene or loss/inactivation of known tumor suppressor genes, including TP53 or SMAD4) [6]. Moreover, follow-up work comparing patient-matched primary PDAC tumors and subsequent metastases revealed the acquisition of further mutations in these metastases [7].
Enriching our knowledge of genes related to PDAC pathogenesis might allow us to develop tests to perform prognostic analyses and/or to identify new biomarkers or potential targets for therapy. One strategy is to analyze the transcriptome profile to try to detect genes with altered expression profiles in tumor cells and to identify any association they may have with survival time. There are specific problems for gene expression studies in pancreatic cancer because PDAC neoplastic cells often represent only a minor part of the tumoral-mass cell population, while dense desmoplastic stromal cells are the predominant component. Some different methods have been used to bypass these problems, including the use of pancreatic cancer cell lines [8], comparing a mixture of RNAs from pancreatitis and non-tumoral pancreas to pancreatic tumor cell RNA, and cancer cell enrichment by aspiration [9]. However, these approaches have their own limitations. Although several different studies have tried to detect a specific mRNA profile in different tumor stages, cell types, or patient survival groups, no definite prognostic signatures for patient survival have so far been identified [10,11].
Laser-capture microdissection, which was first described several years ago [12], allows neoplastic epithelial cells to be isolated from non-tumoral cells, thus allowing the specific analysis of mRNAs from cancer cells while avoiding 'contaminating' the mixture with mRNA from non-cancerous cells. This method helps to solve some of the aforementioned problems regarding the study of gene expression in PDAC samples.
The aim the work we describe here was to analyze the association between mRNA profiles and patient survival. In order to reduce any interference from genes expressed in non-cancerous cells present in the tumor samples we used, we analyzed mRNA levels specifically in pancreatic ductal tumor cells which we selected by microdissecting samples from PDAC patients.

RESULTS
After excluding cases with inadequate material (mixed histology, scant neoplastic pancreatic ductal material, etc.), we obtained enough RNA from the microdissected cells in 30 patient samples, which we then analyzed by microarray. These 30 patients had been followed-up for 20.75 ± 18.4 months in our oncology clinic; the patient and tumor characteristics are shown in Table 1.

Analysis of the association between gene expression and survival
We analyzed mRNA expression levels in relation to patient survival (more or less than 24 months) which allowed us to identify 10 genes ( Table 2) with altered mRNA levels in pancreatic ductal tumor cells compared to normal pancreatic cells (p < 0.001), although their association was not significant after Benjamini-Hochberg adjustment [13]. A dendrogram analysis of mRNA profiles ( Figure 1) allowed us to correctly cluster all the samples into short (< 24 months) and long-term (> 24 months) survival groups, with one exception. This dendrogram also indicated that gene expression varied greatly between each patient and both survival groups.
We decided to perform further analyses by classifying samples into two groups, depending on the presence of regional lymph node metastases: group A (without affected lymph nodes, stages IA to IIA), and group B (those with affected lymph nodes, only stage IIB). Group A included 14 samples (7 whose survival was less than 24 months); group B included 16 samples (10 whose survival was less than 24 months), see Table 1.
Analysis of group A identified 47 genes whose differential expression was associated with patient survival (p < 0.001; Table 3). Figure 2A shows the results of the clustering analysis: samples with long and short survival times could be distinguished by the mRNA expression profile of these 47 genes. This figure also indicates that 16 genes were upregulated and 31 downregulated in patients with lower survival times. Group B analysis identified 24 genes whose differential expression was associated with patient survival (p < 0.001; Table 4). Clustering analysis of this group ( Figure 2B) also showed that samples with long and short survival times could be distinguished by the expression profile of these 24 mRNAs (except for one sample). Nine of these genes were upregulated and 15 were downregulated in shorter survival-time patients. In contrast to the dendrogram from the whole group, the dendrograms shown in Figure 2A and 2B also show that there was higher gene-expression homogeneity in samples from stages I-IIA (Group A) or stage 2B (Group B).
Finally, we analyzed the patient survival curves according to the mRNA profile classifications described above ( Figure 3). In the whole group (30 samples), there was a significant difference between the short and long survival-time cohorts classified by these mRNA profiles (p < 0.001); the mean survival was 16

DISCUSSION
To date, some studies on gene expression profiles in pancreatic cancer focusing on differential expression  (3), IB (7), IIA(4), IIB (16) IA (3), IB (7), IIA (4) IIB (16) Age, follow-up, tumor length, lymph node ratio, and positive adenopathies are indicated as mean ± standard deviation. Lymph node ratio: ratio of the number of metastatic lymph nodes to the number of removed lymph nodes.     between normal and tumor tissues or cell lines have been published [14][15][16][17][18]. However, the findings of these studies are controversial and there is little concordance between the results. Data from these types of studies have been integrated into a meta-analysis which gave interesting results [19], as well as into the Pancreatic Expression Database [20] and Pancreatic Cancer Database (http:// www.pancreaticcancerdatabase.org). These types of retrospective data integration analyses have helped to determine that differences in the results may be due to design biases such as tissue or sample selection, cancer cell enrichment procedures, the type of microarray used, or because of differences between tumors and their development, statistical limitations, etc.
Some studies have focused on detecting differences between primary tumors and metastases and have found different signatures related to prognosis or survival [10,21]. Others have analyzed PDAC samples from patients with and without affected lymph nodes and have identified many genes that are differentially expressed between these groups [22,23]. Donahue et al. (2012) [24] identified and analyzed the mRNA levels of 171 genes which were able to define two prognosis groups based on their probability of disease-free survival. However, variation in the study designs and the use of different methodologies between these studies (including the patient inclusion criteria, tumor type and stage, comparison of primary or metastatic tumors, purification methods, or microarray technologies used) have produced considerable differences in the results. Therefore, little progress has so far been made in the study of associations between mRNA profiles and survival [10,24].
The presence of non-tumor cells in the samples (which are more abundant in PDAC compared to other cancer types), and the altered expression levels of a wide range of genes, can lead to distorted results [19,25,26]. In order to avoid this problem, we microdissected tumor cells and only extracted RNA from them. In addition, we only included tumors which were surgically resectable and without distal metastases (stages IA, IB, IIA, and IIB). We analyzed three separate cohorts: first, the entire sample group, second, patients with tumor stages IA to IIA (group A, without affected lymph nodes), and third, subjects with tumor stage IIB (group B, with affected lymph nodes).
Different genes were identified in each of the three cohorts, indicating that there is a lot of variability in the mRNA expression associated with survival in each group, which may help to explain the large amount of variability found in the group as a whole. Our results improve upon previously published data because we were able to analyze isolated tumor cells, and we took the sample cancer stage into consideration.
We identified many genes that are differentially expressed between the long and short survival-time groups in group B, some of which have previously been found to be altered in pancreatic cancer such as ZNF345, ZNF280B, UNC45B, ZFYVE9, DGKD, PDPN, OLA1, SORBS1, PTPN20A, PTPRA, EYA2, and BCAS4 (according to the Pancreatic Expression and Pancreatic Cancer Databases). Other genes represent a genetic variation which correlates with increased pancreatic cancer risk (ERCC4) [40], have been related to pancreatic cancer development (CCDC18 or DGKD) [41,42], can predict the prognosis or risk for other cancers (PDPN and TMC6) [43,44], or have been related to cell proliferation and migration (EYA2) [45]. It is interesting to note that these include three zinc finger genes (ZNF345, ZNF280B, and ZNF616) which may indicate the presence of an important alteration in gene regulation in PDAC.
Taken together these data indicate that there is a lot of variability among genes that are altered in, and/or related to, pancreatic tumor development, progression, and survival. Our results show that different gene expression profiles are associated with the survival of patients with tumors with or without local lymph node involvement. These profiles may be useful to help guide future decision making regarding treatment options for PDAC patients. However, it is important to note that here we have analyzed only a small sample and that only some of the genes obtained may be really involved in this classification. Our results should be confirmed in a wider population in future studies.

Patients
For this study we selected patients with PDAC stages I and II who had undergone surgery for the disease between 1998 and 2010 and who had both a full clinical follow-up until death and sufficient and appropriate histological material for analysis. Overall survival was considered to be the time from PDAC diagnosis to death. The Ethics Committee at the Hospital Clínico Universitario in Valencia approved this work and all the patients gave their informed written consent for their samples to be included in the INCLIVA Biobank.

Tissue samples
We collected the histological material (paraffinembedded tumor tissue) from the INCLIVA Biobank. Histological sections from all the samples were reviewed by two pathologists to confirm the diagnosis and rule out any cases presenting a mixed-type histology. The most representative block was selected from each case. The paraffin blocks were cut into 6-μm sections on slides, deparaffinized, and stained with sterile hematoxylin and eosin in order to visualize the tumor cells (in RNase-free conditions at all stages of the process). Sections were then microdissected in an AS-LMD Laser Microdissection System (Leica Microsystems) to obtain a minimum of 10,000 cells per sample.
From the 44 patients we initially selected, four cases were discarded because of mixed-type histology, four cases had insufficient tumoral material for microdissection, and in three cases the paraffin blocks were not usable. We eventually processed 33 samples and obtained about 5 ng of total RNA from each sample using a High Pure FFPE RNA Micro kit (Roche). RNA from each sample was proportionally amplified with a Sensation TM RNA Amplification kit (Genisphere) to obtain sufficient RNA for a chip assay. After this process we obtained enough RNA for microarray analysis from 30 of the samples.

Expression studies
We used human HT-12-v4 expression BeadChips for Whole Genome DASL assays (Illumina). Samples were loaded randomly into an Illumina HiScan system to avoid bias in the analysis of the different groups, and we followed the manufacturer's "Whole Genome DASL Assay" protocol. Raw data was obtained using the Genome Studio (version 2011.01) program and the Gene Expression module (v.1.9.0) from Illumina, without normalizing. A quality analysis screen was performed using the Illumina Genome Studio software (v.2011.1). The "Average Signal", "Bead Standard Deviation", "Average Number of Beads", and "Detection p Value" columns were exported, as recommended by the BeadArray package used in the analysis.
Data analysis was performed with R with the Bioconductor module (R_2.14.0 and Biobase_2.14.0), using the BeadArray package for quality control and normalization (Beadarray_2.4.1). Samples were normalized using the QC Spline method. Other packages used were Limma (statistical analysis) and genefilter (sample filtering). Probability p values were adjusted using Benjamini-Hochberg correction [15]. Microarray data were submitted to GEO (accession no. GSE84219).

Statistical analysis
For descriptive analysis, patient data were analyzed using SPSS (v.22) software and the data was expressed as the mean ± the standard deviation. Kaplan-Meier survival curves based on classifications obtained for the mRNA profiles were also created using SPSS (v.22) software.