Association between Genetic Subgroups of Pancreatic Ductal Adenocarcinoma Defined by High Density 500 K SNP-Arrays and Tumor Histopathology

The specific genes and genetic pathways associated with pancreatic ductal adenocarcinoma are still largely unknown partially due to the low resolution of the techniques applied so far to their study. Here we used high-density 500 K single nucleotide polymorphism (SNP)-arrays to define those chromosomal regions which most commonly harbour copy number (CN) alterations and loss of heterozygozity (LOH) in a series of 20 PDAC tumors and we correlated the corresponding genetic profiles with the most relevant clinical and histopathological features of the disease. Overall our results showed that primary PDAC frequently display (>70%) extensive gains of chromosomes 1q, 7q, 8q and 20q, together with losses of chromosomes 1p, 9p, 12q, 17p and 18q, such chromosomal regions harboring multiple cancer- and PDAC-associated genes. Interestingly, these alterations clustered into two distinct genetic profiles characterized by gains of the 2q14.2, 3q22.1, 5q32, 10q26.13, 10q26.3, 11q13.1, 11q13.3, 11q13.4, 16q24.1, 16q24.3, 22q13.1, 22q13.31 and 22q13.32 chromosomal regions (group 1; n = 9) versus gains at 1q21.1 and losses of the 1p36.11, 6q25.2, 9p22.1, 9p24.3, 17p13.3 and Xp22.33 chromosomal regions (group 2; n = 11). From the clinical and histopathological point of view, group 1 cases were associated with smaller and well/moderately-differentiated grade I/II PDAC tumors, whereas and group 2 PDAC displayed a larger size and they mainly consisted of poorly-differentiated grade III carcinomas. These findings confirm the cytogenetic complexity and heterogenity of PDAC and provide evidence for the association between tumor cytogenetics and its histopathological features. In addition, we also show that the altered regions identified harbor multiple cancer associate genes that deserve further investigation to determine their relevance in the pathogenesis of PDAC.


Introduction
Pancreatic ductal adenocarcinoma (PDAC) is a fatal disease with a 5-year mortality rate of almost 100%. As in other types of cancer, understanding of the molecular mechanisms involved in tumor development and progression is a prerequisite to improve early diagnosis and therapy. Usage of a wide battery of techniques such in situ fluorescence hybridization (FISH), comparative genomic hybridization (CGH) and array-CGH (aCGH), has allowed identification of multiple specific recurrently altered chromosomal areas in PDAC tumors; most frequently reported alterations include losses of chromosomes 8p, 9p, 17p and 18q, together with gains of chromosomes 3q, 8q and 20q [1][2][3][4]. However, the identification of the specific genes targeted by such abnormalities has proven difficult with these approaches, partially due to the fact that these techniques have a relatively limited resolution. In fact, the highest resolution of such approaches applied so far to the study of PDAC are based on aCGH [5,6] which has proven to be still relatively limited in resolution for detailed characterization of small regions carrying genetic changes and the identification of the involved genes.
The development of wide-genome approaches such as highdensity single nucleotide polymorphism (SNP)-arrays, has further improved the sensitivity of aCGH and provided the opportunity for large scale genotyping with a more accurate definition of the magnitude of the abnormalities detected, through the identification of copy number variation (CNV) and loss of heterozigosity (LOH) for hundreds of thousands of SNPs [7]. This allows highly precise mapping of those genetic changes occurring across the entire genome in a major fraction of all tumor cells, providing a promising starting point for the identification of novel candidate genes affected by such genomic alterations and profiles. To the best of our knowledge, only Jones et al and Harada et al [8,9] have previously applied the SNP-array technology to primary PDAC samples and none of them has investigated so far the potential association between SNP-array profiles of copy number alterations and tumor histopathology.
In the present study, we applied higher density 500 K SNP arrays with a 2.5 Kb of resolution, to a series of 20 PDAC tumors vs. paired peripheral blood (PB) samples from an identical number of patients who underwent complete tumor resection. Our major goal was to map the most common reccurrent chromosomal alterations present at diagnosis in PDAC tumors and correlate them with the histopathological subtypes of the disease. Overall, the copy number values (CNV) obtained confirm that primary PDAC frequently (.70%) carry extensive gains of chromosomes 1q, 7q, 8q and 20q, together with losses of chromosomes 1p, 9p, 12q, 17p and 18q; these chromosomal regions, contain multiple cancer genes known to be directly related to PDAC disease. Most interestingly, we show for the first time the existence of two major groups of PDAC defined on the basis of the altered SNP-array profiles which showed a close association with tumor histopathology.

Patients and samples
Tissue specimens were obtained at diagnosis from 20 sporadic PDAC patients (15 males and 5 females) -mean age of 67 years (range: 45 to 84 years)-. All patients underwent surgical tumor resection at the Division of Hepatobiliary and Pancreatic Surgery of the University Hospital of Salamanca (Salamanca, Spain), between October 2003 and October 2008. The study was approved by the local ethics committee of the University Hospital of Salamanca (Salamanca, Spain) and written informed consent was given by each individual prior to entering the study, according to the Helsinki Declaration.
Tumors were diagnosed and classified according to Adsay et al. [10] with the following distribution: 5 cases corresponded to welldifferentiated/grade I tumors; 7 to moderately-differentiated/grade II, and; 8 to poorly-differentiated/grade III PDAC. Histopathological grade was confirmed in all cases in a second independent evaluation by an experienced pathologist. Most tumors (18/20, 90%) were localized in the head of the pancreas; the remaining two cases were localized in the pancreatic body and body/tail, respectively. Mean tumor size at diagnostic surgery was of 3.060.95 cm; 10 cases corresponded to TNM stage IIA tumors and the other 10 to TNM stage IIB. The most relevant clinical and laboratory patient characteristics are summarized in table 1.
Once histopathological diagnosis had been established, part of the tumor sample showing both macroscopical and microscopical infiltration was used to prepare single cell suspensions for iFISH and SNP-array studies. From the paraffin-embedded tissue samples, sections were cut from three different areas representative of the tumoral tissue and placed over poly L-lysine coated slides. All tissues were evaluated after hematoxylin-eosin staining to confirm the presence and determine the quantity of tumor cells infiltrating the material to be studied by SNP-arrays. For SNParray studies, tumor DNA was extracted from freshly-frozen tumor tissues mirror cut to those used for iFISH analyses, which contained $70% tumor cells. In turn, normal DNA was extracted from matched PB leucocytes from the same patient. For both types of samples (tumor tissue and PB leucocytes), DNA was extracted using the QIAamp DNA mini kit (Qiagen, Hilden, Germany) following the manufacturer's instructions.

SNP-array studies
Paired samples of purified tumoral DNA and normal PB DNA from individual patients were hybridized to two 250 K Affymetrix SNP Mapping arrays each (NspI and StyI SNP arrays; Affymetrix, Santa Clara, CA, with a median resolution of 2.5 Kb and an average distance between SNPs of 5.8 Kb), using a total of 250 ng of DNA per array, according to the instructions of the manufacturer. Fluorescence signals were detected using the GeneChip Scanner 3000 (Affymetrix) and data stored in CEL files. Analysis of paired tumoral/normal CEL files containing data on the SNP-array results was done using the Genotyping Console software (GTC v2.1, Affymetrix). Genotypes were generated using the BRLMM algorithm included in the GTC software. The mean call rate for individual SNPs was systematically $86.5% (median of 98.6%). Copy number (CN) alterations and loss of heterozigocity (LOH) were inferred by a Hidden Markov Model-based algorithm implemented in the GTC software program, using parameter settings recommended by Affymetrix for tumoral/ normal paired samples and a minimum physical length of at least 5 consecutive SNPs for putative genetic alterations. ''Genetic gains'' (CN$2.5) and ''losses'' (CN#1.7) were defined according to GTC working criteria. In turn, ''high CN gains'' and ''homozygous losses'' were considered to be present when CN values $4 and CN#0.3 were found, respectively.
At every locus, LOH was assumed to be present when a single allele was detected in tumor DNA from heterozygous individuals at a greater percentage than the other allele; it was further subclassified as either true LOH, when loci at which one of the parental copies of a chromosome was deleted, or as copy neutral LOH (cnLOH), when tumoral DNA displayed two copies of a chromosomal region from one parent in the absence of the allele derived from the other parent. Analysis of LOH was restricted to DNA sequences from autosomal chromosomes.

Interphase fluorescence in situ hybridization (iFISH) studies
In all cases, iFISH studies were performed on an aliquot of the single cell suspension prepared from the tumor sample. A set of 12 locus-specific FISH probes directed against DNA sequences localized in 11 different human chromosomes and specific for those chromosomal regions more frequently gained or deleted in PDAC, were systematically used to validate the results obtained with the SNP-arrays ( Table 2). The methods and procedures used for the iFISH studies have been previously described in detail [11] .

Statistical Methods
For all continuous variables, mean values and their standard deviation (SD) and range were calculated using the SPSS software package (SPSS 12.0 Inc, Chicago, IL USA); for dichotomic variables, frequencies were reported. In order to evaluate the statistical significance of differences observed between groups, the Mann-Whitney U and X2 tests were used for continuous and categorical variables, respectively (SPSS). A multivariate stepwise regression analysis (regression, SPSS) was performed to examine the correlation between the chromosomal abnormalities found by iFISH versus SNP-array techniques. Hierarchical clustering analysis was performed to classify cases according to their CN genetic profile by using the Cluster 3.0 software (PAM software; http://www-stat.stanford.edu/,tibs/PAM). Clustering was run using an Euclidean distance metric and the average linkage method. For visualization of dendograms the TreeView software 1.0.4 [12] was used. P-values ,.01 were considered to be associated with statistical significance.
Cancer-associated genes coded in chromosomal regions frequently altered in PDAC By integrating the genomic public data (Ensembl relase 59, Human build GRCh37) with our CN and LOH results, we sought to identify regions which showed recurrent CN changes containing at least one known and well-characterized gene (Table 4). Accordingly, CN gains were frequently detected ($75%) in those chromosomal regions coding for the PSCA, SLURP1, NTSR1, CDH4, BAI1, TARS2, GML, OGFR and PTPRN2 genes, which have been described to be involved in cancer and/or pancreatic functions. Remarkably, from these genes, the PSCA, NTSR1, OGFR and TNFRSF6B genes have also been associated with pancreatic malignancies and the sonic hedgehog gene has been directly related to stem-cellness. In turn, the most commonly deleted gene was MYOCD, a cancer related gene which also displayed LOH in most of our cases (80% ;  Table 3); other frequently deleted cancer-associated genes included the EYA3, NR2C1, PTAFR, and the DCC cancer associated genes which have also been involved in pancreatic cancer. In turn, common regions of LOH also included two genes that have been involved in pancreatic cancer: the RPH3AL and SERPINF1 genes at chromosome 17p13. Noteworthy, regions of LOH identified in chromosome 18q12 also contain genes that have been reported to be involved in pancreatic malignant tumors, e.g. the MAPRE2 gene found to be deleted by LOH in 50% of the cases and by cnLOH in another 20% of the tumors. Other cancer-associated genes coded in those chromosomal regions displaying LOH in a relatively high proportion of cases are listed in Table 3.

Discussion
PDAC are heterogeneous tumors that frequently display complex genetic profiles as confirmed in the present study where multiple CNV and LOH regions were identified in every case analyzed. Overall, our findings indicate that the genetic profile of primary PDAC is defined by imbalanced losses of chromosomes 1p, 9p, 12q, 17p and 18q together with gains of the 1q, 7q, 8q and 20q chromosomal regions. These results confirm previous analyses using chromosome banding techniques [13,14], CGH [15], aCGH [1,2,4,16], low-resolution 100K SNP-arrays [8] and gene sequencing combined or not with SNParray technology [9,17]. Despite a high correlation was also found between the SNP-array results and iFISH analyses performed on the same series of primary tumors samples -as regards the most commonly deleted (e.g. 17p, 18q, 9p and 8p) and gained (e.g. 1q, 15q and 8q) chromosomal regions [11]-, a higher frequency of deletions at chromosomes 1p and 17q, and gains at chromosomes 7q and 20q were found by SNP-arrays vs. iFISH technique (around 70-75% vs. 5-25%, respectively). Such discrepancies could be explained, at least in part, by the increased sensitivity of the SNP-array vs. iFISH studies in the identification of small interstitial changes [18]. A more detailed analysis of the most frequently altered chromosomal regions shows that they contain multiple cancerassociated genes, including several genes which have been specifically related to PDAC. Among others, these latter genes consisted of gained genes such as the PSCA gene, a plausible PDAC tumor marker associated with pancreatic cancer progression [19][20][21][22], the TNFRSF6B gene (a member of the tumor necrosis factor receptor family) which is amplificated in many tumors [23][24][25][26] and whose overexpression blocks growth inhibition signals in PDAC [27], and the NTSR1 and OGFR genes, involved in cancer progression [28][29][30], modulation of angiogenesis [31] and regulation of cell proliferation [32]. In turn, frequently deleted genes of interest were the RPH3AL gene, a potential tumor suppressor gene related with insulin exocytosis [33], the SERPINF1 gene which has been detected to be involved in many epithelium derived tumors [34,35], and the MAPRE2 gene, previously found to be lost in leukemic cells [36], pancreatic cancer [37] and esophageal squamous cell carcinoma [38]; interestingly, deletion of other cancer associated genes which have not been previously associated to pancreatic malignances (MYOCD [39,40], NR2C1 [41] and PTAFR [42]) were found at higher frequencies than other (e.g. CDKN2A, TP53 or SMAD4 [43,44]) genes shown to be recurrently altered/deleted in PDAC. These results underline the potential role of several previously unexplored tumor suppressor genes in the pathogenesis of PDAC. In turn, genes which have been previously found to be amplified in PDAC patients by SNP-arrays [8], such as the SACP2 gene, were also altered in our series but at a lower frequency (e.g. 60% vs. 40% of cases, respectively). Such variability could be partially related to the lower number of patients analyzed and the effect of studying paired tumoral/ normal DNA samples in the resolution of the SNP-array for detection of CN alterations.
Most interestingly, is the observation that based on the overall genetic profile of PDAC tumors detected by SNP-arrays two well defined groups of PDAC tumors emerge which are differentially characterized by gains of the 2q14.2, 3q22.1, 5q32, 10q26.13,  10q26.3, 11q13.1, 11q13.3, 11q13.4, 16q24.1, 16q24.3, 22q13.1, 22q13.31 and 22q13.32 chromosomal regions (group 1) and by gains at 1q21.1 with coexisting losses of the 1p36.11, 6q25.2, 9p22.1, 9p24.3, 17p13.3 and Xp22.33 chromosomal regions (group 2), respectively. From the clinical and histopathologicall point of view, while group 1 PDAC mostly corresponded to smaller well/moderately differentiated grade I/II cases, group 2 mainly consisted of larger and poorly-differentiated PDAC. Among the few well/moderately differentiated carcinomas included in this latter group, 2/3 cases showed intermediate cytogenetic features with coexistence of gains of chromosomes 1q21.1 together with gains of chromosomes 10q, 22q and 11q. Whether these two distinct cytogenetic profiles reflect different cytogenetic pathways vs. sequential stages of development of PDAC, remains to be determined. However, the identification of rather different and non-overlapping chromosomal changes in both groups of tumors would support they could more likely reflect two genetically different diseases. Further studies in larger series of patients are warranted to elucidate this question and determine the specific role of those cancer associated genes (INPP5A, CDX1, MB, CAMK2A, APOL6 vs. VPS53, FAM57A, GEMIN4, SFRS13, ELP2P, GLOD4, CSF2RA and IL3RA), differentially altered in both groups of tumors. In this regard, it should be noted that from those genes, two or more are involved in common intracellular pathways such the cytokinecytokine receptor interactions involving Jak-STAT signaling (IL3RA and CSF2RA genes) or RNA processing pathways (GEMIN4 and SFRS13A genes) [45,46]. Further analyses of gene expression profiles may contribute to determine their relevance in the pathogenesis of PDAC.
In summary, in the present study we confirm the cytogenetic complexity and heterogenity of PDAC and provide evidence for the association between tumor cytogenetics and its histopathological features. In addition, we also show that the most frequently altered regions identified harbor multiple cancer-associated genes that deserve further investigation to determine their relevance in the pathogenesis of PDAC.       . Two well-defined groups of patients (p = 0.03) were identified: group 1 includes tumors showing gains of chromosomes 2q, 3q, 5q, 10q, 11q, 16q and 22q and a high rate of grade I/II PDAC tumors (tumor cases highlighted in yellow), while group 2 predominantly included cases with losses of chromosomes 1p, 6q, 9p, 17p and Xp, and gains of chromosome 1q, in association with a higher frequency of poorly-differentiated/grade III carcinomas (tumor cases highlighted in purple). CNV obtained for each chromosomal region are represented in a color code: red corresponds to chromosomal losses (CN#1.7), green to chromosomal gains (CN$2.5) and black to a normal CN value of 2. The color intensity represents the magnitude of the change, down to CN values ,0.3 (homozygous deletions) and up to CN values $4 (high gains/amplification). Known cancer-associated genes coded in these chromosomal regions and found to be altered include the INPP5A (at 10q26.3), CDX1, CAMK2A (at 5q32), MB and APOL6 (both at 22q13.1) genes gained among group 1 cases and the SFRS13A (1p36.11), VPS53, FAM57A, GEMIN4, ELP2P and GLOD4 (all of them at 17p13.3) and the CSF2RA and the IL3RA (both at Xp22.33) genes deleted in group 2 cases. Panel B: Illustrating histopathological pictures corresponding to group 1 -histologic grades I and II PDACs characterized by typically well-formed glands and less well-defined glands with an incomplete glandular lumina, respectively (images a and b)-, and group 2 -grade III PDAC showing non-structured glands and solid sheets of neoplastic cells (c)-PDAC cases. Original magnification: (a) x200; (b) x100; (c) x400. doi:10.1371/journal.pone.0022315.g001