Gene expression profiling of calcifications in breast cancer

We investigated the gene expression profiles of calcifications in breast cancer. Gene expression analysis of surgical specimen was performed using Affymetrix GeneChip® Human Gene 2.0 ST arrays in 168 breast cancer patients. The mammographic calcifications were reviewed by three radiologists and classified into three groups according to malignancy probability: breast cancers without suspicious calcifications; breast cancers with low-to-intermediate suspicious calcifications; and breast cancers with highly suspicious calcifications. To identify differentially expressed genes (DEGs) between these three groups, a one-way analysis of variance was performed with post hoc comparisons with Tukey’s honest significant difference test. To explore the biological significance of DEGs, we used DAVID for gene ontology analysis and BioLattice for clustering analysis. A total of 2551 genes showed differential expression among the three groups. ERBB2 genes are up-regulated in breast cancers with highly suspicious calcifications (fold change 2.474, p < 0.001). Gene ontology analysis revealed that the immune, defense and inflammatory responses were decreased in breast cancers with highly suspicious calcifications compared to breast cancers without suspicious calcifications (p from 10−23 to 10−8). The clustering analysis also demonstrated that the immune system is associated with mammographic calcifications (p < 0.001). Our study showed calcifications in breast cancers are associated with high levels of mRNA expression of ERBB2 and decreased immune system activity.


Results
Patient characteristics. Demographic characteristics and clinicopathologic findings are summarized in Table 1. The median patient age was 50.0 years. The mean tumor size for the study group was 3.1 cm. Patients had invasive ductal carcinomas (89.9%), invasive lobular carcinomas (1.8%) or others with a clinical stage of I (14.3%), II (63.1%), or III (22.6%). Ninety-two patients were hormone receptor (HR) -positive and the other 76 were HR -negative. For HER2 receptor status, 38 patients were positive, and 130 were negative. HER2 positivity, DCIS and comedo necrosis (all p < 0.001) were more frequently observed in breast cancers with highly suspicious calcifications. The other demographic, clinical and pathologic findings were not significantly different between the three groups.

Mammographic features analysis.
Mammographic features of the patients are shown in Table 1. The majority of cases presented as mass with or without calcifications (p < 0.001). There was no significant difference between the three groups for breast composition or margin and the density of mass. The morphology of calcifications was coarse heterogeneous (n = 1), fine pleomorphic (n = 21), and fine linear and linear branching (n = 10), and the distribution was regional (n = 1), grouped (n = 12), and segmental (n = 19).
Imaging-genomic correlation. When  Focusing on down regulated genes, we found that the 10 top-ranked biological functions (p from 10 −23 to 10 −8 , Figure 1) included the immune response, antigen processing and presentation, defense response, the regulation of cytokine production, the positive regulation of immune system and response to wounding. In breast cancers with highly suspicious calcifications compared to those with low-to-intermediate suspicious calcifications, gene ontology biological process terms included two dominant functions; skeletal system development and immune response. Focusing on down regulated genes, we found that top-ranked biological functions (p < 0.05) included cartilage development, skeletal system development, limb morphogenesis, osteoblast differentiation, ossifications, immune response, leukocyte mediated immunity, and response to wounding. In breast cancers with low-to-intermediate suspicious calcifications compared to those without suspicious calcifications, focusing on down regulated genes, 10 top-ranked biological functions (p from 10 −6 to 10 −2 ) included antigen processing and presentation, regulation of apoptosis, cytokine mediated signaling pathway, regulation of programmed cell death, and defense response.
BioLattice analysis identified the lattice of concepts constructed with DEGs between breast cancers with highly suspicious calcifications and those without suspicious calcifications with 60 clusters annotated by gene ontology (GO) terms in the biological process category ( Figure 2). Only 24 of 60 clusters demonstrated at least one significant GO term(s) (p < 0.001). Overall, the dataset showed 125 significant annotations with 106 unique GO terms. Four core concepts (shown as red color) are associated with the immune system, including defense response, immune response, and inflammatory response. With DEGs between breast cancers with highly suspicious calcifications and those with low-to-intermediate suspicious calcifications, lattice of concept was constructed with 17 clusters annotated by GO terms in the biological process category. Four clusters had significant GO terms (p < 0.001). It also contained defense response and immune response. In the comparison of breast cancers with low-to-intermediate suspicious calcifications and those without suspicious calcifications, lattice of concept was made with 16 clusters annotated by GO terms in the biological process category. Only 1 cluster demonstrated significant GO terms (p < 0.001). It was composed of immune response, response to pest pathogen or parasite, and response to external biotic stimulus.

Analysis of tumor infiltrating lymphocytes.
The mean tumor infiltrating lymphocyte (TIL) score was 37.0 ± 29.5 for breast cancers with highly suspicious calcifications, 34.9 ± 30.4 for those with low-to-intermediate suspicious calcifications, and 42.4 ± 32.8 for those without suspicious calcifications. There was no significant difference between three groups (p = 0.504) in total 130 patients. Subgroup analysis according to immunohistochemistry results revealed that there was no significant difference between three groups.
When we combine breast cancers with highly suspicious and low-to-intermediate suspicious calcifications into breast cancers with suspicious calcifications (n = 52) and then compare the mean TIL score of it with that of breast cancers without suspicious calcifications (n = 78), there was no significant difference between two groups (35.9 vs. 42.4, p = 0.251). However, in triple-negative subtype, the mean TIL score was significantly lower in breast cancers with suspicious calcifications than those without suspicious calcifications (41.5 vs. 62.7, p = 0.045). The other subgroups didn't show the statistical significant differences.

Discussion
We searched for gene expression profiles of breast cancers with suspicious calcifications and compared those with gene expression profiles of breast cancers without suspicious calcifications. Gene expression patterns were different according to the status of mammographic calcifications in breast cancer. First, breast cancers with highly suspicious calcifications are associated with high levels of mRNA expression of ERBB2 and decreased expression of COL11A1 and FNDC1. Second, GO and clustering analysis using DAVID and BioLattice revealed that breast cancer patients with highly suspicious calcifications on mammography were highly associated with decreased immune system activity.
In our experiments, ERBB2 is repetitively overexpressed shown on the list of top 20 genes ordered by p value or fold change and commercially available gene signatures in PAM50, MammaPrint ® and OncotypeDX ® (Tables 2-4). To best of our knowledge, it is the first report that insists relationship between mammographic calcifications and mRNA expression of ERBB2. In addition, it supports the findings of other studies and provides bridging evidence that showed there was an association between HER2 overexpression and calcifications in breast cancer patients 16,18,19,25,26 . Yepes et al. 27 reported that a mass with pleomorphic calcifications on mammography may predict an intermediate to high recurrence score in patients with stage I-II ER-positive, HER2-negative, and lymph node negative invasive breast cancer. Similarly, Chae et al. 28 also reported that the high risk group assessed by 21-gene recurrence score assays was associated with the presence of calcification in the mass and the absence of calcification in the mass is independent predictors associated with low recurrence score. In our study, breast cancers with suspicious calcifications had low expression of COL11A1 and FNDC1. There is little known about FNDC1 gene. COL11A1 is an extracellular matrix molecule which plays an important role in endochondral ossification 29 . In addition, there are evidences that COL11A1 overexpression is related with up-regulation of TGF-β1 and a biomarker indicating activated cancer associated fibroblasts in several epithelial cell origin cancers 30,31 . Several articles reported that co-cultures of cancer associated fibroblasts with breast cancer cells increased metastatic ability 32,33 . It is known to be associated with tumor aggressiveness, tumor progression, infiltration, metastasis and poor survival in several cancers 30,34,35 . However, Fuentes-Martínez et al. 36 reported that COL11A1 is a stromal marker but does not have prognostic value in breast cancer. To sum up these findings, breast cancer with suspicious calcifications would have another pathway for calcifications formation rather than endochondral ossification and would have little association with stromal remodeling.
To the best of knowledge, it is the first report that breast cancers with mammographic calcifications are associated with decreased immune system activity. Although the direct cellular mechanisms or biologic pathways between calcifications and the immune system have not yet been discovered, we could find one possible explanation. Tse et al. 37 reported that rapidly proliferating tumor cells that consume the bloody supply result in tumor necrosis and subsequent acidosis in the microenvironment, which finally causes calcium accumulation in the ducts. We can assume that activated immune system may deter the proliferation of tumor cells and necrosis caused by hypoxia 38,39 . By contrast, breast cancers with decreased antitumor immune response would have uncontrolled tumor cell proliferation and tumor necrosis, finally causing calcifications in the ducts. Therefore, it is possible that breast cancers with suspicious calcifications are associated with decreased immune system activity. However, there is a conflict within our results showing that breast cancers with suspicious calcifications are associated with rapidly proliferating tumor cells following decreased immune system activity. Because DCIS is an indolent non-invasive tumor but it is frequently associated with mammographic calcifications 40,41 . The mechanisms by which suspicious calcifications are produced may vary between invasive cancer and DCIS and it must be evaluated in future studies. The mean TIL score was significantly lower in breast cancers with suspicious calcifications than breast cancers without suspicious calcifications only in triple-negative subtype. The results of TIL scores partly supports our gene expression analysis that breast cancers with suspicious calcifications are associated with decreased immune system activity. It might suggest that the associations between calcifications and immune system are strong and apparent in triple-negative subtype than others. Our study has several limitations. First, this study has a retrospective design, and there may be selection bias in our database. Second, we do not have independent validation set to support our results. At the time of our study, the Cancer Imaging Archive of breast TCGA data had only 4 patients with a preoperative mammography. In addition, we couldn't perform analyses regarding survival or recurrence due to lack of long term follow-up data. Thus, we tried to use commercially available gene signatures, however, not all of genes were available on the Affymetrix GeneChip ® Human Gene 2.0 ST arrays. To overcome the weakness, we analyzed pathologic TIL scores of the same population. However, there remains limitation that we simply hypothesized that TIL might be an indicator of immune system activity, even though immune system is very complex and interactive system.    Concept lattice constructed from the comparison between breast cancers with highly suspicious calcifications and those without suspicious calcifications (n = 1838) in a total population with 60 clusters annotated by GO terms in the biological process category. Only 24 of 60 clusters demonstrate at least one significant GO term(s) (p < 0.001). Overall, the dataset shows 125 significant annotations with 106 unique GO terms. The core-periphery substructures are marked with colors (i.e., core in red, communicating in green, independent in yellow and peripheral in gray).

NCBI ID HUGO
Third, the interpretation of mammographic calcifications is subjective, and other imaging findings were not considered in this study. Fourth, breast cancer is a heterogeneous disease regarding its gene expression profiles. It is possible that the differences between the groups are not only due to the calcification status but also to the intrinsic subtype of breast cancer. Finally, we did not perform in vitro experiments regarding the cellular mechanism of calcium deposit or ex vivo experiments to determine whether immune cells differentially exist or whether immune cell markers are differentially presented according to mammographic calcifications. Similarly, we did not perform analysis of specific mineral species in the specimen. However, our study has several distinct strengths.
Having both fresh frozen tissues obtained from surgical specimen and initial digital mammography is a rare and valuable resource. Additionally, to the best of our knowledge, this is the first study examining the global gene expression profiles of calcifications in breast cancer, and it is the largest study in which over one hundred patients with microarray data were enrolled and that correlates microarray data and imaging features of breast cancer.
In addition, our results may guide further studies in the study of biological process or cell signaling pathways of calcification formation.
In conclusion, gene expression patterns in breast cancer are different according to mammographic calcifications. Breast cancers with highly suspicious calcifications are associated with high levels of mRNA expression of ERBB2 and decreased immune system activity. These results, if validated, could be used as the basis for future hypothesis-based studies.

Methods
Study population. The institutional review board of Seoul National University Hospital (IRB No. 1409-128-612) approved this retrospective study, and all patients provided written informed consent for their breast cancer tissue to be used for genome sequencing (IRB No. 1405-088-580) before operation. All experiments were performed in accordance with relevant guidelines and regulations. We retrospectively identified 2095 consecutive patients with primary operable breast cancer who performed preoperative imaging workup, and underwent surgery between 2003 and 2012 from the Breast Imaging Center database of Seoul National University Hospital. We excluded patients who had (a) non-invasive cancer, (b) prior neoadjuvant chemotherapy; or (c) prior excisional biopsy or breast surgery. As a result, a total of 168 women (mean, 50.7 yrs; age range, 21-79 yrs) comprised our study group (Figure 3).

Mammography acquisition and analysis. Mammography was performed using a Senograph 2000D or
Senograph DS (GE Healthcare, Milwaukee, WI, USA) or a LORAD Selenia (Hologic, Boston, MA, USA) digital mammography unit. Standard two-view mammography was performed with additional views as necessary. A Senograph system was used on 104 (61.9%) women, and a Selenia system was used on 64 (38.1%) women.
Mammographic features of the patients were assessed according the Breast Imaging-Reporting and Data System (BI-RADS) 6 . Using calcifications at mammography as a criterion for grouping the patients, three radiologists (S.E.S., A.C., and W.K.M.) with different degrees of experience in interpreting mammography independently analyzed the calcifications without access to genomic data. The radiologists had to fill out a sheet for each case giving their BI-RADS category: 1, normal; 2, benign; 3, probably benign; 4A, low suspicious; 4B, intermediate suspicious; 4C, highly suspicious; and 5, highly suggestive of cancer. After each radiologist finished the analysis, final consensus was established for each case. Cases with BI-RADS category of 4C and 5 were classified as breast cancer with highly suspicious calcifications (n = 32), cases with BI-RADS category of 3, 4 A, and 4B as breast cancer with low-to-intermediate suspicious calcifications (n = 37), and the other cases with BI-RADS category 1 or 2 as breast cancer without suspicious calcifications (n = 99). For example, fine pleomorphic, fine linear or fine linear branching calcifications were classified as highly suspicious calcifications whereas amorphous or coarse heterogeneous calcifications were classified as low-to-intermediate suspicious calcifications 6,7 (Figure 4).

Tissue samples and Microarray analysis: RNA Isolation, Preparation, Hybridization, and Data
Acquisition. Tissue samples were dissected through the centers of the carcinomatous region during surgery at our hospital between 2003 and 2012. These samples were frozen in liquid nitrogen within 20 min following surgical devascularization and stored at −80 °C. Total RNA from each sample was extracted using TRIzol ® reagent (Invitrogen, Carlsbad, CA, USA). RNA quality was assessed with an Agilent 2100 Bioanalyzer using the RNA 6000 Nano Chip (Agilent Technologies, Amstelveen, The Netherlands), and the quantity was determined with an ND-2000 spectrophotometer (Thermo Inc., DE, USA). The median RNA extracted was 1.202 g/L (range, 0.143-2.986 g/L). Total RNA was measured as the UV absorbance at 260 nm. Sample purity was assessed by measuring the OD 260:280 nm and OD 260:230 nm. The integrity of RNA samples was confirmed by the appearance of distinct 28S and18S bands of ribosomal RNA. The RNA integrity number (RIN) was determined using the RIN algorithm of the Agilent 2100 Expert Software 42 . The quality of the RNA was good with a standard 260/280 ratio and 260/230 ratio of absorbance greater than 1.7 and 1.3 per sample, respectively. The mean 28S/18S ratio was 1.0 (range, 0.3-1.9), and the RIN was greater than or equal to 5.0.
The sample preparation was performed according to the instructions and recommendations provided by the manufacturer. Per RNA sample, 300ng was used as input into the Affymetrix procedures as recommended by protocol (http://www.affymetrix.com/). RNA samples were converted to double-strand cDNA. Using a random hexamer incorporating a T7 promoter, amplified RNA (aRNA) was generated from the double-strand cDNA template though an in vitro transcription reaction and purified with the Affymetrix sample cleanup module. The cDNA was regenerated through a random-primed reverse transcription using dNTP mix containing dUTP. The cDNA was then fragmented by UDG and APE 1 restriction endonucleases and end-labeled by terminal transferase reaction incorporating a biotinylated dideoxynucleotide. Fragmented and end-labeled cDNAs were hybridized using the GeneChip Human Gene 2.0 ST oligonucleotide arrays (53,617 probes) for 16 hours at 45 °C and 60 rpm, as described in the Gene Chip Whole Transcript (WT) Sense Target Labeling Assay Manual (Affymetrix). After hybridization, chips were stained and washed in the Genechip Fluidics Station 450 (Affymetrix). An Affymetrix Model 3000 G7 scanner and Affymetrix Command Console Software 1.1 were used for scanning and data extraction. The raw CEL file containing intensity data was used for further analysis. For normalization, the robust multiarray-average algorithm was used 43 , which was developed in the Speed Lab at UC Berkeley. Statistical analysis and bioinformatics analysis. Demographic characteristics, clinical, pathologic, and mammographic findings were compared between groups using the chi-square test and Fisher's exact test for categorical variables. A one-way analysis of variance (ANOVA) was performed for numerical variables.
Statistical analyses of microarray data were performed using R software, version 3.2.4 (http://www.r-project.org/). The R oligo package was used for processing microarray data, which is freely available on the internet (http://www.bioconductor.org/) 44,45 . To identify differentially expressed genes between the three groups, a one-way ANOVA was performed with post hoc comparisons with Tukey's honest significant difference test 46 , and a p value < 0.05 was applied as the threshold for statistical significance in the subsequent data analysis.
To gain insight into the underlying biology of DEGs related to mammographic calcifications, functional categories enriched in the differentially expressed genes were identified using the functional annotation and clustering tool of the Database for Annotation, Visualization, and Integrated Discovery (DAVID) v6.7 (https://david. ncifcrf.gov/) 47,48 . The probability that a GO biological process term was overrepresented was determined using a modified Fisher's exact test, comparing the proportion of genes in the entire genome that are part of that GO term to the proportion of differentially expressed genes that are part of the same GO term 49 .
In addition, to interpret and organize observed biological changes, we used BioLattice (http://www.snubi. org/software/biolattice/), a mathematical framework based on a concept lattice analysis to make associations of gene expression clusters with biological ontologies or biological pathways. BioLattice considers gene expression clusters as objects and annotations as attributes and provides a graphical summary of the order of relationships by arranging them on a concept lattice in an order based on the set inclusion relationship. Rather than interpreting one cluster at a time, BioLattice integrates all gene expression clusters and annotations into a unified framework as a lattice of concepts 50 . We used Pearson correlation as a similarity measure and set arbitrary k as 60, 17 and 16 for the comparisons between breast cancers with highly suspicious calcifications and those without suspicious calcifications, breast cancers with highly suspicious calcifications and those with low-to-intermediate suspicious calcifications, and breast cancers with low-to-intermediate suspicious calcifications and those without suspicious calcifications, respectively to a cluster containing genes from 15 to 30 51 . We selected a threshold of p < 0.001. In addition, we compared our results with commercially available gene signatures used to provide recurrence score: PAM50 assay (NanoString Technologies Inc., Seattle, WA, USA) 52, 53 , MammaPrint ® 70-gene Breast Cancer Recurrence Assay (Agendia, Huntington Beach, CA, USA) 54,55 and The 21-gene Recurrence Score ® assay (Oncotype DX ® , Genomic Health, Inc., Redwood City, CA, USA) 56, 57 . Pathologic validation. Hematoxylin and eosin-stained slides of frozen human tumor tissue were examined per standard protocols for the pathologic diagnosis. Immunohistochemical analysis was performed on formalin-fixed, paraffin-embedded 4-mm tissue sections using primary mouse monoclonal antibodies for ER, PR, and HER2. For equivocal HER2 results (2+), the status was determined using fluorescence in situ hybridization 58 .
After gene ontology and BioLattice analyses, we found that breast cancers with suspicious calcifications are associated with decreased immune system activity. Thus, we additionally planned pathologic review regarding TIL as a part of validation. One pathologist (H.S.R., with 10 years of experience) retrospectively reviewed the H&E slides of the patients and assessed TIL score using methodological recommendation of International TILs Working Group 2014 59 . Of total 168 patients, pathologic slides were available for review only in 130 patients (77.4%); 78 patients (78.8%) of breast cancers without suspicious calcifications, 26 patients (70.3%) of breast cancers with low-to-intermediate suspicious calcifications, and 26 patients (81.3%) of breast cancers with highly suspicious calcifications. The mean TIL score was compared between three groups using ANOVA and independent sample t-test.