The prognostic and predictive value of ESR1 fusion gene transcripts in primary breast cancer

In breast cancer (BC), recurrent fusion genes of estrogen receptor alpha (ESR1) and AKAP12, ARMT1 and CCDC170 have been reported. In these gene fusions the ligand binding domain of ESR1 has been replaced by the transactivation domain of the fusion partner constitutively activating the receptor. As a result, these gene fusions can drive tumor growth hormone independently as been shown in preclinical models, but the clinical value of these fusions have not been reported. Here, we studied the prognostic and predictive value of different frequently reported ESR1 fusion transcripts in primary BC. We evaluated 732 patients with primary BC (131 ESR1-negative and 601 ESR1-positive cases), including two ER-positive BC patient cohorts: one cohort of 322 patients with advanced disease who received first-line endocrine therapy (ET) (predictive cohort), and a second cohort of 279 patients with lymph node negative disease (LNN) who received no adjuvant systemic treatment (prognostic cohort). Fusion gene transcript levels were measured by reverse transcriptase quantitative PCR. The presence of the different fusion transcripts was associated, in uni- and multivariable Cox regression analysis taking along current clinico-pathological characteristics, to progression free survival (PFS) during first-line endocrine therapy in the predictive cohort, and disease- free survival (DFS) and overall survival (OS) in the prognostic cohort. The ESR1-CCDC170 fusion transcript was present in 27.6% of the ESR1-positive BC subjects and in 2.3% of the ESR1-negative cases. In the predictive cohort, none of the fusion transcripts were associated with response to first-line ET. In the prognostic cohort, the median DFS and OS were respectively 37 and 93 months for patients with an ESR1-CCDC170 exon 8 gene fusion transcript and respectively 91 and 212 months for patients without this fusion transcript. In a multivariable analysis, this ESR1-CCDC170 fusion transcript was an independent prognostic factor for DFS (HR) (95% confidence interval (CI): 1.8 (1.2–2.8), P = 0.005) and OS (HR (95% CI: 1.7 (1.1–2.7), P = 0.023). Our study shows that in primary BC only ESR1-CCDC170 exon 8 gene fusion transcript carries prognostic value. None of the ESR1 fusion transcripts, which are considered to have constitutive ER activity, was predictive for outcome in BC with advanced disease treated with endocrine treatment.

ESR1-positive (ESR1 +) breast cancers (BC) [1]. However, in the metastatic setting, nearly half of the patients are de novo resistant to endocrine therapy while the remaining cases acquire resistance over time [2,3]. One of the primary characterized mechanisms of acquired resistance to endocrine therapy is the acquisition of mutations within the ligand-binding domain (LBD) of the estrogen receptor alpha gene (ESR1) activating the receptor constitutively thereby rendering tumor cells less dependent on estrogen [4][5][6][7]. Another mechanism that lead to less estrogen dependency of BC cells is the occurrence of ESR1 fusion proteins. Through analysis of RNA-sequencing data in breast cancer, recurrent intragenic fusions of 5′ end of ESR1 and the 3′ ends of AKAP12, ARMT1 or CCDC170 amongst other genes have been identified [8][9][10][11][12][13]. AKAP12, ARMT1, and CCDC170 genes together with ESR1 gene were selected for our evaluation, because they all were located at the 6q25.1 locus within 1 Mb distance [14] and fusions between the two non-coding 5' exons of ESR1 with the 3' ends of CCDC170, AKAP12 and ARMT1, upstream of ESR1, were identified in patients resistant to endocrine treatment [9,10].
Gene fusions were preferentially detected in high-grade disease and/or endocrine-resistant forms of ESR1 + BC [10,13]. Particularly, an enrichment of ESR1-CCDC170 fusion was previously reported in HER-positive patients (luminal A 9%, luminal B 3-8% and HER2 3.1%) and was correlated with a worse clinical outcome after endocrine therapy [9,15,16]. The ESR1-AKAP12 fusion was identified in 6.5% breast cancer that were resistant to letrozole aromatase inhibitor treatment [17].The novel fusion ESR1-ARMT1 was instead detected in a HER2-negative patient with luminal A-like subtype [16] and in a breast cancer patient who had not received endocrine therapy [18]. Moreover, a recently study based on molecular characterization of luminal breast cancer in African American women reported the fusions at a frequency of 11% for ESR1-CCDC170, 8% for ESR1-AKAP12 and 6% for ESR1-ARMT1 [19]. Despite the diversity among these fusions, they share a common structure retaining the hormone-independent transactivation domain as well as the DNA-binding domain whereas their ligand-binding domain is lost and replaced with a functional (transactivating) domain of the fusion partner, suggesting a pathological impact in ESR1 + BC [13]. However, the clinical significance of these fusions has not yet been properly addressed in uniform and well annotated cohorts.
In this study, we explored the occurrence of fusion transcripts of three of the most commonly reported fusion partners of ESR1 (i.e. CCDC170, AKAP12 and ARMT1) and determined the associations of their presence with clinical outcome in a cohort of 732 breast cancer patients allowing us to investigate their predictive value for endocrine treatment failure as well as their prognostic value.

Study cohorts
The protocol to study biological markers associated with disease outcome was approved by the medical ethics committee of the Erasmus Medical Centre Rotterdam, The Netherlands (MEC 02.953) and was performed in accordance with the Code of Conduct of the Federation of Medical Scientific Societies in The Netherlands (https:// www. federa. org/ codes-condu ct). The use of coded left-over material for scientific purposes and, therefore, for the greater good, does not require informed consent according to Dutch law and the new European general data protection regulation (GDPR).
In this retrospective study (see Fig. 1A for the consort diagram of the study), female patients were included, who underwent surgery for invasive primary breast cancer between 1980 and 2000 in the Netherlands. A further selection criterion was no previously diagnosed cancers with the exception of basal cell carcinoma or stage Ia/Ib cervical cancer. Within this study, only data from sections of primary tumors with at least 30% invasive tumor cells were included. The details of tissue processing, RNA isolation, cDNA synthesis and QC of this cohort have been described previously [20,21]. Tumor grade was assessed according to standard procedures at the time of inclusion. For the classification of patients' RNA samples regarding expression of the estrogen and progesterone receptors, as well as the human epidermal growth factor receptor 2 (HER2) amplification status, reverse transcriptase quantitative PCR (RT-qPCR) was used with cut-offs previously described by us [20,21].
The total cohort consisted of 732 patients with primary breast cancer (131 ESR1-negative and 601 ESR1-positive cases) (Fig. 1B). The clinical relevance of the gene fusion transcripts was evaluated in a predictive and a prognostic cohort of ESR1 + BC patients.
The predictive cohort consisted of 322 breast cancer patients with ESR1 + primary tumors of which 235 patients received tamoxifen (40 mg daily) and 87 patients an aromatase inhibitor (AI: anastrozole, letrozole, exemestane [22]) as a 1 st -line treatment for recurrent disease. Clinical response to tamoxifen therapy was defined as previously described [20,23]. The prognostic cohort included primary tumors from 279 lymph node negative (LNN) ESR1 + BC patients who had not received any systemic (neo) adjuvant therapy. Of note, 122 of these LNN ESR1 + patients were also included in the predictive cohort. Clinicopathological characteristics of each of these 2 cohorts are described in Table 1 Association of ESR1 fusions with clinical parameters of patients enrolled in the predictive cohort and in the prognostic cohort are reported in Table 2 and Table 3, respectively.

RNA isolation and RT-qPCR
Total RNA isolation from human breast cancer tissue, breast cancer cell line models and quality control were performed as previously described [20]. Next, cDNA was generated by a cycle at 48 °C for 30 min with RevertAid H-minus (Applied Biosystems, Carlsbad, CA), according to the manufacturer's instructions. The cDNA was then pre-amplified for specific genes as previously described [20]. Briefly, 2 µL of cDNA (0.1 to 1 ng/ µL) was subject to a pre-amplification of 15 cycles using a multiple loci target-specific amplification for ESR1 fusions with AKAP12, ARMT1 and CCDC170 and two reference genes, the Epithelian Cell Adhesion Molecule (EPCAM) Fig. 1 Overview of the study and selection of available patients. A Flow diagram of the study; B Workflow of processing samples: fusion gene mRNA levels were measured in 322 ER-positive primary tumors (predictive cohort) by quantitative reverse transcriptase PCR (RT-qPCR). All patients in this cohort were hormone-naïve and all experienced a disease recurrence and subsequently received 1 st line endocrine therapy. The association of the presence of ESR1 fusion genes in the primary tumor progression-free survival (PFS) after start with 1 st line tamoxifen (n = 235) or aromatase inhibitors (n = 87), were evaluated. Similarly, disease free interval (DFS) and overall survival (OS) were investigated in 279 lymph node negative ER-positive breast cancer patients (prognostic cohort) who had not received any (neo)adjuvant systemic therapy. ESR1: Estrogen Receptor 1 gene; AI: Aromatase Inhibitor; LNN: Lymph node negative; ER: Estrogen Receptor; RT-qPCR: Quantitative reverse transcriptase PCR and the Hypoxanthine Phosphoribosyltransferase 1 (HPRT1), with TaqMan PreAmp Master Mix (Applied Biosystems), as recommended by the manufacturer. Preamplified products were then diluted 12-fold in LoTE buffer (3 mM Tris-HCl/0.2 mM EDTA, pH 8.0) prior to downstream analysis. Next, 5 µL diluted pre-amplified samples were subjected to a TaqMan probe based real-time quantitative PCR (qPCR) for each gene combination, according to the manufacturer's instructions, in a MX3000P Real-Time PCR System (Agilent, Santa Clara, CA). The average expression of HPRT1 and the epithelial marker EPCAM was used as reference to control RNA quality and calculate the expression levels of target genes, as previously described [20]. Only those samples  with a ∆Cq > 25 relative to the two reference genes were used for further evaluation of gene fusions, as previously described [24][25][26]. Additional file 1 describes the primer sets used in the pre-amplification combination, as well as the Taqman qPCR used to quantify the fusions and reference genes. For ESR1-CCDC170 fusion transcripts, the variants in which exon 2 of ESR1 is fused to the coding region (exon 2 to 11) of CCDC170 were examined (E2-E2, E2-E3, E2-E4, E2-E5, E2-E6, E2-E7, E2-E8, E2-E10 and E2-E11). Samples with a ∆Cq > 25 relative to the reference genes were afterwards validated by MultiNA analysis (Shimadzu Europe, Duisburg, Germany). Only those samples with a MultiNA fusion product of the expected size were considered positive for the fusion transcripts (Additional file 2). The detection of ESR1-CCDC170 fusion transcripts with RT-qPCR and MultiNA analysis was verified and confirmed in a set of fusion-positive reported breast cancer cell lines (Additional files 3, 4 and 5).

Statistical analysis
All data were entered in SPSS version 24 (IBM Corp., Armonk, NY, USA) to generate the tables and perform the statistical analyses. For contingency tables, the Pearson Chi-Square Test was used. All P-values are 2-sided and P < 0.05 was considered statistically significant.

Association of ESR1 with its CCDC170, AKAP12 and ARMT1 fusion partner
The presence of the ESR1 fusions with AKAP12, ARMT1 and CCDC170 (exon 2 to exon 11) was evaluated in breast cancer tissue samples from 732 breast cancer patients. Fusion transcripts were predominantly detected in the ESR1 + population, with CCDC170, AKAP12 or ARMT1 fusion transcripts observed in 27.6%, 4.04% and 1.4% of the ER-positive cases respectively, and seen in 2.3%, 0.8% and 0% of the ESR1-cases respectively (P < 0.001, Fisher's exact test two tailed. Table 4 and Additional file 6).   In ER-positive tumors, full length ESR1 and CCDC170 mRNA levels were strongly correlated (R 2 = 0.31, P < 0.0001) (Additional file 7A) and transcript levels of both were significantly higher in the group of samples with an ESR1-CCDC170 fusion transcript when compared to the group without [Student T-Test P = 0.0316 and 0.0001, respectively (Additional file 7B).

Prevalence of ESR1 fusion genes in normal mammary tissue, benign lesions and carcinoma in situ of the breast
While AKAP12 and ARMT1 fusion transcripts were not found in 36 non-malignant breast tissues taken at a distance of the primary tumor, ESR1-CCDC170 fusion transcripts were detected in 67% of these normal breast tissues of patients with diagnosed breast cancer ( Table 4). Note that CCDC170, but not ESR1, mRNA levels were significantly higher in these normal (adjacent to tumor) tissues than in cancer tissue (Kruskal Wallis Test P < 0.0001, (Fig. 2). To investigate this unexpectedly high incidence in more detail, we analyzed normal breast tissues of ten women without diagnosed breast cancer, 16 benign fibroadenomas and 13 ductal carcinomas in situ (DCIS) tissues, all of them ESR1-positive.
In addition, we measured the fusion transcripts in three sets of patient-matched normal breast and primary tumor carcinomas and four patient-matched sets of primary breast tumors and metastatic lymph nodes, also all ESR1-positive. In none of these cases did we detect an ESR1 fusion transcripts with AKAP12 or ARMT1. However, one of the breast tissues of women without breast cancer diagnosis (10%) showed ESR1-CCDC170 exon 2 (E2-E2) fusion transcripts, one of the DCIS cases (7.7%) had ESR1-CCDC170 exon 6 (E2-E6) fusion transcripts, and four patients with fibroadenoma (25%) had ESR1-CCDC170 exon 8 (E2-E8) fusion transcripts (Table 4 and Additional file 6). For one out of the three matched normal-tumor cases we found an ESR1-CCDC170 exon 8 fusion in both the primary tumor and the normal breast tissue taken at a distance from the primary tumor.
Finally, for two out of the four patients of which we had a matched primary tumor and lymph node metastasis, an ESR1-CCDC170 exon 2 fusion was present in both the primary tumor and the lymph node metastasis.

Prevalence of ESR1 fusion genes in breast tumor tissues
Since fusion transcripts were predominantly detected in the ESR1 + population, we decided to investigate the clinical relevance of these transcripts in primary tumors.
To this end, we stratified ESR1 + patients in two distinct cohort: a predictive cohort of advanced BC patients treated with first-line endocrine therapy and a prognostic cohort of primary BC patients with lymph node negative disease (LNN) who did not receive any adjuvant systemic treatment.
In these two ESR1 + cohorts, ESR1-ARMT1 fusion transcripts were detected in four patients of the predictive cohort (1.2%) and in three patients of the prognostic cohort (1.1%). Due to the low incidence of this ESR1-ARMT1 fusion transcript, it was not further pursued. ESR1-AKAP12 fusion transcripts were more common, and observed in 13 patients of the predictive cohort (4.0%) and in five patients of the prognostic cohort (1.8%). The ESR1-CCDC170 fusion transcripts, however, were the most prevalent and detected in the predictive cohort     Table 4).

Association of ESR1 fusion genes with DFS and OS in the prognostic cohort
The presence of ESR1-CCDC170 fusion transcripts in the primary tumor of our ESR1 + LNN patients predicted a shorter disease-free survival in a Cox proportional hazards regression survival analysis (HR ± 95% CI: 1.44 (1.01 -2.05), P = 0.044) (  Fig. 3A and Fig. 3B, respectively. A multivariate analysis was performed in which age at primary surgery, pathological tumor classification, tumor grade, progesterone receptor and HER2 status were included. The analysis revealed HER2 status as a significant prognostic factor for overall survival, but not for DFS (P = 0.36) ( Table 5).

Association of ESR1 fusion genes with clinical characteristics, PFS and post-relapse overall survival in advanced BC patients
The fusion transcripts were related with traditional clinical parameters, with response to first-line endocrine therapy in the predictive cohort (n = 322; tamoxifen (n = 235), aromatase inhibitors (n = 87)) ( Table 2). In the predictive cohort ESR1-CCDC170 fusion transcripts showed an association with age at start of first-line treatment, whereas ESR1-AKAP12 fusion transcripts were enriched in patients with progesterone-negative primary    Table 5). In addition, the presence of these fusion transcripts did not affect the time from relapse to death (postrelapse survival, HR ± 95% CI: 1.16 (0.85 -1.60), P = 0.35 and 1.92 (0.84 -4.35), P = 0.12, for ESR1 fusions with CCDC170 and AKAP12, respectively) ( Table 5). Similarly, also no association with PFS for first-line aromatase inhibitors (n = 87) was found for ESR1-CCDC170 fusion transcripts (HR ± 95% CI: 0.85 (0.53 -1.37), P = 0.50) nor for the ESR1-AKAP12 fusion transcripts (HR ± 95% CI: 1.62 (0.73 -3.60), P = 0.24). With data available for only 27 patients post-relapse, we did not analyze post-relapse survival for aromatase inhibitors. Moreover, no-significant associations with PFS were seen when the ESR1-CCDC170 exon 2 and exon 8 fusion transcripts were analyzed separately (Table 5).

Discussion
The genetic landscape contributing to de novo or acquired resistance to endocrine therapy in breast cancer patients is not completely understood yet. In this study, we investigated the occurrence of recurrent fusion transcripts between ESR1 and three different loci adjacent to ESR1 (CCDC170, AKAP12 and ARMT1) and correlated their presence with clinical outcome. All of the fusion transcripts analyzed are recurrent and most frequently present in ER-positive disease and among them ESR1-CCDC170 fusion transcripts were the most predominant. As proposed by others [10,13], the presumption was that these fusion transcripts, which are considered to cause constitutive ER signaling, might signify resistance to endocrine therapy. However, in patients with advanced breast cancer, we did not find that the presence of any of these fusion transcripts is associated with outcome to endocrine therapy whether it concerned first line tamoxifen or an aromatase inhibitor. Importantly, smaller size effects from these the variants may be undetected due to the relatively small sample size of the study cohort, 87 patients treated with aromatase inhibitors and 235 subjects with tamoxifen. In contrast, in patients with primary BC and not receiving adjuvant systemic hormone treatments, we found that fusion between ESR1 and CCDC170 in general, and between exon 2 of ESR1 and exon 8 of CCDC170 in particular, predicted in uni-and multivariable analyses shorter disease free survival as well as shorter overall survival. Thus, ESR1 and CCDC170 fusion transcript pinpoint cancers with an adverse outcome. Understanding the molecular mechanisms that underlay the origin of fusion transcripts could help to comprehend the role of these fusions in carcinogenesis as well as improve the diagnosis of cancer patients [10,13]. Although the progress in DNA sequencing enhanced detection of recurrent and pathological breast cancer fusions, the complexity of underlying genomic rearrangement patterns makes their characterization at the DNA level often difficult. The fusion between ESR1 and its neighboring gene CCDC170 are The reported P-value is from a log-rank test and the test statistics from Cox regression analyses potentially generated by tandem duplication [9,13,27,28], which is also causing other genetic rearrangements in cancer [9,29,30]. Kim et al. found a region within the ESR1 genomic locus most vulnerable to DNA strand breakage, which often included intron 6 region of its neighboring gene CCDC170, resulting in oncogenic mRNA ESR1-CCDC170 fusion transcript of exons 2 of ESR1 connected to exon 2-11 of CCDC170, i.e. the C-terminal domain of CCDC170 [31]. Irrespective of mechanisms causing the gene fusions, they occur in a patient-specific manner, which makes their identification at the DNA level less suitable for routine diagnostics. Our method to analyze fusion transcripts is much less dependent on exact position of the underlying gene fusion at the DNA level and is therefore better suited to evaluate as a general biomarker in large patient cohorts. However, an important caveat for detecting gene fusions at the transcript level is the fact that it cannot distinguish between fusion transcripts arising from actual genetic rearrangements and those that arise from transcription reading from one gene into the next without a genetic cause. Interestingly, Giltnane et al. rejected the option of a run-on transcription for these genes since the 5'end of ESR1 is fused to the 3'ends of CCDC170 and AKAP12, which are upstream of ESR1 gene [10]. Finally, the generation of artefactual fusion sequences, which are randomly ligated during the sequencing procedure, might happen, as previously reported by Veeraraghavan et al. [13]. Overall, we performed RT-qPCR analysis and investigated RNA not DNA, therefore we cannot tell whether fusion transcripts are the results of (DNA) rearrangements. Furthermore, to our great surprise, ESR1-CCDC170 and ESR1-AKAP12 fusions were detected in ER-negative patients even if at low frequency (2.3% and 0.8%, respectively). Besides sampling bias, this finding might be explained by a challenge in ER and PR determination. Althought immunohistochemistry (IHC) is the "gold standard" to determine the surrogate markers ER and PR for breast cancer classification, several studies addressed limitations in IHC by shedding light on the discordance rates in scoring hormone receptor status with negative and false-positive rates in ER and PR statuses higher than 20% [32,33]. Similarly, a recently article by Fakhri et all. found that 12.5% of samples negative for ER by IHC were positive via microarray analysis [34]. In this context, we performed RT-qPCR to accurately determine hormone receptor status. However, this method could be subject to bias during RNA measurement. Moreover, a recently study found that in primary breast cancers, the ERnegative phenotype is not the result of mutations in ER gene, but is due to deficient ER expression at the transcriptional or post-transcriptional level [35]. Therefore, we might hypothesize that the ER expression might be restored in ER-negative patients due to the strongly impact of the signaling environment, as already demonstrated for breast cancer cells via inhibition of DNA methylation or histone deacetylation [36].
Another interesting question regards the biological significance of clinically relevant fusion transcripts. Gene fusions and their products (RNAs and proteins) are assumed to be exclusive to cancer. However, RNAsequencing analyses from normal appearing margins of cancerous specimens showed fusion transcripts also in normal tissues [37]. In fact, oncogenic rearrangements, such as the EML4-ALK [38], NPM-ALK [39], JAZF1-JJAZ1 [40] and BCR-ABL1 [41] fusions are also expressed at a low level in histologically non-neoplastic tissues [9]. In our study, expression of ESR1 fused to exons 2 and exon 8 of CCDC170 was found in mammary epithelial tissues derived from women without diagnosis of breast cancer, and in cases with (benign) fibroadenomas, respectively. Also in early stages of breast cancer, like DCIS, we detected fusion transcripts. Moreover, ESR1-CCDC170 fusion transcripts were also detected in normal breast tissues of patients with diagnosed breast cancer. This argues that a percentage may be transcript read-through instead of fusion transcripts arising from gene fusions.
According to our results, the expression of ESR1-CCDC170 exon 2 and exon 8 fusion transcripts were linked to a less favorable disease in BC patients who not received adjuvant systemic treatment. Overall, our results are in agreement with those reported by Veeraraghavan et al. which showed that ESR1-CCDC170 fusions, when introduced into ER-positive breast cancer cells, leads to a markedly increase of cell motility and colony-forming ability, increase in S-G2/M phase cells and a decrease in G0/G1 phase cells. Although several functional studies [9,42] demonstrated a role of ESR1-CCDC170 fusions in endocrine therapy resistance, no relationship between fusion transcripts and treatment outcome was observed in our predictive cohort. Overall, since ESR1-CCDC170 fusions in our study demonstrated no predictive value for endocrine therapy resistance, their prognostic value might be explained by the recurrent incidence of read-through events during cell cycle progression. This latter has been exemplified with the abundance of CTSD-IFITM10 readthrough fusions during breast cancer cell proliferation [43].

Conclusions
The most important conclusion from our work is that among the fusion transcripts evaluated measuring ESR1-CCDC170 exon 8 fusion transcripts in primary