RNA-Based CTC Analysis Provides Prognostic Information in Metastatic Breast Cancer

In metastatic breast cancer (MBC) the molecular characterization of circulating tumor cells (CTCs) provides a unique tool to understand metastasis-biology and therapy-resistance. We evaluated the prognostic significance of gene expression in EpCAM(+) CTCs in 46 MBC patients based on a long follow-up. We selected a panel consisting of stem cell markers (CD24, CD44, ALDH1), the mesenchymal marker TWIST1, receptors (ESR1, PGR, HER2, EGFR) and the epithelial marker CK-19. Singleplex RT-qPCR was used for TWIST1 and CK-19 and multiplex RT-qPCR for stem cell markers and receptors. A group of 19 healthy donors (HD) was used as control. Univariate (p = 0.001) and multivariate analysis (p = 0.002) revealed the prognostic value of combined gene expression of CK-19(+), CD44high/CD24low, ALDH1high/CD24low and HER2 over-expression for overall survival (OS). The Kaplan–Meier estimates of OS were significantly different in patients positive for CK-19 (p = 0.028), CD44high/CD24low (p = 0.002), ALDH1high/CD24low (p = 0.007) and HER2-positive (p = 0.022). Our results indicate that combined gene expression analysis in EpCAM(+) CTCs provides prognostic information in MBC.


Introduction
Circulating tumor cells (CTCs) are rare, heterogeneous and difficult to analyze, but constitute highly important players in liquid biopsy [1][2][3] since rational treatment decisions and monitoring therapeutic response could be based on the identification of cancer specific biomarkers in these cells [4]. CTC enumeration in the CellSearch provides prognostic information in metastatic breast, prostate and colorectal cancer and is the only FDA-cleared assay up to now [5,6].
In addition to enumeration, the molecular characterization of CTC both at a bulk and single cell level [7,8] reveals tumor heterogeneity and can give valuable information on cancer evolution in real time [7,9,10]. CTC analysis at the RNA level could provide important prognostic information and represents an innovative and promising approach in the clinical management of cancer patients [11]. There is now evidence showing that RNA analysis in CTCs of patients with metastatic castration resistant prostate cancer (mCRPC) at several time-points can demonstrate a continuous change in the expression of AR-V7 splice variant and this has already been connected with response to chemotherapy and androgen deprivation therapy [12,13]. The prognostic significance of CK-19 positive CTCs has already been shown in early and metastatic breast cancer (MBC) [14]. There is evidence that the expression of a variety of markers in CTCs can change during systemic treatment of MBC patients [15,16], and specific gene signatures of CTCs in breast cancer are correlated with the risk for brain metastasis [4]. The presence of mutations in specific genes like PIK3CA or ESR1, or epigenetic alterations in CTCs can define specific subgroups of patients suitable for targeted therapy [17,18].
In primary tumors, it has been shown that subpopulations of tumor cells that display the CD44 high /CD24 low profile and express aldehyde dehydrogenase 1 (ALDH1+) are of high tumorigenic potential [19]. The putative stem cell phenotype (CD44+/CD24− and/or ALDH1+/CD24−) has been detected in CTCs [20], and the tumor microenvironment regulates the plasticity of cancer stem cells through transition between epithelial to mesenchymal-like (EMT) and mesenchymal to epithelial-like (MET) states [21]. Based on molecular analysis in EpCAM (+) CTCs at the RNA level, we have previously shown that in early stage breast cancer, there is a significantly higher risk of relapse and death in patients expressing both stem cell and mesenchymal characteristics [22].
Various studies have shown that the expression of ER, PR and HER2 in CTCs may change during the course of the disease [23]. Differences reported in HER2 amplification between CTCs and the primary tumor [24] suggest that HER2 amplification in CTCs could add important information for HER2-targeted therapy [25]. In addition to HER2 amplification, the detection of HER2 overexpression in CTCs at the RNA level, is able to predict the HER2 status on metastases [23]. It has also been shown in patients with HER2positive breast cancer that the expression of truncated HER2 on CTCs is associated with poor survival and could be related to resistance to trastuzumab [26]. On the other hand, the effectiveness of endocrine therapy in patients with hormone receptor (HR)-positive breast cancer is limited by high rates of de novo resistance and acquired resistance during treatment [27]. Therefore, real-time monitoring of ER/PR status in CTCs may help to understand the bases of resistance to endocrine treatment [28].
In the present study we report on the prognostic significance of combined gene expression analysis in EpCAM (+) CTCs in MBC patients using a panel consisting of the epithelial marker CK-19, a panel of receptors (ESR1, PR, HER2, EGFR), the mesenchymal marker TWIST1 and stem cell markers (CD24, CD44, ALDH1).

Cell Lines
The human mammary carcinoma cell line SKBR-3 was used as a positive control for CD24, CD44, ALDH1, and HPRT expression [22]. SKBR-3 and T47D cancer cell lines were used as a positive control for the development of the quadraplex RT-qPCR assay for ESR1, PR, HER2, and EGFR expression. MCF-7 cells were used as a positive control for CK-19 expression, while for TWIST1 expression MDA-MB-231 cells were used [22]. Cells were counted in a hemocytometer and their viability was assessed by trypan blue dye exclusion. cDNAs of all these cancer cell lines were kept in aliquots at −20 • C and used as positive controls in parallel to the analysis of patient's samples.

Patients
Forty-six patients with already diagnosed MBC were enrolled in the Medical Oncology Unit of "Elena Venizelou" Hospital from July 2009 until February 2011, and their clinical characteristics at the time of diagnosis are shown in Table 1. Peripheral blood was collected from all patients when metastasis was verified and before 1st line of treatment. The median follow-up period since primary tumor diagnosis was around 8 years (mean + SD: 7.7 + 4, median: 8 years). All patients signed an informed consent to participate in the study which was approved by the Ethics and Scientific Committees of our Institution. A group of 19 healthy female blood donors was used as a control group.

Isolation of EpCAM (+) CTCs, RNA Extraction and cDNA Synthesis
Peripheral blood (10 mL in EDTA) from patients and healthy donors (HD) was collected and processed within 3h in exactly the same manner. All steps including, the isolation of EpCAM (+) cells, RNA extraction and cDNA synthesis were performed as previously described [22].

RT-qPCR
Singleplex RT-qPCR assays were used for the epithelial marker CK-19 [29] and the mesenchymal marker TWIST1 [22] as previously described. A quadraplex RT-qPCR was used for the stem cell markers (CD24, CD44, ALDH1, and HPRT) as recently described [22]. For ESR1, PR, HER2, and EGFR we developed and validated a novel quadraplex RT-qPCR assay. Each probe set included a 3'-fluorescein (F) donor probe and a 5'-LightCycler (LC) acceptor probe that was different for each gene set: ESR1: at 610 nm, PR: at 640 nm, HER2: at 670 nm, EGFR: at 705 nm. A color compensation test was performed by using pure dye spectra so that spectral overlap between dyes was corrected [16]. Component concentrations and the cycling conditions of the quadraplex RT-qPCR assay were optimized in detail. The amplification reaction mixture (10 µL) for the ESR1, PR, HER2, EGFR multiplex assay contained 1 µL of the PCR Synthesis Buffer (5X), 1.6 µL of MgCl 2 (25 mM), 0.2 µLdNTPs (10 mM), 0.2 µL BSA (10 µg/µL), 0.1 µL Hot Start DNA polymerase (HotStart, 5 U/µL, Promega, USA), 0.6 µL of a mixture containing all eight primers (10 µM), 0.5 µL of a mixture containing all eight dual hybridization probes (3 µM) and H 2 O (added to the final volume). Cycling conditions were: 95 • C/2 min; 45 cycles of 95 • C/20 s, annealing at 58 • C/20 s and extension at 72 • C/20 s. For the development and analytical evaluation of the assay, we generated individual PCR amplicons corresponding to the four genetargets studied that would serve as quantification calibrators, as we have previously described [16,22]. All RT-qPCR reactions were performed in the LightCycler 2.0 (Roche, Germany) and for every RT-qPCR assay we set a cut-off value following strict criteria. In cases of CK-19 and EGFR we did not estimate any cut-off value since these genes are not expressed in PBMCs of HD. For all other genes studied we estimated the cut off value for genes that were also expressed in HD samples (due to the background noise of PBMC co-isolated with CTCs) based on the mean Cq value and the standard deviation for each gene in HD samples. Thus the average + 2 SD was used as the upper limit of normal background expression (with 95% confidence) [22,30].

Statistical Analysis
Statistical analysis was performed using SPSS (SPSS Statistics 25.0). Mann-Whitney U test, was used in order to evaluate the differences in gene expression between cancer patients and healthy individuals. Associations between gene expression markers and other clinicopathological variables of the MBC patients were analyzed using the Fisher's exact test. The overall survival (OS) rate was calculated by the Kaplan-Meier method and was evaluated by the log-rank test. OS was defined as the time from sample collection and thus registration to the study to death from any cause or censored at the time of last contact. Univariate COX regression analysis was conducted to estimate the prognostic utility of gene expression markers for the OS of MBC patients. Multivariate Cox proportional hazards models were used to evaluate the relationship between gene expression status and eventtime distributions, with tumor size, grade, number of involved lymph nodes, ER, PR, HER2 and age. All p-values are two-sided. A level of p < 0.05 is considered statistically significant.

Combined Gene Expression Analysis in EpCAM (+) CTCs
We quantified CK-19, ESR1, PR, HER2, and EGFR transcripts and evaluated the stem cell profiles CD44 high /CD24 low and ALDH1 high /CD24 low and the mesenchymal marker TWIST1 in EpCAM (+) CTCs in all patient samples ( Figure 1). Prior to proceeding to combined gene expression analysis, the quality of all cDNAs was checked through RT-qPCR for HPRT (reference gene). We define an EpCAM (+) CTC fraction as positive or negative for the expression of a specific gene, based on the RT-qPCR results, and the detection of this gene transcript in the HD control group (analyzed in exactly the same way). CK-19 and EGFR transcripts were quantified based on the absolute quantification approach since no transcripts of these genes were detected in the EpCAM (+) fraction of HD. Based on that, all samples showing amplification curves in RT-qPCR for CK-19 and EGFR were positive. ESR1, PR, HER2, CD24, CD44, ALDH1, TWIST1 transcripts were quantified based on the relative quantification approach, since these transcripts were also detected in the EpCAM (+) fraction of HD samples (analyzed in exactly the same way). More specifically:  Our results are based on bulk CTC analysis, so by using immune-magnetic beads targeting the epithelial antigen EpCAM we are actually not only enriching our samples with CTCs but we also co-isolate a low fraction of non-specifically bound PBMCs. The presence of these non-specific cells is verified by expression of HPRT in all our EpCAM (+) fractions ( Figure 2). HPRT expression is used as an internal control for sample quality to avoid false negative results, but also as a reference gene for relative quantification, since it is expressed in all cells, both EpCAM (+) and PBMC. Following this procedure and the identical analysis of HD peripheral blood samples we define a sample as CTC-positive based on the expression of these specific genes that differentiate CTCs from PBMC. When we evaluated a sample as CTC-positive, based on the expression of one of the nine genes studied individually in the EpCAM (+) fraction, a positivity rate ranging between 2.2% and 21.7% was obtained, e.g., CTC positivity rate based on CK-19 was 21.7% (10/46). However, a significant increase in the positivity rate was observed when we evaluated a sample as CTC-positive, based on the combined expression of at least one of these genes in the EpCAM (+) fraction; in this case the cumulative positivity rate was 52.2% (24/46) (Fischer-Exact test, p = 0.010) (Table S1). 7/46 (15.2%) MBC patients' EpCAM (+) samples were positive for CD44 high /CD24 low (stem cell profile) and 3/46 (6.5%) were positive for CD24 low /ALDH1 high (stem cell profile). A representative heat map of gene expression in EpCAM (+) fractions of all samples and HD is shown in Figure 2.

Comparison between HER2 and ER/PR Status of EpCAM (+) CTCs and the Primary Tumor
For 43 out of these 46 patients the ER, PR, and HER2 status of the primary tumor at the time of initial diagnosis was known. 34/43 (79.1%) primary tumors were ER + , 30/43 (69.8%) PR + and 9/43 (20.9%) HER2 + . The concordance rate between ESR1, PR, HER2 expression on EpCAM (+) CTC fractions and the primary tumor was 27.9%, 34.9% and 65.1% respectively (Table S2). The high discrepancy observed could be explained by the fact that HR status and HER2 positivity was compared between the primary tumor at the time of diagnosis and CTCs at the time of metastasis verification. There were only four cases positive for ESR1 both in the primary tumor and EpCAM (+) CTCs, three cases positive for PR both in the primary tumor and EpCAM (+) CTCs, while all samples that were positive for HER2 overexpression in the EpCAM (+) CTC fractions were negative for HER2 amplification in the primary tumor.

Comparison between HER2 and ER/PR Status of EpCAM (+) CTCs and the Primary Tumor
For 43 out of these 46 patients the ER, PR, and HER2 status of the primary tumor at the time of initial diagnosis was known. 34/43 (79.1%) primary tumors were ER + , 30/43 (69.8%) PR + and 9/43 (20.9%) HER2 + . The concordance rate between ESR1, PR, HER2 expression on EpCAM (+) CTC fractions and the primary tumor was 27.9%, 34.9% and 65.1% respectively (Table S2). The high discrepancy observed could be explained by the fact that HR status and HER2 positivity was compared between the primary tumor at the time of diagnosis and CTCs at the time of metastasis verification. There were only four cases positive for ESR1 both in the primary tumor and EpCAM (+) CTCs, three cases positive for PR both in the primary tumor and EpCAM (+) CTCs, while all samples that were positive for HER2 overexpression in the EpCAM (+) CTC fractions were negative for HER2 amplification in the primary tumor.    Univariate analysis confirmed the prognostic value of combined gene expression of CK-19, and/or CD44 high /CD24 low , and/or ALDH1 high /CD24 low and/or HER2 in the EpCAM (+) CTC fraction for OS (p = 0.001) together with tumor grade (p = 0.030) and ER (p = 0.004) ( Table 2). Multivariate analysis, based on the expression of at least one of the following four profiles: CK-19, and/or CD44 high /CD24 low , and/or ALDH1 high /CD24 low and and/or HER2 independently from patients' age, tumor T stage, grade, nodal status, and the receptor status (ER, PR, HER2) of the primary tumor confirmed also the prognostic value of gene expression in EpCAM (+) CTCs (p = 0.002) ( Table 2).  Univariate analysis confirmed the prognostic value of combined gene expression of CK-19, and/or CD44 high /CD24 low , and/or ALDH1 high /CD24 low and/or HER2 in the EpCAM (+) CTC fraction for OS (p = 0.001) together with tumor grade (p = 0.030) and ER (p = 0.004) ( Table 2). Multivariate analysis, based on the expression of at least one of the following four profiles: CK-19, and/or CD44 high /CD24 low , and/or ALDH1 high /CD24 low and and/or HER2 independently from patients' age, tumor T stage, grade, nodal status, and the receptor status (ER, PR, HER2) of the primary tumor confirmed also the prognostic value of gene expression in EpCAM (+) CTCs (p = 0.002) ( Table 2). The majority of MBC patients had a bone metastasis (35/46, 76.1%) followed by lung, liver and brain metastasis (10/46, 21.7%) ( Table 1). The association of gene expression to bone metastases and other sites of metastasis, is shown in Supplementary Table S3 (Figure S1B), ALDH1 high /CD24 low (6.0 mo vs. 45.9 mo; p = 0.002) ( Figure S1C) and HER2 (20.7 mo vs. 50.9 mo; p = 0.030) ( Figure S1D), compared to patients who were negative for these gene transcripts in EpCAM (+) CTCs.

Discussion
In the present study we report on the prognostic significance of combined gene expression analysis in EpCAM (+) CTCs in MBC patients using a panel consisting of nine genes. It is now evident that molecular characterization of CTC at the RNA level can provide important information on the administration [31] or change of a particular treatment in prostate cancer [32]. Our gene expression analysis was based on a panel of stem cell markers (CD24, CD44, ALDH1), the mesenchymal marker TWIST1, a panel of receptors (ESR1, PR, HER2, EGFR) and the epithelial marker CK-19 and was performed according to the established guidelines [33]. Our results clearly indicate that when using a combined RNA analysis based on the expression of these nine genes the positivity rate for CTC presence in the EpCAM (+) fraction was significantly increased in comparison to that derived when CTC detection was based on one or a few genes.
Kaplan-Meier analysis showed that the group of patients that were found positive for at least one marker had worse prognosis. Similar to our results, Bredemeier et.al have also shown that the expression of a gene combination in EpCAM (+) CTCs has a negative prognostic effect in MBC patients 8-12 weeks after chemo-, hormone or antibody therapy [34], and Reijm et al. identified an 8-gene CTC predictor which discriminates good and poor outcome to first-line aromatase inhibitors in MBC patients [35].
Our results indicate a high heterogeneity in gene expression in CTCs, and this is in accordance with many previous studies highlighting the high heterogeneity of these cells using different isolation and detection methods [1,2,14,16,18,22,28,[36][37][38]. Our approach was based on testing multiple RNA-markers on CTC, so that we could increase the sensitivity of CTC detection. To achieve this, we developed and used multiplex RT-qPCR assays; multiplex molecular assays have the advantage that require limited amount of sample for many different analytes, while the cost and time of analysis is also reduced [35]. Multiplex RT-qPCR assays apply perfectly to CTC analysis, since the amount of available sample for analysis is usually very low and CTC are highly heterogeneous [16,34,39].
Univariate and multivariate analysis confirmed the prognostic value of the expression of at least one of the following four gene expression profiles: CK-19, CD44 high /CD24 low , ALDH1 high /CD24 low and overexpression of HER2 in the EpCAM (+) CTC fraction for OS. Recently, by using an RT-PCR assay analyzing a 46-gene panel it was reported that only 14 genes were identified as significantly differentially expressed between CTC-positive and CTC-negative patients, and only four of these genes (CK-19, EPCAM, CDH1 and SCGB2A2) were significantly differential expressed between the responders and non-responders [40]. Gene expression in CTC could further support the discovery of therapeutic predictors and is very promising for real-time identification of emerging resistance mechanisms in MBC patients [39]. Based on the fact that CTCs represent a rare population, combination of gene expression markers provides CTC detection and molecular characterization, even in cases where only one CTC is detected [41].
Survival and univariate analysis revealed that patients whose EpCAM (+) CTCs were CK-19 (+) or HER2 (+) , had significantly shorter OS. The presence of CK-19 (+) CTCs was of prognostic significance in early breast cancer patients [14]. The presence of CK-19 (+) CTCs after the completion of chemotherapy is associated with increased risk of late relapse [36] and poor survival [37] in MBC. In early breast cancer the detection of CK-19 (+) CTCs and HER2 (+) CTCs is associated with shorter disease-free survival [14]. CTCs in women with HER2-negative breast cancer could acquire a HER2+ subpopulation, that is more proliferative but not addicted to HER2, consistent with activation of multiple signaling pathways [38]. Georgoulias et al. [25], have shown that "secondary adjuvant" trastuzumab therapy based on the HER2 phenotype of CTCs could eliminate CK-19mRNA/HER2positive CTCs [25], and that these patients had a prolonged disease free survival. We report that HR (ESR1, PR, HER2) are detected in EpCAM (+) CTCs at different positivity rates. More specifically, HER2 positivity rate was higher (17.4%) than ESR1 (13.0%), and PR (8.70%). Our results are in concordance with those reported by the DETECT Study Group who have shown, that HER2 is more highly expressed in CTCs in relation to ESR1 and PR, even though a different CTC isolation method and different volume of PB was used [23]. All six patients that were positive for HER2 expression in EpCAM (+) CTCs were negative for HER2 amplification in the primary tumor. Previous studies have also shown that in advanced breast cancer a subset of patients with HER2-negative primary tumors develop HER2-positive CTCs during disease progression [42,43], even after several months of either endocrine treatment or chemotherapy [44], which is an important part of endocrine resistance [27].
Several studies have shown that the majority of CTCs are ESR1/PR-negative [23] regardless of ER and PR expression on the primary tumors. Our results indicate a high discrepancy (>60%) in the ESR1/PR status between the primary tumor and EpCAM (+) CTCs, and that ER/PR are expressed at a very low percentage in CTCs in respect to paired primary tumors. In MBC patients with ER-positive primary tumors, it has been proposed that lack of ESR1 expression in CTCs could be a possible mechanism of resistance to endocrine therapy [45], and that is also associated to increased migration and invasion [46]. In MBC patients with ER-positive disease, the presence of ESR1 mutations in CTCs is one of the diverse mechanisms of acquired endocrine drug resistance and could explain failure to suppress ER signaling within CTCs after 3 weeks of endocrine therapy [47]. Another study has shown that in MBC ESR1 mutations were absent in primary tumor tissue samples and were detected only in metastases obtained after CTC characterization [48]. Our group has previously shown that epigenetic silencing of ESR1 through methylation was associated with lack of response to endocrine treatment [18].
CTCs positive for stem cell markers are chemo-resistant, and their presence independently predicts for unfavorable outcome in MBC [49]. In the present study, Kaplan-Meier and univariate analysis revealed that patients with a positive CD44 high /CD24 low or ALDH1 high /CD24 low profile in CTCs had significantly shorter OS. Our findings are in agreement with previous studies, showing that the presence of HER2-positive CTCs co-expressing a breast cancer stem cell profile (HER2+/CD44 + /CD24 (low) ) and elevated ALDH1 activity is related to aggressiveness and radioresistance [50].

Conclusions
Our results indicate that combined gene expression analysis in EpCAM (+) CTCs provides prognostic information in MBC. These results need to be further confirmed in a prospective study, including a larger and well-defined patient cohort.

Informed Consent Statement:
Written informed consent has been obtained from the patient(s) to publish this paper.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author. The data are not publicly available due to ethical restrictions.

Acknowledgments:
We would like to thank all patients and healthy donors who agreed to take part in this study by providing samples.

Conflicts of Interest:
The authors declare no conflict of interest.