Immune infiltration, aggressive pathology, and poor survival outcomes in RECQL helicase deficient breast cancers

RECQL is essential for genomic stability. Here, we evaluated RECQL in 449 pure ductal carcinomas in situ (DCIS), 152 DCIS components of mixed DCIS/invasive breast cancer (IBC) tumors, 157 IBC components of mixed DCIS/IBC and 50 normal epithelial terminal ductal lobular units (TDLUs). In 726 IBCs, CD8+, FOXP3+, IL17+, PDL1+, PD1+ T-cell infiltration (TILs) were investigated in RECQL deficient and proficient cancers. Tumor mutation burden (TMB) was evaluated in five RECQL germ-line mutation carriers with IBC by genome sequencing. Compared with normal epithelial cells, a striking reduction in nuclear RECQL in DCIS was evident with aggressive pathology and poor survival. In RECQL deficient IBCs, CD8+, FOXP3+, IL17+ or PDL1+ TILs were linked with aggressive pathology and shorter survival. In germline RECQL mutation carriers, increased TMB was observed in 4/5 tumors. We conclude that RECQL loss is an early event in breast cancer and promote immune cell infiltration.


Introduction
DNA helicases are molecular motors that unwind DNA and essential for the maintenance of genomic stability [1].RECQL (also known as RECQ1 or RECQL1) belongs to the RecQ family of DNA helicases [2].RECQL 3´-5' helicase activity is required to unwind DNA, an essential step required during DNA replication and DNA repair.RECQL is implicated in homologous recombination.It interacts with PARP1, RPA, RAD51, Top3α, EXO1, MSH2/6, MLH1-PMS2 and Ku70/80 during DNA repair [3,4].Preclinically, RECQL depletion leads to increased spontaneous sister chromatid exchanges, chromosomal instability, and DNA damage accumulation in cells [3,4].Emerging data indicates a role for RECQL in breast cancer pathogenesis.We have previously shown that germ-line mutations in RECQL are extremely rare and may increase the risk of developing breast cancer [5].In sporadic invasive breast cancers (IBC), we demonstrated that RECQL deficiency at the transcriptomic and proteomic levels are associated with aggressive breast cancer phenotypes and poor patient survival [6].More recently, we validated these observations in an independent clinical cohort of ER-positive breast cancer where RECQL deficiency was associated with poor survival [7].Pre-clinically, in ER-positive breast cancer cells, we observed that RECQL interacts with the FOXA1 transcription factor and regulates expression of the ESR1 gene which encodes the ERα protein [8].These studies suggest a role for RECQL in breast cancer pathogenesis and prognosis.We thus hypothesized that RECQL deficiency may be an early event during breast cancer pathogenesis.Moreover, RECQL deficient genomically unstable tumors may have increased neoantigens expression that could promote tumoral T-cell infiltration and aggressive pathology.

Results
RECQL protein expression was assessed, using tissue microarrays (TMA), in 50 normal Terminal ductal lobular units (TDLUs), 449 pure DCIS, 152 DCIS components of DCIS/IBC, and 157 IBC components of DCIS/IBC .Patient demographics are summarized in Supplementary Table 1.IHC revealed strong nuclear expression of RECQL in the normal luminal epithelial cells of the adjacent TDLUs and lower nuclear expression in the cancerous epithelial cells with occasional inflammatory cells.Cytoplasmic expression was not detectable (Fig. 1A-F).Median nuclear Histochemical (H)-scores were 230, 160, 90, and 70 in adjacent normal TDLUs, primary DCIS, the DCIS component of DCIS/ IBC, and the IBC component of DCIS/IBC tumors respectively.Assessment of RECQL expression revealed higher RECQL levels in the normal epithelial cells of the adjacent TDLUs than in the cancerous epithelial cells of the pure DCIS (p = 3.0 × 10 − 6 ) (Fig. 1G).In the DCIS component of DCIS/IBC tumors, RECQL expression was lower compared to pure DCIS (mean H-scores 90 versus 160) (p = 4.2 × 10 − 17 ) (Fig. 1G).In the IBC component of DCIS/IBC tumors, RECQL expression was lower compared to the DCIS component of DCIS/IBC tumors (mean H-scores 70 versus 90) (p = 1.0 × 10 − 5 ) (Fig. 1G).Overall, these observations reveal a clear reduction in RECQL level from normal epithelial cells to invasive cancer cells (mean H-scores 230 versus 70) (p = 9.4 × 10 − 44 ) (Fig. 1G).

RECQL and ductal carcinoma in situ (DCIS)
In pure DCIS, low RECQL expression was observed in 89/449 (20 %) cases and high expression in 360/449 (80 %) cases.Low RECQL expression showed an association with high DCIS grade (p = 0.028) but not with other parameters (Supplementary Table 2).In the DCIS component of DCIS/IBC tumors, low nuclear RECQL expression was associated with large DCIS size (p = 0.010) and high DCIS grade (p = 0.045) only (Supplementary Table 3).Univariate analysis in the pure DCIS showed that low RECQL expression was associated with shorter local recurrence free interval (LRFI) (both in situ and invasive recurrence) (p = 0.009) (Fig. 1H).In pure DCIS, Low RECQL expression was also associated with shorter LRFI in patients treated by breast conserving surgery (BCS) followed by adjuvant RT (p = 0.003) (Fig. 1I) but not in patients treated with BCS only (p = 0.058) (Fig. 1J).Multivariate Cox regression analysis for recurrence free interval in the pure DCIS series demonstrated that low RECQL expression was an independent poor prognostic factor of all recurrences in patients treated with BCS (p=0.028;HR = 0.538; 95 % CI = 0.309 -0.936).Other independent risk factors included age of the patients at the time of diagnosis (p = 0.004; HR = 0.538; 95 % CI = 0.309-0.936)and DCIS tumor size (p = 0.001; HR = 0.398; 95 % CI = 0 .228-0.695) (Supplementary Table 4).Furthermore, the multivariate analysis in the pure DCIS series with IBC recurrence also demonstrated that low RECQL expression was an  4).
Taken together, the data provides the first clinical evidence that RECQL deficiency in DCIS promotes aggressive phenotype and adverse prognostic significance.
We have previously shown that low levels of RECQL protein is associated with aggressive IBC including larger tumor size, lymph node positivity, high tumor grade, high mitotic index, pleomorphism, dedifferentiation, ER negativity and poor survival [6,7].We hypothesized that RECQL deficiency induced genomic instability [3,4] will not only lead to increased mutagenicity/carcinogenicity but can also increase neoantigen load on tumor cell surface resulting in increased immunogenicity and T-cell infiltration [9].To address this possibility, we first correlated RECQL expression to a panel of DNA repair marker expression in IBC cohort (Fig. 2A).T-cell infiltration [CD8+, FOXP3+, and PD1+ cells] and tumor PD-L1 expression (Fig. 2B-I) was then investigated in RECQL deficient and RECQL proficient IBC.CD8+, FOXP3+ and PD1+ T-cells were evaluated within tumor cell nests, adjacent or distant stroma.Patient demographics of the IBC cohort (n=726) are shown in supplementary Table 5. Immunohistochemical staining protocol is shown in Supplementary Table 6 and described previously [10].A shown in Fig. 2A and Supplementary Table 7, we observed a positive correlation between RECQL and RECQL4, RECQL5, BLM, RPA1, Ku70, MRE11, RAD50, BRCA1, XRCC1, Polymerase beta, pCHK1, CHK2, DNA-PKcs, ERCC1 and PARP1 (all p values <0.0001).
CD8þ T-cell infiltration in RECQL deficient IBC: Breast cancers with enhanced immunogenicity will be susceptible to CD8 + T cell infiltration.The number of CD8+ T-cells were counted in each tumor core.CD8+ T cells were counted in three locations in each tumor: intratumoral compartment (within the tumor cell nests), within the adjacent stroma (defined as CD8+ cells within one tumor cell diameter of the tumor) and within the distant stroma (defined as > one tumor cell diameter away from the tumor).The total number of CD8+ T cells was determined by combining the counts for these three compartments.Tumors with any number of CD8+ cells were considered as positive CD8+ T-cell infiltration.
RECQL deficient tumors with CD8+ T-cell infiltration within tumor cell nests (Table 1) or adjacent stroma (Supplementary Table 8) were highly significantly associated with larger tumors, high grade, dedifferentiation, pleomorphism, higher mitotic index, high Ki67 expression, high risk Nottingham prognostic index (NPI), ER-, PR-and triple negative breast cancers (all p values ≤0.001) compared to RECQL proficient CD8-tumors.RECQL deficient tumors with CD8+ T-cell infiltration in distant stroma was significantly associated with pleomorphism and high-risk Nottingham prognostic index (NPI) (all p values ≤0.005) (Supplementary Table 9) compared to RECQL proficient CD8-tumors.When CD8+ T-cell infiltration was taken together (within tumor cell nest, adjacent and distant stroma), RECQL deficient tumors with total CD8+ T-cells were significantly associated with high grade, de-differentiation, pleomorphism, higher mitotic index, high Ki67 expression, high risk Nottingham prognostic index (NPI) and triple negative breast cancers (all p values ≤0.05) (Supplementary Table 9) compared to RECQL proficient CD8-tumors.
FOXP3þ T-cell infiltration in RECQL deficient BC: T regulatory cells (Tregs) can inhibit antitumor responses and influence the activity of CD8+ TILs [11].FOXP3, a member of the forkhead family of transcription factors, is restricted to specific population of Tregs [11].We evaluated the associations between FOXP3+ T-cell infiltration and RECQL deficient IBC.RECQL deficient tumors with FOXP3+ T-cell infiltration within tumor cell nests (Table 2), within adjacent stroma (Supplementary Table 10), within distant stroma (Supplementary Table 10) and total FOXP3+ T-cells were all highly significantly associated with high grade, de-differentiation, pleomorphism, higher mitotic index, high Ki67 expression, high risk Nottingham prognostic index (NPI), ER-, PR-and triple negative breast cancers (all p values p≤0.0001).
We also correlated expression of RECQL with immune cell infiltration.As shown in Supplementary Table 18, there was a significant inverse correlation between RECQL expression and PD1+ (p=0.004) or IL-17+ immune cell infiltration (p=0.01).
Multivariate analysis for survival: In multivariate analysis, we observed that RECQL, CD8 and FOXP3 were independently associated with BCSS (Supplementary Table 19).Larger tumor size and positive lymph node status were other parameters independently associated with survival in this analysis (Supplementary Table 19).
Gene expression profiling in RECQL knock-out (KO) MDA-MB-231 cells: Immunohistochemical analysis presented above shows that RECQL low tumors with T-cells infiltration (CD8+, FOXP3+, IL17, PDL1+, or PD1) are associated with triple negative breast cancers.Preclinically, in a triple negative breast cancer cell line (MDA-MB-231), we generated isogenic RECQL-wildtype (WT) and RECQL-knock-out (KO) clones using a CRISPRS/Cas-9 system (Fig. 3A and B).Total RNA was extracted from MDA-MB-231 RECQL-WT and RECQL-KO clones and subjected to RNA seq analysis.Over representation KEGG pathway analysis for genes higher or lower in RECQL-KO cells compared to WT cells is shown in Fig. 3C.Interestingly, we observed enrichment of TNFα (hsa04668) and IL17 (hsa04657) pathway genes (Fig. 3D and E).The data suggests that higher levels of TNFα and transcription factors AP1 and CREB may lead to an increase in cytokine and chemokine signaling in RECQL-KO MDA-MB-231cells.
Taken together, the clinical and pre-clinical data provides strong evidence that RECQL deficient tumors have frequent T-cell infiltration which is associated with aggressive pathology and poor survival.

Genomic and transcriptomic analysis of RECQL in BC-TCGA
Pre-clinically, utilizing an unbiased integrative genomics approach, we have observed that expression of ESR1, the gene encoding ERα, is directly activated by RECQL.More than 35 % of RECQL binding sites were co-bound by ERα genome-wide [8].Mechanistically, RECQL cooperates with FOXA1, the pioneer transcription factor for ERα, to enhance chromatin accessibility at the ESR1 regulatory regions in a helicase activity-dependent manner [8].Given the potential role for RECQL in transcriptional regulation, we speculated that RECQL  deficiency in breast cancer will not only promote genomic instability but will also lead to global transcriptomic alterations that could adversely influence its pathology.To address this possibility, we conducted genomic and transcriptomic analysis of tumor samples from patients with invasive breast cancers (IBC) in the TCGA (the cancer genome atlas) cohort [17].
First, we utilized cBioportal to examine mutations and copy number variations of the RECQL gene in the BC-TCGA firehose legacy cohort (1101 patients).Interestingly, only 25/963 patients (2.5 %) showed alterations where the majority were RECQL amplification.Only 2 missense mutations were identified (L286Q and P535S).Correlation between copy number variation and gene expression showed significant positive correlation (Fig. 3F; n=960 Pearson 0.49, p < 0.001).We evaluated DNA methylation status within the transcription start site (TSS) CpG island using SMART app, which utilizes UCSC Xena datasets correlating beta values (Illumina Infinium 450K methylation chip) and gene expression values (RNA-Seq dataset).The majority of the CpG island probes within the TSS were unmethylated and showed significant weak negative correlation with gene expression.The most internal CpG site (cg5389560) varied greatly in methylation status (beta value 0-0.8) but still only showed weak positive correlation with gene expression (R = 0.16, p <0.01; Supplementary Fig. 5A and B).Therefore, DNA methylation is not linked with low expression of RECQL in IBC.
Next, we investigated if RECQL levels can influence global gene expression.Differential gene expression between low RECQL expressing tumors and high RECQL tumors was compared in the BC-TCGA RNA-Seq dataset (Supplementary Table 20).High expression of 10061 genes and low expression of 477 genes was observed in low RECQL tumors (Supplementary Table 20).Interestingly, among the high expressing genes in low RECQL tumors, only 18 % are protein-coding genes (Fig. 3G).In contrast to the genes expressed lower in RECQL low tumors, whichwas 63 % (Fig. 3H).This suggests that RECQL low tumors were associated with a higher expression of lncRNAs and pseudogenes, which is feature of increased genomic instability [18,19].Gene enrichment analysis identified significant KEGG pathways (FDR p-value <0.05; top five pathways shown Supplementary Fig. 5C).Interestingly, lower expression of integrins (log2 fold changes -ITGA4 -1.273, ITGB3 -1.244, ITGB1 -1.229) were the main genes within the pathways.PI3K-Akt (PKB) signaling pathway highlighted lower expression of PI3K subunits and higher expression of HRas (log2 fold change1.2713).

Tumor mutation burden (TMB) and homologous recombination deficiency (HRD) in breast cancers with RECQL germ-line mutations:
The bioinformatic data in RECQL low sporadic breast cancers suggest a genomic instability phenotype.To validate these observations further, we conducted genomic analysis in five breast cancer patients with RECQL germ-line deficiency.In the five carriers of RECQL germline mutations predicted to abolish its helicase activity [5], we performed whole genome sequencing (WGS) on matched tumor and germline DNA.Whole exome sequencing (WES) was completed in tumor samples alone.We used matched tumor normal WGS and WES data for assessing microsatellite instability (MSI) and tumor mutational burden (TMB).Tumor only WGS data was utilized for homologous recombination deficiency (HRD) analysis.The data is summarized in Supplementary Table 21.All WGS data were suitable for MSI and HRD analysis.Only one tumor sample (patient 3) had a high HRD score of 59, likely  due to an oncogenic CDK12 mutation (NP_057591.2:p.Arg1067Ter) found in this tumor which is associated with HRD [20].All tumor samples had a low microsatellite (MS) instability (MSI-L) with unstable MS proportion of around or over 20 %.One tumor sample (patient 5) had a high TMB (>15) and three had an intermediate TMB score of around 10.The tumour (patient 3) with high HRD score had a low TMB (5.4).Taken together, the data provides evidence that germline RECQL deficiency may contribute to genomic instability with an increased TMB phenotype.

Discussion
RECQL helicase is essential for the maintenance of replication fork progression, recombination, and DNA repair [2].RECQL loss is therefore expected to increase genomic instability and promote a mutator phenotype [21] leading to increased risk of cancer.In the current study we not only provide the first clinical evidence that RECQL loss may be an early event during breast cancer pathogenesis, but also show that in established invasive breast cancers, RECQL deficiency is associated with immune cell infiltration, aggressive pathology, and poor prognosis.
The incidence of pre-invasive breast DCIS continues to increase [22].Although surgery (mastectomy or wide local excision), with or without adjuvant radiotherapy are the main treatment modalities, personalization of DCIS therapy is an area of unmet need.Whilst a subset of low-grade DCIS may never progress to invasive cancer, a proportion of high-grade DCIS, despite surgery and adjuvant radiotherapy, may still recur [22].Therefore, development of biomarkers of aggressive phenotype is highly desirable.Emerging data suggest that aggressive DCIS may result from the accumulation of somatic mutations [23].We speculated that RECQL loss may influence the development of high-grade DCIS.In the current study, we provide the first clinical evidence that RECQL loss is a feature of some DCIS which is associated with aggressive phenotype and adverse survival outcomes.We have previously shown that loss of key base excision repair (BER) repair proteins such as XRCC1 [24] or polymerase β [25] in DCIS are also linked with aggressive clinicopathological features and survival.
In established IBC, the complex tumor microenvironment may include infiltrating immune cells.We and others have shown that CD8 + T lymphocytic infiltration are associated with high tumor grade, hormone receptor negative, and basal-like phenotype tumors [26,27].Moreover, high total CD8 + counts promote better survival outcomes [26,27].Although mechanisms of immune cell infiltration are multifactorial, impaired tumor cell DNA repair with associated genomic instability will increase mutagenicity, increase neoantigen load on tumor cell surface and enhance immunogenicity.Previously, we have shown that low RECQL breast tumors were significantly associated with low PARP1, BRCA1 negative, low RAD51, low ATM, low nuclear pChk1, low nuclear Chk2, low XRCC1, low FEN1, low SMUG1, and low DNA-PKcs expression [6] .Moreover, low RECQL tumors were also significantly associated with low levels of other RecQ helicases, including RECQL4, BLM, and WRN [6].Together, the data supports the view that low RECQL tumors have features of genomic instability associated with low expression of multiple DNA repair proteins [6].In the current study, we have shown for the first time that RECQL low tumors with increased CD8+, FOXP3+, PDL1+ or PD1+ TILs are not only associated with aggressive phenotype but also with adverse survival outcomes.In a recent study, we have also shown that breast tumors that expressed low XRCC1 are also associated with high CD8 + TILs counts, aggressive phenotype and reduced poor survival.Importantly, PD1 + or PDL1 + breast cancers with low XRCC1 were linked to aggressive cancers and reduced survival including in ER -breast cancers in that study [10].BRCA1 and BRCA2 proteins perform critical functions during homologous recombination.Mutations within the BRCA genes lead to impaired DNA repair and an increased risk of early-onset breast cancer.BRCA-mutated breast tumors are also characterized by the presence of TILs such as CD4+, CD8+, and FOXP3+ T lymphocytes [28].In contrast, BRCA-mutated breast cancers with increased TILs are associated with better survival outcomes [28].Similarly, loss of MMR genes such as MLH-1, PMS-2, MSH-2 and MSH-6 leads to MSI and increases the risk of colorectal cancers (CRC).CRCs with MSI have increased TMB, increased TILs and better survival outcomes [29].In contrast to BRCA and MMR studies, our data shows that RECQL low tumors with immune cell infiltration have poor survival outcomes.
Although the reason for this observation is unknown, we speculate that different DNA repair deficient backgrounds could influence different subsets of T -cell infiltration or tumor cell biology in different DNA repair deficiency states itself could influence outcomes.An intriguing pre-clinical observation in the current study was that RNA seq analysis in RECQL _KO_ MDA-MB-231 cells revealed enrichment of chemokine and cytokine gene expression.Whilst altered tumor cytokine/chemokine profile could influence the type of TIL subsets, detailed mechanistic investigations will be required to address this possibility.
In the IBC-TCGA cohort, low RECQL tumors had a higher expression of lncRNAs and pseudogenes, which is feature of increased genomic instability [18,19].For additional validation, we exome and genome sequenced breast cancer from five patients who were RECQL germ-line mutation carriers.We observed intermediate to high TMB in 4/5 tumors and HRD phenotype in 1/5 tumors.High TMB was also reported in a colon tumor sample from a patient with a germline pathogenic mutation [30].Taken together, the data supports the view that RECQL deficiency Fig. 3. RECQL depletion and gene expression analysis in MDA-MB-231 cells A. Isogenic MDA-MB-231 RECQL knockout (RECQL-KO) and its wild-type control (RECQL-WT) cells were generated using the CRISPR/Cas9 system and validated by immunoblot detection of RECQL in total cell lysates.RECQL protein levels were measured in two sample volumes.GAPDH is used as a loading control.B. Measuring the RECQL mRNA levels by RT-qPCR normalized to GAPDH.SDHA served as a second housekeeping gene.Relative expressions of RECQL and SDHA is shown here.C. Volcano plot obtained from differentially gene expression (DGE) analysis (fold change (≥±1) combined with adjusted p-value (<0.05) comparing genes associated with low versus high RECQL mRNA expression.D.Over representation KEGG pathway analysis for genes higher or lower in RECQL KO cells compared to WT cells.Pathways shown with positive enrichment ratio are expressed higher in RECQL KO and pathways with negative enrichment ratio are expressed lower in RECQL KO.All pathways are FDR corrected p-value <0.05, from genes that were significantly differentially expressed log2FC ≥1 and FDR p value <0.05.E. Representation of the genes that were enriched in the TNF (hsa04668) and IL17 (hsa04657) pathways highlighting the higher levels of TNFα and transcription factors AP1 and CREB, leading to increase in cytokine and chemokine signalling in RECQL KO cells.Genes shown in bold were expressed significantly higher level in RECQL KO cells compared to WT (log2FC ≥1 and FDR p value <0.05).F. Comparison of RECQL gene expression to copy number variation in TCGA-BRCA (n = 960).GISTIC analysis shows changes in RECQL mRNA levels in tumors with copy number variations.The expression data was from normalized illumina HiSeq RNA-Seq data.The copy number variations are deep deletions (>2 copies deleted), shallow deletion (few copies altered), diploid, gains (few copies gained), amplification (>2 copies gained).Pearson correlation R = 0.49, p < 0.001.RNA gene types (Ensembl MART) are shown for non-coding RNAs (lncRNA, all pseudogenes, miRNAs and other RNA which include snoRNA, tRNA and MT-RNA) plus protein-coding genes, as percentages in the pie charts for G) RNAs expressed higher in low RECQL tumors (n = 9277 confirmed gene types) and H) RNAs expressed lower in low RECQL tumors (n = 451 confirmed gene types).
in breast cancer leads to genomic instability and immune infiltration.We conclude that RECQL based stratification in breast cancer is feasible for immune-oncology approaches.

Patients and methods
Two cohorts of invasive breast cancer (n=1600) and DCIS (n=776) diagnosed and treated at City Hospital, Nottingham, United Kingdom from 1987 to 2013 were used in this retrospective study.All the samples were arranged in tissue microarray (TMA) as previously described.
For the DCIS cohort, demographic data and histopathological parameters were recorded, including age at diagnosis, tumor size, grade, diagnostic method (screening or symptomatic), presence of comedo necrosis, adjuvant radiotherapy (RT) and local recurrence-free survival (LRFS) based on the time (months).A positive tumor margin was subjected to re-excision after the breast conserving surgery (BCS), on the assumption that the first six months after BCS was not considered as a recurrence.Tumors that develop a contralateral side were not considered as recurrence.Molecular classification based on hormonal receptor expression (Estrogen Receptor (ER) and Progesterone Receptor (PR)), HER2 status, and Ki-67 proliferation index was available.A summary of demographic data is summarized in (Supplementary Table 1).In terms of the invasive BC cohort, clinical and tumor characteristics including patient's age at diagnosis, histologic tumor type, grade, tumor size, lymph node status, Nottingham Prognostic Index (NPI), and lymphovascular invasion (LVI) were available (Supplementary Table 5).All samples in the study series were constructed in TMAs using a 3D His-tech® Grand Master® machine and 1mm cores.

RECQL protein expression:
The assessment of the expression of RECQL protein in invasive BCSS and DCIS by immunohistochemistry (IHC) was conducted using the Novocastra Novolink™ Polymer Detection Systems kit (Code: RE7280-K, Leica, Biosystems, Newcastle, UK). 4 μm thick TMA sections were dewaxed and endogenous peroxidase activity was blocked with 0.3 % hydrogen peroxide in methanol for 10 min.Antigen retrieval was performed in citrate buffer pH 6.0 using a microwave (Whirlpool JT359 Jet Chef 1000 W) for 20 min.RECQL antibody (Bethyl Laboratories, catalog no.A300-450A) was used at a dilution of 1:1,000 for 60 minutes.The sections were counterstained with hematoxylin.For each run, negative and positive (by omission of the primary antibody and IgG-matched serum) controls were included in each run.The negative control ensured that all the staining was produced from the specific interaction between antibody and antigen.
RECQL expression was assessed using the percentage and intensity of the expression and H-score (semi-quantitative histochemical scoring) was calculated as previously described [6] CD8, FOXP3, IL17, PDL1 and PD1 immunohistochemistry (IHC) in IBC TMAs were immunohistochemically profiled for CD8, FOXP3, PDL1, PD1 and other biological antibodies [26].Supplementary Table 6 summarizes protocols, antibody dilution, scoring methodology.as previously described [6,10,24].Immunohistochemical staining was performed using the Thermo Scientific Shandon Sequenza chamber system (REF: 72110017), in combination with the Novolink Max Polymer Detection System (RE7280-K: 1250 tests) as described in previous publications [26,31], and the Leica Bond Primary Antibody Diluent (AR9352), each used according to the manufacturer's instructions (Leica Microsystems).The number of CD8+ T lymphocytes was counted in three locations in each tumor: intertumoral compartment (within the tumor cell nests), within the distant stroma (defined as > one tumor cell diameter away from the tumor), and within the adjacent stroma (defined as CD8+ cells within one tumor cell diameter of the tumor).The total number of CD8+ T cells was determined by combining the counts for these three compartments.FOXP3, IL17, PD1 and PDL1 positive T lymphocytes were similarly assessed.Not all cores within the TMA were suitable for IHC assessments as some cores were missing or contained inadequate invasive cancer (<15 % tumor).
Statistical analysis: All statistical analysis was conducted with IBM SPSS software v26 (Chicago, IL, USA).A two-sided p. value <0.05 was considered statistically significant.To correlate RECQL protein level and clinicopathological factors in the invasive BC and primary DCIS series, Crosstabs chi-square test was used after dichotomizing RECQL protein level into high and low based on the X-tile value (X-tile software version 3.6.1,copyright Yale University 2003-05).An H score of ≥215 was taken as the cut-off for high RECQL level.Continuous data analysis was carried out using the Mann-Whitney U test and Kruskal-Walli's test.RECQL1 was combined with CD8, FOXP3, IL17, PDL1 and PD1 to assess the impact of their co-expression on the clinicopathological parameters of breast cancer.Univariate and multivariate statistical analysis and Kaplan-Meier curves with patients' outcomes based on LRFI were performed on the pure DCIS series and based on BCSS in the invasive BC cohort.

Bioinformatics analysis
CBioportal was performed on the Breast invasive carcinoma TCGA firehose legacy cohort (1101 patients) to identify mutations and copy number variations for the RECQL gene [32].The BRCA (TCGA breast cancer) cohort RNA expression data was analyzed.The data was obtained from GDC (https://portal.gdc.cancer.gov/).The RNA-seq data (specimens n=1090) were firstly ranked (lowest to highest expression) for RECQL, then data split into quartiles.Differentially expressed genes between Q1 and Q4 were identified using DESeq2 [33].Differential genes obtained significant change of log2 fold of 1 and above, FDR-p value <0.05.Gene set enrichment analysis was performed using Web-Gestalt with significant KEGG pathways shown (FDR-p value <0.05) [34].Using the SMART app, we correlated DNA methylation beta values (from the Infinium 450methylation array) with RNA expression data (RNA-seq) [35].

RNA isolation and RT-qPCR
Isogenic MDA-MB-231 RECQL-knock-out (KO) and its wild-type control (RECQL-WT) cells were generated using the CRISPR/Cas9 system [36]).Total RNA was extracted using the TRIzol reagent (Invitrogen) according to the manufacturer's instructions.A total of 0.5 μg of RNA was used for reverse transcription using the iScript Reverse Transcription Supermix kit (Bio-Rad) according to the manufacturer's instructions.The cDNA was subjected to real-time quantitative PCR using iTaq Universal SYBR Green Supermix (Bio-Rad) in triplicate.Reactions were cycled at 95 • C for 30 s; followed by 40 cycles of 94 • C for 10 s and 60 • C for 15s with fluorescence data collection during the anneal/extension step on the CFX96 Real-Time PCR System (Bio-Rad).The relative transcript levels were normalized to the housekeeping gene GAPDH and differential expression was measured using the 2-ΔΔCT method.The housekeeping gene SDHA served as a negative control in RT-qPCR experiments.The RECQL primers are Forward 5′-CAA TGGCTGGAAAGGAGGTA-3′; Reverse 5′-CAGAGTTAAAAGCAGCCCTG GT-3′.

RNA-seq analysis
Total RNA from MDA-MB-231 RECQL-WT and RECQL-KO clones was extracted using the RNeasy plus micro kit (Qiagen).RNA integrity was checked with a Bioanalyzer (Agilent), and only samples with an RNA integrity number (RIN) of >9.5 were subsequently subjected to mRNAseq.The mRNA-seq samples were pooled and sequenced on HiSeq using Illumina TruSeq mRNA Prep Kit RS-122-2101 and paired-end sequencing.The samples had ~79-101 million pass filter reads with a base call quality of above ~90 % of bases with Q30 and above.Reads of the samples were trimmed for adapters and low-quality bases using Trimmomatic software before alignment with the reference genome (Human -hg19) and the annotated transcripts using STAR.The average mapping rate of all samples was ~95 %.Unique alignment is above 89 %.The mapping statistics are calculated using Picard software.The samples had ~0.88 % ribosomal reads.Percent coding bases were between 64-66 %.Percent UTR bases are 29-31 %, and mRNA bases were between 93-94 % for all the samples.Library complexity was measured in terms of unique fragments in the mapped reads using Picard's MarkDuplicate utility.The samples have 64-70 % non-duplicate reads.
Read count per gene was calculated by HTSeq under the annotation of Gencode and normalized by size factor implemented in the DESeq2 package.Regularized logarithm transformation (rlog) values of gene expression were used to perform hierarchical clustering and principal component analysis.To assess differential gene expression between different conditions (e.g., constructs vs. mocks), we used a generalized linear model within DESeq2 that incorporates information from counts and uses negative binomial distribution with fitted mean and a genespecific dispersion parameter.DESeq2 used Wald statistics for significance testing and the Benjamini-Hochberg adjustment for multiple corrections.

Western blotting
Cells were harvested after washing with phosphate-buffered saline (PBS) and whole cell lysates were prepared using radioimmunoprecipitation assay (RIPA) buffer containing protease inhibitor cocktail (Roche) and subjected to immunoblot detection of RECQL using anti-RECQL (Bethyl lab) antibody.

Tumour and germline DNA analysis
Whole genome of matched germline and tumour DNA from five patients with the French-Canadian founder RECQL mutation (c.634C>T, p.Arg215*) were sequenced 20x mean depth of coverage).IDT library kit was used for the library preparation and NoVaseq 6000 was used for sequencing.Whole exome sequencing was done for the tumour DNA with a higher depth of coverage (200x) to have a better estimation of TMB.Dragen Somatic software version 4.0.3 from Illumina Inc. was used for analyzing the matched sequence data for determining TMB, MSI and HRD scores.

Fig. 1 .
Fig. 1.RECQL nuclear protein expression in DCIS.A&B.DCIS negative stain (4X and 10X power magnification respectively).C&D.Intense nuclear staining in pure DCIS (4X and 10X power magnification respectively).E&F.Stronger nuclear staining in DCIS component (thick arrow) than in invasive component (thin arrow) (4X and 10X power magnification respectively).G. RECQL nuclear protein expression boxplot showing higher nuclear RECQL expression present in the normal TDLUs, decreased in pure DCIS series and further decrease in DCIS component and the lowest level was seen in the IBC component of the mixed DCIS/IBC cohort.

Table 1
Clinicopathological significance of RECQL and CD8 (within tumour cell nest) coexpression in breast cancers.

Table 2
Clinicopathological significance of RECQL and FOXP3 (within tumour cell nest) co-expression in breast cancers.