Case-control analysis of truncating mutations in DNA damage response genes connects TEX15 and FANCD2 with hereditary breast cancer susceptibility

Several known breast cancer susceptibility genes encode proteins involved in DNA damage response (DDR) and are characterized by rare loss-of-function mutations. However, these explain less than half of the familial cases. To identify novel susceptibility factors, 39 rare truncating mutations, identified in 189 Northern Finnish hereditary breast cancer patients in parallel sequencing of 796 DDR genes, were studied for disease association. Mutation screening was performed for Northern Finnish breast cancer cases (n = 578–1565) and controls (n = 337–1228). Mutations showing potential cancer association were analyzed in additional Finnish cohorts. c.7253dupT in TEX15, encoding a DDR factor important in meiosis, associated with hereditary breast cancer (p = 0.018) and likely represents a Northern Finnish founder mutation. A deleterious c.2715 + 1G > A mutation in the Fanconi anemia gene, FANCD2, was over two times more common in the combined Finnish hereditary cohort compared to controls. A deletion (c.640_644del5) in RNF168, causative for recessive RIDDLE syndrome, had high prevalence in majority of the analyzed cohorts, but did not associate with breast cancer. In conclusion, truncating variants in TEX15 and FANCD2 are potential breast cancer risk factors, warranting further investigations in other populations. Furthermore, high frequency of RNF168 c.640_644del5 indicates the need for its testing in Finnish patients with RIDDLE syndrome symptoms.


Results
Case-control analysis for the identification of breast cancer associated alleles. Filtering and validation steps presented in Fig. 1 were taken in order to select potentially disease associated protein truncating or splice site mutations for the case-control association analysis. In total, 39 different alterations met the filtering criteria (21 nonsense, 9 frameshift and 9 canonical splice site mutations) and were studied further by the case-control approach in the Northern Finnish cohorts (Supplementary Table S1). Mutations in ten genes (CHD1L, CYP19A1, DCLRE1A, ERCC2, EXO1, GNL3, IGHMBP2, NAT10, PTPRH and TOP3A) turned out to be singletons and mutations in MSH3, NINL and ZRANB3 were present only in two patients while absent from controls, thus lacking the power for statistical comparisons. The segregation of these mutations in the carrier families could not be studied due to the lack of additional DNA samples from suitable family members, leaving their impact on breast cancer risk unknown. In total, four of the recurrent mutations showed significant (TEX15 c.7253dupT and FANCD2 c.2715 + 1G > A) or borderline association (TEX15 c.8325G > A and RNF168 c.640_644del5) with hereditary breast cancer in the Northern Finnish cohort. These mutations were further genotyped in expanded set of Finnish breast cancer cases and controls originating from Helsinki, Kuopio and Tampere (Table 1, and meta-analyses in Table 2). Of note, two of the studied mutations were also simultaneously identified in Helsinki: FANCD2 c.2715 + 1G > A in clinical panel sequencing of a patient with early onset triple-negative breast cancer and RNF168 c.640_644del5 in exome sequencing of hereditary breast cancer patients. TEX15 c.7253dupT mutation is stable at mRNA level and associates with breast cancer. Two different protein truncating mutations in the TEX15 gene showed potential association with breast cancer in the Northern Finnish cohort. TEX15 encodes a protein that has been suggested to have a specific role in DNA double-strand break repair, as it is necessary for formation of DMC1 and RAD51 foci on meiotic chromosomes 15 . TEX15 has initially been classified into the group of cancer/testis (CT) antigen encoding genes activated in various cancers, and its expression has been considered to be more or less strictly restricted to testis among mature organs 17,18 . However, besides testis, low level TEX15 expression has been reported also in other normal tissues (including uterus, brain and smooth muscle), whereas in cancer samples TEX15 expression has been recurrently observed in breast, lung and bladder cancer, and cutaneous melanoma 17,19,20 . We confirmed that TEX15 was expressed in the tested patient-derived LCLs and also in breast epithelial cancer cells (MCF7).
The TEX15 frameshift mutation (c.7253dupT, Leu2418PhefsTer6, rs760604179) was observed in three cases from the Northern Finnish hereditary cohort (3/247, 1.2%), whereas only one carrier was identified in controls (1/1190, 0.1%, p = 0.018, odds ratio [OR] = 14.6 and 95% confidence interval [CI] = 1.5-141.1) ( Table 1). All three cases were diagnosed with breast cancer at young age (39, 36 and 38) and did not have mutations in previously screened cancer predisposition genes 3,21,22 . Curiously, two of them had a father diagnosed with prostate cancer: one was confirmed as a TEX15 c.7253dupT carrier, the other was not available for testing ( Supplementary  Fig. S1a). The mother of the third index case was diagnosed with breast cancer at the age of 40 years (carrier status unknown). Three more carriers (diagnosed at the age of 37, 58 and 47, respectively) were identified in the unselected cohort ( Supplementary Fig. S1a). Only one sample from family members was available for mutation testing: the mother of the index (family 6), diagnosed with bone marrow cancer at the age of 81, was confirmed as a carrier. Data from pathology reports was available for all six carriers. Tumors were mainly ER/PR positive, and in five carriers these were of ductal origin (Supplementary Table S2). No additional TEX15 c.7253dupT carriers were identified in either the cases or the controls originating from Helsinki or Tampere (both Southern Finland), indicating that the mutation is geographically restricted to Northern Finland.
Another TEX15 truncating mutation (c.8325G > A, Trp2775Ter, rs146619272) was identified in 7/247 (2.8%) of the Northern Finnish hereditary cases and it showed a borderline association with breast cancer (p = 0.063, OR = 2.7 and 95% CI = 1.1-6.8). The enrichment was restricted to the hereditary cohort, whereas the prevalence of the mutation in the unselected cases was not different from the controls (Table 1). Further screening of TEX15 c.8325G > A in the additional Helsinki and Tampere cohorts revealed several carriers in both hereditary cases and controls with similar frequencies, indicating that TEX15 c.8325G > A might not be a breast cancer risk factor or that the risk associated with it is low ( Table 1 and Table 2).
To resolve this apparent discrepancy in breast cancer association between the two protein truncating mutations in the same gene, we tested their effect at mRNA level. Curiously, TEX15 c.7253dupT and c.8325G > A behaved differently: the breast cancer associated TEX15 c.7253dupT mRNA was stable, whereas the transcript from the c.8325G > A allele was efficiently targeted by nonsense-mediated decay (Fig. 2). This indicates that the effect of c. 8325G > A is at the dosage level of TEX15, whereas c.7253dupT could disturb at least some cellular functions of TEX15, potentially in dominant-negative manner. Although we were able to study the effect of the TEX15 c.7253dupT mutation at mRNA level, the presence of the truncated TEX15 protein product could not be confirmed due to lack of a suitable antibody.
As defects in some DDR genes have been associated with increased genomic instability 12, 13, 23 , we tested the prevalence of chromosomal aberrations in the TEX15 c.7253dupT carriers. They exhibited an increase in the frequency of spontaneous chromosomal rearrangements when compared to controls, however, the difference showing only a borderline significance (p = 0.067) (Supplementary Table S3). All observed chromosomal aberrations were considered random, as no preference for specific break site or evidence for clonality was observed, thus suggesting a plausible effect on genomic instability for the heterozygous TEX15 c.7253dupT mutation.
Fanconi anemia allele in FANCD2 shows enrichment in breast cancer cases.  Table 1). This mutation induces the use of a cryptic splice-donor site downstream the exon (c.2715 + 28_29) resulting in inclusion of a 27 bp sequence of the intron into mRNA, which translates into a premature stop codon after three inserted amino acids. cDNA-specific sequencing of the mutation carrier LCL mRNA confirmed the expression and at least partial stability of the mutated allele ( Supplementary Fig. S2). The deleterious effect of the mutation for the protein function is supported by its occurrence in two FA patients at compound heterozygous state 24 . All three c.2715 + 1G > A carriers from the hereditary cohort were diagnosed with breast cancer at relatively young age (39, 40 and 42). In addition to several breast cancer cases in their families, two index cases had also a close relative (1st or 2nd degree) diagnosed with blood cancer (at the age of 1 and 17 years, respectively) ( Supplementary Fig. S1b). However, the segregation of the FANCD2 mutation with  (Tables 1 and 2), but the frequency remained low, which is typical for mutations causative of FA 25 . In Helsinki, the FANCD2 mutation was identified in four breast cancer cases in the genotyped hereditary cohort (diagnosed at the age of 41, 44, 49 and 68, respectively). Two of the index cases were diagnosed also with basal cell carcinoma. However, additional DNA samples were available only in one family, in which the mutation did not segregate with breast cancer (Supplementary Fig. S1c). Data from pathology reports was available for eleven carriers, the tumors being mainly ductal (9/11). Curiously, three of the tumors (27%) were triple-negative (ER-/PR-/HER2-,   Supplementary Table S2), a subtype that has previously been associated with defects in other DDR genes 26 . For the unselected carriers from all of the analyzed cohorts, the mean age at diagnosis was 57 years (variation 48-66 years). In the meta-analysis, the mutation had an OR of 2.6 among the BRCA1/2 mutation negative hereditary cases when compared to controls, but the association remained under the level of statistical significance (Table 2).

Mutations in genes causative for other inherited syndromes.
Besides the FA gene FANCD2, targeted sequencing of DDR genes revealed several truncating variants in genes known to result in recessively inherited syndromes with variable phenotypes, including sensitivity to UV-light, impairment of eyesight or muscle function and also susceptibility to various cancers (OMIM http://www.omim.org/): APTX, CEP164, CYP19A1, ERCC2, ERCC6, IGHMBP2, NTHL1, RNF168 and UVSSA. The observed recurrent heterozygous alleles in these genes did not associate with breast cancer, although the association for the singleton mutations in CYP19A1, ERCC2 and IGHMBP2 could not be excluded (Supplementary Table S1). Among this group of syndrome genes, RNF168 was particularly interesting since it encodes an E3 ubiquitin-protein ligase involved in the signaling pathway that is required to recruit BRCA1 to the site of DNA damage 16 . In addition, biallelic RNF168 truncating mutations have previously been reported in two cases with recessively inherited RIDDLE syndrome (Radiosensitivity, Immunodeficiency, Dysmorphic features and Learning Difficulties) 16 . The currently identified RNF168 allele (c.640_644del5, Lys214Terfs, rs777601326) was observed in 4/247 (1.6%) of the Northern Finnish hereditary cases, but it was also detected in 6/1185 (0.5%) of the controls and showed only a borderline association with breast cancer susceptibility (p = 0.077, OR = 3.2 and 95% CI = 0.9-11.5) ( Table 1). The prevalence of the RNF168 c.640_644del5 allele was also tested in Helsinki, Tampere and Kuopio case-control cohorts. Unexpectedly, it was observed at higher frequency in controls than in breast cancer patients in all three series and showed no association with breast cancer in the meta-analysis (Table 2). Furthermore, it had a strikingly high carrier frequency in Kuopio (Eastern Finland) not only in cases (2.6%) but also in controls (4.5%) ( Table 1), indicating that homozygotes for this mutation should be expected in the Finnish population.

Discussion
The extensive case-control association analysis performed here for the rare, presumably deleterious alleles in genes encoding important players in the DDR pathway identified two potentially breast cancer predisposing mutations: TEX15 c.7253dupT and FANCD2 c.2715 + 1G > A. Whereas FANCD2 has an integral role in the canonical FA pathway, the biological functions of TEX15 are only gradually emerging. However, TEX15 has already been shown to operate in homologous recombination and to guide the loading of RAD51 to the sites of DNA double-strand breaks along with BRCA1/2 during male meiosis 15,27 . Reflecting this, biallelic mutations in TEX15 lead to spermatogenic failure in both men and mice 15,28 . Despite these testis-specific functions, we confirmed the TEX15 expression in LCLs and MCF7 cell line. Together with the observed association with breast cancer this indicates that the role of TEX15 extends beyond testis and meiosis-specific double-strand break repair. The breast cancer associated TEX15 c.7253dupT differed from the other observed TEX15 mutation, c.8325G > A, in that it was stable at mRNA level whereas the c.8325G > A mRNA was efficiently degraded. It is possible that this stable allele might disturb some of the normal cellular functions of TEX15, potentially relating to cancer predisposition, more severely than a complete null allele. Both c.7253dupT and c.8325G > A mutations reside in the same long exon (altogether 1259 base pairs), preceding the last one. The degraded TEX15 null mutation (c.8325G > A) creates a stop codon close to the original, whereas the stop codon eventually created by the c.7253dupT frameshift is more distant. According to a recent paper by Lindeboom et al. 29 , the distance of a new stop codon from the original one and the long length of the exon where the mutation resides are the key factors for defining the efficiency of nonsense-mediated decay. Indeed, this could provide a plausible explanation to the currently observed situation, which is actually analogous to that observed for the Finnish PALB2 and MCPH1 founder mutations. Both of them reside in a long exon of the gene, are stable at mRNA level, and are more common in people affected with breast cancer compared to healthy controls 12,23 . Besides introducing TEX15 as a putative new breast cancer predisposition gene, the current results can be of relevance for individuals suffering from idiopathic spermatogenic failure. The TEX15 null allele (c.8325G > A) has a carrier frequency of approximately 1% in the Finnish population, and individuals homozygous for this mutation are likely to suffer from spermatogenic failure as in the family described by Okutman et al. 28 .
The prevalence of FANCD2 c.2715 + 1G > A mutation among the studied cohorts supports the previously established link between FA and breast cancer: monoallelic mutations in BRCA2/FANCD1, BRIP1/FANCJ, PALB2/FANCN, RAD51C/FANCO and BRCA1/FANCS genes predispose to breast and/or ovarian cancer, while their biallelic mutations cause FA, characterized by genomic instability, bone marrow failure, developmental abnormalities and increased incidence of cancer, particularly leukemia 14,30,31 . Breast cancer associated genes from the FA pathway function in the homologous recombination repair part of the pathway, while mutations in the core complex FA genes have not been associated with breast cancer 25,32 , the exceptions potentially being FANCM 8,33 and FANCC 34 . Curiously, also in the study by Thompson et al. 2012, one of the index cases with truncating FANCC mutation carried a deleterious BRCA2 mutation 34 , analogous to the situation observed in the current study, where one FANCD2 and BRCA1 double mutant was identified. Although high FANCD2 expression levels in breast tumors have previously been linked to poor survival of the patients 35 , and one common FANCD2 SNP has been associated with sporadic breast cancer risk 36 , to our knowledge, this study is the first one reporting FANCD2 mutations in the context of familial breast cancer.
This study also provides integral information for the field of clinical genetics as many biallelic mutations in DDR genes are known to cause severe congenital disorders. Particularly interesting is the recurrent RNF168 c.640_644del5 mutation, which clearly represents a founder mutation in Finland, but whether it is causative for RIDDLE syndrome remains to be demonstrated. Based on the observed carrier frequency, and if the mutation is not embryonically lethal when homozygous, it is likely that RIDDLE syndrome patients exist in Finland, thus Scientific RepoRts | 7: 681 | DOI:10.1038/s41598-017-00766-9 warranting further investigations. Only two patients with this condition have so far been described worldwide 16,37 and as the phenotype of even these patients is different, it is possible that the severity of the condition might vary. For rare conditions like RIDDLE syndrome, the identification of additional patients is important for the better characterization and understanding of the disease phenotype and etiology, and it may ultimately lead to better diagnosis and potential treatment.
Overall, this study shows that many heterozygous protein-truncating mutations in DDR genes identified in breast cancer cases are also frequently found in the healthy population and thus are not high-risk susceptibility factors for breast cancer, although any low-penetrance effects cannot be excluded. Our results indicate a potential role in breast cancer predisposition for heterozygous truncating mutations in FANCD2 and TEX15. Based on our results, FANCD2 c.2715 + 1G > A might act as a moderate breast cancer risk allele, adding FANCD2 to the list of shared genes between FA and breast cancer. According to the ExAC database 38 , FANCD2 c.2715 + 1G > A occurs also outside Finland and should be studied further in larger sample sets to clarify its role in breast cancer predisposition. TEX15 represents yet another potential breast cancer susceptibility gene from the DDR pathway. Even though the currently identified TEX15 c.7253dupT most likely represents a Northern Finnish founder mutation, other truncating mutations, and their effect at mRNA level, warrant further studies in other populations.

Materials and Methods
Targeted sequencing of 796 DDR genes in Northern Finnish breast cancer patients. Initial sequencing was previously performed for 189 Northern Finnish breast cancer patients with indication of hereditary disease susceptibility [62 cases with young disease onset (≤40y), and 127 cases with family history of breast or breast and ovarian cancer] 12 . Selection and sequencing of the 796 DDR genes (Supplementary Table S4), and annotation of the variants are described in Mantere et al. 12 . Briefly, the selected genes encoded (1) proteins identified as being part of DNA repair processes using the GeneOntology searches and STRING v.9.0 (n = 612) 39,40 , and (2) novel BRCA1 and PALB2 interacting proteins identified in protein complex purification assays performed with epitope tagged versions of the proteins in HeLaS3 cells (n = 184). wANNOVAR 41 , SureCall (Agilent technologies, Santa Clara, CA, USA) and Integrative Genomics Viewer (IGV) 42 were used for annotation and visualization of the variants. All variants selected for further studies were confirmed by Sanger sequencing (ABI3130xl, Applied Biosystems, Foster City, CA, USA).

Case-control cohorts. Oulu cohort (Northern Finland). The hereditary breast cancer cohort (n = 247)
consisted of 166 familial and 81 young breast cancer cases (includes familial and young cases from the discovery cohort). The familial cases were affected index individuals of Northern Finnish breast, or breast and ovarian cancer families. Inclusion criteria were the following: 1) three or more breast and/or ovarian cancers in first-or second-degree relatives (n = 86); 2) two cases of breast, or breast and ovarian cancer in first-or second-degree relatives. Of these at least one had early disease onset (<35 years), bilateral breast cancer, or multiple primary tumors including breast or ovarian cancer in the same individual (n = 26); or 3) two cases of breast cancer in first-or second-degree relatives (n = 54). Young breast cancer cases were unselected for a family history of cancer and diagnosed with breast cancer at or below the age of 40 (median 38, variation 25-40 years). These cases were combined with the familial cohort to form a single hereditary cohort, based on the assumption that when a woman below the age of 40 years develops breast cancer, a hereditary predisposition may be suspected regardless of the family history 43 . In the hereditary cohort fulfilling the inclusion criteria of the current study, 19/247 (8%) of the cases carried known mutations in BRCA1 or BRCA2 21 , Supplementary Table S5. These were included in the analysis in order to identify additional risk factors, as in some families BRCA1/2 mutations could not alone explain the family burden of the disease, and also several individuals with double mutations in integral DDR genes have previously been described in the literature 3,12,34 . The unselected breast cancer cohort consisted of 1326 consecutive cases operated at the Oulu University Hospital during 2000-2014. They were unselected for a family history of cancer and age at disease onset. Altogether 1228 healthy geographically matched Finnish Red Cross blood donors (758 females and 470 males) were used as population controls.
Helsinki cohort (Southern Finland). The unselected breast cancer cohort from Helsinki was collected consecutively at Helsinki University Hospital Department of Oncology in 1997-1998 and 2000 44,45 and Department of Surgery in 2001-2004 46 . The unselected cohort consisted of 1727 patients with invasive breast cancer. Altogether 416 patients from the unselected series had a family history of breast cancer and were included also in the hereditary breast cancer cohort. Additional familial breast cancer patients were collected at Departments of Oncology and Clinical Genetics as previously described 46,47 . In total, the hereditary breast cancer cohort consisted of 1175 patients. Of these, 612 patients were from families with three or more breast or ovarian cancers among first-or second-degree relatives and 563 patients had one affected first-degree relative. The familial breast cancer patients have been tested negative at least for BRCA1/2 founder mutations. Geographically matched 1272 healthy Finnish Red Cross blood donors were used as controls. All patients and controls were females.
Tampere cohort (Southern Finland). The hereditary cases from Tampere consisted of 87 index cases of BRCA1/2 mutation negative hereditary breast, or breast and ovarian cancer families as described in Kuusisto et al. 48 . The unselected breast cancer cases (n = 646) were collected during 1997-1999 at the Tampere University Hospital 44 . 767 geographically matched female Finnish Red Cross blood donors were used as controls.
Kuopio cohort (Eastern Finland). Altogether 487 cases and 288 controls were available from the Kuopio Breast Cancer Project (KBCP) 49 . The KBCP material includes prospective breast cancer cases (unselected for the family history of breast cancer) from the province of Northern Savo in Eastern Finland, diagnosed at the Kuopio University Hospital between 1990 and 1995. The control individuals in KBCP were selected from the National Scientific RepoRts | 7: 681 | DOI:10.1038/s41598-017-00766-9 Population Register at the same time interval and were matched for sex, age and long-term area of residence for the cases. Additional set of unselected breast cancer cases (n = 181) were available from the ILRS breast cancer cohort. The ILRS cases were diagnosed and collected at the Kuopio University Hospital during 2011-2014. Altogether these accounted for 668 unselected breast cancer cases and 288 controls.
Genomic DNA extracted from peripheral blood was used for mutation screening from all samples. The study was carried out with informed consent from all the participating individuals. The study was approved by the Ethical Board of the Northern Ostrobothnia Health Care District, the Ethical committee of Helsinki University Central Hospital, the Ethical Committee of Tampere University Hospital and the National Authority for Medicolegal Affairs, the Ethical committee of the University of Eastern Finland and Kuopio University Hospital and the Ministry of Social Affairs and Health in Finland. The study and the methods were conducted in accordance with the approved guidelines. Chromosomal analysis. Chromosomal analysis of six heterozygous TEX15 c.7253dupT carriers [three affected index cases, one male diagnosed with prostate and skin cancer, and two healthy females (aged 25 and 58)] and nine wild type controls was carried out on metaphases obtained from regular short-term 3-day cultures of peripheral blood T-lymphocytes 13 . The blood samples of the patients selected for chromosomal analysis were collected at least 5 years after the initial breast cancer diagnosis and treatment. The controls were healthy female individuals. A minimum of 50 Giemsa-banded metaphases for each sample were evaluated by light microscopy and photographed with an automatic chromosome analyzer (CytoVision version 7.2, Applied Imaging). Chromosomal aberrations were divided into five classes as previously described 13 . mRNA analysis. Total RNA was isolated from MCF7 breast cancer cell line and Epstein-Barr virus transformed lymphoblastoid cell lines (LCLs) of TEX15 c.7253dupT, TEX15 c.8325G > A and FANCD2 c.2715 + 1G > A mutation carriers and wild type controls using RNeasy Mini Kit (Qiagen, Hilden, Germany) and reverse transcribed with iScript cDNA synthesis Kit (Bio-Rad, Hercules, CA, USA). The expression and outcome of different mutations at mRNA level was studied using cDNA specific sequencing in LCLs (primers in Supplementary Table S6). TEX15 expression in MCF7 cell line was tested by cDNA specific amplification. Statistical analysis. Carrier frequencies between cases and controls were compared using Fisher's exact or χ 2 test. Mann-Whitney U-test was used to determine the difference in the number of different chromosomal aberrations between TEX15 c.7253dupT carriers and controls. The meta-analyses of the mutations occurring in more than one study cohort (FANCD2 c.2715 + 1G > A, TEX15 c.8325G > A and RNF168 c.640_644del5) were performed using logistic regression model combining all the analyzed cohorts and exploiting all individual datasets. All the analyses were performed using IBM SPSS Statistics 22.0 for Windows (SPSS Inc., Chicago, IL, USA). All p-values were two-sided and values <0.05 were considered statistically significant.