Prevalence of PALB2 mutations in Australian familial breast cancer cases and controls

PALB2 is emerging as a high-penetrance breast cancer predisposition gene in the order of BRCA1 and BRCA2. However, large studies that have evaluated the full gene rather than just the most common variants in both cases and controls are required before all truncating variants can be included in familial breast cancer variant testing. In this study we analyse almost 2000 breast cancer cases sourced from individuals referred to familial cancer clinics, thus representing typical cases presenting in clinical practice. These cases were compared to a similar number of population-based cancer-free controls. We identified a significant excess of truncating variants in cases (1.3 %) versus controls (0.2 %), including six novel variants (p = 0.0001; odds ratio (OR) 6.58, 95 % confidence interval (CI) 2.3–18.9). Three of the four control individuals carrying truncating variants had at least one relative with breast cancer. There was no excess of missense variants in cases overall, but the common c.1676A > G variant (rs152451) was significantly enriched in cases and may represent a low-penetrance polymorphism (p = 0.002; OR 1.24 (95 % CI 1.09–1.47). Our findings support truncating variants in PALB2 as high-penetrance breast cancer susceptibility alleles, and suggest that a common missense variant may also lead to a low level of increased breast cancer risk.


Introduction
Partner and localizer of BRCA2 (PALB2) plays a central role in homologous recombination-mediated repair of double-strand DNA breaks [1] and biallelic mutations are responsible for Fanconi anemia complementation group N [2]. Monoallelic inactivating germline mutations in PALB2 were subsequently shown to be associated with familial breast cancer [3] and numerous studies supported this association in various populations and established a mutation prevalence of approximately 1 % among familial breast cancer cases (varying from 0.1 % to 2.7 % as reviewed by Southey et al. [4]). Most recently, Antoniou et al. used a modified segregation analysis approach to determine that the age-specific risk of breast cancer among female mutation carriers overlaps the risk conferred by BRCA2 mutations [5] establishing that, despite the rarity of mutations, PALB2 is the most significant breast cancer predisposition gene after BRCA1 and BRCA2.
In Australia, early studies identified PALB2 c.3113G > A (p.Trp1038*) as a recurring truncating mutation among familial breast cancer index cases, and established the enrichment of c.3113G > A in cases compared to controls [6]. Further studies have identified a spectrum of truncating variants among breast cancer cases [7][8][9][10], the collective frequency of which has not been compared to Australian controls. Indeed few studies of PALB2 mutations have analysed significant numbers of family cancer clinic-ascertained cases or matched controls. Because early studies focused on screening just for the presumed common pathogenic mutations, in Australia (eviQ Cancer Treatments Online; [11]) it is not recommended to test for PALB2 truncating mutations aside from the recurring c.3113G > A variant, however, it is likely that all truncating mutations confer an equivalent loss of gene function and consequent breast cancer risk. Other guidelines, such as National Comprehensive Cancer Network [12], have made no specific distinction between different PALB2 mutations but do raise a general caution around the interpretation of testing for mutations in PALB2 and other "moderate penetrance" breast cancer predisposition genes, especially as part of panel tests. Identification of genetic risk factors is critical for individual risk assessment and reduction strategies, and in the near future may provide avenues for personalised therapy [4]. Therefore it is important to continue to amass the necessary data to support the implementation of whole gene testing of PALB2 in breast cancer families. In this study, we performed germline mutation analysis of the entire coding region of PALB2 in a cohort of 1996 breast cancer index cases referred to familial cancer clinics for genetic testing and tested negative for BRCA1 and BRCA2 mutations as well as 1998 Australian cancer-free female controls. This represents the largest single case/control screen of germline PALB2 mutations to date.

Samples for mutation analysis
Cancer-affected women in the study were referred by their physician to a specialist Familial Cancer Centre (FCC) for genetic testing of BRCA1 and BRCA2 between 1997 and 2014, and were identified as being at "high risk" of carrying a predisposing allele. The criteria for high risk included a personal history of breast cancer, two or more first-or second-degree relatives with breast and/or ovarian cancer, and an additional risk factor (additional affected close relatives, diagnosis before 40 years, multiple primary breast or ovarian cancers in one individual, or Ashkenazi Jewish ancestry). From 2003, individuals with a ≥10 % risk of carrying a BRCA1 or BRCA2 mutation, as estimated by BRCAPro, including tumour pathology, were also eligible [13].
Our final case cohort (Additional file 1) included 997 breast (95 %) or ovarian (5 %) cancer-affected index cases from the Hunter Area Pathology Service (HAPS), Newcastle, Australia [9]. Family history information was available for a subset of this cohort only. A further 999 breast cancer-affected index cases each with detailed family history available were obtained from the combined Victorian Familial Cancer Centres (FCCs) through the Variants in Practice (ViP) study. For all cases, clinical genetic testing of BRCA1 and BRCA2, including for large rearrangements by multiplex ligation probe-dependent amplification (MLPA), returned negative results.
A cohort of 1998 participants in the LifePool study [14] were utilised as cancer-free population control samples for this analysis. LifePool recruits female participants through the Australian population mammography screening program (BreastScreen) for research studies utilising prospectively collected epidemiological, genetic and mammographic data with ongoing clinical follow-up obtained through the Victorian Cancer Registry. Participants provided breast cancer family history information for close relatives only. The average age of the participants recruited to this study was 58.84 ± 9.9 years (range 19-91).
All cases and controls provided informed consent for genetic analysis of their germline DNA. This study was approved by the Human Research Ethics Committees at each participating ViP centre (see Acknowledgements), the Peter MacCallum Cancer Centre, Hunter New England Health and The University of Newcastle. This study was carried out in accordance with all relevant regulations and guidelines.

Germline mutation analysis
Germline mutation analysis of the PALB2 gene was performed as part of a custom sequencing panel. All coding PALB2 exons were amplified from 225 ng of germline DNA extracted from blood or saliva using the HaloPlex Targeted Enrichment Assay (Agilent Technologies, Santa Clara, CA, USA) according to the manufacturer's protocol using an Agilent Bravo Automated Liquid Handling System. Paired-end 100 or 150 bp sequence reads were generated from the indexed, pooled libraries on a HiSeq2500 Genome Analyzer (Illumina, San Diego, CA, USA). Sequence reads were trimmed of adapter using Cutadapt [15] and aligned using either BWA or BWA MEM [16]. Genome Analysis Toolkit (GATK) v3.1 was used to perform indel realignment and Unified Genotyper was used for variant calling [17,18]. Protein consequence and additional annotations were added using Ensembl v73 Variant Effect Predictor [19]. Variant positions were determined by reference to GenBank reference sequence NM_024675.3 according to Human Genome Variation Society (HGVS) guidelines [20]. All novel variants were validated by Sanger resequencing of germline DNA using primers from Tischkowitz et al. [21]. The following in silico prediction tools were used to assess the possible pathogenicity of missense mutations: Combined Annotation-Dependent Depletion (CADD) [22], Condel [23], SIFT [24] and PolyPhen2 [25]. CADD scores evaluate both missense and indel variants, integrating conservation measures, regulatory, transcriptional and protein effects to estimate the relative deleteriousness of the variants.

Coverage
A total of 1996 breast cancer index cases and 1998 noncancer controls were screened for germline mutations in the coding regions of PALB2. These coding regions were well covered by sequence reads in both cases and controls.
The mean read depth across the entire gene for all samples was 217 (192 for cases, 242 for controls), with an average of 98.66 % of the coding regions covered by at least 20 reads (98.12 % for cases and 99.20 % for controls).
The personal and family history information for carriers of the PALB2 truncating variants are given in Table 2 and Additional file 2. As expected, the cases generally have a strong family history of cancer, especially breast cancer. In the controls, four individuals were identified with truncating variants. One individual had a maternal aunt diagnosed with breast cancer at under 40 years of age, and her mother, father and brother all had cancer although not of the breast. The mothers of both of the other individuals had a breast cancer diagnosis aged over 70 years of age, and one of these individuals also had two second-degree relatives with breast cancer. The final and youngest carrier (aged 48) did not report any breast cancer in her family. Thus, 3/4 carriers have some family history of breast cancer.

Missense and synonymous variants
A large number of missense variants (n = 54) were detected in the cohort (Table 3). There was a slight  Considering only those rare variants present in fewer than five carriers among 3994 cases and controls (approximately 0.1 %), a similar number of missense variants were detected in both groups (40 in cases (2 %), 28 in controls (1.4 %)), which does not suggest any association of rare missense variants with risk. There was also no significant enrichment in cases when limited to rare variants that were predicted to be deleterious by any of Condel, SIFT or Polyphen2 (28/1996 cases, 18/1998 controls) or with a CADD score of >10 (29/1996 cases, 20/1998 controls).
We detected 23 synonymous variants (Table 4). Neither the most common alone (c.3300 T > G) nor all together were significantly enriched in cases or controls.

Discussion
This study screened Australian individuals with breast cancer who had been referred to a Familial Cancer Centre for genetic testing and in whom no pathogenic BRCA1 and BRCA2 variant could be identified. The frequency of PALB2 truncating variants in this cohort (1.1 %) is similar to other studies analysing high-risk breast cancer individuals (0.64-3.4 %, 1.35 % overall [3, 6-9, 26-41]) or triple-negative breast cancer (0.9-2.5 % [10,42,43]) but is the largest to include an analysis of the full gene in both cases and controls. However, we would not be able to detect any large deletions or rearrangements. The low frequency of truncating variants in controls supports PALB2 as a high-penetrance breast cancer predisposing gene. The diversity of truncating mutations identified, comprising 16 different variants in eight of the 13 exons including five novel variants, highlights the need for full gene screening, not just the most common variant c.3113G > A (rs180177132). These data will enable evidence-based clinical guidelines to include full PALB2 screening if previously they had advised testing limited to the specific common variant only.
The prevalence of truncating variants in cancer-free controls was 0.15 % in the LifePool cohort. These individuals were ascertained from women attending populationbased mammographic screening, which in Australia is targeted towards women over 50, although some younger women are included. Thus, this volunteer cohort may not be entirely representative of the general population, although all were cancer-free at the time of analysis. Nonetheless, the frequencies of missense and synonymous variants are consistent with those reported in large databases such as 1000 Genomes [44], Exome Aggregation Consortium [45] and Exome Variant Server [46].
We did not observe any significant enrichment in missense mutations overall, although the frequency was slightly higher in the cases when only rare, deleterious mutations were considered. The contribution of rarer variants to breast cancer risk will need to be evaluated in larger case-control cohorts. Surprisingly, the common variant c.1676A > G (Gln559Arg; rs152451) was significantly enriched in cases versus controls, although with only a modest odds ratio (1.24). There was a trend towards homozygous carriers of this variant being enriched in cases versus controls with an OR of 2.08. This variant was shown to be associated with an increased breast cancer risk in multiple-case breast cancer families in Chile compared to population controls [47] with an OR of 2.0 when at least three family members were breast or ovarian cancer-affected. No association was found for individuals diagnosed at a young age (<50) and with no affected relatives. In a small Malaysian case-control study, there was a trend towards enrichment for carriers of the variant in non-familial breast cancer cases (286/871, 33 %) versus controls (70/257, 27 %, OR 1.3 [38]), however, cases and controls were not well matched for ethnicity, with an excess of Indian and Malay women over Chinese in the controls compared to cases. Larger numbers of cases and    controls will be required to confirm whether the association of rs152451 with breast cancer is a robust finding. In addition, the wide variation in the frequency of the minor allele in different populations means that cases and controls will have to be carefully matched for ethnicity. This variant is not located in a known protein domain and was consistently found to have predicted benign effects on protein function by all algorithms tested. However, this base change is only 9 bp away from the exon 4 splice donor site and Human Splicing Finder (v3) found that rs152451 could alter an exonic splicing enhancer motif [48], offering a potential mechanism for how this variant could affect PALB2 function. It should be noted that such a prediction was relatively common for the variants we detected in PALB2 (35/77 missense or nonsynonymous variants had a similar prediction from at least three algorithms) and any effect would need to be confirmed by an RNA-based assay.
There has been only one study to date that has examined the likely functional effect of missense variants in PALB2, which examined p.Leu939Trp, p.Leu1143Pro and p.Thr1030Ile [49]. The first two variants had subtle but significant effects on homologous recombination repair: p.Leu1143Pro in particular showed decreased repair capacity and binding to BRCA2 and RAD51C. PALB2 p.Thr1030Ile was unstable, leading to decreased protein levels and this was assumed to impair homologous recombination repair. However, it should be noted that these functional assays were performed by overexpression of a retroviral transgene in a null cell line and may not reflect the heterozygote situation. In our study, p.Leu939Trp was not enriched in cases (four in cases, eight in controls),  p.Leu1143Pro was only seen in two cases and no controls, while p.Thr1030Ile was not observed in either cases or controls.