BRCA1/2 Mutation Detection in the Tumor Tissue from Selected Polish Patients with Breast Cancer Using Next Generation Sequencing

(1) Background: Although, in the mutated BRCA detected in the Polish population of patients with breast cancer, there is a large percentage of recurrent pathogenic variants, an increasing need for the assessment of rare BRCA1/2 variants using NGS can be observed. (2) Methods: We studied 75 selected patients with breast cancer (negative for the presence of 5 mutations tested in the Polish population in the prophylactic National Cancer Control Program). DNA extracted from the cancer tissue of these patients was used to prepare a library and to sequence all coding regions of the BRCA1/2 genes. (3) Results: We detected nine pathogenic variants in 8 out of 75 selected patients (10.7%). We identified one somatic and eight germline variants. We also used different bioinformatic NGS software programs to analyze NGS FASTQ files and established that tertiary analysis performed with different tools was more likely to give the same outcome if we analyzed files received from secondary analysis using the same method. (4) Conclusions: Our study emphasizes (i) the importance of an NGS validation process with a bioinformatic procedure included; (ii) the importance of screening both somatic and germline pathogenic variants; (iii) the urgent need to identify additional susceptible genes in order to explain the high percentage of non-BRCA-related hereditary cases of breast cancer.


Introduction
Breast cancer is the most common cancer in women worldwide [1]. Approximately 5-10% of breast cancers are hereditary. Women carrying BRCA mutations have an increased risk of breast cancer and/or ovarian cancer development, with a probability of 45-75% and 18-40%, respectively [2][3][4][5].
Hereditary ovarian/breast cancer (HOBC) is frequently caused by founder mutations in the BRCA1 and BRCA2 genes. Founder mutations were historically present with high frequencies in small, geographically or culturally isolated groups and are derived from one or more ancestors [6]. A founder effect can be observed in a population characterized by lower genetic diversity, which might be caused by the parental population suffering a dramatic decrease or bottleneck. The parental population could give rise to a larger population in which new variants could occur spontaneously or be transferred from other populations [7].
Due to the high incidence of breast cancer worldwide and its relatively high mortality and morbidity, it is important to implement appropriate screening tests which enable rapid and efficient mutation screening in the BRCA1 and BRCA2 genes. Based on the available wide range of technologies, and owing to the presence of the founder effect or a frequent mutation in the screened population, there are two possible approaches to mutation screening.
Over the years, a broad range of PCR-based mutation methods have been used for screening-starting from the historical SSCP [24,25] and protein truncated test (PTT) [26], followed by ASO-PCR and RT-PCR of BRCA1 mRNA [27], which helped to discover large genomic rearrangements, and subsequently followed by MLPA and CGH [28].
The second approach to BRCA1 and BRCA2 mutation testing is the sequencing of all coding regions using next-generation sequencing (NGS). The main advantage of this method is its potential to detect not only founder mutations, but also other ones, including both frequent and rare pathogenic changes. In addition, BRCA1 and BRCA2 tests expanded with CNV analysis are available on the market (Entrogen BRCA complete ver. 2 (EntroGen, Inc, Los Angeles, CA, USA), Devyser BRCA (Devyser, Stockholm, Sweden), SureMASTR BRCA Screen (Agilent, Santa Clara, CA, USA)-enhancing the use of NGS in diagnostics. Studies by Gorski et al. (2000) and Perkowska et al. (2003) in the Polish population showed that the BRCA1 founder effect exists, with a predominant presence of c.181T>G (p.Cys61Gly) and c.5266dupC, although the mutation spectrum is more dispersed [29,30]. Full BRCA gene mutation analysis in Polish high-risk families was first postulated in 2003, and NGS studies in this group followed in 2015. Kluska et al., in their study of Genetic Counseling Unit patients with early-onset or familial breast/ovarian cancer, selected 512 cases negative for 11 BRCA1 and 9 BRCA2 mutations. BRCA1/2 testing using NGS technology showed that 52 out of 512 (10%) Polish patients had additional BRCA1/2 pathogenic variants [31]. There were 367 patients representing only familial breast cancer, among whom 26 patients had additional BRCA1/2 pathogenic variants detected using NGS in the group studied by Kluska et al. [31]. In a parallel study, 335 Polish patients with triple-negative breast cancer (negative for 3 BRCA1 mutations) were tested with NGS. The study revealed the presence of deleterious variants in 33 of 335 patients (9.9%) [32]. Finally, Kowalik et al. detected 40 (8.8%) pathogenic variants in a subpopulation of 454 healthy individuals and patients with breast and/or ovarian cancer referred to the Genetic Counseling Outpatient Clinic [33]. Since that time, the first FDA approval for targeted therapy with PARP inhibitors for patients with BRCA-positive ovarian cancer has introduced renewed hope, especially for triple-negative or metastatic breast cancer patients, as many inhibitors are under evaluation for their potential clinical benefits against patients with BRCA-positive breast cancer. In the present study, we attempted to determine the prevalence of pathogenic or likely pathogenic variants of BRCA1 or BRCA2 among the selected group of patients with breast cancer (not diagnosed as mutation carriers with the standard screening procedure in 2003-2015).

Materials
The studied population, comprising patients enrolled in the study between 2003 and 2015, were screened for the presence of known mutations in the BRCA1 gene (c.5266dupC, c.181T>G, c.4035delA, c.68_69delAG, c.3700_3704delGTAAA) by the National Cancer Control Program supervised by the Department of Health Promotion and Prevention (Franciszek Lukaszczyk Oncology Center, Bydgoszcz, Poland). Patients, negative for mutations in the BRCA1 gene during the presented targeted testing, who developed breast cancer (between 2003 and 2015) were referred after providing their informed consent.
A total of 75 breast cancer tissue samples (archived between 2003 and 2017) were selected by the Department of Tumor Pathology and Pathomorphology. The study was approved by the Bioethics Committee of the Nicolaus Copernicus University in Torun (KB 844/2018).

DNA Isolation
The percentage of tumor cells in material qualified by the pathologist ranged from 5% to 80%. DNA was isolated from breast cancer tissue fixed in formalin and embedded in paraffin (FFPE) using the QIAamp DNA FFPE Tissue Kit (Qiagen, Hilden, Germany), according to the manufacturer's protocol. The initial concentration and quality of DNA were measured using NanoDrop1000 (Thermo Scientific, Waltham, MA, USA).

DNA Quality Assessment
The quality and quantity of DNA were evaluated using real-time PCR with Fragmentation Quantification Assay (FQA) CE-IVD (EntroGen) or/and by fluorometric methods using Quantus (Promega) with QuantiFluor ® dsDNA System (Promega).
FQA allows the amplification of 37-, 150-and 300-base pair (bp) fragments of isolated double-stranded DNA in a reference location on chromosome 5 and the evaluation of DNA degradation through assessment of DNA concentration (ng/µL), amplifiable copy number for the three amplicon sizes, as well as fragmentation ratio (F ratio). The F ratios 150 bp/37 bp and 300 bp/37 bp provided information on the amount of dsDNA needed as a template for library preparation. Based on the manufacturer's protocol, we found that archive FFPE material (from 2010-2017) was significantly degraded, but the lack of other biological material forced us to use DNA with an F ratio of <0.5, below the manufacturer's recommendations.

Library Preparation for Next-Generation Sequencing
Libraries for NGS were prepared using the BRCA Complete CE-IVD test (EntroGen), which allowed the sequencing of all exons of the BRCA1 and BRCA2 genes according to the manufacturer's protocol. The test can detect single-nucleotide variants (SNVs), small insertions and deletions. The BRCA Complete test uses the target enrichment method in order to amplify the desired DNA fragments in 4 separate reactions, which are combined into one. The target enrichment step allows 115 amplicons with an approximate length of 254 bp to be obtained, and it is followed by DNA end preparation and ligation of adaptor sequences to the amplified DNA fragments. All prepared libraries were dual-indexed. The libraries were quantified using the real-time PCR Library Quantification Assay CE-IVD (EntroGen). All libraries were pooled, denatured and diluted to 5 pM together with the control library PhiX. Sequencing was performed on a MiniSeq platform (Illumina) using the MiniSeq Mid Output Kit, 2 × 150 cycles, to obtain reads for both strands.

Bioinformatic Analysis
FASTQ files were generated on the BaseSpace Onsite system. Our goal was to establish an NGS workflow for the automated and reliable assessment of many samples simultaneously for diagnostic purposes. We wanted to evaluate whether the final identification of pathogenic and likely pathogenic variants using different customized bioinformatic programs and platforms would be the same. We used two separate algorithms for the secondary analysis: (1)  Finally, an additional tertiary analysis for selected samples with detected pathogenic variants was performed using the NGeneAnalySys software (Seoul, Korea), which independently re-annotated samples [29]. All variants are described with the following reference sequences: NM_007300.3 (BRCA1), NM_000059.3 (BRCA2). There was one exception, c.5266dupC, which is the common name of a frequent BRCA1 variant with reference sequence NM_007294.4; NM_007194.4 (CHEK2); NM_024675.4 (PALB2).

Sanger Sequencing
All pathogenic variants were evaluated in non-cancerous tissue (FFPE with 0% content of tumor cells) or in blood. Sanger sequencing was carried out using an Applied Biosystems SeqStudio Genetic Analyzer (Thermofisher Scientific, Stockholm, Sweden).

Quality Control Parameters
Overall, 74/75 sequenced samples had uniformity of coverage above 71.5 (mean 80.8%); one sample had 62.3%. Uniformity of coverage for Illumina platforms is described as the percentage of targeted base positions in which the read depth is greater than 0.2 times the mean region target coverage depth. Mean base quality for samples assessed with oncoKDM platform was 68.8. In addition, 13.3% samples randomly selected from 75 analyzed probes were checked with an additional application to control the quality of FASTQ files and alignment using FastQC, Flagstat and samtools stats.

Statistical Analysis
Mann-Whitney U test for statistical analysis was performed for the study purposes.

Patient Characteristics
The median age was 43 years at the time of diagnosis (75 patients with breast cancer enrolled in the study were negative for the 5 most common BRCA1 mutations). Pathogenic variants were detected predominantly in patients diagnosed with breast cancer who were under 46 years of age (n = 8), and this prevalence observed in the studied subpopulation of 25-69 years of age was statistically significant (Mann-Whitney U test, α = 0.05) ( Table 1). Based on the histopathological results, we categorized patients with breast cancer into the following groups: (i) triple-negative patients (ER-, PR-, Her2-; TNBC), (ii) (ER+, PR+, HER2+), (iii) (ER+, PR+, HER2-), (iv) (ER-, PR-, HER2+), (v) other subtypes. As expected, the highest frequency of pathogenic variants of BRCA1/2 was detected in the TNBC patients (23%) ( Table 2). Among the examined patients, the two most frequent breast cancer subtypes were: ER+, PR+, HER2-(n = 42) and TNBC (n = 15). A pathologist qualified the FFPE material, which had a varied percentage of tumor cells, ranging from 5% to 80%. Eight detected pathogenic variants were present in samples with a cancer cell content below 41%. However, the majority of the detected pathogenic variants (8/9) were germline, and tumor cell content was not crucial at the moment of detection in the studied population.

Characteristics of Pathogenic Variants
We detected nine pathogenic variants in the BRCA1 and BRCA2 genes in 8 out of 75 patients (10.66%). Among them, eight were missense and one was a frameshift variant ( Table 3). All detected pathogenic variants were present in the coding regions ( Figure 1  In the TNBC patients, we identified the pathogenic variants c.4752C>G (BRCA1) in two cases and c.1687C>T (BRCA1) in one case. The frequency of the detected pathogenic variants was 66.6%, 79% and 47%, respectively.
In algorithm (1), the classification of variants was fully consistent, independently of the application used for VCF file annotation (Illumina Variant Studio 3.0 and EntroGen EVA1.1: results consistent for all 75 samples; and NGeneAnalySys: results consistent for 8 samples with pathogenic variants detected). When the results obtained using algorithms (1) and (2) were compared, detections of pathogenic variants were consistent in 98.7% (two variants were not detected by the OncoKDM platform during testing, but subsequently, OncoDNA improved its database, and since October 2019, there has been a 100% correlation). In the TNBC patients, we identified the pathogenic variants c.4752C>G (BRCA1) in two cases and c.1687C>T (BRCA1) in one case. The frequency of the detected pathogenic variants was 66.6%, 79% and 47%, respectively.
In algorithm (1), the classification of variants was fully consistent, independently of the application used for VCF file annotation (Illumina Variant Studio 3.0 and EntroGen EVA1.1: results consistent for all 75 samples; and NGeneAnalySys: results consistent for 8 samples with pathogenic variants detected). When the results obtained using algorithms (1) and (2) were compared, detections of pathogenic variants were consistent in 98.7% (two variants were not detected by the OncoKDM platform during testing, but subsequently, OncoDNA improved its database, and since October 2019, there has been a 100% correlation).

Discussion
We detected nine pathogenic variants in 8 (10.66%) out of 75 selected breast cancer patients from families with hereditary breast and ovarian cancer syndrome, whose genetic tests for five BRCA1 mutations (c.5266dupC, c.181T>G, c.4035delA, c.68_69delAG, c.3700_3704delGTAAA) were negative. It is challenging to compare the frequency of mutations detected in selected groups because of the different study inclusion criteria (e.g., discrepancy between preliminary assessments of pathogenic variants that exclude patients from the study).
According to Kluska et al., pathogenic and likely pathogenic germline BRCA1/2 variants were detected in the blood in 7.1% of 367 patients with familial breast cancer in the Polish population [31]. Although the percentage of pathogenic and likely pathogenic variants detected was slightly lower, the authors performed a broader preliminary assessment and genotyped 11 known pathogenic variants in BRCA1 and nine in BRCA2 [31]. None of the preliminary variants tested by Kluska et al. were present in our patient groups. In another study, Kowalik et al. revealed only the percentage of germline variants (pathogenic, variant of uncertain significance (VUS), and benign) detected in patients with breast and/or ovarian cancer (12.8%) and did not specify the percentage of germline variants in patients with breast cancer [33].
It has been shown that breast cancer with a BRCA1 mutation is more often associated with medullary-like histopathology, TNBC and basal phenotype [34]. Our data and those of Kowalik [33] indicate that pathogenic BRCA1/2 variants are more common in TNBC than in other types of breast cancer. We showed that pathogenic and likely pathogenic BRCA1/2 variants detected in TNBC accounted for 44.4% of all cases in the studied population (4/9). InŚwiętokrzyskie Oncology Center in the south of Poland, 37% (7 out of 19) of germline pathogenic BRCA1/2 variants detected in breast cancer were reported in TNBC patients [33]. The differences might be due to the group size and patient inclusion criteria, but our result of over 44% of pathogenic or likely pathogenic variants in TNBC indicates the importance of referring TNBC patients.
In the present study, we showed that the prevalence of pathogenic BRCA1/2 variants in the TNBC group of patients was 26.7% (4/15). In an unselected Australian TNBC patient cohort, where 59% did not have any family history of breast or ovarian cancer, only 9.3% were found to have germline pathogenic BRCA1/2 variants [32]. These results show the importance of the selection of enrolled patients.
Interestingly, in sample BR3/18, two pathogenic BRCA2 variants were detected: germline NM_000059.3:c.5645C>A in exon 11 and somatic NM_000059.3:c.9097dupA in exon 23 (Table 2). Both mutations cause premature translation termination (stop codon); therefore, the somatic mutation will probably not affect treatment with PARP inhibitors. Nevertheless, the latest reports describe acquired, reversion mutations after platinumbased chemotherapy and PARPi therapy [35,36]. These somatic mutations (in the presence of a germline BRCA mutation) restore BRCA1/2 function and are associated with a decrease in the response to current treatment and tumor progression [35,36]. The possibility of the detection of reversion mutations is one more argument for the superiority of BRCA1/2 sequencing in tumor tissue. On the other hand, we also must admit that this single-nucleotide duplication was present in homopolymer region. Additional Sanger sequencing could be performed on tissue to better discriminate real somatic changes from a false-positive signal; however, 35% tumor content together with a homopolymer region makes this a difficult challenge.
In breast cancer patients, other molecular changes, such as epigenetic silencing of BRCA1/2 and other genes, may be present. Esteller et al., in 2000, reported that hypermethylation and inactivation of BRCA1 was detected in 13% of sporadic breast tumors [42]. Other potential changes are large insertions, deletions or structural rearrangement [43]. As mentioned earlier, LGR in the BRCA1/2 genes are present in a small percentage of patients with breast and ovarian cancers, but the method used in the present study does not allow them to be detected. In general, LGR are not tested in the Polish population because of their low frequency (3.1-3.7%) [21,22].
Interestingly, in six samples, we detected VUSs. Previous studies have shown that co-segregation analysis and family history in a large cohort [44], as well as protein structure [45], in addition to the increasing number of functional studies, the development of computational prediction algorithms and database enlargement, allow VUSs to be reclassified as well [46,47].
The lack of unified reporting and VUS reclassification guidelines complicates comparisons between similarly designed studies within populations, in which the NGS technology was used, and encourages caution in extracting data from publications. For example, two Polish patient populations were examined at different oncology centers: Kowalik et al. [33] in 2018 considered pathogenic, VUS and benign variants using the ACMG recommendation, whereas Kluska et al. [31] in 2015 mainly used BIC, Condel Score and a literature search and did not mention VUS as a category separate from pathogenic and likely pathogenic variants in their study report. Depending on the biological material, variants detected can be classified using the ACMG guidelines (2015) [48] or TIER classification (2017) [49]. Another major difficulty concerning comparisons of studies is the different inclusion criteria for individuals selected for the study and the fact that some authors do not indicate whether the detected variants are germline or somatic [50].
In the present study, we identified one somatic variant in a tissue with 35% of tumor cells (Table 3), which is slightly lower but still in concordance with the data of other groups. Kowalik et al. reported that the lowest percentage of tumor cells with detectable pathogenic variants was 40% in ovarian cancer tissue samples [51], while Ellison et al. suggested that the starting material should contain at least 10% of tumor cells in order to detect low-frequency somatic variants [52].
NGS data analysis is the next key step. Laboratories use different programs and applications for secondary and tertiary analysis, depending on funds, experience and the number of tests performed. Various analysis algorithms, filters used in secondary analysis, strand biases, unbalanced strand mapping and variant calling can cause variability in the results of analyses [53,54]. Tools used in the analysis of sequencing data are crucial for the appropriate identification and interpretation of variants. It is important to choose the right tools and algorithms for secondary analysis. In our study, we observed differences in variant calling files (VCF) obtained from the same FASTQ files by means of different variant calling algorithms: for sample BR53/18 ( Table 3). The accuracy (proportion of reads that are correctly mapped) and sensitivity (proportion of reads mapped to the reference genome) of tools used should be taken into account [55]. Tertiary analysis performed with different tools is more likely to have the same outcome if files from the secondary analysis of the same type are used.

Conclusions
In conclusion, our study has important value on a national scale owing to the wellselected population. We detected nine pathogenic variants in 8 out of 75 selected patients (10.7%), of which one was somatic and eight were germline variants. We can conclude the validation of NGS as an important bioinformatic procedure, the importance of screening both somatic and germline mutations and the importance of the role of additional susceptible genes in breast cancer. We present a preliminary step to estimate the size of the eligible population for PARPi treatment. Moreover, our study, alongside other similar ones, underlines the need to identify additional breast cancer susceptibility genes, particularly to explain the high percentage of hereditary cases. The next-generation sequencing technology offers new hope for this purpose.