Next-generation sequencing in familial breast cancer patients from Lebanon

Familial breast cancer (BC) represents 5 to 10% of all BC cases. Mutations in two high susceptibility BRCA1 and BRCA2 genes explain 16–40% of familial BC, while other high, moderate and low susceptibility genes explain up to 20% more of BC families. The Lebanese reported prevalence of BRCA1 and BRCA2 deleterious mutations (5.6% and 12.5%) were lower than those reported in the literature. In the presented study, 45 Lebanese patients with a reported family history of BC were tested using Whole Exome Sequencing (WES) technique followed by Sanger sequencing validation. Nineteen pathogenic mutations were identified in this study. These 19 mutations were found in 13 different genes such as: ABCC12, APC, ATM, BRCA1, BRCA2, CDH1, ERCC6, MSH2, POLH, PRF1, SLX4, STK11 and TP53. In this first application of WES on BC in Lebanon, we detected six BRCA1 and BRCA2 deleterious mutations in seven patients, with a total prevalence of 15.5%, a figure that is lower than those reported in the Western literature. The p.C44F mutation in the BRCA1 gene appeared twice in this study, suggesting a founder effect. Importantly, the overall mutation prevalence was equal to 40%, justifying the urgent need to deploy WES for the identification of genetic variants responsible for familial BC in the Lebanese population.

In Lebanon, BC is the most common cancer type in females and it constitutes one-third of all reported cancer cases. BC incidence rates are expected to reach 137 per 100,000 by 2018 [16]. Yet, to date, only two studies have investigated the role of BRCA1 and BRCA2 mutations in the Lebanese population. These studies reported varied prevalence of pathogenic BRCA mutations ranging between 5.6 to 12.5% in BC cases [17,18]. The reported prevalences of both BRCA1 and BRCA2 deleterious mutations were lower than those reported for the Western populations, which suggest the involvement of other genes in the pathogenesis of BC cases [19]. The reported low prevalence does not support the hypothesis that BRCA1 and BRCA2 mutations alone are responsible for the majority of the observed Lebanese women with early-onset BC. This finding could well explain the fact that BC is a disease with a high level of genetic heterogeneity and that monogenic and polygenic models of inheritance may exist.
Since the completion of the human genome project, massive leaps have reshaped the field of clinical genomics. The development of Next-generation sequencing (NGS) platforms allowed a more robust, fast and accurate analysis of diseases and syndromes with polygenic nature. NGS platforms including WES are believed to enhance and improve diagnosis and therapy development of many diseases including BC [20][21][22][23].
In the presented study, we utilized WES to investigate germline genetic variations in 45 Lebanese cases diagnosed with familial BC and unknown BRCA1 or BRCA2 status. We found several rare variants that can potentially explain BC susceptibility in the analyzed cases.

Inclusion criteria
From 2012 to 2015, 45 unrelated patients with inherited BC were selected to undergo DNA testing. They were referred from a wide variety of settings from all over the country, ranging from private physicians' clinics to major academic medical centers because of hereditary BC. The patients fulfilled a personal history of invasive BC and at least one of the following criteria: A) diagnosis at age ≤ 40 years, B) BC at any age at onset with at least 2 firstand/or second-degree relatives, C) BC < 50 years in a first-or second-degree relative, D) ovarian cancer in at least 2 first-and/or second-degree relatives, E) breast and ovarian cancer in at least 2 first-and/or seconddegree relatives, F) both breast and ovarian cancer in a single first-or second-degree relative.
Approval to conduct the study was obtained from the Ethics Committee of Saint-Joseph University-Lebanon. After an informed consent was signed and all ethical requirements were fulfilled, a 10 ml of peripheral blood was isolated from each individual enrolled and the DNA was extracted using the salting out methods [24]. All patients signed the informed consent and agreed to share their variant data.

Whole exome sequencing
Exon capture and sequencing: Samples were prepared for whole Exome sequencing and enriched according to the manufacturer's standard protocol. The concentration of each library was determined using Agilent's QPCR NGS Library Quantification Kit (G4880A). Samples were pooled prior to sequencing with each sample at a final concentration of 10nM. Sequencing was performed on the Illumina HiSeq2000 platform using TruSeq v3 chemistry.
Mapping and alignment: Reads files (FASTQ) were generated from the sequencing platform via the manufacturer's proprietary software. Reads were aligned to the hg19/b37 reference genome using the Burrows-Wheeler Aligner (BWA) package v0.6.1 [25]. Local realignment of the mapped reads around potential insertion/deletion (Indel) sites was carried out with the Genome Analysis Tool Kit (GATK) v1.6 [26]. Duplicate reads were marked using Picard v1.62. Additional BAM file manipulations were performed with Samtools 0.1.18 [27]. Base quality (Phred scale) scores were recalibrated using GATK's covariance recalibration. SNP and Indel variants called using the GATK Unified Genotyper for each sample [28]. SNP novelty is determined against dbSNP. A list of 134 genes known to be associated with hereditary BC and other cancers were studied (Additional file 1).

Variants evaluation
Variants obtained were reported using five categories according to the Human Genome Mutation Database (HGMD Professional) [29]. These categories are listed in Table 1.
The first variant category consists of alleles labeled as disease causing mutations (DM) in HGMD Professional. These alleles must be rare: <1% allele frequency in 6,500 exomes from the National Heart, Lung, and Blood Institute (NHLBI) Exome Sequencing Project ("Exome Variant Server" 2015) and the 1,000 Genomes Project Genomes [30].

Patient characteristics and sequencing statistics
The mean age at diagnosis of BC for the 45 patients was 44 years (range 29-79). Sixteen patients provided us with their histopathological results. Seven BC were estrogen-receptor (ER) and progesterone-receptor (PR) positive, 5 patients had negative ER and PR disease and 2 patients had negative ER and positive PR disease. Two patients had triple negative disease from which one patient (Family 30) carried p.C44F mutation in BRCA1 (Fig. 1).
We obtained an average of 44 million reads per sample, with a mean coverage of 94% at a mean X coverage of 20X.

WES analysis
Within this cohort, a total of 126 variants were detected by WES and these are listed in Table 2. In 7 of the 45 patients, not listed in Table 2, no variants in cancer predisposing genes (Additional file 1) were identified.
We were able to detect 19 HGMD DM variations of which 9 are specifically associated with breast cancer ( Table 2). The distribution of the remaining variants in the HGMD categories was: 11 DM?, 11 DP, 1 FP, and 9 DFP. In addition, 75 novel variations were detected in this study (Table 2).
Six BRCA1 and BRCA2 DM mutations were detected in 5 and 2 patients, respectively in a total prevalence of 15.5% (Table 2).    Nine truncating mutations were detected in 9 different patients ( Table 2). Three of these mutations were DM in HGMD: The first woman carried p.R1443* in BRCA1, the second one carried p.V220I* in BRCA2 and the third one carried p.G164X in ABCC12 ( Table 2). The six remaining truncating mutations were not found in HGMD: p.Q613X in SLX4, p.R170X in ERCC3, p.Q117X in EZH2, p.P742fs in NSD1, p.357_364del in BARD1 and p.L1697fs in BRCA1 (Table 2).
In some families where different variants were found, in order to consider, which variant is pathogenic, we analyzed the co-segregation of the variations found with the cancer phenotype within 3 families 12, 13, and 32 ( Figs. 1 and 2).
Two members of family 12 were diagnosed with BC, their mother and maternal uncle were diagnosed with primary lung cancer and bone cancer, respectively. The nonsmoking mother was affected at the age of 63 but the age of the maternal uncle at diagnostic was not accessible. WES, in proband 12/B35 diagnosed with BC at the age of 42, identified 2 variants including one DM? p.I94L in RAD50, according to HGMD Professional database, and one novel variation p.G191R in ARL11  (Fig. 1).
Six members of family 32 were diagnosed with BC (Fig. 2). Members III-3, III-4 and III-6 were diagnosed with BC at the age of 56, 48 and 50, respectively. WES in proband III-4 identified 2 relevant variants including p.M1I in CDH1 and p.T1354M in BRCA2. Prediction tool SIFT indicated that both changes are damaging and are DM according to HGMD Professional database ( Table 2 and Fig. 2). The analysis of this family showed that these variations were carried by affected and siblings that are not affected to date (Fig. 2). However, they were advised to join our screening program.
We have noted that the most frequently altered genes involved in our familial cases are DNA repair genes (Fig. 3a) and that some variants were recurrent in our cohort: p.W149X in ARL11, p.S836S in RET, p.A126T in RAD51C, p.T241M in XRCC3, p.G998E in PALB2 and c.673-36G > C in TP53 (Table 2 and Fig. 3b). In four cases, like the 4 families shown in Fig. 1, individuals appear to co-inherit multiple cancer causing or predisposing gene mutations. Unlike, the old strategy where one stops the investigation once a pathogenic mutation was identified, NGS gives us the capability of collating all known mutations/variants in a sample, which may permit a more comprehensive understanding of the polygenic landscape model of cancer. An important question to be answered is: Does an individual in Family 13 harboring all three DM mutation have different penetrance, genotype to phenotype correlation, type or age of onset of cancer than a sibling with only one DM variant? This critical question can only be answered when we start to combine all germline variant data of cancer patients and their comprehensive phenotypes from around the world in well-curated databases.
In the Lebanese population, p.C44F mutation in the BRCA1 gene was found twice in this study and 5 times in previous studies [17,18] in a total of 7 from 367 cases studied (1.9%). In fact, 2 of 9 patients carried a In families 23 and 35, we identified the truncating mutation p.357_364del in BARD1 (Table 2). A previous study, on this variation, showed the absence of cosegregation with the disease and it was considered as neutral polymorphisms [35]. We have observed this variant in our population and breast cancer patients and it is recommended that a more thorough and functional examination of this variant be conducted in the future.
In families 12, 13 and 32, we identified 7 variants in ARL11, BRCA1, BRCA2, CDH1, RAD50, SLX4, and STK11. The association of which variation towards increasing predisposition to BC remains unknown. Therefore, we analyzed the segregation of these variations and BC within the families. In family 13, only p.C44F in BRCA1 segregated with BC in the family. In family 12, p.I94L in RAD50 (a DM? mutation) was found in affected and healthy sisters and could therefore not lead to a conclusion regarding predisposition to BC. In family 32, p.M1I in CDH1 and p.T1354M in BRCA2 are implicated in gastric cancer and BC respectively and knowing that the family presented with only BC, two hypothesis can be formulated. First, III-6 can be considered as phenocopy and second healthy, till now, sisters III-5, III-7 and III-9 are at high risk (Fig. 2). In fact, in highrisk families, women testing negative for the familial BRCA mutation have an increased risk of BC and should be considered for continued surveillance [36]. Interestingly, two members of this family, III-4 and III-6 presented with invasive lobular breast cancer (Fig. 2). The association between CDH1 gene mutation and lobular cancer has been well established previously [37], and it is not unrealistic to suggest that this CDH1 variant may be the cause of lobular breast cancer in this family.
The pathogenic status of the majority of novel substitutions found and the 6 variations considered as DM? according to HGMD professional, remains problematic ( Table 2). In fact, HGMD professional reports DM? as likely pathological mutation reported to be disease causing in the corresponding report, but the author has indicated that there may be some degree of doubt, or subsequent evidence has come to light in the literature, calling the deleterious nature of the variant into question [29]. Further studies are needed to define the pathogenic status of the novel substitutions and the DM? variations that have been found in our cohort of patients with BC. These future studies have to be analyzed in a larger number of affected families and control population samples.
NGS and traditional sequencing methods are not proficient in detecting BRCA genomic rearrangements including large deletions or duplications. Deletion and duplication genomic rearrangements vary significantly among countries and within ethnic groups [38]. We admit, therefore, that our reported BRCA mutation prevalence is underestimated.
Among the DM mutations found, several were associated with syndromes (Peutz-Jeghers), different cancer types (renal cell carcinoma, gastric cancer) and with diseases (Xeroderma pigmentosa, ataxia telangiectasia) ( Table 2). Clinically, none of the symptoms found in these diseases were manifested in the different studied families except for family 24. In this family, proband 24/B49 carried the mutation p.R1443* in BRCA1 and two MSH2 variants (Fig. 1). Her mother had ovarian cancer and her sister uterine cancer, both are deceased and could not consequently be tested for these variants. MSH2 mutation is reported in families with endometrial cancer (Lynch syndrome) and breast cancer from Kuwait [39]. This is the first application of NGS on BC in Lebanon. In this study, we showed that the prevalence of deleterious BRCA mutations (15.5%) is lower than expected [17,18] and that the overall mutation prevalence is equal to 40%, justifying the urgent need for the adoption of high-throughput NGS technologies to identify genes responsible for familial BC in the Lebanese population. Indeed, additional to BRCA mutations, highly penetrant mutations in genes associated with various hereditary cancer syndromes, such as CDH1, TP53, MSH2, ATM and POLH were found in the Lebanese population. Finally, we cannot rule out that some of these families shift a putative explanation towards a polygenic model where moderate and low penetrance alleles, acting together, may play a predominant role [20,40,41]. Our findings support the eligibility of performing genetic testing by massively parallel sequencing on Lebanese familial BC cases. Moreover, we would like to use this technology for tumor genome sequencing, in order to identify somatic alterations, which would be a valuable guidance towards individualized cancer therapy of Lebanese patients with BC. However, it is worthy of note that our study reports a small number of variants that are clinically actionable. Given the high rate of novel variants identified in BRCA1/2 and other breast cancerassociated genes, the clinical usefulness of the data is currently limited. Unless larger and rigorous studies are committed in this area of the world to correctly classify variants identified here or in other studies, the diagnosis and treatment of breast cancer will remain suboptimal.

Conclusion
This is the first study that utilized NGS technology to study genetic variants in 45 patients with familial breast cancer from Lebanon. Our deleterious mutation prevalence was 40% with only 15.5% accounted for by the BRCA1 and BRCA2 genes. This data should encourage a different strategy for familial breast cancer genetic screening in Lebanon, one that is based on WES rather than the initial screening of BRCA1/2 genes. We report here novel and rare variants in breast cancer predisposing genes, which will be valuable to researchers and clinicians around the world for variants' classification and patients' care in general.