Prevalence of RECQL germline variants in Pakistani early-onset and familial breast cancer patients

The RecQ Like Helicase (RECQL) gene has previously been shown to predispose to breast cancer mainly in European populations, in particular to estrogen receptor (ER) and/or progesterone receptor (PR) positive tumor. Here, we investigated the contribution of pathogenic RECQL germline variants to hereditary breast cancer in early-onset and familial breast cancer patients from Pakistan. Comprehensive RECQL variant analysis was performed in 302 BRCA1 and BRCA2 negative patients with ER and/or PR positive breast tumors using denaturing high-performance liquid chromatography followed by DNA sequencing. Novel variants were classified using Sherloc guidelines. One novel pathogenic protein-truncating variant (p.W75*) was identified in a 37-year-old familial breast cancer patient. The pathogenic variant frequencies were 0.3% (1/302) in early-onset and familial breast cancer patients and 0.8% (1/133) in familial patients. Further, three novel variants of unknown significance, p.I141F, p.S182S, and p.C475C, were identified in familial breast cancer patients at the age of 47, 68, and 47 respectively. All variants were absent in 250 controls. Our data suggest that the RECQL gene plays a negligible role in breast cancer predisposition in Pakistan.

(RECQL) gene was identified in West European and East Asian populations as a candidate breast cancer susceptibility gene [6,7]. It encodes a DNA helicase, which is involved in the repair of DNA double-strand breaks and plays a crucial role in the maintenance of genomic stability. Several studies conducted among unselected breast cancer patients from Belarus and Germany [8], USA [9], and early-onset and familial breast cancer patients from Poland [6], Canada [6], and Australia [10] reported pathogenic RECQL variant frequencies ranging from 0 to 2.6%. Breast tumors associated with pathogenic RECQL variants were predominantly positive for the estrogen and progesterone receptors (ER and PR) [6][7][8]11].
Apart from two studies conducted in an East Asian population from China [7,11], data on the contribution of pathogenic RECQL variants to early-onset and/or familial breast cancer patients from other Asian regions are lacking. In Pakistan, breast cancer is the most common malignancy and main cause of cancer-related deaths in women. The burden of breast cancer in terms of estimated age-standardised incidence and mortality rates is 43.9 and 23.2 per 100,000, respectively [12]. Pathogenic variants in high-and moderate-penetrance breast cancer susceptibility genes (BRCA1, BRCA2, TP53, CHEK2, RAD51C, and PALB2) account for about 27% of early-onset and familial breast cancers in Pakistan [13][14][15][16][17], leaving a substantial proportion of cases unexplained. In the present study, we determined the contribution of pathogenic RECQL variants to hereditary breast cancer in 302 early-onset and familial BRCA1 and BRCA2 negative patients with ER positive and/or PR positive breast cancer in a South Asian population from Pakistan.

Study subjects
Patients diagnosed with invasive breast cancer were selected from the institutional registry of genetically enriched breast and ovarian cancer families enrolled at the Shaukat Khanum Memorial Cancer Hospital and Research Centre (SKMCH&RC) in Lahore, Pakistan, from June 2001 to August 2015, fulfilling the inclusion criteria as described previously [17,18]. The present study included 302 early-onset and familial breast cancer patients with ER positive and/or PR positive tumors. All study participants were tested negative for pathogenic variants in BRCA1, BRCA2 [17,18] and about 60% for pathogenic variants in PALB2 (n = 187), TP53 (n = 180), CHEK2 (n = 168), and RAD51C (n = 168) [13][14][15][16] (Muhammad U. Rashid, unpublished TP53 data). We categorized study participants into four risk groups based on age at cancer diagnosis or family history of breast and/or ovarian cancer (Table 1) [17].
The control population comprised 250 healthy women with no family history of breast/ovarian cancer. They were selected from the institutional registry of 1012 female controls enrolled in a Pakistani breast cancer case-control study as previously described [19]. The Institutional Review Board (IRB) of the SKMCH&RC approved the current study (IRB approval number ONC-BRCA-001/2). All study participants signed informed written consent.

Variant screening
The complete coding sequence and exon-intron junctions of the RECQL gene (Genbank accession number NM_002907.3) were screened in the 302 index patients and 250 controls by denaturing high-performance liquid chromatography (DHPLC) analysis. The PCR primers details are described elsewhere [7]. When available, a positive control with a known variant was included in each set of DHPLC analysis. Bidirectional DNA sequencing was performed to confirm a variant, as described elsewhere [20].
Variants were classified as pathogenic, likely pathogenic, benign, likely benign, and as variants of uncertain significance (VUS), according to the Sherloc guidelines [21]. Sherloc is a semiquantitative system in which each criterion is awarded a preset number of points on orthogonal benign (1B-5B) or pathogenic (1P-5P) scales using clinical and functional criteria. Point thresholds for pathogenic and benign classifications are 5P and 5B, for likely pathogenic and likely benign classifications 4P and 3B, and for VUS <4P and < 3B. Pathogenicity and benign point scores were calculated separately.

RNA analysis of the c.868-2A > G splice-site variant
Total RNA was extracted from blood samples of the proband and an unaffected sister harboring the RECQL c.868-2A > G, another variant negative unaffected sister, and a variant negative control using TRIzol reagent (Invitrogen, Carlsbad, CA, USA). Total RNA was transcribed to cDNA using the RevertAid First Strand cDNA Synthesis Kit (Thermo Fisher Scientific, Vilnius, Lithuania) with random hexamer primers according to the manufacturer's protocol. Reverse transcriptase (RT)-PCR was performed using the forward primer (5′ -CAG TTC CCT AAC GCA TCA CT -3′) and reverse primer (5′ -TTT CAT TGG CTG ACC ATT TT -3′) located on exon 7 and exon 9 of the RECQL transcript variant 1 (ENST00000444129.7), respectively. PCR reactions were carried out in a 25 μl volume containing 1 μl of respective cDNA, 1x PCR Gold Buffer (Applied Biosystems, California, USA), 2.5 mM MgCl 2 , 0.2 μM of each primer, 250 μM of each dNTP (Invitrogen, Carlsbad CA, USA), and 1 unit AmpliTaq Gold DNA polymerase (Applied Biosystems, California, USA). After an initial denaturation for 15 min at 95°C, cDNA was amplified by 35 cycles of 1 min at 94°C, 1 min at 57.5°C, 1 min at 72°C, and a final extension step of 5 min at 72°C. Five μl of RT-PCR products were loaded on a 2% agarose gel containing ethidium bromide (Sigma-Aldrich, Steinheim, Germany) and electrophoresis was performed at 140 V for 80 min and confirmed by Sanger sequencing as described previously [20].

Characteristics of the study participants
A total of 302 BRCA1 and BRCA2 negative index breast cancer patients were screened for RECQL germline variants. Of these patients, 122 (40.4%) were early-onset breast cancer patients (≤30 years of age), 133 (44.0%) belonged to families with two or more breast cancer cases with at least one case diagnosed at 50 years or younger, 18 (6.0%) to families with both breast and ovarian cancer, and 29 (9.6%) male breast cancer cases diagnosed at any age (Table 1). Of the index patients, 223 presented with ER positive and PR positive breast tumors, 55 with ER positive tumors, and 24 with PR positive tumors. The mean age of disease presentation was 36.6 years (range 20-78) for female breast cancer (n = 273), and 51.5 years (range 27-73) for male breast cancer (n = 29).

Spectrum of identified RECQL variants
In total, 31 distinct RECQL variants were detected. Of these, 20 were novel: one nonsense variant, one splicesite variant, three missense variants, three silent variants, and twelve noncoding variants ( Table 2). The remaining eleven variants were previously reported: three missense variants and eight noncoding variants.

Classification and characteristics of identified RECQL variants
The novel variants were analyzed for their potential functional effect using Sherloc guidelines [21], including the minor allele frequency (MAF) > 1% for benign variants reported in Genome Aggregation Database (gno-mAD) or in our study (Table 2) and in silico prediction tools (Table 3). One variant was classified as pathogenic, three as VUS, and 16 as benign/likely benign.
The novel pathogenic RECQL variant is a nonsense variant at nucleotide position 225 in exon 4 (c.225G > A (p.W75*)), which is predicted to result in premature protein termination. It was identified in a 37-year-old familial breast cancer patient (III:3, Fig. 1a) of Punjabi Variants of uncertain significance Classification of nucleotide alterations was performed using Sherloc guidelines [21] ethnicity and was absent in 250 controls. The patient carrying this variant presented with a grade 3, ER positive and PR positive invasive ductal carcinoma (IDC) with lymph node involvement. The pathogenic variant frequencies were 0.3% (1/302) in early-onset and familial breast cancer patients and 0.8% (1/133) in familial patients. The variant had a Sherloc score of 8P and was classified as pathogenic (Table 4).

Benign or likely benign variants
One novel variant in a canonical splice acceptor site of intron 7, c.868-2A > G, was detected in a 36-year-old familial (II:4, Fig. 1e), a 61-year-old male (II:8, Fig. 1f), and a 25-year-old female early-onset breast cancer patient (II:9, Fig. 1g) of Punjabi, Urdu speaking and Pathan ethnicity, respectively (1%, 3/302). It was also found in one of the two tested unaffected sisters (II:7, Fig. 1g) of the early-onset patient. Moreover, c.868-2A > G was detected in two controls (0.8%, 2/250). The similar frequencies in cases and controls indicate that this variant is not likely to be pathogenic. Using the Sherloc guidelines, a high frequency of the G allele (MAF = 0.5669%) was reported among South Asians (n = 13,582) in the gnomAD. It was predicted to have a functional impact by three of five splice-site prediction tools (Table 3).
To address if c.868-2A > G affects splicing, RT-PCR analysis of RNA extracted from two variant carriers and two non-carriers (one family member and one control) revealed the presence of one transcript corresponding to the reference full-length transcript (364 bp) in all samples (Fig. 2a). All transcripts were confirmed by Sanger sequencing (Fig. 2b-e). Thus, this variant may not affect The variant is considered as deleterious by six of the six protein function prediction or three of the five splice-site prediction algorithms for coding or noncoding variants, respectively b > 20% change in score (i.e., a wild-type splice-site score decreases and/or a cryptic splice-site score increases) is considered significant c Canonical splice acceptor site is abolished (MaxEntScan score + 2.46 → -5.49) and creates a cryptic splice acceptor site at c.877 the splicing of RECQL. It had a Sherloc score of 1P and 8B and was classified as benign ( Table 4). The remaining eleven variants (three missense variants and eight noncoding variants) have been previously reported as benign/likely benign in the ClinVar database (by April 2020) or in other populations.

Discussion
This is the first study that investigates the prevalence of pathogenic RECQL germline variants in 302 BRCA1 and BRCA2 negative high-risk patients with ER positive and/ or PR positive breast tumors from Pakistan. We identified a single novel pathogenic RECQL variant. Although several studies had been previously conducted in Europe and only two studies in East Asia, there is still conflicting evidence for a role of RECQL in breast cancer predisposition [6,8,10,23]. Our study provides additional information on the contribution of the RECQL gene to hereditary breast cancer in a South Asian population from Pakistan.
The novel pathogenic RECQL variant, p.W75* was identified in 0.3% of early-onset and familial breast cancer patients with hormone receptor-positive tumors, but not in controls, suggesting that p.W75* may be diseasecausative. In other studies performed in China [7,11], higher pathogenic variant frequencies ranging from 0.54 to 1.6% were observed in BRCA1 and BRCA2 negative early-onset and/or familial breast cancer cases. In Caucasian studies conducted in the Australia [10], Canada [6], Poland [6], and USA [9], similar variant frequencies ranging from 0.1 to 0.4% have been reported in familial breast cancer patients, while no pathogenic variants were detected in studies performed in South-West Poland and West Ukraine [24]. In other Caucasian studies conducted in Belarus, Germany, and Australia, the frequency of pathogenic variants identified in controls were similar or higher than cases [8,10]. Overall, these findings suggest a controversial role of RECQL as a breast cancer susceptibility gene.
Previously, a missense variant (p.R215Q) in the highly conserved RecA-like domain D1 of RECQL (amino acid residues 63 to 281) is reported to disrupt the RECQL helicase activity and classified as a pathogenic mutation [7]. In the current study, a novel missense variant, p.I141F, in the same domain was found in one familial breast cancer patient (0.3%), but not in controls. It may also affect the ATP-dependent translocation activity of RECQL leading to disruption of helicase activity [25]. However, functional assays are warranted to confirm this finding. Nevertheless, the population allele frequency of p.I141F was rare among South Asians in the gnomAD. Overall population data, variant type, clinical observation and findings from in silico predictions suggest that p.I141F may be a VUS based on the Sherloc guidelines.
The recurrent splice-site variant, c.868-2A > G, was identified in three breast cancer patients (1.0%) and two controls (0.8%). Its similar frequency in cases and controls indicates that this variant may be benign. This is supported by the fact that it has a very high frequency (0.5669%) among South Asians in the gno-mAD. In addition, RT-PCR analysis revealed that it did not affect the RECQL splicing. Thus, based on the Sherloc variant classification guidelines, our data suggest that c.868-2A > G may be benign. However, we cannot exclude that the aberrantly spliced allele may have escaped from detection due to the nonsense-mediated decay or other splicing events may have occurred that were not investigated in the present study. The ER and PR positive breast tumor of the Pakistani patient with the pathogenic RECQL variant showed high grade and IDC histology. These findings are in line with those from other studies conducted in China [7], Poland [6], Belarus, and Germany [8] further supporting the notion that high grade, hormone receptor-positive breast tumors of IDC histology may be predictors of the pathogenic RECQL variant status.
Our study has several limitations. First, despite its reasonable size, larger studies are warranted to confirm our findings. Second, mutation analysis was restricted to patients with ER and/or PR positive breast tumors, in whom a predominance of pathogenic RECQL mutations has been reported [6][7][8]11]. However, since patients with both ER and PR negative or triple-negative breast tumors were not tested, this may have undermined the prevalence of pathogenic RECQL variants reported in this study. Further, the functional analyses of the splice-site variant should be extended in order to confirm its classification as benign.

Conclusion
In summary, we identified a single pathogenic RECQL variant in 302 BRCA1 and BRCA2 negative high-risk patients with ER positive and/or PR positive breast tumors. The frequencies of the novel pathogenic variant were 0.3% (1/302) in early-onset and familial breast cancer patients and 0.8% (1/133) in familial patients. Our data suggest that pathogenic RECQL variants explain a negligible proportion of hereditary breast cancer in Pakistan.