Heterogeneity of germline variants in high risk breast and ovarian cancer susceptibility genes in India

Abstract Breast and ovarian cancers now account for one in three cancers in Indian women and their incidence is rising. Major differences in the clinical presentation of breast and ovarian cancers exist between India and the United Kingdom. For example, Indian patients with breast cancer typically present a decade earlier than in the UK. Reasons for this could be multifactorial, including differences in underlying biology, environmental risks, and other systematic factors including access to screening. One possible explanation lies in variable incidence or penetrance of germline mutations in genes such as BRCA1 and BRCA2. We performed a methodical database and literature review to investigate the prevalence and spectrum of high-risk cancer susceptibility genes in Indian patients with breast and ovarian cancers. We identified 148 articles, but most studies were small, with inconsistent inclusion criteria and based on heterogeneous technologies, so that mutation frequency could not be reliably ascertained. Data were also often lacking on penetrance, histopathology, and survival outcomes. After filtering out unsuitable studies, only 13 remained, comprising 1028 patients. Large-scale research studies are urgently needed to determine mutation prevalence, spectra, and clinico-pathological features, and hence derive guidelines for screening, treatment, and prevention specific to the Indian population.


Introduction
The global cancer burden is expected to increase from 14.1 million new cases and 8.2 million deaths in 2012, to 21.7 million cases and 13 million deaths by 2030.However, these large numbers are contrasted by the very diverse nature of cancer that makes every patient unique.Precision medicine has enormous potential to transform cancer care by identifying genomic and epigenetic markers for screening, treatment, and prognosis.These gains are particularly relevant for countries such as India, grappling with both a rising cancer burden and competing demands for essential health care.India's cancer burden, currently estimated at over 1.5 million new cases is predicted to nearly double in the next 20 years, with age-adjusted mortality rates of 64.5 per 100 000 (GLOBOCAN 2012). 1 Cumulatively, breast, cervical, ovarian, and uterine cancers account for more than 70% of cancers in women in India, thus establishing tackling women's cancers as high priority for healthcare providers and research. 2ignificant phenotypic differences exist in breast and ovarian cancers between patients in India and in the UK.The incidence of breast and ovarian cancer is relatively low in India in comparison with the UK: breast cancer 23.8 versus 92.9 cases per 100 000 women in the UK, ovarian cancer 4.9 versus 11.7/100 000 women in the UK (GLOBOCAN 2012). 1 However, a high proportion (~11-26%) of Indian patients with breast cancer present at ages younger than 35 years. 3onversely, approximately half of newly diagnosed breast and ovarian cancer cases occur in women aged 65 years and older in the UK, compared with only 15% in India (Fig. 1).The incidence of the more aggressive histological type of breast cancer, triple-negative disease, is also estimated to be higher at 31% in India, nearly double that of the UK. 5 Breast cancer incidence also fluctuates substantially across India, with agestandardised incidence rates varying between 41/100 000 rate in urban centres such as New Delhi and 12.4/100 000 in rural cancer registries, thus adding a further layer of complexity. 6hese phenotypic differences could be a result of differences in tumour biology such as differences in the incidence of high-risk germline susceptibility genes, environmental modifiers, 7,8 or systematic factors such as access to screening and treatment.Germline mutations in high-risk susceptibility genes (e.g.0][11][12] Women with a germline BRCA1 mutation have a lifetime risk of ovarian cancer by age 70 years of up to 63% and of breast cancer by age 70 years of up to 85%. 13 Risks of ovarian and breast cancers in women by age 70 years among BRCA2 carriers are reported to be up to 27% and 84%, respectively.Other genes in which germline mutations confer susceptibility to breast and/or ovarian cancer, albeit with lower frequency and penetrance include PALB2, TP53, PTEN, CDH1, STK11, CHEK2, RAD51, and ATM. 14e systematically reviewed the literature and relevant data repositories to characterise the prevalence and spectrum of germline variants in breast and ovarian cancer susceptibility genes in the Indian population, including putative BRCA1 and BRCA2 founder mutations.We excluded SNPs with high frequency in the population.We investigated the literature for details of clinical, family history, pathology, and survival data in these patients.

Search strategy, inclusion and exclusion criteria
A comprehensive literature search was performed to include articles published between 1 January 1990 and 1 December 2016 using the following search terms on ethnicity, condition, and high penetrance genes (Table 1): 'India and (breast cancer or ovarian cancer) and (BRCA1 or BRCA2 or PALB2 or TP53 or PTEN or CDH1 or STK11 or CHEK2 or RAD51C or RAD51D or ATM or BARD1 or NBN or MLH1 or MSH2 or MSH6 or PMS2 or EPCAM)' in EMBASE and PubMed/Medline to identify relevant published and unpublished studies as well as studies in progress.Further searches were carried out in the BIC 15 database using the keyword 'Indian' in the ethnicity fields and also in the ClinVar database. 16Additional database searches included the 1000genomes 16 , TCGA, 13 COSMIC 18 , dbSNP 19 , ICGC 20 , HGMD 21 , ExAC 22 , and the GWAS catalog 23 .
Figure 1.Comparisons between UK and India by age of newly diagnosed BOC incidence in women. 3ble 1.List of genes, with high and moderate penetrance, used in the search terms in association with breast and ovarian cancer as well as Lynch syndrome.

Moderatepenetrance genes
Lynch syndrome genes This initial search was supplemented by checking reference lists, and contact with authors of included studies for information on any relevant published or unpublished studies.No language restrictions were applied.Two reviewers assessed titles, abstracts, and keywords to select potentially relevant studies from the retrieved list of articles.

Study selection criteria for literature search
All studies included in the analysis met the following inclusion criteria: (i) data reported on any genes included in Table 1; (ii) at least 10 patients of Indian origin; and (iii) contained DNA sequence variation data.The susceptibility genes selected are those commonly tested in clinical practice.Lynch syndrome genes were included as they confer susceptibility to ovarian cancer in addition to colon and uterine cancers (Table 1).Importantly, inclusion was not restricted by NCCN or Manchester definitions of familial risk to ensure broad inclusion of studies with available data.
The exclusion criteria were: (i) articles containing data limited to loss of heterozygosity and/or methylation studies; (ii) duplicate publications; (iii) studies that did not perform direct DNA sequencing to validate variants detected by PCR-based techniques using re-amplified genomic DNA; and (iv) studies that did not screen the entire susceptibility gene.If studies had overlapping data, only the latest or largest study was included (Fig. 2).
The first step of a two-stage selection process involved screening titles and abstracts.Subsequently, for all references categorised as 'include' or 'uncertain' by both reviewers, full text was retrieved wherever possible and final inclusion decisions were made on the full paper.Data extraction was carried out using predesigned and piloted data extraction forms with differences resolved by consensus and/or arbitration involving a third reviewer.

Data extraction from literature search
Three reviewers extracted detailed information relating to variants; clinical evidence, including family history when available; clinical diagnosis; and histopathology.The information collected included the following: year of publication; authors' names; journal; geographic location of study; cancer type; genotyping methods; details of germline variant, total numbers of cases and controls; frequencies of variant carriers in cases and controls; histopathology; overall and progression-free survival where available; and age of presentation.
All variants extracted from the publications were queried against the BIC database for BRCA1 and BRCA2 genes and ClinVar 16 to confirm whether they had been reported previously by other studies and to obtain their pathogenic classification.The SNP identifier for each of the variants, where available, was obtained from the dbSNP database. 24

Characteristics of included studies
The combined search for key terms led to the selection of 148 articles.After screening titles, abstract, and keywords, we extracted 120 full texts of articles considered eligible for inclusion.After reviewing the full texts and citations, we identified 67 studies meeting the inclusion criteria of which 31 contained data suitable for extraction.Of the 31 articles, only 13 articles contained usable data that satisfied both the inclusion and exclusion criteria (Fig. 2, Table 2).These publications included familial breast and/or ovarian cancer as well as sporadic cases.For the purposes of this review, we used a broad definition of FEOTN (familial/early-onset/ triple-negative) based on the studies included in the review, specifically one or more of the following: at least one first-degree relative with breast and/or ovarian cancer irrespective of age; early onset breast and/or ovarian cancer diagnosed with a family history; relatives affected first or second degree; triple-negative breast cancer in an early onset case; or bilateral breast cancer diagnosed < 50 years.Data were included from probands and from family members who were carriers, where given.We also included data from sporadic cancer patients where the paper contained this information.However, none of the publications on sporadic cases reviewed reported any pathogenic germline  variants and therefore we focused our analysis on FEOTN cases (Fig. 2).We identified a total of 1028 breast and/or ovarian cancer cases from the 13 studies.A breakdown of the number of studies from different categories of breast and/or ovarian cancer is presented in Table 3.The majority of the studies were conducted in or near the largest cities of India with the exception of two that were carried out within the Indian populations of Malaysia and Singapore.The patients recruited in any study usually resided in or near the big cities, which are densely populated and are more affluent than the rural populations of India (Fig. 3).

Platforms used for genetic testing
Many different platforms were used for genetic testing in the 13 studies, with the majority using PCR-based approaches including hetero-duplex formation, singlestrand conformation polymorphism (SSCP) analysis, denaturing high-performance liquid chromatography (dHPLC), and Sanger sequencing.
Only two studies with a cohort size of 141 and 91 used next generation sequencing (NGS) with Illumina HiScanSQ system, and these also reported the highest proportions of variants in the cohort.

Study findings on prevalence of cancer susceptibility genes
All 13 FEOTN publications reported data on BRCA1 and/ or BRCA2 and only three studies tested for other susceptibility genes such as TP53, RAD50, RAD52, ATM, and  Variants in cancer susceptibility genes in India | 79 CHEK2, with mutations in these found very rarely if at all.We therefore limited our analysis to BRCA1 and BRCA2 genes.Twelve studies reported previously identified pathogenic BRCA1 variants and 10 reported novel variants they considered likely to be pathogenic.The novel variants were not present in any of the online databases listed in the Methods section.Initially, we considered variants causing protein truncation only to be likely pathogenic.We then predicted the functional effects of non synonymous missense variants using SIFT, PolyPhen and CADD and identified 2 additional variants, 5360A>C and 5377G>A, considered deleterious/probably damaging by all three algorithms (Supplementary Table 1).In total, we identified 26 previously reported pathogenic variants and 18 novel likely pathogenic variants for BRCA1 from a total cohort of 926 (Tables 4 and 5).In combination, the previously reported and the novel variants were detected in 71/926 cases, 39 of whom carried the 'Ashkenazi' 185delAG mutation.For BRCA1, there were seven additional recurrent mutations, five in BIC and/or ClinVar and two that were novel (Tables 4 and 5).Of the five previously reported variants, c.2275C>T, c.2338C>T, c.3352C>T, and 4838delAGinsGCC each occurred in two cases and the other, c.4485-1G>A, occurred in three cases.The two novel variants were c.1052delT and c.632insT, the former detected in four cases and the latter in two cases, all from single studies (Table 5).
For BRCA2, there were four variants previously reported as pathogenic in ClinVar detected in the FEOTN cases; these were detected in 6/974 cases.The only recurrent variant, 6079del4, was detected in 3/974 cases from two different studies (Table 6).The number of variants reported to be novel and likely pathogenic was 16, and each of these variants was detected in single cases in single studies (Table 7).Furthermore, there were 9 non synonymous missense variants of which only one,c.3578T>C, was considered deleterious/probably damaging by SIFT, POlyphen and CADD (Supplementary Table 2).

Prevalence of founder mutations in BRCA1 and BRCA2
Ten of the 13 studies reported data on the putative founder mutation BRCA1 185delAG (Fig. S1, see online supplementary material).The mutation was detected in 39/ 927 (4.2%) cases with breast or ovarian cancer, the majority being from South India or Malaysians of Indian descent.The frequency of 185delAG varied, for example one study from New Delhi found only one carrier in 204 cases, but a high prevalence was reported in Bangalore (10/61 cases, 0/100 controls, Fisher exact test P = 3.7×10 -5 ) and Chennai (10/91 cases, 0/2 controls) 25,26 (Table 2).
The reported BRCA2 founder mutation 6174delT was not detected in any of the studies included in our analysis. 2Frequencies of BRCA mutations identified in the included studies in the Indian population are contrasted with those of white European populations (Tables 4 and 6).

BIC and ClinVar search and additional database search for variants from Indian ethnicity cases
The BIC and the ClinVar databases contain DNA sequence variations reported by genetics clinics from across the world.The majority of the DNA variants in these Variants in cancer susceptibility genes in India | 81 repositories are unpublished.The most frequent reported entry in BIC for the BRCA1 gene was 185delAG, which was also the most prevalent in our analysis (Table 8).Eight out of the 20 top entries in BIC were also detected in our literature survey, although not all of these variants were shown to be pathogenic (Table 8).None of the pathogenic BRCA2 variants identified from our literature search were present in the top 20 BIC entries for BRCA2 (Table 9).A search in BIC using the keyword 'Indian' in the ethnicity field revealed 23 BRCA1 variants and 11 BRCA2 variants.All these variants were detected in patients of Indian descent from Singapore or Malaysia.Seven of the BRCA1 variants were present in our dataset collated from the literature (Tables 10 and 11).However, of the seven variants that overlapped, only two (180delA and 185delAG) were classed as pathogenic in BIC and ClinVar (Tables 10 and 11).Of the 11 BRCA2 variants present in BIC with Indian ethnicity, three were also present in our literature dataset and of the three only one was classed as pathogenic, Q2957X.Another interesting observation was that the BRCA2 variant E1593D present in both our dataset and in the subset of 11 BIC variants, was also reported in two additional Pakistani patients in BIC.
The same search performed in ClinVar with 'Indian' detected 40 variants for BRCA1 and 30 for BRCA2, which included all variants also present in BIC.
Individual searches in additional databases such as TCGA, ICGC, dbSNP, GWAS catalogue, COSMIC, and HGMD did not yield any results.Although these databases contain ethnicity data, they use a very broad definition of 'Asians', yet the ethnicity data in the 1000genome database are region-specific and therefore this makes comparisons difficult.Furthermore, there were no data in ICGC on breast and ovarian cancers from India.

Details of family history, penetrance, and survival in included studies
Studies in the literature used very heterogeneous criteria to define a family history of disease.Mutation prevalence in women with a family history of breast and/or ovarian cancer was presented in 11 of the 13 studies, but only seven of these provided clear criteria for family history (≥1 first degree relative affected with breast or ovarian cancer at any age).Women with sporadic breast or ovarian cancer were reported in seven publications.None of the 13 studies provided penetrance data.One small study with 91 patients presented survival information and found no significant association with pathogenic BRCA1 or BRCA2 mutations. 25

Histopathology
Two studies 27,28 provided some data on breast cancer histopathology, with none describing complete histological details such as grade of cancer, hormone receptor, and HER2 status.Eachkoti et al. reported the majority of cases (22/25) to be infiltrating ductal carcinoma (IDC) with two inflammatory carcinomas (an aggressive type of breast cancer) and one Paget's Table 6.Previously reported pathogenic BRCA2 variants identified from the literature search that are also present in BIC and ClinVar.disease.Similarly Thirthagiri et al. identified IDC as the commonest histological type for both BRCA1 and BRCA2 carriers.Where grade was available, tumours were of grade 2 and 3, with no grade 1 tumours identified.BRCA1 tumours were largely triple negative and less commonly HER2 positive, whereas BRCA2 tumours were more likely to be hormone receptor positive.The data, however, were not available for the three markers in eight cases and for at least one of the three markers in an additional seven cases out of the total 28 tumours  Variants in cancer susceptibility genes in India | 83 included.No studies were identified including information on the histology of ovarian tumours.

Discussion
We have reported the findings of a methodical review of reported germline variants in BRCA1, BRCA2, and other high-penetrance breast and ovarian cancer susceptibility genes within women of Indian descent.Our searches highlight both the diversity of the Indian population as well as the paucity of data on germline variants in these genes in the Indian population.There are very limited Indian-specific data and, even where these are available, there is great variability in inclusion criteria, definition of high-risk groups (such as those with a family history), mutation detection methods, geographical origin, and ethnicity, thus making any India-wide assessment unreliable.The small cohort size mean that the spectrum of mutations identified in BRCA genes is unlikely to be representative of the Indian population and is indeterminate for other high-risk susceptibility genes in this population.Our searches have identified 18 BRCA1 and 16 BRCA2 variants in the Indian population that had not been previously reported elsewhere, nor currently present in BIC or ClinVar.There were no studies of sporadic or unselected cases and also very limited data on penetrance or survival that could be used for calculating cancer risks and hence implementing counselling and screening in Indian populations.
The spectra of BRCA1 and BRCA2 mutations have been characterized in a number of different populations worldwide, with significant variation among populations in the contributions of these genes to hereditary breast and ovarian cancer. 29Founder mutations account for differing proportions of cancer in different populations; for example in the Ashkenazi Jewish population [12], three founder mutations have a combined population frequency of 2% and represent 60% of breast cancer families with a BRCA1 or BRCA2 gene mutation.Similarly, BRCA1 and BRCA2 founder mutations account for 78% of families with hereditary breast cancer in Chile. 30ur search reveals a much lower frequency (2.3%; 39/ 1700) of the putative Ashkenazi founder mutation 185delAG in Indian patients with breast and/or ovarian cancer.The carriers of this mutation were usually from the south of India.Other studies have explored how this variant arose in the Indian population.Kadalmani et al. examined the haplotypes of carriers of this variant and their families, and concluded that it arose independently from the Ashkenazi variant.Another study by Laitman et al. came to a similar conclusion based on haplotype analyses of carriers from ethnically diverse backgrounds, which included Indians from Cochin, south India. 31,32Other founder BRCA1 and BRCA2 mutations were not detected in any of the Indian patients with breast and ovarian cancers, and no India-specific founder mutations were detected.
Our literature search shows that variation in the prevalence of high-penetrance alleles in genes such as   26 In conclusion, there is an urgent unmet need for large-scale studies in geographically distinct regions, with high-quality data and longitudinal studies of relatives to help elucidate the role of breast and ovarian cancer susceptibility genes in the Indian population.Understanding these differences through research to derive India-specific paradigms for diagnosis, screening, prevention, and treatment is critical and essential to improving women's health in India. 1 Clinics in countries with the Indian diaspora and established clinical genetics services may be able to contribute to penetrance and survival data and further tease out the differences in environmental risk factors between Indian diaspora and Indian patients.

Figure 2 .
Figure 2. Flow diagram illustrating the criteria for selection of publications and corresponding number of articles.

Figure 3 .
Figure 3. Geographical distribution of the cohorts from the selected studies.The size of the stars are proportional to the size of the study cohort.

a
Recurrent variant detected in multiple studies.N = Nonsense, F = frameshift and SS = splice site.

Table 2 .
Publications reporting variations in high-penetrance breast and ovarian cancer genes.

Table 3 .
Breakdown of cancer subtypes from data extracted.

Table 4 .
Previously reported pathogenic BRCA1 variants identified from the literature search that are also present in BIC and ClinVar.Nonsense, F = frameshift, SS = splice site, IVS = Intervening sequence ie. the intron, Indel = insertion and deletion.Recurrent variant detected in multiple studies: Vaidyanathan et al. (61 cases, 10 carriers of 185delAG), Saxena et al. (204 cases, 1 carrier of 185delAG), N =
Bold face indicates variants also identified in our literature search.

Table 10 .
BIC searching with keyword 'Indian' for BRCA1.Bold face indicates variants also identified in our literature search Variants in cancer susceptibility genes in India | 85 BRCA1 and BRCA2 may contribute to the reported differences in breast and ovarian cancer incidence across India, in Indians in other countries, and between India and the west.The earlier average age of breast cancer among Indian women is especially intriguing in this respect.Data are, however, very limited and have not been collected systematically in terms of inclusion criteria, details such as family history, and critical clinical co-variates such as histopathology.Furthermore, very limited work has been published to address environmental risk factors specific to the Indian population and distinct from Western populations, such as consanguineous marriage, betel quid consumption, and pregnancies.Current guidelines on cancer screening and prevention in gene carriers are based on evidence predominantly derived from white populations of northern European origins.Work is needed to modify existing risk-prediction models such as Manchester or BOADICEA for use in women of different ethnicities.Indeed, previous work has found that overall sensitivity, specificity, and positive-predictive values were lower in the Asian population than in Caucasian populations.