Genetic overlap between endometriosis and endometrial cancer: evidence from cross‐disease genetic correlation and GWAS meta‐analyses

Abstract Epidemiological, biological, and molecular data suggest links between endometriosis and endometrial cancer, with recent epidemiological studies providing evidence for an association between a previous diagnosis of endometriosis and risk of endometrial cancer. We used genetic data as an alternative approach to investigate shared biological etiology of these two diseases. Genetic correlation analysis of summary level statistics from genomewide association studies (GWAS) using LD Score regression revealed moderate but significant genetic correlation (r g = 0.23, P = 9.3 × 10−3), and SNP effect concordance analysis provided evidence for significant SNP pleiotropy (P = 6.0 × 10−3) and concordance in effect direction (P = 2.0 × 10−3) between the two diseases. Cross‐disease GWAS meta‐analysis highlighted 13 distinct loci associated at P ≤ 10−5 with both endometriosis and endometrial cancer, with one locus (SNP rs2475335) located within PTPRD associated at a genomewide significant level (P = 4.9 × 10−8, OR = 1.11, 95% CI = 1.07–1.15). PTPRD acts in the STAT3 pathway, which has been implicated in both endometriosis and endometrial cancer. This study demonstrates the value of cross‐disease genetic analysis to support epidemiological observations and to identify biological pathways of relevance to multiple diseases.


Introduction
Endometriosis (defined as tissue resembling endometrium in extrauterine sites) and endometrial cancer (cancer of the uterine corpus) are serious gynecological diseases with major impacts on the quality of life of affected women. Endometriosis is a relatively common disease affecting 6-10% of women of reproductive age and 35-50% of infertile women [1,2]. Affected women commonly experience severe menstrual pain, pelvic pain, subfertility or infertility, and bowel-related symptoms. Endometrial cancer is the most common invasive gynecological cancer in Australia, ranking sixth for incident cancers in women [3]. This disease is associated with significant morbidity due to surgery and radiotherapy [4], and treatment is further complicated by the fact that most patients present at relatively older age and with major comorbidities, notably obesity and diabetes. Finding the genes and pathways underlying these complex diseases is an essential step toward developing better diagnostic and therapeutic tools for both diseases. Both diseases are known to have a genetic component, with twin studies showing heritability of endometriosis at ~50% (51%, 95% confidence interval = 33-66% [5]; H = 47%, 95% CI = 36-57% [6]) and of endometrial cancer 27% (95% CI = 11-43% [7]). Genomewide association studies have, to date, identified 19 independent SNPs as being significantly associated with endometriosis [8] and nine independent SNPs with endometrial cancer [9,10]. These genomewide-associated SNPs, and the genetic regions in which they occur, are nonoverlapping between the diseases.
Epidemiological, biological, and molecular data all indirectly suggest that there could be links between the two disorders. Endometriosis and endometrial cancer are both hormonally regulated diseases, with increased risk in women exposed to higher levels of estrogen, and decreased or ameliorated risk or symptoms through treatments such as the contraceptive pill and hormonal therapies that include progesterone [11]. Both are associated with increased risk of uterine fibroids [12,13] and with ovarian cancer: endometriosis through an increased risk of this disease, and endometrial cancer through multiple shared risk factors, and histopathologic and molecular features [14,15]. Cancer-related genetic changes such as loss of heterozygosity, and altered methylation and expression patterns have been reported for endometriosis [16]. Numerous endometrial cancer-associated genes, including PTEN and other genes in the Ingenuity "endometrial cancer pathway," have been shown to be dysregulated in endometriosis [17,18].
Epidemiological studies have shown conflicting evidence for a link between a diagnosis of endometriosis and risk of endometrial cancer [13,[19][20][21][22][23][24]. The interpretation of results from epidemiological studies is complicated by several factors, including small sample sizes, the underdiagnosis and misdiagnosis of endometriosis, inability to adjust for confounders including oral contraceptives and parity, and the exclusion criteria of some epidemiological studies which assumed coincidental diagnosis of endometrial cancer in women ascertained via a diagnosis of endometriosis. For example, Rowlands et al. showed an overall 1.5-fold increased risk of endometrial cancer that was reduced by excluding cases diagnosed with endometriosis <1 year before the endometrial cancer diagnosis; however, the subset of women with surgically confirmed endometriosis diagnosed >1 year prior to cancer showed a significant 2.6-fold increased risk of endometrial cancer [13]. However, a recent study in US nurses which was also able to adjust for diagnosis intervals found no association between either self-reported or laparoscopically confirmed endometriosis and risk of endometrial cancer [22]. Meanwhile, two population-based studies had shown associations between the diseases, although neither was able to adjust for confounders such as parity. A large study including 45,790 Danish women with a clinical diagnosis of endometriosis found increased risks of endometrial cancer >1 year (standardized incidence ratio (SIR) = 1.43, 95% CI = 1.13-1.79) and ≥10 years (SIR = 1.51, 95% CI = 1.15-1.95) following the endometriosis diagnosis [23]. Another study including 15,488 Taiwanese women diagnosed with endometriosis found a similar link, but only in women diagnosed with endometriosis at over 40 years of age (adjusted hazard ratio = 7.08, 95% CI = 2.33-21.55) [24]. Age-related effects, if present, could have further confounded the results of previous epidemiological studies investigating shared risk of endometriosis and endometrial cancer.
Given the methodological complications inherent in epidemiological studies, unbiased genetic approaches are an ideal way to test for shared biological etiology between endometriosis and disease. For example, a degree of shared genetic etiology has recently been demonstrated between endometriosis and ovarian cancer, including with ovarian cancer subtypes not previously thought to be associated with endometriosis [25]. We used separate genomewide association study (GWAS) datasets for endometriosis and endometrial cancer to estimate the degree to which these two diseases share a common genetic etiology. We then combined these datasets in a cross-disease GWAS metaanalysis to identify genetic loci potentially contributing to the genetic risk of both endometriosis and endometrial cancer.

Genetic overlap between endometriosis and endometrial cancer: datasets and analyses
This study utilized data from four previously published genetic datasets for endometriosis and endometrial cancer (outlined below and in the following section; Table 1) [26][27][28]. Three of the datasets were GWAS datasets, genotyped using Illumina 610Quad and 670Quad BeadChips (Illumina Inc, San Diego, CA) and containing data for 462,430 SNPs in common between them. Of these, the endometriosis GWAS dataset included 3194 Australian (QIMR Berghofer Medical Research Institute (QIMR)) and UK (Oxford) women with surgically confirmed endometriosis as cases [26]. The first endometrial cancer GWAS dataset included 1262 Australian (ANECS) and UK (SEARCH) endometrioid subtype endometrial cancer patients [27], and the second (NSECG) included 795 UK endometrial cancer cases and 895 nonoverlapping controls [28]. All endometrial cancer cases were histologically confirmed to be invasive cancer of the endometrium lining [27]. As previously published, the endometriosis and ANECS-SEARCH endometrial cancer GWAS datasets included the same sets of controls-1870 Australian controls and 5190 UK Wellcome Trust Case Control Consortium (WTCCC) controls. Hence to avoid overlapping control samples in this study, the controls were redistributed as follows: The 1870 Australian controls and two-third of the WTCCC controls (n = 3460, randomly assigned) were included in the endometriosis GWAS dataset, while an additional set of 1241 Australian controls [28] and the remaining one-third of the WTCCC controls (n = 1730) were included in the ANECS-SEARCH endometrial cancer GWAS dataset.
Following quality control [26][27][28], association analyses were performed for each GWAS dataset using PLINK [29]. Australian and UK cases and controls were analyzed as separate strata within the same GWAS for the endometriosis and ANECS-SEARCH endometrial cancer datasets, adjusting for the first two (endometriosis, ANECS, NSECG) or three (SEARCH) principal components of the genomic kinship matrix [26][27][28]. The summary results for the ANECS-SEARCH and NSECG datasets were then included in an inverse variance, fixed effects meta-analysis performed using METAL [30], to produce one set of endometrial cancer GWAS results. A fixed effect model was considered more appropriate than a random effect model as our hypothesis is that a proportion of SNPs will be associated with both diseases with the same direction of effect, and a fixed effect model is conservative given no expectation that the effect size is similar. The degree of genetic overlap between endometriosis and endometrial cancer was then examined using two programs that test the degree of genetic correlation/concordance between diseases using GWAS summary results (individual SNP effect sizes and P-values), SNP effect concordance analysis (SECA) [31] and LD Score regression [32].
To account for linkage disequilibrium (LD) between SNPs, SECA employs a "P-value informed" SNP clumping procedure to extract a subset of independent SNPs [31] (23,817 SNPs for the current analysis). These SNPs are then partitioned into 12 P-value "bins" (e.g., P ≤ 0.01, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, and 1.0) for each disease. Using the default settings, a number of binomial and Fisher exact tests were performed on SNPs across all bins (12 × 12 bins = 144 SNP subset combinations), and on SNP subsets within bins (see Results), to determine the degree to which individual SNPs are concordant in their P-value level and direction of effect across two diseases, which can indicate the presence of genetic concordance and SNP pleiotropy [31]. For these analyses, the endometriosis dataset was designated as Dataset 1 and endometrial cancer as Dataset 2.
Taking a different approach, cross-trait LD Score regression utilizes the presence of LD, calculating an LD score between SNPs within a 1 cM window and then regressing the product of the SNP association results (z scores) from the two diseases against the LD score [32]. Following the recommendations at https://github.com/bulik/ldsc/wiki/ Heritability-and-Genetic-Correlation, the "-no-intercept" option was used to constrain the LD Score regression intercept to 0 as there was no sample overlap between the two disease datasets.

Cross-disease meta-analysis between endometriosis and endometrial cancer
The cross-disease meta-analysis was performed using an inverse variance, fixed effects model in METAL [30] to search for genetic loci potentially contributing to the increased risk of both endometriosis and endometrial The results for the top SNPs (P ≤ 10 −5 ) from the endometriosis-endometrial cancer meta-analysis were then compared with results for the same SNPs from the fourth dataset included in this study, a separate, independent sample of 4402 endometrial cancer cases and 28,758 controls genotyped at 211,155 SNPs using a custom Illumina Infinium iSelect array by the Collaborative Oncological Gene-environment Study ("iCOGS") [33,34]. SNPs not included on the iCOGS array were imputed (including all SNPs within 1 Mb of the target SNP) using IMPUTE(v2) software [35] and the 1000 Genomes Project (2012 release) as the reference panel [9]. Imputation quality scores ranged from 0.34 to 1.00. Association testing on the iCOGS SNPs was performed using SNPTEST (v2) [36] employing frequentist tests with a logistic regression model adjusting for eight separate strata and the first 10 principal components [9,28]. These results were then included in the replication meta-analysis, which included all four datasets and was conducted as described for the cross-disease meta-analysis above.

Genetic overlap between endometriosis and endometrial cancer
Genetic correlation analyses of GWAS datasets for endometriosis and endometrial cancer revealed the presence of weak to moderate, but significant, genetic overlap between the two diseases. The LD Score regression analysis indicated moderate but significant genetic correlation (r g ) between the two diseases (r g = 0.23, P = 9.3 × 10 −3 ). The SECA primary test for the overlap of associated effects, including all 144 SNP subsets, revealed more subsets than expected by chance showing at least nominally significant pleiotropy between endometriosis and endometrial cancer (P = 6.0 × 10 −3 ): The pair of SNP subsets producing the minimum exact binomial test P-value for pleiotropy (endometriosis SNP subset with P ≤ 0.002 and endometrial cancer SNP subset with P ≤ 0.86) had P = 3.3 × 10 −4 . The primary test for concordant effects between endometriosis and endometrial cancer also revealed that the number of SNP subsets with nominally significant concordant effects (P ≤ 0.05) was significantly more than expected by chance (P = 2.0 × 10 −3 ): The pair of SNP subsets producing the minimum Fisher's exact test P-value for effect correlation (endometriosis SNP subset with P ≤ 0.37 and endometrial cancer SNP subset with P ≤ 1) had P = 2.1 × 10 −4 . The primary results indicate that SNP effects are correlated, with the presence of allelic effects that increase the risk of both traits. Including only specific (default) SNP subsets in the analyses [31], SNP effects were positively, although not significantly, correlated for SNPs at P ≤ 0.05 in both datasets (P = 8.4 × 10 −2 ) and for SNPs with P ≤ 1.0 × 10 −5 in the larger endometriosis dataset and with P ≤ 0.05 in the endometrial cancer dataset (P = 6.8 × 10 −2 ). Together, these results indicate that overall more SNPs than expected by chance were associated with the same direction of effect for both diseases, particularly amongst nominally or marginally associated SNPs.

Discussion
A link between endometriosis and endometrial cancer has long been postulated due to the numerous risk factors shared by the two diseases, but has only recently been convincingly demonstrated epidemiologically [23,24]. Our genetic study indicates that endometriosis and endometrial cancer have a moderate, but significant, shared genetic etiology. Genetic correlation analyses indicated the presence of pleiotropic SNPs as well as correlation in the direction of genetic effects, particularly amongst SNPs marginally and nominally associated with each disease individually. This is consistent with our hypothesis that a proportion of endometrial cancer cases (but certainly not all) will share genetic predisposition factors in common with endometriosis cases.
Cross-disease meta-analysis identified one genomewide significant locus associated with the risk of developing both diseases, and several other loci worthy of prioritization for future studies. These findings indicate that genetic factors underlie at least part of the shared disease risk implied by the epidemiological evidence. The SNP most significantly associated with disease in the endometriosis-endometrial cancer meta-analysis was rs2475335 (P = 4.9 × 10 −8 ). Located on chromosome 9p23, rs2475335 lies within intron 2 of an alternative transcript of the protein tyrosine phosphatase receptor type D (PTPRD) gene. PTPRD is a member of the receptor protein tyrosine phosphatase (PTP) family, a number of which have been found to function as either tumor suppressors or as oncogenes [37]. PTPRD deletions and mutations have been detected in numerous tumor types, including endometrial tumors [38]: the Catalogue of Somatic Mutations in Cancer (COSMIC) database (http://cancer.sanger.ac.uk/cosmic; accessed 10/12/2016) indicates ~5% of endometrioid carcinomas harbor PTPRD mutations. Mutations in PTPRD enhance cell growth and migration in melanoma cell lines, while the presence of mutated PTPRD protein enhanced growth and abrogated dephosphorylation of the STAT3 oncoprotein in human astrocytes [38,39]. Elevated STAT3 expression has been implicated in both endometriosis and endometrial cancer [40,41] and has been suggested as a potential target for treatment for both diseases [42,43]. While PTPRD is an attractive candidate gene for regulation by rs2475335, the gene targeted by this SNP (or equally by another SNP/s in high linkage disequilibrium with rs2475335) is as yet unknown, and further experimental studies in both endometriosis and endometrial cancer models are now required to investigate the biology underlying the increased risks of both diseases associated with this variant [44].
A number of the remaining risk loci prioritized by the meta-analysis harbor candidate genes that are potentially relevant candidates for etiology and/or treatment of endometriosis and endometrial cancer. For example, the missense variant rs2278868 located on chromosome 17q21.32 within exon 7 of the SKAP1 gene is in perfect linkage disequilibrium (r 2 = 1) with rs1452666, which we have previously reported as having borderline GWAS significant association with endometrial cancer in the combined GWAS and iCOGS datasets [9]. SNP variation in the SKAP1 region is associated with ovarian cancer, subtypes of which are clearly linked epidemiologically and genetically to endometriosis [25,45] and to endometrial cancer [46], although rs2278868 is in extremely low LD with the top ovarian cancer SNP rs9303542 (r 2 = 0.034) [33]. Publically available gene expression data (http://www.gtexportal.org) indicate the potential for SNPs rs2278868 (and rs1452666) and rs9303542 to act as expression quantitative trait loci (eQTLs) for SKAP1, altering SKAP1 expression in liver, tibial nerve, testis, and pancreatic tissues, as well as for  Other SNPs of interest include rs12303900 on chromosome 12q21, located between the KITLG and DUSP6 genes. DUSP6 is a critical regulator of ERK signaling, a pathway dysregulated in both endometriosis and endometrial cancer and a potential target for treatment for both diseases [47][48][49]. SNP rs10008492, located on chromosome 4p14, is an eQTL for nearby toll-like receptors TLR1 and TLR6 (http://www.gtexportal.org). Both TLR1 and TLR6 are upregulated in endometriotic mesenchymal stem cells [50] and are expressed in endometrial cancer cell lines [51]. However, as for the PTPRD locus, all of these association results need to be further validated in additional replication datasets for both diseases, and relevant functional studies undertaken, before more is hypothesized about their genetic and biological effects on the risk of both endometriosis and endometrial cancer.
In this cross-disease genetic correlation and genomewide association study, we have provided evidence for overlap in genetic risk factors for endometriosis and endometrial cancer. Our genetic correlation analysis supports recent large epidemiological studies indicating an increased risk of endometrial cancer in women previously diagnosed with endometriosis, while the cross-disease meta-analysis has revealed plausible loci that could increase the risk of both diseases and which should be pursued further in functional studies. This work on endometriosis and endometrial cancer also adds further evidence to the utility of cross-disease genetic correlation and GWAS analyses as tractable and attractive methodologies to identify susceptibility loci that predispose to multiple diseases, which could lead to new diagnostic or treatment options for affected individuals.

Acknowledgments
We acknowledge with appreciation all women who participated in the QIMR Berghofer Medical Research Institute and OXGENE endometriosis studies and the ANECS-SEARCH, NSECG, and iCOGS endometrial cancer studies. For the endometriosis study, we thank Endometriosis Associations for supporting study recruitment, and the many hospital directors and staff, gynecologists, general practitioners, and pathology services in Australia and the UK who provided assistance with blood collection and confirming diagnoses. We are grateful to the many research assistants and interviewers for assistance with the studies contributing to the QIMR and OXGENE collections.

Supporting Information
Additional supporting information may be found in the online version of this article: Figure S1. Forest plots of association between the top 13 SNPs in the endometriosis-endometrial cancer metaanalysis and each of the datasets included in the analysis. Table S1. Results for the top SNPs from the endometriosis and endometrial cancer (ANECS-SEARCH and NSECG) GWAS meta-analysis, with iCOGS as the endometrial cancer replication dataset.