Large-scale genome-wide meta-analysis of polycystic ovary syndrome suggests shared genetic architecture for different diagnosis criteria

Polycystic ovary syndrome (PCOS) is a disorder characterized by hyperandrogenism, ovulatory dysfunction and polycystic ovarian morphology. Affected women frequently have metabolic disturbances including insulin resistance and dysregulation of glucose homeostasis. PCOS is diagnosed with two different sets of diagnostic criteria, resulting in a phenotypic spectrum of PCOS cases. The genetic similarities between cases diagnosed based on the two criteria have been largely unknown. Previous studies in Chinese and European subjects have identified 16 loci associated with risk of PCOS. We report a fixed-effect, inverse-weighted-variance meta-analysis from 10,074 PCOS cases and 103,164 controls of European ancestry and characterisation of PCOS related traits. We identified 3 novel loci (near PLGRKT, ZBTB16 and MAPRE1), and provide replication of 11 previously reported loci. Only one locus differed significantly in its association by diagnostic criteria; otherwise the genetic architecture was similar between PCOS diagnosed by self-report and PCOS diagnosed by NIH or non-NIH Rotterdam criteria across common variants at 13 loci. Identified variants were associated with hyperandrogenism, gonadotropin regulation and testosterone levels in affected women. Linkage disequilibrium score regression analysis revealed genetic correlations with obesity, fasting insulin, type 2 diabetes, lipid levels and coronary artery disease, indicating shared genetic architecture between metabolic traits and PCOS. Mendelian randomization analyses suggested variants associated with body mass index, fasting insulin, menopause timing, depression and male-pattern balding play a causal role in PCOS. The data thus demonstrate 3 novel loci associated with PCOS and similar genetic architecture for all diagnostic criteria. The data also provide the first genetic evidence for a male phenotype for PCOS and a causal link to depression, a previously hypothesized comorbid disease. Thus, the genetics provide a comprehensive view of PCOS that encompasses multiple diagnostic criteria, gender, reproductive potential and mental health.


Introduction
Polycystic ovary syndrome (PCOS) is the most common endocrine disorder in reproductive aged women, with a complex pattern of inheritance [1][2][3][4][5]. Two different diagnostic criteria based on expert opinion have been utilized: The National Institutes of Health (NIH) criteria require hyperandrogenism (HA) and ovulatory dysfunction (OD) [6] while the Rotterdam criteria include the presence of polycystic ovarian morphology (PCOM) and requires at least two of three traits to be present, resulting in four phenotypes (S1 Fig) [6,7]. PCOS by NIH criteria has a prevalence of~7% in reproductive age women worldwide [8]; the use of the broader Rotterdam criteria increases this to 15-20% across different populations [9][10][11].
PCOS is commonly associated with insulin resistance, pancreatic beta cell dysfunction, obesity and type 2 diabetes (T2D). These metabolic abnormalities are most pronounced in women with the NIH phenotype [12]. In addition, the odds for moderate or severe depression and anxiety disorders are higher in women with PCOS [13]. However, the mechanisms behind the association between the reproductive, metabolic and psychiatric features of the syndrome remain largely unknown.
Genome-wide association studies (GWAS) in women of Han Chinese and European ancestry have reproducibly identified 16 loci [14][15][16][17]. The observed susceptibility loci in PCOS appeared to be shared between NIH criteria and self-reported diagnosis [17], which is particularly intriguing. Genetic analyses of causality (by Mendelian Randomization analysis) among women of European ancestry with self-reported PCOS suggested that body mass index (BMI), insulin resistance, age at menopause and sex hormone binding globulin contribute to disease pathogenesis [17].
We performed the largest GWAS meta-analysis of PCOS to date, in 10,074 cases and 103,164 controls of European ancestry diagnosed with PCOS according to the NIH (2,540 cases and 15,020 controls) or Rotterdam criteria (2,669 cases and 17,035 controls), or by selfreported diagnosis (5,184 cases and 82,759 controls) (Tables 1 and S1). We investigated whether there were differences in the genetic architecture across the diagnostic criteria, and whether there were distinctive susceptibility loci associated with the cardinal features of PCOS; HA, OD and PCOM. Further, we explored the genetic architecture with a range of phenotypes related to the biology of PCOS, including male-pattern balding [18][19][20][21].

Results
We identified 14 genetic susceptibility loci associated with PCOS, adjusting for age, at the genome-wide significance level (P < 5.0 x 10 −8 ) bringing the total number of PCOS associated loci to nineteen (Tables 2 and S2 and Fig 1). Three of these loci were novel associations (near PLGRKT, ZBTB16 and MAPRE1, respectively; shown in bold in Table 2). Six of the 11 reported associations were previously observed in Han Chinese PCOS women [14,15]. Eight loci have been reported in European PCOS cohorts [16,17]. Obesity is commonly associated with PCOS and in most of the cohorts, cases were heavier than controls (Table 1). However, adjusting for both age and BMI did not identify any novel loci; and the 14 loci remained genome-wide significant. All variants demonstrated the same direction of effect across all phenotypes including NIH, non-NIH Rotterdam, and self-report (Fig 2 and S2 Table). Only one SNP near GATA4/ NEIL2 showed significant evidence of heterogeneity across the different diagnostic groups (rs804279, Het P = 2.6x10 -5 ; Fig 2 and S3 Table). For this SNP, the largest effect was seen in NIH cases and the smallest in self-reported cases. Credible set analysis, which prioritises variants in a given locus with regards to being potentially causal, was able to reduce the plausible interval for the causal variant(s) at many loci (S4 Table). Of note, 95% of the signal at the THADA locus came from two SNPs. Examination of previously published genome-wide significant loci from Han Chinese PCOS [14,15] demonstrated that index variants from the (2) Rotterdam diagnostic criteria include the NIH criteria. All subjects from the indicated cohorts were used in the Rotterdam analysis.
(3) Controls were screened for regular menses and no hyperandrogenism. � PCOS diagnosis was based on NIH criteria, �� Rotterdam criteria, or ��� self report.
Results are reported as mean (SD) or a number (%). THADA, FSHR, C9orf3, YAP1 and RAB5B loci were significantly associated with PCOS after Bonferroni correction for multiple testing in our European ancestry subjects (S5 Table).
We assessed the association of the PCOS susceptibility variants identified in the GWAS meta-analysis with the PCOS related traits: HA, OD, PCOM, testosterone, FSH and LH levels, and ovarian volume in PCOS cases (Tables 3 and S6 and S2 Fig). We found four variants associated with HA, eight variants associated with PCOM and nine variants associated with OD. Of the eight loci associated with PCOM, seven were also associated with OD. Three of the four loci associated with HA were also associated with OD and PCOM. Two additional loci were associated with OD alone, one of which was the locus near FSHB (S6 Table). This locus was also associated with LH and FSH levels. There was a single PCOS locus near IRF1/RAD50 associated with testosterone levels (S6 Table). We repeated this analysis with susceptibility variants reported previously in Han Chinese PCOS cohorts [14,15]. In this analysis, there was one association with HA (near DENND1A), three with PCOM and three with OD (S2 Fig and S5 Table). A limitation of these analyses is the variable sample size across the phenotypes analysed. Additionally, the known referral bias for the more severely affected NIH phenotype (patients having both OD and HA) may result in more PCOS diagnoses than the other criteria [22], and may have contributed to the number of associations between the identified PCOS risk loci and these phenotypes.
In the analyses looking at the weighted genetic risk score in the Rotterdam cohort, we observed an increase in the risk for PCOS (S3 Fig). Compared to individuals in the third quintile (reference group), individuals in the top 5th quintile of risk score have an OR of 1.9 (1.4-2.5; 95% CI) for PCOS based on NIH criteria and an OR of 2.1 (1.7-2.5; 95% CI) for Rotterdam criteria based PCOS. Of the associations, only the effect estimate for the Rotterdam criteria was significant, possibly due to the smaller size available with cases diagnosed according to the NIH criteria. When looking at the area under the ROC curves at SNPs with different P-value thresholds, we found a maximum AUC of 0.54 using SNPs with a P-value < 5x10 -6 for both diagnostic criteria. While this is significantly better than chance, it is unlikely that a risk score generated from the variants discovered to date would represent a clinically relevant tool.
LD score regression analysis revealed genetic correlations with childhood obesity, fasting insulin, T2D, HDL, menarche timing, triglyceride levels, cardiovascular diseases and depression (Table 4) suggesting that there is shared genetic architecture and biology between these phenotypes and PCOS. There were no genetic correlations with menopause timing or male pattern balding. Mendelian randomization suggested that there was a causal role for BMI, fasting insulin and depression pathways (Table 5). Interestingly, while there was no genetic correlation detected for male pattern balding or menopause timing with PCOS, the Mendelian randomization analyses were significant. The difference in the genetic correlation compared to the Mendelian randomization result suggests that there may be a small number of key biological process that are common between the phenotypes, and that the common genetic causal variants are limited only to the variants shared by the subset of key biological processes. The importance of BMI pathways on reproductive phenotypes was further demonstrated by the attenuation of significance of Mendelian randomization analysis for age-at-menarche when BMI-associated variants were excluded from the analysis. Odds ratio of polycystic ovary syndrome (PCOS) as a function of diagnostic criteria applied. The Y-axis specifies the diagnostic criteria and the Xaxis indicates the odds ratio (OR) and 95% confidence intervals (CI) for PCOS (black circle and horizontal error bars). Data derived as follows: NIH = groups recruiting only NIH diagnostic criteria; NonNIH_Rotterdam = Rotterdam diagnostic criteria excluding the subset fulfilling NIH diagnostic criteria; Rotterdam +NIH = all groups except self-reported; self-reported = 23andMe; and combined = all groups. Specific OR's [95% CI, 5% CI] are indicated on the right. rs804279 in the GATA4/NEIL2 locus demonstrates significant heterogeneity (Het P = 2.6x10 -5 ). The � indicates statistically significant association for PCOS and the variant in that specific stratum. https://doi.org/10.1371/journal.pgen.1007813.g002

Discussion
We found 14 independent loci significantly associated with the risk for PCOS, including three novel loci. The 11 previously reported loci implicated neuroendocrine and metabolic pathways that may contribute to PCOS (1.1 Note in S1 Data). Two of the novel loci contain potential endocrine related candidate genes. The locus harbouring rs10739076 contains several interesting candidate genes; PLGRKT, a plasminogen receptor and several genes in the insulin superfamily; INSL6, INSL4 and RLN1, RLN2 which are endocrine hormones secreted by the ovary and testis and are suspected to impact follicle growth and ovulation [23]. ZBTB16 (also known as PLZF) has been marked as an androgen-responsive gene with anti-proliferative activity in prostate cancer cells [24]. PLZF activates GATA4 gene transcription and mediates cardiac hypertrophic signalling from the angiotensin II receptor 2 [25]. Furthermore, PLZF is upregulated during adipocyte differentiation in vitro [26] and is involved in control of early stages of spermatogenesis [27] and endometrial stromal cell decidualization [28]. The third novel locus harbours a metabolic candidate gene; MAPRE1 (interacts with the low-density lipoprotein receptor related protein 1 (LRP1), which controls adipogenesis [29] and may additionally mediate ovarian angiogenesis and follicle development [30] (1.2 Note in S1 Data). Thus, all the new loci contain genes plausibly linked to both the metabolic and reproductive features of PCOS. We found that there was no significant difference in the association with case status for the majority of the PCOS-susceptibility loci by diagnostic criteria. All susceptibility variants demonstrated the same direction of effect for the NIH phenotype, non-NIH Rotterdam phenotype and self-report, with only one variant demonstrating significant heterogeneity among the groups. It is of considerable interest that the cohort of research participants from the personal genetics company 23andMe, Inc., identified by self-report, had similar risks to the other cohorts where the diagnosis was clinically confirmed. Our findings suggest that the genetic architecture of these PCOS definitions does not differ for common susceptibility variants. Only one locus, GATA4/NEIL2 (rs804279), was significantly different across diagnostic criteria: most strongly associated in NIH compared to the Rotterdam phenotype and self-reported cases. Deletion of GATA4 results in abnormal responses to exogenous gonadotropins and impaired fertility in mice [31]. The locus also encompasses the promoter region of FDFT1, the first enzyme in the cholesterol biosynthesis pathway [32], which is the substrate for testosterone synthesis, and is associated with non-alcoholic fatty liver disease [33]. The major difference between the NIH phenotype and the additional Rotterdam phenotypes is metabolic risk; the NIH phenotype is associated with more severe insulin resistance [34]. rs804279 does not show association with any of the metabolic phenotypes in the T2D diabetes knowledge portal {Type 2 Diabetes Knowledge Portal. type2diabetesgenetics.org. 2015 Feb 1; http://www. type2diabetesgenetics.org/variantInfo/variantInfo/rs804279} so it may represent a PCOS-specific susceptibility locus.
The significant association of PCOS GWAS meta-analysis susceptibility variants with the cardinal PCOS related traits OD, HA and PCOM further strengthened the hypothesis that specific variants may confer risk for PCOS through distinct mechanisms. Three variants at the C9orf3, DENND1A, and RAB5B were associated with all PCOS related traits. The findings were consistent with the Han Chinese DENND1A variant association with HA, as suggested previously [35]. Thus, these loci, along with GATA4/NEIL2 (as discussed above) may help identify pathways that link specific PCOS related traits with greater metabolic risk. In contrast, the variants at the ERBB4, YAP1, and ZBTB16 loci were strongly associated with OD and PCOM, and therefore, might be more important for links to menstrual cycle regularity and fertility. In addition, the FSHB variant was associated with the levels of FSH and LH [16,17], suggesting that it may act by affecting gonadotropin levels. This variant maps 2kb upstream from open chromatin (identified by DNase-Seq) and an enhancer (identified by peaks for both H3K27Ac and H3K4me1) in a lymphoblastoid cell line from ENCODE, indicating a potential role for a regulatory element~25kb upstream from the FSHB promoter. Furthermore, the association between the IRF1/RAD50 variant and testosterone levels may indicate a regulatory role in testosterone production. Of note, results of the follow-up analysis show a high level of shared biology between PCOS and a range of metabolic outcomes consistent with the previous findings [17]. In particular, there is genetic evidence for increased BMI as a risk factor for PCOS. There is also genetic evidence that fasting insulin might be an independent risk factor. This study also confirmed a causal association with the pathways that underlie menopause [17], suggesting that PCOS has shared aetiology with both classic metabolic and reproductive phenotypes. Furthermore, there was an apparent effect of depression-associated variants on the likelihood of PCOS, suggesting a role for psychological factors on hormonally related diseases. However, the links between PCOS and depression might be complicated by pathways that are also related to BMI, as BMI pathways are causal in both PCOS and depression [36]. In addition, male-pattern baldingassociated variants showed strong effects on PCOS, suggesting that this might be a male manifestation of PCOS pathways, as has been previously suggested [18,20,21,37]. This observation may reflect the biology of hair follicle sensitivity to androgens, seen in androgenetic alopecia, a well-recognised feature of HA and PCOS [38,39]. The Mendelian randomization results for male-pattern balding and menopause are significant despite non-significant genetic correlation results, suggesting that the shared aetiology may be specific to only a few key pathways.
In conclusion, the genetic underpinnings of PCOS implicate neuroendocrine, metabolic and reproductive pathways in the pathogenesis of disease. Although specific phenotype stratified analyses are needed, genetic findings were consistent across the diagnostic criteria for all but one susceptibility locus, suggesting a common genetic architecture underlying the different phenotypes. There was genetic evidence for shared biologic pathways between PCOS and a number of metabolic disorders, menopause, depression and male-pattern balding, a putative male phenotype. Our findings demonstrate the extensive power of genetic and genomic approaches to elucidate the pathophysiology of PCOS.

Ethics statement
All research involving human participants has been approved by the authors' Institutional Review Board (IRB) or an equivalent committee, and all clinical investigation was conducted according to the principles expressed in the Declaration of Helsinki. Written informed consent was obtained from all participants. The Boston cohort was approved by the Partners IRB (# 2002P001924) and the University of Utah IRB (IRB_00076659). The deCODE cohort was approved by the National Bioethics Committee of Iceland (VSN 03-007), which was conducted in agreement with conditions issued by the Data Protection Authority of Iceland. Personal identities of the participants' data and biological samples were encrypted by a third-party system (Identity Protection System), approved and monitored by the Data Protection Authority.

Subjects
The meta-analysis included 10,074 cases and 103,164 controls from seven cohorts of European descent. For the analysis of PCOS related traits three additional cohorts, the Northern Finnish Birth Cohort (NFB66) [40], Twins UK [41] and the Nurses' Health Study (NHS) [42] were included. Cases were diagnosed with PCOS based on NIH or Rotterdam Criteria or by selfreport. The NIH criteria require the presence of both OD and clinical and/or biochemical HA for a diagnosis of PCOS [6]. The Rotterdam criteria require two out of three features 1) OD defined by oligo-or amenorrhea (chronic menstrual cycle interval >35 days in all cohorts), 2) clinical and/or biochemical hyperandrogenism (HA) and/or 3) PCOM for a diagnosis of PCOS [7]. Non-NIH Rotterdam was defined by OD and PCOM or clinical and/or biochemical hyperandrogenism (HA) and PCOM. Self-reported female cases from research participants in the 23andMe, Inc. (Mountain View, CA, USA) cohort either responded "yes" to the question "Have you ever been diagnosed with polycystic ovary syndrome?" or indicated a diagnosis of PCOS when asked about fertility ("Have you ever been diagnosed with PCOS?" or "What was your diagnosis? Please check all that apply." Answer = PCOS), hair loss in men or women ("Have you been diagnosed with any of the following? Please check all that apply." Answer = PCOS) or research question ("Have you ever been diagnosed with PCOS?") [17]. 23andMe controls were female, only.
HA was defined as hirsutism and quantified by the Ferriman-Gallwey (FG) score. The FG score assesses terminal hair growth in a male pattern in females, and a score above the upper limit of normal controls (>8) is considered hirsutism [43]. Hyperandrogenemia was defined as testosterone, androstenedione or DHEAS greater than the 95% confidence limits in control subjects in the individual population. OD was defined as cycle interval <21 or >35 days [44]. PCOM was defined as 12 or more follicles of 2-9 mm in at least one ovary or an ovarian volume >10 mL [7]. The quantitative PCOS traits included levels of total testosterone (T), follicle-stimulating hormone (FSH), and luteinizing hormone (LH) and ovarian volume (S1 Table). An overview of the cohorts, diagnostic criteria and number of subjects included in each subphenotype or trait analysis are summarized in Tables 1 and S1.

Data collection and quality control
Each study provided summary results of genetic per-variant estimates produced in either casecontrol or trait association analyses. Adjustment for principle components was performed at the study level. The collected files underwent quality control (QC) by two independent analysts using the EasyQC pipeline [45]. Variants were excluded based on minor allele frequency (MAF) < 1%, imputation quality (R 2 ) < 0.3 or info < 0.4 for MACH and IMPUTE2 respectively [46,47]. Per-cohort QC results from EasyQC are shown (S7 Table), and allele frequency spectrum for each cohort, and the combined cohort after meta-analysis is shown (S4 Fig).

Meta-analysis of PCOS status and PCOS related traits
The per-variant estimates collected from the summary statistics of contributing studies were meta-analysed using a fixed-effect, inverse-weighted-variance meta-analysis that employed either GWAMA [48] or METAL [49]. In addition to the overall meta-analysis, we performed meta-analyses for studies with available data for the separate PCOS diagnostic criteria: NIH, non-NIH Rotterdam [7] and self-report [17], as well as for the PCOS related traits of HA, OD and PCOM. The meta-analysis of PCOS status was performed using two models; (1) ageadjusted, (2) age and BMI-adjusted, given the high prevalence of obesity in affected women that resulted in cases being significantly heavier than controls in most cohorts (Table 1).
We removed any variants that were not present in more than 50% of the effective sample size prior to combining with 23andMe as this was the largest cohort in the meta-analysis, providing approximately 51% of the PCOS cases and 80% of controls. We also removed any variants only present in one study. The meta-analysis of PCOS related traits was performed adjusting for age and BMI. Identified variants were annotated for insight into their biological function using ANNOVAR [50] to assign refGene gene information, SIFT score [51], Poly-Phen2 scores [52], CADD scores [53], GERP scores [54] and SiPhy log odds [55].

Comparison of PCOS diagnostic criteria
In order to compare different PCOS diagnostic criteria [(1) NIH, (2) non-NIH Rotterdam and (3) self-reported] included in the PCOS meta-analysis, an additional meta-analysis was performed to test for heterogeneity across these independent PCOS case groups. These three PCOS case groups were combined in an inverse variance weighted fixed meta-analysis and the heterogeneity statistics (Cochran's Q and I 2 ) were obtained using GWAMA [48]. Any variant with a statistically significant Cochran's Q p-value (P<0.05/14 = 0.0036 corrected for multiple testing) and I 2 >70% were considered exhibiting heterogeneity across the PCOS case groups. Further analysis of the heterogeneity included comparison of the 95% confidence intervals for the direction of effect and overlaps.

Identifying associations between PCOS Loci and PCOS related traits
In order to understand biology relevant to identified PCOS susceptibility, we assessed the association between index SNPs at each genome-wide-significant locus and the PCOS related traits HA, OD, PCOM as well as the quantitative traits testosterone, LH and FSH levels and ovarian volume. The threshold for significance in this analysis was p<4.5×10 −4 (Bonferroni correction [0.05/(14 independent loci x 8 traits)].

Identifying shared risk loci between European ancestry and Han Chinese PCOS
In order to identify shared risk loci between the previously reported GWAS in Han Chinese PCOS cases and our European ancestry cohort, 13 independent signals (represented by 15 SNPs) at 11 genome-wide significant loci reported by Chen et al. [14] and Shi et al. [15] were investigated for association in our meta-analyses of PCOS and PCOS related traits. The adjusted P-value for this analysis was <0.00048 (Bonferroni correction [0.05/(13 independent signals x 8 traits)]).

Biologic function of genes in associated loci
Information on the biological function of the nearest gene (or genes, if variants were equidistant from more than one coding transcript and annotated as such by ANNOVAR [49] to the index SNP of each identified risk locus) was collected by performing a search of the Entrez Gene Database [56], and collecting the co-ordinates of the gene (genome build 37; hg19) as well as the cytogenetic location and the summary of the gene function. In addition to the EntrezGene Database queries, the gene symbol was used as a search term in the PubMed database [57], either alone or combined with the additional search term "PCOS" to identify relevant published literature in order to obtain information on putative biological function and involvement in the pathogenesis of PCOS (summarized in 1.1 Note in S1 Data).

Weighted genetic risk score and prediction
One potential use of genetic risk scores is prediction of disease. The ability of genetic risk scores calculated from loci discovered in analysis of the different diagnostic criteria to discriminate cases from alternative criteria was measured. We constructed a weighted genetic risk score based on a meta-analysis excluding the Rotterdam Study subjects. The weighted genetic risk score was divided into quintiles and tested for association with PCOS in the Rotterdam cohort. The middle quintile was used as the reference and the odds for having PCOS based on both Rotterdam and NIH criteria was then calculated.
Additionally, the 23andMe results were used to select independent SNPs with cut-offs of p<5×10 −4 to p<5×10 −8 . The Rotterdam cohort was then used to calculate risk scores and the area-under-the curve (AUC) for both NIH and Rotterdam diagnostic criteria. Analyses were performed using PLINK v1.9 and SPSS v21 (IBM Corp, Armonk, NY) [58].

Mendelian randomization
Phenotypes of interest, both where there was evidence of shared genetic architecture and where there was previous evidence for genetic links, were assessed using Mendelian randomization methods. Mendelian randomization differs from LD score regression in that one phenotype is analysed as a potential causal factor for another. Mendelian randomization was performed using both inverse weighted variance and Egger's regression methods [68], with inverse weighted methods being more powerful, but Egger's methods being resistant to directional pleiotropy (where there are a set of SNPs that appear to have an alternative pathway of effect). We report here the results of the IVW methods as none of the analysis suggested that the MR-EGGERs results were more appropriate given that none of the EGGERs intercepts were significant (Table 5). In addition to the phenotypes implicated by the LD-score regression measures, male pattern balding has a strong biological rationale and was therefore included. The genetic score for childhood obesity substantially overlaps with the score for adult BMI (such that the INSIDE violation-where the effect of SNPs on a confounding factor scales with that on the trait of interest-of Mendelian randomization would likely occur [69], so only a score for BMI was used, with the proviso that this represents BMI across the whole of the life course after very early infancy. The SNPs for depression were drawn from the results of a more recent analysis, for which there was not, at time of analysis, publicly available genome-wide data.

Credible sets
We defined a locus as mapping within 500kb of the lead SNP. For each locus, we first calculated the posterior probability, π Cj , that the jth variant is driving the association, given by: where the summation is over all retained variants in the locus. In this expression, Λ j is the approximate Bayes' factor [70] for the jth variant, given by Λ j ¼ ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi where β j and V j denote the estimated allelic effect (log-OR) and corresponding variance from the meta-analysis. The parameter ω denotes the prior variance in allelic effects, taken here to be 0.04 [70]. The 99% credible set [71] for each signal was then constructed by: (i) ranking all variants according to their Bayes' factor, Λ j ; and (ii) including ranked variants until their cumulative posterior probability of driving the association attained or exceeded 0.99.
Supporting information S1 Data. Supplementary results suggestive evidence of a 15th signal, rs151212108, near ARSD on the X chromosome and literature lookup of genes at PCOS risk loci.
(XLSX) S4 Table. Fine-mapping of PCOS risk loci identified in the meta-analysis to narrow candidate causal variants.