A variant in MCF2L is associated with osteoarthritis.

Osteoarthritis (OA) is a prevalent, heritable degenerative joint disease with a substantial public health impact. We used a 1000-Genomes-Project-based imputation in a genome-wide association scan for osteoarthritis (3177 OA cases and 4894 controls) to detect a previously unidentified risk locus. We discovered a small disease-associated set of variants on chromosome 13. Through large-scale replication, we establish a robust association with SNPs in MCF2L (rs11842874, combined odds ratio [95% confidence interval] 1.17 [1.11-1.23], p = 2.1 × 10(-8)) across a total of 19,041 OA cases and 24,504 controls of European descent. This risk locus represents the third established signal for OA overall. MCF2L regulates a nerve growth factor (NGF), and treatment with a humanized monoclonal antibody against NGF is associated with reduction in pain and improvement in function for knee OA patients.

Osteoarthritis (OA) is the most common form of arthritis and is associated with a large health economic burden. 1 The sibling recurrence risk (l s ) for OA has been estimated to be approximately 5 in the UK. 1 Two loci (GDF5 [MIM 601146] on chromosome 20 and a signal on chromosomal region 7q22, both with allelic odds ratios of~1. 15) have reached genome-wide significance in European populations. [2][3][4][5] This paucity of established risk loci could be ascribed to limitations caused by insufficient sample sizes, phenotype heterogeneity, resolution of known variation, associations with low-frequency and/or rare variants, interaction effects, or structural variation. 6,7 We recently carried out a large genome-wide association scan (GWAS) restricted to knee and/or hip OA and detected no replicating signals (arcOGEN GWAS). 8 Imputation based on the 1000 Genomes Project (1KGP) has been proposed as an approach that will increase power and resolution in genetic association studies, 9 and researchers have already applied the technique to fine map known association signals. 10,11 In this work, we applied a 1KGP-based imputation and identify a genome-wide significant locus for OA within a gene previously unlinked to the disease.
We used 1KGP pilot 1 data of 60 CEU individuals as a reference set and imputed 1KGP-identified variants into the arcOGEN GWAS of 3177 cases and 4894 UK controls [12][13][14] (Figure 1). The set of 3177 OA cases are unrelated individuals of European ancestry collected in the UK on the basis of two criteria: (1) radiographic evidence of disease (defined as a Kellgren-Lawrence [KL] grade R 2 15 ) and/or (2) clinical evidence of disease requiring joint replacement (TJR). The 4894 UK-population-based controls were unrelated individuals from the 1958 British Birth Cohort (58BC) and the UK National Blood Donor Service (UKBS) and were obtained from an early release of the Wellcome Trust Case Control Consortium 2 (WTCCC2) data. The genotyping and quality control (QC) of these individuals and their genotype data were described previously in the initial arcOGEN GWAS. 8 Our primary 1KGP imputation was based on the April 2009 release of haplotypes for 57 individuals. After removing rare variants (with minor allele frequency [MAF] < 0.01) and SNPs with low imputation quality (r 2 < 0.3), 7,258,070 variants were tested for association with OA. Further quality control was applied by closely examining all SNPs with p < 10 À5 in the association test, removing poorly clustering directly-typed SNPs in their vicinity (up to 300 kb away), and repeating the imputation step with the August 2009 1KGP release of haplotypes from 56 individuals and reassessing evidence for association. We selected eight SNPs from six loci for validation in the original arcOGEN data and for de novo genotyping in independent follow-up sample sets (Table S1, available online).
As part of our follow-up, we first genotyped an independent set of 5165 arcOGEN-collected cases and 6155 population-based controls from the 58BC and UKBS cohorts. Seven out of the eight SNPs were successfully typed with a Sequenom MassArray iPLEX Gold assay (Table S1) and one SNP, rs11842874 on 13q34, replicated with p ¼ 2.  Table 1). We subsequently took this signal forward to de novo genotyping in two further sample sets from the UK: the Genetics of Osteoarthritis and Lifestyle (GOAL) study 16,17 (1686 total joint replacement cases, 743 non-OA controls) and an additional independent set of 2409 newly recruited arcOGEN cases and 2319 population-based controls from the 58BC and UKBS cohorts. The combined UK meta-analysis (n ¼ 12,437 cases, 14,111 controls) allelic OR was 1.22 [1.14-1.30], p ¼ 2.24 3 10 À8 . We further investigated association with this variant in four non-UK OA sample sets: two from the Netherlands ( 21,22 and one from Iceland (deCODE, 1552 cases and 3071 controls, de novo genotyping). We used a meta-analysis framework to combine results across the follow-up studies only and across all data. We obtained the combined estimates of ORs for reference alleles by weighting the logORs of each study by the inverse of their variance via a fixed effects model. We investigated evidence of heterogeneity of ORs by using the Cochran's Q and I 2 statistics. The meta-analysis was performed with the GWAMA software package. 23 In all seven follow-up datasets combined, rs11842874 was associated with OA with p ¼ 3.0 3 10 À5 (allelic OR 1.13 [1.07-1.20]). Combined with the discovery sample set, the overall fixed effects meta-analysis (across 19,041 cases and 24,504 controls) established association at this variant with p ¼ 2.07 3 10 À8 (allelic OR 1.17 [1.11-1.23]; Figure 2, Table 1). The variant appears to be more strongly associated with knee OA (allelic OR 1.17 [1.10-1.25], p ¼ 2.52 3 10 À6 , effective sample size of 28,987) than with hip OA (allelic OR 1.11 [1.03-1.19], p ¼ 3.54 3 10 À3 , effective sample size of 27,452). Studies contributing data to this manuscript acquired informed consent from all participants and were approved by the appropriate ethics committee(s) for the respective institutions and countries.
rs11842874 is one of several highly correlated SNPs at 13q34 and constitutes the observed association signal, which spans 12.7 kb (Figure 3). The surrounding 1 Mb region is characterized by low levels of linkage disequilibrium and contains only nine SNPs correlated with rs11842874 at r 2 > 0.7. rs11842874 was selected for replication because it is the only variant present on some GWAS platforms. The OA risk-increasing allele is the major allele with population frequency of 0.927 (mean over the UK control data), and as a common variant with low OR, it contributes little to the sibling recurrence risk (estimated l s ¼ 1.001). All SNPs that comprise the signal reside in  human cells MCF2L regulates neurotrophin-3-induced cell migration in Schwann cells. 26 Neurotrophin-3 is a member of the nerve growth factor (NGF) family. Treatment of knee OA patients with a humanized monoclonal antibody that inhibits NGF was found to be associated with joint pain reduction and an improvement in function. 27,28 The MCF2L OA locus was taken forward on the basis of evidence accrued through 1KGP-based imputation. Direct typing of~600,000 SNPs through GWAS resulted in modest (p > 10 À5 ) evidence for the association of a single variant, rs11842874, with OA in this region and HapMap-based imputation left the picture unchanged ( Figures 3A and 3B). Prioritization strategies for follow-up in our published GWAS 8 down-weighted lone variants with no corroboration of association from neighboring SNPs. The denser 1KGP reference set empowered the association of multiple additional correlated (i.e., nonindependent), imputed variants, several of which showed stronger evidence for association with OA, thus highlighting this region for validation and replication genotyping ( Figure 3C, Table S1).
The associated variants are common, but their minor allele frequencies are toward the lower end of the frequency spectrum (at~0.07). The identification of similar variants with modest effect sizes (OR 1.17) at genome-wide significance levels will require sample sizes in the order of 23,000 cases and an equal number of controls. Through several rounds of cluster-plot inspection, removal of poor quality SNPs, and reimputation, we observed that the The purple squares designate the estimated Odds Ratio (OR) for each individual study and the error bars extending out on both sides show the 95% confidence interval for the OR estimate. The pink diamond designates the estimated OR of the fixed effects meta-analysis of all the studies. majority of signals were caused by genotyping and imputation artifacts. This fact highlights the need for imputed signals to be scrutinized postimputation before follow-up studies are deployed. As the field of complex trait association studies shifts its focus toward low-frequency and rare variants, thorough quality control of signals becomes highly relevant. In this study, we have restricted 1KGP-based imputation to variants with an MAF > 0.01. As reference panel sizes become larger, imputation of lower frequency variants will become more feasible, empowering The left y axis is the Àlog (p value) of SNPs in the region, and the right y axis is the recombination rate (cM/Mb) as calculated from the pilot 1 release of the 1KGP. Each diamond represents a variant and is colored according to its correlation (r 2 ) with rs11842874. The green arrows below provide an overview of the genes in the region and their transcriptional direction. Imputed variants are denoted by circles and directly typed variants are denoted by diamonds.
the examination of rare variation in next generation association studies. The genetic architecture of OA has not been elucidated yet. By identifying this susceptibility locus, the third one discovered for OA, our study now provides a foundation on which functional studies can be based. New additions to the genetic study toolset, including larger sample sets, well-characterized phenotypes, and resequenced reference-panel-based imputation approaches hold the promise of providing insights into the etiology of this common degenerative joint disease.

Supplemental Data
Supplemental Data include one table and the Acknowledgments and can be found with this article online at http://www.cell.com/AJHG/.

Acknowledgments
The collection of GOAL samples was funded by Astra Zeneca UK. The author affiliated with Astra-Zeneca is an employee of AstraZeneca, a global research-based biopharmaceutical company focused on discovering, developing and marketing medicines for some of the world's most serious illnesses, and owns stocks or stock options and has a pending patent application in the company. The authors that are affiliated with deCODE genetics are all employees of deCODE, a biotechnology company that provides genetic testing services, and some own stocks or stock options in the company. Additional acknowledgments are in the Supplemental Data.