Limitations of Human Genetic Studies on Osteoarthritis

Despite high prevalence and social impact, osteoarthritis (OA) is far behind other skeletal diseases like osteoporosis in the development of disease-modifying treatments. This is mainly because little is known about the underlying molecular mechanism that could be the therapeutic target. Since OA is a multifactorial disease caused by complex interplay between environmental and genetic factors with estimates of around 50% heritability depending on the site [1], numerous efforts and great expense have been spent on human genetic studies on OA worldwide. Although linkage studies have shown large areas of chromosomes associated with the disease, they have failed to detect the susceptible genes. Candidate gene studies have proposed over 100 genes as being responsible; however, most of them have not later been reproduced in larger meta-analysis studies. Recently, while genome-wide association studies (GWAS) have led to the discovery of over 600 gene loci in over 50 common multifactorial diseases, most of the gene variants are of only minimal individual effect. Even though the identified genes with such small effect sizes could possibly be therapeutic targets or at least prognostic markers, it is questionable whether or not these conventional OA genetic studies are worthy of such enormous investment. Aiming at a well-powered approach for this highly polygenic disease with multiple risk loci conferring small effects, consortium studies like Treat-OA and arcOGEN have been developed to enlarge the sample size. Considering the disease characteristics and prevalence, however, it is our opinion that not only the quantity but also the quality of studies is critical for identification of the genetic architecture. In this sense, the conventional OA genetic studies do not seem to us who are clinicians, although not genetic experts, to have been performed with sufficient scientific strictness, even as compared to those on other common diseases.


Editorial
Despite high prevalence and social impact, osteoarthritis (OA) is far behind other skeletal diseases like osteoporosis in the development of disease-modifying treatments. This is mainly because little is known about the underlying molecular mechanism that could be the therapeutic target. Since OA is a multifactorial disease caused by complex interplay between environmental and genetic factors with estimates of around 50% heritability depending on the site [1], numerous efforts and great expense have been spent on human genetic studies on OA worldwide. Although linkage studies have shown large areas of chromosomes associated with the disease, they have failed to detect the susceptible genes. Candidate gene studies have proposed over 100 genes as being responsible; however, most of them have not later been reproduced in larger meta-analysis studies. Recently, while genome-wide association studies (GWAS) have led to the discovery of over 600 gene loci in over 50 common multifactorial diseases, most of the gene variants are of only minimal individual effect. Even though the identified genes with such small effect sizes could possibly be therapeutic targets or at least prognostic markers, it is questionable whether or not these conventional OA genetic studies are worthy of such enormous investment. Aiming at a well-powered approach for this highly polygenic disease with multiple risk loci conferring small effects, consortium studies like Treat-OA and arcOGEN have been developed to enlarge the sample size. Considering the disease characteristics and prevalence, however, it is our opinion that not only the quantity but also the quality of studies is critical for identification of the genetic architecture. In this sense, the conventional OA genetic studies do not seem to us who are clinicians, although not genetic experts, to have been performed with sufficient scientific strictness, even as compared to those on other common diseases.
Several studies indicate that inconsistent and ambiguous definition of OA is a critical limitation of conventional genetic studies [2]. In addition to the stringency of disease definition raised by them, here we propose two other capital issues in the conventional studies: selection of appropriate controls and adjustment for environmental/clinical factors, from a clinician's point of view.

Stringency of Disease Definition
Although most conventional genetic studies determine OA on radiographs as Kellgren-Lawrence (KL) score=2 or higher (Table 1) [3][4][5][6][7], the KL grading is limited in reproducibility and sensitivity due to the subjective judgment of observers and the categorical classification into only a five-grade scale [8]. In the ROAD (Research on Osteoarthritis against Disability) study with a high-quality population-based cohort database of detailed environmental and genetic information of more than 3,000 participants [9], we delete the intermediate and ambiguous KL=2 subgroup for the case-control analysis to increase the detection power. For example, our association analysis of the EPAS1 gene which was identified to be crucial for OA development in mice was able to detect a significant difference of the minor allelic frequency (mAF) of a SNP in the gene between KL=3 & 4 (case; mAF=11.1%) and KL=0 & 1 (control; mAF=15.2%) [10]. The mAF of the omitted KL=2 subgroup was 12.3%, confirming an inverse relationship between mAF of the SNP and KL scores. This clearly indicates that inclusion of the KL=2 subjects in the case group had caused a decrease in the detection power. In fact, this association was not reproduced by conventional Japanese and Chinese studies that include KL=2 in the case group [11]. Considering that prevalence of the KL=2 subgroup is shown to be fairly high in representative epidemiologic studies (17.3-41.3%; difference between KL ≥ 2 and KL ≥ 3 in Table 2), removal of this subgroup may inevitably cause a decrease in the total sample size.
Generally, a lack of objective and quantitative measure for the disease definition remains a fatal limitation of clinical OA studies. The ROAD study has recently established the fully automatic program KOACAD (knee OA computer-aided diagnosis) to quantify the major OA parameters (joint space, osteophyte, etc.) on plain radiographs [8]. We believe that the KOACAD system as well as magnetic resonance image systems [12] will serve as optimal measures for the definition of OA in the near future, just as bone mineral density does in osteoporosis.

Selection of Appropriate controls
In genetic studies on common diseases with a high prevalence, selection of disease-free controls is essential to avoid the potential bias due to contamination of affected subjects in the control. In representative epidemiologic studies worldwide, the prevalence of radiographic knee OA (KL ≥ 2) in the elderly was ≥ 30% in all populations and >60% in Asian populations like Japan (ROAD study) and China (Shanghai) ( Table 2) [13]. Furthermore, the prevalence of asymptomatic knee OA was 24-36% in all populations. Hence, if so-called healthy subjects without knee symptoms were collected as controls, a considerable number of OA subjects would be included in the control group. Even in a series of genetic studies in Japan with a high OA prevalence [13], the control subjects are miscellaneous mixtures of various populations including considerable numbers of so-called healthy subjects and other disease patients without radiographic diagnosis (Table 1) [3,5,7], indicating that a substantial percentage in the control groups are affected subjects. A recent analysis of the effect of controls selected with different levels of stringency on the association of known knee OA susceptibility genes demonstrates that a control with poor selection or without selection cannot be compensated by increase of the sample size [14]. Hence, selection of appropriate controls confirmed to be diseasefree may be crucial to achieve a high detection power.

Adjustment for Confounding Environmental/Clinical Factors
Lastly, we should again note that OA is a multifactorial disease with environmental and genetic backgrounds and that the genetic contribution is less than half in knee OA [1]. A report by Takahashi et al. constructed knee OA prediction models based on genotype (combination of three risk alleles of ASPN, GDF5 and DVWA) and environmental/clinical information (age, gender and BMI), and evaluated the predictive power by area under the curve (AUC; range, 0.5 [worst] to 1 [best]) on a receiver operating characteristic (ROC) curve in a case-control association study [15]. The result was that the power by the genotype information was very small (AUC=0.554), implicating uselessness of the three famous genotypes as a prognostic marker. Contrarily, the environmental/clinical information was a much better predictor (AUC=0.678), but was little improved by the combination with the genotype information (AUC=0.685), again confirming its uselessness. Hence, to achieve a high detection power for the susceptibility gene, all efforts should be made to exclude the influence of environmental/clinical factors. Surprisingly, however, there are big differences in age and gender between case and control groups in previous representative studies (Table 1). Even a sole difference in age of about 20 years between case and control groups that is seen in the Japanese studies [3,5,7] is calculated to cause an increase of odds ratio for OA to 2.65 (=1.05 20 ), according to the authors' own estimation (1.05 / year) [15]. Indeed, we are not opposed to recent activities of OA consortiums to pool subjects worldwide; however, we should note that the pooled subjects are miscellaneous mixtures of various populations with different backgrounds. Selection of case and control subjects with similar backgrounds is essential to minimize selection bias which strongly influences the results in genetic studies with small effect sizes of the risk alleles. Hence, at least for the initial screening, case and control groups should be selected from a single population-based cohort to adjust the living environment and    The reproducibility may then be examined in other replication cohorts of the worldwide consortiums, after adjustment for the specific confounding factors in the respective cohorts.
Taken together, conventional OA genetic studies appear to compare a case group containing a substantial number of subjects with ambiguous definition versus a control group containing a substantial number of affected subjects, plus without adjustment for confounding environmental/clinical factors. Contrary to the genetic studies, studies of clinical trial and observational epidemiology are performed under a sound scientific rigidity in compliance with very strict rules to examine the accurate effect sizes of interventions and environmental/clinical factors, respectively. Introduction of strict regulation in the genetic field, just like CONSORT guidelines in the clinical trial field [16], might improve the scientific rigidity.