Genome-wide association scan identifies candidate polymorphisms associated with differential response to anti-TNF treatment in rheumatoid arthritis.

The prediction of response (or non-response) to anti-TNF treatment for rheumatoid arthritis (RA) is a pressing clinical problem. We conducted a genome-wide association study using the Illumina HapMap300 SNP chip on 89 RA patients prospectively followed after beginning anti-TNF therapy as part of Autoimmune Biomarkers Collaborative Network (ABCoN [Autoimmune Bio-markers Collaborative Network]) patient cohort. Response to therapy was determined by the change in Disease Activity Score (DAS28) observed after 14 wks. We used a two-part analysis that treated the change in DAS28 as a continuous trait and then incorporated it into a dichotomous trait of "good responder" and "nonresponder" by European League Against Rheumatism (EULAR) criteria. We corrected for multiple tests by permutation, and adjusted for potential population stratification using EIGENSTRAT. Multiple single nucleotide polymorphism (SNP) markers showed significant associations near or within loci including: the v-maf musculoaponeurotic fibrosarcoma oncogene homolog B (MAFB) gene on chromosome 20; the type I interferon gene IFNk on chromosome 9; and in a locus on chromosome 7 that includes the paraoxonase I (PON1) gene. An SNP in the IL10 promoter (rs1800896) that was previously reported as associated with anti-TNF response was weakly associated with response in this cohort. Replications of these results in independent and larger data sets clearly are required. We provide a reference list of candidate SNPs (P < 0.01) that can be investigated in future pharmacogenomic studies.


INTRODUCTION
Tumor necrosis factor-α (TNF-α) is a key regulator of the inflammatory cascade in rheumatoid arthritis (RA) and several other inflammatory diseases (1)(2)(3). To date, three TNF antagonists, infliximab (Remicade), etanercept (Enbrel), and adalimumab (Humira), have been approved by the FDA to treat RA and other inflammatory diseases. The molecular mechanisms for these three TNF inhibitors are similar: they block the binding of TNF-α to its cell-surface receptors and limit subsequent cell signaling pathways that are induced or regulated by TNF-α. Etanercept is a dimeric TNF receptor-IgG fusion protein and mimics the inhibition effects of soluble TNF re-ceptor by binding to TNF-α. Infliximab is a chimeric mouse-human antibody, while adalimumab is a fully humanized antibody.
Although the therapeutic utility of TNF-α antagonism is well established, patients display substantial heterogeneity in their response to anti-TNF therapies, and the efficacy of any anti-TNF agent in a given patient is unpredictable. Approximately one-third of patients have minimal or no response to these agents (4). A genetic influence has been suggested based on candidate gene studies (5,6), but no comprehensive analysis of this issue has been reported. In a recent review, Coenen et al. (7) summarized 17 pharmacogenetic studies of anti-TNF treatment that were conducted after 2001. All of the 17 studies focused on polymorphisms in genes known to be involved in RA pathogenesis, genes encoding TNF-α receptors, or genes implicated in TNF-α metabolism. Several groups reported that a single nucleotide polymorphism (SNP), -308G > A, in the promoter of TNFA is associated significantly with the outcome to anti-TNF treatment (5,(8)(9)(10)(11)(12)(13)(14). This positive association also is supported by a meta-analysis that was performed using 311 patients combined from several studies (13).
To identify biomarkers influencing response to anti-TNF therapy, the Autoimmune Biomarkers Collaborative Network (ABCoN) has prospectively enrolled a cohort of RA patients beginning anti-TNF treatment. Using these patient samples, we took an unbiased genome-wide approach to finding common genetic variations that could be responsible for individual differences in response to the three anti-TNF agents. We report results for SNPs with lowest P values from the GWA study, and provide details for selected candidate genes.

Patients
The Autoimmune Biomarkers Collaborative Network (ABCoN) was established to explore the use of new technologies for biomarker discovery in both RA and Systemic Lupus (SLE). The ABCoN RA cohort includes 116 active RA patients followed prospectively to evaluate efficacy of the three available anti-TNF agents. To examine the response to anti-TNF therapy in RA, blood samples, laboratory data, and clinical data were collected at baseline (prior to anti-TNF therapy), 6 wks, 3 months, 6 months, and 1 year post treatment. DNA, RNA, peripheral blood cells, plasma, serum, and urine were obtained at the time of each study visit. Enrollment criteria included having a minimum of six swollen joints at enrollment, and no previous exposure to anti-TNF agents during the 6 months prior to en-rollment in the study. We did not enroll patients taking more than 10 mg of oral steroid therapy per d at the time of enrollment. All the patients provided written informed consent. The study protocols were approved by local ethics committees.

Efficacy Measurements
Disease severity was evaluated using the DAS28 score, which is the Disease Activity Score that includes 28-joint counts and C-reactive protein (15). DAS28 was measured at three time points: baseline, 6 wks, and 14 wks. Two scales were considered to evaluate efficacy of anti-TNF treatment. First, a relative improvement in disease activity was calculated for each patient using the DAS28 scores at baseline and at wk 14: .
RelDAS28 has a continuous scale and is approximately normal. Second, according to the EULAR definition published elsewhere (16), patients are classified as good, moderate, or non-responders, using the individual amount of change in the DAS28 (ΔDAS28) and DAS28 values at 14 wks (16). Briefly, a good responder is classified if ΔDAS28 ≥ 1.2 and DAS28 at 14 wks ≤ 3.2; a moderate responders are patients with (ΔDAS28 ≥ 1.2 and DAS28 at 14 wks > 3.2) or (0.6 < ΔDAS28 ≤ 1.2 and DAS28 at 14 wks ≤ 5.1). Patients are classified as non-responders if they do not fall into any of these categories (16).

Genotyping and Quality Control
The patients' genomic DNA was extracted from peripheral blood using standard protocols. We genotyped 317,000 Single Nucleotide Polymorphisms (SNPs) on 102 anti-TNF-treated patients using an Illumina Beadstation and Illumina HAP300 chips according to the Illumina Infinium 2 assay manual (Illumina, San Diego, CA, USA), as described previously (17). The HAP300 chip includes, on average, one SNP every 10 kilobases across all the autosomes, and interrogates approximately 87% of the common genetic variation in populations of European descent (18).
To ensure SNP marker quality and reduce the possibility of false associations, quality control procedures were performed on each of the 317k SNPs before association tests were carried out. The SNP set was filtered on the basis of genotype call rates (≥ 95%), and minor allele frequency (MAF ≥ 0.05). Hardy-Weinberg equilibrium (HWE) was calculated for individual SNPs using an exact test. All of the SNPs reported in this manuscript have HWE P values > 0.001. After filtering, 283,348 polymorphic SNPs were analyzed on chromosome 1 through chromosome 22. The average call rate for the filtered SNPs was 99.5%. We removed patients if their percentage of missing genotypes was more than 5% or if there was evidence of possible contamination in their DNA sample.

Statistical Analysis
We evaluated associations between SNP markers and response to the anti-TNF therapies in two stages. In the first stage of our analysis, we used relDAS28 as a continuous dependent variable to evaluate the associations. Because the sample size is relatively small in our study, a continuous scale of the dependent variable has more power than a categorized variable. Linear regression was carried out to evaluate association between individual SNP markers and response to anti-TNF therapies (that is, rel-DAS28) in the context of additive genetic effect model. A t statistic was derived from regression and used to evaluate association between individual SNP markers and response to anti-TNF therapies (that is, relDAS28). The t statistic is robust, and, thus, some departure from normality for the dependent variable is acceptable. Because more than 283,348 tests were involved in this study, a permutation test was carried out to account for multiple testing on each chromosome. The permutation test also can address the slight deviation from normality in the dependent variable. To obtain adjusted P values, the phenotypic values were shuffled randomly to break the relationship between phenotype and genotype. The entire analysis was repeated on the shuffled data; therefore, the shuffled data is representative of the null hypothesis. The 1,000 smallest P values were obtained from each of the N = 1,000 permutation iterations for the whole set of SNPs on individual chromosomes.
In the second stage analysis, we use a categorized dependent variable-response status to the anti-TNF therapy (that is, non-responders versus good responders) to evaluate association with SNPs selected from the first stage. The probability of being a non-responder was modeled using logistic regression with additive genetic effect model. Unless specified, all calculations for statistical analysis were carried out using the R software package (Version 2.2.1).
Population admixture (that is, sampling of subjects from two or more subpopulations) has been recognized as a major cause for inconsistent results and spurious associations for genetic studies (19,20). U.S. populations are genetically admixed. Although self-reported ethnicity data was recorded in this study, it is incomplete and may be inaccurate. To accurately classify individuals according to ancestry and to remove any possible related individuals, we calculated pairwise identity-by-state (IBS) distance for the 102 subjects and performed subsequent complete linkage agglomerative clustering and multidimensional scaling using genome-wide SNP markers in Plink software (version 1.00, http:// pngu.mgh.harvard.edu/~purcell/plink/ index.shtml). Clustering data were plotted to identify major population subdivisions. In addition to removing outliers from the dataset, we further evaluated potential effect from subpopulations (for example, northern and southern Europeans) by the EIGENSTRAT program with genome-wide SNP data (21). The top ten principal components (PCs) were obtained. Correlation analysis between the top PCs and the phenotypes (delDAS28 and dichotomous response status to anti-TNF) was performed to detect if the phenotypic difference among individuals was due to population stratification. If the spread of samples in these principal components was due purely to population stratification, it can be removed by forcing all samples to have zero value in these principal components. Then a "virtual" genotype can be obtained by rotating the corrected principal components back to the original genotype space. Pearson′s chi-square test was performed for association between selected SNPs and response status to the anti-TNF therapy (that is, nonresponders versus good responders).
All supplementary materials are available online at molmed.org.

RESULTS
Among the 102 active RA patients with complete genotypic and phenotypic data, self-reported ancestry included 83 Europeans, four Asians, three African Americans, as well as three subjects with reported Latino ethnicity. Nine patients have missing information for their ethnicity. Linkage agglomerative clustering and multidimensional scaling identified 89 patients with primarily European ancestry (Supplementary Figure 1), among them 83 were self-reported to be white and six had missing ethnicity data. These 89 patients were used for subsequent association analysis.
The baseline characteristics of the 89 patients are summarized in Table 1. At the time of diagnosis, their ages in years were 47 ± 14 (mean ± S.D), their disease duration was 8 ± 9 years, 75% were women, and 15% were current smokers. The average serum CRP level (mg/dl) at baseline was 1.7 (standard deviation (SD = 2.0). On average, the DAS28 at the baseline (before anti-TNF therapy) was 5.22 (SD = 0.80), indicating that the RA disease activity was high for most of the patients (16). Fifty-four subjects were treated with etanercept, 37 with infliximab, and 25 with adalimumab. These patients had a mean disease duration of 8 years; 46% had been treated previously with other DMARDs, and 7 patients had been treated previously with TNF inhibitors.

Genome-Wide Association Studies: First Stage Analysis
The GWA analyses were performed to test for association between 283,348 polymorphic SNPs and the relative change in DAS28 (relDAS28) using an additive genetic model. We did not adjust the principal components in our regression model further, because they did not correlate significantly with relDAS28 (that is, the major phenotypic difference among individuals was not due to population stratification) (Supplementary Table 1). The  chromosomal distribution of P values for the genome-wide association is shown in Figure 1. To address multiple testing issues, a chromosome-wide permutation test was performed. Sixteen SNPs remain significant (permutation exact P ≤ 0.05) or borderline significant (0.05 < permutation exact P ≤ 0.1) after permutation test to obtain exact significance levels. The P values, along with annotation information for these 16 SNPs, are shown in Table 2. Among these sixteen SNPs, four are located within genes, one is in the 3′UTR, and eleven are in the flanking regions of genes. All these SNPs are common polymorphisms (minor allele frequency > 0.1). In addition to Table 2, we also provide data for 2985 SNPs with relDAS28 P values ≤ 0.01 in Supplementary Table 1.
We used the Illumina annotation file "HumanHap317K_geneannotation.txt" together with the UCSC Genome Browser to annotate SNP details (22). Among these leading associations, the rs6028945 marker ~500 kb 3′ of the MAFB locus on chromosome 20 is marginally the strongest association (P = 0.003 corrected) when the relative change in DAS28 is considered as a continuous variable; a second marker in the region of MAFB, rs6071980 also shows evidence of association (P = 0.05, corrected). Likewise, multimarker evidence of association is seen with markers in the Paraoxinase 1 gene (PON1) as well as in a region of chromosome 9 that contains the interferon kappa (IFN-κ), MOBKL2B, and C9orf72 loci. Other loci showing some evidence of association include: a guanylate nucleotide binding protein (GBP6) at 1p22.2; LAG1 (longevity assurance homolog 6, LASS6) at chr2; cystatin D (CST5) at 20p11.21; centaurin, delta 1 (CENTD1) at 4p14; and quaking homolog and KH domain RNA binding (QK1) at 6q26-q27 (Table 2).

Association between SNPs and Response to Anti-TNF Therapy: Second Stage Analysis
According to the criteria of EULAR (16), the 89 patients under study can be categorized into three groups: 23 non-responders, 31 good responders, and 35 moderate responders using the baseline DAS28 and the change in DAS28 at 14 wks. We compared allele frequencies between non-responders and good responders using the Fisher's exact test (23). The last two columns of Table 2 display Fisher's exact P values and Odds Ratios (ORs) with 95% confidence intervals (CIs) for the non-responder status to the anti-TNF therapies. Note that these 95% CIs are very wide, reflecting the small sample size (that is, 23 non-responders versus 31 good-responders) in this study. The major principal components did not appear to predict good responder versus nonresponder status (Supplementary Figure 2). Because the sample size is quite small (23 non-responders versus 31 responders), a minor effect due to subpopulation may have a large impact on the 2 × 2 table estimate. Therefore, we further adjusted subpopulation structure to give adjusted Chi-square P values using EIGENSTRAT. As shown in Table 2, these adjusted Chi-square P values are generally larger than the Fisher's exact test P values.

SNPs in Candidate Genes
Of the markers present on the Illumina HapMap 300 chip, four SNP associations with TNF response have been reported previously in candidate gene studies: rs1800896 in IL10 (8,24), rs419598 in interleukin 1 receptor antagonist (IL1RN) (14), rs1041981 in LTA (25)(26)(27), and rs4149570 in TNFRSF1A (26). These markers were evaluated separately in view of their increased prior probability. In the current study, only rs1800896 shows evidence of association, with a P value of 0.0132 (uncorrected) with the delDAS28 and 0.0183 (corrected for stratification only) with responding status to anti-TNF therapy.

DISCUSSION
In this report, we describe the first genome-wide association study to evaluate pharmacogenetic effects on the response to anti-TNF treatment for rheumatoid arthritis. Several SNPs show significant association with the change in DAS28 observed in these patients over a 14-wk period of treatment.
Two SNPs located approximately 500 kb downstream of the MAFB (v-maf musculoaponeurotic fibrosarcoma oncogene homolog B) locus show significant evidence of association after correction by permutation and after accounting for the possible effects of population stratification. This gene is a member of the Maf family of bZIP transcription factors that are involved generally in cellular differentiation (28). MAFB is a putative tumor suppressor in the myeloid lineage, with a key role in monopoiesis (29) as well as monocyte-dendritic cell differentiation (30). Little is known about the biology of this gene in the context of inflammatory disease.
Another interesting association with anti-TNF response was found in the paraoxonase (PON1) locus. Paraoxonase is an enzyme associated with high density lipoproteins that may play a role in inflammatory disease (31). Low serum levels of PON1 have been reported in inflammatory disorders (32,33), including rheumatoid arthritis (34), although, in the setting of autoimmunity, most attention as been on the anti-inflammatory effects of PON1 in risk for cardiovascular disease (35). Genetic regulation of Paraoxonase 1 levels has been documented (36,37), although not always by polymorphisms within PON1 (38). TNF-α, IL6, and PON1 levels appear to be correlated in the setting of RA (39), although it is certainly not clear why genetic variation in PON1 should influence response to TNF inhibition.
Five SNPs in a region of chromosome 9 across a 70-kb region also showed association with treatment response. Of the three genes in this region with high LD, IFNκ is the most compelling candidate for involvement in inflammation and/or response to anti-TNF treatment, since type I IFNs clearly play a role in inflammatory disease (40).
Only four specific SNPs previously tested for association with anti-TNF treatment were included on the 317 K chip. Of these, we found that the G allele of rs1800896 (flanking region of IL10) is associated with good response to anti-TNF therapies (OR = 2.7 [95% CI:1.2~6.7], Fisher's P = 0.0183), which is consistent with the previous result that the combination of GG at this locus was associated with good response to etanercept (8). One SNP reported to be associated with response to anti-TNF treatment is rs1800629, located in the promoter of the TNF-α gene (7). Neither this SNP nor perfect surrogates are on the Illumina 317 K array. Thus, our data cannot replicate this previous finding.
As with other previous reports on the pharmacogenetics of TNF response, this study is limited by a relatively small sample size. Therefore, instead of a dichotomous classification of patients into non-responders versus good-responders, we first utilized the continuous score of the relative change in DAS28 (that is, delDAS28) as the dependent variable in a regression model so that we were able to utilize all subjects, and, therefore, maximize the statistical power of the study. We also performed chromosome-wide permutation tests to address the issue of multiple testing. Bonferroni adjustments often are used in adjusting statistical significance for multiple tests. However, the Bonferroni test is highly conservative, particularly in the way it tests the overall null hypothesis, that is, all null hypotheses are true simultaneously (41). Further- In addition to choosing the analytical approach in this analysis carefully, we investigated sub-population structure in our study subjects thoroughly to reduce false positive association that might be caused by population admixture (19,20). We calculated IBS estimation and further performed agglomerative clustering analysis using the PLINK software to detect population admixture and strata. We selected 89 subjects with predominantly European ancestry and performed subsequent statistical analyses using only these subjects. Furthermore, we performed principal component analysis using the EIGENSTRAT software to detect possible minor phenotypic difference in individuals due to any subpopulation structure and provided adjusted Chi-square P values for selected SNPs in this report.
A further caution regarding these results relates to the fact that our population was rather diverse clinically, with some patients having quite longstanding disease and nearly 40% of the patients were CCP antibody negative; 7 of the 89 patients had previously been treated with a TNF inhibitor. Therefore, this population is clearly not representative of a new onset cohort, where information about TNF responsiveness might be most clinically useful. Clearly, replications of these results in independent and much larger data sets are required to confirm these findings. Given the fact that so many patients have been treated with these agents over the last decade, it is disappointing that these population resources are not readily available. Nevertheless, several European RA cohorts may be useful for such replication studies (7,42). In addition, we, and others, are actively engaged in an effort to recruit a United States of America cohort of up to 1,000 RA patients beginning treatment with TNF inhibitors. Samples sizes of this magnitude or larger will be required to provide robust replication of the current results.
A test predicting response or nonresponse to anti-TNF treatment would be an important tool in the hands of physicians. TNF inhibitors are currently a mainstay of biologic therapy in RA. However, only a fraction of patients show a clinically meaningful response to anti-TNF treatment after 3-6 months. Furthermore, many patients lose response over time. After patients fail their first TNF inhibitor, physicians are faced with a choice between trying a second TNF inhibitor or moving to a biologic therapy with a different mechanism of action (43). Additional biologic agents with different mechanisms of action are available for RA patients who have an inadequate response to TNF inhibitors and more potential therapies are in late phases of clinical development. The ability to determine whether patients are likely to be a non-responder to a TNF inhibitor would make these treatment decisions more rational. The current results suggest that this may be achievable.