Genome-wide association study of liver enzyme elevation in an extended cohort of rheumatoid arthritis patients starting low-dose methotrexate

Aim: A follow-up genome-wide association study (GWAS) in an extended cohort of rheumatoid arthritis (RA) patients starting low-dose methotrexate (MTX) treatment was performed to identify further genetic variants associated with alanine aminotransferase (ALT) elevation. Patients & methods: A GWAS was performed on 346 RA patients. Two outcomes within the first 6 months of MTX treatment were assessed: ALT > 1.5-times the upper level of normal (ULN) and maximum level of ALT. Results: SPATA9 (rs72783407) was significantly associated with maximum level of ALT (p = 2.58 × 10 -8 ) and PLCG2 (rs60427389) was tentatively associated with ALT > 1.5 × ULN. Conclusion: Associations with SNPs in genes related to male fertility ( SPATA9 ) and inflammatory processes ( PLCG2 ) were identified.

Elevation of liver transaminases is a common side effect of treatment with low-dose methotrexate (MTX), which is frequently used to treat rheumatoid arthritis (RA) [1].Long-term therapy with MTX has been associated with the development of hepatic damage leading to fibrosis and possibly cirrhosis.Prevention of MTX-induced hepatotoxicity relies heavily on long-term follow-up of liver function tests [2].When low-dose MTX treatment is initiated in RA patients, it is recommended to monitor the liver enzyme alanine aminotransferase (ALT) every 2-4 weeks for the first 3 months of therapy, every 8-12 weeks after 3 months of therapy and every 12 weeks beyond 6 months of therapy [3].Risk factors for MTX-induced hepatotoxicity include excessive alcohol use, obesity, diabetes and concurrent administration of other potentially hepatotoxic drugs [4].Pretreatment ALT elevation, female sex, statin treatment and increasing BMI were previously identified as predictors of ALT elevation during MTX therapy [5].Genetic risk factors may also play a role.Studies have identified several single-nucleotide polymorphisms (SNPs) potentially affecting both the efficacy and toxicity of MTX by modulating the expression of key enzymes (e.g., MTHFR [6], FPGS [7], DHFR [8], ATIC [9] and SHMT [10]) in the folate biosynthesis pathway.
Identifying more genetic risk factors for MTX-induced liver toxicity would aid the development of personalized therapy for RA.The current authors previously performed a genome-wide association study (GWAS) in a cohort of 198 RA patients starting MTX treatment that revealed novel associations between ALT elevation and SNPs that could regulate the expression of RAVER2 and JAK1.This was strengthened in a meta-analysis with a replication cohort of 160 patients who underwent direct genotyping of the SNPs of interest [11].In this follow-up study, genomewide genotyping of the patients included in the replication cohort was performed and the results of a joint analysis of the discovery and replication cohorts are presented.Other potential associations between genetic variations and an increase in ALT during low-dose MTX therapy in this larger cohort were also assessed.

Cohort description
Patient recruitment and study methodology have been described in detail elsewhere [11].Briefly, a total of 198 patients starting low-dose MTX therapy for RA between 1 January 2005 and 30 April 2013, at the Rheumatology Department at Uppsala University Hospital, Sweden, were recruited and followed for a minimum of 6 months.An identical group of 160 patients starting therapy between 1 May 2013, and 30 September 2017, at the Rheumatology Department, Uppsala University Hospital, Sweden, or between 1 January 2005 and 30 September 2017, at the Rheumatology Department, Sunderby Hospital (Lulea), Sweden, were also recruited.As in the previous study, the primary outcome was an elevation of ALT >1.5-times the upper limit of normal (ULN) within the first 6 months of treatment [11], and patients were divided into cases and controls.The secondary outcome was the maximum value of ALT in all patients within the first 6 months of treatment.

Genome-wide array data & analyses
The original GWAS cohort was genotyped with the Illumina Infinium OmniExpressExome 1 M Array (n = 198 individuals) and the second cohort (n = 160 individuals) with the Illumina Infinium Global screening array 24v3, which is optimized for imputation and has a lower cost.Genotype calls were generated with the GenomeStudio software (Illumina, CA, USA) and the Genome Reference Consortium human assembly GRCh37.Preimputation quality control included gender checks, exclusion of variants with Hardy-Weinberg equilibrium p-values <5 × 10 -8 or minor allele frequency (MAF) <0.5% and exclusion of individuals with >2% missing variants.Imputation was performed separately for each Illumina batch using the Sanger imputation server.The haplotype reference consortium (HRC r1) panel [12] was used as a reference for the pipeline with Eagle2 v2.3.3 [13] prephasing and positional Burrows-Wheeler transform (PBWT) imputation [14].Postimputation, the datasets were filtered for quality (impute2 imputation quality >0.7) using bcftools and converted to hard calls using PLINK v1.9 [15].The resulting two PLINK datasets were finally merged.In this study, variants available from both Illumina batches with good imputation were used and an overall MAF cutoff of 0.01 was applied, yielding a total of 7,489,589 variants for the analysis.Due to an increasing number of spurious results at low allele frequencies, for the secondary continuous outcome (max ALT within 6 months) a MAF cutoff of 0.03 was applied, resulting in a total of 6,074,601 variants for the analysis.PLINK v1.9 was used for logistic regression analysis on a genome-wide level for the primary outcome, and the analyses were adjusted for sex, age and the first four principal components.Analysis of the secondary outcome (max ALT within 6 months) was performed using linear regression implemented in PLINK v1.9.Prior to the analysis, max ALT was log 2 transformed due to the right-tail skewed distribution of the variable.The analysis was adjusted for age, sex, log 2 baseline ALT (ALT at the start of MTX treatment) and the first four genetic principal components.SNP effects were modeled as additive and statistical significance was set at the traditional genome-wide level of p <5 × 10 -8 [16].

Statistical analyses & power calculation
Descriptive data are expressed as mean ± standard deviation (SD), minimum (min), maximum (max) and frequency (%).For comparative analyses between cases and controls, Student's t-tests or Mann-Whitney U-tests were used for continuous variables and χ 2 tests or Fisher's exact tests were used for categorical variables.Statistical power was estimated using the R package GeneticsDesign (function GPC) using the observed prevalence of 33/346 for ALT >1.5 × ULN in the cohort.For a marker with a MAF of 0.2, there was approximately 80% power to detect an odds ratio of 6.2 at a genome-wide level, and for a marker with a MAF of 0.4, an odds ratio of 4.5 (Supplementary Figure 1).

Functional variant analysis
The top GWAS SNPs associated with ALT elevation were intersected with different databases to obtain functional annotations, including the candidate cis-regulatory elements (cCREs) registry from the ENCODE project [17], a cell and tissue-specific collection of representative DNase hypersensitivity sites (rDHSs) supported by either histone modifications (H3K4me3 and H3K27ac) or CTCF-binding data; HaploReg, a tool for exploring annotations of noncoding variants on haplotype blocks [18]; the Tomtom motif comparison tool [19] and the Transcription factor Affinity Prediction (sTRAP) tool [20] to identify candidate transcription factors whose binding could be affected by the studied variants; and gene expression levels from the Genotype-Tissue Expression (GTEx) project [21] to investigate the tissue-specific expression of the associated genes.

Cohort characteristics
A total of 346 patients of mainly Swedish ancestry with successful genotyping data were available in the combined discovery and replication dataset, their characteristics are shown in Table 1.Note that 12 of 160 patients from the replication cohort in the original study had DNA sample yields that were too low for genome-wide analyses and were excluded from the current analysis.
The study of the primary outcome was performed on 33 patients defined as cases due to at least one elevation of ALT >1.5 × ULN within the first 6 months of MTX treatment with the remaining 313 patients defined as controls (Table 1).The proportion of individuals with a previous history of ALT elevation before MTX treatment was significantly higher among cases compared with controls (42.4% vs 16.6%; p = 0.0018).The ethnicity of the patients did not differ significantly between cases and controls (p = 0.27; Table 1) with both parents of 79% of the cases and 84% of the controls born in Sweden.
The secondary outcome, the maximum level of ALT within the first 6 months of treatment, was investigated in a total of 341 patients (5 of 346 patients with normal ALT were excluded because exact ALT values were not recorded, although not >1.5 × ULN).All patients were treated with MTX 7.5-25 mg once a week.According to Swedish guidelines, the initial dose of 7.5 mg per week is gradually increased during 1-3 months to a maximum of 25 mg per week based on individual decisions by the treating physician.The patient is supplemented with at least 5 mg of folic acid per week.MTX dose was not associated with ALT elevation in this cohort [5].

Genome-wide association analysis
As previously observed in the original discovery cohort (n = 198), no statistically significant genome-wide association was found when comparing patients with ALT >1.5 × ULN within 6 months with controls in the extended cohort (n = 346).Tentative associations (p < 1 × 10 -5 ) were observed for SNPs close to PLCG2, KNOP1, HSD17B13 and PRKCB (Figure 1A & Supplementary Table 1).
The maximum level of ALT within 6 months was significantly associated at a genome-wide level (p <5 × 10 -8 ) with one SNP (rs72783407) located at the SPATA9 locus (Figure 1B & Supplementary Table 2).This SNP had a beta-value of 0.60 (95% CI: 0.40-0.81)per increase of one minor allele, meaning that the geometric mean of ALT increased 1.51-times per minor allele of rs72675408.An ALT of 1.00 would thus be predicted to increase to 1.51 in patients with one variant allele and to 2.31 in patients with two variant alleles.The possible interaction between rs72783407 and gender was investigated, however, the interaction was not significant (p = 0.0982; Supplementary Figure 2).We also observed a near-significant association with a cluster of five SNPs (rs3920617, rs55889764, rs72675408, rs72675414 and rs72675415) at the RAVER2 locus.A statistically significant association with rs72675408 was previously demonstrated when using data from the combined cohort (n = 354) [11].

Genomic landscape of the top genome-wide association study SNPs
The variant rs60427389 located in the 13th intron of PLCG2 was the SNP closest to statistical significance in association with the first outcome, ALT >1.5 × ULN within 6 months.PLCG2 is expressed in several tissues including lung, skin, lymphoid tissues, blood and, in a smaller measure, the liver.The genomic region harboring rs60427389 is annotated as quiescent in blood and shows DNase signals and H3K4me1 and H3K27ac enhancer signatures in liver tissue [18].The SNP is located <150 base pairs upstream of an ENCODE annotated cCRE (EH38E3193997) defined by DNase and CTCF signals [17].Motif analysis showed how rs60427389 could alter the binding of several transcription factors, among others: ZNF135, ZNF524, NR2F2, NR2F1, NR2F6, RARG, RARA and MAF [18,19].
The variant rs72783407, located in the second intron of SPATA9, was significantly associated at a genome-wide level with the second outcome, the maximum level of ALT within 6 months.rs72783407 is an eQTL for SPATA9 in testis [21] and is located in a genomic region annotated as quiescent in blood and liver tissue [18] where it is expressed to a much lower degree.It is predicted to alter the binding of different transcription factors: CEBPD, E2F-family, PLZF and PAX4.The transcription factor Affinity Prediction tool revealed a preferential binding of transcription factors on the reference T allele (E2F, PLZF) with respect to the alternative A allele (CEBPD and PAX4).

Discussion
In this expanded GWAS aiming to identify associations between genetic variations and early ALT elevation in RA patients starting MTX treatment, no genome-wide signals were identified when comparing RA patients with or without ALT >1.5 × ULN within 6 months of the start of MTX therapy.The increased cohort size thus did not lead to a significant association, with the primary outcome mirroring the results of the original GWAS performed in the 198-patient cohort.A genome-wide near-significant association was observed for rs60427389, located in an intron of PLCG2 that encodes for the transmembrane signaling enzyme PLCG2.Gain of function mutations in PLCG2 have been associated with severe spontaneous inflammation and autoimmunity [22].PLCG2 has also emerged as a candidate gene to control the inflammation in RA and, in a meta-analytic study, it was enriched in several signaling pathways related to inflammation, antigen presentation, hypoxia and apoptosis [23].In addition, overexpression of PLCG2 in rat hepatocytes has been shown to inhibit hepatocyte proliferation and increase apoptosis [24].
When examining genetic associations with the secondary outcome, a single SNP, rs72783407, was associated with the max value of ALT within 6 months of the start of low-dose MTX therapy.rs72783407 is located in an intron of SPATA9 (also known as NYD-SP16), a gene that encodes for SPATA9, predicted to be involved in cell differentiation and spermatogenesis [25].This membrane protein is involved in capacitation and acrosome reaction and has been investigated as a potential biomarker for male infertility.Patients with Sertoli cell-only syndrome or in maturation arrest at an early stage of spermatogenesis show no expression of SPATA9 in the testis [26].The observed association linked to SPATA9 was unexpected considering the nature of the study cohort, composed of approximately 70% female patients.A trend was observed in which the association of rs72783407 with the outcome was more pronounced in males, although not statistically significant (Supplementary Figure 2), a larger (male) cohort would be required to confirm this finding.The possibility that the association is a chance finding cannot be ruled out and should thus be cautiously interpreted.
A limitation of the study cohort is that both cases and controls have an overrepresentation of female patients (Table 1).However, as for many autoimmune diseases, RA displays an imbalance between the sexes, with females representing the majority of cases [27,28].The associations reported in this study were adjusted by sex to avoid bias.Another limitation is that only around 10% experienced at least one elevation of ALT >1.5 × ULN within the first 6 months of MTX treatment.A significant difference was also observed between cases and controls regarding the history of ALT elevation before MTX treatment, which could lead to biased associations.We previously showed that a history of ALT elevation is a predictor of ALT elevation during MTX therapy.Correction for pretreatment ALT elevations was not employed, since it is unknown whether a genetic contribution to a predisposition for ALT elevation exists.In doing so, a genetic association with ALT elevation regardless of treatment may have been lost.Further, the patients were exposed to other potential causes of liver toxicity (e.g., alcohol use, obesity and diabetes), although no significant differences were seen in this respect between cases and controls (Table 1).
Cui et al. [29] recently reported two new associations at loci for RAP1GAP2 and ULK2.These two SNPs were not reported in the current study due to low imputation quality (rs28661593) and the rarity of the variant, present in only 3 donors (rs76993247).Overall, the results of this study, of our current and previous GWAS [11] and of a meta-analysis [30] have not identified shared common variants associated with liver function, but rather point toward several different loci that in the future might be used to define a polygenic risk score for MTX-induced liver toxicity.

Conclusion
No single genetic variant strongly associated with liver enzyme elevation was identified in this extended cohort of patients starting low-dose MTX.The overall results of the presented GWAS highlighted two genomic loci possibly associated with elevation of ALT levels during the first 6 months of treatment.SNPs at these loci were tentatively associated with genes whose expression could affect biological processes related to male fertility (SPATA9) and inflammatory processes (PLCG2).Further studies are required to validate these findings.

Summary points
• Low-dose methotrexate (MTX) represents the first-line therapy for rheumatoid arthritis (RA).
• A common side effect of MTX treatment is liver toxicity with elevations in alanine aminotransferase (ALT).
• A genome-wide association study was performed in an extended cohort of RA patients starting low-dose MTX treatment to pinpoint new genetic risk factors associated with ALT elevation.• A variant located in an intron of SPATA9 was significantly associated with maximum ALT level within 6 months of starting MTX treatment.• A variant located in an intron of PLCG2 was tentatively associated with ALT >1.5 × the upper limit of normal within 6 months of starting MTX treatment.

Figure 1 .
Figure 1.Manhattan plots for the primary and the secondary outcomes.(A) Primary outcome.(B) Secondary outcome.ALT is log 2 transformed and adjusted by log 2 (baseline ALT), age, sex and genetic principal components 1-4.The red line denotes the genome-wide significance level p <5 × 10 -8 .ALT: Alanine aminotransferase; ULN: Upper level of normal.

Table 1 .
Characteristics of patients with rheumatoid arthritis initiating treatment with methotrexate.
† Patients with ALT Ͼ1.5 × ULN within 6 months from start of MTX treatment.‡ Patients with ALT Ͻ1.5 × ULN within 6 months from start of MTX treatment.§ Missing data for 1 patient.¶ Missing data for 11 patients.# Missing data for 4 patients.† † Missing data for 7 patients.‡ ‡ Missing data for 5 patients.§ § Missing data for 2 patients.ACPA: Anticitrullinated protein antibodies; ALT: Alanine aminotransferase; MTX: Methotrexate; RA: Rheumatoid arthritis; SD: Standard deviation; ULN: Upper limit of normal.