Predicting drug-free remission in rheumatoid arthritis: A prospective interventional cohort study

Background Many patients with rheumatoid arthritis (RA) achieve disease remission with modern treatment strategies. However, having achieved this state, there are no tests that predict when withdrawal of therapy will result in drug-free remission rather than flare. We aimed to identify predictors of drug-free remission in RA. Methods The Biomarkers of Remission in Rheumatoid Arthritis (BioRRA) Study was a unique, prospective, interventional cohort study of complete and abrupt cessation of conventional synthetic disease-modifying anti-rheumatic drugs (DMARDs). Patients with RA of at least 12 months duration and in clinical and ultrasound remission discontinued DMARDs and were monitored for six months. The primary outcome was time-to-flare, defined as disease activity score in 28 joints with C-reactive protein (DAS28-CRP) ≥ 2.4. Baseline clinical and ultrasound measures, circulating inflammatory biomarkers, and peripheral CD4+ T cell gene expression were assessed for their ability to predict time-to-flare and flare/remission status by Cox regression and receiver-operating characteristic (ROC) analysis respectively. Results 23/44 (52%) eligible patients experienced an arthritis flare after a median (IQR) of 48 (31.5–86.5) days following DMARD cessation. A composite score incorporating five baseline variables (three transcripts [FAM102B, ENSG00000228010, ENSG00000227070], one cytokine [interleukin-27], one clinical [Boolean remission]) differentiated future flare from drug-free remission with an area under the ROC curve of 0.96 (95% CI 0.91–1.00), sensitivity 0.91 (0.78–1.00) and specificity 0.95 (0.84–1.00). Conclusion We provide proof-of-concept evidence for predictors of drug-free remission in RA. If validated, these biomarkers could help to personalize immunosuppressant withdrawal: a therapy paradigm shift with ensuing patient and economic benefits.


Introduction
The past two decades have witnessed a remarkable revolution in rheumatoid arthritis (RA) outcomes, from a disease of inexorable joint destruction and disability to one where sustained remission is now a realistic and achievable treatment target [1]. Many of these advances have been realised through the effective use of disease-modifying antirheumatic drugs (DMARDs), especially their initiation in the early phases of disease and escalation in a treat-to-target fashion [2].
Although transformative for patients living with RA, the use of DMARDs comes at a price. Severe life-threatening toxicity is possible, including bone marrow suppression and hepatotoxicity [3]. Less severe but equally debilitating adverse effects are frequently encountered, such as nausea. Furthermore, the prescription and safety monitoring requirements are costly for healthcare providers and intrusive to patients' lifestyles. There are thus several motivations to consider DMARD minimisation in the setting of RA remission, a concept which is now recognised in international RA management guidelines [1,4]. Indeed, complete cessation of DMARDs is possible, with drug-free remission (DFR) a well-documented occurrence in 10-20% of patients in longitudinal cohorts [5,6].
Interventional studies of complete DMARD cessation in RA suggest that arthritis flare occurs in approximately half of cases [7][8][9], a risk that is likely to be unacceptably high for many patients and clinicians owing to the negative impact on quality of life [10,11] and the risk of cumulative joint damage [12,13] with arthritis flare. Prediction of DFR vs. flare prior to DMARD withdrawal would help identify patients in whom DMARD tapering and cessation is more likely to be successful; however, there are currently no reliable biomarkers of DFR to help guide clinicians and patients in this setting.
In this study, we present the findings of a prospective interventional study of complete cessation of conventional synthetic DMARDs in patients with RA in stable remission. Our aim was to identify biomarkers across a broad spectrum of domains -clinical, ultrasound, serological, and transcriptional -which, when measured prior to DMARD cessation, predict future drug-free remission.

Recruitment criteria
Eligible patients were identified by their supervising rheumatology clinical team across five National Health Service (NHS) Trusts in the North East of England between September 2014 and October 2016. Patients were eligible for study enrolment if they had a clinical diagnosis of RA made at least 12 months previously and were judged to be currently in clinical remission by their healthcare professional. Only methotrexate, sulfasalazine and/or hydroxychloroquine therapy was permitted; patients receiving biologics or any other DMARD in the past 6 months (or 12 months in the case of leflunomide), or glucocorticoids (enteral, parenteral or intra-articular) in the past 3 months, were excluded. Patients who were part of another clinical trial, and women who were planning pregnancy in the next 6 months, were also excluded.

Study design
In order to be eligible for DMARD cessation, patients had to be in clinical remission at the point of study enrolment with no power Doppler signal on a 7-joint ultrasound examination (see below). Initially, the 2011 ACR/EULAR Boolean remission criteria [14] were used to define remission. However, in order to facilitate recruitment and allow for analysis of baseline ACR/EULAR Boolean remission criteria as a predictor of drug-free clinical remission, this was changed following study amendment approval to a disease activity score in 28 joints with C-reactive protein (DAS28-CRP) < 2.4 [15,16]. Eligible patients completely stopped all DMARD therapy without tapering. All other medications were continued, including non-steroidal anti-inflammatory drugs if required. Routine study reviews were scheduled at month 1, month 3 and month 6, with additional study visits in the case of suspected flare at patient request. The primary outcome was time-toflare, defined as a DAS28-CRP ≥ 2.4 at any time during the six month follow-up period. A single measure of DAS28-CRP ≥2.4 was permitted if there was an alternative explanation (e.g. concurrent infection causing a rise in inflammatory markers) -in these cases, a repeat DAS28-CRP < 2.4 two weeks later was mandatory for continuation in the study. Patients who experienced an arthritis flare could receive glucocorticoids (parenteral, intra-articular or enteral) at physician discretion, before being discharged from the study to rapidly recommence DMARDs under the guidance of their rheumatologist.
The study design approved by the North East -Tyne & Wear South Research Ethics Committee (National Health Service Health Research Authority, reference 14/NE/1042). The study was conducted in accordance with the Declaration of Helsinki, and all patients provided informed written consent.

Clinical variable assessment
A pre-specified list of clinical variables were recorded, with corroboration of data by clinical notes review (Supplementary Table S1).
Serum C-reactive protein (CRP), erythrocyte sedimentation rate (ESR), rheumatoid factor (RhF) titre, and anti-citrullinated peptide autoantibody (ACPA) titre were measured by the hospital clinical laboratory. Where CRP levels fell below the detection threshold of the local laboratory (< 5 mg/L), a value of zero was used for the purposes of DAS28-CRP calculation.

Ultrasound (US) assessment
US was performed at study enrolment and month 6 using the same machine (Xario XG Diagnostic Ultrasound System model SSA-680A, Toshiba Medical Systems Corporation) by the same operator (KFB) who is trained in musculoskeletal US assessment. All scans were performed using the same linear mixed array transducer (part number PLT-1204BT). B-mode frequency was fixed at 12 MHz for all scans, and Bmode gain was individually set to a level providing optimal contrast between soft tissue, tendons and bony surfaces, Power Doppler images were acquired at a Doppler frequency of 5.3 MHz for all scans, with Doppler gain individually set to the maximum level possible without cortical bone artefact.
A minimum of 30 still images were recorded per scan, corresponding to the individual views of the seven joints of the US7 protocol of Backhaus et al.: [17] the dominant wrist, 2nd and 3rd metacarpophalangeal joints, 2nd and 3rd proximal interphalangeal joints, and 2nd and 5th metatarsophalangeal joints. Baseline scans were performed blinded to the disease activity score. The level of GS at each joint, and the levels of PD at each joint and tendon complex, were scored using the semi-quantitative scales (0-3) as per the approach of Scheel et al. [18] and Szkudlarek et al. [19] respectively. Tendon-associated GS and joint erosions were scored as either present (1) or absent (0). Minor vesselrelated Doppler signal at the wrist was not scored as power Doppler signal so long as all of the following criteria were satisfied: a) only a single vessel was present; and b) the origin of the vessel could be easily visualised as arising from a vessel superficial to the tendons of extensor digitorum; and c) no further branching of the vessel occurred below deep to the tendons of extensor digitorum; and d) the vessel did not traverse any areas of any level of greyscale change. Such an approach is in keeping with representative images from a published atlas of musculoskeletal ultrasonographic scoring for use in clinical research [20]. Scan images were rescored by KFB and a second observer (BT) with good intra-and inter-rater agreement (overall Cohen's kappa 0.73 and 0.62 respectively).

CD4 + T cell isolation and RNA extraction
CD4 + T cells were isolated from peripheral blood samples by negative CD36 selection followed by positive CD4 selection as previously described [21]. The median (IQR, range) purity of CD4 + T cell isolations was 99.0% (98.3-99.3, 95.8-99.7) as confirmed by flow cytometry, with a median (IQR) yield of 2.2 (1.6-2.9) x 10 5 cells per ml whole blood. Extracted T cells were then immediately lysed in the presence of β-mercaptoethanol before freezing at −80°C. Frozen T cell lysates were subsequently thawed, and RNA was extracted using the AllPrep™ DNA/RNA/miRNA Universal Kit (Qiagen) as per the manufacturer's instructions. The quantity and quality of RNA in each T cell lysate was measured by gel electrophoresis using a Tapestation™ 4200 machine (Agilent). The median (IQR, range) RNA yield was 838 (636-976, 277-2275) ng per million cells lysed. The quality of RNA was excellent, with a median (IQR, range) estimated RNA integrity number (RIN e ) of 9.4 (9.1-9.5, 8.7-9.8).

Next-generation RNA sequencing (RNAseq)
1.5 μg of total RNA per sample was used for RNAseq processing; where total RNA < 1.5 μg, the entire sample was used. Total RNA was processed using the TruSeq™ Stranded mRNA Library Prep Kit (Illumina), according to the 'High Sample Protocol' section of the manufacturer's instructions. RNAseq was performed using an Illumina NextSeq™ 500 in high-output mode. This configuration delivered 400 million reads over 75 cycles for 40 samples loaded across 4 lanes per flow cell. Sequencing was performed in batches across 4 separate flow cell sequencing runs. Samples were allocated to sequencing batches such that computational correction for any batch-to-batch variation at the level of either the RNA extraction (6 batches) or RNA sequencing (4 batches) could be achieved, according to a predetermined experimental design using the duplicate correlation command of the 'limma' Bioconductor/R package (v3.32.5) [22]. Samples were sequenced to a mean (range) depth of 12.1 (9.4-18.4) reads per sample, with excellent quality demonstrated by a mean Phred score > 30 across all read positions.
Transcript abundance was estimated from the raw FASTQ files using Kallisto software (v0.43.0) [23] ran in single-end mode, and using an index based on Gencode v24 transcript sequences [24]. Read counts were imported to R (v3.4.1) [25] using the 'tximport' package [26], removing genes with mean read count of < 60. Gene annotation using the Ensembl GRCh38 assembly [27] was performed using the 'biomaRt' package [28]. Read counts were normalised using trimmed mean of Mvalues normalisation (TMM), and were then logarithmically transformed to log counts per million (logCPM) using the variance modelling at the observational level (voom) approach [29]. CD4 + T cell gene expression data are available at the NCBI Gene Expression Omnibus (accession number GSE122612).

Serum protein biomarkers
The levels of 39 circulating cytokines, chemokines and acute phase proteins were measured by electrochemiluminescence (V-PLEX™ plates, MesoScale Discovery) according to the manufacturer's instructions. All baseline samples were processed together on the same plates to avoid batch variation. Assays where < 20% of measurements fell above the lower limit of detection were excluded, leaving 26 biomarkers available for analysis ( Supplementary Fig. S1). Baseline serum protein biomarker data were unavailable for one patient, who was excluded from the serum and integrative biomarker analyses.

Statistical analysis
This was an exploratory study to identify biomarkers for future validation, and the statistical analyses were conducted in line with this to prioritize the reduction of type II error. Analysis was performed in the R environment, version 3.3.2 [25], with additional packages as specified, according to the following standardised schedule.
First, the association between each variable and time-to-flare was analysed by univariate Cox regression within each variable domain (i.e. clinical, ultrasound, serum protein, RNAseq) using the 'survival' package [30]. Next, variables were selected based on their univariate pvalue to be taken forward to a multivariate Cox regression model. For clinical, ultrasound and cytokine data, an elevated significance threshold (p < 0.2) was used in order to reduce the risk of type II error at this preliminary stage, in keeping with established precedent [31,32]. A more stringent significance threshold (p < 0.001) was utilised for RNAseq univariate analysis in reflection of the greater number of variables analysed. Variables were then advanced to multivariate Cox regression with backwards stepwise variable selection based on the Akaike information criterion (using the 'MASS' package [14]). Variables that remained significantly (p < 0.05, or < 0.001 for RNAseq data) associated with time-to-flare in each domain multivariate model were then combined in a final multivariate integrative analysis to form a composite score, weighted by their respective coefficients. No significant departure from proportional hazards (as assessed by Schoenfeld residuals) was observed except where stated. An optimum biomarker threshold based on Youden's index was then calculated using receiver operating characteristic (ROC) analysis to assess the sensitivity/specificity and area under the ROC curve (ROC AUC ), with 95% confidence intervals calculated using bootstrapping (2000 replicates) and the De-Long procedure respectively (using the 'pROC' package [33]). Survival curves were compared between the dichotomised groups (using the 'survminer' package [34]) by the log-rank test as a measure of their utility in predicting time-to-flare after DMARD cessation.

Patient outcomes
78 patients attended for baseline assessment, of which 44 patients were eligible for DMARD cessation (Fig. 1). Prior to revision of the remission criterion by protocol amendment, one patient exited the study at 69 days despite remaining in DAS28-CRP remission, and was censored in remission at this time point. Of the patients who discontinued DMARDs the majority had established but stable disease, all were Caucasian, and all satisfied the 2010 ACR/EULAR RA Fig. 1. Study design and recruitment. 78 patients attended a baseline visit, of whom 44 stopped disease-modifying anti-rheumatic drug (DMARD) therapy. Patients then attended routine study visits at 1, 3 and 6 months following DMARD cessation, with additional unscheduled visits at the request of the patient in the event of suspected arthritis flare. Flare was confirmed if disease activity score in 28 joints with CRP (DAS28-CRP) ≥ 2.4, at which point the patient exited the study to restart DMARD therapy via their referring rheumatology team. Patients who maintained drug-free remission at 6 months remained without DMARDs and exited the study. PD: power Doppler. classification criteria [35] (Table 1).
23/44 (52%) patients experienced an arthritis flare at a median (IQR) time to flare of 48 (31.5-86.5) days after DMARD cessation (Fig. 2). The median (IQR, range) DAS28-CRP score at the time of flare was 3.12 (2.62-3.94, 1.58-4.51). One patient was classified as flare despite a DAS28-CRP of 1.58 due to the presence of synovitis (clinical and ultrasound) in the ankles and feet -discounting this patient gives a DAS28-CRP range of 2.45-4.51 at the time of flare. A further patient (who maintained DFR) was treated with a 7 day course of oral prednisolone by their general practitioner for nasal polyposis at 5 months after DMARD cessation; no other patients received systemic steroids during the course of the study.
There were no breaches of study protocol. There were 101 adverse events recorded, none of which were judged to be a consequence of DMARD cessation (Supplementary Table S2). There were no serious adverse events.

Clinical biomarkers
The association between baseline clinical variables and time-to-flare    Table S3). Those with a univariate p < 0.2 were advanced to form a multivariate stepwise Cox model incorporating 9 baseline clinical variables, of which 6 were associated with time-to-flare at the p < 0.05 significance level (Table 2). RhF positivity, longer time from diagnosis to starting first DMARD, and longer symptom duration at time of diagnosis were all associated with an increased hazard of flare. In contrast, fulfilment of ACR/EULAR Boolean remission criteria at baseline, longer time since last change in DMARD therapy, and longer disease duration were associated with a reduced hazard of flare.

Ultrasound biomarkers
Total greyscale synovial, greyscale tenosynovial, and joint erosion scores were not significantly associated with time-to-flare in univariate Cox regression analysis (Supplementary Table S4).

CD4 + T cell RNAseq biomarkers
The baseline expression of 19 genes within peripheral CD4 + T cells was associated with time-to-flare in univariate Cox regression analyses at the p < 0.001 significance threshold (Supplementary Table S6). From these genes, a multivariate stepwise Cox regression model was formed incorporating 11 genes, of which three were significant at the p < 0.001 threshold (Supplementary Table S7). Two of these genes were associated with an increased hazard ratio (HR) of flare: family with sequence similarity 102 member B (FAM102B; HR flare 1060, 95% CI 22.6-50000, p = 3.88 x 10 −4 ) and the predicted novel antisense gene ENSG00000227070 (HR flare 5.94, 95% CI 2.08-16.9, p = 8.63 x 10 −4 ). In contrast, the remaining gene (ENSG00000228010, also a predicted novel antisense gene) was associated with increased chance of sustained drug-free remission (HR flare 0.02, 95% CI 0.00-0.14, p = 2.24 x 10 −4 ).

Integrative biomarker analysis
Based on the aforementioned multivariate analyses, 11 baseline variables were advanced to a final integrative analysis: six clinical variables (RhF status, ACR/EULAR Boolean remission, months since last change in DMARD therapy, symptom duration at diagnosis, disease duration), two cytokines/chemokines (IL-27, MCP1), and three CD4 + T cell genes (FAM102B, ENSG00000227070, ENSG00000228010). In a multivariate backwards stepwise Cox regression model, there was some evidence for departure from the proportional hazards assumption attributable to minor outlying data for ACR/EULAR Boolean remission only, although not for the model as a whole (p = 0.36). Five variables were significantly (p < 0.05) associated with time-to-flare (Table 3), and were combined with their respective coefficients to form a composite biomarker score: ROC analysis was used to set an optimum threshold (39.65) for the prediction of flare following DMARD cessation (Fig. 3A). The composite biomarker score performed well in predicting arthritis flare, with a sensitivity of 0.91 (95% CI 0.78-1.00), specificity of 0.95 (0.84-1.00), positive predictive value of 0.96 (0.86-1.00), negative predictive value of 0.90 (0.78-1.00), and ROC AUC of 0.96 (0.91-1.00). A negative composite biomarker score (< 39.65) was a strong predictor of sustained DMARD-free remission, with a significant difference in DMARDfree survival between those with positive versus negative baseline scores (p < 0.0001, log-rank test) (Fig. 3B).
To account for reclassification of one patient on grounds of ankle/ feet flare (see Results 3.1 above), a sensitivity analysis was performed with this patient classified in remission with no notable effect on biomarker performance (Supplementary Tables S8 and S9).

Discussion
Advances in the management of RA over recent years have made sustained remission a realistic and achievable goal for many patients. This has, however, ushered a new dilemma into the clinic: how best to manage potentially toxic and costly DMARD therapy in such individuals. There have been several recently published studies of DMARD tapering in RA remission [36], although the majority focus on the partial tapering or spacing of biologic agents with continuation of conventional synthetic DMARDs, rather than addressing the concept of complete DMARD cessation.
Our unique study design of abrupt and complete DMARD cessation enabled us to compare baseline characteristics in subsequently flaring and non-flaring patients. 21/44 (48%) patients maintained clinical remission for 6 months following DMARD cessation, an observation that is comparable with previously published studies. In the RETRO study, randomisation to withdrawal (tapering followed by cessation, or immediate cessation) of a variety of biologics and conventional synthetic DMARDs resulted in DFR (DAS28-ESR < 2.6) in 35/63 (56%) patients at 12 months, compared to sustained remission in 32/38 (84%) patients who continued DMARD therapy [8]. In the BeSt study, DFR (DAS44 < 1.6) was observed in 59/115 (51%) patients who tapered DMARDs to complete cessation, with a median duration of remission of 23 months [7]. The consistent rate of DFR observed across these studies is remarkable given the heterogeneity of DMARD therapy, and perhaps suggests an intrinsic propensity for DFR within disease subtypes that may be independent of the specific DMARDs initially used to achieve remission [37]. In this context the relationship between drug-free remission and true immune tolerance deserves further exploration, particularly because there is a high unmet need for biomarkers to guide tolerogenic therapy development and implementation [38].
Lower disease activity at the point of DMARD cessation/tapering was predictive of DFR in several previous studies [5,7,39], in keeping with the predictive value of ACR/EULAR Boolean remission observed in