Systematic Review with Meta-Analysis: Fecal Calprotectin as a Surrogate Marker for Predicting Relapse in Adults with Ulcerative Colitis

The clinical course of ulcerative colitis (UC) is featured by remission and relapse, which remains unpredictable. Recent studies revealed that fecal calprotectin (FC) could predict clinical relapse for UC patients in remission, which has not yet been well accepted. To detect the predictive value of FC for clinical relapse in adult UC patients based on updated literature, we carried out a comprehensive electronic search of PubMed, Web of Science, Embase, and the Cochrane Library to identify all eligible studies. Diagnostic accuracy including pooled sensitivity, specificity, positive likelihood ratio (PLR), negative likelihood ratio (NLR), diagnostic odds ratio (DOR), and pooled area under the receiver operating characteristic (AUROC) was calculated using a random effects model. Heterogeneity across studies was assessed by the I2 metric. Sources of heterogeneity were detected using subgroup analysis. Metaregression was used to test potential factors correlated to DOR. Publication bias was assessed using Deek's funnel plots. In our study, 14 articles enrolling a total of 1110 participants were finally included, and all articles underwent a quality assessment. Pooled sensitivity, specificity, PLR, and NLR with 95% confidence intervals (CIs) were 0.75 (95% CI: 0.70–0.79), 0.77 (95% CI: 0.74–0.80), 3.45 (95% CI: 2.31–5.14), and 0.37 (95% CI: 0.28–0.49) respectively. The area under the summary receiver operating characteristic (sROC) curve was 0.82, and the diagnostic odds ratio was 10.54 (95% CI: 6.16–18.02). Our study suggested that FC is useful in predicting clinical relapse for adult UC patients in remission as a simple and noninvasive marker.


Introduction
Ulcerative colitis (UC), one subtype of inflammatory bowel disease (IBD), which is characterized by chronic mucosal inflammation, affects more than 1 million people in the United States and Europe [1,2]. The etiology of UC still remains incompletely clear. Studies suggest that intestinal microbial dysbiosis, host genetics, and external environment may play an important role in triggering UC's chronic inflammation and in determining its subsequent disease behavior and outcomes [3][4][5]. The clinical course of UC is featured by remission and relapse, which remains unpredictable [6]. Uncertain clinical recurrence will affect the life quality of UC patients and require extended therapy as well as extra medical costs [7]. If we were able to identify patients with a high risk of clinical flare-up, adjusted treatment at a presymptomatic stage could be carried out. Therefore, an earlier prediction of possible relapse is urgently needed for clinical physicians. Generally, endoscopy together with histological examination is considered the standard for assessing UC relapse [8]. However, as an invasive method, it is often intolerable and inconvenient, which limits its use in predicting UC relapse [9]. A simple, reliable, and readily available test is needed to detect an imminent flare for timely escalation of treatment and better disease control.
Fecal calprotectin (FC), a calcium-combined protein, mainly derives from neutrophil cells during inflammation. The concentration of FC reflects the extent of neutrophil migration to the gastrointestinal tract [10]. It is becoming the most useful noninvasive tool for monitoring the inflammatory status of the mucosa and for assessing patients' response to therapy [11][12][13]. However, the role of FC as a predictor of clinical relapse in UC patients remains controversial. In the present study, we aim to pool the updated literature in this field and try to figure out the predictive value of FC for clinical relapse in adult UC patients.

Materials and Methods
This meta-analysis was conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement [14].
2.1. Literature Search. Databases including PubMed, Web of Science, Embase, and the Cochrane Library were searched up to December 31, 2018 to identify all eligible studies. To avoid omission of potentially useful articles, we used both Medical Subject Heading (MeSH) terms and free words, including "Inflammatory Bowel Disease;" "Bowel Diseases, Inflammatory;" "Colitis, Ulcerative;" "ulcerative colitis;" "UC;" "calprotectin;" "Leukocyte L1 complex;" "relapse;" "recrudescence;" "recur;" "recrudesce;" and "recurrence." Meanwhile, previous systematic reviews and meta-analysis were also explored to seek for potential relevant studies. No language restriction was involved in the search strategy.

Study Selection.
A study was included if it met the following criteria: [1] prospective studies used FC to predict UC relapse, [2] FC level for predicting UC relapse was measured at remission, [3] estimates of diagnostic accuracy (such as sensitivity or specificity) were provided, [4] the identification of relapse was based on clinical symptoms or endoscopic findings, and [5] studies were conducted in adult populations. Two reviewers (Jiajia Li and Xiaojing Zhao) independently reviewed the search results to determine article inclusion while screening the citations. In cases of discordance, a consensus was reached through discussion with another author (Xueting Li). Studies were excluded if they were in consonance with any of the following: [1] patients were diagnosed with other coexisting gastroenterological diseases and [2] studies not separating UC from other IBDs like Crohn's Disease (CD) and Inflammatory Bowel Disease Unclassified (IBD-U). (ii) not restricted to adults (n = 6) no sufficient data (n = 12) Full-text articles for comprehensive analysis (n =40) Articles excluded based on title or abstract (n = 307) Records a er duplicates were removed (n = 361) Records identified through PubMed, Embase, Web of Science, and Cochrane Library (n = 474) Full-text articles assessed for eligibility (n = 54) Records excluded for reviews (n = 14) Articles included in this meta-analysis (n = 14)

Data Extraction.
To ensure accuracy, the quantitative data were collected independently by two investigators. All forms of data were extracted using a standard form, including general information (name of the first author, year of publication, and population characteristics), FC assay, test results, cutoff value, and follow-up time. Test results were presented as the numbers of true positive (TP), false positive (FP), false negative (FN), and true negative (TN) for each study.

Quality Assessment.
Two authors rated each selected study for quality according to the QUADAS-2 (Quality Assessment of Diagnostic Accuracy Studies) tool, which is recommended by the Cochrane Diagnostic Reviewers' Handbook [15]. The QUADAS-2 checklist comprises 4 parts of quality assessment: patient selection, index test, reference standard, and flow and timing. For the first three parts, they each contain two aspects: risk of bias and concerns regarding applicability, while the last part only contains risk of bias. Disagreement was resolved by a consensus. . First, for each study, sensitivity, specificity, positive likelihood ratio (PLR), negative likelihood ratio (NLR), and diagnostic odds ratio (DOR) were calculated after constructing a diagnostic 2 × 2 table. Then, pooled estimates of all included studies with 95% confidence intervals (CIs) were calculated using a DerSimonian-Laird random-effect model. For threshold analysis, correlation between sensitivity and specificity (presented as logit true positive rate (TPR) vs. logit false positive rate (FPR)) was tested to explore threshold effects, and the Moses-Shapiro-Littenberg mode was used to assess constant DOR. After that, a summary receiver operating characteristic (sROC) curve was performed, and depending on whether the DOR is constant, a symmetrical or asymmetrical sROC was used [16]. Heterogeneity across studies was assessed by the I 2 metric. Statistically, I 2 > 50% indicates that the heterogeneity is significant and the random effects model should be used. Otherwise, the fixed-effect model should be adopted [17,18]. To investigate the source of heterogeneity, subgroup analysis was conducted. Preplanned subgroups were defined according to the number of patients in the respective studies (<80 or ≥80), mean age (<40 or ≥40), male ratio (<50% or ≥50%), FC assay (Bühlmann or non-Bühlmann), FC cutoff (<150 μg/g or ≥150 μg/g), and follow-up time (<1 y or ≥1 y). Based on that, potential factors correlated to DOR were also tested by metaregression. Finally, publication bias was tested using Deek's funnel plot [19]. Continuous values were presented as mean ± standard deviation or a range and discrete variables as numbers and percentages. P value<0.05 was considered statistically significant.

Study Selection and Characteristics of Included Studies.
The flow diagram of the study selection is summarized in Figure 1. The search strategy yielded 474 articles. After the removal of 113 duplicates, 361 citations remained. Then, based on the title or abstract, 307 citations were excluded and the remaining 54 articles were further scanned. 14 articles were excluded for reviews. Another 26 studies were removed for not being restricted to adults, mixing with diseases like CD and IBD-U, or not providing sufficient data. Finally, 14 articles enrolling a total of 1110 subjects were eligible for our meta-analysis [20][21][22][23][24][25][26][27][28][29][30][31][32][33]. The baseline characteristics of the included studies are shown in Table 1.

Assessment of Methodological Quality of the Included
Studies. The 14 studies underwent quality assessment using the QUADAS-2 tool. All trials included in our study yielded good quality, thus the pooled results should be persuasive. A summary of the results is presented in Figure 2. correlation between the TPR and FPR was not significant. The Moses-Shapiro-Littenberg method showed that DOR was constant (b = −0 239, P = 0 212). Therefore, a symmetrical sROC was appropriate to calculate the diagnostic accuracy ( Figure 4). As shown in Figure 4, the area under the receiver operator curve (AUC) (SE) is 0.82 (0.027) and the Q statistic (SE) is 0.76 (0.025).  To see if there are any covariates correlated to DOR, we performed metaregression. The factors included the following domains: the number of patients, age, male/female ratio, FC assay, cutoff value, and follow-up time. No significant correlation between the covariates and DOR was detected in the univariate metaregression analysis (Table 3).

Publication Bias Analysis.
Deek's funnel plot asymmetry test was used to assess the publication bias of the included studies ( Figure 5), and no obvious publication bias was detected (P = 0 79).

Discussion
The clinical course of UC is characterized by periods of remission with recurrent episodes of symptom exacerbation because of acute intestinal inflammation [34]. Presently, UC remains an incurable disease, and the aim of existing treatments is to induce remission, promote the healing of the mucosal membrane, and decrease the incident of relapse [35]. Relapses in UC are hard to predict, and the identification of patients with a high risk of clinical flare-up could lead to target treatment at a presymptomatic stage. To better monitor the course of UC, FC has been proposed as a reliable biomarker for the prediction of possible relapse in patients with remission [36].
Our study revealed that FC yielded a good prediction value for the clinical relapse of UC (with a pooled sensitivity and specificity of 0.75, 95% CI: 0.70-0.79 and 0.77, 95% CI: 0.74-0.80, respectively). The maximum joint sensitivity and specificity was 0.76 (SE 0.025), with an AUC of 0.82 (SE 0.027). These findings are consistent with a previous metaanalysis [37].
No consensus has been reached for the definition of the clinical relapse of UC, and the criteria adopted in our studies were not identical. The basic criterion of clinical relapse is worsening of symptoms. Apart from that, other indices include TW score ≥ 11 [32,33] and partial Mayo score ≥ 3 [20,22,23]; the others focused on the Mayo endoscopic subscore. This could be a source of heterogeneity. We noticed that the specificity in the study of Scaioli et al. [26] is extremely high (100%). This may result from the relatively loose relapsing standard set in the study. To date, the golden standard for assessing intestinal inflammation is histological examination [38].     more accurate, well-recognized standard of clinical relapse should be established. The FC value adopted in our analysis was the baseline value tested at enrollment. However, elevated FC is not synonymous to bowel inflammation and FC levels are subject to some day-to-day and diurnal variations. Thus, many studies held that consecutive fecal calprotectin measurements could better predict relapse in patients with UC [40]. De Vos et al. suggested that two consecutive measurements >300 mg/kg were more specific than a single measurement for predicting relapse [30]. Consecutive fecal calprotectin measurements can monitor the disease status. The possibility of clinical relapse is increased if the value of fecal calprotectin stays abnormal. So if patients can keep a regular outpatient visit during remission, we suggest that they take consecutive fecal calprotectin measurements.
Subgroup analysis shows that the diagnostic accuracy is higher in studies with longer follow-up time (≥1 year), compared with those with shorter follow-up time. This suggests that FC is more useful in predicting long-term outcome. Since the disease course of UC is chronic, this finding will help monitor UC patients in the long run.
Patients included in this analysis are restricted to adults. However, in clinical practice, pediatric UC patients make up an important part and are drawing daily increasing attention [41]. Relevant studies show that fecal calprotectin can also serve as an activity marker of IBD in children [42,43]. Walkiewicz et al. found that among children with CD in remission, FC levels may be useful in predicting impending clinical relapse. In their study, eighty-nine percent of CD encounters with FC levels less than 400 μg/g remained in clinical remission [44]. However, it has been reported that the reference ranges of FC in children are age-related and vary a lot [45]. Besides, the clinical characteristics differ a lot between pediatric and adult UC patients. Therefore, only articles restricted to adult patients were included in our meta-analysis. To demonstrate FC's role in predicting relapse in pediatric UC patients, more clinical studies should be conducted.
Recently, the quantitative fecal immunochemical test (FIT) is proposed as a surrogate method to predict relapse in ulcerative colitis [46,47]. FIT holds several advantages over FC in regard to user friendliness, including a lower cost, easy and clean handling, and the ability to make rapid measurements by using an automated measurement system [48]. Moreover, studies confirmed that if FIT was applied together with FC and other biomarkers like CRP, the diagnostic accuracy would be significantly improved [49,50]. However, the present data remains insufficient and further studies regarding the combination of FIT and FC for predicting relapse of UC are warranted.

Conclusion
Our results confirm the diagnostic utility of FC for the detection of UC relapse in adults. Due to its simplicity and noninvasiveness, measuring FC levels at clinical remission appears to be a reliable and reproducible indicator for predicting UC relapse. To further explore its utility, more well-designed studies are required to confirm our results and find the best cutoff value of FC concentration to identify recurrence in UC patients with remission.

Disclosure
A part of the study's results was selected as a postpresentation in Abstracts Published Only, Journal of Digestive Diseases banner. The abstract was also presented at the 26th United European Gastroenterology Week Vienna 2018.