Original ArticleThree risk of bias tools lead to opposite conclusions in observational research synthesis
Introduction
Assessing the methodological quality or risk of bias (RoB) of primary studies is an essential component of any systematic review or meta-analysis [1], [2] and should play a relevant role in interpreting the results of the review [3]. Moreover, the inclusion of poor-quality studies in a review may lead to invalid conclusions [3], [4]. In fact, the results of such quality assessments often exert an important influence on some decisions made in the review process, such as whether to exclude studies not meeting certain quality standards, to perform sensitivity analyses, to determine the strength of evidence, or to guide recommendations for future research and clinical practice [5], [6].
Compared to clinical trials, the quality assessment of observational studies is often more demanding due to the variety of designs comprised and their increased susceptibility to bias [5], [7], [8]. These difficulties are probably the reason why in some areas such as health psychology, only about half of all reviews that include cohort and case–control studies assessed the RoB of the primary studies [9]. Although a wide range of tools suitable for observational studies have been reviewed by several authors [10], [11], [12], there is no consensus on which is the best procedure or tool to assess RoB in observational designs, despite observational studies are usually included in systematic reviews including those of Cochrane [13]. Moreover, most of these tools were poorly developed, and their developers often failed to follow standard methodological procedures or to test their tools' validity and reliability [10], [14]. Thus, RoB assessments of a single study using different tools may lead to different conclusions [4], [15], [16], both in randomized controlled trials [1], [14], [17] and in observational studies [7], [8], [18].
Meanwhile, the use of scales that provide a single summary score is strongly discouraged [4], [15], [19] because it involves the weighting of component items, although some of them may be not related to RoB [3], [11]. The alternative seems to perform an RoB assessment based on domains [20], [21], [22], [23], which is increasingly applied and apparently provides a more structured framework within which to make qualitative decisions on the overall quality of studies and to detect potential sources of bias [16].
The general purpose of this study was to assess the agreement and compare the performance of three different instruments in assessing the RoB of comparative cohort studies included in a meta-analysis related to health psychology. The selected tools were as follows: (1) NOS [24], the most frequently used scale to assess the quality of cohort and case–control studies [9], which provides a summary score; (2) quality of cohort studies (Q-Coh) [21], a specific domain-based tool to assess the RoB of cohort studies with good psychometric properties; and (3) risk of bias in nonrandomized studies of interventions (ROBINS-I) [22], a new domain-based tool proposed by Cochrane, which is intended to assess RoB in nonrandomized studies of interventions but is also applicable to a wide variety of observational designs [25]. To be more precise, the specific objectives are as follows:
- •
To estimate, for each tool, the degree of interrater agreement when examining items, domains of RoB, and overall quality rating.
- •
To estimate the level of agreement between tools for specific biases, domains of RoB, and overall quality rating.
- •
To appraise the qualitative aspects of the tools related to their usability: the average time spent, clarity of instructions and items, coverage, and validity.
- •
To determine the effect of quality ratings on the results of a meta-analysis.
Section snippets
Risk of bias assessment tools
The NOS [24] was developed to assess the quality of observational studies included in systematic reviews. This tool exists in separate versions for cohort and case–control designs, although only the scale for cohort studies was applied here. Studies are assessed using eight items broken down into three dimensions: selection (four items), comparability (one item), and exposure for case–control studies or outcome for cohort studies (three items). A study can be awarded a maximum of nine stars.
Risk of bias assessment
Fig. 1 shows a summary of the consensus results of RoB assessment for each tool, for overall RoB and by domain. The NOS scores ranged from 5 to 9, with a median and mode of 8 (25th percentile [p25] = 6; p75 = 9). Once the studies were classified into categories, there were 21 studies with low RoB and seven studies with moderate RoB. None of the studies were placed in the category of high RoB. According to the Q-Coh results, three studies were classified as low RoB, 11 studies as moderate RoB,
Discussion
Our comparison of three tools for RoB assessment of nonexperimental studies suggests that we are dealing here with three different approaches to RoB assessment, each of which could lead to different conclusions about the final quality grade assigned to each study. In this study, no agreement between tools was found for overall RoB. While 75% of the studies can be considered to be at low RoB when the NOS is applied, 86% of the studies would be at serious RoB according to ROBINS-I. Overall RoB
Conclusions
The present study, comparing the performance of three different tools when assessing the RoB of 28 cohort studies, shows that assessing RoB on the same study using different tools may lead to opposite conclusions, especially at low and high levels of RoB, where most of the studies were rated as low RoB with the NOS, contrary to ROBINS-I with which most of the studies were rated as high RoB. Therefore, both the NOS and ROBINS-I showed low capability in grading RoB in observational studies. Our
References (57)
- et al.
Methodological quality is underrated in systematic reviews and meta-analyses in health psychology
J Clin Epidemiol
(2017) - et al.
Inclusion of nonrandomized studies in Cochrane systematic reviews was found to be in need of improvement
J Clin Epidemiol
(2014) - et al.
Impact of quality scales on levels of evidence inferred from a systematic review of exercise therapy and low back pain
Arch Phys Med Rehabil
(2002) - et al.
A systematic review finds that diagnostic reviews fail to incorporate quality despite available tools
J Clin Epidemiol
(2005) - et al.
Q-Coh: a tool to screen the methodological quality of cohort studies in systematic reviews and meta-analyses
Int J Clin Heal Psychol
(2013) - et al.
Testing the Newcastle Ottawa Scale showed low reliability between individual reviewers
J Clin Epidemiol
(2013) - et al.
Long sleep duration and health outcomes: a systematic review, meta-analysis and meta-regression
Sleep Med Rev
(2018) - et al.
Association between stressful life events and autoimmune diseases: a systematic review and meta-analysis of retrospective case-control studies
Autoimmun Rev
(2016) - et al.
Significant discrepancies were found in pooled estimates of searching with Chinese indexes versus searching with English indexes
J Clin Epidemiol
(2016) - et al.
High agreement but low Kappa: I. The problems of two paradoxes
J Clin Epidemiol
(1990)
Behavior and interpretation of the κ statistic: resolution of the two paradoxes
J Clin Epidemiol
High agreement but low kappa: II. Resolving the paradoxes
J Clin Epidemiol
Bias, prevalence and kappa
J Clin Epidemiol
Does quality of reports of randomised trials affect estimates of intervention efficacy reported in meta-analyses?
Lancet
The rationale for rating risk of bias should be fully reported
J Clin Epidemiol
Risk of bias versus quality assessment of randomised controlled trials: cross sectional study
BMJ
Panning for the gold in health research: incorporating studies' methodological quality in meta-analysis
Psychol Health
Cochrane handbook for systematic reviews of interventions version 5.1.0 [updated March 2011]. The Cochrane Collaboration, 2011
The hazards of scoring the quality of clinical trials for meta-analysis
JAMA
Systematic reviews: CRD’s guidance for undertaking reviews in health care. CRD, University of York; 2009
Systematic reviews in health care—assessing the quality of controlled clinical trials
Br Med J
Reliability and validity of three quality rating instruments for systematic reviews of observational studies
Res Synth Methods
Quality assessment of observational studies in a drug-safety systematic review, comparison of two tools: the Newcastle–Ottawa scale and the RTI item bank
Clin Epidemiol
Evaluating non-randomised intervention studies
Health Technol Assess
Tools for assessing quality and susceptibility to bias in observational studies in epidemiology: a systematic review and annotated bibliography
Int J Epidemiol
Methodological quality assessment tools of non-experimental studies: a systematic review
An Psicol
Assessment of study quality for systematic reviews: a comparison of the Cochrane collaboration risk of bias tool and the effective public health practice project quality assessment tool: methodological research
J Eval Clin Pract
Adjustment of meta-analyses on the basis of quality scores should be abandoned
J Clin Epidemiol
Cited by (64)
Overview of the risk of bias in randomized clinical trials of acupuncture
2022, Revista Internacional de AcupunturaWe Need High-quality Evidence Regarding the Ross Operation
2022, Annals of Thoracic SurgeryAssessor burden, inter-rater agreement and user experience of the RoB-SPEO tool for assessing risk of bias in studies estimating prevalence of exposure to occupational risk factors: An analysis from the WHO/ILO Joint Estimates of the Work-related Burden of Disease and Injury
2022, Environment International
Conflict of interest: None.
Funding: This work was supported by the Spanish Ministry of Science and Innovation (grant number: PSI2014-52962-P). I.O. was supported by funding from a predoctoral grant from the Ministry of Education, Culture and Sport of Spanish Government (grant number: FPU14/04514). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the article.