Review
Outcomes in systematic reviews of complex interventions never reached “high” GRADE ratings when compared with those of simple interventions

https://doi.org/10.1016/j.jclinepi.2016.03.014Get rights and content

Abstract

Objectives

To investigate the application of the Grading of Recommendations Assessment, Development, and Evaluation (GRADE) approach and the quality of evidence ratings in systematic reviews of complex interventions.

Study Design and Setting

This study examined all 40 systematic reviews published in three Cochrane Review Groups from 2013 to May 2014: Cochrane Developmental, Psychosocial and Learning Problems Group (CDPLPG); Cochrane Public Health Group (CPHG); and Cochrane Depression, Anxiety, and Neurosis Group (CCDAN). The reviews were coded and classified into “complex” (n = 24) and “simple” (n = 16) intervention review groups based on the predefined complexity dimensions from the extant literature mapped into the PICOTS framework. All the data were analyzed in these two groups to help identify specific patterns of the GRADE ratings in the reviews of complex interventions.

Results

Outcomes of complex intervention reviews had higher proportions of “very low” quality of evidence ratings compared with those of simple intervention reviews (37.5% vs. 9.1% for the primary benefit outcomes) and were more frequently downgraded for inconsistency, performance bias, and study design. None of the outcomes of complex intervention reviews (0%) were given “high” GRADE ratings.

Conclusion

Results suggest that the GRADE assessment may not adequately describe the evidence base of complex interventions.

Introduction

As compared to many pharmacologic and clinical treatments, interventions commonly used in the disciplines of psychology, education, social work, criminology, and public health are most frequently complex [1]. A recent content analysis of 207 published articles shows that the vast majority of scientific publications in health research use dimensions suggested by the Medical Research Council (MRC) to characterize complex interventions [2]. These include multiplicity and interactions between intervention components, number and difficulty of behaviors required by those delivering or receiving the intervention, number of groups targeted by the intervention, multiplicity and variety of outcomes, and degree of flexibility or tailoring of the intervention [1].

With the adoption of complex dynamic systems approaches in epidemiology and health care research in the last decade, however, less emphasis is now placed on the aspects of interventions as determinants of complexity. In this perspective, complexity is characterized as a set of properties determining the causal relationship among the intervention components, their implementation, the context, and the outcomes [3], [4]. Interventions are viewed as events in systems, where complexity lies as well in the contexts or settings into which interventions are introduced [5]. Expected outcomes, on the other hand, may occur in a nonlinear fashion, and develop over time as a result of the feedback loops of the systems, or they may emerge suddenly preceded by a period of little change [6].

In an attempt to draw systematic differences between the evaluation of “complex” and “simple” interventions, Rehfuess and Akl used the extant conceptualizations of complex interventions, including complex dynamic system approaches and the widely cited MRC dimensions of complexity [1], [7], and mapped these into the well-established Participants, Interventions, Comparisons, Outcomes, Timing and Setting (PICOTS) framework [7]. The left-hand column of Table 1 highlights characteristics of complex interventions in PICOTS that include, but are not limited to, the aspects of interventions themselves. For instance, complexity in the evaluation of many social and public health interventions, such as housing improvements for health and associated socioeconomic outcomes [8], may result from targeting healthy general or at-risk participants at a population level (P), and the “proactive” nature of the intervention (I), which requires active engagement by participants and changes in their behaviors and psychosocial outcomes, including attitudes, cognitions, and norms [9]. In addition, evaluation of these interventions are more likely to use broadly defined “business as usual” as the main comparator (C), target many health and social outcomes (O), require long-time periods for the expected outcomes to emerge (T), and be implemented in multiple settings including communities and households (S), which may interact with the intervention effects. In comparison, many pharmacologic interventions generally target sick populations seeking care, are mainly delivered on an individual level (P) and are usually “reactive” (I), that is, they directly trigger biological mechanisms without a need to actively involve intervention recipients in the production of change. They usually use a well-defined comparator of alternative treatment, placebo, or no treatment (C), target few health outcomes (O), require shorter time periods for the outcomes to manifest (T), and are commonly implemented in health care settings (S).

The foregoing features challenge the evaluation of many social and public health interventions. Specifically, many of these interventions are not feasible to evaluate using randomized controlled trials (RCTs) and might need to draw on different study designs and evidence types to adequately inform practice decision-making [10]. Approaches to reviewing and synthesizing evidence that account for the complex relationships between these interventions and their outcomes are not straightforward and warrant further development. Recent academic discussions suggest the integration of a spectrum of methods and types of evidence, including quantitative, qualitative, and mixed-method approaches, when reviewing the evidence on the effectiveness of complex interventions [11], [12]. To enhance the internal and external validity of the findings, the review questions addressing complex interventions often need to go beyond the standard PICOTS components [13], [14], and also interrogate the implementation and the context of these interventions [15]. Likewise, these considerations need to be reflected in how the quality of evidence of these interventions is defined and rated to inform recommendations for practice [16].

The Grading of Recommendations Assessment, Development, and Evaluation (GRADE) approach was developed in the previous decade as a transparent, comprehensive, and structured framework for rating the quality of evidence (also termed as confidence in evidence and certainty in effect estimates) and the strength of recommendations to guide evidence-based decision-making in health care [17]. More than 80 organizations worldwide have adopted the GRADE approach, including the Cochrane Collaboration, the World Health Organisation (WHO), and the National Institute of Health and Care Excellence (NICE) [18].

Although the GRADE Working Group supports the application of the GRADE approach across all types of evidence in health care, there have been contentions over the suitability of this framework for complex interventions [19], [20], [21]. As a result, uptake of the GRADE approach has been hindered in the disciplines that investigate these interventions. For example, entities such as the United States Community Preventive Services Task Force [22] and the NICE Centre for Public Health Excellence [23] have explicitly rejected the GRADE approach, deeming it inappropriate for the types of evidence they study [7]. A recent publication looking at the experiences of applying the GRADE approach to public health interventions revealed that the organizations that currently use the approach value its transparency; however, major concerns were expressed regarding the interpretations of heterogeneity, degree of indirectness, choice of outcomes, and outcome measures in systematic reviews of public health interventions, as well as discrimination between different types of observational studies and terminology in guidance [7]. Furthermore, several respondents highlighted the lack of a systematic mechanism to integrate intervention implementation and contextual data when rating the quality of evidence in GRADE. Because the evaluation of complex interventions may integrate evidence of different types, several respondents found it important to consider “parallel evidence,” including evidence on intervention implementation, to further enhance the process of assessing the quality of evidence in complex interventions [7].

To date, there is no empirical investigation that examines the utilization of and output from the GRADE approach in reviews of complex interventions. Meanwhile, as prior empirical evidence suggests [7], the GRADE approach might not be fully adequate for complex interventions. As an extension to this work, this study aimed to investigate the current application and ratings of the GRADE approach in systematic reviews of complex interventions. To that end, it applied a cross comparison with systematic reviews focusing on simple pharmacologic treatments to elicit the patterns of the GRADE application specific for the reviews of complex social interventions.

Section snippets

Data sources and extraction

This study focused on systematic reviews published on the Cochrane Database of Systematic Reviews (2013–May 2014). To obtain an adequate and manageable number of recently published systematic reviews on both psychosocial (likely to be classified as complex) and pharmacologic (likely to be classified as simple) interventions within the same Cochrane Review Group, publications from the following three groups were considered: Cochrane Developmental, Psychosocial, and Learning Problems Group

Description of included studies

As of May 2014, a total of 279 systematic reviews were identified and screened for publication year in the three Cochrane Review Groups (see Fig. 1). Of these records, 44 reviews were conducted in 2013 and 2014. Four reviews in the CDPLPG were empty [26], [27], [28], [29]. A full-text examination of the final 40 records identified 24 reviews that were classified in the domain of “complex interventions,” whereas 16 reviews focused on “simple interventions.”

In general, the interventions assessed

Discussion

This study is the first empirical investigation of GRADE application in systematic reviews of complex interventions. The findings indicate that current uptake of the GRADE approach in complex intervention reviews is similar to that of simple intervention reviews. This can be explained by the Cochrane policy on GRADE adoption, which requires all review authors to incorporate GRADE [71]. In general, the findings in this study are consistent with the observations highlighted in the feedback from

Acknowledgments

All the coauthors thoroughly reviewed the manuscript and approved the final draft before submission. This study is supported by the Centre for Evidence-Based Intervention, University of Oxford, and does not involve any financial competing interests.

References (89)

  • A. Fretheim et al.

    Interrupted time-series analysis yielded an effect estimate concordant with the cluster-randomized controlled trial result

    J Clin Epidemiol

    (2013)
  • E.A. Rehfuess et al.

    Beyond direct impact: evidence synthesis towards a better understanding of effectiveness of environmental health interventions

    Int J Hyg Environ Health

    (2014)
  • A. Movsisyan et al.

    Users identified challenges in applying GRADE to complex interventions and suggested an extension to GRADE

    J Clin Epidemiol

    (2016)
  • P. Craig et al.

    Developing and evaluating complex interventions: new guidance

    (2008)
  • J. Datta et al.

    Challenges to evaluating complex interventions: a content analysis of published papers

    BMC Public Health

    (2013)
  • A. Shiell et al.

    Complex interventions or complex systems? Implications for health economic evaluation

    BMJ

    (2008)
  • P. Hawe et al.

    Theorising interventions as events in systems

    Am J Community Psychol

    (2009)
  • S. Galea et al.

    Causal thinking and complex system approaches in epidemiology

    Int J Epidemiol

    (2010)
  • E.A. Rehfuess et al.

    Current experience with applying the GRADE approach to public health interventions: an empirical study

    BMC Public Health

    (2013)
  • H. Thomson et al.

    Housing improvements for health and associated socio-economic outcomes

    Cochrane Database Syst Rev

    (2013)
  • M.W. Fraser

    Intervention research: developing social programs

    (2009)
  • H. Walach et al.

    Circular instead of hierarchical: methodological principles for the evaluation of complex interventions

    BMC Med Res Methodol

    (2006)
  • M. Petticrew

    Time to rethink the systematic review catechism? Moving from “what works” to “what happens”

    Syst Rev

    (2015)
  • M. Gray

    Evidence-based healthcare and public health: how to make recommendations about health services and public health

    (2009)
  • G.H. Guyatt et al.

    GRADE: an emerging consensus on rating quality of evidence and strength of recommendations

    BMJ

    (2008)
  • C. Barbui et al.

    Challenges in developing evidence-based recommendations using the GRADE approach: the case of mental, neurological and substance use disorders

    PloS Med

    (2010)
  • D.N. Durrheim et al.

    Modifying the GRADE framework could benefit public health

    J Epidemiol Community Health

    (2010)
  • E.A. Rehfuess et al.

    GRADE for the advancement of public health

    J Epidemiol Community Health

    (2011)
  • Anon

    The NICE public health guidance development process

    (2012)
  • G.F. Moore et al.

    Process evaluation of complex interventions: Medical Research Council guidance

    BMJ

    (2015)
  • M. Weinstein

    TAMS Analyzer for Macintosh OS X: The native Open source, Macintosh Qualitative Research Tool

    (2014)
  • C.G. Brennan-Jones et al.

    Auditory-verbal therapy for promoting spoken language development in children with permanent hearing impairments

    Cochrane Database Syst Rev

    (2014)
  • S. Gantasala et al.

    Gastrostomy feeding versus oral feeding alone for children with cerebral palsy

    Cochrane Database Syst Rev

    (2013)
  • B. Parker et al.

    Psychoanalytic/psychodynamic psychotherapy for children and adolescents who have been sexually abused

    Cochrane Database Syst Rev

    (2013)
  • A. Vernon-Roberts et al.

    Fundoplication versus postoperative medication for gastro-oesophageal reflux in children with neurological impairment undergoing gastrostomy

    Cochrane Database Syst Rev

    (2013)
  • J. Barlow et al.

    Group-based parent training programmes for improving parental psychosocial health

    Cochrane Database Syst Rev

    (2012)
  • J.I. Bisson et al.

    Psychological therapies for chronic post-traumatic stress disorder (PTSD) in adults

    Cochrane Database Syst Rev

    (2013)
  • R. Churchill et al.

    “Third wave” cognitive and behavioural therapies versus treatment as usual for depression

    Cochrane Database Syst Rev

    (2013)
  • G.M. Cooney et al.

    Exercise for depression

    Cochrane Database Syst Rev

    (2013)
  • G.L. Fellmeth et al.

    Educational and skills-based interventions for preventing relationship and dating violence in adolescents and young adults

    Cochrane Database Syst Rev

    (2013)
  • S. Fletcher-Watson et al.

    Interventions based on the Theory of Mind cognitive model for autism spectrum disorder (ASD)

    Cochrane Database Syst Rev

    (2014)
  • V. Hunot et al.

    “Third wave” cognitive and behavioural therapies versus other psychological therapies for depression

    Cochrane Database Syst Rev

    (2013)
  • A.C. James et al.

    Cognitive behavioural therapy for anxiety disorders in children and adolescents

    Cochrane Database Syst Rev

    (2013)
  • N. Livingstone et al.

    Restorative justice conferencing for reducing recidivism in young offenders (aged 7 to 21)

    Cochrane Database Syst Rev

    (2013)
  • Cited by (28)

    • Effectiveness of interventions to improve drinking water, sanitation, and handwashing with soap on risk of diarrhoeal disease in children in low-income and middle-income settings: a systematic review and meta-analysis

      2022, The Lancet
      Citation Excerpt :

      Of note, the GRADE approach was developed for rating the evidence of health-care interventions.29 Public health interventions, such as WASH, are often complex and contain several interacting components, require multiple behaviours and target groups, and are often inherently difficult to randomise or mask, which makes them more likely to be rated as low or very low certainty evidence compared with interventions that can be better controlled, such as clinical interventions.104,105 Results of the preceding WASH review5 are largely consistent with the findings in this meta-analysis; however, this review found lower reductions in risk of diarrhoea with drinking water of higher quality supplied on premises.

    • Grading nutrition evidence: Where to go from here?

      2021, American Journal of Clinical Nutrition
    • Systematic review and evidence based recommendations on texture modified foods and thickened liquids for adults (above 17 years) with oropharyngeal dysphagia – An updated clinical guideline

      2018, Clinical Nutrition
      Citation Excerpt :

      Based on this and balancing between desirable and undesirable consequences our findings permitted weak recommendations against routinely use of modified liquids in adults with OD. It has been speculated whether the application of the GRADE system in reviewing complex interventions such as OD management strategies [27], which are characterized by active engagement by participants and changes in behaviour in multiple settings, often leads to downgrading of evidence due to performance bias, imprecision and indirectness resulting in weak recommendations [28]. However, the updated recommendations did not change the focus of the recommendations in the clinical guideline reported in Andersen et al. [3].

    • CHIMERAS showed better inter-rater reliability and inter-consensus reliability than GRADE in grading quality of evidence: A randomized controlled trial

      2018, European Journal of Integrative Medicine
      Citation Excerpt :

      While the GRADE approach has been adopted by more than 65 international organizations in recent years, controversy on its application has been raised at the same time. The following concerns about the validity and reliability of GRADE were mentioned in current literature: no validation of the rating scheme [3], no differentiation on the quality of qualitative conclusion (direction of effect) and quantitative conclusion (effect size and its precision) [4], lack of methodological support for quality assessment criteria [5], insufficient guidance on usage, excessive complexity for making reliable and consistent judgement [6,7], poor inter-rater reliability [8,9] and being inapplicable for grading evidence on complex interventions [10]. In order to address these challenges, efforts have been made to refine GRADE by different research teams.

    View all citing articles on Scopus
    View full text