Original ArticleMethods to select results to include in meta-analyses deserve more consideration in systematic reviews
Introduction
Systematic reviews of randomized controlled trials (RCTs) of health care interventions have the potential to have a major impact on patient health, research agendas, and policy making. However, the validity of systematic review findings can be compromised by challenges in undertaking meta-analysis. One challenge is that multiple effect estimates in a trial report may be available for inclusion in a particular meta-analysis [1], [2]. For example, a trial report may present effect estimates for two depression scales, at week three, six, and nine, each analyzed as unadjusted and adjusted for covariates. Multiplicity of effect estimates may lead to “selective inclusion of results,” whereby the process for selecting the trial effect estimates for inclusion in a meta-analysis is based on the estimates themselves, which may, in turn, result in biased meta-analytic effects [3].
Several organizations that produce systematic reviews (e.g., [4], [5], [6]) have recommended methods that aim to reduce selective inclusion of results. The methods (specified a priori) aim to uniquely identify results that will be included in a meta-analysis and can be placed in two broad categories, which we label “eligibility criteria to select results” and “decision rules to select results.” Eligibility criteria to select results include specifying lists of measurement scales, intervention/control groups, time points, and analyses that systematic reviewers consider eligible to include in the review (ideally based on some clinical or methodological rationale). Providing specific criteria discourages the use of broad outcomes such as “pain,” and instead encourages specification of details such as the eligible pain measurement scales and time points of interest to the review [1], [2].
Predefining eligibility criteria to select results is an effective method to minimize the number of effect estimates available for inclusion in a particular meta-analysis. However, this method may not always identify a single eligible effect estimate per trial, and in such cases, the addition of decision rules is useful. Decision rules are strategies to either select one effect estimate, or combine effect estimates, when multiple are available. An example of a decision rule to select one effect estimate is when commonly encountered measurement scales for a particular outcome domain (e.g., depression) are ranked based on their psychometric properties, and for trials that report the results of more than one scale, the results for the tool with the best measurement properties are selected. Such a strategy has previously been referred to as an “outcome data hierarchy” [2], [7], [8]. An example of a decision rule to combine effect estimates is when a trial includes more than one active treatment arm (e.g., placebo vs. high-dose drug vs. low-dose drug), and rather than selecting data from only one of the active arms (e.g., only one dosage group), data from all active treatment arms are combined (e.g., any dosage vs. placebo) [9], [10].
To our knowledge, only two previous studies have investigated multiplicity of results in trial reports or the methods systematic reviewers use to select results to include in meta-analyses [2], [11]. In the first study, that examined interobserver variation in results extracted from trials for use in meta-analyses, decision rules to select final vs. change from baseline values were reported in 4 of 10 review protocols [11]. In the second study [2], that examined the impact of multiplicity of trial results on meta-analysis results, multiplicity was found to be common, but methods to select results to include in meta-analyses were rarely predefined. In 83 RCTs included in 19 Cochrane reviews published from 2006 to 2007, 35% of the RCTs had multiple measurement scales, 29% had multiple intervention/control groups (i.e., in multi-arm RCTs), and 36% had multiple time points that were available for inclusion in a particular meta-analysis. In all review protocols, eligibility criteria for measurement scales and intervention/control groups were always defined, and eligibility criteria for time points were defined in eight (42%). In contrast, decision rules to select measurement scales or intervention/control groups were not reported in any of the review protocols, whereas a decision rule to select time points was reported in one review protocol (5%) [2].
To inform methods guidance regarding inclusion of results when there is multiplicity, several issues still require exploration. First, the protocols in Tendal et al. studies were published before 2006, and it is unclear whether reporting of eligibility criteria and decision rules to select results has changed over time. Second, most systematic reviewers do not report working from a review protocol [12], [13], and the methods used to select results to include in such reviews have not been examined. Third, there has been no investigation of the frequency of other types of multiplicity which may arise in RCTs [e.g., reporting of results from intention-to-treat (ITT) and per-protocol or unadjusted and covariate-adjusted analyses]. Fourth, no one has examined whether multiplicity of results and reporting of methods to select results to include in meta-analyses differs between clinical conditions. It may be hypothesized that there may be less multiplicity of results for clinical conditions that have “core outcome measurement sets” available [14], [15], [16]. Core outcome measurement sets are measurement scales recommended for use in RCTs and systematic reviews of a particular health condition and are designed to increase consistency in scale selection.
Our aim was to investigate multiplicity of results in trial reports and methods systematic reviewers use to select results to include in meta-analyses. The primary objectives were to investigate the frequency and types of: (1) multiplicity of results that arise in RCTs and (2) eligibility criteria and decision rules to select results, which are reported in review protocols and reviews. Secondary objectives were to examine how the extent of multiplicity of results was modified by the existence of a review protocol and the clinical condition of the review and how the reporting of eligibility criteria and decision rules to select results was modified by the clinical condition of the review. We also plan to investigate whether there is evidence of selective inclusion of results in the sample of reviews and what impact this may have on meta-analytic effect estimates [17]; the results of this research will form a subsequent article.
Section snippets
Methods
Our study protocol that describes the eligibility criteria, search strategies, selection of systematic reviews, data extraction, and planned analyses is published elsewhere [17]. An overview of the methods is provided here.
Results
A flow diagram of the identification, screening, and inclusion of systematic reviews is presented in Fig. 1. Searching yielded a total of 2,590 records. A full-text report was retrieved for 264 records. Of these, 145 were screened and excluded (the most common reasons for exclusion were that no meta-analyses were conducted or no continuous outcomes were analyzed in the review). The target sample size was reached after screening 189 randomly sorted full-text articles (leaving 75 full-text
Discussion
Our investigation of multiplicity of results demonstrates that systematic reviewers can expect to commonly encounter multiple eligible effect estimates in trials when they do not predefine methods to select results to include in meta-analyses. Multiple measurement scales and intervention/control groups (in multi-arm RCTs) were the most common types of multiplicity. At least one eligibility criterion and decision rule to select results were reported in more than 80% of review protocols and
Acknowledgments
This work was conducted as part of a PhD undertaken by M.J.P., which is funded by an Australian Postgraduate Award administered through Monash University, Australia. J.E.M is supported by an NHMRC Australian Public Health Fellowship (1072366).
References (34)
- et al.
Attention should be given to multiplicity issues in systematic reviews
J Clin Epidemiol
(2008) - et al.
Many scenarios exist for selective inclusion and reporting of results in randomized trials and systematic reviews
J Clin Epidemiol
(2013) - et al.
Osteoarthritis: rational approach to treating the individual
Best Pract Res Clin Rheumatol
(2006) - et al.
Developing core outcome measurement sets for clinical trials: OMERACT Filter 2.0
J Clin Epidemiol
(2014) - et al.
Design and conduct of clinical trials in patients with osteoarthritis: recommendations from a task force of the osteoarthritis research society: results from a workshop
Osteoarthritis and Cartilage
(1996) - et al.
Multiplicity of data in trial reports and the reliability of meta-analyses: empirical study
BMJ
(2011) - et al.
Methodological standards for the conduct of new Cochrane Intervention Reviews. Version 2.3
(2013) Finding what works in health care: standards for systematic reviews
(2011)Methods guide for effectiveness and comparative effectiveness reviews
(2014)- et al.
Meta-analysis: chondroitin for osteoarthritis of the knee or hip
Ann Intern Med
(2007)
Meta-analysis of multitreatment studies
Med Decis Making
Disagreements in meta-analyses using outcomes measured on continuous or rating scales: observer agreement study
BMJ
Epidemiology and reporting characteristics of systematic reviews
PLoS Med
An evaluation of epidemiological and reporting characteristics of complementary and alternative medicine (CAM) systematic reviews (SRs)
PLoS One
OMERACT: an international initiative to improve outcome measurement in rheumatology
Trials
An empirical investigation of the potential impact of selective inclusion of results in systematic reviews of interventions: study protocol
Syst Rev
Cited by (21)
Methods used to select results to include in meta-analyses of nutrition research: A meta-research study
2022, Journal of Clinical EpidemiologyCitation Excerpt :Although inclusion of multiple effect estimates from a particular study in a meta-analysis is possible (using methods that adjust for statistical dependency) [10], more commonly only one of the available effect estimates is selected for inclusion. There are various methods that can be used to select a single effect estimate [9,10]. However, when this selection is based on the statistical significance, magnitude or direction of effect, this may introduce bias into the meta-analysis effect estimate [11].
Cherry-picking by trialists and meta-analysts can drive conclusions about intervention efficacy
2017, Journal of Clinical EpidemiologyMultiple outcomes and analyses in clinical trials create challenges for interpretation and research synthesis
2017, Journal of Clinical EpidemiologyCitation Excerpt :Furthermore, when only some outcomes are reported publicly, it is impossible for the systematic reviewer or other interpreter of the trial findings to know for sure whether there has been selective reporting. Few studies have explored the number of results that investigators could select to include in meta-analyses [7,13,19]. We know of no studies that have used both public and nonpublic data sources for RCTs to quantify the number of outcomes and results reported across RCTs, the number of reported outcomes that are defined, or the number of results that are meta-analyzable.
Extracting data from figures with software was faster, with higher interrater reliability than manual extraction
2016, Journal of Clinical EpidemiologyEquivalencies Between Ad Hoc Strategies and Multivariate Models for Meta-Analysis of Dependent Effect Sizes
2024, Journal of Educational and Behavioral StatisticsInvestigation of bias due to selective inclusion of study effect estimates in meta-analyses of nutrition research
2024, Research Synthesis Methods
Conflict of interest: M.J.P. has roles in The Cochrane Collaboration including systematic review trainer for the Australasian Cochrane Centre; Methodological Editor for the Depression, Anxiety, and Neurosis Group; member of the Bias Methods Group, Statistical Methods Group, and Trainer's Network; and author of Cochrane systematic reviews. J.E.M. has roles in The Cochrane Collaboration including Co-convenor of the Statistical Methods Group; member of the Methods Executive, Methods Board, and the Bias Methods Group; Statistical Editor for the Consumers and Communication Review Group; Editor of Cochrane Methods; and author of Cochrane systematic reviews. M.C. has a role in The Cochrane Collaboration as author of Cochrane systematic reviews. S.E.G. has roles in The Cochrane Collaboration including Co-Director of the Australasian Cochrane Centre; past editor of the Cochrane Handbook for Systematic Reviews of Interventions; and author of Cochrane systematic reviews. A.F. has a role in The Cochrane Collaboration as member of the Statistical Methods Group. The views expressed in this article are those of the authors and not necessarily those of The Cochrane Collaboration or its registered entities, committees, or working groups.