Asking questions changes health-related behavior: an updated systematic review and meta-analysis

Objectives The question-behavior effect (QBE) refers to whether asking people questions can result in changes in behavior. Such changes in behavior can lead to bias in trials. This study aims to update a systematic review of randomized controlled trials investigating the QBE, in light of several large preregistered studies being published. Study Design and Setting A systematic search for newly published trials covered 2012 to July 2018. Eligible trials randomly allocated participants to measurement vs. non-measurement control conditions or to different forms of measurement. Studies that reported health-related behavior as outcomes were included. Results Forty-three studies (33 studies from the original systematic review and 10 new studies) compared measurement vs. no measurement. An overall small effect was found using a random effect model: standardized mean difference = 0.06 (95% CI: 0.02–0.09), n = 104,388. Statistical heterogeneity was substantial (I2 = 54%). In an analysis restricted to studies with a low risk of bias, the QBE remained small but significant. There was positive evidence of publication bias. Conclusion This update shows a small but significant QBE in trials with health-related outcomes but with considerable unexplained heterogeneity. Future trials with lower risk of bias are needed, with preregistered protocols and greater attention to blinding.


Introduction
Existing systematic reviews have supported the idea that measurement can affect behavior [1e5]. Much of this evidence derives from studies where people who were asked to complete a questionnaire showed changes in behavior relative to a control group. This phenomenon is often called the 'question-behavior effect' (QBE). The findings of these systematic reviews are consistent: (a) there are overall small effects of asking questions on objective and subjective measures of behavior; (b) there is considerable heterogeneity in effects on behavior across primary studies; (c) many of the primary studies in the reviews have high risk of bias, with a lack of preregistration of protocols as a particular weakness; and (d) publication bias is present in the reviews, but not of sufficient extent to reduce best estimates of effects on behavior to zero. Theoretical explanations of how asking questions can produce changes in people include by increasing awareness of own behavior; providing information about consequences of behavior; or attentional effects through increasing the salience of components of health, behavior, or the link between the two. These explanations suggest that being asked questions may produce increases in health-promoting behaviors [4,5]. The QBE is a specific, well-recognized example of measurement reactivity (MR), which describes the phenomenon where any type of measurement (including objective and subjective measures) can affect the people being measured in terms of cognition, emotion, and behavior [6].
Assessing the strength of evidence for and quantifying the QBE is important because MR may introduce bias in otherwise well-conducted randomized controlled trials (RCTs). Bias may occur because the usual methods of conduct and analysis of trials implicitly assume that the What is new?

Key findings
The QBE appears to be a genuine phenomenon albeit small and inconsistently found.
Evidence on the QBE has considerable unexplained heterogeneity and is at risk of publication bias.
What this adds to what was known? Risk of bias is a concern in primary RCTs, but the QBE is still evident in RCTs with a low risk of bias.
What is the implication and what should change now? Future RCTs need to be pre-registered and require close attention to risk of bias.
The QBE is a potential source of bias in RCTs with behavioural outcomes.
taking of measurements does not affect subsequent outcome measurements, interact with the trial intervention, or that any effects of measurement-taking will be the same in each experimental group and hence are unlikely to bias treatment comparisons. Where any of these implicit assumptions are incorrect, the presence of the QBE is likely to result in incorrect estimates of the intervention effect. The MEasurement Reactions In Trials (MERIT) project has developed Medical Research Council (MRC)/ National Institute for Health Research (NIHR) guidance on minimizing the risk of bias in trials of health care interventions as a result of MR [7]. We report here on an update of an existing systematic review [2] of the QBE on healthrelated behavior that was conducted to provide an evidence base for the new guidance [7]. An updated evidence base on the QBE is required because many of the trials to date exhibit a high risk of bias [8]. A lack of trials with preregistered protocols is a particular limitation to existing studies [2]. In recent years, some RCTs have been published with a lower risk of bias and preregistered protocols [9,10] including some large ones with null findings [9].
The systematic review by Rodrigues et al. [2] has been selected for updating because it focusses on health contexts, includes only RCTs as the most robust study design for testing the effectiveness of an intervention, and includes a thorough assessment of risk of bias of existing studies [8]. There was a need to update this systematic review, given that the original search for this review was conducted in December 2012.
The objectives of this updated systematic review were to provide an updated estimate of the effect size of the QBE for all RCTs including new studies, to explore several moderators of the QBE, and to assess whether the effect size is robust with regard to risk of bias of included studies and inclusion of studies with/without a preregistered protocol.

Materials and methods
The protocol for this updated systematic review was published in the PROSPERO database (CRD42018102511).

Inclusion criteria
Trials randomly allocating any type of participant to measurement or non-measurement control conditions or trials in which groups were randomly allocated to different forms of measurement were eligible. Eligible studies reported health-related behavior as outcomes, defined as behavior judged to reduce the risk or severity of diseases or promote health including preparatory behaviors [11]. We included studies with any length of follow-up, although eligible outcomes needed to be assessed at a separate time point to the intervention manipulation measures, that is, studies comparing measures across different formats (e.g., interviews vs. online) were excluded. See Table 1 for PI-COS criteria for inclusion and exclusion.

Search strategy
A systematic search of MEDLINE, Cochrane Register of Controlled Trials (CENTRAL), EMBASE, and PsycINFO was conducted from 2012 to July 2018, using the same terms as used in the original systematic review (Supplementary material 1) [2]. In addition, key authors in the research field were invited to provide any additional published literature that fulfilled the inclusion criteria, and a SCOPUS citation search of the original systematic review was conducted.

Study selection and data extraction
Screening titles and abstracts, and then full papers, for eligibility was completed independently by two reviewers (L.M. and A.R.). Full text was retrieved for 51 papers taking an inclusive approach (see Figure 1); the full paper was scrutinized where the title/abstract was identified by either L.M. or A.R. Each full paper was then assessed independently according to the inclusion criteria (k 5 0.78). For six papers, the reviewers could not decide on inclusion, so consensus was reached after discussion with a third reviewer (D.F.). Data extraction was completed independently by two reviewers (L.M. and A.R.) using an extraction form developed for the original systematic review which covered study and participant characteristics, details of the intervention and control groups, and health behavior outcomes.

Assessment of risk of bias
Risk of bias was appraised independently by two reviewers (L.M. and A.R.) using version 1 of the Cochrane risk of bias tool [12]. Each study was appraised against seven criteria: adequate sequence generation, allocation concealment, blinding (participants, personnel and assessors), incomplete outcome data addressed, and free of selective outcome reporting. The papers were categorized as low, unclear, or high risk of bias and scored 0, 1, or 2, respectively, for each of the seven risk of bias criteria. There was substantial agreement between the two reviewers (k 5 0.78). Overall risk of bias scores were then calculated ranging from 0 to 14; higher scores indicated a greater risk of bias. For two papers, the reviewers could not decide on the risk of bias score for one and two criteria, respectively, so consensus was reached after discussion with a third Duplicate paper n=2 In original systema c review n=4 Excluded n=35 (see Supplementary Table 2   reviewer (D.F.). Each paper was also assessed for whether there was a preregistered protocol for the study.

Analysis
Meta-analysis of the included studies was conducted using Comprehensive Meta-Analysis (CMA version 3.3.070) software. Dichotomous and continuous outcomes were combined to produce standardized mean differences (SMDs) for all included studies. Details of the analytic strategy for the deriving the SMDs for studies in the original systematic review is published elsewhere [2]; where relevant, the same principles were applied in making decisions for selecting the most intensive measurement condition and/or merging or selecting reported outcomes. The SMDs and key moderator variables for the newly identified studies were added to the original CMA data file to facilitate meta-analysis of the new data set, using a random effects model.
Heterogeneity across studies was assessed using Cochrane's Q statistic and I 2 test statistic. Publication bias was examined by a funnel plot (inverse of the standard errors of effect estimates). This was assessed visually to see whether there was evidence of asymmetry. Egger's regression test [12] was used to formally test for the presence of publication bias. Subgroup analyses were performed to assess the impact of potential prespecified moderators of the QBE: features of participants (student or non-student), content of measurement (cognition, behavior, or both), measurement of attitudes (yes/no), format of measurement (questionnaire or interview), type of health-related behavior (e.g., physical activity, screening), and outcomes (self-report or objective). SMDs for each subgroup were calculated, alongside Cochrane's Q statistic and I 2 test statistic, to assess heterogeneity.
To test the robustness of the systematic review findings, sensitivity analyses were conducted to assess whether there were differences in QBE on the basis of risk of bias, presence of a preregistered protocol for the study, or exclusion of an outlying very large study (n 5 39,538) [9]. A dichotomous variable of high or low risk of bias was generated based on the risk of bias score: below the median (3.5) indicated a low risk and above the median indicated a high risk. SMDs were generated, alongside Cochrane's Q statistic and I 2 test statistic, to assess heterogeneity. This systematic review update is reported in accordance with the PRISMA guidance [13].

Results
Ten papers reporting 10 studies (see Table 2) met the inclusion criteria, in addition to the 41 studies (see Supplementary Table 3) that were included in the original review [2]. Data from each of these 10 studies were suitable to add to the meta-analysis (of 33 studies) presented in the original systematic review. A flow diagram of the study selection process is available in Figure 1. The study characteristics and findings of the studies not included in the meta-analysis in the original systematic review have previously been published [2].

Meta-analysis
For 43 studies (33 studies from the original systematic review and 10 new studies) comparing measurement vs. no-measurement conditions, there was an overall small but significant QBE using a random effect model: SMD 5 0.06 (95% CI: 0.02e0.09); n 5 104,096, see Figure 2. This is slightly smaller than the summary effect size published in the original systematic review (SMD 5 0.09, 95% CI: 0.04e0.13, n 5 37,452). Statistical heterogeneity in the updated meta-analysis is substantial with an I 2 of 54% and a Q value of 93.95; df 42, P ! 0.001. This is an increase in heterogeneity, compared with the original systematic review (I 2 :44%, Q: 57.39, df: 32, P: 0.004).

Sensitivity analyses
Like the original systematic review, the risk of bias among newly identified studies was considerable: scores ranged from 0 to 12; the median score was 4.5 (compared with a range of 0e9 and median 3.0 for the studies in the original systematic review meta-analysis). A breakdown of risk of bias scores by category for each new study is available in Supplementary Table 4. When analyses were restricted to studies with a low risk of bias (score below 3.5), the QBE remains small but significant (SMD 5 0.07, 95% CI: 0.04e0.11, k 5 22, n 5 90,558) with substantial heterogeneity (I 2 5 63%), see Table 3.
Half of the newly identified studies had a preregistered protocol [9,10,15,17,19], whereas only one of 41 studies in the original systematic review had a preregistered protocol [22]. A sensitivity analysis of the six studies with a preregistered protocol suggests no evidence of the QBE: SMD 5 À0.02 (95% CI: À0.07e0.04, n 5 59,053), with considerable heterogeneity (I 2 5 72%). In the sensitivity analysis of the 37 studies without a preregistered/published protocol, a small positive effect of measurement is demonstrated: SMD 5 0.09 (95% CI 0.05e0.13), I 2 5 37%. The sensitivity analysis excluding the large study by O'Carroll et al. [9] did not alter the findings. Table 4 shows subgroup analyses investigating potential moderators of the QBE. Overall, results are consistent with the original systematic review findings [2]: a larger QBE effect size is reported in students compared to non-students, and the QBE for cognition only measurement conditions (compared with behavior and cognition/behavior conditions) and questionnaire-based measurements (compared with interviews) were significantly different to zero.

Subgroup analyses
In terms of type of health-related behavior, the QBEs for flossing and physical activity are still intact as there are no new studies on these outcomes. With the publication of new studies, it was possible to assess vaccination uptake as a separate outcome showing evidence of a small QBE (SMD 5 0.07, 95% CI: 0.02e0.13). With the publication of new studies, there is now some suggestion that attitudes and type of outcome could be moderators of the QBE. Studies measuring attitudes showed a QBE (SMD 5 0.07, 95% CI: 0.02e0.13) and behavioral outcomes measured objectively were affected by questions (SMD 5 0.06, 95% CI: 0.02e0.10).

Publication bias
Asymmetry in the funnel plot (see Figure 3) and Egger's regression test (P 5 0.02) show that there is significant risk of publication bias; this was also identified in the original systematic review [2]. show a small but significant QBE in RCTs with healthrelated outcomes. In an analysis restricted to studies with a low risk of bias, the QBE remained small but significant. An issue raised in the original systematic review is the possibility that risk of bias of primary studies could produce overestimates of the observed QBE [2,8]. The systematic review update showed that the methodological quality of the included studies was variable, and the risk of bias in the newly identified studies was comparable, both in terms of variability and overall scores. Importantly, the present review showed that the QBE remains intact when restricted to studies with a lower risk of bias. However, the substantial heterogeneity in the sensitivity analysis of studies with a low risk of bias indicates that there is still a lot of unexplained variance, likely due to large variation in studies with respect to content of measurement, types of health-related outcomes, length of follow-up, and characteristics of participants. Findings on potential moderators of the QBE are broadly consistent with existing evidence [4,5].

Key findings
An important quality criterion raised in the discussion of the original systematic review was whether each included trial had an associated preregistered protocol [2,8]. Only one study in the original systematic review had a protocol preregistered [22], but five of the 10 studies identified in the review update had protocols preregistered [9,10,15,17,19]. The present review showed that the QBE remains intact when new studies with preregistered protocols are included, but a sensitivity analysis restricted to the six studies with a preregistered protocol suggests no evidence of the QBE. This sensitivity analysis showed substantial unexplained heterogeneity. Within this small group of studies, there is large variation in content of measurement and types of health-related outcomes in particular.
Preregistration of trials is a safeguard against publication bias, so it is helpful to consider the results of this sensitivity analysis in light of the funnel plot which detected publication bias. Together, these results suggest that publication bias remains a risk to the evidence base on the QBE. Indeed, previous authors have highlighted publication bias as a particular issue for the QBE literature [2,5,8]. Overall, it is possible that a QBE is undetectable within such a small number of preregistered studies, but nevertheless, this finding suggests there is a risk that the observed QBE is an artifact of publication bias.
Effects sizes of the QBE reported in existing systematic reviews tend to be slightly larger, but these reviews have included non-randomized study designs [4,5] and unpublished data [3] so are not directly comparable. The present authors suggest that the more modest quantification of the QBE in the present review better takes account of the risk of bias and other limitations of existing studies.

Strengths and weaknesses
Systematic reviews specifically follow systematic processes for identifying, selecting, and evaluating relevant studies with a view to minimizing the risk of bias. Particular strengths of this review are the thorough appraisal of risk of bias of included studies, identification and selection of studies for inclusion in duplicate, and exploration of potential sources of heterogeneity. However, the search was limited to the English language and published literature and was not supplemented by handsearching of included studies or topic-related reviews, so it is possible that some studies could have been missed.

Future research and implications
The present review has highlighted the need for further well-designed RCTs with preregistered protocols to facilitate further scrutiny of the impact of publication bias on the QBE literature. All future studies on the QBE need to pay attention to risk of bias, with particular attention to all aspects of blinding (participant, personnel, and assessor). Use of online or automated methods for outcome assessment can help overcome some of these issues. Furthermore, there is a greater need for theorizing about when the QBE is expected and for which groups.
Nevertheless, this systematic review offers empirical support for the idea that measurement can affect the people being measured. There is a need for further primary studies investigating the issue of MR more broadly than the QBE. For example, there is a particular absence of evidence around the potential reactivity of dietary assessments. We also need a greater understanding of when and how the QBE (and MR more broadly) leads to bias in trials. This is an important consideration for trial design. Approaches to minimizing the risk are addressed in new MRC/NIHR guidance on reducing bias from measurement reactions in trials of health care interventions [7].

Conclusions
Overall, this systematic review update provides evidence of a small but significant QBE in RCTs on health-related outcomes. A greater proportion of the more recent studies have a preregistered protocol and several are large studies. The QBE remains intact when analyses are restricted to studies with a low risk of bias. Although there is a risk that the QBE is an artifact of publication bias, the present review lends support to the conclusion that the QBE is a genuine phenomenon albeit small and inconsistently found.