A Methodology for Quantifying Effects and Psychological Functioning of Behavior-Change Techniques

We present a methodology to quantify the effects of behavior change techniques (BCTs) that allows forecasting campaign effects on behavior and psychological constructs. The approach involves the gathering of longitudinal data during actual campaigns in which different combinations and sequences of BCTs are applied to different groups. Approximate metric data are gathered by asking for simple and specific evaluations. The data are analyzed using regression models that consider the value range of the dependent variable as bounded (bounded linear regression). Based on these models, forecasts of the intervention effects are calculated, considering the uncertainty of the parameter estimates. The methodology is applied to investigate the effects of prompts (external memory aids), public self-commitments, and implementation intentions on affective and instrumental attitudes, injunctive and descriptive norms, forgetting, perceived behavior control, and behavior in a health-promotion campaign in Bolivia. Prompts and public self-commitments reached more than half of the target population but only showed relevant effects when combined or repeated. The effects of both BCTs on behavior were mainly mediated by forgetting. Implementation intentions were not well received by the promoters and the population. From the few cases that implemented this BCT, no clear psychological effects could be derived.


Introduction
Many of the most urgent problems humanity faces could be mitigated at least partly by changing individual behavior. For example, the health conditions of hundreds of millions of people in developing countries could be improved by making people disinfect their drinking water and wash their hands with soap (Prüss-Üstün, Bos, Gore, & Bartram, 2008). Psychology could play a crucial role in solving problems related to individual behaviors by providing behavior change theories and evidence of the impact of specific behavior change techniques (BCTs) on psychological constructs and behavior. Based on such information, large-scale behavior change campaigns could be planned, applying the most efficient combination of BCTs for the given conditions. Unfortunately, most psychological studies fail to provide the information relevant for planning behavior change campaigns, namely, how much certain BCTs change different psychological constructs and how much the behavior changes due to the changes in the psychological constructs.
Many studies do not investigate real-world intervention campaigns, but are more or less artificially set up as laboratory or field experiments (e.g., Sheeran et al., 2005;Webb, Ononaiye, Sheeran, Reidy, & Lavda, 2010). While such studies produce high-quality data on isolated effects of BCTs, to the study participants, the situational context of the study is often unfamiliar, the induced behaviors are of low relevance, and the social effects normally induced by largescale campaigns are absent. A second type of study investigates real-world behaviors and settings but only compares a group in which a single intervention is undertaken to a control group without any degree of intervention (e.g., Cox, Cox, & Cox, 2005;Hill, Abraham, & Wright, 2007). Studies that only compare intervention and control groups for statistically significant behavioral differences are of limited value, as it is not very surprising that doing something has a greater effect than doing nothing. What is of genuine interest is the size of intervention effects when comparing different single BCTs, as well as their combinations and repetitions. A third type of study seems to overcome the issue and uses complex 517245S GOXXX10.1177 intervention campaigns, comprising many different BCTs. Unfortunately, often only the overall effect of the entire campaign is investigated (e.g., Cairncross, Shordt, Zacharia, & Govindan, 2005), without clarification of how each technique works.
The type of data that many intervention studies collect constitutes a fourth shortcoming. Often, only behavioral outcomes are measured, with no information provided regarding psychological constructs (e.g., Cairncross et al., 2005;Mosler, Kraemer, & Johnston, 2013). To understand how BCTs affect behavior (and which psychological factors determine behavior), data on psychological constructs are needed. Finally, many studies limit their analysis to linear covariance structures of cross-sectional data (e.g., Hill et al., 2007;Kraemer & Mosler, 2010). However, to investigate intervention effects and how determinants of behavior are changed, longitudinal data that represent changes over time in absolute terms are required.
Success (or lack thereof) in delivering the BCTs to the study participants is another aspect that is usually ignored during studies on intervention effects. In many studies, the researchers or their assistants directly intervened in the investigated cases. Thus, a 100% successful delivery is guaranteed, but no information about the success of delivering the BCT in a large-scale campaign is obtained. Investigations of actual large-scale campaigns mostly assume that the BCTs reached every person in the intervention group. However, this assumption is usually wrong and the delivery is only partially successful, for many reasons. Most interesting are the factors that lead to different delivery success rates for different BCTs, such as their attractiveness to the target persons (affecting refusal rate and the motivation of the promoters) or the effort and time required to apply the BCT correctly.
To develop theories and models of behavior change that support planning, guiding, and evaluating behavior change campaigns such as health promotion interventions, the effects of combined techniques on behavior and mediating psychological constructs under real-world conditions need to be quantified. Or, as stated by Michie, Rothman, and Sheeran (2007), "We need to move beyond assuming the theory indicates how to change behavior to studying behavior change techniques in their own right" (p. 252). The present study takes one step in this direction by presenting an exemplary procedure regarding how the effects of BCTs can be quantified and applying this method to the investigation of prompts, public self-commitments, and implementation intentions. Furthermore, the study investigates the success rate of delivering these different BCTs. While the main goal is to present a methodology that allows quantifying the effects of BCTs, a number of substantial results are derived regarding the effectiveness of the mentioned BCTs to change behavior and the psychological mechanisms that lead to these changes. Before explaining the methodology, we present the BCTs used in this study.

The Investigated Behavior Change Techniques
Prompts (reminders or external memory aids; Intons-Peterson & Fournier, 1986) are objects indicating that a certain behavior should be performed. Implementation intentions (Gollwitzer, 1999) are simple plans that associate a certain behavior with a specific situation. Public self-commitment (Kiesler, 1971) is a more or less formal statement of the intent to perform a behavior that is made visible to other people. All three techniques are similar in their focus on the implementation of an intended behavior. These techniques do not change infrastructure to enable behaviors or persuade persons to increase their intention to perform a behavior. Rather, the techniques facilitate performance at the right moment of an already intended behavior in its correct form. All three techniques are widely used, and their effects are often investigated (e.g., Armitage, 2007;Guynn, McDaniel, & Einstein, 1998). However, research almost exclusively focuses on their effects on behavior. Many ideas on psychological modes of action are discussed, but empirical findings to back up these ideas are scarce.
The mode of action of prompts and implementation intentions seems to be very similar (Guynn et al., 1998, for reminders;Gollwitzer & Brandstätter, 1997, for implementation intentions). (1) Planning processes are activated by setting up a prompt or implementation intention, which should facilitate implementation of an intended behavior. (2) The association between a situational cue and the intended behavior reminds a person to perform the behavior at the right moment. A prompt is an object explicitly set up to be associated with the behavior; implementation intentions use naturally occurring cues to link to the target behavior. (3) Memory aids (Shapiro & Krishnan, 1999) and implementation intentions (Chasteen, Park, & Schwarz, 2001) remind a person how to perform the behavior. A dynamic model presented by Tobias (2009) on the effects of memory aids found that the more committed a person is to performing a behavior, the more effective are memory aids in reminding that person to perform the behavior. This, in turn, leads to stronger habits, which, when developed, prevent forgetting the behavior. It can be expected that prompts will be particularly effective if combined with interventions that increase commitment. Examples of such BCTs include implementation intentions and public self-commitments.
While the effectiveness of self-commitment on behavior has been well investigated (e.g., Burn & Oskamp, 1986), only speculation exists regarding their mode of action. In general, authors refer to tension states (e.g., dissonance, Festinger, 1957) that emerge if the behavior and/or attitudes are not in line with self-commitment, causing them to be adjusted accordingly. Some authors (e.g., Cialdini, 2001) assume that self-commitment and implementation intention work similarly, with the latter specifying the situational context while the former specifying the behavior. In the case of public self-commitment, further normative effects are expected, due to increased social pressure to conform to the publicly displayed self-commitment and increased visibility of the descriptive norm (e.g., Nyer & Dellande, 2010).
To summarize, the most probable mode of action of the three BCTs seems to be the facilitation of remembering. Furthermore, effects on norms and attitudes have to be considered. In the present study, norms are differentiated into injunctive and descriptive norms (Cialdini, Reno, & Kallgren, 1990). While injunctive norms describe perceptions of what ought to be done, descriptive norms express people's perceptions of what is done in the social environment. In a similar fashion, we differentiate attitudes into affective and instrumental attitudes (e.g., Trafimow & Sheeran, 1998). Affective attitude refers to how pleasant it is to perform a certain behavior, while instrumental attitude refers to the advantages and disadvantages of performing the behavior. Behavioral control also needs to be taken into account (Ajzen, 1991), as this can be an important constraint to behavior performance (for example, in cases wherein people do not have enough time or resources to perform the behavior).

The Present Study
This study presents a methodology that quantifies BCT effects in a way that allows forecasting the psychological and behavioral effects of intervention campaigns. The approach is applied to investigate the effects of prompts, public selfcommitments, and implementation intentions, using longitudinal data from a large-scale campaign promoting solar water disinfection (SODIS) in rural Bolivia. The effects on behavior and the psychological constructs of each BCT alone and of some parallel and sequential combinations, as well as success in delivering the BCTs, are quantified.
From an applied research perspective, the principal question is, "How much behavior change can be achieved with certain BCTs or their combinations?" Thus, the study investigates two applied research questions (RQ):

Participants, Procedure, and Interventions
Data were gathered in the northern highlands of Chuquisaca (Bolivia) during a campaign aimed at reducing the high infant mortality rate caused by diarrhea by promoting SODIS and hand washing. SODIS is undertaken by filling transparent plastic bottles with water and exposing them to the sun for 6 hr (or 2 consecutive days if cloudiness exceeds 50%). Sunlight inactivates pathogenic microorganisms due to the radiation in the UV-A spectrum. SODIS significantly reduces levels of bacterial contamination in the laboratory (e.g., Berney, Weilenmann, Simonetti, & Egli, 2006) and under field conditions (e.g., Sommer et al., 1997). A brief overview of microbiological, medical, and psychological research on SODIS is given by McGuigan et al. (2012). Although the application of SODIS is simple, the adoption rate has been rather slow (Tamas & Mosler, 2011). Therefore, a campaign was initiated by the local non-governmental organization (NGO) Fundación SODIS and implemented in collaboration with the Ministry of Health of Chuquisaca and the Department of Health Service (SEDES). These organizations had the required permits, and the study was carried out in accordance with universal ethical principles. 1 The project started in June 2007, with a baseline evaluation followed by three longitudinal panels in August 2007, November 2007, and March 2008 Intervention waves were placed between these panels, and radio spots promoting SODIS were on the air for the entire campaign. Because the population was made aware of the SODIS promotion between the baseline and the first panel, changes during this period might be overstated. To avoid bias in the estimates of the intervention effects due to social desirability, the changes from the baseline to the first panel were not used in the analyses presented. The BCTs were distributed during information events before the campaign by local health volunteers who were trained in applying these techniques. The BCTs applied are compiled in Table 1.
Due to the high illiteracy rate, surveys were conducted via face-to-face interviews; written consent could not be obtained from survey participants. However, the persons contacted by the interviewers had been clearly informed about the study and that participation was completely voluntary. Eight students from Sucre were recruited and trained to conduct the interviews in an interviewer workshop. Nine villages were selected, mainly based on their accessibility. Due to small community sizes and low density of households within the villages, random sampling was not feasible, and the interviewers were instructed to interview every possible household. Figure 1 shows the flowchart of study participants and the rather complex design of the study. Not all households that were interviewed were reached by the health volunteers who distributed the BCTs. Households without interventions serve as the control group, ensuring maximal similarity between the control and intervention groups.

Measures
Based on previous research on SODIS promotion (e.g., Heri & Mosler, 2008), a standardized questionnaire was developed, translated by local experts, discussed for identical understanding of items with the interviewers, and pre-tested. Table 2 compiles the items for the constructs analyzed here. For the behavior measure, the interviewees were asked to estimate the quantity of water (in cups) consumed by the household on an average day. Then they estimated the number of cups that are boiled, the number treated with SODIS, and the number consumed raw. Based on this information, the percentage of each water type was calculated and used in the analyses. Previous studies indicate that this measure is a good indicator for observed SODIS behavior and, in particular, that it shows very similar changes due to interventions as indicators based on observation (Mosler et al., 2013).
Each item is interpreted as an approximate metric measure that quantifies the evaluation of one specific aspect of the investigated behaviors. These single evaluations are then aggregated into more abstract constructs by computing scores using specific formulas (see Table 2). Following the taxonomy of Law, Wong, and Mobley (1998), the constructs used here are aggregate constructs and not latent constructs. Latent constructs could not be used because for modeling individual processes, data must be gathered that can be interpreted for each individual case (i.e., in absolute terms). In contrast, latent constructs can only be interpreted in comparison to other cases (Borsboom, Mellenbergh, & van Heerden, 2003). A further discussion of this approach is presented in the Supporting Information (SI) on page S-2.
The use of aggregate instead of latent constructs allows the estimation of absolute effects. However, a consequence of this approach is that we cannot assume a priori that the items of a construct are correlated. For example, a person who thinks SODIS-treated water is good for health does not necessarily also thinks treating water with SODIS is cheap. Therefore, estimating the reliability of the measures based on internal consistency (e.g., Cronbach's alpha) is not possible. Because we do not know how reliable our data are, we explicitly consider the uncertainty in the forecasts of the intervention effects. The uncertainties can be estimated due to the longitudinal design, but they comprise not only the One side prompted to do SODIS ("Put the bottles into the sun"), one side presented the steps of doing SODIS, one side prompted hand washing, and one side had a current calendar. The prompts were printed in color and could be placed on furniture or hung from the ceiling. The health volunteers gave the prompt with an instruction to place it near where water was usually handled.

Prompt well visible when visited by interviewer
Public self-commitment A plasticized A4-sized poster stating in Spanish "We are committed to drink water treated by the sun," a SODIS-logo and a picture of a promoter shaking hands with a Bolivian woman holding a SODIS bottle in her other hand. The health volunteers asked how many SODIS bottles the household needed to treat all drinking water. The subjects then committed themselves by stating in Spanish: "I will prepare ___ bottles of SODIS water every day." The "contract" was sealed with a handshake. The public self-commitment poster was set up above the outside door of the house.
Poster well visible when visited by interviewer Implementation intention A paper sheet, A4 size, containing the sentence in Spanish "Every day after _____ (e.g., getting up, breakfast) I will prepare the SODISbottles and put them _____ (e.g., on the roof) where they are lying in the sun the whole day," a SODIS logo, and two pictures, one showing bottle filling; the other, bottles in the sun. Promoters discussed the best time and place for doing SODIS and filled out the sentence on the paper accordingly. The subjects were asked to form the implementation intention by pronouncing the completed phrase.
The household was able to produce the paper sheet; asking Note. SODIS = solar water disinfection.
effects of unreliable measurements but also other influences, such as instability of the constructs over time, variations in intervention effects among subjects, or shortcomings of the model. To conclude, we cannot quantify the reliability of our measures, but the fit indicators of the models quantify the effects of all random influences that might impair a forecast and, thus, the interpretation of the data. Table 2 compiles the formulas for computing the scores of the constructs. These formulas are linear combinations of items in which evaluations in favor of SODIS have a positive value and evaluations in favor of raw water have a negative value. Furthermore, the scores are scaled to the range of [−1, +1] and [−1, 0] respectively. Data were gathered regarding behavior, whether the individual remembered to practice the SODIS technique, affective and instrumental attitudes, and injunctive or descriptive norms. Evaluations of the target behavior (consuming SODIS water) and the competing behavior (consuming raw water) were performed. For attitudes and the injunctive norm, the two evaluations were considered as separate constructs. In the case of behavior and the descriptive norm, the two evaluations were aggregated to one construct each, as they depend on each other: If Behavior A is performed more often, Behavior B has to be performed less often; the extent to which individuals consume raw water qualifies the number of times they consume SODIStreated water. For remembering and behavior control, an evaluation regarding raw water consumption does not make sense. In the case of behavior control, an alternative for disinfecting water had to be considered: boiling the water. With the exception of the availability of fuel for boiling the water, no evaluation of consuming boiled water showed any effect on the consumption of raw or SODIS water. The evaluation of the behavior of boiling water before consumption is included in the construct of behavior control for SODIS, as the behavior directly reduces the need for SODIS. The item on the difficulty of performing SODIS correctly was entered as an instrumental attitude and not as a behavior control, as the question is not about being able to perform SODIS but  Note. Interventions were applied in three subsequent waves (1, 2, and 3). Pub. com. = public self-commitment; pr. = prompt; imp. int. = implementation intention; P = panel. Unusable cases are cases excluded from the analysis due to missing values or combinations of behavior-change techniques too rare to be investigated on their own. about confidence that the positive effects of SODIS can actually be achieved.

Analyses
Because the methods of this investigation are not widely used in psychology, they are explained in detail in the SI on pages S-3 to S-6. Most importantly, the models consider that the dependent variables are bounded (in [−1, +1] and [−1, 0] respectively), and therefore, not all theoretically possible changes can actually be observed. For example, if a person reports a behavior of 0.75 in a previous panel, the maximal observable intervention effect is 0.25, even if the intervention can have an effect of up to 1.0 in other individuals. As this model is linear between these bounds, it is called the bounded linear model. The data from the three panels were combined into one set of differences by subtracting the values (for all investigated cases) of Panel 1 from the Panel 2 values, and the Panel 2 values from the Panel 3 values. Therefore, most cases are used twice in the analyses; this is permissible because dependencies are neutralized through the use of differences. From a modeling perspective, this means that the effects under investigation are assumed to be time-independent. However, in regression models, two constants are used (one for each time step), as unexplained changes might differ over time. In addition to the difference values, dummy variables for the interventions are used (0 = no intervention, 1 = intervention). Interactions of interventions must also be considered, specifically (1) if two interventions were applied together at the same time and (2) if interventions were applied in the previous time step (for all interactions, see Table 3).
A regression model for change in behavior on intervention variables is used to answer RQ2; regression models for changes in each psychological construct on intervention variables are calculated to answer RQ3; and a regression model for change in behavior on changes of all psychological constructs is estimated, with the exception of perceived behavioral control, to answer RQ4. The latter construct could not be considered in this model because of too many missing values. RQ5, which attempts to explain the behavior change in the control group, is investigated with a bounded linear regression model of behavior on the psychological constructs prior to the interventions. For all regression analyses, confidence intervals were estimated using a bootstrap approach with 1,000 samples. In addition, data was tested for multicollinearity of independent variables, heteroscedasticity, and autocorrelation. Outliers with residuals greater than two standard deviations were eliminated (on average, about 10% of all the cases). Pre-conditions for estimating regression models are met for all the presented results. However, not all parameters could be estimated for some models. The fit of the models to the data is quantified with the following indicators: explained variance (R 2 and adjusted R 2 ); standard error of the estimate (s of e); mean absolute error (MAE); root mean squared error (RMSE); percentage of cases with absolute residuals smaller than the measurement resolution of 0.25 (|e| ≤ 0.25; interpreted as adequate forecasts); and the percentage of cases with absolute residuals larger than twice the measurement resolution (|e| > 0.5; interpreted as unusable forecasts).
To answer the research questions, forecasts of the intervention effects were calculated based on the parameters estimated in the regression models, and the effect sizes were classified based on the following considerations. The maximal measurable effect is 2 (i.e., a change from −1 to +1; in the case of remembering, the maximal measurable effect is 1). An intervention within an expensive large-scale campaign should lead at least to an effect of one step on a questionnaire scale (i.e., 0.25). Furthermore, because other factors increase the uncertainty of intervention effects in the real world, the factors that can be calculated should be of high probability; the 0.25 minimal effect should be reached with a probability of 97.5%. Intervention effects are evaluated based on the lower or upper limit of the 95% confidence interval, depending on whether the effect is positive or negative. If both limits have the same sign, the weaker effect has to be at least 0.125 to be mentioned. Effects between 0.25 and 0.5 are labeled as small; between 0.5 and 0.75, medium; between 0.75 and 1.0, strong; and greater than 1.0, very strong.

Descriptive and Regression Results
The descriptive statistics of the variables used are compiled in Table 3. The lower part of Table 3 presents intervention counts. Cases with implementation intentions are sparse, and results for this BCT have to be interpreted with caution. The success rate of delivering the different BCTs was limited. Even the most successfully delivered BCTs-prompts and public self-commitments-reached only a little more than half of the targeted population (54% and 56%, respectively). Implementation intentions reached only 22% of the targeted population.  Note. Effects that with p = 2.5 % are stronger than ±0.125 are marked with † , ±0.25 with *, ± 0.5 with **, and ± 0.75 with ***. RW = raw water; SODIS = solar water disinfection; LL / UL = lower / upper limit of the 95% confidence intervals; N/A = not estimated due to numerical problems; MAE = Mean absolute error; RMSE = Root mean square error.
The results of the bounded linear regressions for estimating the intervention effects are presented in Table S-1 of the SI. The fit indicators for each model are compiled at the bottom of Table 4. While the overall fit of the models is acceptable, the fit was only found to be good for remembering, behavior control, and both SODIS attitudes MAE, ≤0.25; more than 50% of cases with |e| ≤ 0.25 and less than 20% cases with |e| > 0.5. Table 4 shows predicted intervention effects together with their 95% confidence intervals adjusted for the changes that occurred without interventions (control group). Unadjusted estimates can be found in Table S-2 of the SI. The results in Table 4 can be used directly for planning a campaign. For example, it is expected that prompts increase the behavior 0.395; with 97.5% probability, the effect will be 0.211 or larger. Answering RQ2 and RQ3, the main results from Table 4 can be summarized as follows. Figure 2 gives an overview of the effects compiled in Table 4.

Predictions of Intervention Effects
Effects on behavior change. Applied once and alone, neither prompts (LL = 0.211) nor public self-commitments (LL = 0.088) show relevant effects. However, if combined, the effect of these BCTs is more than double the  sum of their single effects (LL = 0.638) and shows a statistically significant difference when compared with the effect of prompts and public self-commitments alone. Similar effects are found for the combination of prompts and implementation intentions (LL = 0.489) and repeating the same BCTs (LL = 0.597 and LL = 0.686, respectively). Repeating and combining prompts and public self-commitments has the strongest effect (LL = 1.362, showing a statistically significant difference in comparison with all the other effects). The confidence intervals (CIs) for differences between effects are compiled in Table S-3 of the SI.
Effects on psychological constructs. Prompts (LL = 0.494) and public self-commitments (LL = 0.567) have a weak to medium effect on remembering, and prompts also have a weak effect on the instrumental attitude toward SODIS (LL = 0.261). Furthermore, public self-commitments have a non-relevant influence on the injunctive norm of SODIS (LL = 0.230). If combined, the two BCTs show the same effect on remembering (LL = 0.564), but weak to medium effects for all evaluations of raw water (LL = −0.338 on affective attitude; LL = −0.728 on instrumental attitude; LL = −0.300 on injunctive norm). The combination of prompts with implementation intentions shows no relevant effect on psychological variables. This might be due to the small number of cases that received this treatment. If the BCTs are repeated, the patterns of the effects (i.e., their influence relative to each other) remain the same, although the overall effects become stronger.

Explaining Behavior Change With Psychological Constructs
Changes due to interventions. Table 5 compiles the results of the bounded linear regression analysis of behavior change on the changes of the psychological constructs. The estimate for the parameter of the instrumental attitude toward SODIStreated water turned out to be unstable (i.e., the value changed completely when the model specification was altered slightly, such as by adding or removing a variable). This problem might be due to the many cases with high values for this variable that already existed before the intervention. The potential intervention effect had to be estimated with a few cases that still had a low value before the intervention. Because of this, we did not use the instrumental SODIS attitude in the analyses, even though this construct might be a strong mediator for intervention effects in many cases. The model fits the data well (adj. R 2 = 86%, MAE = 0.238, 68% of the data points have an |e| ≤ 0.25, and only 5% an |e| > 0.5). This indicates a high reliability of the measurements and an adequate model specification, including considering all the important determinants of behavior change.
All psychological constructs have statistically significant effects on behavior change, even though the effect of the instrumental attitude toward raw water is minimal. Surprisingly, the injunctive norms are negatively related to the respective behavior. Because the injunctive norm for SODIS is not correlated with behavior (r = 0.031, p = 0.652) and the injunctive norm for raw water consumption is negatively correlated (as expected) with behavior (r = −0.250, p < 0.001), this result can be explained in relation to the strong effect of the affective attitudes and remembering. Two interpretations are possible: (1) Persons who felt social pressure to perform SODIS did not respond as strongly to the change in attitudes and remembering as did persons with less perceived social pressure; and (2) what seems more probable is that those with more perceived social pressure reported a stronger increase in attitude and remembering that is not reflected in the behavior change. Thus, the effects of the injunctive norms might correct the effects of social desirability on attitudes and remembering. The constant also has a weak influence in the first time step (B = 0.16). Therefore, some systematic change in behavior is not explained by the psychological variables.
Changes without intervention. For the first time step, no relevant effects were found in the control group, even with the barely relevant effect on descriptive norms (LL = 0.242). In the second time step, weak effects on behavior (LL = 0.317) and the instrumental attitude toward raw water (LL = −0.372) were observed in the control group. The remaining effects of single interventions were found only for remembering (LL = 0.474 for prompts; LL = 0.588 for public self-commitments). The behavior returned to the same value as for the control group (M = −0.121 for prompts, M = −0.104 for public self-commitments).
To explain the aforementioned behavioral changes in the control group, a regression of behavior change on the psychological variables before the interventions was computed, as shown in Table 6. Due to the small number of cases, most estimates of the parameters are not statistically significant. The attitudes toward SODIS are strong predictors of behavioral change without interventions (B = 1.35 for instrumental and B = 0.62, n.s., for affective attitude). Surprisingly, however, the affective attitude toward raw water has a strong positive effect on behavior change (B = 0.79). This might be related to the fact that the taste of SODIS water is very close to the taste of raw water, and taste is an important component of the affective attitude. Thus, persons who prefer the taste of raw water to that of boiled water evaluated SODIS-treated water more positively. The only relevant (but barely statistically significant) negative effect is on remembering (B = −0.56), indicating that a lack of fully developed habits might be the principal barrier for people with positive attitudes to routinely perform SODIS.

Discussion
We applied a methodology that quantifies BCT effects in a way that allows forecasting the psychological and behavioral effects of intervention campaigns. The analyses led to a number of interesting substantial results. Those results are discussed next, after which the methodology itself is discussed.

Discussion of the Substantial Results
To investigate the potential impact of prompts, public selfcommitments, and implementation intentions on behavior and on a number of psychological constructs, bounded linear regression models were fitted to data gathered during a largescale behavior change campaign in rural Bolivia. Based on these models, the effects of the different BCTs were estimated. The results allow all of the RQs to be answered.
Results regarding the success of delivering the different BCTs (RQ1) and effects on behavior change (RQ2) are summarized from an application-oriented perspective. Prompts, on their own, had the strongest effect on behavior (expected change, EC = 0.395 of a maximal possible effect of 2.0), and they were the easiest to distribute (about 54% of the target population received the prompt). The same delivery success rate (56%), but a lower impact on behavior change (EC = 0.249), was observed for public self-commitments. Implementation intentions failed with respect to delivery success rates (22%); however, they relevantly increased the effect of prompts (EC = 0.883). A similar effect was found for the combination of prompts and public self-commitments (EC = 0.986). Thus, combining BCTs increases the effects on behavior change to levels higher than the sum of the two single BCTs. The same holds true for repeating the same technique over time (EC due to repeated prompts = 0.918, and due to repeated public selfcommitments = 1.081). The strongest effect found in this study was the combination of prompts and public self-commitments after initial public self-commitments (EC = 1.818). Neither prompts nor public self-commitments showed long-term effects on the behavior if applied only once. No data are available on the long-term effects of combinations and repetitions of BCTs.
These results have two main implications for theory development and application. First, the limited success of delivering the BCTs illustrates the importance of determining whether interventions actually reach the target population. For theory development, the effects of successfully applied interventions need to be quantified independently from the success in delivering them, as completely different processes determine the two success rates. Regarding application, BCTs should be designed in a way that they appear attractive and are easy to understand and apply. The colorful and practical prompts could be handed out with little instruction and were well received by the people, which in turn motivated the health volunteers. The implementation intentions turned out to be too difficult for the local health volunteers to understand, too time-consuming to be applied, and not very attractive to the target population. Thus, within a given amount of time, the health volunteers were not able to apply as many of these BCTs as the others and encountered more refusals; thus, the two factors might have reduced the motivation of the volunteers, leading to an even lower delivery success rate. The critical role of promoter characteristics, particularly their level of commitment, was recognized by Meierhofer and Landolt (2009). Therefore, it is important to work with BCTs that keep the motivation of the promoters high. Second, the effects of BCTs depend on other interventions applied before or at the same time. This is not surprising, but it is almost never scientifically investigated. Mosler et al. (2013) applied a number of BCTs consecutively, but the interactions between BCTs were not considered in the statistical models. The conclusion that can be derived from our results in terms of theory development is that intervention effects need to be investigated as individual processes to understand how they interact. For application purposes (i.e., campaign planning), it can be concluded that the investigated BCTs should be combined and repeated to increase their effectiveness.
Results regarding the effects of the BCTs on psychological constructs (RQ3) and the psychological mechanism behind the BCT effects (RQ4) can be summarized together from a theoretical perspective: Prompts affect the instrumental attitude toward SODIS, while public self-commitments affect the injunctive norm of SODIS. Combining prompts and public self-commitments leads to effects on raw-waterrelated attitudes and norms. Thus, the often assumed effects on attitudes and norms were confirmed. However, the strongest effects were found on remembering, which is absent in most behavior change theories. Only Tobias (2009) considered this construct a central driver of behavior change dynamics, and Mosler (2012) included it in his conceptual model. Moreover, no relevant effects were found on the descriptive norm. A possible reason may be the scattered layout of the settlements in the target region, which makes it difficult to know what neighbors are doing.
Considering the relationships between behavior change and changes in the psychological variables, it turns out that only change in remembering reflected a positive mediating effect of the BCTs on behavior change. It must be mentioned, however, that the effect of the change in instrumental attitude toward SODIS-treated water and the effect of perceived behavioral control on behavior change could not be estimated, due to numerical problems and too few cases, respectively. In particular, instrumental attitude might be an important mediator for the prompt effect in cases where this construct is not very high before the interventions. Roughly, the BCTs changed remembering at least by about 0.5 (lower limit of the 95% CI), while the lower limit of the CI for the effect of a change in remembering on behavior change is 0.4. Thus, with a probability of more than 97.5%, the investigated BCTs changed the behavior by at least 0.5 × 0.4 = 0.2 due to a mediation of the change in remembering. The expected mediation effect of remembering is about 0.5 for single interventions and 0.7 for repeated interventions on behavior.
It might be seen as surprising that prompts and public self-commitments show such similar effects, as they are often categorized as completely different behavior change techniques. For example, Michie et al. (2013) categorized prompts within the cluster "associations," and behavioral contracts (what we call self-commitment) in the cluster "goals and planning" (together with action planning, what we called implementation intentions). However, the effects of BCTs are determined not only by the form of the BCT but also by the problem it solves. Here, the problem was forgetting to put the bottles in the sun at least 6 hr before the water was needed. As it seems that both techniques could solve the problem, and because this was the only critical factor that hindered the performance of the behavior, both techniques show similar effects. One might wonder how self-commitment can prevent forgetting. One explanation is that the commitment sign worked as a prompt; another is that the self-commitment intervention increased the importance of the behavior, and, thus, the persons in charge put more effort into not forgetting to put the bottles in the sun (e.g., by setting up self-made reminders or associating specific situations with performing the behavior-something that also occurred in the control group in the experiment by Gollwitzer & Brandstätter, 1997).
In addition, strong relations between the affective attitudes and behavior change and weaker ones between changes in the norms and behavior change were observed, but no systematic effects of the intervention on these constructs were detected. It might be that the relationships between the changes in the psychological constructs and behavior change reflect only a self-report bias (i.e., over-reporting of the change in attitudes by people who felt social pressure to use SODIS) or that the effects of the interventions on the psychological constructs were not detected due to insufficient measurement or modeling.
Another modeling concept used for planning behavior change campaigns can be discussed as well: stage models (e.g., the Transtheoretical Model of Change by Prochaska & DiClemente, 1983; the Health Action Process Approach by Schwarzer, 2008). Such models have been successfully applied to the use of SODIS (e.g., Kraemer & Mosler, 2011;Mosler & Kraemer, 2012). However, our model explains that the data comprising all levels of SODIS use almost perfectly without considering stages of change. The reason for this is that the bounded value ranges of the variables have been considered in the model. As in stage models, changing some constructs might have no effect for certain persons. However, this is not because these persons are in stages wherein the changes of the constructs have no effect, but because the constructs have values close to their bounds, and, thus, nothing can be won by changing these constructs. In addition, cases without interventions (i.e., in the control group) showed changes in behavior at the second time step (EC = 0.525) and changes in instrumental attitude toward raw water (EC = −0.589). Such changes in the control group were also observed in other SODIS promotion campaigns (e.g., Mosler et al., 2013), but they were never analyzed. According to our results, persons who demonstrate a greater increase in SODIS use have not only more positive attitudes toward SODIS but also a more positive affective attitude toward consuming raw water. As mentioned before, this might be due to the similarity of the taste of SODIS-treated water to that of raw water. More generally, it is important to note that a positive attitude toward a competing behavior does not necessarily impede the target behavior; in fact, it may even promote it. Therefore, theories of behavior change should consider the interaction of competing behaviors.
There is another interesting aspect related to these results: The attitudes are not related to the behavior itself but to the change in the behavior. Thus, persons who changed their behavior without intervention treated less water with SODIS at the beginning than at the end of the campaign. If these persons evaluate SODIS so positively, why did they not use SODIS right from the beginning? Because remembering is negatively related to behavior change, it can be concluded that these persons often forgot to apply SODIS at the right moment. However, as Tobias (2009) demonstrated, if they were still applying SODIS at least once in a while, habits could have developed that prevented forgetting after some months, even without intervention.

Discussion of the Method
The methodology used in this study to quantify intervention effects consists of a number of key elements. Most of the methods used are common in engineering and the natural sciences but not as common in psychology. Therefore, the approach is summarized in terms of the design of the campaign, data gathering, and data analysis.
Campaign. To investigate intervention effects, data should be gathered during actual campaigns. Ideally, these are largescale campaigns, but smaller pilot campaigns can be more practical for trying out a number of BCTs. In these campaigns, different BCTs should be applied on their own, in combination, and in sequence to different target groups. Furthermore, data gathering must be designed in a manner that allows investigation of possible problems with the delivery of the BCTs.
Data gathering. To obtain usable estimates of intervention effects, longitudinal and (approximate) metric data must be gathered. In psychology, this can be achieved by asking the interviewees for very simple and specific evaluations of the behaviors of interest. Based on these data, scores for more complex and abstract constructs can be computed.
Analysis. Intervention effects should be quantified in absolute terms and not just by demonstrating the statistical significance of an effect compared with a control group. This can be accomplished with regression analyses. However, the following points must be considered. First, bounded linear models are necessary if the depended variable is bounded, as in the case of metric data on psychological constructs. Second, to determine the effects of BCTs, predictions of the effects must be calculated based on the parameter estimates. Third, the uncertainty of these predictions must be estimated. Bootstrapping can be a useful approach, because it considers interdependencies among uncertainties of parameter estimates.
This approach quantifies intervention effects in a form that can be used for the development of behavior change campaigns such as health-promotion interventions. Thus, the shortcoming of many studies criticized in the introduction (i.e., that they only show that an intervention is effective but not how much and what type of effect can be expected) is overcome. Furthermore, this methodology supports the development of process theories of behavior change and solves all of the limitations of previous studies mentioned in the introduction. Table 7 summarizes these issues and how they were solved using the above approach.
A particular strength of the present study is that it uses data gathered from a "real-world" campaign, thereby ensuring high external validity. However, this comes at the price

Limitations of previous studies
How these issues were solved in this study Setting is artificial: laboratory or field experiment instead of a realworld campaign For this study, data were gathered during an actual campaign.
Investigation limited to test for statistically significant differences The changes are quantified using meaningful (psycho-)metric scales. Only differences between an intervention group and a control group are investigated, or a combination of various techniques is investigated as one intervention without differentiating the effects of the techniques applied.
A number of explicitly defined combinations of different intervention techniques are investigated considering the effects of each technique on its own and in combination with other techniques or repetitions of the same technique. Only behavioral outcomes are measured.
Behavioral outcomes and a number of psychological constructs are measured. Considering only cross-sectional data Longitudinal data were used for this study. Data is only investigated in form of linear covariance structures.
A case-based approach with bounded-linear models was used considering the limited range of the constructs. The delivery of the behavior-change techniques is artificial or assumed to be perfect.
The behavior-change techniques were delivered within a real-world campaign and the delivery success investigated.
of not having data of the highest quality. Therefore, for this specific study, a number of shortcomings have to be considered before generalizing the results. First, the forecasts of the intervention effects are rather rough. For most constructs, adequate forecasts (i.e., with |e| ≤ 0.25) could only be achieved for about half of the sample, and in many cases, about one-quarter of the forecasts were unusable (i.e., |e| > 0.5). The problem is also reflected in the poor explained variance (adj. R 2 ), which is often below 50%, and for some models, is even below 20%. This needs to be considered, particularly in the case of further analyzing the forecasts. A second problem is the small number of cases for some interventions. In particular, few cases received combinations of BCTs, and results regarding these effects should be interpreted with caution. Third, the cases were not completely randomly assigned to the treatment conditions. The households could not select the BCTs, but they could reject them. However, no statistically significant differences in behavior or psychological constructs before the interventions were found between households with and without intervention. 2 Finally, we could not implement a mediation analysis, even though they are commonly performed in psychological studies (e.g., Baron & Kenny, 1986;Hayes, 2009;MacKinnon, Fairchild, & Fritz, 2007). We did calculate all models (i.e., the direct effects of the interventions on the behavior, the effects of the interventions on the psychological constructs, and the effects of the psychological constructs on the behavior), but we did not correct for direct effect in the regression of the behavior on the psychological constructs. This was not possible due to the high explicative power of the models: Considering the direct effect in the regression of the psychological constructs would have led to strong multicollinearity. However, our approach allows an absolute estimate of the mediation effect under consideration on uncertainty in the sample, measurement, and model, which might be even superior to the traditional approach, which only tests for statistical significance. Nevertheless, when comparing the results with other mediation analyses, this difference has to be considered.

Conclusions
For any science, the ability to forecast the effects of the application of a technique in the real world is a critical step in the application as well as development of a knowledge domain. Regarding application, science should help us foresee the consequences of our actions; regarding theory development, deviations from expected and observed consequences are the most valuable basis for improving models. A prerequisite for this step is to apply adequate methods of analysis to adequate data. In this article, we presented a methodology that allows quantification of the effects of BCTs in a form that allows forecasting of the effects of realworld campaigns. Furthermore, we provided first insights into how three BCTs worked in a large-scale health promotion intervention. However, these results are only a first step, and further research is needed to actually understand how BCTs work. Besides quantifying the effects of other campaigns to see how far the results of this case study can be generalized and what other effects the investigated BCTs can have, new theories of behavior change are needed. Such theories need to focus on psychological processes triggered by BCTs instead of only listing possible determinants and stages of behavior change. Investigating individual processes during large-scale campaigns requires different approaches to data gathering and analysis, as presented herein. Knowing what effects can be expected from different BCTs could help in the design of better behavior change campaigns, and, thus, mitigate many urgent problems faced by humanity.