Are visits of Dutch energy coach volunteers associated with a reduction in gas and electricity consumption?

In a number of European countries, local municipalities, housing cooperatives, and citizen-based initiatives have been training energy coaches to help citizens improve the sustainability of their homes. These local volunteers offer an analysis of a citizen’s home to advise on how to make it more sustainable, comparing citizens’ consumption patterns with similar others’. While energy coaches are widely employed, evidence on the effectiveness of energy coaches and their approach is lacking. We collaborated with a housing cooperation that trains and provides tools for energy coaches in the Netherlands, comparing the electricity and gas consumption of households before the visit of a local energy coach and their consumption 1 year later. Our results suggest that the visit of an energy coach was associated with a reduction in energy consumption, but only for those who were told by the energy coach that they were consuming more energy than comparable others.


Introduction
Reducing greenhouse gas emissions to mitigate the negative effects of climate change is a central objective of this generation and ingrained in the policy of the United Nations (United Nations, 2015). Many European countries such as the Netherlands are still largely reliant on the consumption of natural gas for cooking and heating houses (Dehullu, 2017), but want to drastically change this by 2050 (Ministry of Economic Affairs and Climate, 2019, 2020). Transitioning away from natural gas can seem complex and costly for households; however, the prior step of insolating and energy-efficiency investments specifically in the residential sector is less costly and provides several direct benefits. Still, despite financial incentives that shorten the payback period of energy-efficiency investments, increased home comfort, and reduced environmental impact, the uptake of energy-efficiency investments in the residential sector remains too low, and most European countries like the Netherlands are unlikely to reach their goal to reduce their greenhouse gases by 49% in 2030 compared to 1990 (Ministry of Economic Affairs and Climate, 2019).
There are various reasons why citizens oppose the transitioning away from using gas for heating and cooking, with only one of them being that some citizens are afraid that the transition from natural gas to new energy sources will cost a lot of money (Koning et al., 2020). The substitution of natural gas with alternative energy sources and the insolating of all homes requires the involvement and acceptance of citizens and businesses as they are the key to a successful neighborhood approach (Koning et al., 2020). Municipalities have used strategies that build on research findings showing that feedback on energy consumption results in energy savings and awareness (Weiss & Guinard, 2010). However, there is also evidence that not all forms of providing feedback to household about their energy usage will result in substantial energy savings (Buchanan et al., 2015;Schultz et al., 2015). Providing energy usage feedback through smart meters or other inhome displays might be gaining global popularity; yet, it seems vital to further investigate and develop feedback mechanisms that take into account user engagement and unintended consequences of feedback (Buchanan et al., 2015). Households might prefer information and cost-framed feedback, but these do not seem to lead to reduced energy usage, whereas normatively framed feedback does seem to reduce consumption (Schultz et al., 2015). Another popular way of providing feedback to residents is through training intrinsically motivated volunteers to help encourage others to behave more sustainably and equip them with the tools to give feedback on residents' consumption behaviors. Municipalities and housing corporations have been training and mobilizing volunteers to act as so-called energy coaches (!WOON., 2021).
A trained energy coach can give various sorts of advice on energy consumption, energy saving, and investments, such as how to insulate a house (Bongers & Holtappels, 2019;Ozawa-Meida et al., 2017;Rotmans, 2011). Energy coaches can draw up a personal advisory report for the residents based on the data obtained by the energy coach during a home visit. This includes the energy consumption, the wishes of the residents, and the condition of the house that the energy coach determines during the home visit (Bongers & Holtappels, 2019). These advisory reports include a wide range of information, such as possible investments in sustainability, subsidies for this, and the time frame in which the investment could be recouped (Bongers & Holtappels, 2019).
Advice about the technical energy efficiency of a house is not new, as energy audits have been available and administered for decades. However, the uptake of such paid services has been low, although it was carried out by experts that focus on energy efficiency (Ingle et al., 2012). Relatedly, home energy efficiency investments have been considerably below levels that seem reasonable following a technological economic perspective (Ingle et al., 2012). Energy coaches provide their information in a less formal and more approachable situation as they are local volunteers trying to help and not professionals who need to earn money providing a service. While some energy coaches give advice on how adjustments in the home or behavioral changes can lead to lower energy consumption, other energy coaches also provide a comparison of the energy consumption of the inhabitants compared to that of other similar households (Bale, 2016;!WOON., 2021).
We argue that social comparison information may be critical for the success of energy coaches in lowering energy consumption. Given the vast amount of successful studies utilizing social influence in the field of pro-environmental behavior (Abrahamse & Steg, 2013), we want to investigate if the visits of energy coaches are associated with a decrease in energy usage and to what degree social comparisons play a role in this. Case studies in the field of marketing on domestic electricity consumption showing that feedback on electricity consumption for individuals as well as social norm feedback can lead to reductions of consumption of about three percent, suggest that individual feedback might be sufficient by itself, and field experiments in the energy domain should be careful combining intervention elements as the impact of social norm information might be confounded with that of individual feedback (Harries et al., 2013;Vesely et al., 2022). Research from behavioral economics shows that non-monetary interventions such as social comparisons have the potential to significantly reduce energy consumption of private households, yet that it is crucial to evaluate the impact and effect sizes of such interventions before implementing policy interventions (Andor & Fels, 2018). Similar meta-analyses of the effectiveness of incentivizing lower electricity consumption show that monetary, informational, and behavioral incentives can achieve reduced electricity consumption; however, they do not always produce significant effects and on average achieve 2-4% of energy reduction (Buckley, 2020). Our research on energy coaches contributes to this understanding by investigating a very popular energy saving program that has until now not been explored adequately. Although our study contains a convenience sample of self-selected households and does not have a control group, we think it is vital to explore the effect of social comparison information in this context. In particular, we argue that it is important to explore whether the limited effects of social comparisons might gain in effectiveness by energy coaches providing information to the right people who are interested in saving energy and by providing that information in a salient and personal way.
Besides the desired effect of residents consuming less energy when they are made aware of the fact that they are consuming more than comparable others, there is also an undesirable effect, which in the social comparison literature is called the "boomerang" effect. The boomerang effect refers to individuals who are consuming below the norm and then adapt to the standard of similar individuals and thereby consume more energy (Rasul & Hollywood, 2012;Schultz et al., 2007Schultz et al., , 2018. It seems therefore essential to not only look whether energy coach visits are associated with a general reduction in energy consumption but differentiate between residents initially consuming more and residents initially consuming less. The boomerang effect can easily be confounded with a "regression to the mean" effect. A "regression to the mean" effect refers to a common statistical phenomenon that, due to random variability, scores that are initially above average tend to decline and scores below average tend to go up again. Therefore, we do additional analyses to disentangle social influence effects from a regression-to-the-mean effect. To evaluate the relationship between energy coaches and lower energy consumption, we test the prediction that the change in consumption at least partially depends on the social comparison energy coaches provide to the resident. We use both smart meter data and hand-filled consumption of the electricity and gas consumption before a visit and after a visit of an energy coach. In the time span between 2017 and 2019, 3888 households signed up and were visited by such an energy coach. We do not have any demographic or profile information about the energy coaches nor do we know about their level of competence in the field of sustainability. Yet, we know that the housing cooperation !WOON provided a mandatory one-day training on energy efficiency advice and how to use a certain tablet that helps to calculate individual home improvement advice. Additionally, !WOON provided the opportunity to get home improvement gifts such as LED lamps, so that anyone interested in becoming an energy coach could provide the same valuable information and gifts to local households. Once an energy coach is trained, they are asked to be available for a minimum of 1 h per week to help other locals save energy. Of the 3888 home visits that took place in the cities of Amsterdam and Haarlem we received 467 gas or electricity measurements that will be discussed in the methods section. We investigate if utilizing energy coaches is associated with lower gas (m 3 ) and electricity (kWh) consumption of the households they helped 1 year after the visit. We differentiate between households that were told by the energy coach that they have been consuming more gas or electricity than comparable households and those households who were told by the energy coach that they have been consuming less gas or electricity than comparable households.

Theory
Energy coaches provide information on how to reduce energy consumption and when this information and advice are applied; then, this should be noticeable in terms of reduced subsequent energy consumption. Several studies found that environmentally conscious attitudes have a positive impact on environmentally conscious behavior (Bissing-Olson et al., 2013;Clark et al., 2003;López-Mosquera et al., 2015;Meinhold & Malkus, 2005;Zhang et al., 2020). Other research suggests that despite people having environmentally conscious attitudes, this does not lead to environmentally conscious behavior (Moser, 2015;Prati et al., 2017), also known as the attitude-behavior gap (Peattie, 2010). Diekmann and Preisendörfer (2003) provide an explanation for this gap by arguing that costs are an often forgotten factor in attitude behavior research and can help to reduce the variation in correlations between attitude and behavior. Diekmann and Preisendörfer (2003) refer to costs more broadly: both costs in a financial sense and the cost of behavior. Behavioral costs refer to how much effort something takes to do (Hunecke et al., 2001). Examples of high behavior costs are the considerable timetaking and high cognitive load in processing complex information related to energy savings (Huang et al., 2020;Stern, 2011). Individuals are less likely to save energy when behavior costs are high (Steg, 2008).
Following Diekmann and Preisendörfer's (2003) lowcost hypothesis, behavior is only explained by attitudes when acting upon these attitudes causes little cost and inconvenience to the individual. The effect of attitudes on behavior would therefore depend on the cost intensity of the situation: the higher the cost in behavior or money, the less people act on the attitude they have. According to the low-cost hypothesis of Diekmann and Preisendörfer (2003), individuals should save energy if behavior costs are reduced, as individuals with environmentally conscious attitudes are then more likely to act according to their values. Assuming that many individuals undergoing the effort of signing up to get help with making their home more sustainable and energy efficient have high pro-environmental attitudes, the information, tips, and tools provided by an energy coach should lead to lower energy consumption.
Energy coaches analyzed in this study are trained by the housing cooperation !WOON that provides their energy coaches with an application on a tablet to register all characteristics relevant for energy consumption of the home they are visiting, to provide detailed technical energy efficiency advice. The energy coach is also allowed to give away products worth up to 20 euros that help saving energy like LED lamps and radiator foil. An energy coach could therefore be expected to be effective in reducing energy consumption by lowering behavioral costs, through providing residents with energy saving information. Specifically, we expect the associated decrease in time and effort needed to figure out how to make their house more sustainable as well as the associated aid in overcoming barriers, such as how feedback from tools like a smart meter can be applied to their specific home situation, to lead to more action to reduce energy consumption (Geelen et al., 2019;Huang et al., 2020).
Another crucial contributor towards behavior change and decision-making is the interaction with other people in one's surrounding. Social norm theory explains this phenomenon where people are influenced by others (Parece et al., 2013). People look to other people to determine which behavior is acceptable and not acceptable and which behavior is most frequently displayed, to align their own behavior with it (Parece et al., 2013). Cialdini and Trost (1998, p. 152). If someone is the only one in a neighborhood without solar panels on the roof, they can be socially excluded (Voss, 2001), and therefore, try to avoid this by purchasing solar panels like the others. Normative social influence has been shown to cause substantial behavior changes in energy conservation when compared to other information to conserve energy, even though individuals themselves indicate that normative information is not an important driver of their own behavior (Nolan et al., 2008).
Given that energy coaches lower behavioral costs for people to lower their energy consumption as explained above and can provide a social comparison indicating that similar residents are consuming less energy, we hypothesize that a visit by an energy coach is associated with residents reducing their energy consumption: H1a: A visit by an energy coach is associated with residents reducing their gas consumption. H1b: A visit by an energy coach is associated with residents reducing their electricity consumption.
As part of the social influence dynamic, we also need to consider the so-called boomerang effect. This effect is an unintended consequence of a descriptive standard, in which individuals who are below the norm adapt to the standard of comparable individuals and thereby consume more energy (Rasul & Hollywood, 2012). This boomerang effect has been found in several studies. For example, Schultz et al. (2007) found that it matters to whom information is given about descriptive standards. Giving descriptive standards to individuals with high energy consumption compared to average energy consumption led to a decrease in energy consumption. The opposite was found for individuals with lower energy consumption compared to the average. These individuals started to consume more energy (Schultz et al., 2007). Similar results have been found by Buchanan et al. (2015), where individuals started to consume more energy feeling free to meet the social norm, after seeing feedback on a display in the home that they consumed less energy compared to others.
The !WOON energy coaches we investigate provide the residents they visit with such a descriptive social standard, giving the resident a so called frame of reference of what is "normal" (Handgraaf et al., 2013). More specifically, the energy coach fills in all characteristics of the house and the consumption data of the resident, and then, the tablet indicates whether the occupant consumes more or less than similar other households. The residents are therefore made aware whether they consume more or less energy compared to the norm, by the information of the tablet analysis and the assessment of the coach. Following this reasoning, we compare the group of residents that received the positive social comparison with those residents that received the negative social comparison. We predict the following: H2a: Households that are told they are consuming more gas than a comparable household by an energy coach reduce their gas consumption, but households that are told they are consuming less gas than a comparable household will increase their gas consumption. H2b: Households that are told they are consuming more electricity than a comparable household by an energy coach reduce their electricity consumption, but households that are told they are consuming less electricity than a comparable household will increase their electricity consumption.

Procedure
In order to test our hypotheses, we collected data about the energy coach project from the independent non-profit housing-foundation !WOON, which had organized 3888 energy coach home visits between 2017 and 2019. Trained energy coaches had been visiting interested households, until the corona crisis in 2020 made home visits less desirable and led to a decline in energy coach visits and prohibited further data collection with a fitting control group. The housing foundation !WOON promoted free visits from an energy coach, through flyers with a response rate of 6-8%, and Facebook and advertisements through the municipalities Amsterdam and Haarlem. After a resident had signed up for an energy coach, a visit was scheduled. The housing foundation !WOON is independent, and they provide information, advice, and services like the energy coach program to all interested residents. The client base therefore is very diverse ranging from home owners, renters up to people looking for housing. However, the households that participated in the energy coach program might be particular in the sense that they were interested in saving energy. As a result, they do not necessarily form a representative sample of Dutch households along dimensions such as energy consumption, income, or any other relevant parameter. Households were selected to get help by an energy coach on a first-come-first-serve basis, without any further selection criteria. During such a visit by an energy coach, the resident would share information about the state of their house and the resident's consumption behavior.
The energy coach entered this information into a tablet that was provided by !WOON. Besides combining all possible savings options together into a savings report in possible m 3 saved for gas and kWh for electricity consumption, the energy coach also provided a comparison of the current household's consumption to that of similar comparable households in terms of number of inhabitants (electricity) and housing type (gas). This comparative score was calculated by taking into account the average gas consumption score of the applicable housing type, the average gas consumption of the city of the household, and the national average gas consumption (details in the Appendix). For electricity consumption, the comparative score was calculated by using the household size of a household, the average electricity consumption of the city of the household, and the national average electricity consumption (details in the Appendix). Note that an actual questionnaire was not part of the energy coach visit and such a questionnaire would probably have made the participation more selective. Therefore, we do not have detailed additional data on the visited households on other types of behaviors, attitudes, etc.
After the home visit, the energy coach sent the savings report to the household, and when given permission, the tablet data was sent to the foundation !WOON. The new consumption data can be used to make a comparison with the old consumption data. Gas and electricity consumption was measured in the months of January, February, and March of the years 2018 and 2019. As a benchmark, we consider the change in energy consumption that occurred between 2017 and 2018 in overall Dutch energy consumption. small. The average gas consumption increased from 1240 to 1270 m 3 in these 2 years, which is an increase of 2.4%, while the electricity use decreased from 2860 to 2790 kWh, which is a decrease of 2.5% (CBS, 2022). In comparison to the other 26 European member states in the year 2019, the Netherlands is one of the larger energy consumers of the EU with the highest percentage of gas used for space heating at 84.9% and electricity, with renewables and waste only accounting for 2.5% and 8.5% (Eurostat, 2021).

Participants
From the total of 3888 visits the housing cooperation !WOON had organized, we were able to construct a sample of 467 cases. Each case is a household for which either gas or electricity is measured. If we have both gas and electricity measurements, the respective household is included in the sample twice. Figure 1 shows how we arrive at our ultimate sample of 467. For 109 cases (54 gas and 55 electricity), energy consumption in both years was directly derived from smart meters. The reason this number is so low is twofold: many households did not give permission to use smart meter data, and for some that did give permission, the energy company had technical problems reading out the smart meter. For the remaining 358 measurements (165 gas and 193 electricity), old consumption scores were filled in by hand by the energy coach and self-reported by the household to !WOON. Our sample of 467 cases combines these 358 cases with the 109 smart meter cases. The 467 cases consist of 219 gas comparisons and 248 electricity comparisons. A weather-corrected analyses is not possible as it is only known that there was 1 year between the old and new consumption measurements, yet the specific dates of measurements were missing for the hand-filled data. Gas consumption varies depending on the severity of the winter, and ideally, if the comparison months would have been known for more than 54 cases, weather degree days could have been accounted for. We do not have demographic information about the home owners and therefore cannot analyze how representative our sample households are of Dutch households in general. We thus do not know how sample selection limits representation of the wider population of Dutch households. For the social comparison hypotheses, we need to consider the household size for the electricity measurements and the type of house for the gas measurements, so that we are able to check what sort of social comparison they received. The Appendix shows the formulas provided by !WOON that show how the comparison scores were calculated by the energy coach app to tell a resident if their gas or electricity consumption was higher or lower than that of comparable consumers. We miss this information for 53 households to do the social comparison on electricity consumption. So, for these analyses, we only have 248 − 53 = 195 cases. We know that of these 195 cases, most are apartments 161 (82.6%), followed by 18 (9.2%) corner houses, 10 (5.1%) terraced houses, and 1 (0.5%) detached house. Of these 195 cases, 104 (53.3%) households consisted of single person, followed by 58 (29.7%) two-person households, 24 (12.3%) three-person households, 6 (3.1%) four-person households, and 3 (1.5%) households with five inhabitants. For the gas measurements, we do not know household size, but only the type of house. Of the 219 cases, 153 (69.9%) are apartments, followed by 40 (18.3%) terraced houses, 24 (11%) corner houses, and 2 (0.9%) detached houses.  The "Results" section contains both the smart meter and hand-filled cases. An overview of the descriptive statistics can be seen in Table 1. Table 1 shows that the reduction in gas is 8.4%, while the reduction in electricity consumption is 6.3%. This decrease in gas consumption is realized although the average household gas consumption in the Netherlands increased by 2.4 percent between 2017 and 2018 mentioned above (CBS, 2022). The 6.3% decrease in electricity consumption that we observe is substantially larger than the 2.5% decrease in electricity consumption that occurred in the Netherlands between 2017 and 2018. The data also show that the participating households used less energy than average Dutch households in those years, which indicates some selectivity in the sample from the average Dutch household. The minimum value of − 14,329.2 kWh for new electricity consumption indicates that there are households with solar panels who ended up with negative net consumption as they gave back more to the electricity grid than they consumed.

Analytical approach
The energy consumption data is not normally distributed: the old and new gas consumption scores show right-skewed distributions with some outliers having high gas consumption scores. The old and new electricity consumption scores also show right-skewed distributions with some outliers with high electricity consumption scores before and after the visit of an energy coach. There are also some cases of negative electricity consumption, presumably due to solar panels feeding electricity to the grid. After log transformation-where we deal with negative values using f(x) = sign(x)*ln(abs(x))-our distributions of energy readings still exhibit significant deviations from normality. We therefore conducted non-parametric Wilcoxon signed-rank tests in addition to t-tests. We report both in order to show that our results mostly do not depend on the choice of test. Specifically, for testing the first and second hypothesis, we perform non-parametric Wilcoxon signed-rank tests and show the one-sided paired sample t-tests comparing gas and electricity consumption before and after the energy coach visit. We provided the 90% confidence intervals in line with the α = 0.05 one-sided tests. All results of the tests can be found in Table 2. For both the first and second hypotheses, we conduct one-sided tests following our theory that provides one-sided predictions. This implies that we treat small non-significance in the expected direction as well as large differences in the unexpected direction similarly as lack of evidence for our theoretical predictions. For testing the second hypothesis on the boomerang effect, we also perform non-parametric Wilcoxon signed-rank tests and show the one-sided paired sample t-tests comparing gas and electricity consumption for these two different groups before and after the energy coach visit, which can be found in Table 2.
We do not have data on installations of solar panels, improved energy efficiency, nor on other socio-technical changes such as changes in work or household size that may have occurred within the 12 months after the visit of the energy coach that could have affected the energy usage. We are therefore not able to consider such aspects and have to base our analyses on comparing the imported energy consumption data we have. We focus on straightforward comparison of the energy use before and after the energy coach visit, but only distinguish above-and below-average users. We do not use the type of house nor household size as covariates in our analysis because there are too few cases for drawing type-or size-specific conclusions. Given that we do not have a control group of similar residents and can only refer to the national average consumption scores, we are limited in our causal interpretation of effects based on a comparison of pre-and post-measurement of energy consumption. In the discussion, we go into more depth as to how our analyses may be extended to arrive at firmer conclusions. Table 2 provides an overview of the results that will be discussed below.

Difference between old and new energy consumption after an energy coach visit
New gas consumption (n = 219) 1 year after the visit of an energy coach was on average lower than old gas consumption prior to the visit (ΔM = − 85. New electricity (n = 248) consumption is lower than old electricity consumption (ΔM = − 125.8, CI 90% [− 256.2, 4.6], t(247) = − 1.59, p = 0.056, Z = − 2.43, p = 0.008) so electricity consumption did not significantly decline according to the t test but did significantly decline according to the Wilcoxon test. This provides mixed support for H1b. The decline of 6.5% corresponds to a Cohen's d of − 0.101 suggesting a small effect size.

Social comparison effects
For households with above-average gas consumption, there was a significant decline in the scores between their old gas consumption and new consumption (ΔM = − 225.9, CI 90% [− 288.4, − 163.4], t(112) = − 5.99, p < 0.001, Z = − 6.21, p < 0.001). This indicates that this group of residents whom were told that they were doing worse than comparable other households with regard to gas consumption reduced their gas consumption. This decline of 15.9% corresponds to a Cohen's d of − 0.56, which indicates a medium-sized effect. For households with a belowaverage gas consumption, there was a marginally significant increase in consumption from their old gas consumption to their new consumption (ΔM = 63.5, CI Table 2 Comparisons of gas and electricity consumption before and after the visit of an energy coach # p < 0.10, *p < 0.05, **p < 0.01, ***p < 0.001 (one-sided) shows that this group of residents whom were told that they were doing better than the rest with regard to gas consumption increased their gas consumption. Figure 2 illustrates these differences in gas consumption changes between the two types of households. Regression to the mean The differences in consumption change between aboveaverage and below-average consumers provide tentative support for our social comparison hypotheses 2a and 2b.

G A S C O N S U M P T I O N B E F O R E A N D A F T E R A N E N E R G Y C O A C H V I S I T Gas Consumption (m3)
Below average gas consumers Above average gas consumers  However, an alternative interpretation of Figs. 2 and 3 is regression-to-the-mean: consumption deviations from the mean in any year may be for reasons specific to a household in that particular year, e.g., living half a year abroad, or due to measurement error. In these cases, above-average consumption households would be expected to see consumption reduced in 2018 also without an energy coach visit. Vice versa, below-average households would then be expected to increase consumption also without an energy coach. To disentangle regression-to-the-mean from a social influence effect, we tested whether the variance in consumption differed before and after the visit of the energy coach. If we assume that adaptations in gas and electricity consumption are occurring just due to a regression to the mean, we would expect the variance both for gas and for electricity to be similar at both time points.

E L E C T R I C I T Y C O N S U M P T I O N B E F O R E A N D A F T E R A N E N E R G Y C O A C H V I S I T Below average electricy consumers
We measured changing variance in energy use using the interquartile range, the distance between the 25 th and 75 th percentile. The interquartile range, which can be seen in Table 1, decreased for both gas (− 126 m 3 ) and electricity (− 148 kWh). We then conducted Levene's tests for equality of variances. The first Levene's test failed to reject the hypothesis that the variances of the 219 gas consumption measurements before the visit of an energy coach and the 219 gas consumption measurements after the visit are equal (F(1,436) = 2.42, p = 0.121). For the Levene's test for electricity, we leave out the two outliers with negative values for the new consumption. Including these values further increases variance in new consumption. Again, the Levene's test fails to reject the null hypothesis that the variances of the 246 electricity consumption measurements before the visit of an energy coach and the 246 electricity consumption measurements after the visit are equal (F(1,490) = 1.08, p = 0.299).

Discussion
Our results summarized in Table 2 indicate that energy coaches are associated with lower energy consumption. By comparing the general electricity and gas consumption of households before the visit of a local energy coach and their consumption 1 year later, we find that both gas and electricity consumption decreased substantially and significantly, with the exception of electricity consumption when evaluated using a t test. One year after the visit of an energy coach gas consumption declined by − 8.4% and electricity consumption by − 6.3%. These findings support hypotheses 1a and 1b. These changes for this select group of people who signed up for an energy coach session are considerably larger than the changes in the average Dutch population (CBS, 2022). The somewhat clearer decreases in gas consumption vis-à-vis electricity consumption might be attributed to financial incentives leading households to be more focused in changing their gas consumption, as this can help save hundreds of euros annually in the Netherlands, compared to rather low electricity bills.
Table 2 further shows that residents that were told that they were doing worse and consuming more energy than comparable other households, reduced their gas and electricity consumption. On the contrary, residents that were told that they were doing better than similar others with regard to gas and electricity consumption did not show a decrease in energy consumption. For gas consumption, we even found a small increase of consumption after households heard they used less than comparable other households. These findings support hypotheses 2a and 2b and the theory that social comparisons with worseperforming others lead to boomerang effects. The support is very tentative, however, as we could not rule out that increases in consumption by households who were initially below-average were not simply an artifact stemming from the statistical tendency for observations to revert back to the mean. Our analysis comparing variance in consumption among households before and after the visit by an energy coach failed to clearly identify social comparison effects above and beyond regression-to-the-mean.

Conclusion
Our study indicates that the visits of energy coaches are associated with a decrease in energy use. As expected, we do find different changes in energy consumption dependent on the social comparison information that residents received. However, our evidence is tentative, and further study is required for assessing their effectiveness at aiding households to decrease their energy consumption. Several limitations in particular should be considered and taken up as challenges to be addressed in future research. These limitations include the small amount of usable data. Due to the corona crisis, a second wave of data collection could not take place, as the energy coach project relies on home visits to provide the personal energy consumption advice.
A second limitation is the lack of a good control group. Our use of national averages provides a poor benchmark. We cannot rule out that comparable households that were not visited by an energy coach did not also see declines in energy. A control group in the form of households that do not receive guidance from an energy coach would be a major improvement for future research, because selfselection into participation might lead to an overestimation of the effects. For example, people might have signed up for a visit by an energy coach in anticipation of changing their energy consumption. Such individuals may have used less energy for heating even if the energy coach would not have provided them with guidance. However, this would not explain consumption changes that depend on social comparison information provided by the energy coach: those interested in learning how to save more energy through the visit of an energy coach did not achieve the intended improvements if the coach told them they were already doing better than average.
Third, we drew heavily on self-reported energy consumption data which raises reliability concerns. Using only smart meter data would be another improvement as they provide reliable information on energy use over time and ensure that one can correct for weighted heating degree days.
A fourth limitation is our inability to correct for weighted heating degree days between the years of measuring. Nonetheless, as mentioned previously, the difference in gas and electricity consumption between 2017 and 2018 were relatively small (CBS, 2022), and since natural gas consumption in the Netherlands has been fairly stable for the years 2017, 2018, and 2019 (CEIC, 2021), we expect that the weather differences did not affect our results substantially.
One avenue for future research is the possibility of changing the reference point in social comparisons, that is, to change what peer or what standard a household is compared with. Possibly, comparing households to the most sustainable rather than the average neighbor or to those that have reached municipal sustainability goals could incentivize more energy savings and might also motivate below-average energy consumers to save even more.
Another possible direction for future research is to explore household-type and size-specific effects. This will require more comprehensive data, but would allow researchers to include covariates and investigate whether effects of an energy coach visit differ depending on household size or type of house. Similarly, future studies could conduct comparisons with energy programs in other EU states.
Future research testing social comparison hypotheses will likely need a larger sample of participants to allow differentiation from "regression to the mean" effect. Assessing the effect of being told one is above instead of below-average in consumption requires excluding the possibility that changes in energy consumption consistent with social comparison theory are actually just due to random variations leading scores that are initially above average to naturally decline and scores below average to go up. One approach that requires sufficient numbers of cases near the threshold of average consumption is a regression-discontinuity design, in which the discontinuous effect of a social comparison treatment is separated from the continuous effect of pre-treatment consumption on post-treatment consumption (Allcott, 2011).
Given the current energy crisis, it seems important to also look into the problem of energy poverty. We do not have any data on the financial situations of the households that participated in the energy coach program; however, future research may be able to investigate if normative comparisons could also lead to wanted increases in energy consumption. Households who are currently under heating their homes could be prompted through social comparisons into increasing their heating to avoid health risks associated to underheating.

Declarations
Ethical approval We obtained approval and a waiver of informed consent by the Ethics Committee of the Faculty of Social and Behavioral Sciences of Utrecht University. The approval is filed under number 22-0215.