The effectiveness of building retrofits under a subsidy scheme: Empirical evidence from Switzerland

.


Introduction
Buildings-related emissions reached a record high of 10 GtCO 2 e in 2019, accounting for 28% of global energy-related emissions (IEA, 2020b). Even though the sectoral energy intensity has declined since 2010, floor area growth has offset this progress, resulting in a de facto rise in emissions (IEA, 2020b). Space heating accounts for the majority of building energy emissions, with the remaining share coming from water heating, appliances, and lighting (EIA, 2021;Eurostat, 2022).
Energy retrofits are essential to reduce building energy consumption, as a large share of 2050's building stock is already standing today due to the long lifetimes of the built environment (IEA, 2020a;Ürge-Vorsatz et al., 2020). Therefore, the highest mitigation potential in developed countries can be achieved through retrofitting existing buildings, as comprehensive retrofits can reduce heating requirements by 30-60% (IPCC, 2022;Ürge-Vorsatz et al., 2020). Moreover, thermal retrofits can offer advantages beyond energy savings, such as higher comfort, energy security, or increasing housing value (Du et al., 2022;Gillingham et al., 2021). However, retrofits are currently not deployed to the extent required to reach climate goals. The impact of retrofits on building emissions is determined by both the retrofit rate, the number of buildings that get retrofitted annually, and the retrofit depth, the energy savings that are achieved by retrofitting. The current annual energy renovation rate is about 1-2% globally, with renovations reducing energy intensity by less than 15% on average (IEA, 2020a). To adhere to the sustainable development scenario of the IEA, the retrofit rate and depth need to increase to reduce energy intensity by 30-50% in 2030 (IEA, 2020a).
Ex-ante engineering studies show that many energy efficiency measures, such as retrofitting, have a positive net present value and consumers should thus be willing to implement them, even without incentives. The gap between the energy reduction that could be achieved by cost-effective measures and their lack of uptake has been coined the 'energy efficiency gap' or 'energy efficiency paradox' (Jaffe and Stavins, 1994). Many barriers to retrofits that could explain the energy efficiency paradox have been suggested in the literature, such as split incentives between owners and occupiers, high upfront transaction costs, uncertainty about the energy savings or pay-back period, or risk aversion (Du et al., 2022;Yu et al., 2021). However, studies have also found that ex-ante engineering studies often overestimate the energy savings that can be achieved through retrofits, the so-called performance gap (Cozza et al., 2020). Due to the performance gap, the financial benefits of retrofits based on engineering studies are overestimated, thus suggesting that the energy efficiency paradox is not as dominant as subsidies partially funded retrofits. They found through a regression analysis that the energy consumption decreased by 3-4% after the retrofit. As they did not have a control group of untreated buildings, they could not identify whether these effects could be attributed solely to the retrofits. Coyne and Denny (2021) and Scheer et al. (2013) analyzed an energy efficiency scheme in Ireland. They both included a control group of non-retrofitted buildings. The dataset from Scheer et al. (2013) consisted of 210 buildings and was based on a combination of actual measurements and an algorithm based on these measurements. They compared the energy consumption before and after the program duration for the treatment and control group and found through a t-test that a 21% reduction in gas consumption was achieved through the energy efficiency upgrades. Coyne and Denny (2021) used a difference-in-differences (DiD) approach to analyze and compare the bimonthly energy consumption of 1300 buildings that underwent a subsidized retrofit to 6700 buildings that were not retrofitted. They found that the effect of the program was an average reduction in energy consumption of 9% per year. When splitting the model for specific retrofit measures, they found that external wall insulation reduced energy consumption by 23%, whereas the impacts of cavity wall and roof insulation were statistically not significant.
The effectiveness of the subsidy is also strongly affected by the freerider effect, which occurs when beneficiaries of the subsidy would have also implemented the project in the absence of the subsidy, thus costing money without achieving additional savings (Bertoldi et al., 2021;Labandeira et al., 2020). A review of different energy efficiency policies by Labandeira et al. (2020) found that free riders were responsible for 15-40% of achieved energy reductions, thus significantly reducing the policy effectiveness. Other studies have reported spill-over effects that could partially offset the free-rider effect (Rosenow and Galvin, 2013). The size of the free-rider effect greatly depends on the policy design, particularly on eligibility and the percentage of costs that are covered by the subsidy.
Several studies have estimated the free-rider effect for retrofitting policies. Dolšak et al. (2020) and Nauleau (2014) used survey data to evaluate whether the retrofitting rate increased under the policy, thus estimating which share of retrofits is additional. Dolšak et al. (2020) found the retrofitting subsidy program in Slovenia to be ineffective in the first three years, implying the presence of free riders. They found the policy to be effective for the following three years, for which they identified no free-riding in the first year and a free-rider effect of 52% and 62% in the two years that followed. Nauleau (2014) identified annual rates of free-riding between 85% and 61% for insulation measures, with the highest rate at the beginning of the tax credit policy in France. Grösche and Vance (2009) used survey data to determine the willingness to pay for retrofits to estimate the free-rider effect. They found that the calculated willingness to pay was higher than the costs of retrofitting for 50% of households in Western Germany, suggesting that a high level of free-riding would have taken place if a subsidy were present.
Few studies have linked the effectiveness of retrofits to costs, with varying methods. Coyne and Denny (2021) calculated the actual value for money for the household by scaling the theoretical value for money by the ratio between actual and theoretical change in energy consumption, finding an average actual value for money of around 40€/kWh/m 2 /year for the households. Allcott and Greenstone (2017) evaluated the benefit-cost ratio of the program through an accounting approach and found the ratio to be 0.92 with an internal rate of return to be − 4.1%, thus finding that the costs outweighed the benefits.
In this study, we conduct a three-step ex-post empirical assessment with longitudinal energy data to evaluate the effectiveness of retrofit subsidies for insulation within a Swiss program that partially funds the retrofits. We created a unique dataset consisting of quarterly energy data for 432 buildings over a period of 11 years from 2010 to 2020 (19,008 observations), of which 46 buildings received at least one subsidy to insulate the building, and 15 buildings were insulated without receiving a subsidy. As a first step, we investigate to what extent retrofits reduced energy consumption. Second, we assess if subsidized retrofits led to larger energy savings than non-subsidized retrofits. Last, we evaluate if higher subsidies lead to higher reductions by linking the achieved reduction and the subsidy amount, thus determining the marginal costeffectiveness of each subsidy.
With this dataset and the three-step approach, we contribute to the literature in several ways. First, it has one of the longest observation periods of an ex-post study, thus enabling the control for temporal influences on the rebound effect, such as behavioral change after the retrofit. Second, we use a DiD approach to mitigate the effects of unrelated building trends, which so far has only been done by one other study (Coyne and Denny, 2021). Furthermore, in step 2, we compare the subsidized retrofits not only to buildings that were not retrofitted, but also to non-subsidized retrofits within the same sample, which, to the best of the authors' knowledge, has not been done before. This contributes to the existing free-rider literature, as we hypothesize that subsidies could prompt free riders to retrofit deeper than they would have done without the subsidy, thus making them only "partial" free riders. This effect would not become apparent when studying free-riding based solely on the retrofitting rate. Lastly, we calculate the marginal cost-effectiveness of each subsidy from the perspective of the policymaker. This approach, evaluating the costs of an individual policy, offers a new perspective on the cost-effectiveness of retrofit subsidies.
Section 2 provides the context for the case study by presenting the current state of energy consumption in buildings in Switzerland and detailing the subsidy program. Section 3 presents the method, with the research design in 3.1, descriptive statistics of the dataset in 3.2, and detailing the statistical methods used in 3.3. We show and discuss the results in section 4, and provide conclusions and policy recommendations in section 5.

Building energy efficiency in Switzerland
Buildings are responsible for a quarter of total greenhouse gas emissions and 44% of total energy consumption in Switzerland (BAFU, 2021;Kemmler et al., 2021). Buildings, therefore, play a large role in the Swiss Energy Strategy 2050, which was developed by the Swiss Federal Council as a holistic plan to reduce emissions related to energy (Prognos, 2012). Space heating is of particular importance for the strategy, as it accounts for 70% of total building energy consumption (Kemmler et al., 2021).
Under the Swiss Net-Zero base scenario, the average energy consumption per building area needs to be reduced by 55% in 2050 compared to 2010 (BFE, 2022). Switzerland has set an intermediate target to reduce building-related emissions by 50% on average in 2026 and 2027 compared to 1990 (BFE, BAFU, 2021). To reach these targets, the cantons enact building standards for new and renovated buildings as specified in the model regulations of the cantons in the energy sector ("Mustervorschriften der Kantone im Energieberiech", MuKEn) (EnDK, 2018). Moreover, the federal government provides up to 450 million CHF per year to the building program ("Das Gebäudeprogramm"), which aims to reduce the emissions of the Swiss building stock (Ecoplan, 2017). Part of the building program is the incentivization of energy retrofits, which is essential as the retrofit rate in Switzerland is only 1% (Jakob et al., 2014).

The building program
Through the building program, Switzerland has paid 2.7 billion CHF in subsidies since 2010, resulting in estimated cumulated savings of 72 billion kWh and over 17.8 Mt CO 2 (Infras, 2022). As part of this program, retrofits are partially subsidized. In 2021, Switzerland spent 361 million CHF on subsidies for the federal and cantonal building program (Infras, 2022). The program is organized on a cantonal level, where cantons design their programs based on a harmonized funding model where some subsidies have to be offered by all cantons ("Harmonisiertes Fördermodell", HFM). During the period we investigate, from 2010 to 2020, the HFM 2009 was in effect (Kessler and Moret, 2012). We evaluate the subsidy scheme in canton St. Gallen, located in eastern Switzerland. The buildings in our case study are all connected to a waste-powered district heating network in two neighboring towns in St. Gallen. Because all buildings are part of the same network, they can be compared without considering energy carrier efficiencies.
Under the building program, building owners can apply for a variety of subsidies. We consider two subsidies for retrofits that reduce energy demand: renovation in extensive steps and individual insulation measures. From 2000 to 2016, the subsidies were provided by the national government and only a subsidy for insulating individual components was available. From 2017 onwards, the subsidies were provided by the cantons, which also included a subsidy for extensive renovation. The subsidy amount and energy reduction requirements are shown in Table 1. The individual measures can address the façade, roof, windows, or floor. It is possible to combine multiple individual insulation measures, but the two subsidy types cannot be combined. The subsidies cover part of the retrofit costs, about 10-20% with a maximum of 50%. Building owners apply for the subsidy before they start the retrofit. They can start the retrofit after their application is approved, and they will receive the subsidy after the retrofit is finished.

Methods
In this section, we explain the employed methodology. In section 3.1, the three steps of the research design are explained with the corresponding hypotheses. The dataset is described in section 3.2, and the different statistical methods are discussed in section 3.3.

Research design
The first research question asks whether retrofits reduced energy consumption under the subsidy scheme. In line with results from earlier studies, equation [1] shows the null hypothesis that household energy consumption after retrofit is lower than before retrofit.
The second research question examines whether subsidized retrofits resulted in a stronger reduction in energy consumption than nonsubsidized retrofits. This expands on the approach to identify freeriding by evaluating the retrofit rate. We hypothesize that some subsidy recipients that would have retrofitted their house without a subsidy and are thus free riders, would have retrofitted to a lesser depth without Note. SFH = Single family home; MFH = Multi-family home; COM = Commercial; Q h = Heating demand ("Heizwärme"); Q eh = Heating energy demand ("Heizenergiebedarfs"); U-value = thermal transmittance coefficient ("Wärmedurchgangskoeffizenten"). a Each subsidy is capped at 100,000 CHF or 50% of investment costs. b Amount is based on projected energy savings. the subsidy due to the stronger energy reduction requirements set in the subsidy. The potential energy reduction from the retrofit would thus not be fully attributed to free-riding, and this could be seen as "partial" freeriding. Due to the small number of non-subsidized retrofits in our sample (15), the results will be needed to be interpreted with caution. The null hypothesis, shown in equation [2], depicts the expectation that subsidized retrofits result in a higher reduction in energy consumption than non-subsidized retrofits, because the required retrofit depth for subsidies is higher than demanded through the building standards.
The third research question investigates the effect of the subsidy amount on the change in energy consumption. This quantifies the marginal cost-benefit ratio of a subsidy. The null hypothesis states the expectation that the change in energy consumption is higher for a building where a larger subsidy was received (equation (3)).

Dataset description
We aggregated data from five different sources into one final dataset, as shown in Table 2. The energy data was shared by the waste-to-energy plant who provides energy to the district heating network through metered connections. Data on the size and timing of the subsidies were provided by canton St. Gallen. The two municipalities in which all buildings are located provided the public building permit database. Building owners are required to request a permit or notify the municipality for any large construction work that is done, and energy retrofits should therefore be present in the permit database. Buildings for which no evidence of an energy retrofit was found in either the subsidy or the permit database were assumed to not have undergone energy retrofits during this time. Buildings that received a permit during this period for other renovations that could affect energy consumption, such as expansions, were omitted from the dataset. Building features such as building type and construction year were sourced from the Swiss Building and Dwelling Registry ("Eidgenössische Gebäude-und Wohnungsregister", GWR). Lastly, data on the weather was sourced through MeteoSwiss.
After aggregating the datasets, we balanced the dataset by omitting the buildings that did not have data over the entire period. The aggregated dataset consists of a balanced panel of 432 buildings with quarterly readings of the energy data from 2010 to 2020 (19,008 quarterly readings). The occurrence of the retrofits is shown over time in Fig. 1.
The sample is split into three categories: buildings that received a subsidy for a retrofit (SR), buildings that retrofitted without receiving a subsidy (NR), and buildings that did not retrofit, the control (CL). The characteristics of the different samples are shown in Table 3, together with a comparison of the national building stock in Switzerland. The building characteristics are further explained in the next section. Compared to the national building stock, the median annual energy consumption per area is lower in the sample. There is also an overrepresentation of residential buildings. The retrofitted buildings in the sample are overall much larger than the control group, suggesting a selfselection of larger buildings into the program. However, the median annual energy consumption per area is constant across the sub-samples, which is an important identifier of the retrofit potential.
We also consider the effect of "retrofit comprehensiveness", which we define as the number of building elements that are retrofitted, in line with Collins and Curtis (2016). Comprehensiveness is one attribute of retrofit depth, where retrofit depth is defined as the extent to which the retrofit resulted in energy savings. To evaluate the effect of comprehensiveness, we categorize the different retrofits into three different categories: single measure, two measures, or extensive retrofit. A retrofit is categorized as single or two measures if, respectively, 1 or 2 building elements out of the façade, roof, windows, or floors were renovated. A retrofit was categorized as extensive if at least 3 elements were renovated, or the subsidy was categorized as extensive. Not all entries in the permit database provided the elements that were renovated or another indication of the retrofit comprehensiveness. In this case, the retrofits were categorized as 'unknown'.

Dependent variable and covariates
Throughout the analyses, the dependent variable is based on the energy consumption and specified for each analysis in section 3.3. The energy consumption is measured by the district heating network provider. The consumption includes space heating and can include water heating dependent on the buildings' water heating system, but does not include electricity consumption. When required for the analysis, a hyperbolic sine transformation (IHS) was applied to the energy consumption as the dependent variable, as specified for each analysis in section 3.3. Other variables included in the analyses are heating degree days (HDD), energy reference area (ERA), building type, and construction period. Dependent on the analysis, HDD and ERA are used to correct the energy consumption or included in the model as covariates. Building type and construction period are only included as covariates. Table 3 shows how the sub-samples compare for the building characteristics. Further details on the calculations of all covariates are described in Appendix C.
We use HDD to account for the weather, which is required for methods where data is compared across different time periods, due to the effect of the weather on energy consumption. We use the 'heizgradtage 20/12' as recommended by Mojic and Haller (2019) and defined in norm 381/3 from the Swiss Society of engineers and architects (SIA) (SIA, 1982(SIA, , 2015a. To compare the energy consumption between buildings of different sizes, we correct for the ERA of the building. We calculate the ERA according to the methodology developed by EcoSpeed and TEP Energy, which is used by different government agencies in Switzerland (Hartmann and Jakob, 2016) and defined by the SIA in norm 2028-C1:2015 (SIA, 2015a). The building type and construction period are defined by the Swiss Building and Dwelling Registry ("Eidgenössische Gebäude-und Wohnungsregister", GWR). Both are Note. The data cleaning and aggregation process is detailed in Appendix A. a Due to data confidentiality, this data is not publicly available. b Data is publicly available on request (https://www.meteoschweiz.admin.ch/service-und-publikationen/service.html). c Data is available on request (https://www.housing-stat.ch/de/madd/). included in the analyses as categorical variables. In addition to building characteristics and the weather, socioeconomic factors can also affect energy consumption. When this data is available, it should be included in the analyses, as is common in survey-based studies (Alberini et al., 2014;Dolšak et al., 2020;Nauleau, 2014). However, this data was not available for the buildings in our dataset, and socio-economic factors could thus not be included as covariates. The omission of socio-economic characteristics is a common limitation for studies that use datasets based on measured energy consumption and subsidy data (Allcott and Greenstone, 2017;Coyne and Denny, 2021;Scheer et al., 2013). Frondel and Schmidt (2005) suggest three observational evaluation approaches to study the effect of policy interventions when longitudinal data are available: 1) before-after comparisons, 2) difference-in-differences estimators, and 3) matching estimators. Matching estimators require a large sample size that is not available in our study, and we therefore focus on before-after comparisons and difference-in-differences estimators.

Econometric methods
The before-after comparison has the advantage that persistent unobservable household characteristics do not matter, as they are  subtracted when comparing the change within households. However, there might be other factors that could cause part of the change in energy consumption. With a difference-in-differences approach, any changes in environmental and macroeconomic conditions are accounted for. A disadvantage is however, that the parallel trends assumption must hold, meaning that energy consumption in the treated group would have followed the same trends as the non-treated group, had they not been treated. Moreover, unobserved heterogeneity between the control and treatment groups might affect the results (Frondel and Schmidt, 2005). For example, self-selection could cause a selection bias, as households that are more environmentally conscious might be more likely to participate in the program, even though they would have also reduced energy consumption without the subsidy. As both methods require different identification assumptions for which we cannot guarantee that they are satisfied, we employ both before-after comparisons and a difference-in-differences approach. For the before-after comparison, we use a t-test and linear regression to consider the static effect of treatment and an event study to evaluate the longitudinal effect, further discussed in sections 3.3.1 and 3.3.2, respectively. In section 3.3.3, we elaborate on our use of the differencein-differences approach. Fig. 2 provides an overview of the three methodologies and how the longitudinal data is processed for each approach.

T-test and linear regression
A t-test can be used to compare the means of two groups of data, which we employ to compare the average reduction in energy consumption between the different treatment groups. A linear regression estimates the linear relationship between the dependent and the explanatory variables. For the t-test and linear regression, we evaluate the difference between the average energy consumption before and after the retrofit. As the treatment times vary, we correct the energy consumption for the heating degree days. To remove other year-specific factors, we take the average of the 5 years before the treatment and the 5 years after the retrofit has finished and compare these to each other. If less than 5 years are available, we take the available years with a minimum of 1 full year. The inclusion of energy data is depicted in Fig. 2. A sensitivity analysis was conducted on the choice of 5 years and is shown in Appendix D. Based on the description above, the dependent variable for the t-test and linear regression is calculated as shown in equation (4): where i indexes the building, ΔY HDD i is the change in average energy consumption corrected for the heating degree days in %, EC i is the average annual energy consumption before (PreRet) or after (PostRet) the retrofit in kWh, HDD mean is the average annual HDD over the full investigated time period (2010-2020), and HDD i is the average annual HDD during the period before or after the retrofit.
To enable a comparison of the effects between treated buildings and the control group, we set an artificial treatment time for the buildings in the control group. For the buildings in the control group, we then calculate the change in energy consumption in the same way. We then estimate the model in equation (5): Where i indexes the building, ΔY HDD i is the change in average energy consumption corrected for the heating degree days in %, R i is a dummy variable that indicates if the building was retrofitted, B i is a vector of building-specific variables that is adapted based on the model specification, and ε is the error term.

Fixed effects event study
Event studies can be used to compare the outcome within a unit, before and after treatment over time. We use fixed effects because the building-specific characteristics tend to be stable over time. We annualized the data because the quarterly weather correction would increase outliers in the data when comparing different periods. The dependent variable is the IHS-transformation of the annual energy consumption. We therefore estimate: where i indexes the building and t the years, ln(Y i ) is the IHStransformation of the energy consumption in MWh/year, α i are building-fixed effects, R is a vector of binary dummy variables that indicate the timing of the retrofit, HDD is the heating degree days in • C/ year, and ε is the error term.

Difference-in-difference regression
To evaluate if the results from the fixed effect event study are not due to other effects that were not included in the analysis, a DiD analysis was conducted. A DiD model compares units with each other over time and compares units that have been treated at a certain time to units that have not been treated at that time. The dependent variable for the DiD models is the IHS-transformation of the annual energy consumption.
As our study has multiple time periods and variations in treatment timing, we use the recent DiD methodology developed by Callaway and Sant'Anna (2021). This methodology solves the problems that occur when applying a two-way fixed effects estimator to a study with staggered treatment periods, for example discussed by Goodman-Bacon (2021). In this method, treatments that occur in the same period are grouped and treatment effects are calculated per group. Due to the large number of treatment periods in our sample size, the individual groups are small. It is therefore suggested by Callaway and Sant'Anna (2021) to focus on the aggregated treatment effect, as the sample size for this is based on the total number of buildings that were retrofitted.
We use the R package as provided by Callaway and Sant'Anna (2020), which estimates the average treatment effect on the treated (ATT) through a difference-in-differences model: where i indexes the building, t the time in quarters, and g the period in which the building gets treated, ln(Y i,t ) is the IHS-transformation of the energy consumption in MWh/year, G g is a binary variable that is equal to 1 in the period where the building is first retrofitted, and R t is a binary variable that is equal to 1 if the building is treated at time t and 0 otherwise.

Results and discussion
In this section, we show the results of the aforementioned research questions. In section 4.1 we show the results of the effect of all retrofits on energy consumption. In section 4.2, we compare subsidized and nonsubsidized retrofits, and in section 4.3, we break down the results further for different building types.

Result 1: Effect of retrofits on actual energy use for all buildings
We evaluate the effects of the retrofit on energy consumption through a t-test, linear regression, event study, and DiD. The weighted average annual energy consumption for buildings that were not retrofitted increased by 1.9%, whereas it decreased by 7.0% on average for buildings that were retrofitted. Based on a weighted t-test on the IHStransformed dependent variable, these means are significantly different (p < 0.001). The change in energy consumption fluctuated highly for both subsamples, as can be seen in Fig. 3a. The change in energy consumption of the retrofitted buildings was between − 59% and +33%. To further evaluate if there are unobserved factors that could cause these differences, we conducted a linear regression with various specifications (Appendix D). In the model where all covariates were included, the retrofitting reduces energy consumption by 9.9 percent points. This result is statistically significant (p < 0.001) and consistent across the specifications.
The average change in energy consumption can be split by retrofit comprehensiveness. The average energy consumption before and after retrofit is shown in Fig. 3b. Under a linear regression with the same specifications, the single measure, two measures, and extensive retrofits were correlated with a reduction in energy consumption of 7%, 14%, and 24%, respectively (Appendix D).
Through the event study, we can see the effects of the retrofit over time (Fig. 4, Appendix D). The results show a quite consistent reduction of 8% after the retrofit.
To consider all effects and attempt to show causal effects, we conducted a DiD. The average treatment effect on the treated (ATT) is shown in Table 4. The reduction is between 11 and 20%, depending on the covariates.
Across the different models, the results show a statistically significant reduction for the retrofitted buildings that stays consistent over time. The linear regression shows that a retrofit reduced the energy consumption by 9.9 percentage points. The event study shows a similar reduction of around 8%. The DiD, in which unobserved factors are considered, shows a stronger reduction between 11 and 20%. These reductions are in line with the results reported by other ex-post studies with a control group, which found reductions of around 10-20%, depending on the type of retrofit (Coyne and Denny, 2021;Scheer et al., 2013). These reductions are still far below the 30-50% energy reduction that is needed to reach the sustainability targets as suggested by the IEA (2019). Renovations are typically not considered for a couple of decades after a building has been retrofitted, thus a lock-in effect can be created through shallow retrofits (Dubois and Allacker, 2015). Therefore, it is important that retrofits are deep. The buildings that retrofitted multiple elements achieved significantly higher energy reductions than the single measure retrofits (Fig. 3b), suggesting that focusing on policies for comprehensiveness could be a lever for policymakers to promote deep retrofits. However, policymakers would need to identify what would incentivize comprehensive retrofits, as Collins and Curtis (2016) found that the comprehensiveness did not go up when Ireland introduced a bonus to incentivize more comprehensive retrofits.  Note. IHS = Inverse hyperbolic sine. The dotted line shows the 95% confidence interval. Asterisks indicate significance at the 10 percent (*), 5 percent (**), or 1 percent (***) level. Heating degree days and building fixed effects were included in the model.
The found variation in achieved energy reductions is sizable (between − 59% and +33%), in line with the results from other studies (Coyne and Denny, 2021;Scheer et al., 2013). In order to target more effective retrofits and avoid lock-in, the factors that contribute to the effectiveness of the retrofits need to be understood. Building size, type, location, construction year, quality of retrofit installation, and occupant usage have been suggested as important determinants of retrofit depth (Collins and Curtis, 2016;Coyne and Denny, 2021). Of these factors, we included building size, type, and construction year as covariates in the analyses. The full results are shown in Appendix D. The effect of ERA on the relative change in energy consumption is not significant (Table D.2), yet it does significantly impact the absolute change in energy consumption where larger buildings are correlated with higher absolute energy savings (Table D.3). This is in line with the expectation that larger buildings have a higher absolute saving potential as the energy consumption is higher before the retrofit. We found no significant effect across the construction periods, possibly due to the wide span within each construction period. Retrofitting (partially) commercial buildings achieved significantly lower energy savings than residential buildings, although the effect of the residential buildings was not significant. With the included variables, a majority of the variation in change in energy consumption was not explained, as the adjusted R2 did not reach 20%. There are thus many additional variables that determine retrofit effectiveness, such as occupant behavior. Complimentary in-depth studies are likely required to further demystify the performance-gap.

Result 2: Effect of receiving a subsidy on actual energy use for retrofitted buildings
Next, we compare the effects of subsidized retrofits to non-subsidized retrofits. The weighted average reduction in annual energy consumption was 12.1% for buildings with a non-subsidized retrofit and 5.8% for buildings with a subsidized retrofit. These averages were not significantly different from one another as the p-value of the IHS-transformed t-test was 0.26. The results do not become statistically significant with the inclusion of other covariates in the linear regression (Appendix E).
The temporal results of the event study are depicted in Fig. 5 and are available in Appendix E. Here it can be seen that the reduction achieved through the subsidized retrofits is sustained and significant. The results for the non-subsidized retrofits fluctuate with high uncertainty, likely due to the low number of buildings in this sub-sample.
There are not enough data points to run the DiD with staggered treatment times for the non-subsidized retrofits. Moreover, with the package from Callaway and Sant'Anna (2020), we cannot indicate at which point the non-subsidized retrofits occur. Therefore, we conducted a classic DiD with standardized treatment times, for which results are shown in Table 5. The DiD does not show significant results for the interaction term. In specification 3, where fixed effects and HDD are added, the reduction in energy consumption is 18%, which is in line with the results from the previous section.
On average, the retrofits that were subsidized had a smaller reduction than those that were not subsidized. However, this difference is not statistically significant across any of the statistical methods. The results of the event study (Fig. 5) show that the uncertainty is much higher for the non-subsidized retrofits, likely due to the small sample size of 15 buildings. The results for the subsidized retrofits are similar to the results in 4.1, where all retrofits were included.
The lower reduction for subsidized retrofits in the linear regression is not in line with the hypothesis that subsidies would increase energy savings. The effect of the subsidy on energy consumption is not statistically significant, as estimated through the t-test, linear regression (Table E.1), and DiD (Table 5), so it cannot be inferred that subsidizing retrofits leads to a reduction in energy savings. The high uncertainty is likely caused by the small sample size. There are several possible reasons Note. IHS = Inverse hyperbolic sine. The dotted line shows the 95% confidence interval. Asterisks indicate significance at the 10 percent (*), 5 percent (**), or 1 percent (***) level.
for the anecdotal evidence of the deep non-subsidized retrofits. First, there could be a selection bias, as building owners who chose to retrofit without the incentive of a subsidy might have stronger persuasions towards retrofitting and are therefore more likely to choose a deeper retrofit. Second, we identified the non-subsidized retrofits through the permit data. Building owners might conduct smaller renovations without getting a permit, leading to a bias towards larger and thus deeper retrofits in the sample. Lastly, the targets for renovations as mandated by the cantonal building standards (MuKEN), are close to the targets set by the subsidy. For example, the maximum U-values for roofs, walls, and floors after renovation are 0.25 and 0.20 W/(m 2 K) for the building standards and the subsidy, respectively (EnDK, 2018; Energieagentur . Hence, the policy mix does not allow for very shallow retrofits.
Due to the small sample size and therefore the statistically insignificant results, no clear conclusions can be inferred from the analysis. Further investigating whether subsidies can drive building owners to retrofit deeper is important to quantify the full impact of the free-rider effect, and thus the effectiveness of the policy. Fig. 6 shows the change in energy consumption plotted for the subsidized retrofits versus the size of the subsidy that was received. A positive correlation appears to exist between the magnitude of the subsidy amount and the reduction in energy consumption, although many buildings do not follow this relationship. Table 6 shows the results of a linear regression of the subsidy amount on the absolute change in energy consumption, with the full results available in Appendix F. A higher subsidy amount correlates with a larger reduction in energy consumption across the specifications. Model 4 shows that a marginal increase of the subsidy by 1 CHF reduces energy consumption by 0.12 kWh per year. Assuming that the achieved reduction stays constant, as is suggested by the results from the event studies, we can calculate the subsidy costs per kWh saved over a given period. For example, over a period of 20 years, this corresponds to having spent 0.42 CHF/kWh in subsidies. Compared to a typical energy price of around 0.20 CHF/kWh, the price of avoiding energy consumption is higher than the cost of the energy itself.

Result 3: Effect of the subsidy amount on energy consumption
This cost does not include overhead costs for the subsidy program, nor does this approach consider the free-rider effect, which would both further increase the costs. However, this benefit-cost comparison does not consider benefits beyond energy reductions. The social benefits of energy efficiency programs can far outweigh the program benefits. For example, Tonn et al. (2018) found that the societal cost-benefit ratio was almost 3 times as high as the program cost-benefit ratio. For a holistic program evaluation, all aspects such as free riders and social benefits need to be considered. With current data availability, the options to do this are limited. Incorporating in-depth policy evaluations during policy design and implementation, thus increasing the number of studies with high levels of data availability, would greatly improve our understanding of policy effectiveness.

Conclusion and policy implications
Given the need to reduce energy demand from existing buildings, it is essential to increase the speed and depth of retrofits. In this paper, we use real-world data to confirm that retrofits are an effective method to reduce energy consumption, with an average energy reduction between 8 and 20%. The change in energy consumption varies strongly between retrofits, from increasing by 33% to decreasing by 59%. As timing between retrofits typically spans decades and shallow retrofits can cause a lock-in effect (Dubois and Allacker, 2015), policymakers should focus on incentivizing deep retrofits. One option would be to focus on increasing the comprehensiveness of the retrofit, as extensive retrofits reduced energy consumption by 24%, compared to 7% and 15% for retrofitting one and two elements, respectively. As the adjusted R2 remained low throughout the analyses, there are unobserved factors that affect the change in energy consumption, such as socio-economic factors or occupant behavior. Further studies are needed to identify the Fig. 6. Change in energy consumption between before and after the subsidy with respect to the subsidy amount (N = 33). Note: Outliers in subsidy amount are removed according to the interquartile range method. The linear regression line is plotted based on the least squares method. The observations are colored by comprehensiveness as shown in the legend.

Table 6
Impact of subsidy amount on the absolute difference in energy consumption between before and after retrofit in kWh. Note: Asterisks indicate significance at the 10 percent (*), 5 percent (**), or 1 percent (***) level.
determining factors of the large variation in reduction, so policymakers can target more effective retrofits. Based on the comparison between subsidized and non-subsidized retrofits, we cannot conclude whether subsidies increased retrofitting depth. The Swiss-mandated building standards are likely a large contributor to the already high depth of the non-subsidized retrofits (EnDK, 2018). This emphasizes the importance of a coherent design of the policy mix where policies complement one another. As the rate of free riders can be high (Dolšak et al., 2020;Grösche and Vance, 2009;Nauleau, 2014), it is important to understand whether policies cause free riders to retrofit deeper, so the effectiveness of policies can be better determined.
Last, we find that the subsidy amount correlates with the reduction in energy consumption, showing that higher subsidies result in stronger reductions. Over a period of 20 years, the marginal cost of each subsidy is 0.42 CHF to reduce energy consumption by 1 kWh, if no free-riding is present. It is important to note that this figure cannot be solely compared to energy prices, as retrofits are not only conducted to reduce energy consumption. Retrofits are required to uphold building condition and value and offer other co-benefits, such as increased comfort and liveability. When designing and evaluating policies, a comprehensive view should be taken that includes these co-benefits.
The study has several limitations. First, the dataset consists of 432 buildings, of which 46 buildings with a subsidized retrofit and 15 buildings with a non-subsidized retrofit. Due to the small sample size, the results need to be interpreted with care. This is also the most likely cause of the insignificant results when comparing the subsidized to the non-subsidized retrofits. Second, the dataset is solely based on buildings in two towns in Switzerland. As there are some differences between the studied sample and the building stock, the results cannot be generalized to the whole country without caution. Most notably, the median annual energy consumption per area is lower in the sample than in the national average, suggesting higher energy savings would be possible on a national level, although we cannot say whether the studied policy would achieve this. Lastly, the R2 of the models throughout the analyses remained low, suggesting that there are unobserved factors that affect energy consumption. This is likely partially due to the omission of socioeconomic characteristics in the models, as this affects energy consumption. Another factor could be that the energy consumption can include water heating for some buildings, which would affect the relative change in energy consumption as retrofits do not affect this. Additionally, changes in these characteristics over time or behavior changes could cause changes in energy consumption that could not be explained by the models.
Compared to the low prevalence of ex-ante studies of retrofit subsidies that include energy consumption data, the available datasets for this study can be considered relatively comprehensive. However, to further investigate the effectiveness of the subsidy, a larger and richer dataset would be needed, which can only be collected if researchers would already be included in the design and implementation phase of policies. We recommend that policymakers intensify partnerships with researchers to include effectiveness studies across the policy lifecycle. This would also enable the set-up of a semi-randomized trial, which would facilitate not only finding correlations but also causations. Considering the large budgets being made available for retrofitting subsidies, further investigating the policy effectiveness to improve the outcomes is important.
Overall, we conclude that retrofits are an effective way to reduce energy consumption. However, as the achieved energy reductions are not high enough to reach climate targets, more and deeper interventions will be needed. Subsidies are a powerful instrument that policymakers should employ within a policy mix. Opportunities beyond retrofits should also be considered, such as non-financial interventions and inducing behavior change (Khanna et al., 2021). Lastly, we emphasize the need for policy effectiveness to not only be modeled but also measured ex-post, which should be included in the policy design.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Data availability
The data that has been used is confidential.

Acknowledgments
This research is supported by the Swiss Federal Office of Energy (SFOE) under the 'Policies for accelerating renewable and efficient building & district retrofits' (PACE REFITS) project with the contract number SI/501883-01.
We would like to thank Marcel Knöri for the provision of data and his valuable feedback during this project. We would like to thank Dr. Georgios Mavromatidis, Dr. Alejandro Nunez-Jimenez, Dr. Petrissa Eckle, Oliver Akeret, Julia Bachmann, Joanna Flynn, Christine Gschwendtner, David Pfeffer, and Paula Thimet for their valuable comments on earlier versions of this manuscript. The analyses were run on the Euler cluster managed by the HPC team at ETH Zurich.

A. Data cleaning and aggregation
This Appendix summarizes the steps that were taken to clean the datasets and aggregate them into one final dataset. This is graphically shown in Figure A.1.  Fig. A.1. Overview of datasets used in the study.

Energy consumption data
The energy dataset had a connection ID as a unique identifier, which was first transformed to the address: • If one ID contained multiple addresses, the energy consumption was split by area.
• If one address was split over multiple IDs, the energy consumption was summed.
The address was then transformed in the same way to the Swiss national building ID ("Eigdgenössischer Gebäude-Identifikator", EGID). Entries were dropped for the following reasons: • Entries where the EGID consists of multiple addresses, but not all addresses were present. • Entries where the address could not be matched to an EGID.
The original dataset contained 1408 unique entries with quarterly energy data. After the transformation to EGID, the dataset contained 1411 entries.

Subsidy data
The original subsidy dataset contained 1209 entries of buildings in canton St. Gallen that received a subsidy, and the data was indexed on the address. The address was transformed to the EGID with the same methodology as the energy dataset, after which the dataset contained 1058 entries.
The dataset contained three dates for each subsidy: 1) the application date, on which the government received the application, 2) the confirmation date, on which the government confirmed the subsidy, and 3) the payment date, on which the applicants received the subsidy payment. Renovation can only take place after the confirmation date and must be finished before the payment date. We defined the treatment timing as the date on which the application was confirmed. For the analyses where the period during the renovation was removed, the final renovation date was defined as the minimum of the payment date and the confirmation date plus a predefined period. This period was set to 150, 190, 230, or 270 days for renovations of one element, two elements, three elements, or more extensive retrofits, respectively. If one building was awarded more than one subsidy with an overlapping time period, these two subsidies were merged into one entry. The start date was set to the earliest date between the subsidies, and the end date as the latest.

Building permit data
The data on building permits consisted of two datasets from the two municipalities in which the buildings are located. The combined permit dataset consisted of 8013 entries, indexed on the address. After removing the entries that occurred outside the temporal scope of our study and linking the entries to the EGID, the dataset consisted of 4335 entries.
Each entry in the permit dataset consists of a description of the content of the permit. This description was used to identify permits as a thermal retrofit. The first step of the identification was conducted through the presence of one or more of the key phrases in the description. The key phrases were identified by taking a sample of 100 entries of the full dataset and identifying all phrases that could identify a change in energy consumption. After splitting all entries with these keywords from the sample, a new sample was chosen to identify new phrases. This was repeated until 3 consecutive samples did not include any relevant entries and led to the following key phrases : "sanier, renovation, wärmedämmung, isolation, isolier, umbau, einbau, anbau, ersatz, erweiterung, neubau, fenster, dach, fassade." Next, all entries were checked by hand to identify whether they could be categorized as a thermal retrofit. The comprehensiveness of the retrofit was also determined based on the specification of the description. If no specification was present, the comprehensiveness was marked as unknown.

Merging of datasets
Next, the energy, subsidy, and permit databases were consolidated into one dataset. They were linked through the EGID. The building characteristics were added to this dataset through the dataset from GWR MADD.  Note. Extreme outliers (z-score>3) are not depicted in the graphs.. As the retrofits occur at different periods over the time frame, there is a different number of periods before and after the retrofit available. For the analyses where the data is cut-off respective to the time of the period, the data availability of the retrofits is shown in Figure B.2.  Fig. B.2. Energy data availability for retrofitted buildings before and after, respective to the retrofit after time t = 0. Note. The color shows which data is included in the t-test, linear regression, and event study.

C. Description of covariates Weather: Heating degree days
There are several methods for climate correction (Mojic and Haller, 2019). Mojic and Haller (2019) recommend using the heating degree days ("heizgradtage", HDD) method for residential buildings when there are no base temperatures available for the specific buildings. As our sample largely consists of residential buildings, we chose this method. The HDD is defined in the SIA 381/3:1982 as the sum of the differences between the indoor temperature and the daily mean temperature on days where the daily mean temperature is below a heating limit (SIA, 1982). For Switzerland, the indoor temperature and heating limit are 20 and 12 • C, respectively. This is called the HDD 20/12 and is shown in equation C.1.

Energy Reference Area
To evaluate the energy use between different buildings, it is important to correct for the area. To do this, we calculated the energy reference area ("Energiebezugsfläche", ERA) for each building. The ERA is defined as the sum of all areas within the envelope that require heating or cooling (SIA, 2015b). This was done according to the methodology developed by EcoSpeed and TEP Energy (Hartmann and Jakob, 2016), which is used by different government agencies in Switzerland and can be summarized as follows: Building type and construction period The building type and period stem from the dataset retrieved from the Swiss Building and Dwelling Registry ("Eidgenössische Gebäude-und Wohnungsregister", GWR). These variables are both categorical. The categories are defined in the "Merkmalkatalog" from the Swiss Federal Office of Statistics (BFS, 2018).
The building type is a categorical variable with five categories: single-family home (SFH), multi-family home (MFH), mixedmostly residential (MMR), mixedmostly commercial (MMC), and commercial (COM). The categories are based on the building category ("Gebäudekategorie") and building classification ("Gebäudeklass") from the GWR. Based on the building category, four categories are included in the dataset: residential, MMR, MMC, and COM. The residential category is further divided into two categories based on the building classification (Table C.1). The construction period is a categorical value that is included in the data from the GWR, and defined as the building period ("Bauperiode") in the "Merkmalskatalog" (BFS, 2018). Table D.1 shows the result of a t-test conducted on the average change in energy consumption for the retrofitted and non-retrofitted building if an average is taken of 3, 4, or 5 years of data. The averages are very close across the specifications. To consider changes over longer periods, a period of 5 years was chosen for the remaining analyses.  Table D.2 shows the results from a linear regression with the change in energy consumption in percent as the dependent variable. The effect of retrofitting is consistent across the specifications between reducing consumption by 8.9-9.9%. In specification 2, the ERA is positive and significant. This effect is negative in specifications 3 and 4 where the building type is added, as the energy consumption increases for (partially) commercial buildings which are typically larger than residential buildings. There is no clear effect of the construction period on the change in energy consumption.  Note. SFH = Single family home. Asterisks indicate significance at the 10 percent (*), 5 percent (**), or 1 percent (***) level. Table D.3 shows the results of a linear regression with the absolute change in energy consumption as the dependent variable. The results are consistent across the specifications. In model 4, where the most covariates are included, retrofitting is correlated with a reduction in energy consumption of 20.0 MWh/y. The effect of the ERA is negative and significant, meaning that larger buildings achieve higher savings. This is expected as it is likely that for larger buildings more surface area is retrofitted, resulting in a stronger absolute change in reductions. The building type also appears to have a significant impact on change in energy consumption. Commercial buildings have a smaller reduction in energy consumption, and for mostly commercial buildings, the effect even offsets the effect of the retrofit, resulting in an increase in energy consumption. Multi-family homes achieve higher reductions than single-family homes. The effect of the construction period is positive, and significant for some of the time periods. This suggests that the highest reductions were present in the oldest buildings, which is in line with the expectation that older buildings are more poorly insulated and thus have higher reduction potential. However, this relationship is not consistent, as the buildings built in the period 2001-2010 have a lower change in energy consumption than those from 1981 to 2000. However, this effect is only significant at a p-value of 10%.  Table D.4 shows a linear regression that was run with the same dependent variable but with the retrofit comprehensiveness as the main independent variable. The results are consistent across specifications. The comprehensiveness of some retrofits is unknown. Based on the identified reduction in the linear regression, these reductions were similar to the depth of two measures, and are thus unlikely to bias the results towards lower or higher comprehensiveness.  Note. Asterisks indicate significance at the 10 percent (*), 5 percent (**), or 1 percent (***) level. Table D.5 shows the results from the event study that is presented in section 4.1. The event study is conducted with three specifications, on the IHStransformed energy consumption as the dependent variable. The energy consumption is reduced after the retrofit across all specifications, although only statistically significant in specifications 2 and 3. The effect of the HDD in specification 3 is significant and the model has a higher adjusted R2 and was thus chosen to be shown in Fig. 4 in section 4.1. The effect of the retrofit in this specification is smaller than in the other two models. Note. Asterisks indicate significance at the 10 percent (*), 5 percent (**), or 1 percent (***) level. Table E.1 shows the results from a linear regression where only the retrofitted buildings were included. The main independent variable is whether the retrofit is subsidized or not. The effect of the subsidy is not significant at any of the specifications. Note. Asterisks indicate significance at the 10 percent (*), 5 percent (**), or 1 percent (***) level. Table E.2 shows the full results from the event study for subsidized and non-subsidized retrofits. The event study is conducted with three specifications for each group, with the IHS-transformed energy consumption as the dependent variable. Specification 1, without fixed effects, does not show significant results and has a negative adjusted R2. The effect of the HDD is statistically significant upon including it in specification 3, so this model is shown and discussed in section 4.2.

Table E.2
Event studies of all buildings with a subsidized retrofit (panel A) or non-subsidized retrofit (panel B) with the change in energy consumption in percent between before and after the retrofit, where t0 reflects the time of the retrofit.  Note. Asterisks indicate significance at the 10 percent (*), 5 percent (**), or 1 percent (***) level. Table F.1 shows the full results of a linear regression of the subsidy amount on the absolute change in energy consumption in kWh. As discussed before, the effect of the ERA is negative and statistically significant, commercial buildings seem to have a lower reduction in energy consumption, and no clear relationship can be seen for the construction period. Note. SFH = Single family home. Asterisks indicate significance at the 10 percent (*), 5 percent (**), or 1 percent (***) level.