Informing the planning of rotating power outages in heat waves through data analytics of connected smart thermostats for residential buildings

With climate change, heat waves have become more frequent and intense. Rotating power outages happen when the power supply is unable to meet the cooling demand increase resulting from extreme high temperatures. Power outages during heat waves expose residents to high risks of overheating. In this study, we propose a novel data-driven inverse modelling approach to inform decision makers and grid operators on planning rotating power outages. We first infer the building thermal characteristics using the connected smart thermostat data, and used the estimated thermal dynamics to simulate the thermal resilience during a heat wave event. Our proposed method was tested for the California power outage in August 2020 by using the open source Ecobee Donate Your Data dataset. We found in California the power outage should not last more than two hours during heat waves to avoid overheating risks. Informing the residents in advance so they can prepare for it through pre-cooling is a simple but effective strategy to expand the acceptable power outage duration. In addition to assisting power outage planning, the proposed method can be used for other applications, such as to evaluate a building energy efficiency policy, to examine fuel poverty, and to estimate the load shifting potential of building stocks.


Introduction
Heat waves happen when abnormally high outdoor temperature lasts for several days [1]. As one of many consequences of climate change, heat waves have become more frequent and intense [2,3]. During the past decade, extreme heat events have been recorded in India [4], Russia [5], China [6], and many other places across the world. Heat waves are considered to be a critical public health threat, and they were estimated to be responsible for the death of more than 70 000 people in the summer of 2003 in Europe [7] and 55 000 people in Russia in 2010 [8]. With climate change, heatwave-related excess mortality is expected to increase further, especially in tropical and subtropical countries and regions [9].
Meanwhile, extreme high ambient temperature drives up electricity demands and poses threats to grid reliability, because higher ambient temperature leads to increased cooling loads and thus more electricity use for air conditioning. The atmospheric warming in California is expected to increase grid peak demand in summer as much as 38% by the end of twenty-first century [10]. In August 2020, because of the region wide heat wave and unanticipated power supply shortage, California residents experienced rotating power outages.
The challenges posed by heat waves are more significant in cities for two reasons. First, climate change induced warming is more severe in cities than their surrounding rural areas (i.e. the urban heat island effect); the difference could reach 4 • C under a high-emission scenario [11]. Second, cooling buildings accounts for a higher proportion of the total electricity demand in cities, compared to rural areas. If a power outage is unavoidable during heat waves, it is essential to understand how long it could last, to prevent occupants from being exposed to excess heat while the grid stress is being relieved. Occupants' exposure to excess heat indoors can lead to heat exhaustion, heat edema, heat cramps, heat syncope, and heatstroke [12], all of which are dangerous health risks and can cause a public health crisis.

Heat wave and grid stress
In modern society, the building sector accounts for 32% of global energy demand (24% for residential and 8% for commercial) [13]. Among building end users, heating, ventilation, and air-conditioning (HVAC) is a major electricity consumer, consuming 33% of total building energy consumption in Hong Kong [14], 40% in Europe [15], 50% in the United States [16], and more than 70% in Middle East countries [17]. During heat waves, people tend to stay inside air-conditioned environments for a longer period and extend their use of air conditioning. In addition, the higher outdoor air temperature increases the cooling loads in buildings. These two factors combined lead to significant increases of electricity demand to cool buildings.
In figure 1, we applied a five-parameter change point model [18] to examine how ambient air temperature is correlated with city-scale electricity consumption in two major metropolitan areas in California: Los Angeles and Sacramento. We used the hourly data of two Californian Balancing Authorities-the Los Angeles Department of Water & Power (LADWP) and the Balancing Authority of Northern California (BANC)-collected by the U.S. Energy Information Administration [19] between 2015 and 2020. LADWP and BANC recorded the electricity use in the Los Angeles and Sacramento Metropolitan Areas, respectively.
In figure 1, a clear pattern can be observed showing that higher ambient temperature would drive up city-scale electricity consumption. We extracted the elasticity of city-scale electricity use and peak demand on ambient daily mean temperature in table 1. Compared with the base load, 1 • C increase of ambient temperature drives up the daily total electricity consumption by 4.7% in the Los Angeles region and 6.2% in Sacramento; while it increases the daily peak electricity demand by 6.9% in the Los Angeles Metropolitan Area and 9.2% in Sacramento.
The dramatic increase in electricity demand during heat waves poses challenges to grid operation and energy security. On August 14 and 15, 2020, Northern California residents experienced a rotating power outage event. The major factor that led to the rotating outages was that California experienced a one-in-thirty-year extreme heat wave in mid-August of 2020 [20]. The heat wave drove up the electricity demand, which exceeded the existing electricity resource planning targets. The California Independent System Operator Corporation was forced to institute rotating power outages because the increasing electricity demand could not be met by electricity generated locally or imported from neighbouring areas, as this extreme weather event extended across the Western United States and accordingly strained the resources in neighbouring areas as well. As a result, rotating power outages were instituted.

Research gaps and objectives
A rotating power outage exposes residents to overheating risks due to the lack of air conditioning during the extreme heat wave event. If a rotating power  [21] and the U.S. Residential Energy Consumption Survey [22]. However, those conventional approaches are expensive and usually not adequately representative. For instance, the U.S. RECS is conducted every four to six years and limited to a small sample size (e.g. 5686 households throughout the country in the 2015 survey [23]). Meanwhile, for many places of the world, the information of building thermal property is not available, which makes rotating power outage planning challenging.
In this study, we propose a novel approach to inform decision makers and grid operators when planning the inevitable rotating power outages. This approach was tested using the 2020 rotating power outage in California, and has the potential to be used in other places of the world. We first applied a novel data-driven inverse modelling method to infer building thermal property using a state-wide open source dataset collected from connected smart thermostats-the Ecobee Donate Your Data (DYD) program [24]. Then the inferred building thermal characteristics were used to plan the power outage by simulating the thermal resilience of the residential building stock.
This study is organized as follows, we first introduce the novel hybrid inverse modelling approach in section 2, where we describe the thermal dynamics model (section 2.1), the parameter estimation method (section 2.2) and model validation approach (section 2.3) in greater details. Then we present the results and major findings in section 3: we compare the identified thermal properties between different major cities in California (section 3.1), and then simulate the thermal resilience during a heat wave event using the identified parameters (section 3.2). We will discuss the recommended power outage duration that could avoid overheating risks in section 4.1, and the contribution and limitation of this study in section 4.2 before we conclude in section 5.

Method
We proposed a two-step approach to determine the maximum allowable power outage duration, as shown in figure 2.
The first step is to infer thermal dynamics of residential building stock. As discussed in the Background section, the conventional approach to investigate building thermal characteristics is constrained by its high costs and small sample size. In this study, we proposed a data-driven hybrid (grey-box) modelling approach: using a thermal resistance-capacity network model (R-C model) to characterize the building thermal dynamics and then using the smart thermostat data to estimate the value of the model's parameters; in this case, the value of thermal resistance (R) and thermal capacity (C) of a house. The dataset we used in this study is Ecobee DYD Dataset [24]. The sampling rate of this dataset is 5 min and the temperature measurement resolution is 1 • F.

Thermal dynamic reduced-order model
Inspired from the thermal-electrical analogy, researchers proposed the R-C heat transfer network model to simulate the thermal dynamics of a building [25]. There are various orders of R-C models [26], i.e. different numbers of Rs and Cs in the R-C network. Similar to other machine learning algorithms, higherorder models can deliver a more accurate model prediction but may suffer from over-fitting. Once the model order is determined, the model parameters (e.g. values of R and C) are estimated by fitting the measured data. In this study, we selected a 1R-1C model, as it could deliver a prediction with a root mean squared error (RMSE) of less than 0.5 • C, while avoiding over-fitting risks.
The reduced order model used to simulate a residential building's thermal dynamics is shown in equation (1), where T in and T out are the indoor and outdoor air temperature, R and C represent the thermal resistance and thermal capacity of the building, Q HVAC represents the heat from HVAC (a negative value for cooling and a positive value for heating), and T eq is the equivalent temperature rise that considers Figure 2. The data analytics process to inform the maximum power outage duration in California: We proposed this two-step approach to estimate the allowable maximum power outage duration in California. The first step is to infer the thermal characteristics of residential building stock in California using the connected smart thermostat data. The second step is to predict the thermal states when a power outage happens using the inferred thermal dynamics, and based on that prediction, to estimate the allowable maximum power outage duration. solar irradiation and internal heat gains (from occupants, lights, and appliances use). The term T eq characterizes the effect of solar and internal heat gains, which is defined as T eq = R * (Q solar + Q internal ). The physical implication of T eq is: because of the solar and internal heat gains, the outdoor temperature T out is equivalently increased by T eq . T eq depends on the house's characteristics: orientation, shading, window-to-wall ratio, and window thermal properties As shown in equation (1), the indoor air temperature change is driven by three terms: heat transfer between indoor and outdoor (including heat exchange through exterior envelope and air filtration), solar and internal heat gains, and heating or cooling provided by the HVAC. On the left hand side of equation (1), the thermal capacity term includes the thermal capacity of the envelope, furniture, and indoor air. In terms of the first term on the right hand side, the thermal resistance term takes into account not only the heat transfers through the building envelope, but also the heat transfers through air infiltration. As for the second term on the right hand side, the influence of solar radiation and internal heat gains is captured by adding an extra equivalent temperature term, T eq , to the ambient air temperature. It is worthwhile to point out that T eq is normalized (by R) of Q solar + Q internal , which can make the first two terms on the right hand side of equation (1) consistent and comparable. The value of T eq depends on (a) local solar condition, (b) some building characteristics that are not reflected by R, including the building's orientation, window-to-wall ratio, shading, and window performance. For instance, houses with a large window-to-wall ratio and large window solar heat gain coefficient are exposed to larger solar heat gains and therefore have a larger T eq . As T eq varies building to building, it is inferred through the parameter estimation process as well. The third term represents the heating or cooling provided by HVAC.
In the first-order, linear time-invariant (LTI) system, the concept of time constant is widely used to characterize the system's response to a step input. Physically, the time constant represents the elapsed time required for the system's response to a step signal. In a dynamic system that the variable is increasing, the time constant is the time the variable reaches 63.2% of its final (asymptotic) value in the step response. In a system that the variable is decreasing, the time constant is the time it takes for the system's step response to reach 36.8% of its final value. Residential buildings' thermal dynamics after the cooling is turned off during a power outage event is like an LTI system's step response [27]. Therefore, we used the thermal time constant (TTC) as a key parameter to evaluate the thermal resilience of residential buildings during a power outage event. Temperature increase is more than 2 • C during this free floating period RMSE is less than 0.5 • C RMSE is less than 0.5 • C

Inferring thermal parameters
In the thermal dynamic model of equation (1), there are three types of variables: • Parameters to be estimated: R, C, T eq • Measured variables: T out , T in • Unmeasured variables: Q HVAC To facilitate the parameter identification, we proposed some rules and applied them to select several chunks of data that can be used for system identification.
• Since the Ecobee DYD dataset does not record energy-related data, Q HVAC is not available. As a solution, we selected the time when heating or cooling was turned off (a.k.a. the free-floating period) to get rid of the term Q HVAC in the model. • In the heating season, we used the data between 10 PM and 7 AM for parameter inference, because during this period (a) the solar heat gain was zero, (b) the internal heat gain was marginal, and (c) the outdoor air temperature was the lowest. Therefore, we can assume T eq is 0, and the term (Tout−T in ) R represents the right-hand side of equation (1) during this period.
• In cooling season, T eq is not negligible. We used the data around noon (between 10 AM and 3 PM) because we wanted to infer the largest T eq (due to the solar radiation), which is needed in the worst scenario analysis of thermal resilience. Additionally, we used less than three hours of data so we can (a) assume T eq was constant during the model fitting, and (b) identify the largest solar heat gain for worst scenario analysis. • We selected the free-floating periods that lasted more than 1.5 h and with a temperature change of more than 2 • C because more data points and larger state variations could help the system identification process.
We used scipy.optimize [28] for parameter identification. Once the parameter fitting was done, we only kept those results with a RMSE less than 0.5 • C. We dropped those data points if the RMSE was larger than 0.5 • C because a large RMSE indicates some of our assumptions might be invalid, for instance, T eq did not stay constant for this household during this period. We summarized the assumptions in table 2.
Because of the data quality issue and the restrictions we used to select the data, we could not infer the thermal properties for every residential building recorded in the database. Figure 3 plots the three major error types we encountered during the parameter identification process. The sample size of the database increased by more than eight times between 2015 and 2019. The major reason the parameter identification failed in heating season is that the temperature variation during free floating was less than 2 • C, because California generally has a mild winter. The major reason the parameter identification failed in cooling season is that we could not find qualified free floating periods, for two reasons. First, cooling is less frequently used in Californian households. Second, fewer residents turned off cooling during 10 AM to 3 PM. On the contrary, more occupants tend to turn off heating or reset to a lower indoor temperature setpoint after they fall sleep, therefore it is more likely to find a free-floating period during 10 PM and 7 AM. Once we were able to find a qualified data fitting period, the model was able to deliver regressions with few households having an RMSE larger than 0.5 • C.

Model validation
We applied two methods to validate our approach. We first validate our model with the real measurement data. Figure 4 plots the measured and predicted temperature of a random winter and summer day, showing a good fitness of our model.
The second validation approach is to the values of the thermal time constant of the same households inferred from heating and cooling seasons. Theoretically, TTC inferred from summer data and TTC inferred from winter data should be similar unless there is a major retrofit of the building. The box plot of figure 5 shows a good consistence between the TTC median values and ranges between the 25% and 75% percentiles. The variation of TTC inferred from the cooling season was larger than that inferred from the heating season for two reasons. First, as shown in    figure 3, the sample size of residential buildings with successful parameter identification was larger in the heating season. Second, the temperature difference between indoor and outdoor temperature in heating season was larger, therefore the indoor temperature variation was larger during free-floating mode in the heating season. A larger temperature variation facilitates a more accurate parameter identification.

Thermal properties of Californian residential buildings
We plotted the distribution of estimated TTC and T eq for Californian cities that have more than 25 successful parameter identification houses in figure 6. It could be observed that cities in the Central Valley (Fresno, Bakersfield, and Clovis) and Northern California (Sacramento) have larger TTC values compared with cities in the Southern Coast region (Los Angeles, Santa Clarita, Irvine). This is partly because the California Building Energy Efficiency Standards [29] require building thermal insulation in colder climate zones to be higher. Better building thermal insulation leads to a larger thermal time constant.
In terms of T eq , Southern California cities such as Los Angeles, San Diego, and Rancho Cucamonga have larger T eq than Northern California cities (e.g. Sacramento, San Jose). This is because Southern California cities have more sunshine, leading to higher solar heat gains for residential buildings. The higher solar heat gains drive up the T eq of residential buildings in Southern California.

Thermal resilience in power outage
After the thermal dynamics are identified, we apply them to simulate the indoor thermal states when a power outage happens. As air conditioning is turned off during a power outage, the building enters the 'free-floating' mode. The rates of indoor temperature increase depend on the ambient weather conditions and the building thermal properties: a higher ambient temperature, higher T eq , and smaller TTC lead to a faster temperature increase. In this study, we considered the worst scenario by using the highest hourly temperature of 2020 as the ambient air temperature of each city and inferring the T eq of the noon time (see the Method section). The impact of solar radiation is considered by using T eq inferred from historical data, assuming the contribution of solar heat gains stay about the same during the heat wave event.
We used the API provided by the National Oceanic and Atmospheric Administration (NOAA) [30] to download the weather data. We downloaded the weather data from the geographically closest weather station for each city during 2020. To consider the worst scenario, we used the hourly maximum temperature as the inputs to analyse the residential buildings' thermal resilience during the power outage. The hourly maximum ambient temperature during the heat wave reached 50 • C in some regions, as shown in figure 7. To determine the allowable maximum power outage duration, we needed a clear definition of overheating risks in residential buildings. Based on the heat index classification of NOAA, the occupants should be Cautious when the indoor heat index is above 80 • F (26.7 • C) and Extremely Cautious when the indoor heat index is above 90 • F (32.2 • C) [31]. In Europe, based on the Chartered Institution of Building Services Engineers' Environmental Design Guideline, there should be no more than 1% of annual occupied hours over an operative temperature of 28 • C in living rooms, and no more than 1% of annual occupied hours over an operative temperature of 26 • C in bedrooms [32]. In this study we used 28 • C and 32 • C as the two thresholds of overheating.
We considered two scenarios: (a) not notifying residents about the power outage and (b) notifying residents about the power outage in advance; corresponding to the two initial conditions. When the residents have not been notified about the power outage, we assumed the initial condition to be an indoor temperature of 24 • C. If the residents have been notified about the power outage in advance, they might take some pre-cooling measures to further cool down the indoor environment before the power outage, therefore the initial condition of indoor temperature was assumed to be 22 • C (which is at the lower end of ASHRAE cooling temperature range from 22 • C to 25 • C) once the cooling was shut off.
The evolution of indoor temperature during a power outage event is plotted in figure 8. We plotted Los Angeles and San Jose because these two cities had the largest sample size in the database and also are among the biggest cities by population in California. The temperatures of San Jose's houses rise slower than those of Los Angeles's houses for three reasons: (a) Los Angeles has a higher ambient temperature, (b) Los Angeles has higher solar heat gains (reflected by a higher T eq in figure 6(b)), and (c) houses in Los Angeles have less insulation (reflected by a smaller uate the the TTC in figure 6(a)). The pre-cooling measure can increase the allowable maximum power outage duration by about an hour in both cases. Figure 9 shows a plot of the percentage of households exposed to overheating risks as a function of power outage duration for four Californian cities: Los Angeles (largest California city by population), San Diego (2nd), San Jose (3rd), Sacramento (6th), Irvine (14th), and Riverside (12th). Those six cities have the largest sample sizes in the Ecobee DYD database. A higher percentage of households are exposed to overheating risks with increasing power outage duration. Because the indoor temperatures of houses in Los Angeles increase the fastest, the highest percentage of households are exposed to overheating risks in Los Angeles given the same power outage duration. Conversely, households in San Jose, a Northern Californian city, have the lowest overheating risk during the power outage event.

Recommended power outage duration
The determination of power outage duration to avoid overheating risks of residents depends on two criteria: (a) the acceptable maximum indoor air temperature, (b) the allowable percentage of households exposed to overheating risk. In this study, the recommended allowable power outage duration was determined as the maximum period that less than 10% of households are exposed to overheating risks. We selected 28 • C as the threshold value because we wanted to be more conservative. In extreme scenarios to avoid power blackout of the entire power grid, a higher temperature such as 30 • C or even 32 • C may be considered. We chose 90% rather than 100% of households to be free of overheating for two reasons: (a) to   account for measurement uncertainty and modelling error, and (b) to avoid the results dominated by the few poorly insulated houses. The criteria to determine the maximum allowable power outage duration can be set by the local grid operators. We plotted the recommended power outage duration for Californian cities in figure 10. Informing the residents in advance of a power outage, so they can cool down their houses to a lower temperature before the power outage, is a simple and effective strategy to increase the acceptable power outage duration-by more than one hour for most cities.

Contribution and limitation
The advantages of our proposed approach are threefold. First, it can save costs and labour compared with conventional methods of investigating the thermal properties of building stock, because we are using the existing Ecobee DYD database. Second, the sample size of this method is larger than the existing data sources, which enables a more robust, accurate, and reliable estimation of a building's thermal performance. For instance, the RECS surveyed 5.6 thousand households once every four years. The Ecobee DYD database recorded the smart thermostat data of 85 thousand U.S. households. In California, we have 8399 samples out of 11 500 thousand households state-wide, and the sample rate is 0.70 samples per thousand households, exceeding the sample rate of RECS by 23 times. Third, the hybrid grey-box approach integrates the strengths of a data-driven, physics-based model: achieving a high modelling accuracy with clear physical implications. The developed R-C models and inferred parameters can be used for other applications, such as to estimate the load shifting potential of residential building stocks by leveraging the passive thermal storage of building structures, and to evaluate building thermal efficiency policies.
The major limitation of this approach lies in the potential sample bias. We can only sample from households that have installed the smart thermostats, which may not be a random sampling from the whole population. Even though some researchers found that the technology adoption intention is not influenced by household income [33], there is a lack of evidence to support the idea that the residential buildings recorded in the Ecobee DYD database are a random sampling of the whole residential stock. The positive side is, with the penetration of smart thermostat technology and increasing number of households that are willing to donate their data (the sample size of the DYD dataset increased from 7000 in 2015 to 101 000 in 2019), this method could gradually approach the true thermal property distribution of the residential building stock.
Another limitation of the approach is the use of the one order R-C model and the related assumptions, which may lead to larger errors for certain individual houses. However, our study focus on the building stock level. Quite some households' data cannot be used in the study due to the modelling assumptions and selection process. However, with the continuous growth of data in the Ecobee DYD dataset, many more valid households' data can be used in future research.

Conclusion
With climate change, heat waves become more frequent and intense. Heat waves pose new challenges to energy security and public health as they drive up electricity demand and expose residents to overheating risks. In extreme cases, when the power supply is unable to meet the demand increase, rotating power outages are instituted. Californian residents experienced rotating power outages in August 2020, when a historic heat wave extended across the western United States. The lack of space cooling during a power outage during heat waves exposes residents to high overheating risks, which could cause a public health crisis.
If a power outage is unavoidable during heat waves, it is essential to understand how long the power outage can last, so the grid stress can be relieved while minimizing occupants' overheating risks. In this study, we proposed a data-driven inverse modelling approach to inform decision makers and grid operators on planning a rotating power outage. Our proposed approach was tested using data from the California rolling power outage in August 2020.
Our method includes two steps: (a) infer the thermal characteristics of residential building stock using the connected smart thermostat data, and (b) simulate the thermal states when a power outage happens using the inferred thermal dynamics, based on the prediction to recommend the maximum allowable power outage duration.
We tested our approach in California, with special focus on large Californian cities with large sample sizes. We first inferred the thermal properties of residential stock using the Ecobee DYD dataset. Residential buildings in Northern California cities have a larger thermal time constant due to more stringent building thermal regulations. Then we applied the inferred models to simulate the thermal resilience of residential buildings during the power outage. For the majority of Californian cities, the power outage should not last more than two hours during heat waves to avoid overheating risks. Informing the residents in advance, so they can cool down their houses to a lower temperature before power outages during heat waves, is a simple and effective strategy to increase the acceptable power outage duration by about one hour.

Data availability statement
The data that support the findings of this study are available upon reasonable request from the authors.