Impacts of Heat Decarbonisation on System Adequacy considering Increased Meteorological Sensitivity

This paper explores the impacts of decarbonisation of heat on demand and subsequently on the generation capacity required to secure against system adequacy standards. Gas demand is explored as a proxy variable for modelling the electrification of heating demand in existing housing stock, with a focus on impacts on timescales of capacity markets (up to four years ahead). The work considers the systemic changes that electrification of heating could introduce, including biases that could be introduced if legacy modelling approaches continue to prevail. Covariates from gas and electrical regression models are combined to form a novel, time-collapsed system model, with demand-weather sensitivities determined using lasso-regularized linear regression. It is shown, using a GB case study with one million domestic heat pump installations per year, that the sensitivity of electrical system demand to temperature (and subsequently sensitivities to cold/warm winter seasons) could increase by 50% following four years of heat demand electrification. A central estimate of 1.75 kW additional peak demand per heat pump is estimated, with variability across three published heat demand profiles leading to a range of more than 14 GW in the most extreme cases. It is shown that the legacy approach of scaling historic demand, as compared to the explicit modelling of heat, could lead to over-procurement of 0.79 GW due to bias in estimates of additional capacity to secure. Failure to address this issue could lead to {\pounds}100m overspend on capacity over ten years.


Introduction
Capacity markets have become a common framework for providing security of supply in energy systems without problems of oversupply of costly peaking capacity or the high economic and political costs of load shedding. Peak electricity demands are expected to grow significantly in regions with high levels of gas-fueled space heating, as heat is moved onto electrical systems to meet decarbonisation targets. These heat demands have much stronger sensitivities to meteorological and seasonal factors than historic electrical demand [1,2,3,4]. In some countries this issue is compounded by the replacement of aging conventional thermal plant by renewable generation which is highly sensitive to meteorological conditions [5,3]. The study of system adequacy uses probabilistic methods, and demand uncertainty can both bias and increase variability in the estimates of capacity to secure; this bias should therefore be identified and, wherever possible, minimized. In fact, it is the range of calculated capacity to secure values across scenarios that sets the procurement target in some markets [6]. Given the large sums allocated in the capacity market (£700 million in the most recent GB capacity auction), even modest reductions in uncertainty could yield significant dividends in terms of social welfare.
Despite this, the consideration of uncertainty in underlying changes in demand on system adequacy is seldom considered in detail. The system adequacy literature of the past decade has primarily focused on the determination of capacity value of non-dispatchable plant (e.g., renewables, demand-side response, energy storage) [7,8,9,10,11,12,13,14]. These approaches assume a known, well-defined distribution of demand, with approaches typically scaling historic demand curves to meet projected forecast peak demand [6], neglecting changes in the distribution of demand duration curves that may occur due to clean energy transitions. To address this, a recent review of GB capacity market by independent academic experts made the formal recommendation that this issue be explored, noting that 'The factors affecting the evolution of peak behaviour should be analysed ... from the broad perspectives of current and future technical, society and regulatory evolutions' [15]. This point is particularly pertinent given recent annual heat pump installation targets of 600,000 and one million per year (by the close of this decade) which have been proposed by the UK government and UK's Climate Change Committee, respectively; this will increase demand-weather sensitivity dramatically [16].
Although there are a range of capacity market designs and time frames, it is generally the case that the most important timescale for capacity markets is typically between τ − 1 delivery for the following peak season (e.g., 18 months ahead of time for a summer auction to be delivered in the following winter) through to the τ − 4 delivery for the peak season four years ahead [7]. This relatively short timescale with respect to energy transitions [17,18,19] means that the practicalities of capacity procurement via markets (or other means) are not usually considered by papers that study the decarbonisation of heat; works usually focus on the increase in seasonality or growth of system peak over several decades. For example, the impacts of increasing heat pump demand are considered for the year 2050 in [1], estimating that 50% penetration of heat pumps would result in increases in demand of more than 20 GW. Similarly, the works [20,21] develop a spatial heat model which is combined with network models for the GB gas and electric system, modelling prices and heat demand to 2050. It is estimated that peak demands could grow by close to 10 GW between 2020 and 2025. Other works consider the impacts of heat without explicitly considering the timescale of changes that could occur, such as [4]. It demonstrates that, if 30% of gas demand is shifted to highly efficient heat pump load, then electrical peak demands would increase by 25%. The authors of [22] find that an 80% heat pump demand scenario increases GB demand by as much as 54 GW in an unmitigated scenario, dropping to 16 GW under more favourable conditions. In [23], the authors study four heat transition scenarios, noting that the pathways with widespread electrification of heat (via heat pumps) leads to particularly strenuous impacts on electricity systems. All of the aforementioned works point to electrification of heat causing systemic change, with significantly increased demand-weather sensitivity as well as base demand. Unfortunately, the disjuncture between the timescales of the capacity market and these works on long-term energy transitions results in challenges interpreting possible impacts on capacity procurement.
There are some works that do consider changes in demand up to five years ahead, although the implications of these works are not studied in the context of impacts on capacity markets. In [3], the authors consider a five-year forecast of demand net of renewables, and (with the scenarios considered) estimate that net demand could drop by up to 5%; the coefficient of performance (COP) of heat pumps is shown to be a key parameter that will determine the future impacts of heat electrification. The authors of [2] consider a similar problem, estimating changes in demand every five years until 2050. Long-term peak demand forecasting focuses on the problem of estimating the demand peak, using data-driven methods [24,25,26]. These approaches do not explicitly consider the system margin or the subsequent capacity required to meet this demand; additionally, it is also usually implicit in data-driven models that demand growth can be extrapolated, with the main uncertainties typically in terms of economic growth (and so they are not appropriate if there is systemic change). In [27], it is shown that the seasonality of electrified gas (heat) demand leads to 60% faster peak demand growth than the equivalent electrical energy demand. However, across reviewed works, it has been identified that there is a gap in the analysis of impacts of decarbonisation of heat on capacity adequacy over the time frame of capacity markets, despite the key role that electrical heating is expected to play in achieving net zero.
This paper studies the impact of increasing the weather-dependency of electricity demand via electrification of space-heating, and how this will affect capacity markets through changes in capacity to secure. The paper develops a novel system adequacy model for this purpose which combines hourly demand and renewable generation models with an explicit space heating profile derived from daily gas demand and heat pump profiles. This enables the model to capture changes that heat pumps could have on demand profiles as well as on meteorological sensitivities. With this model, the impact of one million annual heat pump installations on a GB case study are considered comprehensively, with three specific aspects considered.
• Changes in demand-weather sensitivity. The evolution of demand-weather sensitivities are studied using Lasso-Regularized linear regression. Weather variables known to correlate with either gas or electricity demand are studied comprehensively, with the covariates subsequently derived from climate reanalysis data.
• Bias arising from the legacy Load Duration Curve approach. Models accounting for heat demand explicitly are compared against the legacy approach whereby heat demand growth is implied by the scaling of historic demand. Possible biases introduced by the legacy implicit approach are quantified.
• Estimating scenario variability in Additional Capacity to Secure. Scenario analysis is used to capture possible changes in variability of capacity to secure, with meteorological sensitivities and heat demand profile uncertainties considered.
By considering a detailed system adequacy model with these objectives in mind, we study not only the effects of heat pumps on demand, but also how they influence the capacity required to meet security of supply standards. The contributions of the work are summarised as follows.
1. A demand model is proposed that explicitly accounts for increased electrical space heating demand at a national level, and is suitable for consideration within system adequacy studies. Space heating demand is estimated by assimilation of historical gas demand data with heat pump usage profiles. 2. The demand model is considered alongside 30 years of historic climate reanalysis data to create a demand hindcast covering winters from 1990 to 2020. The model uses Lasso-Regularized regression to avoid overfitting and exclude uninformative covariates. Net demand across each winter can then be hindcast using coincident renewable generation. 3. The model of net demand is combined with models of conventional and renewable generation to quantify security of supply in terms of loss of load expectation and subsequently capacity to secure for a GB case study. Scenario analysis across heat pump profiles and coefficients of performance show variability in capacity to secure greater than all scenarios presently considered in the most recent GB capacity market auction. 4. It is demonstrated for the first time that significant bias in capacity to secure could be introduced if models fail to capture changes in the underlying end-uses of electrical demand. This is achieved by comparing the explicit space heating model with conventional approaches that ignore changes in time-and weather-based dependencies of electrified heating demand.
This paper has the following structure. Firstly, the novel Explicit heating model is introduced in Section 2, to illustrate how heating demand can be accounted for in time-collapsed adequacy models in a natural way. In Section 3, we outline the Lasso-based linear regression approach, used to consider how heating demand could change the nature of future demand curves. The specific details of the GB system model are outlined in Section 4, to introduce the key characteristics of the subsequent detailed case study. In Section 5, the full case study is used to study the key impacts of increased sensitivity, bias in capacity to secure estimates, and increased variability in capacity to secure. Salient conclusions on the modelling approach are drawn in Section 6.

Time-Collapsed System Adequacy Modelling
A energy system is adequate at a given time instant if there is sufficient generation to meet demand. Power systems are designed so that if all generation is available then there will always be a positive margin; however, due to unforced outages at generators and varying meteorological conditions (in systems with significant amounts of varying renewables) there is a non-zero probability of a shortfall occurring.
In this Section, we consider the structure of the full adequacy modelling approach, as summarized in Figure 1 (with the exception of the linear regression approach which is considered in more detail in Section 3).

Time-collapsed Adequacy Model
A time-collapsed (or 'snapshot') adequacy model is designed to model the distribution of the system margin at a randomly selected time instant during the peak season [28,14]. We propose the use of an hourly time-collapsed model, where each hour of the day t is modelled separately. The key advantage of this approach is that the impacts of a range of demand profiles on the whole day can be considered-this is important as some heat demand profiles peak in the morning (e.g., [29]). Using this approach, the system margin Z t for a given hour t is given by the linear sum where X t represents dispatchable generation, Y t represents renewable generation, and D t represents total system demand. (Each of the variables of (1) are random variables.) Dispatchable generation X t consists of conventional thermal plant and interconnectors, whilst renewable generation Y t is modelled as a combination of onshore wind, offshore wind, and solar generation. The 'overall' margin Z combines the profile for the whole day, and as such can be written where I t is a binary variable with a value of unity if it is hour t, and is zero otherwise. It is worthwhile stressing the time dependencies in (1). Dispatchable generation X t is considered to be equally likely to be available during the whole peak demand period, and so this random variable is independent of demand and renewables [30]. On the other hand, both renewable generation Y t and demand D t are dependent on weather, and so the distribution of net demand D t − Y t must be found by considering coincident times (i.e., these are both assumed dependent on the weather of a given time). Once the net demand has been found, the distribution of the system margin Z t for each hour t can be found by convolution of the probability distribution functions of the net demand D t − Y t and conventional generation X t .

Evaluating Security of Supply
The system margin Z can be used to define a range of risk metrics to understand the likelihood and severity of shortfalls. The likelihood is considered using the Loss of Load Expectation (LOLE), having units of hrs/yr, and is given by where n is the number of periods in year and E denotes the expectation operator.
The LOLE metric has the advantage of being the target security standard of many European systems, whilst also being closely linked to the Loss of Load Probability (LOLP), which is used as an operational indicator of scarcity by transmission system operators.
The LOLE is subsequently used to determine the Additional Capacity to Secure (ACTS). For a given security standard of T LOLE hours per year, the ACTS is the (perfectly reliable) generation required to bring the LOLE to that security standard, where we use LOLE(Z) to denote the calculation of LOLE from (3) using system margin Z (as calculated as in (2)). For example, suppose that the GB security standard is 3 hours LOLE per year, but the generation already committed for the τ −4 delivery year (the 24/25 winter) results in an LOLE of 10 hours per year, such that additional capacity is required. If, say, 1500 MW of perfectly reliable generation could bring the LOLE to 3 hours per year, then the ACTS would be 1500 MW. Note however that real generators are not perfectly reliable, and so a de-rating factor must be applied (i.e., more than 1500 MW of real generating capacity would need to be procured).

Definition of Peak Demand
It is also useful to define the peak demand for a season of N Wtr. winter days so that the peak of different demand distributions can be compared. This work uses the method that is used to define Average Cold Spell peak demand [31]. This approach resamples winter demands many times to to determine the distribution of peak demands empirically; the median value of these demand peaks is then selected as the Peak Demand. This method can be denoted for the time-collapsed model of this work as where the ith random variable over which the max{} function is taken, D i , has the corresponding demand model of that hour's margin (e.g., the 6am model is used for D 6 , D 30 , and so forth).

Impacts of ACTS on Capacity Markets
The design of an effective capacity market is a challenge from both a practical and theoretical point of view [32,33]. The GB capacity market is regarded as a well-designed, modern market [34], although technical details continue to develop, with annual recommendations from an independent Panel of Technical Experts [15].
A brief overview of the design of this market is given in [35]. The Target Capacity to Secure is calculated using a range of supply-and demand-side sensitivities considered around the base case, as well as system-based sensitivities based on National Grid's Future Energy (FE) Scenarios. In total, between 20 and 30 sensitivities are typically considered. Based on these scenarios, the Least Worst Regret (LWR) methodology is used to estimate the Target Capacity to Secure from all scenarios, looking to identify the generating capacity that will have the smallest regret based on projected costs associated with oversupply (based on the net Cost of New Entry) and the costs of shortfall. The latter costs are calculated by the Expected Energy Unserved multiplied by a monetary estimate of the Value of Lost Load. Cost curves for each scenario are combined, and the aggregate least worst-regret option identified.
Explicitly calculating the LWR Target Capacity to Secure is beyond the scope of the current work: suffice to say, the Target Capacity to Secure is almost entirely dependent on only the largest and smallest estimates of the ACTS [6]. As such, not only is it important that calculations of ACTS have low bias, but also that the variability in forecasts for supply and demand are correctly captured. The Range of Capacity to Secure (RoCS) is therefore considered to evaluate the total variability across all scenarios, and is defined as where ACTS(S) denotes the calculation of capacity to secure ACTS using scenario S. For a more detailed critical discussion on the LWR methodology see [6, Apdx. 7].

Explicit System Model, using Non Daily Metered Gas Demand G NDM as a Proxy Heat Variable
We first consider a system model with electrical demand D t which is explicitly decomposed into underlying electrical demand E t and electrified space heating demand H t , as Given this disaggregtion, we refer to the model (7) (combined with (1)) as the Explicit system model, as heating demand is accounted for explicitly within the calculations of system adequacy.
For the purposes of this work, the Explicit model (7) will be considered the ground truth (for a given heat demand model H t ), to which alternative approaches will be considered. This is because space heating demand H t has been observed to have a very different demand profile to that of underlying electrical demand E t , irrespective of whether the means of fulfilling that demand is by gas boilers or electric heat pumps [29,4]. The approach therefore has advantages of being more closely linked to the systemic changes driven by the electrification of heat, although a method of estimating the electrical heating demand H t is required.

Estimating Electric Space Heating Demand
To model heating demand H t for the Explicit system model, we propose that suitably scaled Non Daily Metered (NDM) gas demand G NDM is used as a proxy variable, in a similar vein to [27,36]. NDM gas demand is largely comprised of water and space heating demand (cooking using gas is less than 3% of total domestic consumption) [37], and the customer composition is largely residential and flats/commercial properties. It does not include large industrial customers (such as gas-fired power stations), who are instead billed as part of the Daily Metered class [38].
Following other works, it is assumed that the daily electrified heating demand follows some electrified heating profile h, and that water heating demand G HW is approximately constant throughout the year [39]. As such, the (space) heating demand H is calculated as an affine function of the daily NDM gas demand G NDM as where G HW is the (constant) daily hot water demand for gas, f Dom is the fraction of gas demand meeting domestic demands, k COP a system-wide coefficient of performance, and n H is the number of customers with electrified heating demand.

Implicit System Model, using Load Duration Curves
The standard approach for considering the evolution of system demand is via the use of load duration curves. With this approach, the estimated peak demand for a given year k Peak is forecast by some means. Once this has been determined, the distribution of the total demand D is calculated by linearly scaling the electrical demand E as The model (9), combined with the system margin (1) we refer to as the Implicit system model, as changes in heat demand are implied by the coefficient k Peak . This approach is therefore used, for example, in the methodology for modelling demands in the GB Capacity Market [40]. For the purposes of creating Implicit models that are equivalent to Explicit demands in a meaningful way, k Peak is chosen so that the Peak Demands (5) are equal. The clear advantage of this approach is its conciseness: once suitable load duration curves E have been identified, only the peak demand coefficient k Peak for a given year needs to be determined. On the other hand, it must be assumed that the load duration curve describing the electrical demand E will not change significantly. As mentioned in Section 2.2, there are good reasons to think that heat demand H t has a different distribution to electrical demand E t ; however, if changes to electrified heat demand are small, then this Implicit model might be preferable.
To study explicitly the differences between the models, we consider the Bias in the estimates of ACTS for a given scenario S to be given by where the subscripted ACTS Ex. , ACTS Im. are the calculations of the ACTS using Explicit model (7) and Implicit model (9), respectively. In this way, the effect of changes in demand profiles on the ACTS can be taken into account for models which otherwise are identical according to their Peak Demand (5).

Weather-Dependent Energy System Modelling
A variety of statistical inference procedures, aimed at understanding the effects of exogenous factors (such as weather) on energy demand, have been developed by both academia and industry. To ensure that all possible effects of increased space heating are captured, we consider the statistical inference methods developed by both the gas system operator, National Grid Gas, and the electricity system operator National Grid ESO (NGESO).
NGESO estimates the sensitivity of electricity demand to weather using the Average Cold Spell methodology [31]. This calculates the sensitivity of Variable Description W on , W off Hourly onshore/offshore wind capacity factors S,S Solar PV capacity factor Binary variable for weekdays (each of Monday to Sunday) t Prd, Ci , t Prd, Si ith harmonic time component (C/S as cosine/sin terms) t Sunset Sunset time t Lin Linear time variablê E Out Out-turn peak electrical demand G Out Out-turn mean winter gas demand unrestricted system demand to weather variables. (Unrestricted system demand is defined as the sum of the transmission system demand with and demand-side response, embedded generation and interconnector exports all accounted for.) From this, the 'underlying', non-weather sensitive demand can be estimated. The exact weather variables used are not specified, unfortunately, but by far the most common weather variable studied in academic literature is temperature (alongside temporal variables such as day of week) [24,41,42,43,1,2,30,44,5]. National Grid Gas uses the Composite Weather Variable [45] to quantify the impacts of weather on demand. Unlike the Average Cold Spell methodology, however, many of the variables used in the Composite Weather Variable are public, and are described in [46]. The variables include temperature, wind chill, solar irradiance, and in future could include the effects of precipitation. Combining the approaches from gas and electrical domains, a total of ten weather-based covariates are considered, as well as a range of temporal variables (see Table 1).

Converting Meteorological Reanalysis to Weather-Based Covariates
Accounting for the dependencies of energy systems on weather requires an approach to convert historic weather measurements into appropriate covariates. Meteorological reanalyses are becoming increasingly popular for modelling of weather within the context of energy systems. These datasets are a gridded reconstruction of past weather observations, created by combining historic observations with a high-fidelity numerical model of the earth system, providing a high quality, comprehensive record of how weather and climate have changed over multiple decades. In the context of energy systems, the high spatial resolution allows for the climate of a region or country to be captured, as well as being freely available for researchers [47,43,48,49]. In this work, we develop the methods described in [44,50] for deriving weather-based covariates from the MERRA-2 reanalysis data [51].
In total, there are six covariates based on meteorological conditions (and a further four averaged variables), as in Table 1. These parameters are derived from the raw reanalysis data as follows: • Hourly onshore and offshore wind capacity factors W On , W Off are constructed by modification of the method from [44], with the main steps summarised here. The MERRA-2 reanalysis near-surface wind speeds are extrapolated using a power-law to 58.9 m and 85.5 m, based on the capacity-weighted mean onshore and offshore turbine hub heights respectively (from [52]). Onshore and offshore turbine power curves from [53] are obtained using [54], with aggregated GB onshore and offshore wind capacity factors then created by considering wind farm locations from [52] for the year 2017 (Figures 2a, 2b). As in [44], this results in good accuracy compared to out-turn data [55], with Coefficient of Determination R 2 of 0.95, 0.90 and RMS error of 7.9%, 7.5% for onshore and offshore wind capacity factors respectively, when compared against 2018 daily forecast capacity factors.
• Hourly solar capacity factors S are modelled using a combination of surface temperature and incoming surface irradiation, as described in [44].
• Hourly temperature T and cold-weather uptick variables T Cold are calculated based on population-weighted 2 meter temperatures. (The latter is used as a proxy for the Cold Weather upturn of the Composite Weather Variable, and is intended to model increased demand at cold temperatures.) The cold-weather uptick is calculated as with a cold spell cut-off temperature T 0 of 3 • C, based on [56, pp. 6]. The population weighting of GB is shown in Figure 2c.
• Wind chill is calculated by multiplying the population-weighted 2-meter temperature T , with respect to a wind chill temperature T WC against a similar population-weighted 2-meter wind speed W Pop , also with respect to a threshold W WC The wind chill parameters W WC , T WC are chosen as −1.5 m/s and 16.5 • C, which are consistent with regional values reported in [56, pp. 6].

Linear Regression with Lasso Regularization
The goal of regression is to determine underlying sensitivity of a dependent output variable with respect to given input variables (covariates). Least-squares linear regression typically achieves this goal by minimizing the square of the residuals, with the Least Square sensitivities θ L.S. determined as for covariates x and output y. Unfortunately, naïve Least Squares (13) can lead to over-fitting as there is no penalty on the complexity of a model, risking returning a model with poor predictive performance due to spurious correlations [57,Ch. 7.2]. Indeed, with the relatively large number of covariates considered in this work, it was found that the models showed poor out-of-sample predictive capabilities compared to within-sample fitted data.
To overcome this issue, we use Lasso Regularization [57,Ch. 3]. In addition to mitigating against over-fitting, Lasso Regularization provides solutions θ Lasso that are sparse. That is, coefficients corresponding to covariates which have little or no impact on the output are set to zero. This is achieved by adding a regularization term α θ 1 to the Least Squares cost function (13), with the Lasso estimate of the sensitivities θ Lasso determined as θ Lasso (α) = arg min The regularization term penalizes large coefficients, having the effect of reducing the magnitude of individual entries in θ Lasso , depending on the value of α. For α → 0, the Lasso estimate tends to the Least Squares estimate (13), and will therefore tend to over-fit; on the other hand, for sufficiently large α, all values in the vector θ Lasso will be zero, under-fitting in most cases. Between these two extremes will be an 'optimal' value of α, α * , which will maximise the out-ofsample predictive performance (in this work Coefficient of Determination, R 2 , is used as a scoring function). The optimal Lasso fit θ * Lasso is fitted with this value of α, i.e., Thus, a method is required to estimate the out-of-sample of predictive performance and subsequently determine α * .

Determining the Lasso Parameter α * and Computational Complexity
To determine the out-of-sample predictive performance, and subsequently determine an optimal choice of α, we use k-fold cross-validation [  • k cross-validation folds are created from the k-years of data, with each fold having one year of data for validation and k − 1 years for training.
• The sensitivity θ Lasso (α) is determined for a range of values of α and for each of the k cross-validation folds.
• Estimates of the mean and standard error of the prediction score (R 2 ) are calculated for each value of α using the value of R 2 calculated for each of the k cross-validation folds.
• The value of α * is chosen for which the mean prediction score is within one standard error of the maximum value of the prediction score R 2 .
Although there is no closed-form solution to (15), the computational complexity of the Lasso is typically the same as ordinary Least Squares [57,Ch. 3]. The scikit-learn package [58] is used for all regression calculations.

GB Case Study System Modelling
In this section we discuss how the energy data sources of Table 2 are used to build and validate a generally representative model of the GB system. The GB system is chosen for study as it has had a capacity market functioning for several years, and because it has a very high fraction of its domestic space heating demand met currently met by natural gas [59].

Growth in Underlying Demand and Generation
The peak demand season for both electrical and gas systems in Northwest Europe occurs during the winter months from November to March. Following prior works, we study of the 20 weeks of the year following the first Sunday of November, with the exception of the two weeks surrounding Christmas (these two weeks have low demand and so the likelihood of shortfall is negligible) [73,30,74]. +RXURIWKHGD\ 3RZHU*: The underlying electrical demand E is assumed to remain steady at 19/20 levels, following industry five-year forecasts [60]. To consider a rapid but credible increase in heat pumps in a system on top of this, as could be considered at some point on a pathway to net-zero, we consider a rate of one million domestic installations per year [71].
Conventional generators are represented by a two-state model, using forcedoutage rates from [62, Table 1], with reported availabilities between 81% and 97% for these technologies. The forecast of total installed capacity of each class of generation technology is taken from the five-year forecast [60]; following previous works, these total values are then disaggregated into individual generating units based on unit sizes taken from National Grid's 2013 'Gone Green' scenario [27]. The distribution of interconnector flows are modelled with a uniform distribution. Specifically, imports are assumed to equally likely between high and low capacity factors of the individual interconnected countries reported by NGESO [75], with flows assumed independent.
With these models for interconnectors and conventional generators, the probability density of the dispatchable generation X can be determined via convolution of all individual generators and interconnectors [27]. Boxplots of the distribution of the resulting random variable X are shown in Figure 3c for each delivery year. The median of the generation X for 19/20 (discounting embedded generation) is 54.6 GW, compared to the previously procured capacity of 52.4 GW for 20/21 as reported in capacity market reports, and is therefore considered reasonably representative of the GB system.

Estimating Historic System Demand
The estimation of total demand is challenging due increasing levels of embedded generation [67] and customer demand management (CDM-colloquially referred to as 'triad avoidance', and results in up to 2.5 GW of demand-side response) [65]. Embedded generation represents a range of technologies, both dispatchable (such as small diesel generation) or renewable wind and solar generators. Additionally, NGESO keeps reserves to cover the loss-of-largest-infeed, which at present has a value of 1.32 GW [66] (this effectively increases demand by the same amount).
The unrestricted system demand E is therefore determined as the sum of Transmission System Demand [64]; estimated embedded wind and solar generation output [64]; estimates of remaining embedded generation [67] (assuming an availability of 90%); and, estimates of customer demand management (provided by NGESO). The latter is assumed to run at 100% from 5-6 pm and at 40% at 4pm/7pm. Collating all data, peak demands (without weather correction) match NGESO estimates of weather-corrected peak to within 2 GW from 14/15 through to 19/20 winters with mean absolute error of 1.05 GW. The closeness of estimates gives confidence that the demand model E, like the generation model X, is also broadly representative of the GB system.

Heat Pump Load Profiles and System-wide Coefficient of Performance
The heat pump load profile h will have a large impact on results, as heat demand H at each hour is linearly related to this profile (8). Therefore, three heat pump profiles are considered as system-wide sensitivities, taken from literature using [54], and then normalised and compared in Figure 4. The first of these we define as our central profile C, and is taken as the cold-weather weekday profile of Love et al [29], and is based on measured data from several hundred UK-based heat pumps. Secondly, we consider the flat profile F, as considered by Eyre et al in [23] (such a profile has been reported as the de-facto standard for heat demand in [1]). Finally, we compare this against the profile P of Sansom, as described in [69], from hereon referred to as the 'peaking' profile. It is noted in [69] that P has been very influential in policy, even though it has a peak higher than other estimates of half-hourly heat demand.
The system-wide coefficient of performance k COP is also subject to considerable uncertainty. In [23], the authors estimate the value of the COP for air-source heat pumps could increase from 2.0 up to 3.0 from 2010 through to 2050, similarly increasing from 2.5 to 4.0 for ground-source heat pumps, although the authors assume the COP at peak to be 0.8 times lower than this (due to colder temperatures during peak demands). Similarly, in [70], the authors estimate seasonal performance factors (equivalent to the COP definition used in this work) of 1.5-2.1 for air-source heat pumps and 2.0-2.8 for ground-source heat pumps; again, during cold weather the performance of these systems will likely be lower than these values. To capture the range of values and possible improvements in building stock during any refitting, we therefore consider a COP range from 1.5 to 2.8, with a central estimate of 2.0. +RXURIWKHGD\ 3URILOHYDOXHQRUPDOLVHG 3URILOHh C/RYHHWDO F(\UHHWDO P6DQVRP Figure 4: The three heat pump profiles used for sensitivity analysis. The central profile C is from Love et al [29]; the flat profile F is from Eyre at al [23]; and the peaking profile P of Sansom is from [69].
The fraction of NDM gas demand used by domestic customers f Dom is 79% [72,68]. It is assumed that hot water heating demand G HW is evenly spread through the year [39] and that commercial and domestic properties have a similar use of hot water. Under these assumptions, the mean hourly gas demand for hot water is 9.9 GW throughout the year [37].

Results
The aim of this work is to consider how electrification of heat could impact on capacity markets through changes in Additional Capacity to Secure. In Section 5.1, we first demonstrate the Lasso Regularization approach (as described in Section 3.2), before considering how the meteorological sensitivity could evolve for τ − 4 delivery in the 24/25 winter. Based on this model, in Section 5.2 we then consider how bias could be introduced in estimates of ACTS by Implicit modelling, even when the equivalent Explicit model has identical peak demand levels. Finally, in Section 5.3, the variability in the estimates of ACTS are considered across a range of scenarios.

Net Demand Regression with Meteorological Variables
We first illustrate the Lasso Regularization approach outlined in Section 3.2, considering fitting the linear model for the system demand at 6pm for τ − 0 (the 20/21 winter, with no heat pump demand). Figure 5 outlines the approach: a model is fit for each of the five training datasets (each with one contiguous winter used as a hold-out validation set); from this, the mean and standard error of the Coefficient of Determination from the hold-out validation data are calculated. The optimal value α * is selected as the value that is within one standard error of the maximum mean coefficient of determination from that hold-out scoring. Each vertical line indicates the value of 1/α for which the coefficients of a given 0RGHO&RPSOH[LW\ 1 α &RHIILFLHQWRI'HWHUPLQDWLRQR 2 Figure 5: Linear regression approach, using Lasso Regularization with 5-fold cross-validation. As the regularization weight decreases (so that 1/α increases), the number of non-zero coefficients increases, as indicated by the vertical lines. The optimal value of α, α * , is chosen at the point at which the mean coefficient of determination R 2 is within one standard error of the highest mean value of R 2 .

0RUHQRQ]HUR FRHIIV
covariate becomes non-zero, from which it can be observed that there are many coefficients which are estimated to have a value of zero.
Using this approach for all hours, the Coefficient of Determination R 2 with the optimal Lasso parameter α * varies between 0.73 and 0.94; it is above 0.87 for all hours periods between 6am and 9pm. The mean value of R 2 for each of these hours is above 0.8 using the out-of-sample validation data. This performs very favorably compared to the least squares model-although the mean Coefficient of Determination throughout the day was reduced by 5%, the least squares modelling over-fit dramatically at most times, with all but a single time period having a Coefficient of Determination lower than −8 using the out-of-sample validation data. If the 24 models of each of the hours are combined, the overall coefficient of determination R 2 is 0.978 with sample standard deviation of all combined residuals of 1.02 GW (1.69%), indicating good performance.

Meteorological Sensitivity from Explicit and Implicit Models
To consider more clearly how the system has evolved after 4 years of heat demand growth, we now compare the Explicit heat model (7) and Implicit heat model (9) using the central heat pump profile C with a COP of 2.0. The Implicit model requires the zero-heat demand model of τ − 0 (the 20/21 winter) to be scaled by 14.0% so that the Peak Demand matches the Explicit model (as described in Section 2.3), and so all coefficients have increased by the same amount compared to that model. In contrast, the coefficients of the Explicit model have changed in more subtle ways to match the shape of the electricalplus-heat demand model.
Firstly, we consider the five sensitivity coefficient values with the largest mean absolute value across the day, as plotted in Figure 6. been normalized to have zero mean and unit variance, and so a comparison in this way is meaningful.) It can be observed that sensitivities change significantly through the day-for example, mean 24-hour solar irradianceS is related to the demand, but with the highest sensitivities after 11am (even extending to the hours after dark). A clear result is the change in sensitivity coefficients with respect to temperatures T,T . The sensitivity coefficient corresponding to mean daily temperaturē T has increased sharply around 7am in the Explicit model, as compared to the value obtained by uniform scaling using the Implicit model. This sharp increase is caused by the morning peak seen in the central heat pump profile C, which has a peak around this time (Figure 4). There is also a noticeable reduction in the sensitivity of demand to weekends t Sat , t Sun as well using the Explicit model.
The mean during the afternoon peak (1600-1900) and number of non-zero of each of the sensitivity coefficients (across all 24 hours), as calculated for the τ − 4 year 24/25, are also compared for the Implicit and Explicit models in Table 3. The mean temperature sensitivity (across the instantaneous and 24 hour mean values T,T ) has increased by 54% in the Explicit model, at a rate of more than three times that of the Implicit model. It can also be observed that, of the additional covariates selected from the Composite Weather Variable, the 24-hour wind chillW Chill is most influential, with the corresponding model Vrbl.   (7) and Implicit (9) models for τ −4 delivery (24/25). Despite the models having the same peak demands, the coefficients differ as the Explicit model is fit using gas demand as a proxy for heat, where the Implicit model only scales existing load duration curves. Reported are the mean of the regression coefficients during the system peak (1600-1900 hours); numbers in parenthesis indicate the number of non-zero coefficients across the whole day.
coefficient more than quadrupling from 0.042 to a value of 0.205 during the peak hours. Interestingly, there is not enough correlation between many of the other weather-based covariates and demand to yield a non-zero response. This is potentially due to the scope of the modeling considered here as compared to the industry models from which they are derived: we have only considered the coldest months, where approaches such as the Composite Weather Variable are designed to model year-round weather sensitivities. For example, covariates such as the cold-spell uptick T Cold could account for the fact that demandtemperature sensitivities are known to be different in winter and summer in the GB system [76].

Bias in ACTS Calculations from Implicit Heat Demand Modelling
Having compared the sensitivities of the Implicit and Explicit models, we now consider how the ACTS capacity changes for each of these models through to τ − 4 delivery. Table 4 reports the Additional Capacity to Secure for each of the five years to τ −4 for the central heat pump profile C and COP of 2.0. Whilst the underlying electrical demand E does not change, the generation fleet X does, particularly from τ − 3 to τ − 4, as there is significant levels of decommissioning of legacy plant expected during this period (Figure 3c). It can be seen that as the level of electrical heat demand H increases, ACTS calculated by Explicit and Implicit system models drifts apart, with increasing over-procurement Bias (10). By delivery at the τ − 4 time period, the Bias is such that 0.79 GW   additional ACTS is required for the Implicit model as compared to the Explicit model. By comparison, the Demand Curves used in the GB capacity market τ − 1 and τ − 4 auctions have had a width of 2 GW or less for the past two years [77]. This Bias could lead to systematic over-investment-if at τ − 1 there is a consistent over-procurement of 0.21 GW over 10 years at the net Cost of New Entry of £49m per GW-yr [77], this equates to an over-spend of £103m. A similar result is found for the other two heat pump profiles, as given in Table 5. The ACTS is biased by 0.71 GW for the flat profile F, a similar amount to the central profile C. The peak profile P shows an even larger change of more than 2.3 GW, but that is perhaps unsurprising giving the extreme evening peak in the evening of that profile (Figure 4).
To consider why this bias is introduced, we plot the load duration curve (LDC) of the demand D of the Explicit model for τ − 4 in Figure 7, with three different Implicit models fit. This first is a model to fit the once-per-year peak demand (as in (5)); the other two models fit the one-in-twenty-year median peak demand, and the eighteen-per-year median peak demand (i.e., the weekly median peak demand). As the Explicit and Implicit models have difference distributions, this results in different demand profiles D, and subsequently different levels of Bias. It can be seen that the value of demand D at 50% duration is greater in the Implicit than Explicit models. This implies that the rate of increase of the Peak Demand of D is greater than that of the median demand of D-the Implicit scaling coefficient k Peak must over-compensate for the increased Peak Demand, so overshoots the quantiles lower down the LDC. 'HPDQG'XUDWLRQRIWKH:LQWHU 'HPDQGD*:  (7) using central HP profile C and COP of 2.0, in comparison to the equivalent Implicit models (9), matching the eighteen-per-year, once-per-year and one-in-twenty median Peak Demands. The median demand is several GW higher for the Implicit models-it is only at high demand levels that the distribution of the Implicit models become close to that of the Explicit models (inset).

Sensitivity Analysis: Varying Heat Profiles and System Coefficient of Performance
As mentioned previously, sensitivity analysis is an integral part of resource adequacy planning and capacity markets, as there are many uncertainties for which it is difficult to assign probabilities accurately. In this final section we consider how the variability of ACTS capacity changes, considering first increased meteorological uncertainty (from increased demand-weather sensitivity) and secondly in uncertainty to heat pump profiles h and system-wide coefficient of performance k COP . The goal is to compare the additional variability compared to existing variations considered by industry, and put into perspective the bias observed in the previous section.

Meteorological Sensitivity at τ − 4 Delivery
To consider clearly how the meteorological sensitivity has changed, the first result we consider is how the τ − 4 delivery compares for models with and without heat pumps. Results are given in Table 6 for the central profile C and COP of 2.0. It is found that the ACTS increases from −0.72 (i.e., a small level of oversupply) to under-supply of 6.26 GW, an increase in ACTS of 7.0 GW. This implies an increase in demand of 1.75 kW per heat pump at peak. This is remarkably close to the 1.7 kW per heat pump estimated by [29] and well within the range of 1.2 to 2.6 kW reported in [22].
Of particular interest for this work, however, is how changing the winter conditions changes the ACTS compared to the use of a long-term average. The 10/11 and 13/14 winters are chosen first for investigation as these have been identified in prior works as being of particular importance for the determination of peaking capacity in the GB system [43]. These winters have mean hourly temperatures T of 7.98 • C, 10  of T are 9.46 • C, 3.4 • C, respectively); the mean offshore capacity factors W Off are 0.39 and 0.55 (with long-term mean and standard deviation of 0.51, 0.28 respectively). The results shows an almost-symmetric change in the required capacity factor close to 1.8 GW for those particular years with no heat pumps ( Table 6). As the system meteorological sensitivity increases with heat pump numbers, however, the width of spread for those anomalous years has increased by almost 50%, subsequently covering the range from −3.03 to 2.75 GW. Furthermore, Figure 8 plots the changes in ACTS for all winters, again comparing the model with and without heat pumps. In this figure, the colour of the points indicates the value of the mean temperature anomaly for that winter. From this figure, it is first observed that 10/11 and 13/14 winters are close to the extremes for the ACTS spread at τ − 4 delivery, as expected. The use of several decades of climate data allows for a much deeper understanding of the variability of these estimates, and allows for the severity of individual winters to be compared.
It is interesting to note from this figure, however, that it is not only the mean temperature that determines the change in ACTS-the Pearson correlation coefficient between the mean temperature anomaly and the change in ACTS is r = −0.55. For example, the 17/18 winter is cooler than the long-term average but requires less generating capacity than the long-term climate, as the very cold temperatures from that winter do not occur at points of high net demand. This is also possible to observe in winters such as 06/07 and 12/13-whilst these are the extreme years in terms of mean temperature anomaly, they do not lead the greatest changes in ACTS. As both wind generation and heat pump demand increases, the sensitivity of the system to wind (through renewables) and temperature (through demand) will continue to evolve, and so the meteorological conditions of greatest importance to explain power system variability will also continue to change.

Impacts of Changing System Performance Factors
The ACTS for each combination of heat pump profile h and coefficient of performance k COP is now considered. Figure 9 plots these sensitivities against Weather and Modelling sensitivities, as well as against the range of scenarios and sensitivities considered within the GB capacity market [75]. Weather sensitivities are based on the cold/warm winters of 10/11 and 13/14 (as in Section 5.3.1), whilst the Modelling uncertainties represent the ACTS based on the five cross-validation models (i.e., five models fit with the optimal α * in (15), but with the data of one winter not included). NGESO's FE scenarios consider a range of supply and demand pathways up to 2050, whilst ECR Sensitivities consists of (amongst others) over-and under-delivery of the capacity market, unusually cold or warm weather, and over/under estimation of nominal peak demands.
The Explicit modelling uncertainty around τ − 1 delivery has a RoCS of 4.7 GW, increasing to 14.5 GW for τ − 4. In comparison, the current industry (NGESO) sensitivities and scenarios for τ − 1 and τ − 4 remain roughly equivalent in their Range of Capacity to Secure, with total RoCS of 6.6 GW and 6.7 GW respectively. In other words, the uncertainty due to the addition of one million heat pumps per year, given the three heat pump profiles C, F, P, and COPs from 1.5-2.8 is more than double that of all sensitivities considered in the capacity market report for 2020, even without considering variations in actual installed number of installed heat pumps n H . This huge range in possible outcomes for high penetrations of heat pumps has been noted before [22], although to our knowledge has not been identified as an issue in the context of capacity markets.
The unrestricted profile P of Sansom is considered pessimistic [69], with a peak demand much greater than the other profiles. In the worst case (with system COP of 1.5), four million heat pumps would each add 4.6 kW to the peak. In terms of peak heat demand per dwelling, this is not unreasonablemeasured data analysed in [69] shows a heat demand peak of 8 kW per dwelling at the coldest temperatures, and so the value calculated in this work would correspond to an equivalent COP of 1.7. However, heat pumps tend to have very different load profiles to gas boilers, as equivalent devices of the same volume and footprint are much lower power, generally resulting in much flatter profiles.

1*(62
ZUW%DVH&DVH*: Nevertheless, even with the unrestricted profile P excluded, the uncertainty in peak demand increase using only profiles C, F shows a Range of Capacity to Secure of 5.05 GW, 75% of the τ − 4 uncertainty of NGESO (again, even without considering variation in the number of actual heat pumps installed or heat pump DSR). These numbers are comparable with white-box modelling estimates such as [22], which (by consideration only of the reported range of possible peak demands of heat pumps of between 1.2 kW and 2.6 kW) would show a range of 5.1 GW with four million heat pumps. If this uncertainty is added to the existing variability in industry forecasts (as the FE scenario only consider very low levels of annual heat pump installations), the RoCS would increase to more than 10 GW.

Discussion: Bias, Variability and Impacts on Social Welfare
To close this section, we briefly compare bias and variability from sensitivities and scenarios in the context of the capacity market. Where possible, bias should clearly be eliminated: if we consistently over-or under-estimate the true capacity to secure, then in the long-run consumers will pay for unnecessary capacity (the Cost of New Entry in the GB capacity market is £49m per GW-yr [75]). Changes in demand-weather sensitivity lead to uncertainty that is largely irreducible, and so it is sensible to ensure this sort of risk can be communicated to the market. Whilst uncertainties around the number of heat pumps installed will likely remain, uncertainties pertaining to future heat demand curves could be diminished significantly.
Given this conclusion, system operators need to continuously review the most extreme scenarios, as well as understanding estimates of possible bias in their analysis. For example, if the peaking profile P could confidently be rejected, then this would deliver immediate benefits, as the target capacity to secure can be reduced to more modest levels. Similarly, if a system operator considers the Explicit heat model as the most accurate modelling approach, then using this model would eliminate Bias (as well as any potential of perceived bias), thereby providing both accurate results and modelling transparency.

Conclusions
This paper considered the impacts of electrification of heat on system adequacy, with a particular focus on the changes to the Additional Capacity to Secure driven by both scenario and meteorological uncertainties. The addition of one million heat pumps annually to the GB system is considered as a case study, which has a very large impact on the capacity required to reach system security standards.
In particular, it is shown that demand-weather sensitivities could increase by 50% for τ − 4 delivery, with similar increase in the Range of Capacity to Secure required when net demand is conditioned on individual weather years. The proposed Explicit heat demand model has advantages in terms of transparency, with the heterogeneous differences between space heating and electrical demand growth clearly demarcated. By utilising Lasso-Regularized linear regression methods, a wide range of weather-based covariates can be studied without the problems of over-fitting that occur using the standard Least-Squares approach.
It is shown that failure to disaggregate demand separately into heat and electrical demands leads to a bias of 0.79 GW in capacity to secure when compared to results determined using the legacy approach of linearly scaling load duration curves. Over the course of 10 years, it is demonstrated this could result in over-procurement of more than £100m. Uncertainty due to heat pump demand could be greater than all currently considered scenarios that NGESO considers, although further research to narrow this uncertainty could prove fruitful. Reducing this uncertainty, in-turn, would lead to benefits in terms of reducing wasteful over-procurement of peaking capacity, although this needs to be done conservatively to ameliorate dangers of costly lost load.
There are many ways in which the electrification of heating demand can impact energy system security. Amongst others, interactions with large-scale energy storage introduces time coupling which adds complexities (whether from standalone electrical, thermal, or electric vehicle storage). Increased interconnection between regions which each have high levels of space heating could increase further challenges around coincidence of peaks between markets which had previously been considered largely independent. It is concluded that modelling rapid electrification of heat in such interconnected systems will require multi-region, time-sequential modelling for the accurate determination of capacity requirements.
Physical Sciences Research Council through grant no. EP/S00078X/1 (Supergen Energy Networks hub 2018). H. Bloomfield and D. Greenwood are funded by the Supergen Energy Network's CLEARHEADS flex project. S. Sheehy is funded by an EPSRC studentship.