Model-based analysis to identify the impact of factors affecting electricity gaps during COVID-19: A case study in Germany

The recent COVID-19 pandemic has precipitated drastic changes in economic and lifestyle conditions, significantly altering residual electricity demand behavior. This alteration has expanded the demand gap between actual and forecasted electricity usage based on pre-pandemic data, highlighting a critical global issue. Many studies in the pandemic have explored the features of this widening gap, which is impacted by major social events like fast virus spread and lockdowns. However, the influence of factors like economic shifts and lifestyle changes on this demand remains largely unexplored, primarily due to the pandemic's significant effects in these areas. Understanding the essential factors affecting the demand gap is crucial for stakeholders in the electricity sector to develop effective strategies. This study examines the hourly electricity consumption and related factors during the specified period. We present a method combining time-series forecasting and sparse modeling. This helps identify critical factors affecting the electricity demand gap during the pandemic, highlighting the most crucial variables. Utilizing this method, we identify the variables that have undergone significant changes during the pandemic and evaluate their effects on the electricity demand gap. The effectiveness is proven by applying it to the dataset collected in German.

(continued ) After the COVID-19 outbreak, economic changes and shifts in consumer habits have impacted electricity consumption patterns [1,2].Moreover, recent developments in electricity power systems, such as the widespread adoption of household-scale power sources like solar energy and increased interest in energy efficiency, have intensified these effects on residual electricity demand.This term indicates the difference between overall national electricity usage and the portion derived from renewable sources [3].Addressing these demand fluctuations presents a global challenge; discrepancies between actual demand during the pandemic (COVID-19 demand scenario) and projections based on pre-pandemic data (non-COVID-19 demand scenario) have introduced uncertainties into decision-making processes for electricity utilities, impacting areas like reserve planning and facility design.These shifts in demand patterns could affect the energy economy and progress towards global greenhouse gas reduction targets [4].Therefore, stakeholders in the electricity sector must grasp the factors influencing these demand gaps to formulate effective strategies [5].

Related works
Historically, many studies have investigated electricity demand, suggesting various approaches to analyzing the statistical connections among variables that depict consumption patterns [6][7][8].Initially, early studies emphasized a limited set of variables determined by practical experience.For instance, Hor et al. [6] analyzed the correlation between monthly electricity demand and elements like weather and gross domestic product (GDP).With the evolution of the electricity system, demand may now be influenced by extra factors such as energy-saving practices and the costs of essential goods [9,10].
Lockdown measures and social distancing policies during the COVID-19 pandemic led to substantial decreases in electricity demand.In China, Huang et al. [1] found that electricity demand varied by 12 % during COVID-19 compared to regular times, correlating with lockdown stringency.Similarly, Chen et al. [11] investigated how electricity demand correlates with consumer mobility data, affirming that variations in lockdown severity drove the substantial discrepancies between the COVID-19 and non-COVID-19 demand scenarios.Such analyses, which focus on lockdown severity data, provide a rapid means to assess the pandemic's impact on electricity demand.Additionally, the impact of the demand gap during the pandemic may differ depending on the types of commercial, industrial, and residential activities within the specific area [12].The pandemic's impact on particular sectors may lead to significant, unforeseen fluctuations in electricity demand.Exploring the impact of COVID-19 on electricity demand from various sectors like commercial, industrial, and residential activities remains uncharted, mainly in academic research.
N. Kaneko et al. 1.3. Contributions This study aims to employ a statistical model-based approach to analyze the effects of commercial, industrial, and residential activities on the electricity demand gap between the COVID-19 and non-COVID-19 demand scenarios.Conventional methods typically involve aligning time-series data on electricity demand gaps with various variables to identify correlations visually.These methods facilitate an intuitive understanding of data relationships.In contrast, developing a statistical regression model that identifies key variables from a dataset enables a quantitative analysis of how variations in each variable specifically impact the electricity demand gap.This approach offers a more comprehensive assessment than simple data alignment, providing deeper insights into the dynamics influencing electricity consumption during the pandemic.
The research questions (RQ) for this study are outlined as follows: RQ1.What are the significant commercial, industrial, and residential changes during the COVID-19 pandemic?
RQ2. Which key activities significantly impact electricity demand?
RQ3.How do these key activities quantitatively affect the gap in electricity demand between the COVID-19 and non-COVID-19 demand scenarios?
This study focuses on the hourly residual electricity demand, defined as the total electricity demand minus the supply from renewable sources.We consider a broad array of potential explanatory variables, including aspects of power systems, economic conditions, and consumer interests, which previous studies have not sufficiently addressed.We aim to identify the key factors defining the hourly electricity gap induced during the COVID-19 crisis.We propose a data-driven methodology utilizing autoregressive integrated moving average with an exogenous variable (ARIMAX) [13] and enumerated sparse partially linear additive models (enumerated sparse PLAMs) [10] to explore these questions.ARIMAX is commonly used for forecasting time-dependent data by incorporating external or exogenous variables and is employed here to simulate scenarios under non-COVID-19 conditions.The enumerated sparse PLAM approach is crucial for pinpointing a limited number of key variables that define the situation-dependent hourly electricity demand, enabling us to analyze changes in the informative variables effectively.This study uncovers how key factors affect the electricity demand gap amid the pandemic, offering valuable insights for electricity sector stakeholders.[21] Weekly average demand Lock down during COVID-19 pandemic Germany, U.K. 3. We implement enumerated sparse PLAMs for demand modeling and determine key factors influencing demand gaps.4. We propose a methodology to quantify the impact of each key factor by comparing the actual COVID-19 demand scenario with a hypothetical non-COVID-19 scenario derived from the constructed PLAMs. 5. Utilizing this framework, we analyze a real-world dataset to explore the key factors that impact hourly electricity consumption.

Organization of the paper
This paper is organized as follows: Section 2 reviews the electricity demand gap analysis and outlines the features of the specific electricity demand and several explanatory factors.Section 3 outlines the proposed methodology for identifying essential factors affecting pandemic-era demand and their impact on the electricity demand gap.Section 4 examines the outcomes of using the proposed framework on a German dataset, a country heavily affected by the pandemic [14].It offers information on how the identified variables affect the electricity demand gap.Section 5 concludes the findings of this study.

Literature review
Table 1 summarizes previous studies on the behavior of electricity demand.Traditionally, analyses have predominantly relied on a small number of factors like weather and economic conditions to understand how they impact the target demand.However, in cases where power demand exhibits complex variations, the analysis incorporating a broader array of variables.For instance, Kaneko et al. [10] examined the hourly electricity demand in Japan, identifying critical variables that influence changes in demand by analyzing numerous factors, including weather, interest rates, stock prices, calendar effects, and GDP data.Post-COVID-19, several studies have examined the pandemic's impact on electricity demand changes; the effects of lockdown scale and infection rates were primarily emphasized.Ceylan et al. [15] investigated the daily electricity consumption patterns and analyzed how lockdown measures affected demand changes.Similarly, Chen et al. [11] developed a method to model daily demand, incorporating mobility data to indicate economic activity.These studies suggest that the pandemic has significantly impacted the electricity demand gap between the COVID-19 scenario and the non-COVID-19 scenario, with these effects varying based on factors such as population densities, economic development, political orientations, and COVID-19 management strategies.Moreover, Alavi et al. [12] noted that the demand gap tends to be more pronounced in commercial and industrial areas compared to residential areas.However, prior related research has not thoroughly explored the relationships between specific business categories and the demand gap.

Empirical studies of behavior of electricity demand in Germany
Various factors, such as weather, renewable energy capacity, economy, and conservation habits, can affect electricity demand (see Table 2 for details).This study specifically focuses on the hourly electricity demand in Germany [24], a region that has experienced significant economic and social repercussions due to the pandemic, which has also profoundly impacted the gap in electricity demand.Notably, the relationships between the variables are not consistently linear and may change based on seasonal conditions.Therefore, our study focuses on the linear/nonlinear behavior of the hourly electricity consumption curve.
Since the onset of the COVID-19 pandemic in 2020, political interventions such as lockdowns and shutdowns have drastically altered economic conditions, pricing structures, and electricity consumption behaviors, leading to considerable fluctuations in electricity demand.Fig. 2 shows the monthly and yearly electricity demand from 2015 to 2021.The annual electricity demand in Germany has been slowly decreasing, possibly because of the rise in small household-scale power sources like rooftop solar installations.Despite a decrease in average electricity demand in 2020, there was a noticeable increase in 2021 as the pandemic's immediate impacts began to wane.
Fig. 3(a) and (b) show the difference between the electricity demand in 2020 and 2021 and the electricity demand in 2019.The figures illustrate the variations in electricity demand compared to 2019 under different circumstances: in 2020, there was a general decrease in electricity demand compared to 2019, although there were intervals where demand exceeded that of the previous year.Conversely, in 2021, there tended to be extended periods where electricity demand exceeded the levels recorded in 2019.
Fig. 4 shows the monthly average electricity demand changes from 2015 to 2021, while Fig. 5 compares demand variations in 2020 and 2021 with 2019.These figures suggest that, in April 2020, electricity demand decreased significantly by 27.8 %, followed by a strong recovery in March 2021 with a 21.9 % surge compared to 2019.This analysis underscores the dynamic nature of electricity demand under varying economic and social conditions.
The average daily demand patterns were analyzed for different months from 2015 to 2016, shown in Fig. 6(a)-(d).These patterns vary in timing and intensity across seasons.In Germany, electricity usage typically decreases during the day, especially in 2020 and 2021, reflecting pandemic-related changes affected by seasons and specific time slots.
N. Kaneko et al.Fig. 7 presents the long-term trends of several critical variables influencing electricity demand from 2015 to 2021.For instance, Fig. 7(a) displays the production index1 of the coal mining industry [25], where certain variables remained stable over the period.However, the trends of several variables shifted during the pandemic.The behavior of the German stock index [26], as shown in Fig. 7 (b), reflects the domestic economic conditions, which deteriorated significantly during the pandemic.The filled ranges in these figures represent the expected variations for 2020-2021, using data trends up to 2019.The data show notable shifts in the behaviors of these variables that changed significantly during the pandemic; the restrictions and lockdowns substantially impacted economic conditions and consumer habits.Hourly DAX [26]: v18 Daily Indices of production in industry [25]: v19-47 Monthly Indices of production in service [38]: v48-70 Indices of production in construction [39]: v71-72 GDP [40]: v73 Producer price index [41]

Theoretical background of analysis of electricity demand
Many researchers and operators have tried to develop data-driven models for studying electricity demand.Most existing studies have relied on a narrow range of variables based on experts' empirical knowledge.They concentrate on factors affecting demand, like economic indicators such as GDP and national holidays [6,7].Additional factors like lockdown severity have also been incorporated post-pandemic, as shown in Table 2.However, given the complexity of power systems, more comprehensive analyses that consider a wider range of variables-including demographic, climatic, commercial, industrial, and residential factors-have been discussed [5,10].These studies indicate that the main factors affecting temporal variations in electricity demand may vary depending on the particular situation and target dynamics.High numbers of explanatory variables lead to unstable estimation results due to multicollinearity [27].Some variables, including redundancies, inadequately capture changes in electricity demand.The concept of sparse modeling in machine learning focuses on choosing crucial variables and pinpointing those with low relevance to the target value [28].This method has successfully solved various complex, real-world, ill-defined problems.We can understand how factors impact different situations by constructing models using data from particular seasons, pinpointing critical explanatory factors, and comparing them with models from other conditions.Moreover, while previous studies like Dai et al. [23] have assumed simple linear relationships between the demand gap and influencing factors, utilizing statistical models based on these assumptions to analyze the main impacts of each factor on demand, recent research has increasingly highlighted the importance of considering nonlinear relationships.For instance, Huang et al. [18] presented a method using nonparametric and nonlinear modeling to enhance the description of target demand.However, such nonlinear approaches, which consider the complex effects of interactions between variables, often make it challenging to isolate the main effects of individual variables on the target value.In this context, PLAM effectively separates linear and nonlinear main impacts of explanatory variables [29], excluding complex interactions.
In the next section, we introduce a scheme utilizing sparse PLAMs.This approach is highly promising for describing the behavior of electricity demand and identifying key variables that influence the target demand in both linear and nonlinear ways.

Overview of the analysis of the gap in electricity demand
We focus on the hourly residual electricity demand and propose an approach to select key variables that describe this demand, as well as to identify the additive contribution of each variable to the electricity demand gap between the COVID-19 demand scenario and a non-COVID-19 scenario, which is hypothetically estimated using pre-pandemic data.Fig. 8 shows an overview of the proposed approach.
We employed a dataset spanning from 2015 to 2021, which includes actual residual demand and hundreds of potential explanatory variables.These variables are grouped into categories such as weather, stock prices, commodity prices, and calendar data (see Table A1 in the Appendix for more details).In Step 1, we used the enumerated sparse PLAM to select key variables for explaining electricity demand.This model is particularly effective for identifying a limited number of critical variables that elucidate the situationdependent behavior of annual demands from numerous candidates.Additionally, it determines whether the plausible relationships between the selected variables and the target demand are linear or nonlinear [10].In Step 2, we employ the ARIMAX model to estimate the hypothetical non-COVID-19 variable and demand scenarios using pre-pandemic data.The ARIMAX model is an extension of the ARIMA model, widely recognized for its efficacy in describing relationships between target demand and historical data.ARIMAX enhances this approach by incorporating exogenous explanatory variables, allowing for a more comprehensive modeling of demand behavior under different scenarios [30][31][32].In Step 3, we estimate the gap in electricity demand between the COVID-19 scenario and During the pandemic, several variables exhibited unexpected and significant changes, while others varied in predictable ways in response to economic fluctuations.In this study, we propose an approach to identify the key variables that deviate significantly during the pandemic based on estimated confidence intervals around the non-COVID-19 variable scenarios and to determine the additive contributions of these variables to the deviance-oriented gap.As a whole procedure, we analyze the impact of the critical variables related to the changes in the electricity demand during the pandemic.

Situation-dependent modeling based on partially linear additive models
In Step 1, we developed a situation-dependent model to describe the hourly electricity demand.Initially, we targeted specific demands and corresponding explanatory variables for each period to create models that accurately represent the behavior of hourly demand during these periods.This approach essentially illustrates how hourly electricity demand is influenced by various time-related factors.
Let S = {1, …, J} be an index subset of explanatory variables, where {( x ymdh , l ymdh )} is a set of pairs containing l ymdh , the observed electricity demand at hour h on day d in month m of year y; and is a vector of J variables observed at the corresponding timing.The daily, monthly, and yearly variables were interpolated using the nearest-neighbor method.Moreover, we let L ⊆ S and N ⊆ S be the index subsets of S = {1, …, J} to indicate the linear and nonlinear variables related to the target demand.The PLAM provides a formulation for describing the target demand using various variables with both linear and nonlinear relationships, defined in Eq. ( 1) as follows: where indicates a set of model coefficient parameters, β denotes a set of the coefficient parameters for the explanatory variables x, and τ j represents the vector of coefficient parameters for transformation functions φ . Additionally, we employ cubic spline transformation functions as Eqs.( 2) and ( 3), known for their effectiveness with the PLAM in prior research [10], with K bases:  [ where represent the knots of the spline selected from quantiles in the sample set.The sparse modeling technique systematically selects informative variables, determining linear or nonlinear relationships between explanatory variables and target demand.For a specific month m and hour h, the parameters can be estimated based on minimizing the squared error loss F mh under the given subsets L and N , defined in Eq. ( 4) as follows: where λ denotes a positive regularization constant for the penalty, and represents the L2-norm of the vector τ.The parameter λ influences the modeling outcomes and is determined based on a one-day walk-forward validation [33].The components of the minimizer θ in Eq. ( 4) approach zero to minimize absolute and L2-norm penalties, along with decreasing the squared error loss.This effect leads to redundant variables approaching zero, making it easier to select a subset of explanatory variables with multiple informative variables [29].In Eq. ( 4), the regularization process identifies essential variables based on seasonal conditions and determines their linear or nonlinear relationships as follows: • β j ∕ = 0, τ j,k = 0 (∀k): Variable x j exhibits a linear relationship with demand.
To enhance the interpretability of variables that significantly influence electricity demand on an annual basis, the enumerated sparse modeling technique [34] is utilized.This technique mechanically selects a limited number of consistently relevant variables across various situational models.The enumeration scheme builds models by choosing vital variables and assessing their linearity or nonlinearity.In this scheme, plausible candidate models are enumerated based on the specific situation (m, h), and a representative model is chosen from these candidates to minimize the overlap of explanatory variables commonly used in models tailored for in- mh , F (1) mh ) be a set of pairs containing the index set of variables selected by the sparse PLAM as expressed in Eq. ( 1) and the squared-error loss under that variable set S ) contain almost identical information to describe electricity demand, and the sets of squared-error loss ) exhibit only minor differences.Therefore, the squared error loss in every enumerated model satisfies the following requirements in Eq. ( 5): where the positive parameter ε regulates the suboptimality of the enumerated results.Specifically, the situation-dependent key variables are selected based on these criteria in Eq. ( 6).
These statistical models are created to explain electricity demand variations in different seasons, helping identify essential variables.The selected variables for each seasonal situation are described in Ŝ mh .

Approach for estimating the hypothetical non-COVID-19 variable scenarios
For each variable that exhibited fluctuations during the pandemic, a non-COVID-19 scenario is estimated in Step 2, with deviations from this projected scenario subsequently analyzed.Fig. 9 provides an overview of how these variable scenarios are estimated.Specifically, using ARIMAX models, we forecast long-term variable scenarios using pre-pandemic data, which includes a predicted variable and its confidence interval, as shown in Fig. 9(a).We then focus on the discrepancies between these scenarios and the observed variables to determine which significantly diverged from their estimated scenarios during the pandemic, as illustrated in Fig. 9(b).The variables selected for scenario forecasting, clearly impacted by the pandemic, are monitored daily, monthly, or annually;   Focusing on the daily variable x j d (d ∈ D ), the ARIMAX (P, O, Q) model is utilized to describe the target variable using historical data and exogenous variables that reflect the month and day of the week.This is defined in Eqs. ( 7)-(9) as follows: where h a (d) and h b (d) represent functions that generate dummy variables for the month A = (January, …, December) and the day of the week B = (Monday, …, Sunday), respectively, within the target period as Eq. ( 10) and ( 11):  These parameters κ, μ, ν, and ξ are estimated based on maximum likelihood estimation (MLE) in Eq. ( 12), which is conceptually similar to the least-squares estimates as shown below [35]: where σ d is derived from the standard deviation of the residuals, which are the differences between the observed values and the fitted values from Eq. ( 7) using the estimated parameters, expressed in Eq. ( 13) as follows: ) The predicted value of xj d (d > D) can be estimated as a non-COVID-19 variable scenario during the pandemic, and the 95 % confidence interval is determined by the difference between xj d,upper and xj d,lower [36].The confidence intervals are derived in Eq. ( 14)-( 16) as follows: The deviating variables are identified by comparing the actual variable x j d with the bounds of the confidence interval as follows: Similarly, the ARIMAX model (P, O, Q) for describing the monthly variable x j m (m ∈ M ) is defined in Eq. ( 17) and (18) as: κ, μ, ν, ξ = argmin κ, μ, ν, ξ The scenarios during the pandemic are estimated based on xj m (m > M), derived in Eq. ( 19)-( 21) as follows: The ARIMAX (P, O, Q) model for describing the yearly variable x j y (y ∈ Y ) is defined in Eq. ( 22) and ( 23) as: κ, μ, ν, ξ = argmin κ, μ, ν, ξ The scenarios during the pandemic are further estimated based on xj y (y > Y), expressed in Eq. ( 24)-( 26) as follows: N. Kaneko

Evaluation of additive contribution to demand gap
In Step 3, we analyze the demand gap between COVID-19 and non-COVID-19 demand scenarios using PLAMs constructed in Step 2. Fig. 11 shows an overview of this analysis.The demand gap can be classified into two components: the deviance-oriented gap, caused by significant changes in variables from their non-COVID-19 behavior, and the expected gap, resulting from variables acting according to typical non-COVID-19 scenarios.In Fig. 11, variable x J deviates from its non-COVID-19 variable scenario in the target period, leading to a deviance-oriented gap.
Here, we calculate the additive contributions of each variable over different seasonal periods, considering their significant fluctuations throughout the target period.Let Gap ymdh be the electricity demand gap at hour h on day d in month m of year y.The non-COVID-19 demand scenario, the COVID-19 demand scenario, and the demand gap between these are outlined using the constructed model in Step 1 (Eq.( 1)) as detailed in Eq. ( 27)-( 29): ) ( where Ŝ mh ⊆ S represents the index subset selected by the enumerated sparse PLAM for each seasonal situation in Step 1.Thus, the additive contribution of variable j can be described in Eq. ( 30) as follows: ) ( The pandemic-induced demand gap is categorized into two types: the deviance-oriented gap, where variables behave abnormally compared to non-COVID-19 conditions, and the expected gap, where variables fluctuate within expected ranges.These gaps are Fig.11.Analysis of the impacts of key variables on the electricity demand gap. N. Kaneko et al. quantified based on the additive contributions of variables, detailed in Eq. ( 31)- (35) as follows: where Ŝ deviance mh represents the indices of key variables that deviate from their respective scenarios during each seasonal period, and denotes the variables that vary within the expected scenario range.These variables are identified by analyzing the differences between the non-COVID-19 variable scenario and the actual variable values, as outlined in Step 2.

Simulation setup
The analysis of the pandemic's demand gap in Germany utilizes actual hourly electricity usage data from January 2015 to December 2021.This analysis includes 179 explanatory variables related to categories such as weather, interest rates, stock prices, calendars, and GDP data, as detailed in Table 2.For constructing the sparse PLAMs, we have to tune parameter K in Eq. ( 2) for nonlinear transformation and parameter λ in Eq. ( 4) for the sparse scheme.The parameters were determined through a one-day walkforward validation [33] spanning from January 2018 to December 2021.During this period, statistical models were reconstructed daily and evaluated based on the forecasts of the subsequent day.In the enumeration scheme of Step 1, we utilized the parameter ε = 0.005, which controlled the number of enumerated models.This rigorous approach ensures precise identification and analysis of the variables impacting the electricity demand gap during the pandemic.
We examined four distinct models outlined in Table 3 to analyze electricity demand.The Linear Model (LM) uses a limited set of explanatory variables-such as weather conditions, the German stock index, and holiday/weekday data-which are commonly employed in traditional studies [6,7], and assumes a linear relationship among these variables.Similarly, Additive Model 1 (AM1) also utilizes a limited number of variables but introduces considerations for nonlinearity among them.Conversely, Additive Model 2 (AM2) incorporates a broader array of variables (179 variables) and is used in more recent studies [5,10] to model nonlinearity among these variables.Furthermore, the PLAM engages many variables to implement the procedure outlined in Section 3 and identifies a small set of annually dominant variables.The evaluation of these models was conducted based on their descriptive accuracy.

Description accuracy of constructed models
Initially, we developed statistical models of hourly electricity demand using both naive and advanced approaches listed in Table 3, based on the dataset.The performance of the constructed PLAMs in capturing the behavior of electricity demand is substantiated by the average root-mean-squared error (RMSE) and Nemenyi tests.Table 4 presents the RMSE values used to compare the descriptive accuracy during the training period.The RMSE is calculated, defined in Eq. ( 36) as follows: where T denotes the number of data samples.Fig. 12 presents the outcomes of the post-hoc Nemenyi test [43], commonly employed to evaluate significant differences in descriptive ranks based on their RMSE values.This analysis assigns the model with the lowest RMSE the highest rank.The Nemenyi test compares the average ranks across multiple models and identifies significant performance disparities among them.The results reveal that models AM1 and AM2 exhibited lower descriptive accuracy than model LM.This implies that enhancing the assumptions regarding linear or nonlinear relationships between the demand and the variables, and careful selection of explanatory variables could improve descriptive accuracy.Notably, model PLAM demonstrated higher descriptive accuracy than conventional naive modeling approaches, indicating that the proposed method effectively identifies key variables and discerns the relationships of linearity or nonlinearity between them.

Analysis of the gap caused by the pandemic
The analysis also examines the impact of the pandemic on seasonal variations.Fig. 14 displays the monthly averages of the expected and deviance-oriented gaps.In response to the COVID-19 outbreak, Germany implemented its initial nationwide lockdown in March 2020.A phased mitigation strategy followed, with the government announcing on April 15 that small retail stores would reopen on April 20, and schools would gradually begin to reopen from May 4.However, due to a resurgence in case numbers, a partial lockdown was reinstated from November 2, 2020, to March 1, 2021 [44].Fig. 13 indicates that the deviance-oriented gap varies seasonally, showing that in April 2020, when the most substantial gap occurred, approximately 95 % of this gap resulted from the deviance-oriented demand gap, aligning with the initial lockdown measures.Further analysis suggests that despite the continuation of lockdown measures, Germany experienced a notable recovery in electricity demand from late 2020 into 2021.These findings indicate that the proposed approach effectively captures behavioral changes in the demand gap unaccounted for by movement restriction policies alone.
This section presents an analysis of hourly averages for the actual COVID-19 demand scenario alongside a hypothetical non-COVID-19 scenario, as depicted in Fig. 14(a)-(c).The results demonstrate that the proposed approach successfully reproduced the COVID-19 demand scenario through the construction of probabilistic latent additive models (PLAMs).Additionally, the PLAMs generated hourly hypothetical non-COVID-19 scenarios, revealing fluctuations in the hourly demand gap dependent on the time of day.In 2020, a consistent trend of decreasing electricity demand was observed across all time slots.Conversely, the 2021 data show a recovery in

Discussion of selected variables
Furthermore, we examine the essential variables influencing the demand gap during the pandemic and assess the additive contributions of each variable.The situation-dependent modeling approach selected 72 out of 179 explanatory variables to characterize the annual demand behavior for each monthly hour.We highlight the variables significantly impacting the expected and deviance-  as they vary according to the non-COVID-19 scenario during the target period.Additionally, we explore the differences in the additive contributions of these key variables to the electricity demand gap across each month and hour.Fig. 15(a) and (b) illustrate the mean additive contributions for each seasonal period of 2020 and 2021; the contributions are derived in Eq. (37) as follows: The black box highlights the variable Ŝ deviance mh deviating from the non-COVID-19 scenario during the target period.These findings elucidate the variations in additive contributions across different seasonal conditions, defined by the sets of year y, month m, and hour h.The analysis identified several variables as pivotal despite their stability during the pandemic.Notably, the installed capacity of power generation (variables v7, v10, v11, v12), the manufacturing of basic pharmaceutical products and phar-

Table 5a
Key variables that deviated drastically from the projected scenarios.maceutical preparations (v34), and the consumer price of passenger transport by road (v131) were deemed essential.Furthermore, certain variables significantly influenced the deviance-oriented gap, including the production of land transport and transport via pipelines (v50), postal and courier activities (v54), consumer prices of domestic and household services (v116), passenger transport via railways (v130), and equipment for sport, camping, and open-air recreation (v147).The general trend suggests that these variables are susceptible to fluctuations driven by lockdown measures and increased home-based activities during the pandemic, potentially leading to substantial demand gaps.These variations may also differ across seasons.For example, in March 2020, the consumer price of recording media (v141) diverged from the non-COVID-19 scenario, significantly influencing the demand gap.However, by 2021, the impact of these variables on the demand gap had diminished, suggesting a shift towards an expected scenario-dependent explanation of the gaps observed.Fig. 16 shows examples of the additive contributions of key variables for the same period covered in Fig. 15.The additive contributions are calculated based on the mean values of A j ymdh from Eq. ( 30) for each year, month, and hour to explain the demand gap.As illustrated in Fig. 16 (a), the manufacturing of food products (v23) and the production of land transport and transport via pipelines (v50) in January 2020 had a significant impact on the demand gap.These variables did not exhibit considerable deviations during the target period, suggesting that the demand gap in January 2020 was likely unrelated to the pandemic.Notably, COVID-19 infections in Germany commenced towards the end of January, indicating that the pandemic had minimal influence on demand during this month.
In contrast, as depicted in Fig. 16 (b), the influence of variables diverging from the non-COVID-19 variable scenario during the pandemic substantially increased in February 2020 compared to January.For example, the production of legal and accounting activities (v61) had a pronounced effect on the demand gap as this sector was severely impacted by the pandemic, leading to a downturn in business activities due to restrictions [45].In July 2021, as portrayed in Fig. 16(c), the manufacturing of paper and paper products (v30) continued to be crucial.Additionally, the consumer price of refuse collection (v99) experienced significant deviations during daytime hours (6:00 to 17:00), highlighting the substantial impact of variable deviations during these hours in the target month.This finding suggests that the additive contributions of key variables due to their deviations significantly altered their behavior according to the season and time of day.These seasonal and temporal variations observations illustrate idiosyncratic changes significantly driven by pandemic-induced economic conditions and consumer behaviors.To accurately predict the long-term behavior of electricity demand, it is imperative to thoroughly characterize the key variables and their behaviors under specific seasonal circumstances.To ensure reliable forecasts, these variables must be accurately modeled considering prevailing economic conditions and consumer interests.
Furthermore, we explore the confidence interval analysis introduced for each key variable in Section 3.3, focusing on the fluctuations in additive contributions attributed to uncertainty in the explanatory variables.We specifically concentrate on the variation in each non-COVID-19 variable scenario and examine their implications on the variability of additive contributions impacting the demand gap.The variation scenarios are derived from 0 to 100 percent tiles within the confidence interval; an example of these scenarios for a specific variable is illustrated in Fig. A1 in the Appendix.For each variation scenario, we analyze the range of contributions to the variable variation using the same process for deriving the additive contribution presented in Section 3. Fig. 17 shows the range of additive contributions of each key factor.In this figure, variables marked with a blue asterisk indicate that the upper and lower bounds have the same sign.It suggests that these variables consistently influence the electricity demand gap under factor uncertainty.Such results provide relatively reliable information for electricity utilities and governments as a factor in capturing electricity demand trends.Conversely, other variables exhibit erratic behavior, with their impacts fluctuating positively or negatively in response to uncertainties.Although these variables are expected to contribute to electricity demand, the direction and magnitude of their contributions may vary significantly due to their factor variability.This requires careful consideration by utilities and governments when analyzing data and forecasting demand.

Analysis of the factors affecting the gap caused by the pandemic
We analyze the trends in the additive contributions of each critical variable that influenced the deviance-oriented gap . We employ principal component analysis (PCA) [46] to extract the temporal trends of the significant patterns of contributions constituting the deviance-oriented gap.In this context, pc(q) = { pc q ymdh } represents the q th principal component, and Ψ j q is the factor loading of variable x j in the q th principal component.The pc q ymdh values are computed, defined in Eq. ( 38) as follows: Fig. 18 shows the five principal patterns of contributions that characterize the deviance-oriented gap during the target period, illustrating distinct patterns influenced by government regulations and changes in consumer behavior due to the pandemic.For instance, the significant patterns pc(1) and pc(2) seem significantly correlated with the partial closure of public transport, while pc(2) also relates strongly to the general lockdown measures.Notably, pc( 5) is prominent in the early stages of the pandemic.
Further analysis of the key factors influencing these patterns is presented in Fig. 19, which details the principal component scores of specific variables to each significant pattern; we focus on the top 10 variables with the most critical influence on the significant pattern.The results highlight the substantial impact of the consumer price of small tools and miscellaneous accessories (v114) and spare parts and accessories for personal transport equipment (v126) in patterns pc(1) and pc (2).This suggests that changes in demand for personal mobility resources significantly influenced the deviance-oriented gap during the partial closure of public transport.The prominence of the consumer price of equipment for sports, camping, and open-air recreation (v147) in pc (3) indicates that the demand for outdoor equipment predominantly affected the demand gap during lockdown periods.Additionally, the influence of the production of legal and accounting activities (v61) is a significant driver in pc (5), suggesting that the demand gap was responsive to variations in business and government operations during the pandemic.These findings demonstrate the efficacy of using PCA to discern diverse patterns in the deviance-oriented demand gap resulting from various pandemic-related events and to pinpoint the key factors associated with each pattern.
Lastly, we focus on the production index of the equipment for sport, camping and open-air recreation (v147) and the term "solar power" used in Google trends (v171) in Fig. 19 and analyze relationships between the electricity demand and key factors; these variables are confirmed to have that a consistent trend of the impact on the demand gap under factor variation in Fig. 17.Fig. 20(a) and (b) show the monotonically decreasing trend in electricity demand with increasing the production index of the equipment for sport, camping and recreation.These results suggest that outdoor activities have changed the behavior of electricity demand downward.Fig. 20(c) and (d) show that the electricity demand tends to decrease depending on the increasing number of searches for "solar energy" in Google Trends; the trend becomes noticeable during the day.These results suggest that increased consumer interest in solar power may contribute to reduced electricity demand.A plausible interpretation for this occurrence could be the rise in small-scale solar power installations, which has occurred with heightened consumer engagement and curiosity.Alternatively, it may be that consumers' growing concern about energy issues has led them to reduce their electricity consumption.Throughout the COVID-19 pandemic, there has been a marked shift in consumer attention toward energy issues, with an increase in engagement with renewable energy and electric vehicles [47].This shift in focus has indirectly led consumers to adopt behaviors that reduce electricity demand.
The results suggest that our proposed data-driven analysis method allows for the mechanical identification of key variables associated with the electricity demand gap from a wide range of possible variables.Furthermore, by focusing on a select few key factors, our analysis of the additive contribution of each variable to the gap allows for a qualitative understanding of the relationships between the factor and the demand gap, relationships that have not conventionally been considered.

Conclusions
In this study, we developed a methodology to ascertain the impact of key variables on the electricity demand gap in Germany during the COVID-19 pandemic.Utilizing ARIMAX models, we quantified deviations of each explanatory variable from hypothetical non-COVID-19 variable scenarios.Additionally, we implemented a sparse enumerated PLAM to construct demand models that facilitate the selection of key variables and elucidate demand behaviors.This dual approach, employing both ARIMAX and PLAMs, effectively pinpointed the variables influenced by shifts in economic conditions and consumer behaviors throughout the pandemic,    and it delineated their impact on the electricity demand gap across different seasons.Key findings of this research include.
• The ARIMAX model proved instrumental in estimating non-COVI-19 variable scenarios absent of COVID-19 influences during the pandemic, providing a baseline for comparative analysis.• Our variable selection methodology, which utilizes a sparse enumeration technique, adeptly identified critical variables that characterize electricity demand fluctuations across various seasonal contexts.• Using PLAMs facilitated a detailed understanding of the interrelationships between electricity demand and pivotal variables under diverse seasonal conditions.• Ultimately, our approach highlighted the primary effects of significant variables on the demand gap instigated by the COVID-19 pandemic, underscoring the responsiveness of the electricity sector to external disruptions.
This comprehensive analysis underscores the value of advanced modeling techniques in understanding complex market dynamics and aiding policymakers and industry leaders in making informed decisions during unprecedented times.

Policy implications
In this study, we focused on the electricity demand gap in Germany, analyzing the influence of the pandemic on changes in electricity demand.This analysis is instrumental in understanding the dynamics behind these changes and in forecasting future demand behaviors.Such insights are crucial for strategic decisions regarding infrastructure investment, such as enhancing grid flexibility to adapt to electricity use shifts and creating informed public messages about energy conservation and management during crisis events.For instance, electricity utilities and governments can use these insights to adjust energy supply strategies, ensuring stability in electricity provision despite unpredictable demand fluctuations.Moreover, this analysis supports developing and promoting energy efficiency initiatives tailored to the observed shifts in consumption patterns.The influence of various factors on the demand gap may N. Kaneko et al. fluctuate with changes in government policies and other significant social events.Therefore, it is essential for electricity utilities and governments in each country and region to conduct detailed analyses focusing on individual power demands and variables.Moreover, it is imperative to perform demand analysis and estimation to prepare for future crisis events that could impact demand.This study proposes a general modeling approach that facilitates the analysis of demand factors based on a consistent methodology applicable to any period and observed variables.Electricity utilities and governments should continue to compare data from multiple perspectives, including different geographic regions and periods, especially during crisis events.This comprehensive approach will enhance the resilience and responsiveness of energy systems to global challenges.

Study limitations and future recommendations
In this study, we identified the key variables that significantly influenced the electricity demand gap during the pandemic, particularly between 2020 and 2021, when the impact of the pandemic was most pronounced.It is also crucial to examine the pandemic's ongoing effects on electricity demand beyond 2021.Post-pandemic, the consumption patterns in commercial, industrial, and residential sectors have varied considerably, with some categories reverting to pre-pandemic routines and others undergoing substantial changes.As we move forward, forecasting future electricity demand becomes increasingly complex due to the lasting impacts of the pandemic.The methodology proposed in this study effectively analyzes the direct effect of key variables on the targeted demand.However, constructing a hierarchical structure for factors that may indirectly influence the demand gap could provide a more comprehensive understanding of the increasingly intricate demand dynamics observed in recent years [48].Although sparse modeling is a widely utilized machine learning technique for analyzing important variables mechanically, supplementing model outputs with literature reviews and expert consultations is crucial for accurately discerning the physical relationships among variables.For example, investigating the relationships between changes in long-term electricity demand and ESG (environmental, social, and governance) factors helps clarify a physical interpretation that considers the growth of the country and the company [49,50].Future research should also focus on developing forecasting methodologies that adequately account for the uncertainty associated with key variables.These variables can exhibit high volatility over time and may significantly impact demand gaps based on economic, lifestyle, and environmental conditions.Such an approach will be essential for accurately projecting electricity demand in a post-pandemic world, ensuring that energy systems are resilient and responsive to ongoing and future changes.

λ
Observation of explanatory variables on year y ha(d)Indicator function to derive the dummy variables based on the month h b (d) Indicator function to derive the dummy variables based on the day of the week κ Coefficient parameters for the autoregressive component μ Coefficient parameters for the moving average component ν, ξ Coefficient parameters for the exogenous variables l ymdh Hourly demand recorded at hour h on day d in month m of year y J Number of explanatory variables x ymdh = ( x 1 ymdh , …, x J ymdh ) .Explanatory variables observed at hour h on day d in month m of year y L ⊆ S .Index subset showing variables linearly linked to target demand N ⊆ S .Index subset showing variables nonlinearly linked to target demand β Coefficient parameters for explanatory variables φ k j (.) Function for cubic spline transformation K Number of bases in cubic spline transformation functions τj Vector of coefficient parameters for transformation functions φ 1 Positive constant penalty for regularization S = {1, …, P}.Index set of explanatory variables S (i) mh Index subset of the selected variables with nonzero components Gap ymdh Electricity demand deviation at hour h on day d in month m of year y Ŝ deviance mh Index of the critical variables deviating from the scenarios without COVID-19 in each seasonal period

Fig. 1 (
a)-(d) show how electricity demand relates to different factors in specific seasons; solid lines represent relationships identified by the sparse PLAM introduced in Section 3. The figure highlights how electricity demand fluctuates in response to these factors.

Fig. 1 .
Fig. 1.Relationships between seasonal electricity demand and several factors, with solid lines representing curves derived from the sparse PLAM (refer to Section 3).These factors include: (a) Temperature at 7 p.m. in February, (b) Temperature at 9 p.m. in April, (c) German stock index at 10 a. m. in March, and (d) German stock index at 7 p.m. in March.

Fig. 2 .
Fig. 2. Behavior of electricity demand.The red line indicates the date COVID-19 was first detected, the blue line represents the annual average electricity demand, and the orange line represents the monthly average electricity demand.(For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)

Fig. 3 .
Fig. 3. Difference between the electricity demand (a) in 2020 and (b) in 2021 and the electricity demand in 2019.

Fig. 5 .
Fig. 5. Comparative analysis of monthly electricity demand from 2020 to 2021 against the same period in 2019.

Fig. 6 .
Fig. 6.Sample averages of hourly electricity demand (a) in January, (b) in April, (c) in July and (d) in October.
mh , …, F (I mh ) mh )} be the set of variable indices and the corresponding squared error losses enumerated in the sparse PLAM scheme.These enumerated sets of variables ( S (2) mh , …, S (I mh ) mh

Fig. 7 .
Fig. 7. Dynamics of explanatory variables across two scenarios: pre-pandemic and during the pandemic.Specifically, it focuses on: (a) Coal mining and (b) the German stock index (DAX).The black line represents the observed variable, while the orange line and shaded area depict the non-COVID-19 scenario and the 95 % confidence interval, respectively.This scenario is derived using pre-pandemic data through the ARIMAX method, as explained in Section 3.1.(For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)

Fig. 10 (
a)-(d) illustrate an overview of the observation granularity for each variable.Let T = (1, …, T) be the training period, and T be the number of samples of hourly data.The training period for daily data, such as stock prices, is defined as D = {1, …, D}, where D denotes the number of samples.Similarly, the training periods for monthly data, such as the index of production in services, are defined as M = {1, …, M}, and the training periods for yearly data, such as power generation amounts, are defined as Y = {1, …, Y}.

Fig. 9 .
Fig. 9. Estimation overview of the non-COVID-19 variable scenario for each explanatory variable: (a) Overview of the predicted variable and its confidence interval and (b) an example of the variable significantly diverged from their estimated scenarios during the pandemic.

Fig. 12 .Fig. 13 .
Fig. 12. Results of the Nemenyi test assessing the average ranks of description accuracy.Hatching intervals indicate no significant differences between average ranks were found based on the test results.
electricity demand, marked by increases during early morning and late-night hours and decreases during daylight hours.This pattern may indicate a shift in daily routines across commercial, industrial, and residential sectors and adaptation to remote working practices, which consequently alter traditional electricity consumption peaks.These observations suggest that the nature of electricity usage varies with the pandemic conditions.

Fig. 15 .
Fig. 15.Mean additive contributions of each key variable affecting the electricity demand gap (a) in 2020 and (b) in 2021.The black box highlights variables that deviate from the non-COVID-19 scenario during the target period.The x-axis represents the transition of the situation (m,h), and the yaxis represents the selected variables.The color bar on the y-axis indicates the categories of the variable.(For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)N.Kaneko et al.

Fig. 16 .
Fig. 16.Examples of additive contributions of crucial variables for the period shown in Fig. 14: (a) January 2020, (b) February 2020 and (c) July 2021.The black box illustrates key variables and their contributions to the demand gap relative to the deviance-oriented gap.

Fig. 17 .Fig. 18 .
Fig. 17.Range of additive contributions of each key factor of non-COVID-19 scenario variation.Blue asterisks indicate that the variables in the upper and lower bounds have the same sign.(For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)

Fig. 19 .
Fig. 19.Principal component scores of critical factors to each significant pattern.The top 10 variables with the greatest influence on the pattern are shown.The intensity of the color of the squares indicates the magnitude of the impact of each variable category on the electricity demand gap.(For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)

Fig. 20 .
Fig. 20.Relationships between seasonal electricity demand and key factors.Solid lines represent curves derived from the sparse PLAM introduced in Section 3: (a) Production index of the equipment for sport, camping and open-air recreation at 1 a.m. in April and (b) at 1 p.m. in April, and (c) "Solar energy" in Google trends at 1 a.m. in October and (d) at 1 p.m. in October.

Table 2
Categories of explanatory variables used.
Table A1 in the Appendix shows the observation granularity for each explanatory variable.
et al.

Table 3
Conditions of the constructed models.

Table 4
RMSE of the models as described in Table2.

Table 5
(a) displays the index of the variable Ŝ deviance mh deviating from the non-COVID-19 scenario.Table 5 (b) lists the index and names of the key variable

Table 5b
Key variables fluctuating with scenarios.
a This category includes the amount of conventional power generation but not the generation from nuclear, lignite, hard coal, fossil gas, and hydropumped storage.bThiscategoryincludes the mining and quarrying various minerals and materials: abrasive materials, asbestos, siliceous fossil meals, natural graphite, steatite (talc), and feldspar.N.Kaneko et al.(caption on next page) N.Kaneko et al.

Table A1 (
continued ) "Maintenance and repair of other major durables for recreation and culture" (continued on next page) N.Kaneko et al.