Artificial Intelligence for Electricity Supply Chain automation

The Electricity Supply Chain is a system of enabling procedures to optimize processes ranging from production to transportation and consumption of electricity. The proportion of distributed energy sources within the electricity system increases steadily, which necessitates an improved monitoring capability to ensure the overall reliability and quality of the Electricity Supply Chain. Automation is strongly required to process the growing amount of data. Thus, it is inevitable to handle large amounts of heterogeneous data and process the information using forecasting and optimization techniques. Artificial Intelligence techniques are crucial for extending human cognitive abilities in these tasks. In our work, we synthesize the main impacts of the Artificial Intelligence paradigm on the automation of the Electricity Supply Chain. We describe the emerging automation through Artificial Intelligence in every layer of the Smart Grid Architecture Model and highlight state-of-the-art approaches. In the review, we focus on the following Electricity Supply Chain functionalities: generation, maintenance, pre-processing, analysis, forecasting, optimization, and trading within energy systems. After investigating the individual perspectives, we examine the potential implementation of a fully automated Electricity Supply Chain. Lastly, we discuss perspectives and limitations for the transformation from conventional to automated Electricity Supply Chains, specifically in terms of human interaction, Artificial Intelligence adaptation, energy transition, and sustainability.


Introduction
The increasing number of local energy sources connected to the utility grid, such as individual photovoltaic (PV) systems, necessitates automated control to ensure a reliable and efficient electricity supply.The digitalization of the Electricity Supply Chain (ESC) is being accelerated to provide infrastructure supporting control automation in terms of collection, communication, and treatment of information for the increasing number of energy agents like prosumers.Based on the Smart Grid Architecture Model (SGAM) [1], ESC includes the following domains of electricity: generation incl.Distributed Energy Resources (DER), consumption, transmission and distribution as well as its trading.Each of these domains has become more complex through digitization, resulting in a growing demand to support grid operators and electricity traders.This requirement can be accomplished by automating information treatment and associated decision-making processes, which particularly rely on Artificial Intelligence (AI) techniques.
In this regard, AI is a collective term, which combines tools to substitute the need for human cognitive ability [2] including machine perception and Machine Learning (ML) procedures.AI can be used in ESC automation because of its ability to handle large amounts of data which are often heterogeneous and of varying quality.Furthermore, AI is capable to identify complex patterns or relationships and hardly requires potentially limited pre-processing [2].While AI aids in data analysis and interpretation, it does not replace human intervention entirely.Considering the inability of AI to generalize multiple tasks, specific problems require a specific implementation and appropriate control mechanisms.In addition, ML algorithms are often black box models with limited potential for understanding and control of their deviations.AI applications requiring large input data sets are also known to be prone to overfitting.Nevertheless, review articles on the application of AI for energy sector automation in specific energy topics such as energy building efficiency [5] and energy economics [4] have been published in recent years.In addition, other researchers examined the adaptation of the ESC for AI in the context of electrical grid challenges to adopt automation [6] and intelligent processes [7].Furthermore, Cheng et al. [8] presented a paper on AI evolution and taxonomy, which introduced a new generation of AI, illustrated with Smart Grid (SG) applications.Here, we intend to synthesize the main impacts of AI for the specific processes inside ESC and discuss how AI might extend the possibilities of electrical sector automation.In this work, the focus is set on various AI approaches solving different tasks in ESC to give perspectives for entire automation of ESC.We cover their specific limitations as well as a necessary development to achieve an optimal and sustainable energy supply.The specific focus of this review lies in the fields of generation, maintenance, analysis, forecasting, management, and trading within electrical systems as well as the respective interaction between them to automate the distinct tasks but also the entire ESC.The term "energy" is essentially referring to its electrical aspect in this work.
The paper is organized as follows: Section 2 introduces AI methods that are principally used within the electricity sector.Then, a detailed description of specific AI algorithms is provided in Section 3 to solve various distinct tasks in ESC based on the layers of the SGAM [9] (see Fig. 1 in Section 3).In Section 3.6, the combination of the distinct tasks is discussed to create a full automated ESC.Finally, Section 4 discusses perspectives and challenges for an entirely automized ESC.In this context, an outlook of the evolution of AI implementations related to ESC and the new possibilities of human interactions are given.AI-specific effects on energy sustainability, resilience and transition are described as well.

Methods
This section gives an overview of the most common AI techniques used in the energy sector.At first, we present a short history of AI and define shallow learning techniques like Random Forest (RF) or K-Nearest Neighbors (KNN) commonly used in energy applications.Given that Deep Learning (DL) techniques became increasingly relevant, we briefly define them in Section 2.3 and discuss possible impacts.In Section 2.4, we examine the Deep Reinforcement Learning (DRL) approach, which is a specific application approach playing an important role in automatizing energy processes of DL.Finally, we consider Generative Adversarial Networks (GANs) and Genetic Algorithms (GA) procedures due to their expected role in the future electricity system and introduce them in Sections 2.5 and 2.6, respectively.

Artificial Intelligence: context and definitions
While the discipline of AI has existed for some decades now, there is still no commonly accepted definition of AI.In the following, we consider the AI definition of the Oxford Dictionary: "the theory and development of computer systems able to perform tasks normally requiring human intelligence, such as visual perception, speech recognition, decision-making, and translation between languages". 1 From 1950 to the 1980s, the dominant paradigm of AI was the symbolic one.With symbolic AI, rules are defined that allow conclusions to be made from the input.Until the mid-1980s, domain experts agreed that with a sufficiently large set of rules, AI could be brought up to human cognitive levels [10].However, symbolic AI reached its limits for very complex tasks such as image recognition, image generation, speech, and text recognition.Consequently, the field of ML gained major importance as a sub-field of AI and was defined as "the study of computer algorithms that improve automatically through experience" [10].In the area of ML, supervised, unsupervised, semi-supervised, and Reinforcement Learning (RL) can be distinguished.While labeled data is necessary for supervised learning, no labels are required in unsupervised learning.In semi-supervised learning, a mix of labeled and unlabeled data is used for model training.Finally, the goal of RL (see Section 2.4) is to learn a strategy that maximizes a pre-defined quantity.Within the field of ML, DL approaches, especially Neural Networks (NNs) gain in popularity.NNs can be used to learn representations of the input data through backpropagation [11].This automatic feature learning is the basis of DL (see Section 2.3), where the parameters of multiple sequential layers are learned to represent given data.Nevertheless, if only one representation layer is used in NNs, we speak of shallow learning (see Section 2.2).

Shallow learning
In contrast to DL, shallow learning uses only one representation layer.Typical methods established in the energy sector are discussed below.
Support Vector Machine (SVM).For solving linear and non-linear problems, regression and classification of time series as well as for outlier detection, SVMs have been proven to be very powerful [12].SVM defines decision boundaries as hyperplanes so that the training data is separated as optimally as possible according to its class affiliation.It was found that SVMs were very well suited in practice for the classification of complex small to medium-sized data sets [12].In general, SVMs are popular and frequently used in the energy system, e.g., for condition monitoring of wind turbines [13] and fault diagnostics of hydropower plants [14].
Gradient Boosting Machine (GBM) and XGBoost (XGB).Some of the most common boosting algorithms [15] nowadays are GBM [16] and XGB [17].The idea of boosting algorithms is to use an ensemble of weak models to build a stronger model [15].Designed to be "efficient, flexible and portable" [17], XGB improves GBM from an algorithmic point of view by enhancing regularization, weighted quantile sketching, and sparsity-aware splitting.In the last years, XGB has become one of the leading benchmark techniques for ML.Due to its good performance in classification and regression, it is suited for a broad range of applications, which could soon reach the energy sector.
RF.By combining a set of weak decision trees to an ensemble method, a strong classification or regression model, a so-called RF was proposed by Leo Breiman [18].Each decision tree is trained on bootstrapped samples of the training data.For regression tasks, the final prediction is yielded by the average of the predictions of all trees.For classification tasks, the final class is found by a majority vote of the predicted classes.Regarding the energy sector, RFs were already applied within the literature [19,20].
KNN.As a lazy learner, the modeling of a KNN method takes place only at the time of the classification or regression request.The output of the request is simply based on the k-closest training examples.For regression, the average of these k-nearest neighbors is then taken as a value.For classification, the class with the most votes, i.e., under the KNN, is then selected.

Deep Learning
NNs are algorithms that can approximate a non-linear function based on the given input data.They have the advantage that complex relations can be parameterized through the number of neurons, the used activation function, and the propagation function.The difference between shallow NNs to Deep NNs (DNNs) is characterized by the depth of the network.DL is specifically defined by using more than two hidden layers.The goal of DL is to learn sequential layers for more complex data representations 1.The use of the Rectified Linear Unit (ReLU) [21] activation function and its further developments that increase the stability of training.2. Improved initialization functions, like Xavier initialization [22] and He initialization [23].3. Improved Optimizers like Adam [24] and RMSProb [25].
Regularization techniques like dropout [26] and batch normalization [27] have further improved the stability of training DNNs.Dropout is a regularization technique that reduces overfitting by not considering a certain number of neurons within each layer for a training step [26].Thus, forcing the network to create more generalized solutions, which are hardly dependent on specific neurons.In terms of the batch normalization, overfitting is reduced through normalizing the output of each layer using the batch mean and the batch standard deviation.In the energy sector, time series forecasts are of special interest wherein Long Short-Term Memory (LSTM) networks are particularly suitable.LSTM-NNs are a special type of NNs [28] that can capture long-term dependencies.Recently, new methods for time series forecasting, such as DeepAR [29], were presented, which have achieved very good results for time series forecasts.Due to the increasing need for forecasting in the energy system, we expect that DL will become a crucial tool.In particular, its combination with RL is of great interest for automatizing the energy system.

Deep Reinforcement Learning
RL is a field of ML in which an agent is trained in an environment to maximize a defined expected cumulative reward [30], e.g., an electricity trader on the electricity exchange market.The latter is evaluated on the agent's performed actions, e.g., holding, buying, or selling electricity, in the environment.The overall performance measure of RL is the consecutive reward, e.g., profit/loss in electricity trading.Accordingly, the idea of RL is to learn by interacting with the environment and adapting to it in a goal-oriented way.In 2015, RL was successfully combined with DNNs for the first time [31] through two improvements: (i) the introduction of a replay memory that stored agent experiences as a basis for learning, so that training data is less correlated; (ii) the separation into a target model for the evaluation of the agent's actions and an online model for the choice of actions that stabilized the training.Moreover, an important technique called prioritized experience replay was proposed in Ref. [32], also known as importance sampling.Instead of drawing experiences from memory with equal probability, important experiences are drawn from memory with a higher probability.Currently, the state-of-the-art RL methods used are Rainbow [33] (a combination of six different RL techniques) and Proximal Policy Optimization [34].For energy systems, RL is of great interest as it can make a significant contribution to their automation in term of grid protection, energy management, and automatic energy trading.Possible applications of RL are discussed further in Section 3.

Generative adversarial network
A GAN is a generative model, which is "described (in terms of probabilistic models) how a dataset is created" [35].It consists of two NNs, i.e., the generator and the discriminator, which have opposing optimization targets.The discriminator aims to detect whether the data set is real or fake.The goal of the generator is, in contrast, to produce fake data sets, which are undetectable for the discriminator.For this purpose, the generator takes as input a random vector, such as a Gaussian noise vector, and produces new data, e.g., an image, audio, text, or a time series.As the objectives of the generator and the discriminator are contrary, the two networks are trained separately.Since their invention in 2014, GANs have become very popular and are considered as one of the most important inventions in the field of DL in the last 20 years.For a comprehensive overview of GANs and their variants, we refer to Ref. [36].Possible applications of GANs in the energy sector are the implementation of wind and solar data [37], but also the generation of new time series data with specific properties [38].In this context, GANs can be used to automatically build additional training data for other models but also to detect anomalous time series inserted by man-in-the-middle attacks or produced by erroneous measurement or production/generation equipment.

Genetic Algorithms
A GA is a procedure inspired by evolutionary theory, where the fittest individuals are chosen for reproduction.The basic idea is that through the interaction of modification and selection of better-suited individuals, good approximate solutions can be found for given problems [39].GAs generally start with a randomly selected and evaluated population in which each individual represents a possible solution to the problem.Then, the following cycle is repeated until a termination condition is fulfilled [39]: (i) selection of individuals for recombination, (ii) recombination of the characteristics of these selected individuals to descendants, (iii) random mutation of the descendants' characteristics, (iv) evaluation of the mutated descendants and (v) determination of the next generation.GAs play an important role in the energy industry and are widely used, e.g., to solve the unit commit problem (see Section 3.4.2) and for automating the energy production (see Section 3.1.1).

Electricity Supply Chain automation and Artificial Intelligence
The major contribution of this paper is the presentation and discussion of the potential improvements of the different distinct parts of the ESC based on AI but also the potential implementation for an automated ESC by combining these distinct tasks using several AI techniques.The structure of this section follows SGAM [8] as shown in Fig. 1 because it assigns different related tasks to a certain layer.
We begin our analysis of the ESC in Section 3.1, starting at the component and communication layer.These layers cover the electricity production in different generation units (RES and conventional units), the predictive maintenance of various components, and the prevention of cyber-attacks.In Section 3.2, we analyze different methods like preprocessing of data transferred from the component layer via the communication layer to the information layer.In conjunction with a knowledgebased system, data analysis and data transformation techniques are proposed to extract important information as well as to find and optimize the most suitable forecast method.Anomaly detection and substitute value formation are central within the entire process chain ensuring valid data esp.for AI methods.Thereafter, we analyze the function layer in detail.In Section 3.3, methods for time-series forecasts are discussed, whereas the optimal solution of the unit commitment problem (UCP) based on the forecasted time series is addressed in Section 3.4.In Section 3.5, we focus on the methods of the business layer.These aim at increasing the economic benefit using flexibilities gained from the optimization in the previous layer.In this work, we primarily focus on different methods concerning Electricity Price Forecast (EPF) and energy trading strategies.Finally, the findings of the previous sections are aggregated into a fully automated ESC combining the distinct tasks of the individual layers.For this purpose, we discuss the necessities and implications that arise from the integration in Section 3.6.

Components & communication -energy access
As shown in Fig. 2, this section is focused on the services at the component and communication layer.The most crucial functions of the component layer have no links to the distributed parts of the SG.Thus, cyber-attacks are prevented by using separate networks.However, to set up a smart data-based system, it is necessary to connect different distributed parts via the communication layer.Some services are considered in Section 3.1.3,which can be improved by using AI concepts.Starting with the initial stage of the ESC, we describe the impact of AI on Electricity Access (EA).Here, EA refers to the energy generation and its distribution as well as its management through information communication infrastructure at the component level.Also, the following paragraphs report the support provided by AI automation for efficient energy production, health monitoring of EA, and its metering relying on information communication.The main impacts of AI on EA are also synthesized in Table 1.

Automation of electricity production
Before producing electricity, AI can assist the design process of generation systems to ensure their optimal usage, where clustering was employed to maximize the yearly solar cell production through dimension reduction of the spectral characteristics [40].Furthermore, the design and sizing of the local electrical grid for Renewable Energy Sources (RES) can be automated, e.g., for wind farms using GA [41].The production control of hazardous energy sources like nuclear plants can also benefit from AI automation.Indeed, NNs were able to model nuclear generation regarding heat and flow transfers [42,43].Similarly, the management of highly variable RES like wind farms can be improved by NNs better predicting their performances [44].Beyond energy generation modeling, the definition of control policies can be automated as well, e.g., a fuzzy NN method was proposed to control hydraulic turbines [45].Furthermore, C. Keerthisinghe et al. showed that NN techniques were comparably efficient to Dynamic Programming (DP) [46].In their work, they performed a policy function approximation to the use-case of PV storage energy management.More generally, for optimal power flow management of DER, a data-driven procedure was based on a multiple stepwise linear regression to learn DER-specific control policies [47].Finally, the automation of energy production operations was performed for simplification and optimization purposes as well as environmental ones.For example, NNs and GAs were applied to model and predict gas emissions from coal-fired power plants to automate human supervision [48].

Health monitoring for fault detection and predictive maintenance
Energy production processes can be subject to dangerous situations since generation and distribution operations might lead to material, economic, and human risks.As a preventive measure, AI techniques have been developed to monitor the health of generation systems.Probabilistic NN classifiers were proposed for fault detection and classification in PV systems [49,50].For wind turbines, Stetco et al. described NNs and SVMs as being common techniques for their condition monitoring [13].For example, DNNs have been used for turbines fault detection [51].Moreover, condition and fault diagnostics of hydropower plants were performed using SVM classifiers [14].Lastly, thermal power plant monitoring was achieved by DL applied to remote sensing images [52] and fault detection by SVMs [53] and NNs [54].
Transporting the produced energy, the associated distribution devices can also be damaged and require monitoring.The fault diagnosis of power transformers in electrical substations was realized with a KNN cumulative voting approach [55].Additionally, predictive maintenance to prevent power substations failure was obtained by a NN algorithm applied to thermal images [56].Concerning electricity delivery lines, locations of faults were determined by NNs [57] and SVMs [58].Furthermore, ML delivers information about fault location and types of power lines [2].The state recognition of isolating switches can also be supported by AI using SVMs to evaluate their high-voltage condition [59].

Communication infrastructure supported by Artificial Intelligence
Supporting EA, AI requires a strong communication infrastructure to gather data from the energy system sensors and metering devices.To face this demand, intelligent communications were proposed like the cognitive radio enabling optimal usage of the available spectrum resources for wireless operations.This wireless communication was automated with all categories of ML techniques to ensure adaptive learning and decision making [60,61].Based on information communication, SG systems are exposed to cyber-attacks which can be active like the injection of false data, or passive, if data confidentiality is affected.Against active attacks, a k-means clustering algorithm was used to identify anomalies within normal data traffic of an SG [62].Moreover, an online NN algorithm was proposed to detect malicious voltage control actions [63].To uncover both active and passive attacks, Prasad  et al. employed an ML boosting approach [64].Ahmed et al. [65] propose a combination of GAs and SVMs while indicating that the obtained results depend on system size.Indeed, the performances of ML techniques to detect cyber-attacks are not homogeneous and depend on the system dimensions as well as the associated available data [66].Lastly, due to the high cost of developing the infrastructure, AI starts to be employed as an alternative to extending system automation with only minimal communication.In this regard, ML methods, i.e., SVM algorithms, were proposed to compute control strategies, which rely only on historical data and allow local management of DER without communication needs [67].

Information -data pre-processing, anomaly detection, and analysis
The data-handling methods of the information layer (see Fig. 3) include data preprocessing, anomaly detection, and data analysis.These methods are crucial for appropriate data modeling.To enhance the automation of the ESC based on ML, reliable data models are necessary to prevent forecast errors (see Section 3.3) and false decision-making (see Section 3.4 and 3.5).The energy data to be processed are transferred from the production sites such as conventional or renewable power plants, as well as metering devices such as smart meters and their gateways.Generally speaking, electricity time series concerning consumption or generation are often non-stationary.Furthermore, they possess a strong daily, weekly, and annual cycle.Additionally, they contain anomalies like missing or false values [68].Thus, the goal of data pre-processing is to generate a time series without anomalies that can be used for data analysis and forecast.From a variety of applicable pre-processing methods, several need to be selected by the applicant.The choice depends on the use case, but also on prior knowledge of the time series and the selected forecast algorithm.First, each time series should be checked for erroneous behavior, with methods described in Section 3.2.1.Considering a new time series without any prior knowledge about its characteristics, it is preferable to run through a large amount of different pre-processing steps to get an overview of its general characteristics (see Section 3.2.2).Before selecting an appropriate forecast model (see Section 3.3), different data transformation techniques (see Section 3.2.3)can be applied.In addition, methods such as correlation, clustering, classification, and regression can be used to get insights into the characteristics of a given time series.An overview of data analysis method can be found in Table 2.If one has prior knowledge about a time series and an already trained forecast model exists, one only applies the pre-processing steps to receive the best results.Since time series might change their behavior, e.g., due to a change of the portfolio of an energy supplier, an automated forecast evaluation based on the comparison between forecast and standard load profile or historical data is recommended to react instantaneously.

Time series tests and corrections
Whether we want to process a new energy time series or new data points of an existing one, it is preferable to check its values concerning plausibility, detecting anomalies, and substituting erroneous values.In the following, we introduce different possibilities for this task.
Plausibility tests.At the beginning of time series analysis, plausibility tests to check for any errors in time series, e.g., regarding status information of a smart meter or sensor, are crucial for the quality of the analysis and, thus, for the forecast results.It is highly recommended to verify time series and especially sensor data quality in the context of energy time series [70].Since the paper focuses on automation and data science, plausibility checks like visual control of measurement devices are not applicable.
Anomaly detection algorithms.The numerical behavior of time series is checked for abnormal values by anomaly detection methods [71][72][73].These can be classified by the available knowledge of a given data set [71]: (i) no prior knowledge of data, i.e., unsupervised clustering; (ii) data labeled in normal data and anomalies, i.e., supervised classification; (iii) only normal data available, i.e., novelty detection or semi-supervised recognition.In the case of energy time series, labeled data are hardly available, and different types of anomalies can be distinguished enabling an advanced knowledge extraction.There exists a variety of different definitions of abnormal data [71][72][73][74] pronouncing different aspects.Nevertheless, three types of anomalies in energy time series might be distinguished [72]: 1. Noise data, which either describes data containing logical errors (violation of business rules, e.g., consumption data from February 30th) or inconsistent data containing format issues, deviations from co-domain (negative energy generation), and significance issues.2. Incomplete data, which refers to missing or duplicated data and to data deviated from statistical characteristics randomly.3. Outlier data, which is generated by unusual circumstances.These can either be based on subjective reasons originating from human factors, e.g., change of consumer behavior due to price signals, or objective reasons, e.g., system maintenance or natural catastrophes.
While noise and incomplete data should be treated in a correction step, outliers include significant information about the underlying events.Thus, outlier mining might offer more insights into underlying mechanisms [73].Outlier mining methods can be divided into (i) methods based on NNs; (ii) methods based on statistical methods, e.g., fuzzy theory, proximity-based methods, and regression analysis; (iii) ML approaches such as decision trees and rule-based systems time series analysis; (iv) methods using state estimation techniques; and (v) hybrid methods [71][72][73].
Substitute value formation.After different tests for plausibility and anomaly, missing or false values have to be substituted in the time series by different techniques depending on the type of error to ensure reliable forecasts.Besides conventional methods like ignoring or deletion of missing or false values and mean imputation, there are far better ways to estimate or correct values [75].Prominent examples include imputation based on KNN and SVMs, GAs and autoregressive models, maximum likelihood, or imputation through interpolation.The latter technique should only be applied to electricity time series if the series of missing or false values is no longer than seven consecutive values [76].If the series of missing or false values is longer, then typical load or generation profiles [77] or even historical values fitted to the current date should be used if no more sophisticated methods are available.Finally, non-processing can also be an option to train a forecast model based on a decision tree [78].

Time-series statistics
Different statistics can be used to test for seasonality or stationarity to extract general characteristics of the time series.In the field of automation, statistics are also beneficial to link acquired knowledge to the current problem [79] and help to find the optimal model configuration.Table 2 gives some possible correlation types and other statistics which can be applied on either the scaled and corrected (see Section 3.2.1)or on the residuals of the detrended time series (see Section 3.2.3).

Data transformation
Stationarity and detrending.From many points of view, stationarity is, besides several other characteristics, among the most important ones [80], which means that the probability distribution is time-independent.After time series are corrected (see Section 3.2.1)and analyzed (see Section 3.2.2),forecast algorithms are still not able to predict them reasonably due to included (linear and non-linear) trends, seasonality, or other periodicities.Time differencing [81,82], as well as Fourier transformation, wavelet transformation, and empirical mode transformation [83], are often used to eliminate seasonality and trend.
Feature engineering and scaling.Besides classical time series transformations (see Table 2), it is strongly recommended to use feature scaling techniques to increase the comparability of different data sets, to enhance the learning process and the accuracy of different models.Time series can be also transformed with feature engineering techniques, which can enhance model accuracy as well.Possible features are: (i) date-related features to extract patterns for weekdays, weekend days, months, and even holidays; (ii) time-related features to extract patterns for, e.g., daytime and nighttime; (iii) rolling mean or variance of different time windows; (iv) domain-specific features to extract a meaningful relation by multiplying, dividing or even combining different variables.

Function I -forecast of load and generation
This section is focused on the forecast methods for Electricity Demand (ED) and generation in the function layer (see Fig. 4), which are based on the data models of Section 3.2 and determine the achievable quality of the optimized schedule of available energy units (see Section 3.4).To ensure grid stability [84], Electricity Generation (EG) and consumption have to remain in balance.As a consequence of the regulation of conventional power plants due to bottlenecks in the ESC, economic and physical damage can be avoided by intelligent grid management using forecasts of time series.There are three main contrary mechanisms determining grid stability: (i) ED, (ii) EG of RES and conventional power plants, and (iii) Electricity Storage (ES) to buffer ED and EG.The characteristics of them differ: aggregated ED is less variable and depends mainly on temporal patterns like daytime, day of the week, holiday, and season.In contrast, EG, esp. the generation of RES, is mainly affected by exogenous weather conditions, thus causing a stochastic behavior.However, the weather also affects human behavior and induces randomness to their electricity consumption.Fortunately, using the buffering capacities of ES can help to balance ED and EG.While both, ED and EG of RES are independent of each other and can be predicted by standalone forecast algorithms, ES depends on ED and EG as well as its storage capacity and its current state.Thus, it can be treated mathematically as an optimization problem (see Section 3.4).The following gives a brief introduction of different time series forecast algorithms and AI methods to solve prediction tasks adequately.Finally, applications are discussed.

Electricity load forecast
The electricity load of a certain region contains many different consumers like private and commercial users.In addition to the abovementioned dependencies (daytime, weekday, season, and weather), energy consumption is a significant cost factor in manufacturing companies but also in households.Thus, the energy consumption price is correlated with the electricity load.This relationship is induced by the adaptation of consumption to prices from the energy market [69] and tariffs.Therefore, minimizing energy usage might be integrated into the decision-making system of production planning and control to reduce production costs [85].In the case of a single factory load forecast, hybrid (physical) simulations of manufacturing process queues including static and dynamic processes can be used [85].These simulations are mainly based on physical models.However, forecasting an entire grid (or region) requires an appropriate energy consumption estimation of all participants, and purely physical models would be cumbersome to use.An overview of state-of-the-art algorithms to forecast load time series for grids is summarized in Table 3 and details can be found in Refs.[81,82], and [86].

Renewable energy forecast
Considering the fact that several RES such as hydropower, biomass energy, geothermal, wind, and solar power plants are part of the electricity grid, their forecast is highly important.The latter two depend strongly on the prevailing weather conditions [91,92], while the other RES have different dependencies.To reach a higher penetration of RES into the electricity grid, the expansion of renewable energies is primarily driven by the expansion of solar and wind power generation, given that the implementability of other RES is limited due to geographical, geological, and biological reasons.Therefore, the need for highly accurate solar and wind power forecasts can not be underestimated.Unfortunately, each forecasting method for wind and solar power yields results with a specific temporal resolution depending on the included geographically induced weather conditions.Thus, an evaluation of the forecast results on real measurements of the investigated power plant is strongly recommended.Additionally, there exists a huge difference between predicting a single power plant and an entire electricity grid [93] due to spatiotemporal balancing effects.There are many different types of models to predict solar or wind power, which can be generally

Table 3
Overview of forecast models and their domain of application.

Algorithm Description Regression
The regression formula is always manually formulated and therefore understandable, but it cannot figure out complex, non-linear relationships, and even seasonality.Regression analysis for the prediction of residential energy consumption was applied in Ref. [87].Autoregressive Integrated Moving Average (with eXogenous variable) The Autoregressive Integrated Moving Average (with eXogenous variable) (ARIMA (X)) algorithm is mostly applied to selfcorrelated time series and built on a regression of the endogenous variable and its lag values as well as exogenous variables like weather data.It can be extended by a seasonal component.The mathematical model is easy to train, understandable and stable in its application.It was used, e.g., to forecast power demand for an office building and the energy demand in China [81,86].

Fuzzy
Fuzzy algorithms have been used to get a better understanding of uncertainties in the electricity load forecast and have a high accuracy considering uncertain situations.However, they have high computational complexity and lack stability [86].NN NNs need big data and expert knowledge to set up an appropriate architecture to avoid overfitting.Different model architectures were applied to short-term energy load forecasting and the results were compared in Ref. [88].RF RFs are very robust but also limited due to their constrained co-domain.RF is used, e.g., to combine feature engineering and selection in an optimization framework to solve accurate load forecasting tasks in SGs [20].SVM SVMs can solve non-linear problems in the case of small samples and improve generalization.However, they are very sensitive to missing data.SVMs were used to calculate the demand response baseline in office buildings to manage SGs [89] and short-term residential load forecasts [90].
Whereas statistical models calculate future values of power output with past power values, physical models use future values of numerical weather prediction or cloud movement [95,96].Hybrid models finally merge the two concepts by optimizing statistical and physical models.Solar power forecast.The power output of a specific PV module depends on (i) the incident light (diffuse and direct irradiance) [97,98], (ii) on its system configuration including inverter and coating, (iii) on shadow casting and (iv) on temperature [99].Snow and dust can additionally cover the surface and, consequently, reduce PV-power output.Satellite images can directly measure the solar irradiance and extrapolate it by Cloud Movement Vectors (CMV) into the near future (<6ℎ) [95].In this regard, Numerical Weather Prediction (NWP) models estimated solar irradiance using various parameters for the next days [100].Model Output Statistics (MOS) using AI further improved the prediction of solar irradiance in topographic terrain [101].Solar power forecasting for an entire electricity grid is very challenging due to the heterogeneous distribution of installed PV power.A comparison between different model types to predict solar power output including different forecast horizons found that the persistence forecast method performs best for forecast horizons only a few minutes ahead in time [101].CMV, NNs, and different time series models achieve the best results for a couple of hours, whereas NWP is the best choice for larger forecast horizons [101].A multivariate regression combines these different forecast techniques to receive one forecast model for every forecast horizon [102].This hybrid technique performed best according to Ref. [103].
Wind power forecast.Wind power forecasting is another challenging task, due to very different local topographical conditions, including surface roughness.Wind speed increases with height and depends on wind direction.Currently, there are only local wind measurements (wind power output of single plants and the numerical weather prediction) which include discrete horizontal grids and many vertical layers to predict wind power.Unfortunately, there is no accurate wind speed estimation using satellite images.However, wind speed prediction was improved by applying MOS using NNs and real wind speed measurements on NWP output [104].Furthermore, the distance between wind power plants is also important, since wind direction and velocity-dependent turbulent flow around a certain wind power plant affects neighboring plants [105].Additionally, different wind turbines have varying properties with individual efficiencies to be considered.Meteorological fronts, known as ramp events, and the heterogeneous distribution of installed wind power can lead to significant challenges for power system operators [106].Since all single wind power farms have a different efficiency considering the previously mentioned effects, many researchers have investigated different models to simulate and predict single wind power farms.Single time series forecast algorithms (see Table 3) were outperformed by a hybrid combination of them [107].An improved radial basis function NN-based model with an error feedback scheme to optimize wind power forecast accuracy for the next 72 h was proposed by Chang et al. [108].A novel hybrid approach based on deep convolutional network, wavelet transform, and ensemble technique was able to outperform the standard persistence method and shallow NNs [109].Through spatial smoothing, Matthias et al. were able to achieve a more accurate wind power prediction for several wind farms on a regional scale [110].

Forecast automation
Summarizing Sections 3.3.1 and 3.3.2, the forecast process chain consists of many different steps and each of them requires expert knowledge for selecting and configuring the problem-specific algorithms.The difficulty is increasing, considering the organization and operation of the entire forecast pipeline as shown in Fig. 5 (a).In consequence, there is an urgent need for unifying and automating these different tasks through, e.g., Automatic ML (AutoML).AutoML solutions only need input data, e.g., energy time series, to provide a reasonable forecast without any necessary human intervention.Thus, it enables non-experts to use ML procedures effortlessly.An AutoML system needs to compute its tasks efficiently and easily comprehensibly.The general AutoML pipeline (see Fig. 5 (a)) and its challenges are described in Ref. [111].Firstly, different data pre-processing, data analysis and feature processing steps should be applied (see Section 3.2).During the selection of a problem-specific ML model, overfitting and a possible ensemble creation need to be considered.Additionally, each forecast model has to be optimized concerning its hyper-parameters.Therefore, a procedure containing of three steps was suggested [111]: (i) filtering methods to narrow down the range of hyper-parameters (without training the ML parameters) by applying different statistical tests (chi-square test, fisher score, or correlation coefficient); (ii) wrapper methods using trained ML as black boxes to select hyperparameters; (iii) embedded methods using knowledge of the ML structure and parameters to find the hyper-parameters.In Ref. [112], two state-of-the-art ML tools (auto-sklearn [113] and TPOT [114]) were tested to solve the automation process chain in the context of electricity load forecast.By using two Fig. 5. Forecast process pipeline (a) included in the workflow of TPOT (b) and auto-sklearn approach (c).
L. Richter et al. benchmark datasets of load consumption, it was concluded that these automation systems have a high potential [112].A scheme of the used workflow is shown in Fig. 5, in which both AutoML frameworks applied the same pre-processing steps and ML models as forecast process pipeline (see Fig. 5 (a)).TPOT uses genetic programming in a loop to optimize hyperparameters, whereas auto-sklearn uses Bayesian optimization.In comparison with manually configured models, AutoML can produce robust models achieving accuracy close to a prediction system designed by an expert.Several other packages based on Python, e.g., pyautoweka2 and PyAF,3 exists, which use AutoML approaches as well.It is preferable to adapt an ML pipeline to an entire time-series prediction task because some algorithms are only suitable for classification tasks [112].Additionally, the state-of-the-art libraries (auto-sklearn and TPOT) do not include the time-series models presented in 3.3.1 and 3.3.2but give a practical overview of the AutoML approach.The future challenge is to extend these frameworks by accurate time-series forecast algorithms for energy consumption (see Section 3.3.1 and esp.Table 3) and RES generation as described in Section 3.3.2.

Function II -energy management and potential solutions to the unit commitment problem
This section considers the Energy Management System (EMS) as part of the ESC, which enables the optimized scheduling and operation of energy units.Based on the forecast of RES and demands (see Section 3.3), it also belongs to the function layer of the SGAM [8].
EMS is an integral component of energy system automation and can facilitate the operation of AI technologies.The functions of an EMS can range from the introduction of energy efficiency to process optimization measures in the industry and household sectors to dispatch planning and operation of energy generation or consumption units.Here, we follow the interpretation of most of the scientific literature and focus on the scheduling and operation of energy units.In European countries, power generation portfolios are operated by private, public, or semi-public utilities and suppliers.With the rise of decentralized portfolios of RES generation and their direct market transactions, so-called aggregators have emerged.Most of them bundle RES generation units to collectively predict and market their generated energy.The dispatch planning of generation portfolios ranges from resource-driven long-term planning for large power plants to short-term scheduling horizons, especially for volatile RES.Short-term horizons depend on gate closure times of shortterm markets and vary from a few minutes up to several days.For the technical implementation, the market penetration of virtual power plants linked to local control systems and integrated EMS is growing.
In recent years, solutions to enhance the degree of flexibility of the consumption side have been increasingly pursued, recognizing the restricted controllability of intermittent renewable generation units as well as new challenges emerged, e.g., by electric mobility.Important instruments defining the market and grid-oriented flexibility of consumption facilities include demand response, demand-side management, peak shaving, and direct participation of consumption units in ancillary services markets.Core components of these EMS are optimization algorithms that take decisions on the optimum dispatch of generation, storage and consumption units based on boundary conditions or physical constraints.On the consumption side, basic rules-based timenaive procedures are still commonly used.On the generation side, in contrast, elaborate approaches are the norm for planning the unit commitment of systems with horizons of days, weeks, and even months or years [115].
The Unit Commitment Problem (UCP) is a mixed-integer problem and can be divided into two sub-problems: (i) the binary decision on the operating state of each energy unit, i.e., whether it is running at a certain time of the scheduling horizon.This problem can be considered a combinatorial optimization problem.(ii) the continuous decision of the power state, i.e., the rate at which the units generate, consume or store energy.The mixed-integer nature of the UCP leads, if not relaxed, to a non-convex solution space, which often results in highly complex NPhard problems, e.g., see Ref. [116].Common conventional approaches to solve the UCP include DP [117], Lagrangian Relaxation (LR) [118], Integer Programming (IP) [119], Branch-and-Bound [120], Mixed-Integer (Linear) Programming (MIP, MILP) [121,122], and Nonlinear Programming [123].Uncertainties resulting from load and generation forecasting errors were solved by stochastic and robust optimization [124,125].As a classical optimization problem, the UCP is an ideal use case for AI and ML methods.Due to its central function in the energy system and its nature, it is not surprising that numerous scientific publications have been dedicated to the solution of the UCP via AI in recent years.The scalability of complex problems with high computing requirements is a key challenge herein, which still limits the market share of AI-based EMS.However, growing computational capabilities, new algorithms, and parallel processing potential increase the chances of AI techniques in this area.Most of the AI approaches to solve the UCP are hybrid techniques that complement and improve conventional methods such as DP and LR with AI.With growing possibilities, more approaches entirely based on AI techniques or pure AI hybrids were developed.The potentials of logic and knowledge representations of expert systems, fuzzy systems, NNs, and evolutionary computing methods have been explored since the mid-1980s [126].A condensed overview of the diverse approaches to solving the UCP using AI techniques is given in the following sections.

Knowledge-based systems
To overcome the human cognitive barrier in power system operations and to enhance the usefulness of EMS, knowledge-based systems were proposed to replace or supplement numerical approaches, to cope with the overwhelming quantity and rate of data obtained [127].The suggested approach included the application of a knowledge base in an expert system using numerical methods to aid in decision-making and to solve complex optimization tasks.This approach can be categorized into rule-based, frame-based, and logic-based AI programs [127].Another hybrid approach also based on an expert system was designed to assist operators in modifying UCP input data to meet all scheduling constraints [128].A significant number of approaches to solving UPC originate from the field of evolutionary algorithms and computing, which will be discussed in the following.

Evolutionary algorithms and swarm intelligence
GA.It was found that an adaptive search method using GA (for details of GA see Section 2.6) can replace shortcomings of mathematical programming approaches for a small UCP-example of thermal power generating units [129].The required high computational time was rated as less significant considering the increasing computing power and parallelization possibilities.In a case study with systems of up to 100 thermal energy generating units, GA was instated for solving the binary optimization subproblem while variants of differential evolution algorithms were used for the continuous subproblem [130].Significant cost savings in comparison to eight benchmarking algorithms were recorded for larger systems with 40-100 units.Unfortunately, the benchmarks included only a priority list and AI-domain approaches.Thus, further comparison with discrete mathematical optimization algorithms would be enlightening.
Evolutionary Programming (EP).An EP approach to solving the UCP for a system of 100 thermal generation units consisted of a competitionand-selection routine and a mutation operator [131].The implemented algorithm reached a satisfactory performance and showed advantages in scalability over DP, which is only appropriate for small generator systems.A potential weakness of evolutionary methods for solving the UCP is the limited local and global search capability, which raises the risks of entrapment in local optima esp.for large-scale problems with constraints of high complexity [132].A recent overview of differential evolution algorithms in economic dispatch can be found in Ref. [133].
Particle Swarm Optimization (PSO).To solve the UCP for a dual-mode combined heat and power portfolio including secondary heat and thermal generation units, a Binary-PSO (BPSO) approach solved the combinatorial discrete sub-problem applying a sigmoid function, while the PSO determined the solutions for the continuous decision variables [134].The results were promising but appear to be dependent on the system complexity.A three-stage approach to solve a UCP included obtaining a primitive structure of the 100 thermal units of the IEEE 118-bus system in the first stage.In the second stage, the economic scheduling was solved by a weight-improved crazy PSO considering a pseudo-inspired algorithm.In the third stage, a solution restructuring process determined extra energy reserves and minimize total operating costs [132].A recent survey on economic dispatch using PSO was conducted in Ref. [135].

Reinforcement Learning
Recently, DRL algorithms (see Section 2.4) have been increasingly studied for the solution of the UCP and showed promising results.Here, an environment is built based on the Markov decision process.The environment maps the energy unit portfolio of the EMS and its specific constraints to the constraints of an equation system of a discrete mathematical model.In this setting, single or multiple RL agents are trained to maximize a reward function.The reward function takes the expense of producing and consuming electricity into account, as well as the profits from the energy sold.The environment holds information on changing key determinants such as variable energy consumption and generation.Applied algorithms include Deep Q-Networks [136], Prioritized Deep Deterministic Policy Gradient [137], or Proximal Policy Optimization [34,138].

Further approaches with generative adversarial network
The use of GANs (see Section 2.5) for EMS and the UCP solution promises great potential to complement the described approaches.Among others, the challenge of the high quantity of required training and test data for DLR models can be addressed by GAN through generating new data sets with the same statistical properties as the original systems.Furthermore, GANs can create data for systems where no measurement data sets are available, based on parameters such as the nominal capacity of the units and the derived behavior of similar systems.

Businesstrading
Regarding the ESC, one of its last components consists of the electricity trade and the associated EPF.Here, we cover state-of-the-art procedures using ML and DL approaches for EPF, which can be subdivided into Day-Ahead (DA) and Intraday (ID) spot price forecast.

Day-Ahead spot price forecast
The DA spot price prediction is a frequently discussed topic consisting of many different approaches as generally described in Refs.[139,140], and specifically regarding DL in Refs.[141,142].An often-used approach consists of LSTM NN, which performs exceedingly well in spot price forecasting [143].Different approaches are enhancing LSTM forecasts using advanced hyper parametrization [144] and combining LSTM NN with a wavelet approach to model the seasonality and recurrent patterns of the spot price [145,146].Furthermore, a combination of a LSTM model and statistical models to forecast the DA price was proposed [147].Besides LSTM, other NNs were also used in literature, e.g., Recurrent NN [148], Multi-Layer Perception [149], or Quantile Regression NNs [150].Quantile forecasts were also investigated by applying a Bayesian approach in combination with NNs [151], a forecasting approach based on the XGB [152], and an interval forecasting approach based on an extreme learning machine in combination with bootstrapping [153].

Intraday spot price forecast
In comparison to the DA market, the ID market is characterized by its short trading horizon and inherent volatility of each electricity product.With less than 24 h until delivery, market participants take sudden changes, e.g., in weather forecasts, into account and adjust their trading decisions to minimize their risks [154,155].Due to the increase in RES, the ID spot market gained more attention in the literature in recent years.However, while there were various papers discussing the different influencing factors of the respective markets [154,[156][157][158][159], the number of EPF papers dedicated to the German ID market is relatively small, especially regarding ML.In this respect, the first advances in the ID price forecast were achieved considering the Iberian spot market MIBEL [160,161].While a Multi-Layer Perception NN was used to forecast the ID prices in Ref. [160], a probabilistic forecasting approach based on a statistical learning algorithm including the DA spot price was studied in Ref. [161].Besides the Iberian spot market, the performance of different NNs on the Turkish ID market was compared with regression and LASSO approaches [162].Most of the few studies considering the German ID market applied only an elastic net regression instead of ML methods [163][164][165].However, the performance of a simple NN was compared with an auto-regressive process using external variables and a naive method on the German ID market to describe a dense grid of the spot price quantiles [166].

Machine learning trading strategies
Given the recent advances in the EPF, we want to highlight the stateof-the-art ML trading strategies, that can be classified into two groups: (i) hybrid approaches and (ii) RL approaches.In the first case, hybrid approaches base their trading strategies on underlying ML predictions of important variables, while the actual trading decision is executed through solving a MILP problem [167].However, the core trading decisions of the hybrid approaches are not utilizing ML approaches.Regarding the RL approaches, one can divide the group into (i) trading strategies into the national spot market and (ii) trading strategies on smaller scales, e.g., local electricity grids or microgrids.For the general electricity market, different trading strategies for the German continuous ID market applying RL based their strategy on Markov decision processes and have similar experimental settings, yet they differ in the algorithmic implementation [168][169][170].A Deep Q-network implementation was used in Ref. [169], whereas in other studies the Markov decision process was restrained by a threshold policy for an analytic model [168] and for the REINFORCE algorithm [170].However, the previous research only considered trading in a daily or hourly frequency, which contradicts the continuous nature of the ID market.This problem was addressed by applying a Proximal Policy Optimization algorithm to an every-minute trading process [171], showing the applicability of RL in ID trade.Beside the research on the German ID market, a similar approach to Ref. [171] was used for DA market trading in Ref. [172] Additionally, an ML bidding strategy for virtually trading between the DA and ID market was proposed in Ref. [173].Various models for electricity trading on a smaller scale, e.g., for microgrids or ES units, were published [174][175][176].In particular, a multi-agent RL system was constructed to model the trading decisions of different prosumers on their self-defined grid market environment [174].Figs.1-7.

Fully integrated and automated Electricity Supply Chain based on Artificial Intelligence
As shown in the previous sections, there has been immense progress in the development of AI to support, improve and automize various, but distinct tasks within the ESC.However, in most cases, the various methods are specifically designed for the respective layer without consideration of the overall ESC.Furthermore, the development is often done manually, based on the underlying task and data characteristics.In this work, we propose the integration of the distinct tasks to one fully automated ESC that interconnects the individual layers.By neglecting the plant design and sizing as well as predictive maintenance, which remain mainly distinct tasks, we scheme an adaptive, self-learning workflow for an automated ESC based on AI as shown in Fig. 8.To achieve a fully automated ESC, commonly used rule-based engineering methods need to be replaced by knowledge-based (see Section 3.4.1)and data-driven AI approaches [177].Furthermore, thorough communication between the individual layers has to be ensured, to recurrently optimize the distinctive tasks in a coherent matter.The most crucial point in the proposed data-driven approach is obviously the data quality [178].Thus, several standard data preprocessing steps as well as more sophisticated methods, e.g., AI-based anomaly detection and substitution [179,180], are required to ensure reliable, anomaly-free data sets (see Section 3.2).Concerning the function and business layer, the data characteristics have to be analyzed and classified to enable an automated knowledge-based decision for appropriate methods (see Section 3.3-3.5)[182].In this regard, AutoML optimization (see Section 3.3.3)as well as the neural architecture search with RL [183] might prove beneficial to tune the individual methods and enhance the optimization without the necessity of an expert.The automatic selection of the most suited algorithm for a specific task and for a certain data set is of particular importance.Thus, one could deploy an RL process to find the best candidate considering a validated comparison of multiple model approaches over time.To support this selection process, Explainable AI (XAI) might be useful to evaluate the quality of the chosen method and enables the necessary human-understandable monitoring of the decision-making to lead to a reliable AI-automated system [187].In summary, the full AI automation of the ESC is still in progress, with various points to consider.Nevertheless, with a combination of a data-driven AI, XAI, RL, and AutoML approach, we see a fully automated ESC [184] as achievable in the near-term future.

Perspectives on the Electricity Supply Chain automation
In this Section, we want to highlight new trends in the AI community that are also beneficial for the ESC automation.In this regard, Section  4.1 covers the human-AI interaction, while Section 4.2 focuses on the computational challenges for the automated ESC.In Section 4.3, the question of resilience is emphasized, and we consider the influence of AI on ESC for the energy transition process in Section 4.4.

Human-centered and standardized Artificial Intelligence
The increase of information collected, analyzed, and communicated leads to the need for process automation, which has resulted in a growing number of implemented AI methods for ESC support [5,182].In this regards, an automation method was constructed to support ESC agents by reducing the information complexity notably through knowledge visualization of aggregated dispersed resources and to collect feedback from different agents [184].Associated with the notion of "Human-on-the-loop" [185], this contextualization of the collected data supports the human decision process and reduces the cognitive load for the operators.Thus, communication between automated systems and users is key for an efficient implementation of the automated ESC and should be at the center of AI system design to enhance the performance of the AI methods, while simultaneously increasing user acceptance.In this regard, a Human-Centered AI framework to achieve high levels of human control and automation but also setting requirements to ensure efficiency and reliability was proposed in Ref. [186].Here, we follow the proposition of [186] and highlight the need for a standardized AI framework design to ensure interoperability and comparability.Furthermore, privacy and security issues need to be considered esp.due to the increase of communicating tasks in the ESC.This requires standards and testing processes to protect user access and energy system integrity [5] as proposed by the American National Institute of Standards and Technology [178].Additionally, AI automation processes have to be monitored and validated, thus, techniques like XAI [187] are necessary to guarantee reliable, safe, and trustworthy applications [5].

Computational challenges and perspectives
The specific implementation of AI automation to ESC encounters several challenges and possible directions of development.In this regard, the necessary computing adaptation to tackle data streaming, processing, analyzing, and storage could be answered through the cloud computing paradigm [188].In recent years, AI tends to be implemented more and more remotely to access the needed large computational and storage resources.However, the large amount of energy data comes with specific needs in terms of IT infrastructure, data quality, data sharing, and security mechanisms that have not yet been met [189].Moreover, big data techniques should also be adapted to energy data to perform real-time control [190].To tackle this aspect, approaches using stream and iterative computing were proposed [191,192].In contrast, edge computing can also be preferred to cloud computing due to its lower computation latency [182] and since local data storage prevents certain privacy issues.The edge paradigm has fostered new concepts such as TinyML [193] and neuromorphic computing [194].These concepts offer the usage of ultra-low-power devices for local, hardware-based AI to improve computing efficiency and to reduce communication usage.However, one has to consider that these concepts have not yet been deployed on a large scale.An intermediate technique providing low latency and benefiting from cloud computing infrastructure is fog computing, which extends the local infrastructure by using computing capacities near the data-creating network edge [195].Fog computing is a promising solution combining the advantages of cloud and edge computing [195,196].

Sustainability and resilience of Artificial Intelligence automation
For sustainable systems, efficiency and resilience are necessities [197,198].Efficiency is required for economic success and is achieved through a highly specialized system limiting the usage of resources.Simultaneously, resilience is implemented using redundancy and diversification of measures, enabling systems to handle unexpected as well as possibly catastrophic situations caused by natural circumstances or criminal activities [199].Thus, a trade-off between efficiency and resilience has to be found to adapt costs, computation power, and time but also security and robustness.To foster AI automation within the ESC, resilient information and communication technologies are crucial due to the ongoing entanglement between the critical infrastructure sectors of energy, information technology, and communication.This entanglement is increasing the access to energy data as well as to available services, but it also increases the risk of data errors, manipulation, and cyber-attacks.It was found that manipulations were possible by injecting false data sets into sensitive data sets such as energy prices, contracts, and transactions between grid entities [200].In addition, one has to consider adversarial attacks on AI models.Recent papers revealed general weaknesses in AI time series models [201], and in load forecasting in particular [202].Furthermore, research in the direction of EMS with RL showed similar problems [203].One promising AI solution that addresses these general challenges is GAN (see Section 2.5), which has been proposed to increase the forecast accuracy [204] but more importantly has been implemented both against cyber-attacks [205] as well as for privacy protection [206].

Energy transition through Artificial Intelligence
Here, we briefly want to outline the indirect influence of AI on the electricity domain.As underlined by the International Energy Agency [207], energy efficiency is crucial for tackling energy transition, and one aspect to improve it is the generation of new materials for energy technologies including batteries and PV plants [208].With the flexible and rapid prediction framework of ML methods, the discovery of new materials is expected to be accelerated by a factor of ten [209].However, this self-driving laboratory paradigm carried by AI and automation is challenged with the limited data quality and the amount available as well as with the non-physical form of ML that limits interpretation and extrapolation [210].Another aspect to enhance energy efficiency is to use AI for building design and energy management (as a special form of ESC) [5].Beyond architectural intelligence for new constructions, AI can be largely applied to predict and reduce building energy consumption while considering comfort, health, and productivity in the living spaces [177] esp.by applying an automated ESC if RES are included.To be optimal, this evolution of building construction and management requires new information types such as 3D topographic data [5] as well as combining data-driven-based methods with knowledge methods [211] within an automated ESC.This collection of information improves diagnostic interpretation for building fault detection, which participates in the efficient management of infrastructure.In addition to energy efficiency, the use of RES is also key to fostering energy transition.AI supports their deployment and enables adaptation to changing RES by enabling power generation prediction, sizing systems, evaluating risks, and scheduling operations [7,6,212] as shown in Section 3.Moreover, AI facilitates the usage of RES for electric vehicles charging within existing infrastructure with minimal investments [182].This could be achieved through coordinating, e.g., solar energy availability and vehicle battery charge [181].To date, however, AI approaches have been mainly studied for wind and solar energy systems, while more research remains considering other and hybrid RES [212].

Conclusion
In our work, we thoroughly investigated the newest state-of-the-art research on the automation of the ESC.Due to the rising volatility in electricity production, more and more researchers account for the complexity by proposing new AI methods on every level of the ESC.We assert and assess both new and established methods and aggregate the most promising candidates.Following the SGAM, we categorize the AI methods on individual levels of the supply chain, offering a distinctive analysis depending on their field of application.Interestingly, we found that ML methods are not only employed for forecasting and optimization tasks but instead play a vital role in data processing and anomaly detection as well.In this regard, implementations of AI are used to secure control of highly hazardous or volatile energy sources as well as security environment levels, especially on the communication and information layer of the SGAM layout.Nevertheless, we also assert that the deployment of AI methods has to be considered under the viewpoint of resilience, to ensure a reliable energy system.The main result of this paper is the proposition of a fully automated ESC and enabling developments for implementation.Even though AI is still often developed for the individual layers of the ESC, we elaborate that the integration to a fully automated ESC might be within reach in the coming years due to the developments in XAI, RL, knowledge-based AI and data-driven AutoML.With a self-learning workflow, it is possible to optimize data processing from generation to forecasting and to use it for management and trading decisions.However, there remain challenges in terms of data quantity needed, computational performance, resilience, algorithm robustness, human interaction as well as standardization that need to be solved to ensure a fully automated ESC based on AI.
Overall, this paper targets researchers and operators of the electrical sector in order to enhance the understanding of AI, highlighting the possibilities, and offering a first introduction to the different fields of automated ESCs.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
[9].Next to the availability of large amounts of data and improved hardware, three decisive algorithmic enhancements were presented in 2009 and 2010 that enable the training of DNNs [9]:

Fig. 2 .
Fig. 2. Component and communication layer of SGAM [9] including the discussed use cases improved by using AI and the respective AI methods.

Fig. 3 .
Fig. 3. Information layer of SGAM [9] presenting the data processing methods benefiting from respective AI methods.

Fig. 4 .
Fig. 4. Function layer of SGAM [9] representing AI improved forecast methods of energy time series.

Fig. 6 .
Fig. 6.Optimization of scheduling and operation of energy units occurring on function layer of SGAM [9].

Fig. 8 .
Fig. 8. Scheme of an AI-based automated ESC enabled by XAI, RL, knowledge-based AI and data set classification (depicted in red color).XAI methods enable the monitoring of automized decisions of applied processes, which are learned using RL and thus creating an adaptive knowledge base for certain classes of energy data sets.(The colors display the corresponding SGAM layer used in.).

Table 1
Summary of AI automation impacts on EA.
Domain Impact of AI Main techniquesEnergy production -Ensure continuous and autonomous optimal energy generation.-Allowautomatedmanagement of hazardous/highly variable energy sources.-Definelocalcontrol strategies.L.Richter et al.
A Bag of patterns, B Bag of Symbolic Fourier Approximation Symbols.L.Richter et al.