Applications of Artificial Intelligence in Distribution Power System Operation

Due to the energy transition and the distribution of electricity generation, distribution power systems gain a lot of attention as their importance increases and new challenges in operation emerge. The integration of renewables and electric vehicles for instance leads to manifold changes in the system, e.g. participation in provision of ancillary services. To solve these challenges artificial intelligence provides a variety of solutions based on the increase in sensor data and computational capability. This paper provides a systematic overview of some of the most recent studies applying artificial intelligence methods to distribution power system operation published during the last 10 years. Based on that, a general guideline is developed to support the reader in finding a suitable AI technique for a specific operation task. Therefore, four general metrics are proposed to give an orientation of the requirements of each application. Thus, a conclusion can be drawn presenting suitable algorithms for each operation task.


I. INTRODUCTION
The supply of electrical energy is a necessity in modern society and its reliability has to be guaranteed by the power system operators. Nevertheless, thanks to the energy transition many new challenges regarding the stability and operation of the energy grid emerge. That being the case, system operators as well as other stakeholders have to find solutions to keep the grid stable and the energy supply reliable, while transitioning the energy system to carbon-neutral operation.
Hence, governments of multiple countries are developing strategies in cooperation with grid operators to increase the integration of renewables, namely the European commission's strategy to harness the potential of offshore renewable energy [1], [2]. The German government for instance published a double tracked strategy relying on the optimized operation of existing grid structures to reduce reserves as well as the speed up of building new power lines [3].
Besides the conventional techniques in power system operation, data-driven methods and especially artificial The associate editor coordinating the review of this manuscript and approving it for publication was Manoj Datta .
intelligence are getting more attention as a result of the increasing amount of data provided by a growing number of installed measurement systems. Additionally, the growth of computational power is enabling the application of powerful techniques in real-time, which also leads to an enlarged interest in data-driven systems [4].
Given these circumstances, the research output in this field also increases. To showcase, the online database of publishers Elsevier, IEEE and Wileys (IET) were searched and the growth of the number of publications per month compared to the previous year was plotted in percent in figure 1. Thus, a clear trend can be identified. According to this analysis, it can be seen that the number of publications on AI in power systems increased by 40 % for the last few years. Recently, multiple studies have been done reviewing possible applications of AI in power systems focusing on different aspects. Some will be briefly revisited here. Zhao et al. [5] give a broad overview about the three lifecycle phases of power electronics being design, control and maintenance as well as possible AI applications therein. Another broad review was proposed by Monti et al. [6] focusing on different types of distributed intelligence control in smart grids. Also, Omitaomu [7]  presented a survey on operation and security concerns focusing on AI in smart grids. Some much more specific collections of applications were presented in the following studies. In [8], Cao et al. provide an overview of applications using deep reinforcement learning to solve problems in modern power systems. Another study about reinforcement learning was proposed by Glavic et al. [9], presenting decision and control applications of reinforcement learning in power systems. Sun et al. [10] focus on voltage control and give an overview of the challenges and opportunities of this control type. An extensive review about load flow control and its challenges and opportunities including the integration of AI was proposed by Alhelou et al. [11]. Chai et al. [12] and Darab et al. [13] investigate different approaches and their applications of AI in fault detection and diagnosis in power systems. Kumar et al. [14] propose a collection of possible applications of AI and other emerging techniques used for the integration of distributed energy resources into the smart grid. A similar review was presented by Ali and Choi [15]. Cai and Lu [16] propose a survey about different metaheuristic algorithms and possible applications in power systems.
It can be concluded that a systematic review of applications of AI techniques in distribution system operation would be a valuable addition to the current literature. On that account, different aspects of the applications provided here will be revisited from a control perspective. Doing so, the aim of this article is to give a systematic overview of applications of AI already available and possible concerns. The contributions of this article include the following.
• Systematic review of AI applications in distribution system operation including the analysis of individual requirements as well as their essential functions derived from the reviewed papers. Because of this, four different metrics are introduced being runtime, dataset, adaptability and dynamic. The severity of each metric is rated for every application from 0 to 5 based on the analysis of the reviewed studies.
• A guideline is derived in a compact format to present suitable techniques for every application, based on the performed analysis. In this way, a helpful tool for designing AI solutions for the proposed applications is provided. The structure of this article can be outlined as follows. Section II presents a short overview of the basic concepts of the most commonly used AI methods, being machine learning, fuzzy logic respectively control and metaheuristic. In section III the review methodology is described and the metrics further used in this study are proposed. The applications of AI methods in power system operation are reviewed in section IV, divided into decision-support and closed-loop systems. This chapter closes with a guideline for finding suitable methods for each application. Section V gives a brief outlook about AI in power systems together with a short overview of emerging concepts. Finally, in section VI this article is concluded with a summary of the contributions.
This article does not claim to be exhaustive, however, it aims to provide a systematic overview and guideline for the selection of suitable AI algorithms in distribution power system operation.

II. ARTIFICIAL INTELLIGENCE METHODS
The term AI has been discussed in many studies, starting with definitions provided by Turing in the middle of the 1940s [17], [18]. Also in recent discussions, there is no consensus about one definition. However, most agreed on an information-processing system influenced by an environment that is able to learn and adapt [19], [20]. In this chapter, the basic methods of AI techniques used in the different applications throughout this paper are briefly presented according to figure 2, this division has been adapted from [5] and [21]. It is noteworthy that in many studies, standard algorithms are modified to a certain extent, however, they are classified in one category in this paper. Further information on the different variations of each standard algorithm can be found in the publications. Again, the article does not present a comprehensive overview but the most frequently used techniques.

A. MACHINE LEARNING TECHNIQUES
Machine learning is one group of techniques that gets used a lot in recent studies. Some basics will be revisited here, divided into three subgroups supervised, unsupervised and reinforcement learning according to figure 3. In supervised learning techniques a dataset consisting of the input and VOLUME 9, 2021 the output/target data of the mapping strategy, such as neural networks, is used for training and validation. The training is performed using an optimizer, minimizing an error function consisting of some kind of distance measurement between the actual output value and the target value of the data. Supervised learning is used for regular neural networks as well as convolutional neural networks, which also use an additional filter layer at the inputs [22]. In contrast to that, unsupervised learning does not have the target values included in the dataset, which leads to a training procedure where the learning algorithm has to find the individual target itself. Typical methods of unsupervised learning are k-means clustering algorithm and Support Vector Machine, applications are often found in the field of image classification and anomaly detection [23]. The last learning technique that should be mentioned here is reinforcement learning, which is an agent-based method to learn a certain action strategy. Here, the agent has to decide for an action in a specific situation and earns a reward for this. That way, a utility function is approximated describing the value of a specific action [24].

B. METAHEURISTIC METHODS
Metaheuristic methods describe a group of algorithms that solve a given optimization problem, they are often used for finding the hyperparameters of models and controllers. The algorithms can be divided into two subgroups being trajectory-based and population-based methods also called swarm intelligence according to figure 4.
The Particle swarm optimization (PSO) is probably the most famous of the population-based methods. First invented in 1995 by Eberhart and Kennedy, there are multiple improved versions available, as mentioned in many studies below. The basic version of the PSO uses a swarm of particles with an initial position and velocity in a search field to find a global optimum, while each particle knows its individual best and the global best position [25]. The fruit fly algorithm is another popular metaheuristic optimization algorithm, compared to PSO the flies implicitly collaborate to build the solution and the algorithms only build a geometrical representation [26]. The Ant colony optimization is based on the foraging behavior of a real ant colony and was first introduced in the early 1990s [27]. This algorithm is known to be able to solve complex problems in a short amount of time. The Genetic algorithm is inspired by natural evolution, consequently, only the fittest individuals are selected for reproduction by crossing the parent's genes [28]. In the differential evolutionary optimization a similar approach is used, while additionally utilizing the survival of the fittest principle [29]. The immune algorithm was also developed from the genetic algorithm based on the construction of the immune operator through vaccination and immune selection [30]. Tabu search method is another metaheuristic algorithm that guides a local heuristic procedure to search the global solution space, based on the incorporation of adaptive memory and responsive exploration [31]. The simulated annealing combines the physical behavior of the cool-down phase of a solid material after annealing with solving large combinatorial problems of optimization [32].

C. RULE-BASED SYSTEMS
Rule-based systems are a group of AI techniques that allow the direct integration of human knowledge. By developing a set of if-then rules, the system is able to decide based on the rules given by an expert. Hence, a definition can be derived, describing the rule-based system as a modularized know-how system [33]. In multiple studies, rule-based systems are also referred to as expert systems. Besides the Boolean logic, fuzzy logic and control have been used a lot in rule-based systems, as it can be seen in figure 5.
The main advantage of using fuzzy theory and logic is the description of variables and relations in human linguistics. Because of this, a fuzzy system normally consists of three basic parts. Starting with fuzzification, where the input signals are mapped onto a fuzzy membership function using a membership degree. These functions can be of different shapes namely triangular, trapezoidal or Gaussian. In the following inference module, the calculated degrees of membership are integrated into IF-THEN fuzzy rules, which were prior derived from expert knowledge about the process. As the last step the defuzzification is performed, which creates an output signal that the physical system is able to handle [34]. It is noteworthy, that many combinations of the different categories are possible, namely, [35]- [37]. In addition, some techniques do not belong to only one category and might be classified in others as well. To conclude this introductory chapter to AI techniques, basic advantages and limitations for each group of techniques is listed in table 1 together with applications.

III. REVIEW METHODOLOGY
In this chapter the methodology of the proposed review is described in detail. Therefore, table 2 lists all databases that were searched as well as the search string, search period and the screening procedure. After classifying the studies into groups of distribution system applications, some general metrics are defined, which are critical for either distribution grid operation or artificial intelligence design. In this way, each study can be reviewed with a focus on the defined aspects, to build a base for the concluding guideline. These metrics are then used to show the severity of requirements of the individual applications for possible approaches. Every metric allows a rating from 0 to 5, meaning low to very high severity of requirements in this category. After reviewing the studies for each application, a rating for every metric is chosen, concluding the requirements mentioned in the approaches.
While concluding the reviewed studies, a general guideline was derived, showing the applicability of all methods described in chapter II to each application. The outcome is shown using a table with a general rating summarizing the findings of the review and metrics. In the following, the metrics are further described.
Dataset: As most of the approaches presented here are data-based, the database used in each study is presented here if accessible as well as the required set of measurements.
Runtime: The operating timescale and runtime of the proposed approaches are also mentioned if possible. Doing so, the practical applicability of each study can be reviewed as well as real-time operation possibilities.
Dynamic: The consideration of system dynamics is mandatory for some applications.
Adaptability: The required effort to adapt the reviewed approaches to new situations is investigated, such as training time [44]. This metric is highly correlated with the dataset as large datasets often lead to long training times, so in table 3 duration of times needed for adaptation are given.
In Table 3 the ratings of the different metrics are specified. The defined ranges for dataset, runtime and adaptability are based on the reviewed papers as a quantitative measurement. Therefore, the values provided in the studies are sorted in five groups defining the outer limits. The highest rating achieved in one study is then shown in the corresponding figure in each subchapter. Same holds for the dynamics, but when no dedicated timescales are mentioned, the definitions provided in [45] are utilized as additional information.
Although certain characteristics (or non-functional requirements) of the algorithms like convergence speed, accuracy as well as exact training and testing time are relevant to system dynamics, but cannot be considered directly as an individual metric to compare the approaches. It is thanks to differences in the test scenarios of reviewed studies, in simulation software and hardware set-ups and the modification of original algorithms.

IV. APPLICATIONS OF AI IN DISTRIBUTION POWER SYSTEMS
In figure 6 the structural division used in this paper is shown. Distribution grid operation is divided into two different types. On the left side, the decision support systems are shown, which use the measurements to visualize different situations inside the grid, so the operator is able to take manual control actions. In this paper, it is referred to as decision support systems. Here the decision support systems are human-inthe-loop or open-loop control, as it is not working fully automatic. This category includes in particular state estimation, fault diagnosis systems and stability assessment methods. For every system, there are multiple examples presented using the three different types of artificial intelligence described in section II.
The right side of figure 6 describes the closed-loop or automatic side of grid control. Here, only fully self-reliant systems without the need for human interaction to influence the grid are considered. This is shown by the two arrows under the box representing the closed-loop of the grid and the control. Nevertheless, a visualization of the automatic control actions to the distribution system operator is often necessary to check if the control is running as intended. The applications using artificial intelligence found by the authors contain for example voltage control. These days, there exist different degrees of automation in distribution grids, but they are often run by manual operator decisions. In the following section A, some of the most recent applications of AI in decision support systems are presented and different aspects are discussed. Section B investigates the applications for the closed-loop-control. It is noteworthy that not all reviewed approaches were originally developed for distribution systems, but they are at least applicable to them. Due to the transition of power systems, e.g. distributed generation, operational tasks that were exclusively for transmission systems might also be of interest for distribution system operators. For example, the provision of frequency ancillary services by distributed generation connected to the distribution system has been researched recently [46]. On that account the influence of fast generation changes on the distribution grid has to be investigated. So the assessment of frequency will be of interest for future grids. Moreover, when distributed and especially inverter-based generation also has to work in grid-forming mode, the dynamics of the distribution system will also change [45]. For that reason, there are also tasks considered in this study that might be useful for future distribution system operation, but are mainly used in transmission systems right now.
The maintenance of power system components is also part of the system operation and a field where AI is applied regularly especially in predictive maintenance. But as it is very specific for every component it will not be considered here. For further information on this topic the authors refer to [47], [48].
To categorize the applications in line with distribution management system terminology, the wording used by EPRI for the description of the advanced distribution automation (ADA) functions in [49] is used. Therein, high-level functions like real-time Distribution Operation Model and Analysis (DOMA) and Fault Location, Isolation and Service Restoration (FLIR) are described in detail. In figure 7, the four main ADAs investigated in this paper are shown as well as the included functions. For example, the modeling of loads and the analysis of power quality are part of the DOMA. Figure 8 shows a rough timescale ranging from ms to days, which is used here to arrange the applications reviewed in the following chapter. The tasks occurring in the distribution grid are listed above the timeline. The distribution grid management functions to solve these problems can be found under the timeline. The Distribution Operation Model and Analysis function (DOMA) includes the modeling of distribution nodal loads as well as the analysis of economic efficiency and power quality.  It should be highlighted here, that the fault location, isolation and service restoration's main task is to identify the faulted section and location and recommend an optimal isolation strategy of the faulted part of the system. In the following, the emergency actions have to be coordinated, in particular load shedding, to keep the system stable. The coordination of restorative actions is necessary after the emergency appeared and all emergency actions have been taken.
It is worth mentioning that the power quality is a sub-function describing all techniques that assess the stability of the system as well as the exploration of the given operational limits. The state estimation in distribution systems ranges from ms to minutes depending on the application and the measurement update rate, but for future systems the interest in real-time applications in the range of ms will most probably increase [50]. The aim of the fault location, isolation and service restoration is the accurate detection of faults and anomalies in a minimum amount of time and their isolation and restoration. In dynamic optimization of the voltage, reactive or active power of the distribution grid multiple objectives can be considered, discussed in part B of this chapter. The distribution grid operator is also in charge of the coordination of emergency actions like load shedding as well as the following restorative actions for instance feeder re-connection. In the modeling of distribution nodal loads on a short timescale, the digital twin concept is a technique applied in very recent studies. The concept of the digital twin is gaining lots of attention with the rise of industry 4.0 in different disciplines [51]. It describes the digital representation of a physical system in which behavior and state are changed based on parameter information and measurements. This concept is already well established in the manufacturing sector and gets used for a variety of applications therein. Over the last few years, some applications were also found in the power systems sector, such as maintenance and power plant design and very recently monitoring and control [52]. In figure 9 some basic requirements are defined for the implementation of nodal load and circuit connectivity modeling in a power system operation environment. The processing speed FIGURE 9. Severity of basic requirements for modeling distribution nodal loads, distribution mode circuit connectivity. VOLUME 9, 2021 is essential as well as the accuracy to show the system behavior in every possible state. Hence, a large dataset including multiple situations is necessary. Moreover, the model needs to be able to extract the system behavior from the data and adapt to all operation scenarios, consequently the adaptability is rated high. Owing to the ability of the digital twin to change its behavior online, an estimation of the dynamic parameters is often necessary, which can be performed by utilizing AI techniques. This seems to be a very common technique to build a digital twin, so some recent applications are presented in the following. Zhou et al. [53] propose a digital twin-based framework for online grid analysis. For this purpose, a virtual model of the power system containing a bus/breaker, node/breaker, and a bus/branch model is updated in real-time by SCADA and state estimation data. When a change in the model is detected, a complex event-processing engine performs a situation awareness analysis and feeds the results into a machine learning framework. Therein, an online security assessment prediction is performed using a neural network, which was previously trained offline. The computing time for the whole process was less than 300 ms in field tests.
He et al. [54] propose a digital twin-based power flow calculation using an Artificial neural network. To this end, a mapping of the grid inputs P, Q to the outputs being complex voltage through a neural network is developed. A set of 9600 samples of Gaussian power fluctuation for the IEEE 9-bus system was created in MATPOWER for the training and testing of the system. Thus, the operator is able to monitor the power flow all through the power system in real-time only using operational data. To perform a conventional power flow calculation, a model of the whole system is mandatory, including load models in particular. Jereminov et al. [55] propose a linear first-order load model which can be utilized for power flow calculations and an algorithm for parameter fitting called PowerFit. Relying on linear models has the advantage of better convergence of the power flow algorithm. By utilizing load data from Carnegie Mellon University campus and µPMU data from Lawrence Berkeley National Laboratories, the developed algorithm as well as the load model are tested. During operation, the algorithm searches for cut points in the data, which can be detected by drastic changes in the load data. In case of a detected cut point, the load parameters are adapted to the new situation.
The first dataset consists of 575 samples in 5 minute steps of real voltage and current, as well as imaginary current. For the second dataset 12 days were used including complex voltage and current with a 120 Hz measurement frequency, which was averaged to 500 samples.
Wang et al. [43] propose a two-stage approach for load modeling using the Western Electricity Coordinating Council Composite Load Model (WECC) to capture the dynamic load response. In this model, each load component aggregates a different type. That being the case, during the first stage, the composition of the load at each bus is investigated using a DDQN learning agent. In the following stage, a parameter set for the model is found using Monte Carlo simulations. For training and testing purposes, the TSAT in DSATools by Powertech Labs Inc. was used for creating training examples utilizing the IEEE 39-bus grid. In contrast to that, Cui et al. [56] propose an LSTM based method for parameter estimation of a composite load model using the ZIP model. To extract the temporal relationship between measurements at the target bus, namely, P, Q, V and the load model parameters, a stack of LSTMs is used for the parameters as well as for the measurements. Afterwards, both are temporarily pooled for the extraction of the average temporal latent, finally they are used to estimate the new set of load parameters through linear regression. For the experimental investigation, the Siemens PSS/E 23-bus system with a Gaussian variation on every parameter and a ground fault event simulation on every bus with a sampling time of 0.4 s in a 32 s timeframe is used. Additionally, a 68-bus New England and New York interconnected bus system are considered with a similar parameter variation, but transmission line outages as test cases. The data has a resolution of 0.1 s and a simulation timeframe of 20 s in this example.
A digital twin approach for load dynamics identification is proposed by Baboli et al. [57], by combining system identification methods with neural networks. Hence, optimal utilization of EVs and DERs is possible. In addition to the parameters of the nodal load modeling, the overall system parameter and the structure of the system are a necessity for most calculations. In most distribution systems the topology and state of the breaker are not known to full extend. Consequently, some kind of topology identification system can be helpful.
Zhao et al. [58] propose a neural network architecture with binary classifiers for online identification of the line status. The developed network is trained using a set of inputs from measurements, e.g., PMUs. Across the hidden layers, a set of features is created followed by a line status approximation in the output layer. Thus, the problem is formulated as binary, because the output of the neural network is either one or zero, meaning the line is connected or not connected. For simulation, the IEEE 30-bus system including 41 lines is utilized. A set of 300,000 training and test samples is created together with adequate power data, setting the line status as a Bernoulli random variable with a probability of 0.6 of the lines being connected. A different approach is followed by Jafarian et al. [59] using a deep neural network for topology identification only utilizing measurements available to DER management systems. For testing purposes, the IEEE 123 node test feeder with 24 different topologies and different switching positions, which should be classified by the DNN is used. A training and testing set of 6,000 load and generator settings is created for every topology. Chao et al. [42] propose an approach for checking the topology of a LV distribution grid using fuzzy c-means clustering. For this purpose, the Smart meter data provided by the individual household is collected over time and the correlation between different users is checked. The data is finally compared through a GIS system. Doing so, it can be shown if a user is listed in the right transformer area by utilizing the fuzzy c-means algorithm. In this study, the data of 48220 users in 500 transformer areas were used and the connection relationship was verified.

2) STATE ESTIMATION
As a result of the transition towards distributed generation, the measurability and controllability of the distribution grid are getting important. Because the operation practices of the distribution grid changes, an estimation of the actual states is mandatory for every grid model and control. On account of the missing topology information and under-determined measurement sets that often occur in distribution systems, conventional approaches are hard to implement. Consequently, the utilization of AI techniques seems like a consequential step, because of their ability to extract information purely data-based. In the following, some approaches utilizing AI techniques are presented, for further information on distribution system state estimation, please take a look at Primadianto [60]. As presented in figure 10, state estimation mostly requires a large dataset and high adaptability to new system states. Additionally, the runtime is fast in most cases due to the AI approaches. Wang et al. [61] propose a physics-guided model combining machine learning methods with established physics-based methods in a hybrid model to enhance the explainability of the data-driven model. The basic idea is to include temporal correlation of the states to get a better state estimation, which also takes into account the dynamics of the system. That being the case, a Deep Neural Network model containing LSTMs is used with the measurement of the current and previous time steps as inputs. In this way, the state of the system is estimated and fed into an AC power flow model containing the physical parameters of the system. For simulation, the IEEE 14-bus and 118-bus systems are trained and tested with 35,000 samples from NYISO load profiles in 5 min time resolution.
A different approach to integrating physical structures into a neural network was proposed by Zamzam and Sidiropoulos [62]. Herein, the graph structure of the electrical grid is utilized and copied as the structure of the neural network leading to a graph neural network. Doing so, the complexity and trainable parameters of the network are reduced. The approach was tested using a large dataset and the IEEE 37-feeder power system. Mestav et al. [63] developed a deep learning-based framework for real-time distribution system state estimation only relying on machine learning methods. The system consists of an offline part for training the DNN and an online part, which is a copy of the offline DNN. When new data arrives, the offline system is trained repeatedly, followed by the adaptation of the online DNN. The offline learning procedure starts with some sets of historical smart meter data, which are used to estimate the injection distribution using Gaussian, Gaussian mixture and Weibull models. In the following, a Monte Carlo sampling is performed using the estimated injection distributions to generate some sets of injection samples. These are fed into a power flow calculation, which then creates the training samples for the offline DNN training. With this framework, the creation of a full training set is possible without full observability. A bad-data detection is also performed by investigating the difference between the measurement and the learned distribution parameters. That being the case, bad data can be detected pre-estimation. For the simulation a dataset from Pecan Street collection [64] is used, containing four months of training data and four months of testing data. Zhang et al. [65] proposed a real-time state estimation with an additional forecasting system. This approach focuses on the nonlinear dynamics of the power system. This is done by utilizing two different types of DNNs, the first one for estimation and the second one for prediction. For estimation, a prox-linear net consisting of a plain-vanilla FNN and a prox-linear solver is presented. This system is trained offline using a dataset from the 2012 Global Energy Forecasting Competition containing real load data. When the whole system runs in real-time, three basic steps are performed. First, the estimation of the states through the prox-linear net, which is then fed to the Deep RNN for predicting the upcoming states. The results are feedback to the prox linear net, to improve the estimation accuracy.

3) POWER QUALITY ANALYSIS
The analysis of power system stability is an important part of the power system operation, so in this chapter some of the most recent studies using AI techniques for stability assessment tasks in distribution system operation are presented. Thanks to their ability to efficiently extract nonlinear dynamic system behavior and their short runtime, multiple AI techniques have been applied here. As this paper is an overview over various topics of AI applications in power systems, it is not as detailed as others. For further information, VOLUME 9, 2021 the reader might take a look at Alimi et al. [66]. Some of the tasks and studies mentioned here are traditionally considered for transmission system operators, in particular frequency stability. However, as generation moves to the distribution system, frequency and non-frequency ancillary services have to be provided by generators connected to the distribution system [46]. On that account, the assessment of frequency and frequency stability might also be relevant for distribution system operators in the near future. Additionally, a distribution grid with a high share of renewables reacts dynamically to system disturbances and might affect the overall power system stability. It is also worth mentioning that there are additional stability classifications like resonance stability and converter-driven stability, defined by an IEEE PES task force [45].
To detect dynamic system stability, an accurate model of system dynamics leading to a large dataset and a fast runtime on account of rapid changes in stability are necessary as shown in figure 11, same holds for the adaptability. Nevertheless, this does not hold for every application in this chapter, such as long-term voltage stability. Voltage stability is hard to assess, considering the time behavior mentioned at the beginning of this chapter. Because of this, Zhang et al. [67] propose a hierarchical and self-adaptive data analytic method for real-time short-term voltage stability assessment. Based on PMU measurements a voltage instability detection is performed in the first place, meaning the voltage is checked for undergoing stable or unstable propagation. The detection of a stable status is followed by the prediction of the fault-induced voltage delayed recovery (FIDVR) severity. Therefore, the root-mean squared voltage dip severity index (RVSI) is used, which is proposed in this paper and evaluates the voltage recovery performance of every single bus. That way, a hierarchical assessment system is developed, which leads to faster execution of the process, because the second hierarchy is only activated if the first hierarchy detects a stable point. This makes the first stage classification and the second one a FIGURE 11. Severity of basic requirements for power quality analysis. regression problem, both are solved using an extreme learning machine ensemble. Even though the aggregation of the ELMs is done separately for each stage, the performance validation is aggregated after the training. Doing so, a multi-objective optimization problem is formulated and solved to find the balance between the earliness and the accuracy of the proposed approach. For database generation of pre-fault condition the New England 39-bus system is used, running 10,000 Monte Carlo simulations for an added 700 MW Wind power plant and loads. The fault simulations are done using the Transient Stability Assessment Tool (TSAT) at a 0.01 s simulation step size and the RELIEFF algorithm was used for feature selection purposes. For further information on this specific algorithm, the authors refer to [68].
A similar approach was proposed by Xu et al. [69] and Zhu et al. [70], which also uses a two-stage system for voltage stability assessment. As a first stage, a stability detection is performed here, followed by a trajectory prediction. Mohammadi et al. [71] propose a SVM for assessing the power system voltage stability using PMU measurements. The measurement data is processed using two optimization goals, first the misclassification rate of the SVM. In the second step the number of input features of the SVM is reduced systematically, thanks to the highly nonlinear relations between the measurements and the voltage stability. Thus, the authors try to reduce the processing time and increase the prediction accuracy. For the selection of the subset of features, containing the highest amount of information, mutual information is used, describing the mutual dependence between two random variables. In the following, the dataset is processed by a biogeography-based optimization algorithm (BBO), which is an evolutionary optimization algorithm. For the first simulation, a 39-bus test system is utilized to create a database of 506 pre-fault operation conditions from load patterns, for which the stability of power flow convergence is checked. Each set contains reactive power flow, line currents, squared voltages and voltage phase angles calculated from PMUs, as well as the fault location. Afterwards, a 66-bus real power grid in Iran is used for further testing and 26 operation conditions are created from 15 days of load data. To this end, 24 PMUs are placed throughout the grid.
Another approach is proposed by Malbasa et al. [72] detecting the operating points which are different in the developed machine learning predictions and the actual system state. After detection, a training set around the identified operating points is created, so the machine learning method can be adapted. That being the case, the incoming data is divided into three different classes by their voltage stability margin. The first one contains all operating points with voltage stability margin (VSM) larger than the mean (stable) VSM, the second one operating points with a VSM in the second quantile (alert) and the third one in the smallest quantile (critical). When operating online the incoming data from PMUs as well as SCADA is collected into an unlabeled pool and fed into the machine learning system for prediction. In this way, the most inaccurate predictions can be found and the unlabeled datasets are handed over to an offline PSSE simulation, which creates accurate labels for these operating points. This leads to a labeled data pool for further training. All through this study, ANN, RF and SVM are compared as possible prediction techniques. For simulation purposes, a version of the WECC system is used with a dataset of 10,000 operating points created through PSSE simulation environment.
In power system operation, transient stability also has to be considered. Owing to the fast appearance, the detection algorithms have to operate on a very short timescale, as mentioned at the beginning of the chapter. On that account, Tan et al. [73] propose an approach for transient stability assessment based on PMU data considering different signal-to-noise ratios. Stacked autoencoders (SAE) are used for feature extraction, followed by a convolutional neural network (CNN) to perform representational learning for noise filtering. The learning process is performed offline based on historical data, utilizing unsupervised learning for the features and supervised learning for classification by the CNN. In online operation, the real-time data provided by PMUs is used for the transient stability analysis. A simulation database is created using the 39-bus New England grid and a PSD-BPA software to perform power flow calculations at different load levels, three-phase short-circuits are applied to create an unstable system. That way, 4,000 samples were obtained including different levels of SNR. Another two-stage approach for online transient stability prediction was proposed by Zhu et al. [74] by utilizing a hierarchical convolutional neural network. PMU data is used to build the fault-on trajectories of voltage magnitude, rotor angle, frequency deviation, active and reactive power of each generator. From this transient profile a spectral representation is extracted using discrete Fourier transform and a 2D-graphical representation is created, called the transient image. These images are fed into the first stage of the proposed system, consisting of a CNN as a regression model for the stability margin. For the second stage of the model, all incoming data is divided into subsets by the estimated stability margin. A CNN is trained for every subset to get a more precise estimation and a binary stability signal. For simulation, the IEEE 39-bus system and the Guangdong Power Grid system in south China are utilized, performing simulations in the PSD-BPA simulation package released by China-EPRI. 7200 transient cases were created by varying load and topology.
In contrast to that, Chen et al. [75] propose an indirect PCA approach to reduce the dimensionality of inputs for stability assessment using ML systems. Because of this, only the relevant data points are kept. In direct PCA, the reduction is performed by cutting off the smallest eigenvalues, which are not necessarily the most irrelevant for stability assessment in power systems. For this purpose, an indirect PCA approach is presented, which calculates the difference between stable and unstable projections for every single dimension after acquiring the necessary values. Doing so, the most important, meaning the most different dimensions are kept. For testing purposes, the IEEE 39-bus system is used to create datasets containing 165 measurements (bus voltage/ generation/load, branch power flows, etc.) by performing Monte Carlo simulations on active power generation and bus voltage.
The frequency stability is traditionally a transmission system task, but as described above this might also become interesting for distribution system operators. Therefore, Xu et al. [76] propose an online predictor of frequency stability utilizing an Extreme Learning Machine. The frequency stability margin is described as a combination of the distance between the actual frequency, the minimal frequency allowed and the duration of the undergoing. For the training of the ELM, a database is constructed by utilizing the New England 39-bus system consisting of generation and load at each bus as well as the total load and generation serving as inputs for the system. A 30 s simulation is performed for a tripping generator under 360 different system conditions. After offline training, the ELM-predictor can be applied in an online scenario. A partially similar approach is followed by Mestav et al. [63] proposing a two-stage framework for online usage based on a DNN for the estimation and a stacked ELM for the correction phase. During the first stage a DNN is used to estimate the frequency stability metrics being frequency nadir and the time to reach it, Rate of Change of Frequency (RoCoF) and quasi-steady-state frequency. The results are then handed to the second stage and the frequency metrics are corrected using a stacked ELM. For Simulation purposes a modified IEEE RTS-79 bus system is utilized to generate 30,000 samples with 261 inputs for the estimation stage, including primary reserves, inertia constants and load damping coefficient. Yurdakul et al. [77] propose a methodology for the prediction of system frequency based on LSTMs. Multiple variables are used as system inputs in a certain time, including frequency measurements, loads, day of the week and hour of the day. These variables are fed into a multilayer LSTM network, which is followed by a neural network with one neuron to finally provide the frequency forecast to the operator. For testing, a dataset from NGESO containing two months of secondly measured frequency is utilized and sampled down to a resolution of one minute. All through the study, multiple tests were performed including different look-back windows for the inputs from 1 to 30 minutes. A much more holistic approach is proposed by You et al. [78] utilizing an artificial intelligence model for assessing transient, small signal and frequency stability at the same time based on the same input parameters. For this purpose, dispatch data from the scheduling model is obtained by simulation to calculate the stability margins for different scenarios in the first step. By utilizing the generator dispatch levels and network data as input features the artificial intelligence system is able to predict the stability margin indices for frequency, transient and small-signal stability after training. Throughout the study, a neural network as well as random forests are trained using an 18-bus test system with 288 stability scenarios, calculated every 5 minutes for 24 hours. A similar approach is followed VOLUME 9, 2021 by Hotz and Becker [41], by utilizing an ANN for online detection of small signal stability.
In [79] a framework for power quality disturbance analysis is proposed combining compressive sensing and machine learning algorithms. Therein, a two-stage reduction is used, first random projection is utilized to reduce the fault signals dimension, second a k-nearest-neighbor algorithm is applied to find the best k nearest neighbor training samples from the whole dataset and create a reduced set for training. Finally, the fault is classified by solving an objective formulation consisting of a combination of L1-norm and L2-norm. For testing purposes, sixteen different scenarios including flicker and harmonics were simulated 200 times each.

4) ANALYSIS OF ECONOMIC EFFICIENCY
In the following, the economically optimal generation of a distribution system is investigated. This topic might be most relevant for the balancing group manager since the liberalization of energy markets. Nevertheless, the evaluation of economic efficiency is mentioned in [49] as part of the DOMA and the optimization might also be extended to other optimization goals. The economic dispatch problem was originally intended to minimize the generation cost, today some approaches also consider the reduction of carbon as the main goal or at least as secondary. Consequently, a cost function is formulated, integrating all the different optimization goals [80]. Due to slow change of the problem variables, the optimization does not require a fast runtime or adaptability as shown in figure 12. Nevertheless, a certain amount of data is required for proper optimization. To solve this optimization task a lot of studies were presented utilizing metaheuristics methods, a few are reviewed in the following. Liang et al. [81] propose an improved fruit fly optimization algorithm for solving the economic dispatch problem. To this end, multiple modifications are implemented like penalty functions for the integration of operation constraints of the system. For testing the IEEE 6-, 40-and 10-bus systems are used to run multiple tests. Whereas the first two grids remain static and the latter one dynamic in its load and generation behavior. Chen et al. [82] applied an improved particle swarm optimizer using biogeography-based learning to the economic dispatch problem. By integrating a comprehensive learning strategy and biogeography-based optimization, the PSO particles are able to learn from each other, which leads to an efficient balance between exploration, exploitation and unintentional convergence. Across the study, five test systems with varying numbers of generators and loads were implemented in MATLAB and 50 individual simulations were performed for the generation of statistical information about the performance of the optimization algorithm.
Besides metaheuristic methods, reinforcement learning is also applied to economic optimization problems a lot recently. Lin et al. [83] present an approach based on deep reinforcement learning for real-time economic dispatch in a virtual power plant. By integrating edge computing, the computational and communicational load is reduced. Moreover, a 3-layer system is implemented with the virtual power plant (VPP) operator on the highest stage, followed by an agent for every region of generation and load, which is the lowest stage. To solve the economic dispatch problem, a DNN is trained offline at the VPP stage using historical data on an hourly timescale. The results are handed over to the agents as set points. Doing so, the agents are able to solve the economic dispatch for their own region online. For testing purposes, a three-area system with multiple loads and generators was designed and 45,000 samples with 24 hours of data were used to train the network. Dai et al. [84] also propose a distributed reinforcement learning algorithm for solving the economic dispatch, additionally unknown generation cost functions are taken as a premise. That being the case, the state-action value function approximation is utilized to solve this problem. A simple 4-generator system and the IEEE 39-bus system with 10 generators are used for testing. All through the study, twelve different load situations are implemented on both systems to test the algorithm.
Across this chapter, multiple approaches for modeling and analysis of power systems utilizing AI have been reviewed, so some major concerns in practical application should be briefly discussed here.
• The integration of large measurement systems, which are crucial for most modeling and analysis approaches, increases the vulnerability to cyberattacks. As this topic is not an essential part of this study but a major concern when integrating AI approaches, it should be mentioned here. It is also worth noting that some approaches already integrate trust metrics for measurement values in modeling, e.g., [85].
• The online adaptability of models and analysis tools is crucial for long-term application to power systems, as they change permanently. Consequently, new training and adaptation of the model have to be performed on a regular basis, which also requires new datasets including updated system data. Hence, such models and analysis tools have a high maintenance demand. Based on the basic requirements proposed for every application as well as the usage statistics of the applied AI methods, a rough guideline for selecting a suitable algorithm for every application is presented in table 4. Using a quantitative approach, every algorithm's applicability to a certain application is rated. A bad rating in this table does not mean that the technique can not be applied to a problem in either case, it just provides an orientation.

B. APPLICATIONS IN DISTRIBUTION SYSTEM CLOSED-LOOP CONTROL
In the following chapter, some of the most recent studies working on closed-loop controls in distribution systems are presented. For the most part, closed-loop controls are used to optimize voltage, active or reactive power.

1) VOLT/VAR/WATT OPTIMIZATION
In the first part of this chapter, the low voltage oscillations resulting from multiple voltage regulators in the system are investigated. For the most part, this was a problem of the transmission system operator, but when operating an inverter-based generation connected to the distribution grid in voltage-controlled mode, this might also be interesting for distribution grid operators. The integration of continuously acting voltage regulators in most conventional generating units has a significant impact on the steady-state stability of the power system. The reason being the low frequency and small magnitude oscillations, which can get dangerous for the system without additional control. Because of this, a supplementary excitation control known as the power system stabilizer was developed for synchronous generation [86]. The runtime of the implemented controller has to be fast to react to the dynamic behavior of the system, nevertheless the optimization of controller parameters can be significantly slower as it does not have to happen in real-time as shown in figure 13, same holds for the adaptability. To improve the controller performance, Sabo et al. [87] propose a Neuro-Fuzzy controller (NFC) to replace the conventional PSS as well as a coordinated multi-power system stabilizer for power system stabilization and reduction of low frequency oscillations. The NFC combines a fuzzy controller and an ANN, so the advantages are the integration of expert knowledge into fuzzy logic, no need for a plant model as well as the ANNs ability to learn. In this study a 6-layer NFC with 2 inputs, the error and the change of the error, is developed. For the coordinated multi-power system PSS a metaheuristic farmland fertility algorithm is utilized, which divides the problem into different sections and optimizes each one separately. For testing purposes, an eigenvalue simulation analysis is performed based on SMIB, IEEE 3-machine, 9-bus and the 10-machine 39-bus New England grid using MATLAB. Three different scenarios are simulated in this study, first a symmetrical three-phase fault, followed by a drop and a sudden rise of generation at a certain number of generators. A similar method is applied by Douidi et al. [88], using a cascaded controller consisting of several PD fuzzy control blocks to act as a nonlinear lead-lag for low frequency damping and a krill herd algorithm for parameter optimization. Throughout the paper, disturbance tests are performed using a 3-machine 9-bus IEEE grid for simulation as well as a larger 16-machine 68-bus system. Masrob et al. [89] proposed an ANN to adjust the controller parameters in real-time for a responsive control behavior. To mimic the behavior of a PD controller a neural network with one layer and two nodes is trained based on the rotor speed aberration and its derivative. On that account, the utilized grid model is reduced in the first place by keeping the dominant eigenvalues. A one-machine infinite-bus system is used for simulation and small changes in the reference voltage are studied. A similar approach is followed by Rana et al. [90], except the ANN is used to estimate the optimal parameters for a conventional PSS online. Therefore, a set of 1,000 samples in different conditions is generated and optimized offline for training and testing of the ANN. After training, a one-machine infinite-bus system is used for testing combined with different load conditions. Chitara et al. [91] proposed a metaheuristic approach and applied the cuckoo search optimization algorithm as a power system stabilizer to reduce low frequency oscillations. To this end, the algorithm is used to optimize a cost function that consists of the damping ratio and damping factor of the eigenvalues and operating points. By selecting the damping factor and ratio, the unstable eigenvalues will be placed in a D-shape region in the left half of the s-plane. For simulation purposes, the New England 39-bus system with 10 generators is utilized and three operating conditions are tested varying from low to high loading. Additionally, three three-phase fault scenarios are tested on different buses. It is worth mentioning that the computation times for all implemented algorithms are over 15 min. The application of metaheuristic algorithms to PSS is a logical consequence, as the problem can be easily formulated as a figure of merit. That being the case, similar approaches as the ones described above are proposed by Dasu et al. [92], Syahputra and Soesanti [93] and Ekinci and Hekimoǧlu [94].
A different approach is followed by Zhu and Jin [95]. Here, a reinforcement learning framework is applied to the optimization problem of the power system stabilizer. Thus, the Q learning algorithm is used to optimize the PSS parameters based on the reward received for specific control action. Using Kundur's four machine two-area system, multiple tests were performed in MATLAB/SIMULINK. Two different operation modes were utilized, the first one containing pulse inference and the second one a three-phase short circuit.
In the next part of this chapter, the control of voltage and reactive power in distribution systems is briefly investigated. An in-depth analysis of challenges of voltage control in smart grids can be found in Sun et al. [10]. In this study, only some recent approaches will be reviewed to give a broad overview. On account of multiple different load and generation situations in a power system, the voltage control requires a short runtime, high adaptability and therefore a comprehensive dataset as shown in figure 14.
A multi-agent framework for voltage control using deep reinforcement learning was proposed by Wang et al. [96]. For this purpose, the voltage control problem is formulated as a Markov game and only the local measurements are available for each agent in its zone. During offline training, a power flow calculation is performed in the first step and the results are handed over to the individual agents to detect voltage band violations. To clear the violation, each agent identifies control actions and a second power flow is calculated as well as the rewards. By utilizing the Illinois-200-bus system, four different test cases with 5,000 samples each are investigated in this study. A broad spectrum of possible failures is covered, ranging from load change to line tripping and communication failures. Diao et al. [97] and Duan et al. [98] also propose a deep reinforcement learning framework named GridMind, which is able to take online control actions. When a set of real-time measurements arrives, a power flow is calculated for voltage band violation detection. The obtained states are then processed by the deep reinforcement learning agent together with the calculated reward, to find the control actions which lead to the highest future reward. In the following, a second power flow is solved also considering the suggested control actions to check for voltage violations again. The results are then feedback into the process to update the reward. Multiple systems are utilized for testing purposes, namely, the IEEE 14-bus system to create 10,000 samples with PSAT software.
A hybrid system combining a metaheuristic approach and reinforcement learning is presented in [36]. Here, a two-stage system is proposed consisting of a real-time automatic voltage regulation (AVR) with secondary voltage control. Therein, an artificial emotional reinforcement learning algorithm is implemented for each generator's AVR, followed by an improved dragonfly algorithm. The algorithm individuals consist of a real and imaginary part to extend the search space for performing a coordinated secondary voltage control. As an additional optimization goal, the carbon emission ratios for each generator are taken into consideration by the dragonfly algorithm. Doing so, the voltage control is moved from the conventional centralized three-stage approach to a decentralized two-stage version. The proposed framework is tested on the IEEE 57-bus, 118-bus and 300-bus systems with multiple simulations. Relying on metaheuristic optimization, Yoshida and Fukuyama [99] and Iwata and Fukuyama [100] propose a parallel multi-population differential evolutionary particle swarm optimization for voltage and reactive power control. On that account, the problem is formulated as a mixed integer nonlinear optimization problem with AVR operating values, OLTC tap positions and the number of reactive power compensation equipment as state variables. Constrained by the min, max voltage and power flow, the formulation is optimized by the PSO. In this approach, multiple sub-swarms are built with agents migrating between the sub-swarms to exchange information. In total 100 trials simulations are performed on the IEEE 118-bus system. A hybrid particle swarm optimization was proposed by Chen [101], utilizing a fuzzy adaptive inference to control the reactive power and voltage in a distribution system. Thus, a hybrid PSO is used, consisting of three PSO variants, searching for the optimal solution of the formulated mixed integer non-linear programming problem. Fuzzy adaptive inference is used to improve the search process of the proposed PSO as it tends to converge to local minima. For testing, the IEEE 33-bus system is adapted and simulations are performed based on real-world data from a Chinese grid operator and multiple scenarios created by load and generation variation. Guliyev [40] developed a fuzzy controller for reactive power control of capacitor banks. Based on the reactive power, the derivative of reactive power, the voltage and the number of commutations of the capacitor bank, a control action is obtained using 96 calculated fuzzy rules. Furthermore, the functionality of the proposed controller is tested using a capacitor bank model for simulation with multiple variations of load and generation.

2) FAULT IDENTIFICATION, ISOLATION AND SERVICE RESTORATION
Another important part of power system operation is the detection and diagnosis of faults, losses and anomalies, in particular different kinds of short circuits, communication outages, and cyberattacks to avoid power outages. For more detailed information on this specific topic, the reader might check Gururajapathy et al. [102]. In figure 15 it is shown, that the detection of faults and anomalies requires an accurate modeling of system dynamics and a fast runtime, due to fast changing system states. The systems adaptability is also important, owing to changing topologies, e.g, the addition of a feeder. Some techniques focus on a specific error type, so a comprehensive dataset is not mandatory in any case. It is noteworthy, that the system level restoration and recovery could be open-loop in some cases, but in future scenarios we assume that it will be automatic.
First, the detection and classification of faults, mainly short circuits, will be reviewed in this chapter. Peng et al. [103] propose an intuitionistic fuzzy spiking neural p system for fault diagnosis. According to the author's analysis, this approach has three main advantages, the first one being the use of an intuitionistic fuzzy number, which shows the amount of alarm information and the impreciseness. The fuzzy reasoning mechanism followed by the representation of the diagnosis results as a membership and non-membership function. Based on this approach, a fault diagnosis model is built, which collects information from each device in the outage area, calculates the fault confidence levels and finds the faulty component. The proposed approach is further investigated using two different test grids in different sizes and voltage levels. The first one is 69 kV with 10 sections and the second one 348 kV with 18 system sections. For both grids, three fault scenarios are tested ranging from single faults without failure devices to multiple faults with the rejection of circuit breakers.
A different approach by Lin et al. [39] utilizes a hybrid system combining the advantages of the Genetic Algorithm and the Tabu Search for fault diagnosis. Therefore, the objective function containing the state of the system, the breakers in the system and the protection are enhanced by integrating the influence between the main and backup protection. This was done to improve the problem of non-uniqueness, which was investigated in prior studies. For simulation purposes, a typical power system consisting of 4 substations, 28 components, 84 protections, and 40 breakers is utilized and possible failure scenarios are tested throughout the paper. Jamil et al. [104] propose a two-stage approach for fault classification consisting of a wavelet transformation and a genetic algorithm. In the first stage, the incoming current signal is decomposed into a high and low frequency part by a filter and then separated into detail and aggregated components using multi resolution analysis. Here, the detail coefficients are unique for every type of fault and are used to construct the input datasets for the following genetic algorithm. Ten different types of faults are simulated on a transmission line in MATLAB with different values of fault inception angle and fault resistance. In [105] Wang et al. combine the advantages of metaheuristics and machine learning by using a SVM for the classification of anomalies in generation control, optimized by an improved PSO algorithm. Thus, the PSO is extended by an adaptive speed weighting and population splitting to overcome convergence speed and local minima problems. For experimental validation, a dataset provided by the Electric Consumption & Occupancy is utilized, consisting of aggregated consumption data of six households over 8 months in one second resolution. By adding an additional analysis stage, Deng et al. [38] proposed a hybrid three-stage approach combining techniques from different fields to detect the faults of a motor bearing. In the first stage, the original vibration signal is decomposed into different intrinsic mode functions using the empirical mode decomposition followed by a fuzzy information entropy, to obtain the features used in the following stages. All through the second stage, an improved PSO algorithm is proposed using different methods to tailor it to the existing problem and the optimization of the parameters of a Least-squares SVM, which is trained using the improved PSO algorithm. Finally, the trained Least-squares SVM is applied to the actual classification task. For testing of the developed algorithm vibration data from Bearing Data Center of Case Western University was used, measured at a frequency of 12,000 Hz for 10 s. Another three-stage approach targeting the identification and localization of anomalies based on PMUs was proposed by Li et al. [106]. To avoid costly labeling work, unsupervised learning was used, so there is no need for historical labels. The developed framework consists of three main parts, first event detection based on the change-point method, which detects abrupt changes in the data matrix, generated from PMU signals. In the second part, an identification approach based on two stages is proposed. The first one is a PCA, which finds the most important features to cluster the events, followed by a compactness evaluation stage. Here, the compactness of the normal and event data distribution is evaluated. The final step is the localization of the event that occurred, which is done by the change-point method. That way, the location of the event is estimated by finding the most significant change in neighboring PMUs. Across the experiments performed in the study different events as well as different PMU penetration levels are considered. Blazakis et al. [107] propose an adaptive neuro fuzzy inference system (ANFIS) for the detection of nontechnical losses such as illegal electricity power consumption, e.g., meter tampering, or grid manipulation. The ANFIS system is the combination of an ANN using backpropagation with a Sugeno fuzzy inference system consisting of five layers. The first one being a fuzzy layer, followed by a product layer combining the results from the first layer. In the third layer, all values are normalized followed by a defuzzification layer before all nodes are aggregated in the output layer. As testing scenarios, three base cases are identified, being partial theft, when the consumption is constantly lower, overload, when the consumption is constantly higher and periodic theft, reduced consumption during specific hours of the day. By varying the percentage of the overall consumption, thirteen different scenarios are created. The dataset used in this study contains 5,000 household data from Ireland collected over 6 months in a 30 min resolution. As an input for the ANFIS, the mean, median, load factor and entropy were selected from a range of possible features using the neighborhood component analysis. In [108] Goswami et al. studied three different machine learning techniques for fault analysis and classification, focusing on the identification. For this classification task a set of 11300 samples is created, 1,000 for each fault type, using MATLAB. Each sample consists of one voltage and current value per phase, which makes six features. The time span set for the fault data captured during the simulation is 10 ms to 280 ms, this enables the trained classifiers to identify the fault types based on their dynamic behavior. Three classification algorithms are trained, namely, a K-Nearest-Neighbor, Support Vector Machine and a Decision Tree utilizing the prior described dataset.
In [79] the framework also applied in [109] is applied to fault classification. A real-world test grid with 13.2 kV is built in MATLAB and 10 different faults are simulated 100 times each with multiple locations and fault parameters to generate a training dataset. Wang et al. [110] present an approach for online anomaly detection in a data attack situation with automatic generation control using a multi-class classification based on k-nearest neighbors. Therefore, k-means clustering is performed offline to form the classes, followed by an online classification based on three conformity metrics that rely on the received Area Control Error. The developed system is tested using the IEEE 39-bus grid with synthetic data and six developed test scenarios, for instance flip and ramp attack. An ensemble system for the detection of anomalies in PMU data is proposed by Zhou et al. [37]. Therein, a set of base detectors is trained offline in the first place. When detecting an anomaly in an online operation, the anomaly scores are calculated and aggregated as a decision base. For testing purposes, a stream of synthetic PMU data created by a real-time digital simulator as well as real-world PMU data is used. For the latter one, three different types of anomalies are detected, voltage is zero, data during events and data beyond 5 % of the mean value and the previous and following point. Ren et al. [111] also focused on online anomaly detection and proposed a machine learning approach integrating HPC.
The detection is performed after the anomaly appeared, so a prediction algorithm seems valuable for the system operator to take preventive actions. To this end, Zhang et al. [112] developed a two-step system for fault prediction based on historical data. In the first stage, three LSTM subnetworks extract the temporal information from current, voltage and active power measurements. The resulting features are fed into a SVM classifier for fault estimation. In this study, a dataset from the China Southern Power Grid in Wanjiang from the years 2012-2014 was used, with 2500 samples for training and testing, consisting of 500 measurement points each. These points are recorded either before a line trip or during normal operation in a 15 min resolution and labeled with the event that finally appeared. The practical applicability of the proposed approach is highlighted by the authors, as a result of the performed experiments and the ability of the system parameters to constantly update to new states of the power system online. Ashok et al. [113] propose an approach to detect cyber-attacks in measurement systems and their influence on the state estimation by forecasting the state behavior and comparing the prediction with the actual measurement. In [85] the anomaly detection is also integrated into the state estimation by the addition of a trust metric for every measurement.

3) COORDINATION OF EMERGENCY ACTIONS
When the grid is in a critical mode, the coordination of emergency actions is an important part of distribution system operation. A possible reaction to a frequency drop besides control actions is the shedding of loads. That being the case, under-frequency load shedding relays get installed that disconnect the load when the threshold is reached. Thanks to the complexity of power systems and their dynamic behavior, the optimal load shedding strategy is hard to find. Because of this, the proposed algorithms have to be adaptable but do not require large datasets as shown in figure 16. In [114] an approach for optimal coordination of under-frequency load shedding is proposed. An analytical hierarchy process algorithm is used to rate the importance of each load for the creation of a ranking as well as different load shedding strategies. K-means clustering is used to divide the appearing instability mode into different clusters based on the detected measurements. Moreover, an ANN is trained to choose the best load shedding strategy for every cluster. For training purposes, 667 datasets are created through offline simulation of system faults causing instability. The trained algorithm is then tested on the IEEE 39-bus grid using three different fault scenarios. Another approach was proposed by Malkowski and Nieznanski [115] using fuzzy logic to create an adaptive load shedding algorithm. Therefore, a membership function for frequency as well as the derivative of the frequency is created. The output of the inference block shows the number of load groups that have to be disconnected.
For simulation purposes, the CIGRE 23-bus system is utilized. Multiple different situations are extensively tested including different turbine controllers, frequency-dependent loads and different numbers of operating points.
Usman et al. [116] propose an approach to solve the optimal load shedding coordination when undergoing voltage limits. By using a multi-objective minimization problem formulation, power loss, voltage deviation and cost of the load shedding are taken into account. An evolutionary PSO algorithm is used to solve this optimization problem. Additionally, the computational efficiency is increased by integrating an evolutionary competition between the current and previous positions of particles. Throughout the study, the approach is tested on the IEEE 33-bus distribution grid using a daily demand profile. In [117], Hasanat et al. propose an ant colony optimization algorithm to minimize the amount of load shedding. On that account, the algorithm is extended using a local search to improve the solution. As this approach is purely developed on the graph structure of the electrical grid, a graph generator is used in combination with data from the national grid data of Bangladesh to create benchmark datasets. Dreidy et al. [118] propose another study on the optimization of the load shedding amount comparing PSO, binary evolutionary programming and binary genetic algorithm. Because of this, a part of the Malaysian distribution system with high penetration of PVs is modelled and ten loads are flexibly prioritized while two remaining a fixed priority.

4) COORDINATION OF RESTORATIVE ACTIONS
The importance of the distribution system operator in restorative actions after a blackout is increasing as more generation happens to be connected to the distribution system. That being the case, multiple studies were recently presented working on the integration of distributed generation into the restoration strategy [119], [120]. Moreover, the integration of energy storage in restoration is also proposed in many studies, including the utilization of EVs [121], [122]. Applied approaches have to be adaptable, as a result of the different situations that appear after a blackout, as can be seen in figure 17.
Zhou et al. [123] propose a multi-agent system to restore a distribution grid. Therein, two classes of agents are defined, being load agents and distribution substation agents. The first one's goal is to restore the own load and offer energizing actions to the neighboring loads in the following, whereas the second one monitors the substation power flow and holds a list of each load agent. For testing purposes, a 16-bus 24 branch system is used considering load and substation faults. Another multi-agent based approach is provided by Sampaio et al. [124]. A system based on four different types of agents is proposed, being substation agent, feeder agent, branch agent and equipment agent. Every agent communicates its name, status and equipment as well as loading and priority loads if accessible. For testing, a simulator was developed to represent a real MV system with four substations. In this study, two fault situations were tested. In contrast VOLUME 9, 2021  to that, a hybrid approach using machine learning and metaheuristic methods is proposed in [35]. A total of four metaheuristic algorithms being a modified PSO, frog leaping algorithm, genetic algorithm and ant colony optimization algorithm are used in combination with a multi-class support vector machine. To create a database, 320 faults were simulated using the IEEE 69-bus system and for every scenario each metaheuristic algorithm finds its best restoration solution. The best solution of the four algorithms serves as a target value for the SVM and features that are extracted utilizing a discrete wavelet transform are used as inputs. In [125] another approach using metaheuristic and machine learning techniques is proposed for optimal restoration strategies. For this purpose, a PSO is used to optimize the switch positions for power loss minimization. Additionally, an ANN is fed with switch positions, system level loads and line power to find a load balancing index and the voltage profile. To create a database for training the IEEE 33-bus by NRM is utilized and load variations are performed.
In the following, some concluding remarks regarding the practical implementation of closed-loop AI algorithms in power systems will be discussed similar to the previous subchapter.
• In control theory, the robustness of controllers has been studied extensively, but in AI there have only been few studies and for some AI approaches the robustness is hard to investigate due to their black-box structure. Nevertheless, for the safe operation of power systems, robustness is crucial, as the provision of energy to the consumers has to be guaranteed by the system operator. Hence, future research should consider this problem, which is also closely related to the explainability that will be discussed in more detail in the next chapter.
• The dynamics in power systems change as a result of the integration of inverter-coupled participants [45], which also leads to interactions between multiple controllers with different runtime and optimization goals. As a consequence, the influence of AI based controllers has to be investigated in depth. To conclude the analysis of closed-loop systems, a table is presented similar to the one proposed at the end of the previous subchapter. In this way, table 5 provides a guideline for the selection of an AI algorithm for closed-loop control applications. As prior described, this only serves as a guideline and does not exclude possible applications of low-rated methods for certain problems.
Throughout this chapter, some of the most recent studies for distribution grid operation utilizing AI techniques were systematically presented and reviewed. Doing so, the current state of the art is presented in a compact format and research directions for the individual problems can be identified. In table 6 the distribution of the application in publications is shown to visualize possible research gaps or underrepresented research fields. It can be seen from the table that the field of economic dispatch and power system stabilizers has not been extensively researched for the past few years. Regarding the underrepresentation of research in the field of economic dispatch, a reason might be that this topic is very complex thanks to market mechanisms and government rules. Power system stabilizer has been researched for some time, but they seem to lose relevance during the last few years according to the analysis. One reason for that might be the rise of inverter-based generation also connected to the distribution grid, which needs at least an adaption of controls, as they are not coupled synchronously. This also leads to an enhanced description of stability, which has to be considered when designing controllers. For further information, the authors refer to [126]- [128].

V. OUTLOOK: AI IN POWER SYSTEMS
It can be concluded that AI is already used a lot in power systems research these days but there is still room for improvement and further research. In this chapter, a broad outlook will be given together with some major concerns in AI practical implementation, which have to be addressed in future studies after the potential of AI application has been shown in the last few years.

A. EXPLAINABILITY OF AI
Regarding the implementation in real-world systems a concern is the explainability of the AI system. This is especially relevant in closed-loop control, where the system and not the operator take control actions. Owing to the black-box structure of most AI approaches, it is not possible to check if the developed system is behaving as intended in all situations. An approach that has been researched a lot recently is the integration of physics into AI techniques and explainable AI. There are multiple possibilities to explain the behavior of an AI system ranging from understanding what the model has learned [129] to the explanation of individual predictions [130]. Nevertheless, during this review, only a few studies on explainable AI in power systems were found, so this topic has lots of potential for future research activities.

B. DATABASE
As already mentioned across this review, the database is essential for most AI applications. The complexity of power systems is high, which leads to a need for extensive data collection and sorting for training and testing of the developed models and algorithms. There are already some open source data collections available [64], [131], but these are developed on specific benchmark grids or recorded in a specific real-world situation. Therefore, the general applicability might not hold for every approach. Nevertheless, there are already collections and surveys available concluding multiple databases [132], [133]. Additionally, access to data is also limited because of data privacy regulations. A lot of attention has been paid to this topic during the last few years [134]. As a result of the restrictions, a central collection and usage of data for individual loads is sometimes not possible. Whereas being necessary to protect the data, the restrictions lead to suboptimal datasets and slow down further development to a certain point.

C. REDUCTION OF COMPUTATIONAL LOAD
Despite the fact that computational power has increased massively during the last decade, it still requires a lot of time for most AI techniques to learn complex behaviors. So, the application to real-time tasks in power systems is limited, especially when the learning process is performed online, e.g., online adaptation of Deep Neural Networks. This is also a critical point in metaheuristic approaches when applied to optimization tasks, they perform an extensive trial-and-error process, which takes a long time to converge. Moreover, online adaptability of the developed models and approaches is necessary due to long time changes in power systems, namely aging of components.
The following conclusion regarding the application of revisited AI approaches to future power system operation is not comprehensive but highlights three points that should be considered: • Enhancement of explainability to increase the plausibility and traceability of AI systems • Enhancement of robustness in every state of the system, so safe operation is possible • Development of comprehensive datasets for training and testing of developed approaches • Reduction of computational demand to allow real-time application and online adaptability In the last few years, a lot of new concepts and techniques emerged in power systems that have lots of potential also for AI applications. Some of them will be mentioned here briefly.

D. SECTOR COUPLING
For the transition of the power system to carbon neutrality, the coupling of energy sectors is a research field that gained a lot of attention during the last decade [135], [136]. Concepts like energy quarters were developed for optimizing whole sector coupled buildings and neighborhoods, which are placed in the distribution grid. They can also be utilized as flexibility in the grid [137], depending on the storage capability e.g., in battery storage and EVs. Therefore, various control concepts and algorithms have to be developed and implemented, also AI seems to be a helpful tool here [138], [139].

E. PROVISION OF ANCILLARY SERVICES
As a result of the volatility of renewable energy generation and new load characteristics, the provision of ancillary services is an upcoming research topic. On that account, flexible loads, namely, electric vehicles (EVs) can be utilized besides DER [140], [141] through advanced loading concepts [142]- [145]. A problem that occurs in the participation of EVs in ancillary services is the missing infrastructure [146]. Moreover, thanks to the increase in VOLUME 9, 2021 asynchronous generation the acquisition and provision of system inertia has to be considered as it is no longer inherently provided by synchronous generators. This leads to multiple questions on the behavior of low inertia systems [147]. Therefore, also AI techniques can be applied [148].

VI. CONCLUSION
In this article, some of the most recent applications of AI in distribution power system operation are reviewed. Consequently, the basic functionality of the main AI methods of rule-based systems, metaheuristic methods and machine learning are introduced and the exertion on power system specific problems is shown. Throughout this study, the applications are divided into decision-support and closed-loop control systems. A guideline for selecting a suitable algorithm for an application is developed in this review. In doing so, four general metrics are proposed, the severity of requirements on the database, runtime, dynamics and adaptability. The metrics are quantitatively assigned to each application. Based on the revised studies and the provided metrics, a conclusion is provided rating the suitability of each technique to the application.