A Fully Decentralized Machine Learning Algorithm for 1 Optimal Power Flow with Cooperative Information Exchange 2

.


Background and Motivation
Despite the profound differences between modern smart grids (SG) and traditional power systems, the primary function remains unchanged: Achieve a balance between power generation and demand at the minimal cost while ensuring system reliability and stability.Optimal power flow (OPF) management is at the heart of this, aiming at identifying optimal generation capacity of controllable/dispatchable generators in a power grid such that the total demand is met.OPF ensures that an electrical power grid is operating at a minimal total cost given the current demand profile and its technical and security constraints.OPF analysis is indispensable for power system operators, being continuously employed to ensure that the system is running at minimal or near-minimal operating costs (Abdi et al., 2017).
In a traditional power system, the system operator is a single central entity with access to all measurable variables in the power grid.Accordingly, constructing a deterministic AC power flow model is possible (Kotsalos et al., 2019).The AC-OPF thus incorporates all constraints, including the available generators' costs and limitations, grid structure, and associated safety constraints (e.g.bus voltage angle and transmission line power limits), to obtain the exact solution for ideal generation levels of individual generators.However, a deterministic AC power flow and the resulting AC-OPF models are both highly complex and highly non-linear, making it a formidable task to mathematically construct for each specific case, in addition to being computationally expensive to solve.Therefore, there has always been an interest in scientific literature to develop simplified and computationally efficient models, popularizing linear programming approaches (Capitanescu, 2016;Ergun et al., 2019).
Linearized OPF models do alleviate the mathematical complexity and computational expense.However, they suffer from critical drawbacks.First, the obtained solution, while deterministic in nature, is approximate and may lack in accuracy.Furthermore, linearization approaches can only be applied if the objective function(s) are differentiable and continuous, and they do not consider uncertain or unknown variables model (Yang et al., 2018).
The presence of the latter has been an issue even for traditional power systems.While a centralized system operator would have access to all measurable variables in the grid, uncertain parameters in the model persist for two reasons.First, in conventional power systems not all variables are constantly being metered or measured.Second, the increased presence of renewable energy sources (RES) as non-dispatchable sources inherently creates uncertain parameters in the model (Cruz et al., 2018).
Alternatively, quasi-deterministic and probabilistic methods have been developed to solve for power flow and OPF in the presence of missing / unmeasurable data and RESs uncertainties (Beltrán et al., 2019).
Quasi-deterministic methods, such as Monte-Carlo (MC) simulations, account for uncertain variables by randomly generating a sufficiently large number of input samples to cover the entire uncertainty range and obtaining a deterministic solution for each sample.Thus, the uncertainty range of input variables is used to generate an uncertainty range of the outputs.While this provides accurate and complete information on grid behavior, it is often computationally expensive (Beltrán et al., 2019).
Probabilistic approaches employ statistical models to convert probability distributions of input parameters to those of the outputs.Those have a significantly higher computational efficiency and do not necessitate constructing a physical system model, however they require knowledge of the uncertain variables' probability distributions from a historical dataset, being predecessors to modern machine learning (ML) algorithms (Alimi et al., 2020).
In order to clearly identify the state-of-the-art progress on the development of probabilistic and ML methods for new SG management paradigms which are discussed in Section 1.3, a literature review of the original probabilistic methods and their historical evolution is needed and is presented in Section 1.2.

Literature Review: History of Probabilistic Methods for Traditional Power Systems
Decades ago and prior to proliferation of renewable/non-dispatchable generation, stochastic behavior in power grids existed essentially on the load side; hence the original name: "Probabilistic Load Flow" (PLF) which is still used interchangeably with probabilistic power flow (PPF).(Borkowska, 1974) was one of the first papers to propose, implement, and test a PLF method for power system operation and planning.
The proposed method was used to obtain probability distribution functions (PDF) of branch (transmission line) power flows given those of input loads.First, three assumptions were made to simplify complex nonlinear equations: 1) linear relationship between branch flows and net nodal loads (linearized around expected values), 2) independence of active and reactive power, and 3) power balance is a function of the sum of input and output powers only (i.e.independent of individual nodal values).
Branch flow PDFs were then obtained by evaluating a recursive set of convolution integrations of the input and output PDFs.This could be used to obtain practically valuable information such as the probability of a line flow exceeding a certain value (e.g.capacity limit) or the realistically possible range for line loads.
A major drawback of Borkowska's technique was that a very large number of convolution integrations that had to be evaluated.This restricted the method to smaller networks due to limitations of both computational speed and memory, especially at the time, limiting its applicability to real life scenarios.In 1981, (Allan et al., 1981) realized this issue and exploited frequency domain multiplication with Fast Fourier Transforms (FFT) as a computationally efficient alternative to time domain convolution integration.
The results proved FFT to be superior to convolution both in terms of computational efficiency and accuracy.In addition to proposing a more efficient PLF method, the study performed multiple important validation studies.First, the results of PLF were compared against those of MC-50001 for a case of high load value uncertainty (15x usual standard deviation).The results were highly similar with a slight skew in the PDFs which was not significant for practical applications where realistic uncertainties are much lower.
Second, the use of the central limit theorem was considered.Results showed that even when all inputs had a normal distribution, the output did not due to inherent system non-linearities.Therefore, it was concluded that the theorem should not be used regardless of system size.(Allan et al., 1981) later adapted the method to radial distribution networks, considering uncertainties in short-term (hourly) wind speed forecasts and corresponding uncertainties in produced active power and absorbed reactive power.(Hatziargyriou et al., 1993) incorporated a probabilistic model for wind turbines to the original method.
An alternative to convolution/FFT was proposed by (Zhang & Lee, 2004).The proposed algorithm relied on the statistical premise that two distributions with equal moments must also have equal cumulants.
Thus, one can be computed from the other.The algorithm started by calculating moments, thereby the cumulants, of injected power.Linearized equations are then used to calculate cumulants, thereby the moments, of line flows.PDFs of line flows are finally constructed from their moments using Gram-Charlier expansion.The study found that at least 7th order Gram-Charlier expansions should be used to provide accurate output PDFs.The proposed approach was significantly faster than MC simulations.
The different variations of PLF mentioned so far are all characterized as analytical PLF methods (refer to Figure 1).Analytical PLF methods still perform deterministic power flow calculations, however they employ statistical theories and probabilistic approaches to model the input uncertainties and determine the corresponding output uncertainty range in a computationally efficient manner (compared to quasideterministic methods).Consider some variable y which is a function of random variables {v1,v2,…vNv}.Let y=f(v1,v2,…vNv) with f being a deterministic function and NV being the number of random variables.A point estimation method is intended to approximate the first few moments of y and thereby estimate its PDF by evaluating f a number of times around each random input variable.
While this may seem similar to a MC simulation, the main and crucial difference lies in point estimation methods concentrating statistical information calculated on the points for each random input and thereby only requiring a significantly lower number of evaluations for a large number of random inputs.For comparison, MC has a complexity2 of O(M N ), varying exponentially with the grid size, or number of buses M, and the number of random variables N. Borkowska (1974) Allan ( 1981) Viviani (1981) Hatz ( 1993) Zhang ( 2004) Su ( 2005) Morales ( 2007) Schellenberg (2005) -> convolution to evaluate output density functions.
-> linearized load flow functions -> validate the use of linear load flow functions.
-> replaced convolution with more efficient FFT.
-> Proposed probabilistic model for wind turbines.
-> Proposed point estimation algorithm for PLF.
-> first to propose probabilistic evaluation of power flows (PLF).A simultaneous effort was underway to develop probabilistic models for the OPF and optimal dispatch problems.One of the pioneering papers to propose a probabilistic OPF (P-OPF) method was that of Viviani & Heydt, 1981).Dispatchable generators were modeled using second-order cost functions, and the control vector (for ideal generation levels) was modelled as a vector of PDFs by using Gram-Charlier expansion.The results showed excellent agreement with MC simulations for a small (8-bus) system, which was used as a test case due to limited computational capabilities of the time.technology allowed the method to be tested on a much larger 118-bus system.The simulations showed the proposed method provided highly accurate results with a computational efficiency an order of magnitude better than MC.
This section provided a review of some of the most influential papers in historic literature which pioneered the application of probabilistic and statistical theories to power flow and OPF methods (represented in Fig. 1).From this historical literature review, one can categorize power flow and OPF methods in traditional/centralized power systems into four categories (as shown in Figure 2): deterministic, quasi-deterministic, analytical PPF, and approximate PPF.The different advantages and drawbacks of these methods are qualitatively compared in Table 1, according to the evaluated literature.

Computational Expense (running time and memory)
• Capability to incorporate uncertain / random variables.

❌ ✔ ✔ ✔
Grid physical model replaceable with historical data (ML extension possible).

State-of-the-Art Review: Operation of Modern Smart Grids
In modern SGs, Internet-of-Things (IoT) enabling creates an abundance of measured data from even the smallest devices in the system, which is in fact the main identifier of SGs compared to a traditional grid: This largely eliminates uncertainty caused by missing/unmeasurable variables in SGs.However, the combined effect of distributed energy resource (DER) proliferation, increased RESs penetration, and highly dynamic loads due to demand-side management (DSM) and demand response (DR) policies results in new sources of uncertainty, even for a centralized system operator with global data access (Kotsalos et al.,

2019; Sedhom et al., 2021).
On top of that, recent technological, societal, and policy changes are resulting in a call for decentralized operation of SGs, associated with paradigms whose names are increasingly seen in scientific and technical literature such as citizen-run energy communities, peer-to-peer (P2P) energy trading, and the Internet-of-Energy (IoE).In these paradigms, a transition from a centralized operation structure to a decentralized one is advocated (Green & Newman, 2017).
While there are several technical, economical, and environmental benefits of this shift, it adds an additional layer of operational challenges.Contrary to a central operator, decentralized agents would only have access to local data, adding a significant level of uncertainty and inaccessible information from other parts of the grid.In this case, a deterministic power flow or OPF model is not possible to construct by multiple individual agents to whom information (including the physical model) on other regions of the grid may not always be known (Pasetti et al., 2018).
To deal with this challenge, two solution efforts exist.The deterministic approach involves the use of decomposition techniques, in which deterministic power flow or OPF global functions are decomposed into a set of local ones to be solved by individual agents, with each local function being dependent on the solutions of the local functions of other agents.Therefore, an iterative procedure is employed where local solutions are interchanged between agents until a convergence is achieved for all local functions i.e., global consensus found.
However, multiple drawbacks are associated with this approach.First, decomposition techniques often suffer from convergence issues and are difficult to generalize for use with a generic power system configuration (i.e., they must be tuned/configured for each system to achieve acceptable convergence).
Second, the exchange of local function solutions between local agents must be highly synchronized for real-world applicability, making it heavily reliant on secure communication infrastructures with low latency.Finally, the incorporation of uncertain variables is very complex, if not impossible, in real-world applications (Christakou et al., 2017a(Christakou et al., , 2017b;;Munsing et al., 2017).
This leads to the second solution effort, revisiting probabilistic methods and their ML successors as viable solutions for the decentralized operation of SGs without the aforementioned problems.IoT-enabling and cloud computing capabilities now make it possible for probabilistic/ML algorithms to replace centralized system operators (Alimi et al., 2020).In traditional systems, the main drawback of these methods was reliance on historical data, which is no longer an issue in modern SGs (on the contrary, data redundancy is often brought up as an issue in the IoT paradigm).
The challenge in this case is to develop new algorithms in which agents can cooperate to achieve optimal global performance (as in OPF analysis), without sharing private/personal data to fit the new fully From the conducted review of pioneering works and the subsequent categorization of methods employed to traditional/centralized power systems, it was evident that the fourth category, approximate PPF/P-OPF methods are the predecessors of modern-day probabilistic and ML algorithms (Alimi et al., 2020).A major step in this evolution is the replacement of simple point estimates with non-parametric methods, which does not assume any statistical properties of the inputs to be known a priori.
Moreover, a point estimation (i.e., a deterministic calculation at the sampled points) is not needed and can be fully replaced with a historical dataset of input/output pairs, from which non-parametric methods directly estimate density distributions and corresponding output PDFs.Thus, the approach becomes purely a ML one where knowledge of a grid physical model is unnecessary.Kernel density estimation (KDE) has become the most popular non-parametric method in scientific literature, not only in the field of power systems but across the different applications of ML due to its reliability and computational efficiency (Kamalov, 2020).
(Cao & Yan, 2017) performed a two-stage P-OPF analysis on a grid with multiple wind farms.First, input PDFs of wind generation were estimated, followed by MC simulations in which sampling was done based on the generated PDFs.By constructing a combined PDF for all wind farms with wind speed dependence, its effects were analyzed on the P-OPF results as compared to using individual PDFs of the wind farms.By comparing KDE (non-parametric) and parameter estimation, the PDF produced by KDE and corresponding P-OPF results was found to be more accurate than parameter estimation.The results of these studies are all in agreement with (Cao & Yan, 2017), verifying that the use of nonparametric methods in general and KDE in particular to estimate uncertain input variables in PPF is superior to parametric techniques.The aforementioned studies solely employed KDE as a pre-processing method, performing quasi-deterministic power flow analysis rather than exploiting KDE's full potential to replace MC simulations for PPF, which was performed by other works in literature.

Novel Contributions and Paper Organization
In this work, a novel ML algorithm for fully decentralized power flow management of SGs is proposed, formulated, implemented, and tested.The proposed method was inspired by the forecasting models of  A conceptual model is constructed for the transition from a fully centralized operation of a SG to a decentralized one, proposing the transition scheme between the two paradigms.
 In a fully decentralized SG run by local agents, a novel ML algorithm is proposed and formulated to enable this transition and into cloud-based fully decentralized system operation.
 The proposed algorithm relies solely on the local historical data for each agent to accurately predict optimal control action without being given any information on the physical system from outside their local zones (i.e., full grid structure is unknown), and without access to historical data from other agents.
As such, the proposed algorithm is designed not only to deal with variable uncertainties, but also missing / lacking information in a decentralized system.
 The proposed algorithm incorporates the capability of cloud-based cooperative information exchange without sharing private/raw data (e.g., local historical datasets or control actions taken locally).This is performed by proposing a new concept of an s-index vector, which is an encoded information that can be shared between agents to improve their control action predictions without sharing raw information.
The rest of this manuscript is organized as follows: In Section 2, the conceptual model of the proposed approach and its mathematical formulation are presented.In Section 3, the modified IEEE 24-bus test system which is used to test the proposed approach is detailed, along with the data used to generate the historical datasets.In Section 4, a high resolution (15-minute) week-ahead test case is performed for three different case studies, and the results are shown and validated against the deterministic solution based on AC-OPF.The studies investigate the effect of parameter tuning, impact of the incorporated cooperative information transaction model, and compare the performance with a Neural Network (NN).A discussion of the implications and limitations of the proposed algorithm is presented in Section 5, along with recommendations for future and follow-up work.The final conclusions are summarized in Section 6.

Proposed Methodology
In this section, the conceptual model for decentralized operation of a SG is described.Then, the mathematical formulation of the proposed ML method for fully decentralized OPF management in this paradigm is presented.Finally, the cooperative information exchange capability of the proposed algorithm is illustrated.

Conceptual Model
In Fig. 3, an illustration of the conceptual model for a transition between a centralized (left) and decentralized (right) operation paradigm for an electrical power grid is illustrated.In the centralized paradigm, the system operator would have global access to all measurable variables in the system, and issue control signals to all controllable ones.For a given grid, multiple overlaying models exist (e.g.corresponding to functions of real-time operation and dispatch, operation planning and unit commitment, voltage and frequency control, etc.).For each of those functions, the system operator would have a historical log of all input (measured) variables from the grid and the corresponding control actions taken (historical system states).In the case of OPF management, the logged historical system states can correspond to load values of the network and the corresponding ideal generation levels of all dispatchable generators.
On the right of Fig. 3 In the transition from the centralized to the decentralized paradigm, each agent would be handed over a subset of the system operator's historical dataset, only containing input/output variables specific to their respective zone.From this point onwards, agents utilize their individual historical datasets to manage local zones of control using a ML approach which continuously updates their datasets as new control signals are issued for new system states.
At the beginning of the transition the information inherited from the retired system operator, although partial, can guarantee accurate prediction of control actions which lead to both local and global optimization of system performance.However, with time the information provided from the time of the centralized paradigm becomes obsolete, and thus a continuous cooperation scheme between the agents is required to exchange useful information without having to share private and/or raw local data.

Generalized Mathematical Formulation
Consider a system that at any given instant has a set of measurable independent input variables V. To ensure global applicability of the developed ML method, all the system input variables are considered as uncertain variables (i.e., |V| = NV) as shown in (1).Moreover, no knowledge of the physical system model is known a priori, and thus no input variable interdependencies are assumed.
A set Y is defined in (2), containing Ny output variables, each of which is a function of the input vector, through an unknown system model, as shown in (3).
= { , , … , } Thus for all recorded system states, a historical dataset of operation H exists for the system matching the input and output vectors as shown in ( 4)-( 5).
For an updated system state, a new set of input variables is measured as shown in ( 6), for which a corresponding set of output variables exists as shown in (7).
The objective of the proposed ML algorithm is to predict the value of , provided only and H, without any knowledge of the physical system model.The proposed ML algorithm (visualized in Fig. 4) is comprised of three main steps.
In the first step, a similarity index (s-index) between each historical system state and the new one is calculated using Nadarya-Watson KDE (NW-KDE), which evaluates a product of Gaussian kernel functions of all variables (Monteiro et al., 2018), as shown in ( 8), to obtain an s-index with a value between 0 and 1 for each historical state, resulting in the vector of s-indices S as shown in ( 9)-(10).
= { , , … } (10) For each variable i, the Gausian kernel function bandwidth bi determines the sampling window relative to the statistical range of the historical samples of this variable.A coefficient can be used to tune the individual bandwith value for each variable as shown in (9).
Accordingly, serves as a normalized tuning coefficient, whose value can be set between 0 (exclusive) and 1 (inclusive) as shown in ( 12), corresponding to a bandwidth value between zero (exclusive) and the maximum range of the historical values of the variable (inclusive).
The second step in the proposed method is to generate an activation vector for the most similar historical cases based on the calculated s-index vector.This can be done in two ways.First, a cut-off value can be set to activate all cases with an s-index above a certain value.The second, and the one used in this study, is to activate a fixed number of the top NS historical cases from (highest NS s-indices).In both cases, the result is an activation vector A, whose elements are binary values as shown in ( 13)-( 14).
= { , , … , } The activation vector is used to extract the most similar cases from the historical dataset.Thus, a subset H S is extracted by discarding all cases whose corresponding activation value is zero.The original indices of the extracted cases H S (in the original set H) are preserved in vector I S .This is represented mathematically using ( 15)-( 17) and is illustrated in Fig. 4.
= { , , … , } The third and final step is to predict the output variables Y new using an ensemble of the extracted most similar historical cases.The simplest approach is to calculate the mean value of each output variable from the extracted historical values as shown in (18).
Confidence intervals can be obtained by applying simple univariate KDE to obtain the out variables' PDFs and the corresponding confidence interval bounds as represented mathematically in ( 19)-( 21), where ⁄ ( )% corresponds to the x th percentile.The lower and upper bounds of the x th percentile for each predicted output are expressed as , , % and , , % as shown in ( 20) and ( 21), respectively.

Application to Decentralized OPF with Cooperative Information Exchange
In the case of an OPF problem, the input variables are the bus loads and the output variables are the ideal generation levels.In the decentralized operation paradigm, each agent would only have a historical dataset with variables in their regions of operation.From the presented mathematical formulation it can be seen that the proposed ML model is independent of the physical grid model, and thus even with a limited number of historical variables the prediction of the outputs can still be obtained.
One of the novel contributions of this paper is the capability of the proposed method to enable the cooperative information exchange without sharing private/raw data (e.g., local historical datasets or control actions taken locally).With the proposed method, this can effectively be performed by publicly sharing the activation vector A or the corresponding indices .
In this way, the exchanged indices of activated historical cases would greatly improve the output prediction of other agents by giving them insight on relevant historical cases from the perspective of other agents and thereby the global performance of the system, without exposing any local/private information from the transacting agent's local database or the need for any central coordination.Accordingly, a distributed energy cloud operation scheme is enabled by the proposed method, as illustrated in Fig. 5.The activation functions therefore serve as encoded, yet useful information which are broadcast into a public energy cloud shared accessible by all agents in the decentralized system.A few final remarks about the proposed algorithm and cooperation scheme are noted before proceeding to the case study and analysis:  The size of the historical database does not need to be the same for all agents.In case a distributed energy cloud operation is adopted, standardized implementation is foreseen.However, even in the unlikely case where no standard database size is present activated historical cases can be shared based on their timestamps rather than index in the dataset.
 The tunable parameters of the proposed method are the normalized bandwidth coefficients and NS.
While tuning of these parameters can improve the prediction accuracy, it will be shown in the next sections that the proposed algorithm is highly versatile, such that applying default values for all the parameters still guarantees highly accurate output predictions.
 The proposed method is highly computationally efficient.While the computational implementation is demanding, all operations are based on simple direct array multiplications and manipulations.The high computational efficiency of the implementation will also be demonstrated in the next sections.
 The proposed algorithm was proposed and implemented as original code by the authors using MATLAB R2020b, on a standard laptop computer with an Intel Core i7-8550U CPU @ 1.80 GHz, 16.0 GB RAM, and a Windows 10 64-bit operating system.

Modified IEEE 24-Bus Test System
To demonstrate and validate the proposed algorithm, a case study was constructed based on a modified IEEE 24-bus reliability test system (RTS), whose single line diagram is shown in Fig. 6.The 24-bus network has 33 transmission lines, in addition to five transformers separating the two voltage levels in the network (138 kV and 230 kV).A total of 33 generators (G1, G2, … , G33), including one synchronous generator (G15), are incorporated.
For the purposes power flow analyses, active power generators that are both 1) connected to the same bus, and 2) have the same cost functions, are aggregated as a single generation station or utilities (Espinosa-Juárez & Hernández, 2007).Applying this to the considered 24-bus RTS results in 14 utilities (U1, U2, … , U14) being the active power generation stations of the system.In the decentralized operation paradigm, these generation utilities are considered to be the decentralized operating agents of the system, spread across five zones of operation (Z1, Z2, Z3, Z4, Z5) as is also shown in Fig. 6.
The zones are listed in Table 2, including the corresponding buses and utilities therein.Accordingly, in the case of an OPF analysis, each utility as a decentralized operating agent has access only to historical load values from its own zone (i.e., loads at buses inside its zone).The utilities do not have knowledge of each other's historical generation values (only their own).Table 3 lists all 14 utilities, the corresponding zone, incorporated generators, and their respective operating costs represented as coefficients of a quadratic cost function model as described in (Javadi et al., 2019).In the decentralized operation paradigm, the utilities do not have knowledge of the full grid topology (incidence matrix is unknown), only having knowledge of transmission line connections inside their respective zones.
It can be observed that the designated zones are diverse in terms of both size and structure.This was intended to test the versatility of the proposed algorithm.Furthermore, load buses 4, 6, 9, and 10 are not incorporated in any zone and therefore historical information from them is only provided when the energy cloud / cooperative information exchange is enabled.
This way, they are considered as self-managing microgrids (with self-consumption) which can still share their activation vectors in the proposed scheme for every new system state, however, have no control actions over the system.In this sense, the case study can also demonstrate the applicability of the proposed algorithm for the decentralized management of interconnected multi-microgrids with varying sizes, local generation capacity, and self-consumption.

Synthetically Generated Historical Data
To synthetically generate the historical dataset of centralized operation (as described in Section 2.1), that reflects realistic conditions, a typical transmission systems' annual load profile provided by the Portuguese Energy Regulation Services Entity (ERSE) was used.The load profile is high resolution (15-minute) for the full year of 2019 (35040 time steps).
To apply the normalized load profile to the current network, the annual peak load was set to correspond to the marginal operation of the network at maximum loadability (considering bus voltage angle and transmission line power limits).To determine this, the network is modeled and simulated using MATPOWER 7.0, and the total load of the system is gradually increased by incrementing individual bus loads while maintaining their original power factor (pf).The maximum loadability occurs at the point when any infinitesimal increase in individual bus loads would render an AC-OPF solution infeasible (violating network constraints).By performing this, the maximum loadability of this network was found to be 3334.50MW (total load).The maximum (active power) loadability of individual load buses (PD max ) and their pf is detailed in Table 4 (note that non-load buses have a pf of zero).
In this way, the normalized typical load (based on Portuguese transmission systems) can be applied to the current network by setting the annual peak load at each bus to its maximum loadability.A synthesized historical dataset could now be generated by performing the following steps: Step 1: Calculate active power load at each bus b, and the corresponding reactive power load, for each timestep in the load profiles.
Step 2: Perform a deterministic AC-OPF calculation to determine the corresponding active power generation levels of each utility, for each timestep in the load profiles.
, , , With this a historical dataset for a full year (15-minute resolution) is obtained.The day of the year and hour of the day are added as independent input variables in addition to the power loads at all buses.The total load for the generated historical operation is plotted in Fig. 7.

Performed Analyses
The case study is now fully constructed, and validation studies can be performed.Using the generated historical datasets, the proposed algorithm is tested by employing a fully decentralized week-ahead operation planning for OPF in the grid with very high-resolution (15 minute).The total system load profile for the test week, shown in Fig. 8, was generated based on the average summer values of the yearly load profile.Individual input variables (bus loads) at each time step are generated by maintaining the ratio of load buses in the test system and dividing the total system load accordingly.Three studies were performed: Study 1: In the first study the validity of the proposed algorithm is demonstrated, and the influence of tunable parameters is showcased.In this study, the full algorithm is employed, including the cooperative information exchange scheme between the agents.Coefficients were defined as follows:  : coefficient for historical input variables from sources which are physically connected to the utility (i.e.load values of the same bus).


: coefficient for historical input variables which are not physically connected to the utility (i.e.load value of bus other than that of the utility).For all studies, the results are compared against the centralized scenario with a deterministic AC-OPF solution using MATPOWER.The results of all three studies are presented subsequently in the next section.

Study 1: Validation and Parameter Tuning
In the first study, the proposed algorithm is validated and the effect of tunable parameters (normalized bandwidth coefficients and NS) is assessed.Two scenarios are compared.In the first, all normalized bandwidth coefficients are set to 0.5.This can be considered the "default" value of the coefficients.In the second scenario, the coefficients are tuned by generating random value combinations until the error falls below a certain threshold.Both scenarios are assessed relative to the exact values obtained by a centralized AC-OPF solution.Here, it is noted that on the machine used for implementation, the algorithm run time was recorded to be less than 4 seconds.Tuning the parameters ran in less than 1 minute for each utility.
This was around the same time that the training time of the NN used in the third study4 .Therefore, the tuning process was very fast, even for the demanding high-resolution week-ahead test case considered.
Plots of the results of the ideal generation levels of all utilities controlled by each of the decentralized operating agents are shown in Fig. 9, and the detailed results for all utilities are listed in Table 5.It is noted that U1, U3, and U12 are found to have a zero-load factor throughout the test week as the act as baseload generators for the network.The load factor of each utility is shown in Fig. 10 based on the centralized OPF solution.Therefore the baseload generation utilities U1, U3, and U12 were excluded from the results of this study and the two subsequent ones.The overloaded operating conditions that were used to assert the effectiveness of the proposed method can be seen in Fig. 10, with all generation centers being committed throughout the entire test week.By analyzing the obtained results shown in Fig. 9 and Table 5, several detailed observations can be.
First, with tuned parameters the moving average percentage error (MAPE), relative to that of the centralized AC-OPF solution, was well below 0.1% for all utilities in all zones.In all the plots of Fig. 10, it is seen that the lines corresponding to the exact centralized OPF solution and those of the predicted values using the proposed algorithm are tightly overlayed, being hardly distinguishable.
Second, using untuned parameters set to the default arbitrary values (10 for Ns and 0.5 for αout, αin, αday, and αhour), the ideal generation profile was still estimated using the proposed algorithm with very high accuracy.Apart from U6 in Zone 3, the ideal generation levels of all utilities in the network were accurately predicted with a MAPE less than or equal to 1%, which is very satisfactory for a worst-case performance without using any parameter tuning or information exchange.
Third, it is noted that the utilities predictions most affected by parameter tuning were U5 (Zone 2) and U6 (Zone 3), both of which incorporate large high-cost generators resulting in the high-frequency fluctuations during peak load hours.For those utilities parameter tuning, using the tuned parameters successfully resulted in dropping the MAPE to below 0.1%.U7 was similarly critical due to the three acute ramps in generation during the test week, which were also very accurately predicted after parameter tuning with a MAPE of less than 0.1%.Finally, U10 and U11 being high load factor utilities (Fig. 10) and in an energy exporting zone (Fig. 11) were sensitive to parameter tuning (although the untuned solution still had a 1% error, which is satisfactory for being the worst-case performance scenario).
By performing this study, the validity of the proposed algorithm was demonstrated, and the influence of tunable parameters was showcased.With proper parameter tuning, the algorithm was shown to be exceptionally accurate in predicting the ideal generation levels for each decentralized agent in the decentralized operation paradigm, relying solely on locally available historical data and with neither knowledge of the physical grid model, nor any raw information exchange between the zones.The robustness of the proposed algorithm was showcased by guaranteeing satisfactory prediction accuracy even when using untuned, arbitrarily chosen parameters, which serves as the worst-case operation scenario.

Study 2: Effect of Cooperative Information Exchange
In the second study, the effect of the proposed cloud-based information exchange framework was investigated.In this case, two scenarios are simulated and compared: using tuned parameters but with and without the incorporated cooperative information exchange framework.Once again, the MAPE of predicted ideal generation levels of each utility is calculated he exact values of the centralized AC-OPF solution.The simulation results of this study are shown in Fig. 12 and Table 6.By analyzing these results, the following observations are made.
First, the results show that incorporating the proposed cloud-based cooperative information exchange framework results in a profound improvement in the accuracy of all predicted generation values.By using the activation code sharing method formulated in Section 2, all the decentralized agents in all zones were capable of reducing their ideal generation prediction errors to less than 0.1% without divulging any raw information or having any knowledge of the physical grid model.
Second, by referring to the data in Table 3 and the metrics in Fig. 11, one can see that the cooperative information exchange framework proves to be most beneficial for utilities that have expensive generators and exist in zones that are net exporters of energy.This is expected, since such utilities would heavily rely on any information from other parts of the grid since they mainly respond to peak loads or to zones that are net importers of energy in the grid.This is most evident by through the results of U10 in Zone 4. Being in the top energy exporting region of the grid, using the proposed cooperative information exchange allows the U10 operator to completely diminish the decentralized prediction error from 8.07 % to 0.05%.
Finally, the influence of parameter tuning vs. the cooperative information exchange functionality can be compared by comparing the results of this study vs. the previous one.Overall, It can be seen that the impact of the implemented cooperative information sharing model is higher than parameter tuning, both in terms of the confidence intervals (evident by relative narrowing of the shaded regions of the plots in each of Fig. 9 and Fig. 12), and overall accuracy (MAPE).With this being said, it is seen that there are specific cases where either functionality contributes to the prediction accuracy more significantly than the other.In the case of U8, the proposed algorithm has a MAPE of 0.033 %.Removing the information exchange capability greatly increases the MAPE to 8.07 % (Table 6), while using untuned parameters causes a much smaller increase to 0.323 % (Table 5).U6 exhibits the opposite behavior, as using untuned parameters causes the MAPE to increase from 0.057 % to 6.10 % (Table 5), while removing the cooperative information exchange capability on results in a slight increase to 0.069 % (Table 6).This suggests that parameter tuning is critical for more large expensive generation centers with high-frequency fluctuations during peak hours such as U6, while participating in the cloud-based cooperative information exchange framework is more beneficial for expensive generators in zones that are net energy exporters in the network.

Study 3: Comparison with Neural Network
In this third and final study a prediction of the ideal generation values of each utility is obtained using a NN, to serve as a comparison between the developed ML algorithm a well-establish one.The same local historical datasets used as the inputs for the proposed algorithm are used to train the NN for each individual agent.In this case, the results of the proposed model and that of the NN, both being decentralized solutions, are evaluated by evaluating the MAPE based on the centralized AC-OPF.The NN was trained and simulated for the zone that exhibited the highest MAPE in its respective zone (with tuned parameters and cooperative information exchange): U2 from Z1, U5 from Z2, U6 from Z3, U10 from Z4, and U7 from Z5.
By inspecting the results shown in Fig. 13 and Table 7, the following points are noted.
First, the proposed algorithm significantly outperforms the NN for all cases.The NN guarantees a MAPE < 0.5% , while for the proposed method the MAPE is < 0.1% for all cases.In terms of the computational speed, it was mentioned that the training time of the NN was around the same time as the tuning process of the proposed algorithm.However, a key difference is the fact that the NN training process must be rerun for any new output variable introduced, while the proposed algorithm is tuned once for each utility / agent.
With this being said, the proposed algorithm not only outperforms the NN in terms of accuracy but also in terms of computational time, since the proposed method's average running time is 4 seconds (for the high-resolution week-ahead test case considered).It is also important to note that the NN network results are dependent on the training process which contains random elements, i.e., the results of the NN are different each time the training process is re-run (hence the averaged results presented in Table 7).This is not the case for the proposed algorithm, which provides the same results given the same historical dataset being used, being more reliable than a NN.The results of the first case study it was shown that the proposed algorithm provides accurate predictions even without parameter tuning, guaranteeing a much more robust applicability compared to a conventional NN algorithm.
Finally, it is very important to note is that one of the main novel contributions of the proposed algorithm is its capability to accommodate cooperative information exchange to enhance the results of individual agents.A NN implementation (and other ML algorithms) does not accommodate this, since local datasets are used to train the NN.Therefore, to improve the NN results it would be required to further augment or pre-process the historical data itself, and afterwards reperform the training process.In the case of the proposed algorithm, the designed cooperative information exchange framework allows agents to improve their results dynamically while the algorithm is running, adding a significant level of versatility to the proposed approach as opposed to a NN and other ML algorithms (not to mention the higher computational efficiency).Table 5: Results of the first study: tuned parameters and MAPE.
worst-case scenario, the full benefits of the algorithm will not be retained, and the performance will still be better than widely available ML algorithms as was demonstrated.However, there will be a loss of the potential improvement brought about by the cooperative information sharing capability.In realistic terms, operators will not be inclined to invest in deploying a new method unless they are sure its benefits are worth the effort.Thus, follow-up work must be performed 1) to concretely demonstrate this from an algorithm analysis perspective, and 2) from the point of view of overlapping research fields (i.e., informatics, economics, and legislature) to fully pave the way for the transition to this new paradigm.
From an algorithm analysis perspective, several future studies are recommended.First, a full sensitivity analysis of the physical grid considered is recommended, which would compare the algorithm performance when applied to power grids of different sizes, topologies, and densities (number of lines vs. number of buses).Second, the effect of the level of decentralization (which corresponds to the resulting number of decentralized agents or operating zones) on the performance of the algorithm should be evaluated.In the context of decentralized operation, the existence of multiple zones exists a priori.With the zones of operation corresponding to different decentralized operating agents, the physical definition of the zones is what the method is applied and adapted to, rather than being an imposition by the method itself.In real applications the zones can range from different individual prosumers in an energy community, to different micro-grid operators in a multi-microgrid system.The application of the proposed method to similar newly proposed physical models of decentralized power systems should be investigated, and interactions such as the effect of merging of several zones or the splitting of one should also be simulated, to cover the full spectrum of possible applications to future power system models.
From the perspective of the transition to fully decentralized operation and overlapping research fields, several paths for follow-up work can also be recommended.First, opportunities for business models involving transactive information exchange using peer-to-peer or cloud-based platforms for cooperative operation by decentralized agents can be investigated.Using the proposed cooperative information exchange framework, it was demonstrated that sharing the s-indices of one agent can have a significant positive effect on the obtained results of other agents.The value of this information can be quantified from an economic perspective and business models for a transactive information framework can be proposed.
Moreover, from an informatics perspective, cloud-based services dedicated to this information sharing mechanism can be developed to fully leverage the operational capabilities and techno-economic benefits of the system and to ensure the security of the information transactions.

Declaration of Competing Interest
The Authors declare no conflicts or competing interests, from financial, institutional, or personal domains.

Following
the development of PLF techniques in literature, another category of PLF -approximate PLF -was being proposed.As will be shown subsequently, those are the direct predecessors of state-of-the-art work on probabilistic and ML algorithms of most recent literature.The main reason for the delay in developing approximate PLF techniques is that the statistical/mathematical theories they are based on were first discovered around the same time as PLF itself (the first point estimation method was published in 1975 (Rosenblueth, 1975), one year after Borkowski's paper).Moreover, more modern computing technologies motivated the application of such methods in the field of electrical power systems (the first point estimation method applied to PLF was in 2005, (Su, 2005)).

Figure 1 :
Figure 1: Historical timeline of pioneering papers in literature to apply statistical and probabilistic theories to power flow and OPF problems in traditional/centralized power systems.
More than two decades later, (Schellenberg et al., 2005) combined the work of (Viviani & Heydt, 1981) and (Zhang & Lee, 2004) and proposed a cumulant-based P-OPF method.More modern computing

Figure 2 :
Figure 2: Illustration and description of different categories of methods employed for power flow and OPF in traditional/centralized power systems.
Solves branch loads for a specific set of load/input values.-Generatelarge set of random inputs (preferably) based on their PDFs.-Solvedeterministic equations to obtain a solution for each sample.-ConstructPDF of branch loads from resulting set of output values.to simplify the eqs.(linear/multilinear models).-Linearprobabilistic equations solved for input PDFs using a variety of methods (most prominent are convolution/FFT and cumulants).-Depending on the technique branch flow PDFs obtained directly (e.g.convolution/FFT) or constructed from calculated branch flow moments.input variables -only central moment (mean) required estimation of output PDFs -Knowledge of distribution of input random variables not mandatory.-Estimation of output PDFs by evaluating first few moments at selected points.-Most common approaches include point estimation methods.{ , , . .} ( , , . . ) implementation of smart metering and communication infrastructures (Mohamed Lotfi et al., 2018; Thirugnanam et al., 2021).
decentralized SG management paradigm (Wu et al., 2021).Probabilistic and ML techniques make use of historical data to provide fast and accurate predictions of solution variables even in the presence of high levels of uncertainty.Moreover, these methods rely solely on statistical relationships between input and output variables without requiring a deterministic model of the physical system to be constructed (Alimi et al., 2020; Wu et al., 2017).This makes them ideal to deal with the aforementioned problems of decentralized operation.Moreover, they do not suffer the drawbacks of deterministic decomposition-based techniques (lack of general applicability, convergence issues, reliance on low-latency communication infrastructures, and difficulty to incorporate uncertain parameters) (Lin et al., 2019).
Other studies such as (Ren et al., 2017) and (Constante-Flores & Illindala, 2019) have performed similar analyses, in which KDE was used to estimate the PDFs of uncertain input variables (particularly from renewable generation).

(
Liu et al., 2016) applied KDE for PPF analysis of 14 and 118-bus systems with high levels of uncertainty and relying on historical operation data measurements.The method demonstrated accurate results with respect to the field measurements.In (Nosratabadi et al., 2019), a KDE-based method was proposed for PPF of unbalanced distribution networks.The method was tested on modified IEEE 13-and 37-bus test systems and compared against MC-3000 and 2N+1 point estimation.The results showed that the proposed KDE-based PPF method was superior to both MC and point estimation both in terms of computational time and results accuracy.(Abbasi,2020) proposed and tested PPF algorithms based on holomorphic embedding, KDE, and saddle point approximation, comparing different approximate PPF techniques to analytical and quasi-deterministic ones.The proposed methods were tested on modified IEEE 14-and 118-bus test systems with high levels of uncertainty, and compared against MC-150000, 2N+1 point estimation, and other methods.The results clearly show the effectiveness of the proposed PPF methods and their superiority in terms of computational effort, while providing the same level of accuracy as MC simulations.The paper recognized the potential of approximate PPF methods in terms of their independence of the physical system model and flexibility in application, recommending their use for complex networks and energy management in modern SGs where historical operation data is available.Aside from operation, the application of KDE-based methods is recently observed in other areas of SGs research, namely in forecasting.Indeed, by employing PPF/P-OPF based on historical data with a ML model, the problem is modeled similar to a forecasting problem in which a desired output is to be predicted based on historical inputs.Examples of this are (Monteiro et al., 2018) which proposed a cooperative forecasting model for electricity market prices based on KDE and (M.Lotfi et al., 2020) which proposed a KDE-based ensemble algorithm for solar power forecasting.

(
Monteiro et al., 2018) and (M.Lotfi et al., 2020), combining the cooperative approach of the former and the ensemble prediction of the latter.The novel contributions of this paper are listed as follows: , the decentralized paradigm is illustrated, in which the centralized system operator is replaced by individual agents.Agents control local regions, where the agent has direct access to measured variables and can issue control actions.The agents in this fully decentralized operation paradigm are speculated to be individual utility operators, small-scale energy communities, or autonomous microgrids as demonstrated by (Martirano et al., 2021; Munsing et al., 2017; Thirugnanam et al., 2021).

Figure 3 :
Figure 3: An example illustration of the conceptual transition between a centralized (left) and decentralized (right) operation paradigm of power grids.Green lines correspond to information exchange.

Figure 4 :
Figure 4: Illustrative flowchart of the proposed and implemented ML algorithm.

Figure 5 :
Figure 5: Illustration of the cooperative information exchange made possible by the proposed and implemented ML algorithm, enabling a distributed energy cloud operation scheme.

Figure 6 :
Figure 6: Modified IEEE 24-bus test system showing the defined zones.

Figure 7 :
Figure 7: Plot of the historical total load (1 year with 15-minute resolution.)

Study 2 :Study 3 :
day of the year and hour of the day for each historical case, respectively.In the second study, the proposed cloud-based cooperative information exchange is investigated by comparing the results for each agent with and without the exchanged activation functions.In the third and final case study, the performance of the proposed algorithm is assessed in comparison with a NN.

Figure 8 :
Figure 8: Total load profile for the considered test week (15-minute resolution).

Figure 9 :
Figure 9: Results of the first study: predicted week-ahead generation profile (15-minute resolution) by each utility, with and without parameter tuning, compared with a centralized solution.

Figure 10 :
Figure 10: Average load factors of the utilities.

Figure 11 :
Figure 11: Total energy import/export by each zone for the considered week.

Figure 12 :
Figure12: Results of the second study: predicted week-ahead generation profile (15-minute resolution) by each utility, with and without information exchange, compared with a centralized solution.

(Morales & Pérez-Ruiz, 2007), varying
On the other hand, even the earliest point estimation methods are between O(2N) and O(4N) linearly with larger problems, which is the same as analytical PLF methods,

Table 2 :
Zones of the test system under analysis, and the corresponding buses and utilities within.

Table 3 :
Utilities of the test system under analysis (considered the decentralized operating agents) including the zone association, incorporated generator number, and the coefficients for the individual generator cost functions.3

Table 4 :
Maximum loadability (peak annual load) at each load bus and corresponding pf.