Review of online learning for control and diagnostics of power converters and drives: Algorithms, implementations and applications

Power converters and motor drives are playing a significant role in the transition towards sustainable energy systems and transportation electrification. In this context, rich diversity of new power converters and motor drive products are developed and commissioned by the industry every year. However, to achieve efficient, reliable and stable operation of power converter and drive systems, there are challenges in condition monitoring, fault diagnosis, lifecycle estimation, stability evaluation and control. Online learning is an emerging technology that can serve as a powerful remedy to these challenges. This paper aims to provide a systematic study of algorithms, implementations, and applications of online learning for control and diagnostics in the area of power converters and drives. First, online learning problems are formulated for condition monitoring, fault detection, online stability assessment, model predictive control for power converter and drive applications. Next, guidelines are provided about how to develop online learning models and algorithms for these applications. Practical case studies are presented with experimental demonstrations. Finally, challenges and future opportunities are discussed about online learning for power converter and drive applications.


Introduction
Power converters and drives are essential components of many modern electrical systems, including renewable energy systems, electric vehicles, and industrial drives.Control and diagnostics of power converters and drives are usually done with rigid signal processing structures [1].In other words, these control and diagnostics structures are pre-designed in an offline manner using data from linear or nonlinear system models prior to field application, and they do not adapt to potentially changing real-world conditions once they are commissioned.
Rigid signal processing structures were historically implemented in many different ways, from conventional cascaded linear control to more advanced controllers, which involve state feedback control, model predictive control (MPC), machine learning, and others [2].Recent research has shown that advanced controllers and diagnostics methods offer potential for far better performance than conventional ones.Among them, machine learning has emerged as an especially attractive technique because it can either ''imitate'' any kind of advanced signal processing structure [3,4] or learn a new one from scratch.Despite these attractive traits, offline machine learning still presents some limitations because the data used for training of signal processing structures may be incomplete.On  is that it can potentially improve the performance and efficiency of these systems.For example, in case of design of controllers for power converters, it is possible to achieve better adaptability to varying grid impedance, which is a very well-known phenomenon in converter dominated electric power grids [7].It is also possible to infer the efficiency curves of power converters and drives, which change over time as these components degrade.This information can then be used to always know which is the optimal efficiency point of the system, and operate it in such manner [8].Similarly, in the area of fault diagnosis in power drives, it is possible to design a self-commissioning condition monitoring systems, which can learn the ''normal'' operating performance of any system on the fly, and use this information to detect and diagnose faults with minimal intervention of engineers, thereby achieving high system reliability and low downtime with minor installation cost [9].Finally, as online learning involves training the model on small batches of data at a time, it can reduce the training time and cloud computational resources required compared to training on the entire data set at once.This can be particularly useful in cases where data sets are large or when performing control or diagnostics on many heterogeneous power converters and drives in the field.However, one needs to be aware that there are also some challenges and limitations to the use of online learning in power converters and drives, especially because they are often associated with missioncritical applications (e.g.waste and clean water pumping, industrial machinery, compressors) [8,10].Therefore, one challenge is the need to carefully design the online learning algorithm to ensure that it is stable and performs well.Another challenge is the need to carefully select the input features and data preprocessing methods to ensure that the model can learn effectively from the data.This paper will discuss in detail several applications of online learning algorithms and measures that can be taken in order to ensure their good performance and safety.
The use of online learning in power converters and drives can be traced back to the 2000s, when researchers began exploring its potential for improving the performance and efficiency of these systems.For instance, a B-spline neural-network was used in 2007 to online and in real time learn the nonlinear-flux-linkage characteristic of a switched reluctance motor [11].A promising estimation performance was shown within a certain motor speed.A similar idea was explored in [12], where a feed-forward neural network was applied to online estimate strong interaxis couplings of permanent-magnet spherical motor.The trained network was then used as a basis for nonlinear fuzzy control strategy, which showed greatly improved performance compared to conventional controllers.An optimal controller for a boost converter was also found online via fuzzy-neural-network control in [13].Here, an additional contribution was that the online learning algorithms were derived in the sense of Lyapunov stability even in existence of uncertainties.A fuzzy neural network was also employed to online estimate the lumped uncertainty of the six-phase permanent-magnet synchronous motor.The model was then used within the sliding mode control framework to achieve a fault tolerant control of such system [14].A reinforcement learning algorithm based on Q-learning was used in the context of variable-speed wind energy conversion system to online learn an energy-yield-maximizing mapping from measured states to control actions by updating the action values according to the received rewards in real time [15].A controller that simultaneously ensures the fulfillment of grid codes during grid faults and minimizes dc-link voltage oscillations without overburdening the inverter was trained online using a probabilistic fuzzy neural network with an asymmetric membership function in [16].A self-learning control scheme was developed for an interior permanent-magnet synchronous machine to achieve optimal performance in different operating regions.The essence of the scheme is first the learning of a static reference flux-map online and the subsequent exploitation of the model search scheme to find an optimal point [17].
In recent years, online learning has been applied in condition monitoring, lifetime prediction, stability evaluation and control for power converter and motor drive systems to enhance performance.In [8], AI based algorithm is developed for the prediction of the remaining useful lifecycle of power electronic components, and a step further, AI based control parameter design considering the remaining lifecycle is developed in [10] for enhanced reliability of grid-connected inverters.A machine learning based parameter estimation approach is developed to achieve condition monitoring of components in power converters [18].Ref. [19] proposes a physics-informed neural network based online impedance identification approach to achieve fast online impedance identification and online stability evaluation of voltage source converters with limited online data.An imitation learning based model predictive controller is designed for power converter systems to achieve good performance with fast online computation [20].Overall, the advancements in online learning are revolutionizing the field of power converter and motor drive systems, offering new possibilities for enhanced performance, reliability, condition monitoring, and control.Considering the challenges in the control and diagnosis of power converters and motor drives in the energy transition, and the potential of online learning in improving the performance and mitigate the challenges, it is important to develop a systematic review of online learning for power converter and motor drives, and provide a guideline on how to develop online learning models and algorithms for various applications.
This paper aims to go step forward and present several new applications of online learning in the realm of power converters and drives, and to systematically review available algorithms.Specifically, online learning problems in condition monitoring, fault detection, online stability assessment, model predictive control for power converter and motor drive applications are presented in Section 2. Next, how online learning models are developed with the selection of input/output features is presented in Section 3. Section 4 then illustrates different online learning algorithms that can be designed for condition monitoring, fault detection, online stability assessment and model predictive control.Practical case studies with experimental results are demonstrated in Section 5 as typical examples about online learning for power electronic and motor drive applications.Section 6 discusses the challenges and opportunities.The conclusion is drawn in Section 7.

Online learning problems
In this section, we will discuss the challenges in power electronic and motor drive systems, mainly focusing on condition monitoring, fault detection, stability evaluation and control; and how online learning can be utilized to address these challenges and achieve better performance.

Anomaly detection
Although power electronic systems are often designed to fulfill certain reliability and safety requirements, component-level and systemlevel failure and degradation processes occur as a result of complex phenomena, which involve a variety of interactions both within the system itself and between the system and its environment [21,22].In many applications, it is therefore practically impossible to accurately model all of these mechanisms during the design phase.For this reason, machine learning methods have emerged as a viable solution to analyze degradation and failure processes, owing to their capabilities to model arbitrarily complex relationships in data [10,23].
Real-time fault detection is often based on condition monitoring techniques, which commonly employ sensors to measure known performance indicator parameters.These parameters can then be directly employed to inform operational decisions, such as the scheduling of predictive maintenance [24].This approach can provide reliable estimates of the condition of the system, but frequently at the cost of the inclusion of additional hardware, e.g.advanced measuring equipment.An alternative method is to employ sensorless and noninvasive solutions, which may be more attractive to industry practitioners.In this context, machine learning methods can serve as a powerful tool to enable the inference of parameters relevant to the condition of the system while using only readily available measurements [23].
In machine-learning-based parameter estimation, data sets are typically obtained during lab tests, by fitting the device or system under study with the additional hardware required to monitor relevant condition indicators and recording data under a variety of operating conditions [18].A model can then be trained offline on the collected set of data, and deployed during field operation to obtain health status estimates in real-time.However, in cases where condition parameters are strongly related to mission profiles or other environmental factors, it may not be feasible to collect sufficient data to cover the entire operating space of the system solely during lab tests.Such situations may benefit from online training methods, where models can adaptively learn to monitor the behavior of the system during its field operation.
When condition parameters cannot be inferred directly from the locally measured values, fault detection methods may employ techniques from the closely related field of anomaly detection.Rather than estimating relevant condition parameters, anomaly detection methods output binary decisions, classifying observations as either normal or abnormal.In the online learning setting, anomaly detection can be better fit to fault detection than parameter-based condition monitoring, as the latter's common requirement of additional hardware can often be relaxed, thus allowing for classification based only on available measurements.Besides maintenance, anomaly detection is commonly applied to cybersecurity problems, especially in the detection of false data injection attacks (FDIA) [25,26].
For model-based online anomaly detection, machine models can be trained on the healthy behavior of the system through a set of streaming measurements.Then, when the model predictions for a set of inputs differ significantly from their corresponding measurements, an alert can be raised.The diagram shown in Fig. 1 illustrates the basic concept of model-based anomaly detection, where internal and external measurements  () are the inputs to a neural network model, which learns to estimate the target internal state  ().The prediction residuals  () can then be employed in an anomaly test to extract conclusions on the status of the system.A key step in anomaly detection is thus to adequately define what is considered to be the normal behavior of the system.In the online learning setting, this can be realized in several ways.To detect anomalies in systems where environmental conditions remain mostly constant, an option is to assume healthy behavior during the initial stages of operation.A model can then be trained from system commissioning until a defined time limit, and thereafter used to detect anomalous behavior.For rapidly developing faults and for more dynamic environments, an alternative is to alternate between periods of training, where the system is assumed to be healthy, and periods of evaluation, where anomalies may be identified.Ref. [27] follows this principle, while proposing several methods for online training of neural network models for anomaly detection in electric machines.In the proposed schemes, regression models are trained to predict the negative-sequence line current, and the prediction error is employed to signal stator winding turn faults.A third approach, especially suitable for distributed machine learning schemes, is to train a local model continuously while also maintaining a global model, which can be achieved through e.g.federated learning.Local models can then be compared with the global model to extract conclusions on whether any local device presents abnormal behavior.Ref. [28] proposes the use of federated learning with incrementally trained classifier models to detect FDIAs in solar farms, under the assumption that binary labels are available during the training process.
By limiting the requirements for additional hardware and extensive preemptive testing, online learning techniques can play an important role in enabling the use of anomaly detection in power electronic systems.

Remaining useful life prediction
Remaining useful life (RUL) can be defined as the length of time from the present to the end of life of a device under study [29].RUL prediction methods are commonly used in condition monitoring, but they aim to provide estimates of the residual lifetime of the system or component under study, rather than estimates of condition indicators or a probability of failure.In power converters and motor drive systems, RUL analysis often focuses on the degradation of power semiconductor devices and capacitors, as they are considered the most fragile components [30].The ability to obtain reliable RUL estimates for power electronic systems allows for substantial advantages in optimal planning, design, and maintenance scheduling, thus reducing downtime and driving down its associated costs.
Typically, relevant measurements and estimated condition indicators are used as inputs to RUL prediction models.RUL estimation shares many of the challenges found in other condition monitoring methods.However, RUL is often a more directly interpretable metric than raw estimates of condition indicators, and can therefore be readily used in operational planning.
One method to obtain estimates of RUL is to employ incremental degradation models, which map a history of operating conditions to a deterioration of a device of system.Such methods can be applied to forecasted operating conditions, and the RUL estimate can be found as the time when the total accumulated damage reaches a certain threshold [8,31,32].These models are often paired with cycle counting algorithms, such as rainflow counting, which effectively compress timeseries data into counts of stress cycles-a representation that is often better suited to RUL estimation.Fig. 2 presents a diagram illustrating this basic process, where the measured states  () are running through a cycle counting algorithm, with the resulting count information being used in an incremental damage model, which can be based on a neural network model.
In online applications, cycle counting algorithms must also operate continuously, which entails additional challenges [33].Additionally, incremental damage models may need to be learned or updated online to achieve adequate precision.Namely, models inferred solely from lab testing data may be limited by the fact that it is not always practically feasible to test devices in sufficiently diverse operating and environmental conditions, to which they may be exposed in the field.
By allowing models to adapt to operating conditions not sufficiently contained in the lab training data, online learning techniques may serve as a powerful tool for RUL estimation in power electronic systems.

Impedance based stability analysis
The large integration of renewables through interface power electronics converters into the power system has introduced the interaction stability issue [34].The interaction between the inner control loop (e.g., phase lock loop) of the interface converters and the grid may bring oscillation at different frequencies [35].Then how to figure out the mechanism of the oscillation and estimate the stability of the interaction stability is a key issue in modern power electronics system research.Majorly, there are two categories of solutions: the time-domain modal analysis based on the state-space model and the impedance-based analysis based on the impedance model [36,37].
The state-space model could indicate the complete dynamics of the large-scale systems, where the eigenvalues and participation factors analysis could be used to reveal the oscillation frequencies, damping ratios and state variables contributions to the system dynamics [38].However, with hundreds of power converters integrated into the power system, modeling and stability evaluation for the large-scale power converter dominated systems become challenging.This will lead to a high-order system model with thousands of states, which is hard to be used for the electrical system stability analysis [38].Meanwhile, as vendors are unwilling to share detailed information of their power converter products, the accurate state-space model of the system is difficult to be obtained.Thus, the time-domain modal analysis has severe applicability limitations in the multi-vendor system.
The impedance-based method is a promising solution to address the above challenge [39,40].As each power electronics converter can be modeled as a voltage/current source along with the impedance/ admittance, the impedance-based method has good modularity and scalability.Moreover, the impedance model can be experimentally measured at the terminal of the converter, which enable the black-box modeling.The impedance measurement based stability analysis is thus a suitable solution to the modern power electronics system.A typical impedance measurement scheme is shown in Fig. 3 [41].The perturbation is injected into the system at the point of common coupling (PCC), after which the current and voltage response (i.e.,   and   ) at the same terminal are measured.Next, the impedance of the inverter system can be calculated.Further, as reported in [42,43], an artificial neural network (ANN) can be adopted to model the multi-operatingpoint impedance of the system.To enable the online stability analysis of the power electronics system, the online impedance modeling method is a key technique.Several research works have investigated the online learning based impedance modeling and stability estimation, which use different online learning methods to identify the impedance [19,44,45].Thus, online learning is necessary for online impedance modeling and stability analysis.

Model predictive control
As the MPC can handle the multi-variable case and system constraints and nonlinearities, it has gained significant attention in recent power electronics converters and motor drive research and industrial applications [46][47][48].Compared to the traditional controller, it can achieve fast dynamics and optimal performance in power electronics converters and motor drive control [49,50].
A typical MPC method for power electronics converter and motor drive control is shown in Fig. 4. First the controlled variables () are measured or estimated.Second, the variables in the next step ( + 1) are predicted based on the () and predictive model.Finally, the switching state is obtained by solving the optimization function  that finds the situations that minimize the difference between system trajectories and the reference ()  .Based on the MPC concept, many control methods are proposed for different types of converter controls in different scenarios [51][52][53][54].In [52], an MPC controller is designed for three phase inverter, which can achieve fast control of voltage and current.In [54], a composite MPC based decentralized dynamic power sharing method is proposed for the hybrid energy storage system with constant power load, where a higher-order sliding mode disturbance observer is adopted to estimate the uncertainties.
However, the computational burden limits MPC implementation in more complex control cases, e.g. the control implementation of Modular Multilevel Converters (MMC).A neural network based method is a promising solution to address this issue.As shown in Fig. 5, if a neural network can be trained offline to mimic the MPC algorithm, it can be directly used for online implementation without solving the optimization problem online, which needs many calculation resources.Recently, several research works have been proposed following this principle [3,[55][56][57][58].In [56], a trained neural network is used for MMC control, which could reduce the calculation burden and enable the fast implementation of MMC control.In [58], an ANN based MPC surrogate is developed for high voltage ac (HVAC) systems control, which can also guarantee the performance of HVAC system control implementation.
As the working scenarios of power electronics converters and motor driver are variable, online model updating is necessary to further increase scalability of learning-based MPC method.
The summary of online learning problems in Section 2 is shown in Table 1.

Online learning models
This section discusses the formulation of online learning models for power electronic applications, together with the selection of input and output features for each case.

Online learning problem
Research issue

Anomaly detection
Anomalies detection can be understood as a classification method, with the aim to estimate the probability that one or several anomalies or faults are present in the monitored system.
Regression models can be adopted for residual-based anomaly detection, where models are trained to predict a set of observable variables during ''normal'' device operation.These models can then be employed in condition monitoring, by continuously analyzing their prediction error or residuals.Thresholding and statistical tests, such as Page's cumulative sum test, can employ these residuals to determine whether a set of observations qualifies as anomalous [59].In the online learning setting, regression models can be trained on data obtained during operation considered as normal.The main advantage of regression-based anomaly detection, which is shared with clustering methods, is therefore the fact that model training does not necessarily require anomalous data.Some notion of abnormality is however still required in order to adequately design thresholds and statistical tests.In the case of power electronic systems, regression models may be trained to predict relevant variables such as converter efficiency based on the available system information.The flexibility of data-driven models allows for the combination of multi-domain information, which may include operating setpoints, internal system measurements, and information on the environment, among others.
Classification models, on the other hand, typically require examples of both healthy and anomalous operation, which can then be used to train classifiers for the identification of anomalous samples.Online learning is not particularly suitable for classification-based fault detection, as anomalous samples can only be collected during undesirable operation.However, data corresponding to anomalies can be collected in a controlled setting, with healthy data being added to the training data set in an online manner and used for model updates.When training a classification model for a power electronic system, the output of the model becomes a probability of a fault or anomaly being present in the system.If the model is trained to identify the appearance of several failure mechanisms, it must instead perform multi-class classification, where its output becomes a probability corresponding to each type.In either case, binary labels corresponding to each data sample must be employed to train a classifier model.
Many clustering techniques can be used for anomaly detection, including -means clustering, self-organizing maps, and -nearest neighbors [60].Data clustering models can be updated online as new samples are continuously obtained.Autoencoders have been successfully applied to anomaly detection in power electronics [61,62], often by employing their reconstruction error in residual analysis, in a similar procedure to regression-based anomaly detection.Alternatively, the inferred latent representations of input data can be used in conjunction with another clustering algorithm to extract conclusions on the abnormality of the input samples.

Remaining useful life prediction
In RUL prediction, incremental damage models are trained online to perform regression tasks.Using a set of system parameters and measurements as inputs-often after preprocessing-the models are used to predict either a point estimate of the incremental damage or the parameters of its assumed distribution function.These inputs may include system and component level measurements, and in power converter systems often involve parameters such as the switching frequency, the device and ambient temperatures, the dc-link voltage, and the input and output powers.
RUL estimation is typically sensitive to numerous sources of uncertainty, and thus it is often not sufficient to predict point estimates of the residual lifetime.Therefore, machine learning methods for RUL estimation often seek to instead model the distribution of the random variable representing RUL, which can be summarized as a confidence interval under the assumption that the distribution is Gaussian.Regression models that have the capability to quantify the uncertainty of their outputs are therefore the best fit for RUL prediction.Some possible embodiments of such models are Gaussian process regression, Bayesian networks, relevance vector machines and, more generally, any ensemble of regression models.
In the context of online learning, it is sometimes not possible to obtain RUL estimates in real-time to update supervised learning models.For example, quantifying the degradation of insulated gate bipolar transistor (IGBT) bonding wires is often unfeasible without prior decommissioning.In these cases, models can only be updated when device failure occurs; sets of historical system measurements and parameters can then be accordingly paired with RUL values, and the updated data set can be employed in updating the models.
In other cases, where RUL estimates can be obtained in real-time, supervised models can be updated more frequently.The RUL estimation of electrical motors, including fans and pumps, falls into this category, as RUL is directly correlated with state-of-health, which can in turn be inferred from their operating characteristics.

Impedance based stability analysis
The impedance model of the power electronics converters can be expressed as a transfer function of the frequency  and impedance , i.e., (), where  = .Considering that the impedance model is highly dependent on the operating point of converters, the impedance model should include the operating point [40].Thus, the model can be expressed as (, , ), where  and  are the vectors of voltage and current, respectively.Theoretically, for a three-phase grid-connected inverter shown in Fig. 6, where the control structure is shown in Fig. 7, the admittance model can be expressed as: where transfer functions with the superscript '''' represent the symmetric transfer matrices,   is the unitary diagonal matrix and the others can be expressed as follows: where   , () and   , () are the transfer function of the filter,    () represents the time delay introduced by the digital control system,    , () represents the transfer function matrix of the current controller,    is the transfer function of phase lock loop (PLL),   ,   and   ,   denote the steady-state complex space vectors of current and voltage on the -axis and -axis, respectively.So that the impedance model can be modeled as the function where the inputs are the 4dimension operating point   ,   ,   ,   and frequency , the output is the admittance/impedance.
The model can be obtained by measurement and identified with ANN, where the operating point and frequency are selected as the inputs and the impedance is selected as outputs [42].The trained neural network based impedance model can be used for stability analysis of the grid-converter interaction through the general Nyquist criterion.

MPC model
As shown in Fig. 5, the ANN based MPC model can be separated as two types according to the different inputs and outputs.First is using the ANN to mimic the prediction model [3,56] This kind of methods are usually used in the scenarios where the physics of the system is hard to model.The inputs of the model are the current and voltage measurement at current step, which is denoted as , while the outputs are the next step state.Another kind of methods use the ANN to model the whole MPC controller, which consists of the prediction and optimization.The inputs are the current states and the outputs are the control signal or the pulse width modulation (PWM) signal of the system [20].These kinds of methods can reduce the calculation burden and enable the black-box control of the system.However, the robustness of the method often cannot be fully guaranteed.Although decent progress in formally verifiable neural networks offers the promise that this challenge will be resolved in the near future.
The summary of online learning models in Section 3 is shown in Table 2.

Online learning algorithms
In this section, different online machine learning algorithms are presented.

Supervised learning
Within power electronics, supervised learning is the most common training method for machine learning algorithms [63].Given a data set  consisting of a set of input samples  ∈  and their corresponding labels  ∈ , as well as a model  ∶  ⟶ , and a loss function  ∶  ×  ⟶ R, supervised learning seeks to find the set of parameters of  that minimize .In offline learning settings, it is assumed that the entirety of  is available for training, validation, and testing, but this assumption does not hold for online learning.
Online supervised learning therefore has the main advantage of being able to reduce data collection and storage requirements when compared to its offline counterpart.However, it requires automated data processing and labeling.Supervised online learning can thus be a good fit for problems where the target output variables are available as measurements or can be computed with relatively low computation requirements, such as the modeling of some dynamical processes.Supervised online learning has been successfully applied to control problems [64][65][66], as well as machine fault diagnostics [27].
Neural networks are some of the most widely used models in supervised machine learning.Within power electronics, the most common type of neural network is the feed-forward multilayer perceptron trained with backpropagation [63].A diagram exemplifying this topology is shown in Fig. 8, for a particular structure with two hidden layers and a single output ŷ.The activations  () of each layer  after the input layer are obtained as where  () is a learned matrix of weights,  () is a learned vector of biases, and  () is a nonlinear activation function.The corresponding flowchart of the supervised learning can be shown in Fig. 9.

Table 2
Summary of Section 3.
Model predictive control ANN to mimic the prediction model [3,56].ANN serves as the whole MPC controller [20].

Unsupervised learning
Unsupervised machine learning methods aim to recognize patterns in unlabeled data; this comes in contrast with supervised learning, where each input sample must be paired with a corresponding target.Many clustering and decomposition methods operate in an unsupervised manner.
The most common unsupervised neural network models are autoencoders and their derivatives, such as variational autoencoders.In their basic form, autoencoders are feedforward neural networks trained to output the same set of values that they receive as inputs.For this task to be non-trivial, the neural network models are designed such that one or several of their hidden layers contain fewer units than the number of model inputs.In this way, the autoencoder learns an effective low-dimensional representation of the data set that allows for its reconstruction.When the model is used for inference, the analysis of this reconstruction error can provide information on changes in operation, such as anomalous behavior.This is due to the fact that the compression and decompression architecture learned by the autoencoder is specific to its training data, and therefore samples that deviate from this set will be expected to result in higher error values.Additionally, the compression half of the autoencoder can be deployed to obtain a latent representation of the input sample, which can prove useful in denoising, compression, and feature selection [63].A diagram illustrating the structure of a basic autoencoder is shown in Fig. 10.In the context of online learning, unsupervised learning has the main advantage of being applicable to settings where data labels are unavailable.And the corresponding flowchart of unsupervised learning is shown in Fig. 9.However, the applications of unsupervised machine learning have historically been more limited in scope than those of its supervised counterpart.

Federated learning
Federated learning is based on a set of techniques involving the aggregation of model updates across multiple devices [67].Its goal is to allow multiple system nodes to share model information while avoiding the transmission of large data sets, which can otherwise be both costly and privacy-sensitive.In its simplest form, a machine learning model is trained individually on each device, while transmitting its parameter updates to an aggregating (e.g.cloud) platform.After combining these updates via an averaging algorithm, the centralized platform sends the globally updated parameters back to the nodes, thus updating the individual models with information from their peers.The basic structure of this process is shown as a diagram in Fig. 12.And the corresponding flowchart is shown in Fig. 11.
Federated learning has found numerous applications in mobile edge networks [68], but its potential within power electronics remains largely untapped.As a recent example in power electronics, [28] presents an FDIA detection algorithm for photovoltaic systems based on a long short-term memory (LSTM) classifier, with model updates obtained through federated learning.

Bayesian deep learning
Bayesian deep learning is a unified probabilistic framework to integrate deep learning and Bayesian models together [69,70].It has the advantages in conditional dependencies on high-dimensional data and modeling of uncertainty [71].The general framework of Bayesian learning can be shown in Fig. 13.The left part represents the perception component, and the right part is the task-specific component.For the Bayesian deep learning, the perception component can be represented as a deep neural network.In this structure, there are three key variables: perception variables   (..,   ,   ), hinge variable  ℎ (.., ) and task variables   (.., , , , ).If the edges between the two components point towards the hinge variable  ℎ , the joint distribution of all variables can be written as: If the edges between the two components point originate hinge variable  ℎ , the joint distribution of all variables can be written as: Eq. ( 4) can be used for supervised learning while Eq. ( 5) can be used for unsupervised learning.Note that besides these two vanilla cases, it is possible for Bayesian deep learning (BDL) to simultaneously have some edges between the two components pointing towards  ℎ and some originating from  ℎ , in which case the decomposition of the joint distribution would be more complex.[72,73] directly use convolutional neural networks (CNN) or deep belief networks (DBN) to assist representation learning for content information, but the deep learning components of their models are deterministic without modeling the noise and hence they are less robust.The corresponding flowchart of the BDL is shown in Fig. 9.

Deep reinforcement learning
Deep reinforcement learning (DRL) is a prominent machine learning paradigm, which combines the deep neural network and reinforcement learning (RL) concept [74].The core concept of RL can be visualized as shown in Fig. 15.In each time step t, the agent takes actions at to    the environment.Then, based on the actions the operation rules, the environment will generate the rewards of rt the actions, as well as the current states st.With several time steps iteration, the agent has the capability to generate the proper actions  based on measured states s of the environment, which is the so-called polity (|).
The methods of RL can be summarized as shown in Fig. 16.The RL method consists of model-based methods and model-free methods, where the difference is the model-based methods explicitly estimate and update the system model and take actions based on the estimated model, e.g., Thompson Sampling and Upper Confidence RL (URCL).The model-free methods can be separated into policy-based methods and value-based methods, where the value-based methods directly estimate the optimal value function, e.g., Q learning, SARSA, Least-Squares Policy Iteration (LSPI) and fitted Q-iteration.And the policy-based methods estimate the optimal policy, which consists of trust region Requires labeled data, may struggle with high-dimensional or complex data.

Unsupervised Learning
Discovering patterns or structures in unlabeled data without explicit output labels.
Useful for exploratory data analysis, finding hidden patterns, and dimensionality reduction.
Lack of explicit labels may make interpretation and evaluation challenging.

Federated Learning
Collaborative learning on distributed data without centralizing it.
Preserves data privacy, enables learning on decentralized devices, facilitates collaboration.
Communication and synchronization overhead, potential bias from non-representative data.

Bayesian Learning
Incorporating prior knowledge and updating beliefs based on observed data using Bayesian inference.
Provides a principled framework for handling uncertainty, can make robust predictions.
Computationally intensive for complex models, may require strong prior assumptions.

Deep Reinforcement Learning
Training agents to learn optimal actions based on rewards and interactions with an environment.
Effective for sequential decision-making problems, can learn complex strategies.
Requires substantial computational resources, susceptible to sample inefficiency.

Physics-Informed Neural Networks
Combining physics-based knowledge with neural networks to learn from data and satisfy physical laws.
Incorporates domain-specific knowledge, improves generalization and interoperability.
Requires domain expertise to formulate physics constraints, potential challenges in balancing data-driven and physics-driven components.policy optimization, proximal policy optimization and many actorcritic structured methods.DRL methods combine the deep learning method into the RL concept, where the deep neural network is used to mimic the policy/value in the policy/value-based method.The most widely used combination in power electronics converter and motor drive cases are the deep deterministic policy gradient (DDPG) and deep Q learning (DQN), where the DDPG uses the deep neural network (DNN) to estimate the actor and critics of the actor-critic method and the DQN use DNN to estimate the Q-value of Q learning [85,86].The DQN is usually used for discrete action cases and the DDPG is used in continuous cases [87][88][89].And the general flowchart of DRL can be shown in Fig. 14.

Physics-informed neural network
Physics-informed neural network (PINN) is a kind of universal function approximator that can embed the knowledge of physics in the learning process [90].The concept of the PINN is visualized as shown in Fig. 18.The physical laws of the learned system are summarized as the knowledge, which can accelerate or enhance the learning process.
As traditional neural network training highly depends on the data, where the correction and low data availability are the apparent issues in the power electronics converter and motor drive research and industrial application [44].The characteristics render the traditional ineffective in most For PINN, the knowledge of general physical laws in the of neural networks increases the robustness and the correctness of the function approximation, thus enhancing the understanding of the available data and making the trained agents have good generalization even with a small amount of data [18,91].When machine learning techniques are used in the power electronics domain, the PINN is a promising solution.In [19], a PINN framework is adopted for the online impedance measurement.From the comparison, the PINN based method shows better performance in small data amount scenarios.And the corresponding flowchart is shown in Fig. 17.
The comparison and summary of different online learning algorithms in Section 4 are shown in Table 3.

Online learning case study demos
Based on the different online learning problems in Section 2, online learning models in Section 3, and algorithms in Section 4, this section demonstrates the practical case studies and experimental results.

Anomaly detection
In this case study for anomaly detection, the goal is to identify thermal anomalies in a motor drive by monitoring the heat sink temperature of the power electronic converter.Fig. 19 shows the laboratory setup from which data are obtained.It consists of two Danfoss VLT ® AutomationDrive FC 302 motor drives controlling a pair of coupled induction machines, allowing for testing under variable loading conditions.Randomized torque and speed references are sent to the drives, while continuously monitoring, among other values, the output current and the heat sink temperature of one of them.
A neural network model is trained online on a rolling window of measurements, mapping the last 30 min of rms output current measurements to the real-time value of the heat sink temperature, with data collected during expected operating conditions and with a sampling time of 10 s.After the training process converges, the model can be used for inference, resulting in prediction errors that can be employed to detect anomalies.Some corresponding test results are shown in Fig. 20, where the top graph displays the measurements and the corresponding estimations of heat sink temperature, and the bottom  graph displays the squared prediction error.At about 5.5 h, the air inlet of the converter's cooling system is blocked with a piece of cardboard, which results in the estimated temperatures being consistently lower than the corresponding measurements, as model has been without any blockage.This is reflected in the error values, which can then be employed in an anomaly test to conclude that a fault is present.
These results are obtained with a fully-connected neural network model, containing two hidden layers of 16 and 8 hidden units, respectively, each employing the ReLU activation function.The neural network model is trained with backpropagation, with a mean squared error loss function, and stochastic gradient descent with momentum as the optimizer.

Remaining useful life prediction
The diagram of the two stage single phase grid-connected PV system is shown in Fig. 21, where the boost converter is controlled at the maximum power point tracking (MPPT) mode and the PR controller is used for the current control of the single phase inverter.This case study illustrates the use of machine learning models for the purpose of RUL estimation, with the goal of monitoring a power electronics converter system and optimizing the lifetime with the propel inductor sizing, which can be represented as follows: where  is the lifetime consumption,  1 and  2 are the filter inductor,  is the weighting factor.In this example, first a surrogate neural  network model ANN1 is trained to map the operating conditions and designed parameters (i.e., dc voltage   and switching frequency   ) into the junction temperatures, where the data is obtained with the thermal simulation of the converters.Then with the trained ANN1, the mapping between the parameters and junction temperatures can be established.Then feed the yearly mission profile of the converters to ANN1, the yearly junction temperature data can be obtained.Run the cycle counting and Miner's rule on the yearly junction temperature data, the lifetime consumption can be obtained.Then the ANN2 is used to train the mapping from the designed parameters and lifetime consumption.Based on the ANN2, the Pareto front of optimal designs obtained with different weighting factor  is shown in Fig. 22.With the machine learning technique, the optimal design of lifetime with the propel inductor sizing can be achieved.

Impedance based stability analysis
Here is a case study of online learning used in online stability analysis.The system is shown in Fig. 23, where the three-phase gridconnected inverter is controlled by the current controller and phase lock loop.The parameters of the system are shown in Table 4.We adopt the PINN based method to enable the online impedance modeling of the converter, and the multi-operating-point (MOP) admittance of    the converter is shown in Fig. 24.The admittance in -frame can be represented as a second-order matrix, which can be written as follows: The diagram in Fig. 24 shows the magnitude and phase the four elements when the operating point of the system changes, i.e., -axis  When the VSC is at operating point a, the grid-connected VSC system is unstable, while the system is stable when the system is at the operating point b.The experimental results verify the accuracy of the admittance model generated with the online learning methods.

MPC for converter control
A case study of dc-microgrid controlled with the MPC method is shown in Fig. 28, where the boost converter is integrated into the system.The   and   are represented as the constant power source and load.The inductor L1 is 1 mH, and the capacitor C1 is 1 mF.The dc link voltage   is controlled at 100 V and the voltage of the boost converter is set to 50 V.The dc-microgrid system is controlled by MPC.A platform dc-microgrid system feeding constant power load (CPL) is constructed as shown in Fig. 29.The DC/DC converter controlled by dSPACE 1006, is used to connect the battery to the common bus.Only the CPL is integrated into the system, as the worst case in terms of stability is the pure CPL.The experimental result of the system with the variable CPL is presented in Fig. 30.The CPL increases from 500 W to 1 kW at  1 and decreases from 1 kW to 500 W at  2 .As shown in the diagram, with the load variation, bus voltage can accurately track the reference with smooth transient performance and the settling time is around 2 ms.The experiment result can validate the effectiveness of the MPC method in dc-microgrid control.
The summary of the case study in Section 5 is shown in Table 6.

Limitations and future opportunities
Despite the accomplishments that have been achieved in online learning for power converter and motor drive systems (as reviewed by this paper), there are still many challenges that remain unsolved, and thus provide many future opportunities:

Case studies Description
Anomaly detection Identify thermal anomalies in a motor drive by monitoring the heat sink temperature of the power electronic converter using trained ANN.

Remaining useful life prediction
Using the trained ANN model for the optimal design of lifetime with the propel inductor sizing.Impedance based stability analysis Using the ANN-based impedance model to predict the interaction stability of the grid-inverter system.MPC for converter control Using the observer based MPC to control the boost converter.

Short summary
Our work is not only a review of online learning in power converter and motor drive applications, but more importantly, we provide a systematic guideline about how online learning can be designed and utilized to address challenges in recent power converter and motor drive applications.We mainly focus on four main challenges, i.e., anomaly detection, remaining useful life prediction, impedance identification and online stability analysis, control.For each challenge, we formulate online learning problems, build online learning models, implement different online learning algorithms, and provide an experimental demonstration for practical implementation.These works provide good performance in the targeted applications.However, we still find some limitations: (1) Most existing works are for one specific converter or motor drive system, but when it comes to another converter or motor drive system, the time-consuming offline training and online implementation need to be performed again, making the solutions not efficient in scalability.(2) Most existing works assume the good quality of the data they use.
Thus the solutions might not be robust under polluted data or attacked data.(3) As the obtained online learning models are black-box neural network models, it is difficult to check if it is correct afterward.(4) In most works, the model training is still offline, while online training and updating are difficult due to limited computing resources in existing digital processors.

Future opportunities
Based on the above limitations, future works are discussed below.

Scalability and adaptation
For the online learning methods for monitoring/diagnosis/control of power converters and motor drives, normally the obtained machine learning model can only be used for the specific system that is under study, e.g., the machine learning model trained for one converter cannot be used for another converter; and extensive work needs to be done and huge amount data is required if there are numerous converters.Since numerous power converter and motor drives products are developed and commissioned by industry and power grids every year, it is important to develop scalable and adaptable online learning methods, that once a machine learning model of a converter/motor drive is obtained, only light effort of retraining is required for it to be used in other converters/motor drive products.

Robustness against data quality and cybersecurity
Online learning methods are data driven methods that the accuracy and effectiveness rely heavily on data quality.However, real measurement data from power converters and motor drive systems may suffer from various issues, e.g., noise, missing data, measurement error, even cyberattacks.Small perturbations of data may lead to dramatically different machine learning model, and unreliable results.Therefore, it is important to develop methods to guarantee the robustness of online learning methods to achieve acceptable monitoring/diagnosis/evaluation/control results under data corruption and cyberattacks.

Real-time performance
As power converter systems have fast dynamics and no inertia, and their operating conditions change frequently, thus they require rapid decision-making.To guarantee system performance, online learning algorithms need to operate in real-time, to make decisions in response to changing operating conditions.This motivates the development of efficient online learning algorithms, as well as the utilization of high-performance computation.

Combining model-based methods and data driven methods
Pure data driven online learning methods may suffer from the scalability and robustness issues discussed above.Power converters and motor drive systems are not black box systems, and they have physical models and domain knowledge that can provide much information.Therefore, it is a promising future direction to combine model based methods and data driven methods, to make full use of both physical model and data, and inherit the advantages of both methods.There are several possible ways to combine model based methods and data driven methods.One way is to use physical knowledge to structure the neural network or pretrain a neural network, and then use data to train the final neural network.For example, a physics-informed neural network based modeling method is developed in [42], which uses physics knowledge and a generic model of a voltage source converter to compress and pretrain a neural network model, and then use limited online measured data to obtain the final neural network model for the targeted power converter; it achieves fast and accurate online impedance identification of power converters with largely reduced data amount compared with pure data driven methods.The other way is to use the physical model as nominal model, and then use data to identify the unknown/varying parameters.The observer based model predictive control method in [56] is an example to achieve more accurate control results compared with model based method.There are still many challenges to be solved in maximizing the physical knowledge and data in online learning.Despite limited work, integrating physical knowledge in online learning is recognized as a promising future direction.

Interpretation
As models obtained by online learning methods are black box models, it is difficult to understand why the decision is made.Thus machine learning based solutions are less convincing for practitioners to implement in industry applications, especially for safety-critical infrastructure.As power converters and motor drive systems are widely used for power systems, electric transports and various industries, which are critical infrastructures of society, it is pressing to improve the interpretation of the online learning solutions for operators/customers to understand how machine learning models make such decisions to enable the wide deployment.

Integration with internet of things (IoT) and other digital technologies
As online learning algorithms require real-time data and efficient data process, there is a trend to integrate them with IoT and other digital technologies (e.g., edge computing and cloud computing) for power electronic and motor drive applications.It can facilitate data processing, enhance the efficiency of the online algorithm implementation.
In summary, the flowchart about future plan of online learning for control and diagnosis of power converters and drives is shown in Fig. 31.

Conclusion
This paper aims to provide an overview of online learning for power converter and motor drive applications.Power converters and motor drives are playing an important role in sustainable energy systems and electric transportation systems.However, due to the more diverse and changing profiles compared to lab tests and simulations, there are challenges in condition monitoring, fault detection, online stability assessment, and control for power converter and motor drive applications.First, these typical challenges in power converter and motor drive applications are introduced.Next, how online learning models are developed with the selection of input/output features for condition monitoring, fault detection, online stability assessment and model predictive control are presented.Then different online learning algorithms that can be designed for these applications are illustrated.Practical case studies with experimental results are demonstrated as typical examples of online learning for power electronic and motor drive applications.Finally, the challenges and opportunities in online learning for power converters and motor drive applications are discussed.This paper also serves as a guideline about how online learning algorithms can be applied to enhance the performance in the modeling, design, analysis and control of power converter and motor drive systems.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Fig. 1 .
Fig. 1.Model-based anomaly detection applied to a power electronic system.

Fig. 2 .
Fig.2.RUL estimation process for a power converter system.

Fig. 4 .
Fig. 4. Typical MPC for power electronics converter and motor drive.

Fig. 5 .
Fig. 5. ANN based MPC surrogate model for power electronics converter and motor drive.

Fig. 8 .
Fig. 8.A feed-forward neural network model with two hidden layers.

Fig. 12 .
Fig. 12.An application of federated learning.Local models are continuously trained on data from individual power converters, and are aggregated into a global model in a cloud computing platform.

Fig. 19 .
Fig. 19.Laboratory setup used for data collection in the fault detection case study.

Fig. 20 .
Fig. 20.Experimental results for the fault detection case study.

Fig. 21 .
Fig. 21.Diagram of two-stage single phase grid connected PV system.
current from 20 A to 80 A. Take the magnitude of   as an example, if   is fixed to 20 A, then we can get the 2-D diagram of frequency response, which is shown in Fig.25.With the trained ANN model, the stability condition of the system can be gained as shown in Table5.To validate the stability analysis result, the experiment is conducted.The experimental test setup is shown in Fig.26.The dSPACE DS1007 is used to control the VSC.The Chroma 61845 serves as the grid simulator.Two cases shown in Table2are tested, the corresponding experimental results are shown in Fig.27, where the   represents the grid voltage and   ,   ,   are the output current of VSC.

Fig. 27 .
Fig. 27.Experimental results of grid-converter interaction system at different operating points.(a) Case I: operating point a.(b) Case II: operating point b.

Fig. 31 .
Fig. 31.Future plan of online learning for control and diagnosis of power converter and drives.
the other hand, even if the

Table 1
Summary of Section 2.

Table 3
Summary of Section 4.

Table 5
Stability prediction results.

Table 6
Summary of Section 5.