Machine Learning for Inverter-Fed Motors Monitoring and Fault Detection: An Overview

Monitoring and fault detection can be critical for efficient, safe and reliable operation of electric drive systems. Unfortunately, developing accurate physics-based models for these tasks is difficult due to unknown machine parameters and incomplete knowledge of the physical phenomena occurring within the system. Machine Learning (ML) methods can learn the system’s behavior from data without requiring explicit models. However, expert knowledge of the system is still crucial to extract useful features before applying ML models. This paper presents an overview of the use of ML and data visualization methods for condition monitoring of inverter fed induction motors. More specifically, stator winding temperature estimation and insulation degradation are considered. The analyzed methods make use of the signals normally available in electric drives. Time and frequency-based approaches are considered. The developed methods are assessed on an experimental test bench. The paper is intended to bridge ML and electric drive domains. The desired outcome of this work is to provide useful guidelines for researchers in the electric drives field who aim to apply modern ML and data visualization techniques for monitoring and fault detection.

x (j)  j-th input sample to the model.i abc , u abc Three-phase time series of currents and voltages.i αβ , u αβ Time series of currents and voltages as complex vectors.T Stator temperature.w(k) Motor speed at sample k. f (x (j) , θ) Parametric non-linear function defined by a set of parameters θ and input x (j) .f linear (x (j) , θ) Parametric linear function defined by a set of parameters θ and input x (j) .
ŷ Predicted value of the variable y estimated by f .I αβ (h) The h th harmonic of the time series i αβ .U αβ (h) The h th harmonic of the time series u αβ .Z h , Z − h Direct and inverse sequence impedance respectively for the h th harmonic.

III. INTRODUCTION
Induction motors are widely used in multiple sectors due to their robustness, simplicity, and competitive cost.Monitoring and fault detection techniques play a key role in ensuring their reliable and efficient operation.Detection of abnormal conditions at early stages allows to extend the lifespan of the machine, mitigate potential risks, reduce downtime and prevent catastrophic failures, eventually resulting in reduced costs and improved operational efficiency [1], [2], [3], [4].The existing approaches for fault detection and diagnosis can be classified into three categories: model-based approaches, signature-based approaches and data-driven approaches.Model-based approaches [4], [5] rely on mathematical models which can be derived from the physical equations or other prior knowledge of the process.By feeding these models with the actual process inputs, deviations between the estimated and actual behavior can be used to detect faults.However, the effectiveness of these approaches heavily rely on the accuracy of the model, so both model design as well as precise estimation of its parameters become crucial.Signature-based approaches [1], [6], avoid the need of process models, focusing on specific signatures or patterns associated with known faults.Patterns are obtained from historical data analysis, or from previous expert knowledge.These methods are based on the fact that certain measured variables or behaviors of the motor, such as currents [7], vibrations [8], or acoustic noise [9], contain valuable information on its mechanical and/or electrical condition.These methods typically use signal processing techniques, such as Fast Fourier Transform (FFT) [1], [10] and wavelet transforms [11] to extract relevant features or descriptors from the measured signals expected to carry fault related information.The resulting features can be statistically compared with the expected values under normal conditions to detect deviations indicative of a fault.However, signaturebased methods still rely on the availability of patterns both during normal and fault condition.Moreover, since these patterns are linked to particular faults, they might not uncover faults producing different patterns.Data-driven approaches [12], [13], [14], [15] learn fault-related patterns and input/output relationships directly from process data without relying on explicit models or predefined fault signatures.These methods often use ML algorithms to train general models that can later be used for prediction or classification purposes.The growing data resources and recent advances in ML, especially in Deep Learning (DL) techniques [16], have increased data-driven methods popularity over model-based and signature-based approaches.A more detailed review of data-driven approaches will be presented in section IV.
Monitoring and fault detection for electric drives using the above approaches present the following challenges: 1) the lack of labelled data from realistic scenarios that measure the progressive, unforced motor degradation; 2) the development of ML-based models that are invariant to the change in operating point; and 3) the complexity and high-dimensional nature of fault models, involving electrical, electromagnetic, mechanical and thermal variables, where the mere application of ML algorithms, without wisely using the existing knowledge of the physical principles of electric machines, will unlikely succeed.Methods and approaches that provide insight within the electrical engineering domain are required to find an optimal design under the many degrees of freedom that arise from combining variables, feature extraction techniques, level of analysis granularity, etc.However, there are few works which include methodological considerations specific for electric motors [17], [18], such as the use of frequency descriptors of electrical variables, or using data driven dynamic models for monitoring and fault detection.Surprisingly, even less frequent is the use of data visualization techniques for the discovery of useful knowledge for fault detection and diagnosis [19].Dimensionality reduction techniques, which have been successfully used in other fields such as biomedicine [20], [21], chemical process analysis [22] and process monitoring [23] to discover relationships and useful knowledge in complex problems from multidimensional data, largely remain unexplored in the field of electric drives.Condition monitoring in inverted-fed motors also involves multidimensional data (currents, voltages, harmonics, etc.) and the relationship between the signals and faults is not straightforward.Exploratory analyses enhanced by data visualization techniques can be a powerful approach in this ill-defined scenario to discover patterns related to abnormal states, suitable to be used as potential targets for ML algorithms, and contrast hypotheses about potential sources of fault.
This paper aims to bridge the ML and electric drives domains, discussing the applicability as well as pros and cons of data-driven techniques for monitoring and fault detection of inverter fed machines.As an example of application, the proposed procedure is applied to monitor the stator condition of an inverted-fed motor.Currents and voltages feeding the machine are used to train and evaluate several widely used ML and data visualization models.Data were collected from a sequence of tests in which the motor suffered of a progressive deterioration, and eventually suffered an irreversible damage.
The paper is organized as follows.First, a literature review of previous works on ML techniques for monitoring and fault detection in inverter-fed motors is included in section IV.Nomenclature and conventions used throughout the paper are presented in section V.The model of the electric motor is discussed in section VI.In section VII the test bench and experiments used to develop the methodology are described.Section VIII discusses the options for ML application to monitor and diagnose of inverter fed motors.Section IX shows the application of the proposed methodology to the experimental data and discusses the results.Finally, section X summarizes the conclusions.

IV. REVIEW OF MACHINE LEARNING METHODS FOR MONITORING AND FAULT DETECTION OF ELECTRIC MOTORS
The use of ML classifiers for fault diagnosis of induction motors can be found in [24] and [25].ML classifiers rely on handcrafted features or descriptors extracted from currents, voltages and vibrations of the motor [26].Bearing condition monitoring has been widely addressed by support vector machines (SVM) classifiers [27], [28], artificial neural networks (ANN) [29], [30] and random forest classifiers [31] using different types of time-domain and frequencydomain features extracted manually from feeding signals and vibrations.However, electrical faults have been less covered by classic ML-based classifiers.In [17], the authors proposed ANN and SVM models to detect broken rotor bar faults using both time-domain and frequency domain features computed on phase voltages and currents.
Deep learning (DL) techniques [16], [32] have been replacing classic ML approaches in recent years with classifiers and estimators based on deep neural networks.Advances like the introduction of rectified linear units (ReLU) [33], the availability of large datasets for training, and open source libraries that made DL more accessible and allow harnessing the computational power of GPUs have contributed to this paradigm shift.DL techniques have demonstrated excellent performance compared to classical ML methods in several aspects, including the ability to learn features relevant to solve classification and regression problems.
Feature learning results from the training of neural networks with many layers that can progressively extract higher-level features from raw input data.DL models are able to learn fault indicative features directly from the raw motor signals, avoiding handcrafted feature extraction stages.Specifically, Convolutional Neural Networks (CNNs) possess the capability to learn local patterns, enabling the extraction of features associated with time dependencies in the data.This ability has led to extensive utilization of CNN classifiers, for example, in bearing fault detection [34], [35], [36], [37], [38], [39].In [40], a deep CNN classifier was successfully applied for the condition monitoring and fault detection of an induction motor.Notably, the CNN demonstrated its capacity to learn highly relevant features in the frequency domain for the given task.
CNN-based models often use currents and voltages feeding the motor and vibrations signals, as classic ML approaches.These signals are directly fed into the models as 1D multivariate time series [39] or as 2D images [34], [36], [37], [38].Arranging the motor signals as 2D images facilitates the use of pre-trained CNN architectures consolidated in image classification problems.This transfer learning process reduces the amount of data and computational time required to properly train the models.Apart from bearing fault detection, only a few works have addressed other faults as broken rotor bar [35] and stator winding faults [41], [42] using DL-based classifiers.
As in the case of signature-based models, ML classifiers are limited to the number of faults used during training phase.This implies that a general-purpose classifier needs a balanced and comprehensive training dataset, covering the full spectrum of possible motor faults and working conditions.However, the aforementioned approaches normally include a few types of artificially-generated fault data under similar load and speed conditions, that are not representative to achieve good generalization.Despite the efforts of some authors to enrich training sets with data generated by generative deep learning models [35], fault monitoring in induction motors by means of classifiers is still far from mature.
Finally, regression models can be used as soft sensors [43], [44] to predict one or more variables of interest for monitoring purposes.Regression models can replace model-based approaches as analytical copies of the process or digital twins [45], [46] by learning the relationship between the input variables of the process and the monitored signals.
If regression models are trained with healthy data, the residuals computed as the deviations between the estimation and the real values of the monitored signals are an indicator of faults [24], [47], [48].In contrast to the previous models, the deviations detected by regression models carry information about any change in the system from its normal condition, avoiding the need to know every possible type of failure in advance.Due to their potential as general purpose models where it is not necessary to know the faults in advance, this article considers DL-based soft sensors for monitoring and detecting faults in inverter-fed motors.

V. NOTATION AND THREE-PHASE VARIABLES TRANSFORMATIONS
The notation that will be used throughout the paper is as follows.

A. TIME SERIES
Most of the analysis presented in this paper will use discrete signals, acquired at a sampling rate f s = 1/T s .Being x(t) the continuous signal, the k th sample is denoted as x(k) def = x(kT s ).Time series composed of a sequence of m consecutive samples, will be denoted as: When multiple time series are considered, such as for model training, index j will be used to denote the sample time series, and the index k to denote the timestep, {x(k)} (j) .A more general case involves time series of vectors, where each vector may consist of measurements (e.g.voltages, currents temperatures, etc.), or computed features (such as harmonics, peak values, averages, etc.).In this case the samples x (j) can be seen as a tensor quantity x(j, k, c) where j denotes the sample, k is the timestep, and c is the channel or vector component.

B. THREE-PHASE QUANTITIES AND αβ QUANTITIES
The k th sample of the three phase currents and voltages is denoted as a vector: Use of complex quantities αβ instead of abc has well known advantages for the modeling, analysis and control of AC systems.The corresponding current and voltage complex vectors in the αβ (stationary) reference frame are obtained as ( 4)-( 5), where a = e −j2π/3 , [49], [50].
Throughout the paper [i αβ (k)] will be used as a short form of the expanded vector [i α (k), i β (k)] -same with u αβ (k).

VI. DATA-DRIVEN ML MODEL OF THE MOTOR
A general model of a system can be described in the form of a parametric expression: where x (j) denotes the input data to the model and ŷ(j) its output for sample j th ; f represents a parametric nonlinear function or algorithm relating the input and output, and θ the set of parameters that modulate the nature of the input-output relationships defined by f , and hence bearing information on the system behavior.
In the case of a motor, the input and output arguments of f can be direct measurements (e.g.currents, voltages, speed, temperatures), computed features (e.g.harmonics, averages, higher order statistics, and other aggregated values considered relevant for monitoring and fault diagnosis), or time series.The parameters θ of the model can be learned from the available input/output examples (x (j) , y (j) ) by minimizing the mean squared error (7).
The regularization term R(θ) is included to prevent model over-fitting and ensure generalization beyond the training data [51].

A. TIME BASED MODELING
Time based modeling uses the general model ( 6) with time series as inputs.Since time series implicitly contain timedelayed information, these models will capture the dynamic behavior of the motor.The model can include as inputs other related features like the operating point of the machine.The prediction target in this approach may be a scalar variable, e.g. the stator temperature T , or any other time series of the motor, e.g. the currents i αβ (k).
A possible application of this approach is presented later in section IX-A to predict the motor temperature from the multivariate time series x (j) consisting of the complex valued currents i αβ (k) and voltages u αβ (k) and the motor speed (8).
It is also possible to consider the motor as a dynamical system, where the function f acts as a dynamical model predicting the motor's response (currents) out from voltages and speed (9).
The approaches in ( 8) and ( 9) are examples of sequenceto-point (seq2point) and sequence-to-sequence (seq2seq) models found in the ML literature [52], [53], [54].Depending on how the model makes the inference, direct mapping methods or recurrent models can be used.

1) DIRECT MAPPING METHODS
Direct mapping involves a static nonlinear transformation in the signal space, mapping the whole input time series, to an output scalar or time series according to the seq2point and seq2seq schemes.In this case f is a nonlinear parametric function that can be adjusted in a data-driven manner, by learning the parameters θ that minimize the loss function in (7).The most representative and efficient methods for this purpose are the convolutional neural networks (CNN) [55], [56], [57].CNN models increase accuracy of time-based approaches and reduce overfitting by exploiting the time coherence of the input data [16], [58], [59].Instead of using a dense connection of weights from every time step of the time series to every neuron of the next layer, CNN models involve a composition of layers in which the input time series is convolved using a set of kernels (digital filters).The convolution principle drastically reduces the number of weights to be learned by the CNN.

2) RECURRENT MODELS
In the state-space formulation, recurrent models can be written as: The seq2seq model defined by ( 10)-( 11) predicts the output time series {ŷ(k)} from the input time series {x(k)} and trainable parameters θ = (θ σ , θ g ), being a particular case of the general model ( 6).This kind of models involve an internal state vector s(k) containing a minimal set of variables (states) that represent the dynamical condition of the system at time k.The state function σ predicts the state vector at the next time step based on the current state and the inputs, and therefore conveys information of the internal dynamics (stability, bandwidth, dynamical modes, etc.).The observation function g directly maps the internal state and the inputs to the outputs.
Examples of methods which fall under this category include echo-state networks (ESNs) [60], [61] and related reservoir networks [62], recurrent neural network architectures such as Long-Short Term Memory (LSTM) [63], Gated Recurrent Units (GRU) [64] regression based identification of non-linear dynamics [65], and also Nonlinear AutoRegresive eXogenous models (NARX) methods [66].These methods can identify complex behaviors.However, they are prone to instability and sensitive during training as they involve a complex dynamic system.

B. FREQUENCY BASED MODELING
Frequency-based techniques aggregate the entire time series {u αβ (k)} and {i αβ (k)} to extract descriptive features (e.g.harmonics, direct or inverse sequence impedances, etc.).Therefore they produce a static mapping of the entire time series.This approach, although coarser than time-based techniques, can provide more reliable results.Fast Fourier Transform (FFT) can be applied to time series of voltages {u αβ (k)} and currents {i αβ (k)}, to get the corresponding harmonic series {U αβ (h)} and {I αβ (h)}, where U αβ (h) and I αβ (h) represent the h th harmonic of the (complex valued) input and output time series, h = . . ., −2, −1, 0, 1, 2, 3, . . . .Note that the FFT of a complex vector signal includes positive and negative frequencies.
From these harmonics, features or descriptors which can carry information on the motor condition can be computed.Examples of this are the direct sequence impedance Z h (12) and the inverse sequence impedance Z − h (13).
Impedances represent the motor behavior in terms of complex gains (amplitude and phase), and are highly invariant to changes in the amplitudes of the voltage harmonics feeding  the motor.The direct sequence impedance (12) has been shown to be a reliable descriptor of the symmetric behavior of the motor [67].On the other hand, the inverse sequence impedance (13) is not useful (tends to infinite) for the case of an ideal (symmetric, linear) motor, but will take finite values in the presence of asymmetries or nonlinearities.Consequently, they have been used for fault detection [67], [68].
Direct and inverse sequence impedances, or any other relevant frequency based quantities or derived expressions, can be integrated in the model in (6) as feature vectors x (j) .In section IX-A, direct impedances are employed to estimate the stator temperature T (j), following the model in (14).

VII. TEST BENCH AND EXPERIMENTS DESCRIPTION
The test bench consists of two coupled identical induction machines, denoted as Device Under Test (DUT) and Auxiliary (AUX) motors (see Fig. 1 and Fig. 2).Main motor parameters are listed in Fig. 2. It is noted that the proposed methods do not require previous knowledge of machine parameters, as they learn the behavior of the machine.The DUT fan was removed, consequently, it will reach higher temperatures than AUX.Both motors are fed by two three-phase inverters connected back-to-back (see Fig. 1).The dc-link voltage was limited to 400 V.During the experiments, the DUT was operated at different speeds and load levels.Two different types of modulation were used for the inverter: linear (PWM) and six-step.Six-step operation is especially appealing for diagnostics purposed as it creates well defined harmonics at h = 6n + 1 multiples of the fundamental, i.e. h = . . ., −11, −5, 1, 7, 13, . . ., for n = −2, −1, 0, 1, 2, . . .[69].The experimental data used in Section IX were obtained with the inverter operating is sixstep.
Phase currents and voltages of both machines were measured using Hall-effect current and voltage sensors of 100 kHz bandwidth.Signals were sampled and 750 kHz with 16-bit resolution.A 1024-line encoder is used to measure the speed.Five type-K thermocouple temperature sensors were installed: two inserted into the end-windings of DUT motor, two attached to DUT and AUX frames, and one for ambient temperature.Temperatures were sampled at 1 Hz.
Experiments were performed at a maximum rate of one per day, starting always with the motors at ambient temperature.DUT temperature increase will be due exclusively to losses induced during its operation.To accelerate degradation, in some of the experiments DUT was forced to operate at temperatures above its class for small periods of time.
A total number of 28 experiments were performed before failure.Fig. 3 shows temperatures vs. time for two of them.Both took ≈ 110 min, with a maximum DUT stator winding temperature of ≈ 110 • C and ≈ 160 • C respectively.Fig. 4 shows the duration and maximum temperature in the stator windings for all the experiments carried out.
DUT phase-to-phase and phase-to-frame insulation was measured using an insulation tester immediately before and after each experiment.For experiments #1 − #23, the measured resistance was >10 G , which corresponds to the upper limit of the insulation tester.The insulation resistance decreased < 10G after test #23 (see Fig. 4).Following experiment #28 the insulation dropped to a few and the experiment was stopped.

VIII. APPLICATION OF ML METHODS TO MONITORING AND DIAGNOSTIC OF DUT MOTOR
This section is aimed to discuss the options available using ML methods for monitoring and diagnostic of inverter fed motors.The proposed workflow is depicted in Fig. 5.It has been organized in eight stages, which are denoted by encircled numbers. 1 currents and voltages were acquired at 750 kHz, as described in section VII.They are resampled to f s = 20 kHz, which is enough to capture the harmonics being considered in this paper while preventing aliasing.2 the three phase currents and voltages are converted to αβ as outlined in Section V. 3 The FFT is used to obtain the harmonics of the complex time series {i αβ (k)} and {u αβ (k))}.Special care must be taken for the selection the number of samples for the FFT to avoid spectral leakage [67].After this process {I αβ (h)} and {U αβ (h)} are available.4 The feature extraction stage builds the input and output data structures x (j) , y (j) .These structures will be used by the model in the monitoring and diagnosis tasks.Feature extraction involves computing meaningful aggregation operations out from raw process data (e.g.averages, harmonics, or operations among them, like impedances) to obtain a reduced set of descriptors or features.These features should uniquely and compactly describe the motor condition, serving as a signature.Features and/or raw values will then be organized into the scalar, vector, time series or multivariate time series forms, as in the examples shown in ( 8), ( 9), (14).5 An ML model trained as shown in 7 with data of the motor in healthy conditions uses the computed input features x (j) to predict the expected targets ŷ(j) .Predictions can be used for monitoring variables of interest, such as the temperature.6 Alternatively, analytical redundancy can be used for fault detection.The predictions ŷ(j) can be compared with the actual values y (j) to yield residuals r (j) .Residuals will be close to zero if the motor behaves similarly to the model training condition.If there are differences in the response of the motor, the model 5 will show errors predicting the targets.In this case, the norm of the residuals ∥r (j) ∥ 2 will increase.A decision block can further use the residuals' norm to classify the state as normal/abnormal, or to build a health indicator.The normal/abnormal classification often results from applying a threshold to the residuals.The confidence in this classification strongly depends on the chosen threshold.Section IX-B statistically demonstrates the validity of the residuals as a score, on which a threshold can be applied.The performance of the resulting classification can be then assessed using established metrics, such the F1-score, applied to data from a representative sample of motors working until failure.7 Models are trained with sequences of input and output data under healthy conditions (x (j) , y (j) ) j∈{healthy cond.} .In this way the models will capture the healthy dynamics of the system.However, the training process can also be expanded to include data under n conditions of interest c 1 , c 2 , • • • , c n , resulting in n parameter vectors θ (1) , θ (2) , • • • , θ (n) that contain information about the motor behavior under the corresponding conditions.These parameters can be used for visualization purposes in 8 .8 Data visualization can be also a powerful tool to provide further insight to the analysis.The trained model weights θ (i)  or the motor features x (j) representing the operating conditions at samples j can be projected using a dimensionality reduction (DR) algorithm (e.g. a deep autoencoder [32]; a manifold learning algorithm like UMAP [70]; or t-SNE [71]) on a 2D visual map, where close points represent related motor behaviors.Examples of this are shown in SectionIX-C.

IX. RESULTS AND DISCUSSION
Examples of use of the proposed methodology for monitoring and fault diagnosis of electric machines using data-driven models are presented in this section.Data used for the experiments are collected from the test bench as described in section VII.In section IX-A, a representative set of timebased [58], [59], [61], [63], [72] and frequency-based models [73], [74] for temperature monitoring are evaluated.The models are trained with currents and voltages in alpha/beta coordinates under healthy conditions.Section IX-B addresses the detection of stator insulation degradation by means of residual values when the stator temperature is measured.Finally, visual approaches for condition monitoring are discussed in section IX-C.
For the analysis being presented, tests #21, #24 and #27 were chosen out of the tests shown in Fig. 4. In the three tests, the inverted operated in six-step modulation and the motor worked under the same conditions (voltage, load, speed, etc.) to decouple the impact of changes in the working point in the studied models.As explained in section VII, according to the insulation tester measurements, the stator insulation was healthy for the first two tests, and suffered degradation for the third.It is noted that the insulation resistance for the third case still was high (>5 G ).
The models were trained with the motor operating at full load and fixed speed.Generalization of the models to cover full range of torque and speed, as well as different modulation strategies, is a subject of further research.
The performance of ML-based techniques is highly dependent on hyperparameters, such as learning rate, batch size, etc., whose values define the model and control the learning process.Hence, a systematic hyperparameter search of frequency-based and time-based models was carried out by using the random search optimization algorithm [75].Once the hyperparameters were fine-tuned, the performance of the models was assessed using regression metrics Mean Squared Error (MSE) and Mean Absolute Error (MAE) [76].All models were cross-validated using k-fold technique.Therefore, randomly selected training and test data are used to optimize and validate the models.Only data from healthy tests #21 and #24, were considered in the cross-validated training; data from test #27 (insulation fault already detected) were reserved for the evaluation of the residuals and the data visualization.
A computer with Debian GNU/Linux OS and equipped with a RTX 3090 Nvidia GPU card was employed to develop, train and evaluate all the data-driven presented in this article.All the ML-based models were built on top of scikit-learn [77] and tensorflow [78] python libraries.The code and the set of optimal hyperparameters per model necessary to reproduce the subsequent results can be found in the following repository: https://github.com/gsdpi/MLforInvertedFedMotors

A. TEMPERATURE ESTIMATION USING TIME AND FREQUENCY-BASED MODELING 1) TIME-BASED MODELLING
For time-based models, the stator temperature estimation relies on the seq2point approach described in (8).The temperature is estimated from the multivariate time series consisting of the αβ coordinates of voltages and currents.The following representative set of direct and recurrent mapping models (see ( 6)) is compared: • Recurrent models: -Long Short-Term Memory networks (LSTM) [63].

2) FREQUENCY-BASED MODELLING
As explained in section VII, six-step operation creates harmonics at h = 6n+1 multiples of the fundamental frequency.The positive sequence impedance is then computed for each harmonic (12), from which an input feature vector of the type shown in ( 15) is formed.
Polar form, [|Z h |, ̸ Z h ], was used to represent the elements in (15).Three harmonics, −5 th , 7 th and 13 th were used.Upon these choices, ( 14) is replaced by (16), whereby the ML-based models estimate the stator temperature.
• MultiLayer Perceptron (MLP) neural network [74].and frequency-based models.Resulting metrics from each fold are aggregated by their mean and standard deviation.The number of folds is set to 5.

3) PERFORMANCE ASSESSMENT
For the time domain case, CNN and LSTM models outperform the rest of the proposed time-based models.ESN, TCN and rocket models exhibit poorer MSE and MAE values, as they tend to produce high-frequency noise in the estimated temperatures.
For frequency models, MLP and SVR approaches exhibit low mean error and variance values.Differences in MSE and MAE between LSTM and CNN time models and MLP, SVR frequency models are minimal.A qualitative comparison of the temperature estimations of these models is shown in Fig. 6.The overall temperature curve is well learned by all the approaches.However, local temperature variations are better captured by the frequency-based models.

B. INSULATION DEGRADATION DETECTION USING RESIDUAL VALUES
The insulation degradation information present in the experimental data discussed in section VII might be detected using the residuals as proposed in stage 7 of section VIII.The residuals are computed in (17) as the difference between the measurements of temperature T (j) and the corresponding estimations T (j) predicted by models trained with healthy data.
CNN and LSTM time-domain models and the SVR and MLP frequency-domain models in section IX-A are used to estimate T (j) .Further details of these models can be found in appendix.Models are trained with healthy data from tests #21 and #24.The evolution of residuals r (j) from these tests is illustrated in the second row of Fig. 6.The residuals are seen to for test #27 both in the time domain and frequency domains, the deviation being more relevant for higher temperatures (see zoomed subfigures in Fig 6).
The boxplot visualization and t-test analysis in Fig. 7 confirm this behavior.Low p-values indicate that the mean of test #27 clearly deviates from the mean of test #24.This difference is even more pronounced in the frequency domain using SVR and MLP models.This suggests that frequency models might be more sensitive to system changes caused by 27174 VOLUME 12, 2024 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.stator insulation degradation.The t-test results also show that the mean of the residuals for healthy tests (tests #21 and #24) can be used for normal/abnormal classification.However, this classification should be further evaluated using data from a representative set of motors operating until failure, as it was mentioned in section VIII.

C. VISUALIZATION
Visualization methods outlined in section VIII, which correspond to stages 7 and 8 in Fig. 5, can be used to provide further insight on the correlation between measured variables and motor condition.Two approaches were considered: 1) projection of the feature vectors; and 2) projection of the parameter vectors of the model trained under different conditions.1) of feature vectors x (j) .In this approach, the same feature vectors x (j)  ∈ R 6 composed as in (16), containing the information of direct sequence impedances Z −5 , Z 7 , Z 13 , were projected using the UMAP algorithm [70] on a 2D latent space.The resulting projections are shown in Fig. 8, colored according to the test type (left) and according to the stator temperature (right).Regions A, B and C in Fig. 8 reveal motor conditions which produce characteristic impedance   2) Projection of model parameters.The second approach involves the projection of the model parameters θ trained with data of the motor under different conditions that are meaningful for the analysis.Multivariate linear models (18) were considered, with θ = (θ 0 , . . ., θ p ), where p = 6, relating the 6 impedance features to the temperature as in ( 16), under different motor conditions.
Instances of the model were trained for 300 ''bags'' of 30 random input-output pairs, (x (j) , y (j) ), taken at each combination of temperature bins of 20 • C width, ranging in 18 steps of 5 from the three tests #21, #24, #27.This results in a total of n = 300 × 18 × 3 = 16200 models.The resulting models have a common function f linear (), and differ only in their parameter vectors θ, which carry information about the relationship between the impedances and the temperature under the different training conditions.
An UMAP 2D projection -Fig.9-of the parameter vectors θ identified under the previous conditions, reveals three interesting regions A, D, E, all involving differentiated and potentially informative behaviors from test to test.Regions A and E show singular behaviors (the dots are not mixed with those from the other tests) of the motor for test #21, at high and low temperatures, respectively.Region D, in turn, reveals singular behaviors of test #27 at low temperatures.Note that, consistently, region A had been highlighted in the projection of feature vectors.Also it can be noted that low and high temperatures concentrate most of the differentiated behaviors, while at medium temperatures the motor behaves similarly for all the three tests.
The results in Fig. 9 are indeed a visual map of the motor behaviors under different thermal conditions and different proximity to the catastrophic failure that can be potentially useful in further studies to reveal factors influencing degradation.

X. CONCLUSION
This paper has presented an overview of machine learning and data visualization techniques for monitoring and diagnostics of inverter-fed motors.The study encompasses data acquisition, variable transformations to produce complex vector time series, computation of descriptive features, datadriven modeling of the motor behavior, and visualization of fault-related information.
Two approaches have been considered: 1) analyzing signals directly in the time domain, in terms of time series, and 2) frequency domain analysis based on FFT computation of features, exemplified by computing direct and/or inverse sequence impedances from relevant harmonics in currents and voltages.
Direct analysis of time series considers full real or complex waveforms, and allows characterizing transient states, which could provide informative insights for diagnosis.Frequencybased approaches describe the aggregated behavior of the motor over a specific period (data window), making the features more robust (e.g., against noise).However, their applicability requires steady state operation.
Two different approaches have been considered: 1) the stator temperature is not measured.The target in this case is the estimation of the stator temperature from the measured voltages and currents; 2) the stator temperature is measured.In this case, degradation of the stator winding insulation is detected from deviations between the estimated and measured temperatures.
From the experiments carried out during this research, it was been concluded that MLP and SVR ML models for frequency-based analysis and CNN and LSTM time-based deep learning models showed better performance predicting stator temperature and stator insulation degradation.It is noted that time-based methods are more sensitive to noise, while frequency-based methods require that the machine operates in steady state for short periods of time.Finally, the UMAP dimensionality reduction technique applied to direct sequence impedances has been also shown to be useful for stator insulation degradation detection.
The approach outlined in the article offers multiple opportunities for further research in the field of monitoring and fault detection of motor drives using ML, including: 1) detection of other type of faults, as damaged rotors and bearing faults; 2) development of ML based digital twins suitable for monitoring and fault detection; 3) performance evaluation of the residuals-based fault detection for a representative sample of motors working until failure; 4) integration in the algorithms of other types of signals which might be available, e.g.vibration from accelerometers; 5) improvement of the robustness of the algorithms against variations operating condition of the machine (rotor speed, load, flux level, etc.), as well as against ambient conditions (e.g.ambient temperature or humidity); 6) optimization of the algorithms for their implementation in real time in the existing digital signal processors.

APPENDIX DEFINITION AND HYPERPARAMETERS OF THE MODELS
The architectures and hyperparameters of the ML models for fault detection discussed in section IX-B are described in this appendix.
LSTM model uses a 1D convolutional layer followed by a max pooling layer and a long short-term memory layer.The hyperparameters of this architecture and its training are listed in Table 2. CNN model is composed of 6 convolutional blocks followed by a flatten operation and a fully-connected layer with one output unit.Each convolutional block consists in a 1D convolutional and a max pooling layer.The hyperparameters of this architecture and its training are listed in Table 3.
MLP model uses an input layer of 6 units connected to 3 hidden fully-connected layers and an output layer of one unit.The hyperparameters of this architecture and its training are listed in Table 4.
SVR model is based on the scikit-learn [77] implementation.The hyperparameters employed for the SVR model are listed in Table 5.
Further details about these models can be found in the repository: https://github.com/gsdpi/MLforInvertedFedMotors

FIGURE 1 .
FIGURE 1. Schematic representation of the experimental test bench using two back-to-back inverters.

FIGURE 4 .
FIGURE 4. Summary of experiments.a) length in minutes, b) final stator widing temperature.Experiments marked with a solid dot indicate that a decrease in phase-to-phase insulation had been detected with the insulation tester.

FIGURE 5 .
FIGURE 5. Proposed block diagram of ML-based monitoring and diagnosis of inverter fed motor: (1) A/D acquisition, 750 kHz sampling and 20 kHz subsampling; (2) transformation to complex αβ form; (3) complex FFT of voltages and currents in αβ form; (4) feature extraction and building the inputs x (j ) and the targets y (j ) ; (5) ML model of the motor behavior; (6) anomaly detection; (7) model training; and (8) DR visualization of behaviors and conditions.

FIGURE 6 .
FIGURE 6. Left: estimated temperatures (top) and residuals T (j ) − T (j ) (bottom) using LSTM and CNN (time domain), for the three tests being considered.Right: same results using MLP and SVR (frequency domain).

FIGURE 7 .
FIGURE 7. Boxplot of temperature residuals for CNN and LSTM time domain models and for MLP and SVR frequency domain models.Values in brackets are the resulting p-values of the t-test between residuals for tests #24 and #27.

FIGURE 8 .
FIGURE 8. UMAP projection of feature vectors x(j ) with direct sequence impedances.

FIGURE 9 .
FIGURE 9. UMAP projections of model parameter vectors θ learned from motor data under different test and thermal conditions (left and right).
patterns, i.e. different from the patterns for the rest of the tests.The regions were matched to the temperature plot along time.Region A represents conditions from test # 21 with high temperature revealed by a small bump around sample 600 -note that around this sample the residuals are slightly larger and present higher peaks.Regions B and C contain states of test # 27, close to the failure, matching a temperature bump in test # 27 (region B) and states with the highest temperatures reached in the three tests (region C), that could be potentially related to degradation.

Table 1
shows the resulting mean and standard deviation of MAE and MSE values collected from all the analyzed time

TABLE 1 .
MAE and MSE values obtained from frequency-based and time-based models.