Non-Intrusive Electric Load identification using Wavelet Transform

This paper shows the development of a decision tree for the classification of loads in a non-intrusive load monitoring (NILM) system implemented in a simple board computer (Raspberry Pi 3). The decision tree uses the total energy value of the power signal of an equipment, which is generated using a discrete wavelet transform and Parseval’s theorem. The power consumption data of different types of equipment were obtained from a public access database for NILM applications. The best split point for the design of the decision tree was determined using the weighted average Gini index. The tree was validated using loads available in the same public access database.


Introduction
Nowadays, the World is facing several challenges regarding energy usage, such as energy sources availability, carbon emission, sustainability, among others (Aiad & Lee, 2016b).Building energy management is becoming a major issue worldwide; it is estimated that nearly 40% of all the electric power is consumed in buildings (Ma et al., 2016).Several countries, like China (Zhou et al., 2015), European Un-ion (Tsai & Lin, 2012), and Mexico (Honorable Congreso de la Unión, 2012), have developed public policies to mitigate these challenges Hoyo-Montaño, León-ortega, VaLencia-PaLoMo, gaLaz-BustaMante, esPejeL-BLanco, and Vázquez-PaLMa public database for NILM applications.Finally, the results obtained from the validation of the design of the decision tree for the classification of loads, as well as the conclusions drawn, are presented.The implementation of energy saving and/or efficiency actions, in particular regarding domestic installations, requires information on how the energy is used.Currently, energy meters provide information about total energy consumption through monthly bills, and they do not allow to determine individual equipment consumption (Aiad & Lee, 2016b).
There are studies that show a relationship between the knowledge of the amount of energy consumed by the equipment and the implementation of changes in the operating habits of the equipment by the users that promote energy savings, which may vary between 9% to 20% (Aiad & Lee, 2016a).A suitable monitoring system is required to know the operation conditions of electrical appliances.The monitoring of these operations would facilitate the implementation of energy efficiency measures.The monitoring of loads in a general way is a process that seeks to identify and acquire the measurement of energy consumption of a particular load (I.Abubakar, Khalid, Mustafa, Shareef, & Mustapha, 2017).
Traditionally, the monitoring of the operation of connected equipment in a facility is based on the installation and operation of a large number of sensors.Each power outlet or load has a sensor, and the system is called intrusive monitoring (He, Stankovic, Liao, & Stankovic, 2016), having disadvantages of a high cost, complex installation and difficult maintenance issues (I.Abubakar et al., 2017).
As an alternative to the inconvenience of intrusive monitoring, some alternatives have been developed with a reduced number of sensors, forming a Non-Intrusive Load Monitoring (NILM) system.In this type of system, the goal is to disaggregate the individual load consumption from the total, and the application of voltage and current waveform analysis at a single point located at the service's point of entry (I.Abubakar et al., 2017).Figure 1 shows a general structure of a NILM system (H.H. Chang, Chen, Tsai, & Lee, 2012).This paper presents an implementation of a NILM system for power consumption signature detection based on discrete wavelet transform, Parseval's theorem, and decision trees suitable to be executed in a Single Board Computer (SBC), just like Raspberry Pi 3. The system developed is part of a smart power-meter; its hardware is based on a Raspberry Pi 3 and an acquisition stage, as shown in Figures 2 and 3.
In the next section, a brief review of several approaches developed for load identification in NILM systems is presented.Wavelet transform structures suitable for NILM applications are presented afterward followed by the design process of the decision tree using data available in a

Non-Intrusive Load Identification
A NILM system analyzes voltage and current waveforms trying to identify a power consumption signature that can be associated to the nature and operating state of individual devices.These power consumption signatures can be classified as steady state, transient and non-traditional signatures (I.Abubakar et al., 2017).
Steady state signature is obtained when the device has completed its starting stage, and it has an steady operation, this identification uses parameters such as active power, reactive power, RMS voltage and current, power factor, and harmonic components (I.Abubakar et al., 2017).
Transient signature is drawn from the analysis performed to period between the turn-on and steady states, or between the steady and turn-off state of a device, because during these periods some characteristic power consumption behaviors can be associated with specific loads (I.Abubakar et al., 2017).
Non-traditional signatures, on the other hand, can be obtained using the values of non-electric variables in the load identification process.Values of temperature, lighting, time of day, start-up time, among others, are used to give context to the device usage.Information from these variables can be mixed with previous signatures to improve identification (I.Abubakar et al., 2017).
To help in the identification process of the devices operating in an installation using NILM systems, a classification has been proposed (Bernard, Wohland, Klaassen, & Vom Bogel, 2016;Hart, 1992;Zoha, Gluhak, Imran, & Rajasegarar, 2012): 1. Type I. Turn-on/Turn-off There are only two possible operating states (turn-on and turn-off); a typical example is a lamp.

Type II. Finite State Machines
They present several levels of defined consumption, and a cyclic operation.An example of these devices is the washing machine.

Type III. Continous Variable Consumption
They have an infinite number of operating points when turned-on, examples of these devices are light dimmers and power tools.They are a great challenge for identification due to their power consumption nature.

Type IV. Continuous Consumption
They operate during long periods of time, days or weeks, wireless phones, and any remote controlled appliance, are perfect examples of this type of devices.
Each one of these categories present its own complexity for the identification of individual devices.
Total aggregate power consumption of devices operating inside an electric installation can be described as (Hart, 1992): Where ) (t P is the power consumption over time, ) (t a i is the activation vector of device i, with values 0 and 1 when device is off or on during time t; i P is the power vector of device i, ) (t e is the error or noise term.It is required to gather information of the steady and transient states from the power waveforms of the device.A high frequency sampling stage captures information regarding transient events; meanwhile, a low frequency sampling stage gathers steady state information of the device.

Data processing
Data must be conditioned and processed in order to give meaningful information.This stage includes noise filtering, harmonic components separation, signal synchronicity, etc.

Event detection
Processing and storage of all the information is inefficient and impractical process, so it is important to detect the activation and deactivation of the device.It is Hoyo-Montaño, León-ortega, VaLencia-PaLoMo, gaLaz-BustaMante, esPejeL-BLanco, and Vázquez-PaLMa necessary to establish a threshold crossing detection mechanism for the detection of transient.

Characteristic Extraction
Electric parameters, such as active power, reactive power, harmonic components and transient waveforms, can be extracted from the event detection and data processing stages.The identified characteristics are depending of the disaggregation method used for load identification.

Load classification or disaggregation
Using the characteristic information gathered from the processed data, along with a known pattern, the device disaggregation can be performed from the total energy consumption, that is, the device can be identified.

Energy calculation
By identifying an individual device, its operation pattern and energy consumption can be estimated.
Using active power (P), reactive power (Q), RMS current and harmonic components has given good results in identifying type I and II devices; however, it has a poor performance with low power devices.High frequency alternatives have been developed to improve steady state analysis by including harmonic content.Because the transient behavior of the device turns out to be distinctive in many cases, implementing this type of analysis can facilitate the identification process, and requires the implementation of a high frequency sampling scheme (Isiyaku Abubakar, Khalid, Mustafa, Shareef, & Mustapha, 2015; Zoha et al., 2012).
There are some reports of performance improvement of steady-state analysis using voltage and current waveforms as a way to identify unique features of loads, such as peak and RMS values, phase difference and power factor (Zoha et al., 2012).These identifying methods can be complemented using harmonic analysis along with real and reactive power features to improve device detection of the algorithms, but its uses require high sampling rates of waveforms (Abubakar et al., 2015;Zoha et al., 2012).
Most devices have a distinctive transient behavior that can be suitable for device identification, using a high sampling rate is possible to capture the transient behavior (Abubakar et al., 2015;Zoha et al., 2012).Features such as transient shape and turn-on energy calculation have been used to identify individual devices (Zoha et al., 2012).
Several  Fourier Transform usage for spectral analysis of power consumption has been proven useful to detect variable loads.To detect the operation of a device, and estimate its energy consumption, Short-time Fourier Transform, and active and reactive power calculation has been combined (Zoha et al., 2012).This mathematic tool performs the transformation of a time domain function into a frequency domain function (Marcu & Cernazanu, 2012).Figure 5 shows a spectral analysis reported by (Liang, Ng, Kendall, et al., 2010) for different devices, it can be noticed that both TV and Air Conditioners have a strong presence of low-order harmonics, meanwhile devices such as induction pots present a high content of high-order harmonics.
Markov models have become an interesting alternative to implement NILM systems due to their simplicity to model basic functions.The general scheme to implement Markov Models, in particular Hidden Markov Models (HMM), is based in the fact that a device behavior can be represented as a latent state and an observable output, usually active power.A system with trained Markov models can perform inferences regarding the most probable state sequence of a device, based in the set of measures processed.HMM have been proven useful in predicting in a precise way the behavior of devices using measurements gathered with low-frequency sampling (<1Hz).The goal of a HMM-based NILM system is to generate energy consumption profiles and to determine time of use of each devices operating in an installation.Usually this information is considered as non-critical, and its processing is performed off-line (Mueller & Kimball, 2016).
Markov modeling requires a periodic acquisition of T measurements, where each measurement is assumed to be associated with a state Q from de process, and each state of the process can assume one of N possible values.HMM have a three component structure.A Transition A matrix containing the state-transition probability values; each state probability φ, and a vector with the values of the initial state occupation probabilities π. Figure 6 shows the structure of a HMM for a device with only three states (Mueller & Kimball, 2016).Identification of each device in an aggregate total power consumption requires firstly the determination of the sequence of states that can be used to compose the observation sequence.Using Matrix A and φ, the inferred state sequence is calculated, it represents the most probable behavior of all the devices represented as a unity.
The state sequence Q can be used to determine which state sequence has the highest probability of occurrence for each individual device (Mueller & Kimball, 2016).
Wavelet Transform is another tool that has been used to perform transient analysis of a device (Zoha et al., 2012).Analysis based on Wavelet Transform performs an extraction of the desired waveform applying a function translation and dilation process (Chen, Chang, & Chen, 2013).A more detailed discussion about Wavelet Transform is presented next.

Wavelet Transform and Parseval's Theorem
Wavelet Transform can be implemented in two ways: Continuous Wavelet Transform (CWT), and Discrete Wavelet Transform (DWT).DWT has a structure more suitable for digital signal analysis.DWT is derived from CWT definition, and can be expressed as (Chen et al., 2013): x is the signal analyzed, ψ is the mother wavelet applied, 0 a is the scaling factor, and b 0 is the shift factor.
Equation (2) can be transformed in: Setting, and, (3) becomes: DWT performs two operations, dilation (applying scaling factors), and translation (applying shifting factors).These operations are performed to decompose a signal into a series of short duration waveforms called Mother Wavelets.Mother Wavelet has characteristics suitable for transient events analysis (H. H. Chang et al., 2012).Multi-Resolution Analysis (MRA) is based on the application of DWT.MRA decompose a complex waveform or signal into several sets of simpler waveforms, this is performed by a set of low-pass g[n] and high-pass h[n] filters (Chen et al., 2013).Figure 7 shows a three-level DWT filter structure.This type of structure provides a multilayer decomposition scheme, where the last low-pass filter g [n] gives an approximation value (level 3 in Figure 7), meanwhile, highpass filters h[n] provides detail values (Figure 7 shows three values).An increase in the number of levels will provide an increase in the number of detail values, but only one approximation value will be obtained (Chen et al., 2013).
Parseval's Theorem is used to calculate the energy dissipated by a 1Ω resistor when a discrete current f [n] flows through it.The Theorem uses the Fourier Transform coefficients (Kocaman & Özdemir, 2009) ( ) where N is the sampling period, and a k are the Fourier Transform coefficients.
In order to apply (5) to DWT, it is transformed in: The first term in the right part of (6) represents the energy levels of the approximation component of DWT, the second component represents the energy levels of the detail components.The total energy of the DWT is transformed in: Hoyo-Montaño, León-ortega, VaLencia-PaLoMo, gaLaz-BustaMante, esPejeL-BLanco, and Vázquez-PaLMa Where ||d j || is the norm of the expansion coefficients, and N J is the number of samples used at level J.

Load identification
Load identification from total energy of the decomposition with DWT was performed using a Decision Tree (DT).In the NILM system implemented, six types of loads were defined for identification: Air Conditioning (class 0), Compact Fluorescent Lamp (class 1), Fan (class 2), Refrigerator (class 3), Vacuum cleaner (class 4), and Washing Machine (class 5).
In a NILM system, each type of load has hidden information, when developing a DT, the main goal is to develop a classification tree that contains an optimal entry node capable to measure impurities in the tree nodes, this can be performed using the Gini Index (Alshareef & Morsi, 2015; J. M. Gillis et al., 2016;J. Gillis & Morsi, 2016) ( ) ( ) Where C is the number of classes, and f(c|σ) is the probability that σ belongs to class c.
The design procedure for a DT suitable for classification can be seen in Figure 8. Considering that there are six types or classes of devices, and that an eight-level DWT analysis was applied using a Daubechies 3 mother wavelet, the total energy values of DWT for one level of approximation (A8) and eight levels of detail (D1 to D8) were obtained.The number of levels is directly related to the harmonic spectrum covered within the sampling frequency; hence the eight-level analysis was chosen to cover a frequency analysis from 30kHz (sampling frequency) to 117,18Hz.The sampling frequency for the analysis is set in the PLAID public database (Gao, Giri, Kara, & Bergés, 2014).To calculate the Total Energy Vector (TEV) from the DWT, a Python function was developed and its results were compared with the results from a MatLab Wavelet Toolbook function.The Python function has a quadratic average error of 0,22%, this accuracy in the calculation of the TEV helps to differentiate appliances with close energy signatures, the code of the Python function is shown in Figure 9. Table 1 shows the DWT total energy values of 48 devices taken form PLAID. Below is the procedure to find the best split point of the classification DT.

Total Energy List sorting
The first step required to find the best split point is to sort in ascending way the values of the Total Energy of DWT List before the calculation of Gini Index.The sorted List is partially shown in Column 1 of Table 2. Column 2 shows the mid-point between two adjacent values of Total Energy of DWT.

Split point calculation based on mid-points
Columns 3 and 4 from Table 2 show the number of devices of each class with values lesser (column 3) or greater/ equal (column 4) to the mid-point value of column 2. It can be seen in Table 2, for instance, that the total number of devices with Total Energy values greater or equal to 1 465,72 is 34, and they are dispersed as: six class 5, eight class 4, eight class 3, one class 2, none class 1, and eleven class 0.

Gini Index and its weighted average
The Gini Index shows a measurement of the impurity of a node, when its value gets to a minimum, the point of best split is found.Since there are two columns of membership (columns 3 and 4), a value that includes them must be obtained, this is done by calculating the weighted average of the Gini Index using: Where S1 and S2 are the number of devices with Total Energy values lesser than, and greater/equal than mid-point value; Gini(σ)a is the total number of devices; and Gini(σ)b are the Gini Index for devices with Total Energy values lesser than, and greater/equal than mid-point value.The results of this calculation are shown in column 6 of Table 2.

Best split point identification
Because the Gini Index measures the impurity of a node, the minimum value of this index corresponds to the best split point, and identifying the entry point of the DT.By inspection of Table 2, it is found that the minimum value of Gini WA (σ) is 0,7242; which corresponds to an average point of total energy of 1 565,42.The tree developed from this entry point is shown in Figure 10.

Simulation Results
The classification DT was tested using 30 devices from the PLAID database (Gao et al., 2014), these devices were not included in the design process.Table 3 shows the total energy values used for testing, the identification result from de DT, the device class, and if there was an error in the identification process.
As it can be seen in Table 3, only three devices were wrongly identified, this means that the proposed classification DT has a 90% success rate identifying load class.Some HMM solutions had reported a success rate between 51,66% and 87% (Aiad & Lee, 2016a, 2016b) using REDD database; using DWT as part of the analysis process.Alshareef (2015) using a Daubechies 3 mother wavelet reached a 95,83% using 1 000 decision trees; Chang (2014) reported a DWT with an Artificial Neural Network reaching identification levels between 86,16 and 96,82%; Gillis (2016) reported a DWT plus Decision Tree, using a Daubechies 3 mother wavelet and six levels of decomposition, having a 96,18% success in load identification.

Figure 4 .
Figure 4. Aggregated power consumption from different devices.Source: Hart (1992) Figure 4 shows an example of aggregated power consumption from different devices.Identification of Power consumption signature has been implemented in different schemes in literature.In general, NILM identification process requires implementing six stages of analysis (Basu, Debusschere, Douzal-Chouakria, & Bacha, 2015; Liang, Ng, Kendal, & Cheng, 2010):

Figure 6 .
Figure 6.Structure of a HMM for a three state device.Source:Mueller & Kimball (2016)

Figure 8 .
Figure 8. Procedure for best split point identification for Classification DT.Source: Authors

Figure 9 .
Figure 9. Python Code of DWT and Total Energy Vector Calculations.Source: Authors

Figure 10 .
Figure 10.Classification DT based on the values of Table2.Source: AuthorsThe Classification DT developed presents an unbalanced structure; this is not rare when the Gini Index is used.There are eight nodes at the left (Energy lesser than), and seventeen to its right (Energy greater/equal than) of the entry node.The implementation of the Classification DT as a Python function is shown in figure11.

Figure 11 .
Figure 11.Python function code for the Classification DT.Source: Authors

Table 1 .
Total Energy of DWT analysis

Table 2 .
Sorted List of Total Energy of DWT

Table 3 .
Simulation Results from Classification DT for Load ID