Future Trend Forecast by Empirical Wavelet Transform and Autoregressive Moving Average

In engineering and technical fields, a large number of sensors are applied to monitor a complex system. A special class of signals are often captured by those sensors. Although they often have indirect or indistinct relationships among them, they simultaneously reflect the operating states of the whole system. Using these signals, the field engineers can evaluate the operational states, even predict future behaviors of the monitored system. A novel method of future operational trend forecast of a complex system is proposed in this paper. It is based on empirical wavelet transform (EWT) and autoregressive moving average (ARMA) techniques. Firstly, empirical wavelet transform is used to extract the significant mode from each recorded signal, which reflects one aspect of the operating system. Secondly, the system states are represented by the indicator function which are obtained from those normalized and weighted significant modes. Finally, the future trend is forecast by the parametric model of ARMA. The effectiveness and practicality of the proposed method are verified by a set of numerical experiments.


Introduction
With the rapid development of sensor techniques and signal processing, a variety of sensors are arranged in a complicated system to monitor its operational states. Each sensor can obtain a set of measured values and each reflects one side of the running system, for example, temperature, humidity, pressure, etc. However, those parametric values often have close or relaxed relationships among each other. Moreover, they are affected by noise or interference and it is difficult to judge the operating states directly by those simultaneously measured signals. At the same time, the engineers and researchers are no longer satisfied with real-time monitoring of the running states of a complex system. They want to predict the future operational trend according to the current and previous states. Although each sensor records a real signal independently, for simplicity and practicality, it is better to synthesize a comprehensive one called indicator function taking all measured signals into account. Prior to synthesizing the indicator function, random factors and interference in the measured signals must be eliminated effectively. Based on the synthesized indicator function, the future states of the complex system can be forecast reliably.
To forecast the future operational behaviors or states, the autoregressive moving average (ARMA) model can be directly used [1]. ARMA is a high-precision short-term prediction method for time series analysis and offers a simple description for correlated linear, random processes [2,3]. For a linear time-invariant system, the observed data can be expressed by time-series containing historical observations and measurement noise [4]. When using ARMA to fit or predict a signal, an important issue is to estimate the order number of ARMA [5]. ARMA can be considered as the combination

The Trend Forecast Method
In engineering and technical fields, there is a special class of signals that are captured from the same complicated system and have close or relaxed relationships among them. With the system running, each signal only reflects one side of the operational states-for example, temperature, humidity, pressure, etc. If we extract the significant modes of all measured signals, we can synthesize an indicator function, which can be applied to forecast the future operational states of the whole system. Prior to getting the reliable significant mode of each measured signal, the random noise or interference must be carefully reduced, i.e., the extracted significant modes are reliable. In order to achieve the above tasks, we propose a novel method that is based on EWT (empirical wavelet transform) and ARMA (autoregressive and moving average model). The block diagram of the approach is shown in Figure 1. Figure 1. The schematic diagram of the proposed trend forecast method.
Suppose f 1 , f 2 , · · · , f K are K signals captured from a complex engineering system. Each of them is composed of a set of frequency components. The ith signal f i (i = 1, 2, · · · , K) not only includes its significant mode, but also contains other frequency components, naturally including various noise. To extract the significant mode in f i , empirical wavelet transform (EWT) is adopted and performed on f i , due to its ability of anti-interference and computation effectiveness. Then, the extracted significant mode g i from f i (i = 1, 2, · · · , K) is normalized into the interval [0, 1], to eliminate the impacts resulted from the numerical ranges of the measured signals. The normalized result of g i can be denoted as h i , for i = 1, 2, · · · , K. After that, all significant modes h 1 , h 2 , · · · , h K are weighted and summed as the indicator function l to reflect the comprehensive operational states of the monitored complex system. Finally, ARMA provides an effective linear model by the least number of coefficients and is performed on the slowly changing l, to reliably predict the future operational states of the complex system. The following subsections will discuss the main ideas above reflected in Figure 1 in detail.

Extraction of Significant Modes
In general, the captured signals from a complicated system are inevitably affected by random factors and various noise. If a measured signal has multiple disjoint narrow-band components and wide-band noise, the narrow-band component with maximum relative energy can be considered as the significant mode of the original signal. The significant modes can be effectively extracted from all original signals by empirical wavelet transform (EWT) under the conditions of random interference. EWT essentially designs a set of suitable wavelet filters to get several different bands of a signal (each band corresponds to one mode). In particular, the mode is selected as the significant one due to its outstanding energy. At the same time, the other ones are regarded as the interference components and discarded naturally. In EWT, spectrum segmentation is the most important step to obtain different modes [25,26]. It depends on the reliable detection of the local spectrum peaks of the original signal. In classical EWT, the intermediate frequency value between two consecutive spectrum peaks can be seen as their boundary [20,26]. Suppose the spectrum interval [0, π] of each digital signal is divided into N segments and their boundaries are denoted by ω n (ω 0 = 0 and ω N = π). Hence, the empirical wavelet can be constructed by empirical scale function and empirical wavelet function, which are expressed bŷ andΨ where γ and β(x) are defined by and Suppose the ith signal f i (i = 1, 2, · · · , K) is processed by EWT, the approximate coefficients can be obtained by the inner product of the signal and empirical scale function: where W ε f i (0, t) represents the approximate coefficient, Φ 1 (t) represents empirical scale function, and f i (t) represents the object signal. f i (t), Φ 1 (t) represents the inner product of the object signal and the empirical scale function, andf i (ω) andΦ 1 (ω) represent the Fourier transform results, respectively. The symbolˇrepresents the inverse Fourier transform. Similarly, the detail coefficients of EWT are given by the inner products with the empirical wavelets: where W ε f i (n, t) represents the n-th detail factor, and Ψ n (t) represents the n-th empirical wavelet function. f i (t), Ψ n (t) represents the inner product of the object signal and the empirical wavelet function,f i (ω) andΨ n (ω) represent the Fourier transform results, respectively.
The i-th signal f i (t) can be reconstructed by where i = 1, 2, · · · , K. Therefore, the n-th empirical modes of each signal can be given by where i = 1, 2, · · · , K denotes i-th measured signal and n = 1, 2, · · · , N denotes n-th empirical mode.
In empirical wavelet transform, FFT is applied to calculate the spectrum of the object signal and then the spectrum peaks are employed to determine the boundaries of the different modes. For a noise-contaminated signal, it may result in incorrect boundaries because the spectrum peaks are sensitive to noise and interference. To improve reliability of spectrum segmentation, the modified EWT, which is based on local window maxima (LWM) [29], is adopted to extract the significant mode of the measured signals in this paper. It can reliably detect the local spectrum peaks at the cost of computation. The main idea of the method is to find all local maximum values of the spectrum and determine the global maximum value as the first peak. Then, all spectrum values around the global maximum value are set to zero. The other spectrum peaks can be found successively in that way until the number of spectrum peaks meets the predetermined requirement. The modified EWT (LWM-EWT) are not sensitive to noise and can void incorrect spectrum segmentation.
The anti-interference ability of the modified EWT (LWM-EWT) can be verified by the following example. The simulated signal is expressed as: From Figure 2, LWM-EWT can effectively detect the five significant components contained in the simulated signal and can avoid the excessive spectrum segmentation. Thus, LWM-EWT is adopted to extract the significant modes from the multiple-component signals captured in a complex system in this paper, in order to to obtain more reliable results.
Supposing that N modes are obtained by EWT performing onto the i-th signal f i (i = 1, 2, · · · , K), they can be denoted as f i n , for n = 1, 2, . . . N. The relative energy values of N modes are calculated by the following equation: In general, the mode corresponding to the maximum value of relative energy can be considered as the significant one. For a real signal, if five modes are obtained by EWT, their relative energy values are denoted as a vector [0.3, 0.5, 0.1, 0.07, 0.03], then the second mode is considered as the significant one, due to its maximum value.
Suppose K signals are measured from a complex system, for the i-th signal f i , the significant mode g i can be extracted by EWT from f i , according to the relative energy equation expressed in Equation (11), for i = 1, 2, · · · , K. Therefore, K significant modes can be obtained and denoted as g i (i = 1, 2, · · · , K). The major process of significant mode extraction can be expressed by the following algorithm: • Load all K original signals ( f i , i = 1, 2, · · · , K) measured from a complex system. • For the i-th signal f i , perform empirical wavelet transform and determine its significant mode g i a. Set the maximum mode number N i and window length W i , according to prior knowledge. b. Calculate the magnitude spectrum of the i-th signal by fast Fourier transform (FFT). c. Use the LWM algorithm to find N i spectrum peaks, using mask window with length W i . d. Calculate N i − 1 boundary values ω n for dividing the spectrum, according to N i peaks. e. Construct N i empirical wavelets with N i + 1 boundaries, including ω 0 = 0 and ω N = π. f. Perform the constructed wavelet on f i and obtain N i modes, denoted as f i n , n = 1, 2, · · · , N i . g. For N i modes, calculate the relative energy values, denoted as p n (n = 1, 2, · · · , N i ). h. Select the mode with largest relative energy as the significant one g i , corresponding to f i .
• Execute the above operations until all K significant modes g 1 , g 2 , · · · , g K have been output.

Synthesize Indicator Function
For a complicated system, suppose K signals f 1 , f 2 , · · · , f K from different sensors (temperature, humidity, pressure, etc.) can be recorded simultaneously. Each measured signal only reflects one side of operational states. To reveal overall operating state, all measured signals must be considered. For simplicity, the significant modes are taken instead of the original signals. To minimize the impacts resulted from the numerical ranges, the values of the significant modes g 1 , g 2 , · · · , g K must be normalized into [0,1]. The normalization process can be represented by where i = 1, 2, · · · , K. The values of max(g i ) and min(g i ) denote the maximum and minimum of the significant mode g i , respectively. h i denotes the normalized result of g i . Due to the complex correlating or coupling relationships among the measured signals f 1 , f 2 , · · · , f K , or their significant modes g 1 , g 2 , · · · , g K , we prefer to look for a comprehensive indicator to reveal the operational state of the complex system rather than consider multiple signals simultaneously. For simplicity, the normalized significant modes h 1 , h 2 , · · · , h K are is weighted and summed to make up an indicator function.
where w i represents the i-th weight coefficient corresponding to the normalized significant modes h i , for i = 1, 2, · · · , K. In particular, they meet the condition of ∑ K i=1 w i = 1. In general, there are two ways to obtain the weight coefficients w 1 , w 2 , · · · , w K in Equation (13): one is to derive the weight coefficients according to the accuracy physical model of the complex system. The other is to use the data-driven method to setup the related empirical formula. However, it is very difficult to determine the weight coefficients by these two methods in practice, due to the fact that there are no explicit mathematical models or available empirical formulas to express complicated relationships among the measured signals. For a specific problem, the field engineers may adjust the weight coefficients based on their prior knowledge. Under some unknown conditions, the weight coefficients can be set to the same, i.e. w 1 = w 2 = · · · = w K = 1/K. It will lead to it being impossible to accurately describe the relationships among the measurement signals, and then lead to inaccuracies of the indication function and predicted results. If the importance of each signals are known, the weight coefficients must be adjusted correspondingly.

Forecast Future Trend
On the basis of the indicator function l obtained by (13), we can accurately forecast the trend which reflects the future operational states of the complex system. The autoregressive moving average (ARMA) model is applied to implement this task in this paper. In the theory of ARMA, a measured signal is considered to be a set of random variables that depend on time t. Although the individual value which makes up the signal is uncertain, the changes of the entire signal follow a certain rule that can be approximately described by a mathematic model [3,30]. The core idea of future trend forecasts is to use the extrapolation mechanism constructed by ARMA to obtain the better prediction result. The process of ARMA(p, q) can be represented by where l t denotes the indicator function l depending on time t. ϕ i denotes the autoregressive (AR) coefficients at lags i. a t and a t−j denote the residual or error terms. θ j denotes the the moving average (MA) coefficients at lags j. p and q denote the number of AR and MA coefficients, respectively. From (14), the model of ARMA(p, q) is a memory system that includes the past states and various noise. That is, the sequential value at the certain moment can be represented by a linear combination of p historical observations and q moving average values of a white noise sequence [4,5]. The AR coefficient ϕ i (i = 1, 2, . . . , p) determines the effect of the historical observations, while the MA coefficient θ j (j = 1, 2, . . . , q) determines the effect of random factors [3,30]. One step ahead prediction can be represented as Similarly, L (> 1) steps ahead forecast is expressed as Before using the ARMA(p, q) model, we need to test the stationarity of the indicator function, using the Augmented Dickey-Fuller (ADF) criterion. If it is not stationary, the differential transformation is performed until the transformed result is stationary [1]. Then, we determine the orders of the ARMA model for the minimization of the selected criterion function. The criterion functions include the Final Prediction Error (FPE), Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC), etc.
Another important step is parameter estimation of the ARMA model. In general, the maximum likelihood estimation and the least square estimation can be used to estimate the coefficients of the ARMA model.
In this paper, we determine the parameter ranges of (p, q) using the autocorrelation function and the partial autocorrelation function, which are performed onto the indicator function. Then, we test the ARMA(p, q) model in the valid ranges of (p, q), from low orders to high ones successively. The appropriate model can be determined finally according to AIC criterion proposed by Akaike [31]. In this way, the forecast values can be calculated by (15) and (16), using the indicator function l and the selected ARMA(p, q) model. The final results can be considered as the predicted results, which reflect the future operational states of the monitored complex system. In general, the predicted results are evaluated by the percentage of relative errors: where l true is the true state value of the complex system.

Future Trend Forecast of Engine State
To verify the effectiveness and practicability of the proposed method in Section 2, we employ the experimental data from the Turbofan Engine Degradation Simulation Data Set that are recorded by several sensors to characterize engine state evolution under different operational conditions [32]. Figure 3 gives the simplified schematic of the 90 K engine. Several sensors are arranged at the High-Pressure Compressor (HPC) to monitor the engine states. The selected experimental data (from four sensor channels) are shown in Figure 4.  The empirical wavelet transform (EWT) is performed on those data and the transformed results (IMFs-intrinsic mode functions) are shown in Figure 5.
Using the formula shown (11), the relative energy values of all IMFs, decomposed from each sensor, are calculated. According to the calculation result, we can find that the first component (IMF 1 ) has much more energy than other IMFs, for all sensor channels. For each sensor channel, IMF 1 is selected as the significant mode and the other IMFs are considered as random noise or interference. The significant mode IMF 1 and corresponding signal are illustrated in Figure 6.
According to Figure 6, all significant modes coincide with their original signals very well. That is, they contain the main information from the original signal. Prior to synthesizing the indicator function, the significant modes must be normalized to the interval [0,1], in order to minimize the impacts of the numerical ranges of the sensor signals. The normalized results are shown in Figure 7.
According to Figure 7, the indicator function, reflecting the comprehensive behavior of the turbofan engine system, is synthesized by (13) and shown in Figure 8. The weight coefficients are set as the same value for all significant modes because we have no explicit mathematical models or available empirical formulas for the engine system. For any complex systems in practice, it is very difficult to accurately determine those weight coefficients, but the field engineers can moderately adjust the weighted values, based on their prior knowledge and experience.  In Figure 8, the indicator function that characterizes the turbofan engine system states slightly fluctuates around 0.5. That is, the engine system is running smoothly or its operating status has no significant changes. In particular, the waveform of the indicator function depends not only on the rise or fall of the significant modes, but also on their increase or decrease slopes.
In addition, empirical mode decomposition (EMD) can also be applied to extract the significant modes of four signals shown in Figure 4. The EMD results of those signals are shown in Figure 9. Similarly, we can also determine the first components (IMF 1 ) as the significant modes by using (11). The significant modes and original data are illustrated in Figure 10. Compared with Figure 6, the significant modes extracted by EWT are a little better than those extracted by EMD. Similarly, we can normalize the significant modes obtained by EMD and synthesize the corresponding indicator function, shown in Figure 11.  According to Figure 11, the indicator functions, obtained by EWT and EMD, are very close to each other. If those indicator functions are applied to forecast the future states of the turbofan engine system, we can obtain similar results.
Based on the indicator function obtained by EWT, shown in Figure 8, the first 200 values are selected as the known data to predict the future values from 201 to 287, using the autoregressive moving average (ARMA) model expressed by (14). The forecast results of the turbofan engine system are shown in Figure 12. According to Figure 12, the engine system operates normally in the near future (values from 201 to 287) and the forecast trend does not significantly deviate from the actual state curve. The predicted results of this experiment are consistent with the actual situations very well in the short term, although the biases increase over time.

Trend Forecast of Exchange Rates
The trend forecast approach proposed in Section 2 can be easily extended to a variety of fields, including science, technology, engineering, society, finance, etc. Here, we use the exchange rate data to verify the effectiveness of the proposed method. The adopted exchange rate data of USD/EUR, USD/GBP, USD/JPY and USD/CN are shown in Figure 13. The empirical wavelet transform (EWT) results of those exchange rate data are shown in Figure 14. Similarly, the relative energy values of IMFs are calculated by (11) and all IMF 1 components are selected as the significant modes. The extracted significant modes and their corresponding exchange rate data are shown in Figure 15. Each significant mode (IMF 1 ) can represent the overall changes of the original exchange rate data. Then, normalized significant modes are shown in Figure 16. The normalized significant modes, shown in Figure 16, are applied to synthesize the indicator function of the exchange rates, with the same weight coefficients. The synthesized result is shown in Figure 17. Although the normalized significant modes greatly change over time, the synthesized indicator function changes relatively little. This is consistent with the actual economic and financial developments. According to the indicator function, the comprehensive trend of the exchange rates is forecasted by the ARMA model. The forecast results (from July 2013 to May 2014) are shown in Figure 18. The predicted results are well consistent with the actual exchange rates in the short term. This experiment also shows that the presented method is applied successfully in the finance field.

Conclusions
For any complex system in engineering and technical fields, operational state prediction is a very important technique to guarantee safe operation. On the basis of empirical wavelet transform and the autoregressive moving average model, an effective forecast method is proposed and discussed in this paper. For the multiple signals measured from a complex system, their significant modes are extracted reliably, through taking advantage of empirical wavelet transform and relative energy relationships. Those significant modes are very consistent with the original signal, but they have smoother waveforms or higher signal-to-noise-ratio (SNR). To suppress the negative impacts resulting from numerical ranges, those significant modes are normalized, weighted and summed as an indicator function, which reflects the comprehensive operational state over time. According to the simple indicator function, the future running trend of the complex system is reliably predicted by autoregressive moving average technique. The effectiveness and practicability of the presented method have been verified by a set of experiments whose multiple channel signals were recorded from actual complex systems. The experimental results show that the proposed method has been applied successfully in engineering and financial fields. The proposed approach can also be easily extended to science, technical, social and other fields.
To obtain better forecast results of complex systems, the future work related this paper may focus on: (1) improving the robustness of significant component extraction, and (2) enhancing the accuracy of predict methods.