Improving prediction accuracy of cooling load using EMD, PSR and RBFNN

To increase the accuracy for the prediction of cooling load demand, this work presents an EMD (empirical mode decomposition)-PSR (phase space reconstruction) based RBFNN (radial basis function neural networks) method. Firstly, analyzed the chaotic nature of the real cooling load demand, transformed the non-stationary cooling load historical data into several stationary intrinsic mode functions (IMFs) by using EMD. Secondly, compared the RBFNN prediction accuracies of each IMFs and proposed an IMF combining scheme that is combine the lower-frequency components (called IMF4-IMF6 combined) while keep the higher frequency component (IMF1, IMF2, IMF3) and the residual unchanged. Thirdly, reconstruct phase space for each combined components separately, process the highest frequency component (IMF1) by differential method and predict with RBFNN in the reconstructed phase spaces. Real cooling load data of a centralized ice storage cooling systems in Guangzhou are used for simulation. The results show that the proposed hybrid method outperforms the traditional methods.


Introduction
District ice storage cooling systems use centrally-located chilling equipment to generate cooling in the form of chilled water so that it can be distributed to the nearby buildings for air conditioning. As energy prices increase, optimal control of district cooling systems has become more critical for energy cost reduction. Among issues of optimal control, accurate cooling load prediction for next-day demand is essential for the optimal scheduling of cooling system operating.
Various strategies have been developed to model cooling load demand. These attempts can be categorized in two approaches: physical-based software simulating methods and statistical methods. The physical-based software simulating methods are based on the thermal knowledge and physical equations of the buildings [1]. The statistical methods are based on the data collected in the cooling system. For district ice storage cooling systems, it is unrealistic to obtain all the building thermal behaviour. For this reason, statistical methods are more suitable for district cooling load prediction. The statistical methods include the single algorithm and the algorithm combinations. Linear-regression analysis [2], the artificial neural networks (ANN) [3], and the support vector machines (SVM) [4] are the primary ones of the single algorithm. The main feature of the algorithm combinations [5] is to combine more than one single algorithm to obtain prediction precision higher than that could be obtained from a single algorithm.
Regression analysis models formulate the cooling load prediction problem by an empirically mathematical function of the main parameters that affect the cooling load, such as temperature, humidity [6]. Common models reported in the literature include ARX, MLR, and ARMAX models.
Generally, such a model is suitable for long-term predicting with large samples. For instance, Yoshida [7] applies this method to predict air-conditioning load based on an ARX model derived from building load simulation. Results show that its average prediction error is 29% in summer and 12% in winter.
Li [8] applies four different methods including ARMAX, MLR, ANN and RC network to predict short term building cooling load demand, the results show that the MLR model and the ARMAX model are superior in prediction precision .Usually, regression models require a much shorter training time, but are less capable of modelling nonlinearity. Artificial neural network and support vector machine has been proved to obtain prediction precision higher than the regression analysis models, due to its capable of modelling nonlinearity. ANN, SVM, modified ANN and SVM have been widely used to predicting, such as power load forecasting, natural gas usage trend forecasting, energy consumption in a building, and etc [9]. It is reported that, in forecasting a building air conditioning load, compared with back propagation neural network (BPNN), the root mean square error and the average relative error obtained by RBFNN are reduced by 36%. It is also reported that, for building air conditioning load prediction, the root mean square error obtained by SVM is about 50% lower than that obtained by BPNN [6].
Cooling load demand depends on many external and internal factors of cooling system, such as outdoor weather parameters, indoor equipment usage, and indoor human activities. Nonlinear interaction of these factors in cooling system leads to chaos of cooling load demand. The chaos especially the self-similarity and non-stationary of cooling load should be considered in cooling load prediction. EMD, which is data-driven technique that represents nonlinear and non-stationary data as a sum of a finite zero-mean components referred to as IMF, can be employed to transform the selfsimilarity time series into the short-related components. Ying [10] used EMD to forecast an hourahead wind speed and power, and Yang [11] integrated EMD, Chaos-based neural network and PSO for financial time series forecasting. EMD has also been used to predict networking traffic [12], price of power [13], and etc. To improve the prediction accuracy for the next-day Cooling load demand of a centralized ice storage cooling systems in Guangzhou, we propose the hybrid EMD-PSR-RBFNN model.
The rest of this paper is organized as follows. Section 2 is a collection of cooling load data, including tests. This serves to confirm that the data are chaos time series. In section 3, we introduce the proposed EMD-PSR-RBFNN model. Section 4 consists of simulations and results of the proposed model and other relevant models. Finally, Section 5 concludes this paper and summarizes the discussion.

Data Collection and Proofing of Chaos
This paper adopted the real cooling load time series of a centralized ice storage cooling systems in Guangzhou, the data that is collected a total of 121 days from July 1, 2012 to October 31, 2012, in which the former 106 data are taken as the training set and the remained 15 data as the testing set.
In order to investigate the chaotic nature of the cooling load time series, firstly, we applied mutual information method [14] and Cao method [15] to calculate the delay time and the embedding dimension respectively. Time delay is the delay time between the adjacent two coordinates of reconstructed phase space vector. Mutual information method utilizes the first minimum value of mutual information function to determine the optimum delay time, while the mutual information function is a measure of general random association between two random variables. The delay time of the cooling load time series in this paper is calculated to be 1.
Takens embedding [16] theorem deems that: For infinite and noiseless scalar time series with ddimensional chaotic attractors, an m-dimensional phase space embedded can always be found in the sense of topological invariant, as long as dimensions 2 1 m d   . By Cao method, the embedding dimension m of the cooling load time series is calculated to be 5.
Based on the calculated delay time and embedding dimension, Wolf method [17] is applied to calculate the maximum Lyapunov exponent. The maximum Lyapunov exponent of the experimental data is 0.0676, which indicates that the cooling load series has chaotic nature. Phase space reconstruction method of chaotic series is then applied to predict the cooling load demands in this paper.

Method
Chaos is a complex behaviour manifested by the low-order deterministic nonlinear dynamical system, which has the characteristic of nonlinearity, self-similarity and etc. In this paper, the proved selfsimilarity cooling load chaotic series is decomposed with EMD, which broken the self-similarity time series into several short-related IMFs. An IMF combining method is proposed to balance the computation complexity and prediction accuracy, and the differential method is used to smooth the high frequency component of EMD (IMF1) to improve the prediction accuracy further. Then reconstruct the phase space for the combined series and predict with RBFNN algorithm.

Emd and Combination of IMFs
EMD in Hilbert-Huang transform (HHT) [18] is mainly applied to decompose the original signal into several IMFs and a residue. The essence is to determine the basic shock mode of valid signal in the series by experience. The IMFs which contain local characteristics of cooling load in different time must satisfy the following two conditions: ①in the whole data set, the number of extreme and the number of zero-crossings must either be equal or differ at most by one. ②at any point, the mean value of the envelope defined by local maxima and the envelope defined by the local minima is zero. N.E .Huang, who was the founder of HHT, explained the meaning of these two conditions in the literature [19]. The first condition stipulates that the signal form of IMF is similar to the traditional narrow band requirements for a stationary Gaussian process, which can be characterized as a form of () (t) e jt a  , in which () at represents the envelope of the signal, () t  represents the phase of the signal .The second condition guarantees the symmetry of IMF waveform, and ensures that the instantaneous frequency does not fluctuate after Hilbert transform.
The process of EMD is as follows: Firstly, identify all the local extreme points of cooling load series () ht . Secondly, connect all the local maxima points and the local minima points by a cubic spline as the upper envelope and the lower envelope, the upper and the lower envelopes should cover all the series between them, the mean of upper and lower envelope value is designated as (t) l . Thirdly, set , judge whether () mt is an IMF component ,if () mt satisfies the conditions of IMF, () mt is the first IMF of () ht , otherwise () mt is treated as () ht , and repeat the above processes. Suppose after k iterations, the difference value between signal and the mean of the envelope is m t is a new series which removes the low frequency, after repeating up to 1 k  times , the difference value is In order to reduce the computation, when the equation (1) is The threshold value  is set according to the actual needs. The smaller the threshold is, the closer to the real IMF component ,1 () k m t which satisfies equation (1) is. In this paper, the threshold is set to 0.3. Assuming will be considered as () ht , repeating the above processes. When the residual component is a monotonic function or its amplitude less than a predetermined value, just stop calculating. So we can get several IMFs, these components can be designated as ()

Phase Space Reconstructing
Takens theorem [16] considers that the evolution of a component is decided by the interaction of other components in the system, and the information in these related components are implied in the development process of any component. Therefore, phase space reconstructing [20]  (1) x (1 ) x(1 (m 1) ) x(2 (m 1) ) x(3 (m 1) )

Rbfnn
In the reconstructed phase space, RBFNN algorithm is utilized to predict the cooling load demand. There are three parameters need to be found for RBFNN simulation: the centres of base functions, the variance and the weights between the hidden layer and the output layer. In this paper, the centres of base functions were selected by self-organized method, and Gaussian function is chosen as the radial basis function. Therefore, the activation function of RBFNN is expressed as In equation (5), In this paper, k means clustering algorithm [21] is used to calculate the centre points i c , and the least squares method [22] is used to calculate the connection weights ij w .

Hybrid Prediction Model Based on EMD-PSR-RBFNN
This paper proposes a hybrid method based on EMD, PSR and RBFNN (EMDPSRRBFNN) for shortterm cooling load demand predicting. The steps are as follows: Step1: Pre-process the original cooling load series, including data cleansing and data restructuring.
Step2: Decompose the cooling load series with EMD, and then get k IMFs and a residual.
Step3: Combine the IMFs according to the IMF combination methods described in section 3. tM  of phase space domain, and process the IMF1 by differential method.
Step6: Set up the RBFNN prediction models in the reconstructed phase space for IMF1, IMF2, IMF3, the combined low-frequency component (call it IMF4-IMF6 combined) and the residual respectively, then anti-normalize the predicted values.
Step7: Add the predicted values above together to achieve the final prediction result of the cooling load demand.

Simulation and Result
This section discusses the simulation works and results of the proposed EMD-PSR-RBFNN model in predicting the cooling load demand. Works begin with evaluation criteria in section 4.1. The results in section 4.2 verified that EMD model acts as a good filter in non-stationary and self-similarity cooling load time series systems. Simulation results of each EMD component using the PSR-RBFNN are described in section 4.3. Both are displayed in fig.3 and table 3. Following to this, it is the determination of high-frequency components, low-frequency components of EMD. Predicting comparison of different IMFs combination methods are shown in section 4.4. Section 4.5 serves to improve the prediction accuracy of IMF1. Section 4.6 conducts to demonstrate the performance of the proposed EMD-PSR-RBFNN model in predicting cooling load demand. We used models of SVM, LSSVM, RBFNN and SVM, LSSVM, RBFNN based on EMD-PSR for predicting comparison.

Evaluation Criteria
The root mean square relative error (RMSRE) and the mean relative error (MRE) are used as the evaluation criteria to assess the prediction accuracy, where in equation (7), ' y i is the actual load result, i y is the predicted load result, n is the total number of the predicted result, 12   The Hurst index H is the parameter to characterize the self-similarity. When 0.5,1 H  ( ) , the series is self-similarity. The greater the H is, the stronger the self-similarity is. As shown in table 2, after EMD, the self-similarity of IMF1~IMF4 reduced significantly. It indicates that EMD reduced the selfsimilarity of the original cooling load time series. From figure 2 in section 4.2.1, we can see that the waveforms of IMF5, IMF6 and residual approach sinusoidal waveforms gradually which resulting to the random reduced greatly. The calculation error of its Hurst index H is large and we will not discuss this topic in this paper.
Normally, a self-similar random process is also a non-stationary random process. Long-range dependence time series has the disadvantages of the high model algorithm complexity and poor precision. EMD is an effective method to break the self-similarity time series down into several shortrelated components. Figure 2 shows the EMD components of the original cooling load series and their predicting results with PSR and RBFNN modelling. In figure 2, the horizontal axis represents the samples (days) and the vertical axis represents the cooling load demand (W).  Prediction errors corresponding to each EMD component are shown in table 3.As can be seen in table 3 and figure 2, there exists larger prediction errors for the former IMFs (especially the IMF1) when RBFNN modelling, while the prediction errors are smaller for the later EMD components. In section D, we will compare different combination ways of the EMD components to balance the computation complexity and the prediction accuracy.

Comparing of Prediction Errors for Different IMF Combination Methods
The means of IMF1 to IMF6 are calculated to be -9.9246, 6.4726, -2.9915, 100.369, -83.4171 and -86.3867 respectively. The first IMF whose mean deviates from zero significantly is IMF4. It indicates that IMF1, IMF2 and IMF3 are high-frequency EMD components and represent the influence factor of significant events. In the meanwhile, the IMF4, IMF5, IMF6 are low-frequency EMD components and represent the normal fluctuations factor of cooling load. The residual represents the long-term trend of the cooling load demand. To illustrate why combining the EMD components as mentioned in section 3.1, this section gives the prediction error of two different combination ways of the EMD components respectively in table 4  and table 5, then compares the results of table3, table 4 and table 5. In addition to the different combination ways of the EMD components, the combined components are all predicted respectively with RBFNN model after PSR.
According to the IMFs combination method mentioned in section 3.1, keep the IMF1, IMF2,IMF3 and the residual unchanged, and superimpose IMF4, IMF5, IMF6 into a new low-frequency component (call it IMF4~IMF6 combined). The prediction error of each combined component is shown in table4. We can find that the predicting accuracy of the new combined low-frequency component and the residual are excellent, but the prediction errors of the high-frequencies, especially the IMF1, are so great. The predicting error of another IMFs combination method is shown in table 5. In which, IMF1~IMF3 are superimposed into a new high-frequency component, IMF4~IMF6 are superimposed into a new low-frequency component, and keep the residual unchanged as before. Table V shows that the prediction error of superimposing IMF1~IMF3 is so high that the final error (RMSRE) of the cooling load prediction runs up to 3.695%. Therefore, to take the prediction accuracy and the computation complexity into account, this paper selects the combination method of EMD components which is proposed in section 3.1. Figure 3(a) and table 3 in section 4.3 show that the stochastic of IMF1 leads to the poor prediction accuracy. Differential [25] is found as a good method for pre-processing the IMF1 before RBFNN prediction. Table 6 shows the RESRE and MRE of the IMF1 prediction with RBFNN after zero to six order differential operation. It is evident that the RESRE and MRE of the IMF1 prediction after one order differential operation is the minimum, which are 1.9% and 1.3% respectively. Obviously, one order differential pre-process to IMF1 is the first choice.

Improving Prediction Accuracy for High Frequency Component of EMD
Comparing to table 4, the only change to table 7 is that the IMF1 is pre-processed by one order differential operation. Table 7 shows that the RESRE and MRE of the IMF1 prediction with RBFNN are decreased from 2.267% to 1.903% and 1.539% to1.3035% respectively. And the final prediction errors are decreased from 2.221% to 1.794% and 1.517% to1.288% respectively. The differential operation to IMF1 improved the prediction accuracy indeed.

Comparison with Different Prediction Methods
In order to discuss the influence of different prediction methods on the prediction accuracy, three single prediction methods and three hybrid prediction methods are compared. The single prediction methods are LSSVM (least square support vector machine), SVM, and RBFNN. The hybrid prediction methods are EMD-PSR-SVM, EMD-PSR-LSSVM, and EMD-PSR-RBFNN (the proposed method). The process of hybrid prediction methods are: decomposing the original cooling load series with EMD, then combining the IMFs as mentioned in section 3.1, reconstructing the phase space for the combined EMD components, predicting with different single algorithms finally. The IMF1 of the three hybrid prediction methods are all pre-processed with differential method. The RMSRE and MRE of different prediction methods are shown in table 8.   Table 8 shows that RESRE and MRE of the three hybrid prediction methods are smaller than the three single prediction methods. Among the hybrid prediction methods, the prediction accuracy of the proposed EMD-PSD-RBFNN is the maximum.

Conclusion
Predicting non-stationary and nonlinear cooling load demand for next day is very challenging, since there are many correlated independent variables involved. This paper introduced a new model comprised the EMD algorithm, the PSR, and the RBFNN. While the EMD functions as a filter for self-similarity and non-stationary cooling load time series, the RBFNN algorithm predicts the EMD-PSR components.
We proved the chaotic nature of the real operation cooling load time series of a district ice storage cooling systems in Guangzhou. The results in section 4.2 verified that EMD model acts as a good filter in non-stationary and self-similarity cooling load time series systems. In terms of the EMD component combination, different EMD component combination methods results to different prediction accuracy. To balance the computation complexity and prediction accuracy, the combination method is excellent.