Elsevier

Applied Energy

Volume 236, 15 February 2019, Pages 778-792
Applied Energy

Designing a multi-stage multivariate empirical mode decomposition coupled with ant colony optimization and random forest model to forecast monthly solar radiation

https://doi.org/10.1016/j.apenergy.2018.12.034Get rights and content

Highlights

  • Solar radiation can provide a viable alternative to renewable energy.

  • A multi-stage machine learning model is designed to forecast solar radiation.

  • The proposed model outperforms comparative models in Australia’s energy sites.

  • The proposed multi-stage model can be adopted as a pertinent decision-support tool.

Abstract

Solar energy is an alternative renewable energy resource that has the potential of cleanly addressing the increasing demand for electricity in the modern era to overcome future energy crises. In this paper, a multi-stage multivariate empirical mode decomposition coupled with ant colony optimization and random forest (i.e., MEMD-ACO-RF) is designed to forecast monthly solar radiation (Rn). In the first stage, the proposed multi-stage MEMD-ACO-RF model, the MEMD algorithm demarcates the multivariate climate data from January 1905 to June 2018 into resolved signals i.e., intrinsic mode functions (IMFs) and a residual component. After computing the multivariate IMFs, the ant colony optimization (ACO) algorithm is used to determine the best IMFs based features for model development by incorporating the historical lagged data at (t − 1) in the second stage. The RF model at the third stage is applied to the selected IMFs to forecast monthly Rn. The results are benchmarked with M5 tree (M5tree) and minimax probability machine regression (MPMR) models integrated with MEMD and ACO, to develop the comparative hybrid MEMD-ACO-M5tree and MEMD-ACO-MPMR models respectively. The multi-stage MEMD-ACO-RF model is also evaluated against the standalone RF, M5tree and MPMR models. The proposed multi-stage MEMD-ACO-RF with comparative models is tested geographically in three locations of the Queensland state, in Australia. Based on robust evaluation metrics, the proposed multi-stage MEMD-ACO-RF model outperformed models that were compared during the testing phase and has shown the prospects of an accurate forecasting tool. The proposed multi-stage MEMD-ACO-RF model can be considered as a pertinent decision-support framework for monthly Rn forecasting.

Introduction

The quest for increased generation of electricity from renewable sources is the key priority of many nations (including Australia) as a measure to combat and mitigate the drastic impacts of the changing climate. However, during the year 2014–2015, the electricity generated in Australia from renewable sources declined by 7%, of which hydro-electric generation showed the largest decline [1]. This decline was predominantly due to a reduced hydro-electric generation brought about by the recession of dam water levels following a prolonged period of drought. In contrast, solar energy could provide a viable alternative for Australia which is susceptible to drought events. Australia is one of the countries that receive an extremely high level of annual global solar radiation, indicating that solar energy could sustain a high percentage of Australia’s electricity demand [2]. In addition, solar energy has the least adverse impact on the environment, hence is regarded as one of the cleanest renewable energy sources [3]. In particular, the state of Queensland (aka the sunshine state) has a very high rate of incident solar radiation with the State Government committing to a 50% renewable energy target by the year 2030 [4].

Even though this is a very strong proposition, solar energy is known to be highly variable in nature requiring specific technological implementation and grid management systems [2]. Therefore, effective forecasting tools that can potentially be implemented into smart-grid systems are necessary for efficient supply and demand matching to ensure reliable and sustainable solar power generation. A plethora of studies has attempted to forecast solar radiation via data-driven modeling approaches [5], which can largely be classified into classical and hybrid models. In the classical modeling approach, standalone models were applied to forecast solar radiation including (but not limited to) the widely used artificial neural networks (ANN) [6], [7], [8], [9], support vector machine (SVM) [9], [10], [11], support vector regression (SVR), gradient boosted regression (GBR) and random forest [12]. Additionally, Guermoui, Melgani [13] trialed weighted Gaussian process regression both in a parallel forecasting architecture and a cascade forecasting architecture for solar radiation forecasts, while the adaptive neuro-fuzzy inference system (ANFIS) was proposed by Quej, Almorox [9]. Furthermore, a more advanced and efficient algorithm, namely the extreme learning machine (ELM), has also produced worthy results [14]. Yet, these classical and standalone models might not be able to capture all the deterministic features that exist in the historical data when predicting this important variable i.e., solar radiation.

To augment the forecasting capabilities of data-driven modeling, hybrid models have been developed and explored. Gala, Fernández [12] proposed a weighted linear combination of SVR, GBR and RFR outputs, whereby the respective weights were derived from each individual model's mean average error (MAE) during the training period. Hybrid models are developed with the intention to extract as many pertinent features as possible from the predictor input data set to optimize the model performances. A strategic approach to achieving this is by incorporating a suitable feature selection algorithm. Bouzgou and Gueymard [15] applied maximum relevance – minimum redundancy (MRMR) filter as a feature selection method with ELM in order to optimize the forecasting performances. Similarly, fuzzy logic feature pre-processing with an ANN model also gave commendable forecasts [16]. Recently, Salcedo-Sanz, Deo [17] developed a method by integrating Coral Reefs Optimization (CRO) with ELM (CRO-ELM) where the CRO acted as a feature selection function guided by an ELM algorithm. The authors compared the forecasts of their CRO-ELM method with the Grouping Genetic Algorithm integrated with the ELM model (GGA-ELM). In a similar manner, the hybrids of Multivariate Adaptive Regression Splines (MARS), Multiple Linear Regression (MLR) and the Support Vector Regression (SVR) were also studied. They found a better performance in CRO-ELM in comparison to the other models. A major shortcoming of these hybridized models was that they only concentrated on addressing the issue of feature selection, while the other important issue of non-stationarity was being ignored.

Owing to the day-to-day variability in irradiance, cloud cover, and other environmental and atmospheric parameters, the solar energy is a strong random process [18] that makes the time series intrinsically stochastic. Accordingly, appropriate multi-resolution data pre-processing tools are necessary to suitably extract the embedded information within the non-stationary historic time series. A commonly used procedure is to decompose the predictor signals using discrete wavelet transformation (DWT) into respective detailed components and an approximation component. Consequently, studies that applied the DWT were plenty. Royer, Wilhelm [19] combined DWT with ANN to forecast short-term solar radiation, while Deo, Wen [20] combined DWT with SVM, resulting in their DWT-SVM model. Although DWT based models achieved improved performances in comparison to the classical models, they have inherent drawbacks. Firstly, the DWT is impaired by the decimation effect whereby half of the wavelet coefficients are recursively lost at each of the subsequent transformation levels [21], [22]. Additionally, DWT requires the adoption of a pre-selected mother wavelet; otherwise, a time-consuming trial and error process is often needed. With that, different decomposition levels have been found to generate varying forecasting performances [22].

An alternative multi-resolution analysis (MRA) tool, the empirical mode decomposition (EMD) developed by Huang, Shen [23], segregates higher frequency input series into lower frequency resolved components. The EMD gained prominence due to its self-adaptability i.e., it does not require any prescribed frequency bands or imposed any basis functions [24]. This property of complete data dependence makes EMD advantageous in terms of extracting pertinent features from the predictor time series without any loss of information. In addition, the set of decomposed salient features is a representation of the physical structure of the data since the EMD temporally decomposes the predictor inputs using the extrema information of the riding waves [25]. Likewise, EMD integrated with ANN has proved to be a successful model to forecast solar radiation [24]. In another recent study by Wang, Tian [18], the authors combined EMD with local mean decomposition (LMD) to decompose the non-stationary solar radiation series into simpler components and employed least squares support vector machine (LSSVM) and the Volterra models for forecasting. A comparison of these EMD based algorithms with an autoregressive integrated moving average (ARIMA) method revealed a better performance in the EMD-LMD-LSSVM-Volterra model.

Yet, the key issue with such methods including EMD and its variants (including ensemble EMD (EEMD) [26], the complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) [27], and the improved complete ensemble empirical mode decomposition with adaptive noise (ICEEMDAN) [28] is that they can only be applied in a uni-variate setting, i.e., only the significant lags of solar radiation time series could be applied as predictors to forecast future solar radiation values. This is a critical issue since the variability of incident solar radiation is dependent on many dynamic meteorological and environmental factors that may have been left out. The meteorological parameters such as air temperature, sunshine duration, relative humidity, cloud cover are indeed correlated with solar irradiation [29]. Therefore, these parameters need to be appropriately incorporated into the respective models.

To utilize several predictors and subsequently extract most, if not all, possible relevant predictive features, a new hybridized modeling approach called multivariate ensemble empirical mode decomposition (MEMD), is introduced in this study and applied to forecast monthly solar radiation at three sites. The proposed MEMD is an extension of standard EMD to multivariate signals, where EMD has been shown to accurately perform data-driven time-frequency analysis of complex, nonlinear and multichannel dynamical processes [30]. Another key advantage is that the MEMD overcomes the mode alignment issue in the joint analysis of multiple oscillatory components within a higher dimensional signal, which has remained unresolved in standard EMD [31]. The MEMD has been applied with successes in forecasting evapotranspiration [32], soil water [33], crude oil price [34] and iceberg drift forecast [35], but this is the first application of this novel technique for solar radiation forecasting in Australia.

For solar radiation forecasting, eight meteorological predictor time series (i.e., Maximum temperature, minimum temperature, precipitation, evaporation, vapour pressure, estimated Relative Humidity at maximum temperature, estimated Relative Humidity at minimum temperature and Potential Evapotranspiration) acquired from the Scientific Information for Landowners (SILO) are concurrently transformed into respective IMFs and residual components via the MEMD process. The MEMD addresses the important non-stationarity issue via a simultaneous demarcation of input series into resolved components. The important problem of feature selection is resolved via the implementation of a robust bio-inspired feature selection method called the ant-colony optimization (ACO). The ACO is a bio-inspired algorithm that mimics the behavior of ant colonies [36] and has been successfully used in different applications [37], [38], [39], [40], [41]. The novelty of this paper lies in the proposal of a hybridized data-intelligent model that integrates MEMD and ACO with a robust tree-based model, namely random forest (RF), resulting in the hybrid MEMD-ACO-RF model for Rn forecasting. The proposed model simultaneously addresses the non-stationarity and feature selection problems that negatively impact Rn forecasting models. The MEMD-ACO-RF model is benchmarked against competitive M5tree and MPMR models as well as the standalone RF, M5tree and MPMR models in forecasting monthly solar radiation at three solar rich stations in the state of Queensland, Australia. In the remaining of this paper, the theoretical frameworks of these models will be presented, followed by the descriptions of study region, results, discussion, and conclusions.

Section snippets

Theoretical overview

In this section, an overview of the forecasting model Random forest (RF), multivariate empirical mode decomposition (MEMD), ant colony optimization method (ACO) and its comparative counterpart M5tree and minimax probability machine regression model (MPMR) models will be presented.

Study region and datasets

The study area is located in Queensland (QLD), which is Australia’s sunshine state that has an abundance of solar resource. To construct a large set of predictor matrices, predictor data for neighboring meteorological sites were acquired from the Scientific Information for Land Owners (SILO) Portal developed by the Queensland Department of Environment and Resource Management [72]. The data is comprised of monthly rainfall (Rain; mm), maximum (Tmax; °C) and minimum temperature (Tmin; °C),

Results

The performance of the proposed multi-stage MEMD-ACO-RF model vs. comparative models in the testing phase were assessed with the aid of a set of statistical metrics, visual figures and error distributions between the forecasted and observed Rn.

Discussion: limitations and opportunities for further research

In this paper, the suitability of the ACO optimized MEMD coupled RF (benchmarked with M5 Tree model and MPMR) for monthly solar radiation forecasting was investigated. Generally, the RF outperformed M5tree and MPMR models for all selected sites, thus revealing that the RF model was efficient in the detection of the features within the meteorological inputs in a physically meaningful way in order to forecast Rn. The hybridized MEMD-ACO-RF also outperformed the other models compared (i.e.,

Conclusion

A hybrid multi-stage MEMD-ACO-RF model has been designed by incorporating the selected IMFs based on ACO feature selection method on the decomposed input data (Tmax, Tmin, Rain, Evap, VP, RHmax, RHmin, FAO56) for training of the model to forecast future Rn in the Springfield, the Ross River, and the Clare Solar Farms. The monthly input data collected since January 1905 till June 2018 for these candidate sites were decomposed into IMFs and a residual through the MEMD algorithm. The ACO algorithm

Acknowledgements

The authors are grateful to Scientific Information for Landowners (SILO) for providing the relevant meteorological and solar radiation data for the study regions. The authors are also thankful to the editor and the two respected reviewers in providing their comments in improving the quality of the paper.

References (93)

  • Z. Wang et al.

    Hourly solar radiation forecasting using a Volterra-least squares support vector machine model combined with signal decomposition

    Energies.

    (2018)
  • R.C. Deo et al.

    A wavelet-coupled support vector machine model for forecasting global incident solar radiation using limited meteorological dataset

    Appl Energy

    (2016)
  • R. Prasad et al.

    Input selection and performance optimization of ANN-based streamflow forecasts in the drought-prone Murray Darling Basin region using IIS and MODWT algorithm

    Atmos Res

    (2017)
  • M.A. Colominas et al.

    Improved complete ensemble EMD: a suitable tool for biomedical signal processing

    Biomed Signal Process Control

    (2014)
  • R. Yacef et al.

    Prediction of daily global solar irradiation data using Bayesian neural network: a comparative study

    Renew Energy

    (2012)
  • W. Hu et al.

    Soil water prediction based on its scale-specific control using multivariate empirical mode decomposition

    Geoderma

    (2013)
  • A. Badr et al.

    A proof of convergence for ant algorithms

    Inf Sci

    (2004)
  • R.J. Mullen et al.

    A review of ant algorithms

    Expert Syst Appl

    (2009)
  • J.D. Sweetlin et al.

    Feature selection using ant colony optimization with tandem-run recruitment to diagnose bronchitis from CT scan images

    Comput Methods Programs Biomed

    (2017)
  • G. Singh et al.

    Ant colony algorithms in MANETs: a review

    J Network Comput Appl

    (2012)
  • R.C. Deo et al.

    Very short-term reactive forecasting of the solar ultraviolet index using an extreme learning machine integrated with the solar zenith angle

    Environ Res

    (2017)
  • R. Prasad et al.

    Soil moisture forecasting by a hybrid machine learning technique: ELM integrated with ensemble empirical mode decomposition

    Geoderma

    (2018)
  • B. Bhattacharya et al.

    Neural networks and M5 model trees in modelling water level–discharge relationship

    Neurocomputing

    (2005)
  • O. Kisi

    Pan evaporation modeling using least square support vector machine, multivariate adaptive regression splines and M5 model tree

    J Hydrol

    (2015)
  • S.J. Jeffrey et al.

    Using spatial interpolation to construct a comprehensive archive of Australian climate data

    Environ Modell Software

    (2001)
  • J. Zajaczkowski et al.

    Improved historical solar radiation gridded data for Australia

    Environ Modell Software

    (2013)
  • C.W. Dawson et al.

    HydroTest: a web-based toolbox of evaluation metrics for the standardised assessment of hydrological forecasts

    Environ Modell Software

    (2007)
  • A. Lahouar et al.

    Hour-ahead wind power forecast based on random forests

    Renew Energy

    (2017)
  • Department of Industry Innovation and Science

    Australian Energy Update 2016

    (2016)
  • Department of Energy and Water Supply. Powering Queensland Plan: Queensland Government;...
  • S.M. AI-Alawi et al.

    An ANN-based approach for predicting global radiation in locations with no direct measurement instrumentation

    Renew Energy

    (1998)
  • K. Angela et al.

    Predicting global solar radiation using an artificial neural network single-parameter model

    Adv Artificial Neural Syst

    (2011)
  • N. Kumar et al.

    Prediction of solar energy based on intelligent ANN modelling

    Int J Renew Energy Res

    (2016)
  • M. Şahin et al.

    Application of extreme learning machine for estimating solar radiation from satellite data

    Int J Energy Res

    (2014)
  • Sivaneasan B, Yu CY, Goh KP. Solar Forecasting using ANN with Fuzzy Logic Pre-processing. World Engineers Summit –...
  • J.C. Royer et al.

    Short-term solar radiation forecasting by using an iterative combination of wavelet artificial neural networks

    Independent J Manage Prod

    (2016)
  • M. Rathinasamy et al.

    Wavelet-based multiscale performance analysis: an approach to assess and improve hydrological models

    Water Resour Res

    (2014)
  • N.E. Huang et al.

    The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis

    Proc Royal Soc

    (1998)
  • P.-F. Alvanitopoulos et al.

    Radiation time-series prediction based on empirical mode decomposition and artificial neural networks

  • Z. Wu et al.

    On the time-varying trend in global-mean surface temperature

    Clim Dyn

    (2011)
  • Z. Wu et al.

    Ensemble empirical mode decomposition: a noise-assisted data analysis method

    Adv Adaptive Data Anal

    (2009)
  • M.E. Torres et al.

    A complete ensemble empirical mode decomposition with adaptive noise

  • N. Rehman et al.

    Multivariate empirical mode decomposition

    Proc Roy Soc A: Math Phys Eng Sci

    (2009)
  • D. Looney et al.

    Multiscale image fusion using complex extensions of EMD

    IEEE Trans Signal Process

    (2009)
  • S. Adarsh et al.

    Scale-dependent prediction of reference evapotranspiration based on Multi-Variate Empirical mode decomposition

    Ain Shams Eng J

    (2017)
  • K. He et al.

    Multivariate EMD-based modeling and forecasting of crude oil price

    Sustainability

    (2016)
  • Cited by (145)

    View all citing articles on Scopus
    View full text