Predictive models development using gradient boosting based methods for solar power plants
Graphical abstract
Introduction
In order to leave a more livable environment with reduced carbon emissions, the methods by which energy is obtained should be selected from renewable energy sources. In the same way, increasing the amount of energy produced by such energy sources will contribute significantly to reducing the carbon footprint. The paradigm of converting the clean and endless energy of the sun into usable electrical energy appears as a field that has been studied for decades. On the other hand, the point reached in artificial intelligence (AI) and the eminent algorithms with proven performance brings the idea of integrating this field into energy systems. AI methods make significant contributions to the control of a system, determining the decisions to be made about the system, future strategies, and increasing efficiency. For this purpose, many machine learning and deep learning algorithms have been integrated into the use of renewable energy resources. Similar to the one set forth in this study, machine learning(ML) algorithms such as solar power estimation have been used with different training sets. In [1], a two-stage estimation study is carried out, firstly irradiance data is estimated from numerical data, in the second stage solar power generation is predicted using the estimated irradiance. Moreover, while in some studies more than one algorithm is used together [2], in others modified and enhanced versions of the same algorithm are used together [3]. In addition to machine learning methods, deep learning approaches have also been introduced for solar power forecast in some studies [4], [5]. In these studies of [6], [7], [8], whereas artificial neural network (ANN) models are trained for prediction [6], LSTM with memory cells or its derivatives are sometimes preferred [7], [8]. Another issue that determines the methods in these studies is whether the targeted prediction study will be short-term or long-term. In the same way, day-ahead forecasting has yielded successful results [9]. Further, the selection of the right data for training predictive models has a significant impact on the performance outcomes of these models. Some studies use geographic structure features for training, satellite photos are seen as very popular training data in this context [10]. Convolutional Neural Networks (CNN), which is a powerful deep learning method, stands out in this regard. Both LSTM and CNN are frequently used with deterministic, heuristic, or meta-heuristic optimization methods which are used for enhancing learning [11]. However, the common problem that arises with these approaches is the need for high computational power and long training times. On the other hand, meteorological data have a considerable influence on the level of solar energy to be produced. In particular, radiation is directly related to solar power generation. In this direction, some studies have focused on estimating solar radiation rather than estimating power and have achieved very successful results [12], [13]. When all these prediction studies focused on solar power plants are examined, it is seen that two different difficulties arise. The first is the difficulty of selecting and preprocessing the training dataset to be used for training the models. The second is the problem of the long training time spent on training the models, although a high percentage of accuracy is demonstrated. When studies are conducted on how to overcome these difficulties, the idea of using meteorological data with more than one feature, instead of choosing only radiation data or satellite photos, comes to the fore for the training set. On the other hand, innovative regression algorithms based on gradient boosting machine (GBM) have been researched to shorten the training time. It has been seen that the XGBoost algorithm yields very promising and strong results in very short-term energy forecasts [14]. Similarly, it can be said that the performance outputs in the load prediction study using this method are impressive [15]. In [16], in which XGBoost and LightGBM are used together with CNN for load prediction, very successful results are achieved. Also, in [17], LightGBM, used in conjunction with CNN, is a useful solution for the power generation forecasting of wind turbines. In a similar study trying to solve the same prediction problem, two popular deep learning techniques CNN and LSTM are applied together with LightGBM and a successful model is presented [18]. CatBoost,a relatively new GBM-based algorithm, can be considered as another method that shows a significant success in prediction studies. This method has been effectively used to solve the long-time load prediction problem [19].
Another benefit of creating fast and highly accurate prediction models as targeted in this study is emerging in machine learning-based control applications as augmenting the methods. This field, with the innovations it promises and the developments it reveals, is explained in detail in [20] under the name of data-driven control. In parallel with this field, where system dynamics is tried to be learned from data and ML models, the subject of reinforcement learning (RL), which is based on learning to control the system through a model environment, has an increasing popularity. In [21], temporal difference-based, and deep neural networks-based RL algorithms are described in detail. The issue of how prediction models can improve novel control methods can be exemplified by control problems in smart grids, microgrids or nanogrids referring to a building’s grids [22]. A control agent based on ML needs a well-designed environment model to learn to control the system. The good design in this regard is to determine the control actions appropriately, to create the finite or infinite possible states correctly, and to determine the rewards that the agent will receive as a result of the control decisions to be made. In the energy management or control problems of the grids that have solar power plants or panels, the highly accurate prediction of the power to be generated by the solar plant is vital because it is used by ML (RL) agent to learn to control the system through the environment model [23]. Briefly, accurate future prediction leads to the creation of correct rewards and the agent learning to control better [24]. In addition to RL-based methods that try to control the system directly, learning the system dynamics with data-driven control-based methods, and then controlling the system are also promising topics [25]. In this context, Sparse Identification of Nonlinear Dynamics with Control (SINDY) and SINDY-PI algorithms [26], [27], which give very successful results in the control of nonlinear systems, can be enhanced with GBM-based prediction models. When all these studies and the solution proposals that have been examined in terms of solar power plants and forecasting models are examined, the contribution of a predictive model that produces high accuracy and fast results is considerable. This study is based on creating and analyzing a solar power prediction model using relatively novel GBM based algorithms (LightGBM, XGBoost, CatBoost). The performance outputs of these models are presented, and compared and their usability is examined. In the next section, the methodology of the models used is described. In the third section, dataset analysis and problem formulation is given. In the following section to the third section, the performance outputs are given.
Section snippets
Gradient Boosting Machine
Ensemble learning (EL) algorithms come to the fore in many AI or prediction-based academic publications, competitions, and prediction projects aiming for high accuracy. This concept is based on creating multiple models and unifying their results for the final output, rather than creating a single model for a forecasting problem. It is divided into two parts as bagging and boosting. In the concept of bagging, more than one model is trained at the same time in parallel using the same dataset. In
Dataset analysis and problem formulation
In solar power forecasting studies, the dataset to be used in the training of the model is as important as the method to be used to create the model. Which machine learning or deep learning method will be used requires obtaining a dataset accordingly. For example, if a CNN-based model is to be created, training with a dataset consisting of satellite photos may give good results, while a prediction model using LSTM requires training with time-series data for easier learning of memory cells. The
Performance outputs
The dataset, which is pre-processed and made ready for training, becomes suitable for the use of the proposed algorithms. Subsequently, Python language and open-source libraries for each method are used while creating the models. The open-source software library prepared by the Distributed (Deep) Machine Learning Community (DMLC) group is used for XGBoost [32]. For LightGBM, open source software library provided by Microsoft company is used [36]. Finally, the python library prepared by Yandex
Conclusion
Around the world, where the importance of renewable energy sources is becoming more and more felt today, using the sun’s endless energy efficiently is vital for the environment to be left to the future. It is essential to make accurate future planning to increase the efficiency of solar power generation, and to anticipate what may happen in order to prepare an appropriate plan. The idea of integrating the powerful and innovative prediction methods offered by machine learning with solar power
CRediT authorship contribution statement
Necati Aksoy: Conception and design of study, Acquisition of data, Analysis and/or interpretation of data, Writing – original draft, Writing – review & editing. Istemihan Genc: Conception and design of study, Acquisition of data, Analysis and/or interpretation of data, Writing – original draft, Writing – review & editing.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgment
All authors approved version of the manuscript to be published.
Necati Aksoy earned his B.Sci in Electrical and Electronics Engineering. Then, he received his Master’s degree from Temple University, USA. He is a Ph.D. candidate at Istanbul Technical University and a member of the Smart Grid lab at ITU. His research interests include reinforcement learning, microgrids, smart grids, machine learning applications in power engineering and related fields.
References (37)
- et al.
Deep learning and wavelet transform integrated approach for short-term solar PV power prediction
Measurement
(2020) - et al.
Interpretable deep learning models for hourly solar radiation prediction based on graph neural network and attention
Appl. Energy
(2022) - et al.
Short–mid-term solar power prediction by using artificial neural networks
Sol. Energy
(2012) - et al.
Efficient daily solar radiation prediction with deep learning 4-phase convolutional neural network, dual stage stacked regression and support vector machine CNN-REGST hybrid model
Sustain. Mater. Technol.
(2022) - et al.
Bagging–XGBoost algorithm based extreme weather identification and short-term load forecasting model
Energy Rep.
(2022) - et al.
A CNN-LSTM-LightGBM based short-term wind power prediction method based on attention mechanism
Energy Rep.
(2022) - et al.
Multi-dimensional data-based medium- and long-term power-load forecasting using double-layer CatBoost
Energy Rep.
(2022) - et al.
Sparse identification of nonlinear dynamics with control (SINDYc)
IFAC-PapersOnLine
(2016) - et al.
Probabilistic solar power forecasting based on bivariate conditional solar irradiation distributions
IEEE Trans. Sustain. Energy
(2021) - et al.
Ensemble approach of optimized artificial neural networks for solar photovoltaic power prediction
IEEE Access
(2019)
A hybrid ensemble model for interval prediction of solar power output in ship onboard power systems
IEEE Trans. Sustain. Energy
PV power prediction based on LSTM with adaptive hyperparameter adjustment
IEEE Access
Solar power generation forecasting with a LASSO-based approach
IEEE Internet Things J.
Day-ahead power output forecasting for small-scale solar photovoltaic electricity generators
IEEE Trans. Smart Grid
Short-term solar power prediction learning directly from satellite Images With Regions of interest
IEEE Trans. Sustain. Energy
Hourly solar irradiance prediction based on support vector machine and its error analysis
IEEE Trans. Power Syst.
An ensemble method to forecast 24-h ahead solar irradiance using wavelet decomposition and BiLSTM deep learning network
Earth Sci. Inform.
Very short-term renewable energy power prediction using XGBoost optimized by TPE algorithm
Cited by (11)
Predictive models of beetroot solar drying process through machine learning algorithms
2023, Renewable EnergyForecasting of Solar Power Generation Using Hybrid Empirical Mode Decomposition and Adaptive Neuro-Fuzzy Inference System
2024, Lecture Notes in Electrical EngineeringShort-Term Hours Ahead Forecast of Expected Available Solar Power Using Linear Regression Machine Learning Scheme
2024, Proceedings of the 32nd Southern African Universities Power Engineering Conference, SAUPEC 2024
Necati Aksoy earned his B.Sci in Electrical and Electronics Engineering. Then, he received his Master’s degree from Temple University, USA. He is a Ph.D. candidate at Istanbul Technical University and a member of the Smart Grid lab at ITU. His research interests include reinforcement learning, microgrids, smart grids, machine learning applications in power engineering and related fields.
Istemihan Genc received the B.Sc. degree in electrical engineering from Istanbul Technical University, the M.Sc. degrees in electrical engineering, systems and control engineering, and systems science and mathematics from Istanbul Technical University, Boğaziçi University, and Washington University, respectively. After receiving the Doctor of Science (D.Sc.) degree in 2001 from Washington University in St. Louis, he joined Istanbul Technical University where he is currently a Professor in the Department of Electrical Engineering. His research interests include power system dynamics and stability, smart grids, optimization and control applications in power engineering.