Forecast Optimization of Wind Speed in the North Coast of the Yucatan Peninsula, Using the Single and Double Exponential Method

: Installation of new wind farms in areas such as the north coast of the Yucatan peninsula is of vital importance to face the local energy demand. For the proper functioning of these facilities it is important to perform wind data analysis, the data having been collected by anemometers, and to consider the particular characteristics of the studied area. However, despite the great development of anemometers, forecasting methods are necessary for the optimal harvesting of wind energy. For this reason, this study focuses on developing an enhanced wind forecasting method that can be applied to wind data from the north coast of the Yucatan peninsula (in general, any type of data). Thus, strategies can be established to generate a greater amount of energy from the wind farms, which supports the local economy of this area. Four variants have been developed based on the traditional double and single exponential methods. Furthermore, these methods were compared to the experimental data to obtain the optimal forecasting method for the Yucatan area. The forecasting method with the highest performance has obtained an average relative error of 7.9510% and an average mean error of 0.3860 m/s.


Introduction
With the increase of Renewable Energy Sources (RES) the demand for oil and gas has been decreasing. Moreover, 238 GW of installed wind and solar power capacity has been added during 2021, thus demonstrating the ability of RES to overcome the world's energy crisis [1,2]. Likewise, RES provided approximately 29% of the world's electricity generation [3].
Wind energy is among the main RES. This energy source already provides a significant portion of the electrical energy consumption in several countries of the European Union. In Latin America and the Caribbean, 4.7 GW were added for a regional total of 33.9 GW. In Mexico, during 2020, only 0.6 GW were added to its installed capacity. Therefore, Mexico, which was considered one of the top 10 installers in Latin America, disappeared from the list [3]. However, many regions in Mexico such as Baja California, Veracruz, Oaxaca, and Yucatan have wind power capacity potential [4]. Particularly, the Yucatan Peninsula has a technical capacity and annual generation potential of 6125 MW and 14,802 GWh, respectively [5]. With three wind farms in operation, the state of Yucatan has an installed capacity of 244.7 MW [6][7][8].
With the installation of wind farms being more frequent in these types of regions (i.e., flat areas and at sea level), a proper evaluation of the wind resource for each determined site is essential for the successful development of the power grid, since the turbine production depends mainly on wind speed. It is possible to calculate energy production by having the turbine power curve and the resource assessment. To carry out this evaluation, the main anemometric variables to be measured are wind speed, wind direction, temperature, and atmospheric pressure (the first two parameters are key values since, as the wind is an airflow, not all the wind is used to generate) [9]. However, recording wind speed data using different devices as anemometers can present errors due to different wind conditions, which represents a problem in obtaining the best performance from a power grid [10]. This affects the electricity service and the economy of the people who live in areas such as Yucatan, especially people with low income. To solve this problem, many researchers have developed different mathematical models depending on the lapse of time to be able to anticipate wind conditions. Prediction models, depending on the period, can be classified into very short term (a few seconds to 30 min), short term (30 min to 6 h), medium-term (6 h to 24 h), long term (24 h to 72 h), and very long term (>72 h) [11][12][13][14][15][16].
Another classification is made according to methodology. In this way, the models can be deterministic (or physical), statistical, and hybrid. Physical methods use previous wind power data and Numerical Weather Prediction (NWP) data. These models require a more detailed physical description of the site including roughness, obstacles, temperature, pressure, etc., which makes them computationally complex. This approach requires considerable resources, both computationally and economically since, although the data required are common to all wind forecast models, they will vary with the location of the wind farm and it will be necessary to obtain them for each location. Physical approaches are satisfactory for short-term and long-term forecasting [16][17][18]. Statistical methods are based on historical data series to make a forecast for the next few hours. These models present good results for short periods. Its main disadvantage is that, as the time in the forecast increases, the prediction error also increases. Specifically, these methods can be divided into traditional models, time series based, and Artificial Neural Networks (ANN). The most frequently used statistical models include autoregressive model, moving average, autoregressive moving average (ARMA), autoregressive integrated moving average (ARIMA), exponential smoothing, Markov chain, Generalized Autoregressive Conditional Heteroskedasticity (GARCH), among others [19][20][21][22]. Hybrid or mixed methods are due to the combination of different approaches (physical and statistical) and different time windows. The main objective of this method is to benefit from each method and improve the accuracy of the forecasts. Although it is not always achieved, it has been proved that there are lower risks in most situations. Among these methods are ANN-ARIMA, wavelet-ANN, wavelet-ARIMA, Kalman filter-ANN (KF + ANN), Wavelet-Support vector machine optimized by genetic algorithm (WT-SVM-GA), Isolation Forest (IF)-deep learning-ANN, CEE-CC-FS, among others [20][21][22][23][24][25].
In particular, the exponential smoothing method has great variability, adaptation, and a large number of applications. This method is characterized by giving weight to the error caused by the forecast and the experimental data and using the previous forecasts to generate the following predictions. Therefore, the latest observations significantly influence the current forecast, as is the case with short-term measurements of wind speed. However, according to works reported in the literature, the use of the exponential smoothing method in the prediction of wind speed is scarce [19]. In [19], the authors have analyzed the wind speed data collected in Chetumal, Quintana Roo by using a statistical analysis of the time series and the Single Exponential Smoothing (SES) method. Finally, in the work [19], the SES method has been compared with the artificial neural network method to prove that the SES method is very useful for wind speed forecasting (in particular for Chetumal, Quintana Roo). In [26], the authors have developed a combined forecasting system and validated it by comparing it with wind speed data sets from three different wind farms in Penglai, China. Furthermore, in the work [26], the performance of the model has been proved by contrasting it with an extension of the exponential smoothing method and two machine learning models. In the paper [27], two-hybrid forecasting systems based on the structural characteristics of wind speed have been proposed to capture the linear and nonlinear factors hidden in wind speed data. This is because the authors have used a decomposition algorithm to eliminate noise from raw data and reconstruct more reliable wind speed time data. So, a linear model based on the exponential smoothing method or autoregressive moving average model captures the linear patterns hidden and a nonlinear model based on the backpropagation neural network extracts the nonlinear patterns hidden in the data.
Although the use of the exponential smoothing method for the development of new forecasting methods for wind speed is scarce, different forecasting methods have been developed based on other traditional methods. In [10], the authors have evaluated the effects of a set of various moving average filter durations and turbulence intensities on the recorded maximum gust wind speed to present a function dependent on the average duration and turbulence intensity. In [28], a new forecasting method is proposed by the combination of the local convolutional neural network. The authors have transformed the non-convex problems into convex problems to obtain the globally optimal solutions for the convex problems by using heuristic optimization algorithms. Therefore, a more stable model can be constructed to deal with various wind speed data sets. In the work [29], the authors have developed a forecasting method to assess and predict wind speed by integrating Sentinel satellite imagery analysis. This process has been carried out by using multi-sensor satellites and machine learning methods. Furthermore, the developed method has been applied to assess wind energy potential around the Favignana island in Sicily, Italy. In [30], the authors have improved the accuracy of forecasting the short-term wind speed by developing a hybrid wind speed forecasting model based on four modules: crow search algorithm, wavelet transform, feature selection based on entropy and mutual information, and deep learning time series prediction based on long short term memory neural networks. Moreover, the proposed method developed in this work was applied to wind data from Galicia, Spain, and Iran. In [31], a combined prediction system has been proposed to develop a new forecasting method based on optimal sub-model selection, point prediction based on a modified multi-objective optimization algorithm, interval forecasting based on distribution fitting, and forecasting system evaluation.
From the current state of the art, this work focuses on developing four variants of the exponential smoothing forecasting method to optimize the forecast of wind speed. Thus, it is possible to propose strategies to improve the energy harvested from wind farms (in this case, for Yucatan peninsula wind farms) avoiding the main grid as much as possible. This would support the local economy and improve energy services. Furthermore, these methods can be applied to any kind of data. This paper is divided into six sections. After the introduction, in Section 2 the characteristics of the experiment are described (i.e., experimental test set-up and characteristics of the area where the data were collected). In Section 3, a brief description of the SES method and the Double Exponential Smoothing (DES) method is provided. Later in Section 4, the optimization of the SES and DES methods is developed and explained. Subsequently, results and discussion of these forecasting methods developed are presented in Section 5. Finally, the conclusion of this work is given in Section 6.

Measurements Site and Characteristics
Data collection was carried out on the north coast of the state of Yucatan, Mexico. The surface of the territory is mainly flat and its elevation is 2 m above mean sea level. The climate of the site is warm semi-dry with rains in summer. Its average temperature is 26°C with prevailing winds from the south-east [32].
The geographical characteristics of the measurement site are presented in Table 1 and its location in Figure 1. Table 1. Geographical characteristics of the measurement station.

Latitude
21°23 N Longitude 89°53 W Height above sea level 2 m An ultrasonic wind sensor (Gill windsonic anemometer) was used to measure the speed and direction of the wind, placed at 40 m in a mobile phone tower; see Figure 2. This sensor records the horizontal components of the wind vector to generate the scalar values of wind speed and wind direction. The technical parameters of the ultrasonic sensor used are presented in Table 2 [33].  The monitoring frequency for both variables was 1 Hz and the average data of every 10 min was stored in a data logger. The measurements were carried out during a year from January to December of 2011, Figure 3. The descriptive statistics for the site are presented in Table 3. The frequency distribution of the wind speed series and their probability distributions (Weibull and Normal) are shown in Figure 4.  The shape (k = 2.3952) and scale parameters (A = 7.0909) of the Weibull distribution (dp(v)) for the wind speed (v) are calculated as in [34].
where Γ is the Gamma function.

Forecasting Methods for the Wind Speed
One way to analyze the wind speed data is by forecasting methods, since this type of data is essentially a complex process to be analytically modeled [35]. Moreover, as has been proved, some forecasting methods for the wind speed depend not only on the experimental data values, but can also be a function of the residuals of past forecasts, which corresponds to small periods [13,36]. Therefore, due to the behavior of the experimental data, the analysis has been carried out using these methods.

Moving Average
There are many wind speed forecasting methods. One of the most common is the moving average method, which is a time series constructed by taking averages of several sequential values of another time series [37]. In this study, the data analysis began with the average method, since the time series have demonstrated accuracy in wind speed forecasting [14]. Moreover, in the application of the wind speed, the moving average method is used to extract the wind power fluctuations [11].
To start the analysis of the experimental data, the moving average methods to forecast were as follows: (1) The Two-Period Simple Moving Average (2-SMA): (2) The Three-Period Simple Moving Average (3-SMA): (3) The Two-Period Double Moving Average (2-DMA): (4) The Three-Period Double Moving Average (3-DMA): For all cases, y t and F t are the experimental data and the predictive value corresponding at time t, respectively.

Exponential Smoothing
Exponential smoothing is one of the classical methods used for forecasting. This method has been used in diverse fields of research [38,39]. In addition to forecasting, this method allows smoothing the analyzed function. Therefore, the data can be presented in a more convenient form and the random errors can be removed [19].
An efficient implementation of the method together with the descriptive and the inferential statistic is allowed due to its robustness [19]. This method is based on the intuitive application of movable averages, in where the tool to smooth the function is the combination of the error of the last observations and one or some constants [40]. In this work, the SES and DES methods are described.
The SES method used in this study is based on the methodology described in [19], which is presented as: where α is a constant taking values within the interval [0, 1] [19]. However, as will be shown later in Section 5, the optimal α was greater than 1 for some cases.
For the DES method, the equations presented in [40] are used: where γ and β are constants. Like α, the values of γ and β must be between 0 and 1 [40]. However, as is presented in Section 5, the optimal β was greater than 1 and the optimal γ was lower than 1 for some cases.

Proposed Optimization of SES and DES Methods for Wind Speed Data
The non-linear least squares function (Equation (10)) has been implemented to optimally calculate the parameters α, β, and γ for each day of the year according to the experimental data obtained, as an optimization for the SES and DES methods. The minimum daily error of the SES and DES methods was calculated by adjusting the parameters α, β, and γ. With these parameters, it is possible to develop variants of the traditional SES and DES methods as described in this section.
where n is the number of data, p is the set of parameters of the equations (i.e., p = {α} for (8) and p = {β, γ} for (9)). With optimal parameters, two variants from the SES method and two variants from the DES method have been proposed. The first variant for the SES method is given by: where where α opt,k is the optimal parameter of the day k and N is the number of days of the month (i.e., N is equal to 28, 30, or 31 depending on the month). α opt,k has been calculated by using the command lsqcurvefit in the software MATLAB ® . As can be seen, this variant consists in obtaining the average of the optimal αs corresponding to a month. Then this average is implemented in the classical SES method. The second variant for the SES method is given by: In this variant, the classical SES method has been adapted by iterations of the optimal value of α opt,k−1 , in other words, it is the optimal parameter of the previous day k, the optimum value α opt,1 has been used for the first day of each month (k = 1).
Similarly, the variants of the DES method are similar to the variants of the SES method. The first variant of the DES method is given by: where β m and γ m are similarly as α m where β opt,k and γ opt,k are the optimal parameter of the day k and they have been calculated by using the command lsqcurvefit in Matlab ® . Finally, the second variant of the DES method is given by: As for Equation (13), the optimum values β opt,1 and γ opt,1 have been used for the first day of each month (k = 1).

Results and Discussion
To apply the methods developed in Section 4, the optimal constants α opt , β opt , and γ opt for each day have been calculated using MATLAB ® . Figures 5 and 6 illustrate the optimal α opt behavior from the months January-June.  Figures 7 and 8 illustrate the optimal β opt and γ opt calculated from the months July-September. Figures 9 and 10 illustrate the optimal β opt and γ opt behavior from the months October-December.
As can be seen in Figures 5-7 and 9, the values of the optimal constants α opt and β opt in some cases are greater than 1, which contrasts the theory of the classical SES and DES methods, since the values α and β are assumed to be between 0 and 1 [19,40]. Similarly, the value of the optimal constant γ opt in some cases is lower than 1, when it is speculated that the value oscillates between 0 and 1 according to the classic DES method [40], see Figures 8 and 10.
With the values of the optimal constants α opt , β opt , and γ opt , the values of α m , β m , and γ m have been calculated for each month of the year, as presented in Table 4.     Once the values of constants α m , β m , and γ m have been obtained, the methods developed in Section 4 are applied.

Simulation
To evaluate and compare forecasting methods, the relative error E r , the mean error E m , the mean squared error MSE, the root mean square error RMSE, and the coefficient of determination R 2 have been calculated as follows: whereȳ is the mean of the monthȳ As mentioned in Section 3, the first forecasting methods for wind speed were based on the moving average method. Moreover, the recursive function y n−1 = y n was analyzed. This is due to the similarity with the proposed method. Table 5 reports the relative errors of the methods 2-SMA, 3-SMA, 2-DMA, 3-DMA, and the recursive function y n−1 = y n .
As presented in Table 5, the recursive function reports the best performance between these five methods. However, the methods developed in Section 4 are implemented to reduce the relative and mean errors obtained by the recursive function. With the values obtained for α opt , α m , β opt , β m , γ opt , and γ m in the previous subsection, the methods developed in Section 4 have been applied. After that, the forecasts with these methods have been analyzed and compared to the experimental data using different error flags. A summary of the results obtained from this analysis is presented in Tables 6-10.    Table 9. Root mean square error RMSE of each month for the methods developed.  As can be seen in Table 6, the ranges of the relative error were: 5.8199% to 10.4158% for SES method with α m ; 6.0283% to 11.1853% for DES method with β m and γ m ; 5.8835% to 10.5249% for SES method with α opt,k−1 ; and 7.3060% to 18.6109% for DES method with β opt,k−1 and γ opt,k−1 . Furthermore, the average relative errors were 7.9510%, 8.4128%, 8.0213%, and 10.0586% for SES with α m , DES with β m and γ m , SES with α opt,k−1 , and DES with β opt,k−1 and γ opt,k−1 , respectively.

Month
In the analysis of the relative error E r , the method with the best performance for each month was the SES method with α m except for February and June where the SES method with α opt,k−1 obtained the smallest error, see Tabla 4. However, the SES method with α m generally performed the best as its average relative error was the lowest.
As presented in Table 7 In the analysis of mean error E m , the method with the best performance for each month was the SES method with α m , see Table 6. Therefore, based on the mean error analysis, the SES method with α m has the best performance.
In Tables 8 and 9 Based on the analysis of mean squared error and the root mean squared error, the method with the best performance was the SES method with α m . Table 10 reports the coefficients of determination. These values had the following ranges: 0.9132 to 0.9755 for SES method with α m ; 0.9081 to 0.9727 for DES method with β m and γ m ; 0.9117 to 9748 for SES method with α opt,k−1 ; and 0.4691 to 0.9621 for DES method with β opt,k−1 and γ opt,k−1 . The average coefficients of determination were 0.9459, 0.9426, 0.9447, and 0.8882 for SES with α m , DES with β m and γ m , SES with α opt,k−1 , and DES with β opt,k−1 and γ opt,k−1 , respectively. In this case, the SES method with α m has reported better performance.
Once the analysis had been carried out, the simulations were carried out to be able to observe the behavior of the forecasting methods developed in this work. The simulations compared the SES with α m , DES with β m and γ m , SES with α opt,k−1 , DES with β opt,k−1 and γ opt,k−1 , and 2-SMA methods, which have shown the lowest relative and mean errors. In Figures 11 and 12 the behavior of the data taken on the first day of January and the simulations of the forecasting methods proposed in this work can be appreciated. In Figures 13 and 14, the errors between experimental data and the forecast for the first day of May have been illustrated, which is the month with the lowest relative errors (see Table 6). The last four simulations illustrate the behavior and comparison of the forecasts with the highest relative and mean error, see Tables 6 and 7. In Figures 15 and 16, the behaviors of the data taken and the forecasts corresponding to the first day August have been shown, in this month the SES with α opt,k−1 , DES with β opt,k−1 and γ opt,k−1 , and 2-SMA methods incurred their largest relative error. Finally, Figures 17 and 18 illustrate the behavior of the error of the comparison between experimental data and the forecasts for the first day of September, where the SES method with α m and DES method with β m and γ m obtained their largest relative error.    Error of the DES method with opt,k-1 and opt,k-1 Error of the 2-SMA method Figure 18. Error when comparing the experimental data with the simulation data of the SES method with α opt,k−1 , DES method with β opt,k−1 and γ opt,k−1 , and 2-SMA method for the first day of September.

Discussion
With the methods developed in this work, it is possible to obtain up to an average relative error E r = 7.9510 %, an average mean error E m = 0.3860 m/s, an average mean squared error MSE = 0.3615, an average root mean square error RMSE = 0.5974 m/s, and an average coefficient of determination R 2 = 0.9459, which indicate a high degree of accuracy of the proposed methods (i.e., due to the amount of experimental data used in this work, 52,560 wind speed data). Moreover, based on the analysis carried out, the SES method with α m has reported the best performance by having the lowest errors. Therefore, the developed method in this work is effective in forecasting wind speed data. However, the estimation of the model is not optimal, and errors between the experimental data and the model can be observed. Furthermore, the obtained errors are particularly noticeable in some months, which are complex to forecast according to the weather conditions. Thus, taking into account the weather conditions, it is possible to increase the reliability of this forecasting method.

Conclusions
In this study, four methods have been developed based on the classic SES and DES methods to forecast the wind speed in the north coast of the Yucatan peninsula. Statistical tests have demonstrated the effectiveness of these methods. Thus, strategies to optimize the energy harvests from wind farms in this region can be established. However, the variability of the weather affects the performance of the proposed methods. To solve this, it is necessary to collect more data from the studied area to find patterns for improving forecasting methods. It is also worth mentioning that forecasting methods developed in this work can be applied not only for wind speed data but for different kinds of data.