Validation of European-scale simulated wind speed and wind generation time series

This paper presents a validation of atmospheric reanalysis data sets for simulating onshore wind generation time series for large-scale energy system studies. The three reanalyses are the ERA5, the New European Wind Atlas (NEWA) and DTU’s previous generation European-level atmospheric reanalysis (EIWR). An optional scaling is applied to match the microscale mean wind speeds reported in the Global Wind Atlas version 2 (GWA2). This mean wind speed scaling is used to account for the effects of terrain on the wind speed distributions. The European wind power fleet for 2015–2018 is simulated, with commissioning of new wind power plants (WPPs) considered for each year. A generic wake model is implemented to include wake losses that are layout agnostic; the wake model captures the expected wake losses as function of wind speed given the technical characteristics of the WPP. We validate both point measurement wind speeds and generation time-series aggregated at the country-level. Wind measurements from 32 tall meteorological masts are used to validate the wind speed, while power production for four years from twelve European countries is used to validate the simulated country-level power production. Various metrics are used to rank the models according to the variables of interest: descriptive statistics, distributions, daily patterns, auto-correlation and spatial- correlation. We find that NEWA outperforms ERA5 and EIWR for the simulated wind speed, but, as expected, no model is able to fully describe the auto-correlation function of the wind speed at a single point. The mean wind speed scaling is found to be necessary to match the distribution of generation on country-level, with NEWA-GWA2 and ERA5-GWA2 showing highest accuracy and precision for simulating large-scale wind generation time-series.


Introduction
The generation of electricity from renewable sources is a key component of the climate change mitigation plans worldwide. For the first time in 2019, the renewable power generation grew faster than the electricity demand [1]. But, the global share of renewable electricity generation would need to increase to 28% by 2030 and 66% by 2050, to keep the mean global temperature rise below 2 • C by the end of the century [1]. In the European Union the share of renewable energies reached 32% of electricity generation and a 18% of the total energy consumption in 2018 [2], these shares are planned to grow to 50% of the electricity and 32.5% of the total energy by 2030 [2].
Accurate simulation of wind energy generation time series are needed in energy system design studies, such as the sector coupling and transmission reinforcement designs in Europe [3] and in the North Sea [4], as variability in wind power generation impacts electricity prices, and correlations in generations between countries can impact Nomenclature Wind speed time-series at a location Turbulence intensity Country-level standardized wind generation time-series WT Turbine rated power WT Number of turbines in a plant WT Rotor diameter WPP Land-use area of the plant Variable holder for nomenclature, can be or Variable holder for nomenclature, can be either location for , or country for Time One hour ramps time-series for : = ( + 1) − ( ) Long-term mean of , averaged over time for the full period available Mean of at each month and each hour of the day All errors are computed between a model and observations Error in long-term mean Error in normalized monthly-hourly mean Area between the model and observed empirical cumulative density functions MAE Mean absolute error RMSE Squared root of the mean squared error Sample (or prediction) correlation coefficient ( 1 , 2 ) Correlation coefficient (Pearson's) between two time series 1 Error in 1h lag auto-correlation in Mean error in the hours lag autocorrelations in Mean error in the spatial-correlation of , i.e. correlation between time-series at different locations Mean error in the spatial-correlation of the one hour ramps of , i.e. correlation between ramp time-series at different locations WT Wind turbine WPP Wind power plant Validation of mean wind speed or predictions of annual energy production for synthetic wind turbines was carried out in several studies [10]. For energy system modeling, predicting the correct mean wind speed or annual energy production is not enough, as it requires the simulation of time series with the right distribution (probability density function, PDF), auto-correlation and spatial-correlation among the different locations. Individual reanalysis data sets have been validated before for MERRA and MERRA-2 [9], for MERRA with micro-scale wind speed scaling from the Global Wind Atlas (GWA) [11], for ERA-Interim with WRF dynamic scaling and GWA micro-scale wind speed scaling [7]. The ERA5 reanalysis was shown to give more accurate wind generation predictions in comparison to MERRA-2 [12]. Recently, a validation study of multiple weather data sets in France was published by [13] and it concluded that (1) high resolution regional reanalysis (COSMO-REA2) or high resolution weather models (AROME) tend to have lower prediction errors, (2) ERA5 is very skilled in prediction of regional wind generation despite its lower resolution and biases in mean wind speed predictions in mountainous locations, and (3) NEWA and MERRA-2 show problems with the diurnal cycle that translates into larger biases in mean wind speed. New generation regional reanalyses such as COSMO-REA2 have been shown to better correlate to both wind speed measurements [14] and wind generation in France [13] and in Switzerland [15] The purpose of this paper is to assess the accuracy of several reanalysis data sets for wind generation simulations in large scale scenarios. Two validations data sets are presented in this paper: (1) several years of measurements from 32 met masts and (2) reported country-level wind energy generation for four years. Validation metrics are defined for several variables of interest, and the models are ranked according to each metric. Validation metrics are defined in terms of prediction errors of descriptive statistics, mean daily cycle, autocorrelation, spatial-correlation, and the cumulative density function distance.
The hypothesis of this study is that it is possible to accurately simulate the large-scale regional wind energy generation using the global reanalysis ERA5 corrected with microscale wind speed effects with the same accuracy as with time series from detailed mesoscale reanalysis simulations. Mesoscale models are expected to show higher accuracy for modeling individual sites. This paper presents a new optic of the validation of reanalysis data focused on the specific need of large-scale wind generation simulations, and compares multiple weather data sets including microscale downscaling in European level. Compared to previous works where bias correction of wind speeds was done based on measured generation data per country [9] or as a global wind speed correction factor to better match capacity factors [12], in this paper downscaling is based on microscale wind speed data (Global Wind Atlas, GWA). Measured power generation is not used as part of the model calibration, which allows model validation in the power generation domain to be fully independent of the measured generation data. A more detailed wake loss model differentiates this study from [11]. When microscale effects and wake losses are considered, the resulting power generation times series show strong agreement with measured data, which indicates that measured generation data is not required in the model calibration; and in general shows improved modeling results than applying wind speed corrections such as [16]. Compared to [10], this paper adds the comparison of spatio-temporal dependencies in wind time series. Section 2 describes the reanalysis data sets and the measurements evaluated in this paper. Sections 3 and 4 presents the modeling methods and the validation metrics. Section 5 presents the results for the two validation cases. Sections 6 and 7 discuss and conclude the paper.

Reanalysis data sets
Data from three atmospheric reanalysis are compared: ERA5 [17], DTU's previous generation European level weather reanalysis simulation performed using the Weather Research and Forecasting (WRF) model [18] and the New European Wind Atlas (NEWA) [19]. Additionally, a mean wind speed scaling is applied to each reanalysis data set to match the mean wind speed reported by the Global Wind Atlas v2 (GWA2) micro-scale resource assessment, see Section 3.2 for details on this scaling. Resolutions and periods of availability for each data set are presented in Table 1.

Wind speed measurements
Data from 32 sites were collected from different sources and processed, see Table B.9. Time series of wind speed measured at the level closest to 100 m height, originally at 10 min averaged resolution, were aggregated to hourly resolution. Gaps are identified and discarded from the models in the validations. All data sets were subjected to an adapted version of the quality control routine described in [21] and a rough attempt was made to minimize the effect of flow distortion caused by the mast on the wind speed measurements. The observations cover different time periods from 1996 to 2019 with at least one full year, see Table B.9. The locations of the masts are shown in Fig. 1.

European wind energy fleet
We use the database of wind plant installations in Europe from [22]. This database includes locations, hub height, installed capacity, commissioning year and turbine type among other parameters for each wind power plant in Europe. Additionally, the database includes the power curves for most turbine manufacturers and turbine models. Individual installation scenarios are run for each production year within [2015][2016][2017][2018], the plants expected to be operating are selected accordingly to the commissioning year, and an assumed plant lifetime of 25 years. Plants within 2 km of each other are merged into a single plant to consider the plant-to-plant wake losses. As the WPP data set sometimes reports even single turbines as plants, this merging gives a more unified specification of WPP sizes across Europe. Plant-toplant wakes are in particular important for countries in Central and Western Europe where plants (and individual turbines) are sited in close proximity to each other. In this study three different wind turbine spacings were used: 6 rotor diameters for the Nordic countries, 3 rotor diameters for Germany and France, and 10 rotor diameters for South European countries. The difference in WT spacing follows the trends on installation density in Germany presented in [23]. An overview of the country level aggregated WPP characteristics is given in Table 2 (see Fig. 2).

Country-level wind generation measurements
Hourly resolution onshore wind generation data from the ENTSO-E Transparency platform are obtained from [24]. Hourly installed capacity data from [24] for Germany, Denmark and Sweden is used for calculating standardized generation time series for these countries. For the other countries, annual installed onshore wind capacity data from [25] is used with linear interpolation of the annual values to estimate the standardized generation.
A comparison of annual capacity factors from several sources is presented in Table 3. IRENA is calculated using both annual onshore wind generation and installed capacity from [25]. ENTSO-E considers generation from [24] and onshore wind installed capacity directly from [26]. Mean of start-of-year and end-of-year installed capacity is used to approximate the annual installed capacity.
Curtailment is also presented in Table 3 for Germany and Ireland, where data is available from [27] and [28], respectively. For Germany, curtailment shares are available for 2015 and 2016; 2017 and 2018 curtailment are modeled by taking mean of the 2015 and 2016 values. On average for the period 2015-2018, curtailment share of 4.8% is found for Germany, and 4.2% for Ireland. As curtailment data is available only on annual level, it is not applied on the hourly time series, however, it is reported in this article to show the challenges of finding accurate measured data to validate the large-scale simulations.
The resulting capacity factors in Table 3 show that for some countries, e.g., Austria, Denmark and Italy, the sources give approximately the same capacity factors. However, significant differences can be seen in e.g. Ireland and France. For Germany, standardized generation time series used in this article show much higher capacity factor than the other two sources.

Methods
The overall simulation model chain is depicted in Fig. 3.

Wind speed interpolation
Cubic-spline interpolation is used for horizontal interpolation of wind speeds. Power law interpolation between the two closest heights available in the reanalysis is used for vertical wind speed interpolation. This approach is equivalent to a piece-wise linear interpolation in loglog scale. Note that the wind speeds are interpolated for every time step, without applying any smoothing or filter.

Mean wind speed scaling
The long-term mean wind speed maps are pre-computed for every reanalysis data set at the available heights. The mean wind speed from GWA2 is averaged over the land-use area of the plant in order to have a representative wind speed for the full plant and the value at the center point of the plant. The scaling factor is computed between the longterm mean wind speed from the reanalysis and the mean wind speed from the GWA2. Both mean wind speeds are interpolated using the methodology described in 3.1. This methodology was used in several studies such as [7,11,29]. Fig. 4 depicts the scaling factors at the WPP installation locations in 2018.

Wake modeling
The layout information is not available for any of the plants in this study, therefore a generic wake model was developed to capture the wake losses as a function of the wind speed ( ), mean turbulence intensity at the site ( ), turbine rated power ( WT ), number of turbines ( WT ), rotor diameter ( WT ) and area of the plant ( WPP ). The longterm mean turbulence intensity over Europe is estimated using NEWA's turbulent kinetic energy.
A database of 1000 statistically representative WPP is generated to cover the variation observed in the European installed fleet of the WPP parameters ( , WT , WT , WT , WPP ). For each WPP, 10 different layouts are obtained using a space filling algorithm that maximizes the turbine spacing within the plant area. Wake losses as a function of wind speed are then computed for the 10 layouts at various wind directions using the Bastankhah Gaussian wake model [30] available in pywake [31]. This wake model consists of a Gaussian wind speed deficit, with linear wake expansion and a squared-sum wake superposition. To generalize the wind direction and layout dependencies in the wake losses, an average wake loss over the layouts and wind directions is used producing a database of wind speed dependent wake loss factors for each of the generic WPPs.
Using the database of wake loss factors and the generic WPP characteristics ( , WT , WT , WT , WPP ), an artificial neural network (ANN) with 7 hidden layers is trained. The performance of the resulting ANN is then validated on wake loss factors computed for actual WPPs in Europe. Details of the ANN architecture selection along with the results of the validation can be found in [32].

Wind generation and aggregation
WPP wind generation is obtained by interpolating the wake affected wind plant power curve at the wind speed for every time step. Note that the WPP database used in this article [22], includes a database of power curves of the turbines installed in Europe. A WPP efficiency of 0.95 is assumed for all plants, which includes 0.97 for availability and 0.98 of electrical and additional losses, as reported in [33]. Finally, the individual WPP generation time-series are aggregated into a country-level standardized generation time-series.

Validation metrics
Traditional validation metrics are used for comparing the observed (subscript ) and modeled (subscript ) time series such as: (1) bias or error of the mean ( = − ), (2) mean absolute error, MAE, (3) square root of the mean squared error, RMSE, and (4) sample or prediction correlation coefficient = ( , ). Here, and in the following error metric definitions, represents either the wind speed at a location ( ) or the standardized wind generation of a country ( ).

New error metrics
This article introduces several new error metrics for wind speed or country-level wind generation time-series validation. The first error metric diagnoses diurnal cycle errors and is the error in the normalized mean wind speed at each hour on each month ( ) over the mean wind speed, or the equivalent for standardized generation ( ): The second metric quantifies the difference between the modeled and measured distributions of wind speed or wind generation. It consists in computing the area between the cumulative density functions ( ( )), note that this error metric is only positive and will take the value of the error of the mean if there is a bias in two equally distributed time series, for more information refer to [19]: Four additional validation metrics are used for comparing the timeseries properties. Two metrics are used to compare the auto-correlation functions on a given location ( ) or of a single country ( ): one metric focuses on the auto-correlation function errors for 1 h lag: while the other metric computes the mean of the errors in the autocorrelation function between 1 and hours lag: The last two metrics quantify the errors in the correlations of a point pair ( , ) (or country pair). The first error metric consists in computing the mean error in the spatial-correlation: while the last error metric computes the error in the correlation of one hour ramps, ( , ) = ( , + 1) − ( , ) between two points or two countries: ) .
Finally, in order to characterize the spatial-correlation from each data set a characteristic length (L) is computed by fitting the following correlation to distance model, see Eq. (8); [34] presents several additional spatial-correlation models. Here the correlation between either a pair point wind speeds or 1h wind speed ramps is noted as , while the distance between the points is . The length scale is the distance in which the correlation takes a value of 1∕ , or using the fitted coefficients:

Wind speed
An example of the simulated wind speed time-series is presented in Fig. 5. All the simulations capture the larger scale trends in the wind speed, but fail to capture the specific times in which wind speed fluctuations occur. This is a well know behavior of such data sets [35].
The overall results of the validation of the wind speed simulations are presented in Tables 4 and 5, The mean and standard deviation across all sites are also reported for every error metric and model in the tables. Fig. 6 shows the model versus measurements for the relevant variables used in the definition of error metrics focused on the wind speed distribution. The bias in the mean wind speed is smaller for ERA5 and NEWA than for the mean wind speed scaled models. However, the standard deviation of the errors (or the model uncertainty) are smaller for the GWA-scaled models than for using reanalysis data sets directly. The wind speed measurement locations cover mostly offshore and coastal areas; thus, the results can be expected to vary if more onshore wind speed measurement locations would be added to the analysis. ERA5 with GWA scaling (ERA5_GWA2) has the best traditional error metrics over all models, lowest mean MAE, lowest mean and standard deviation of RMSE, and the highest mean prediction correlation values with its lowest standard deviation across locations. NEWA_GWA2 has the lowest standard deviation of MAE across sites, with a value very close to ERA5_GWA2. EIWR has the lowest mean and standard deviation of errors in the normalized diurnal cycle, see Fig. 6 center. Nevertheless, all models show the same order of magnitude in accuracy to predict the normalized diurnal cycle with errors in the order of 10 −3 . This means that all models capture reasonably well the diurnal and seasonal variability over the mean wind speed conditions. Note that GWA2 scaled models are omitted in the daily cycle figure as they provide the same normalized diurnal cycles results as the nonscaled models. NEWA has the lowest mean , while EIWR_GWA2 has its lowest standard deviation across sites; but in general, all models show similar mean error in wind speed distribution over all the sites, while the standard deviation over all sites are lower for the models with mean wind speed scaling. This means that the mean wind speed scaling does improve the model accuracy to predict the wind speed distribution at a site.
Auto-correlation error metrics are presented in Table 5 and visualized in Fig. 7 for some example sites and the overall wind speed auto-correlation model versus measurements (for aligned lags) for all sites. All models over-predict the auto-correlations, with the models with lower horizontal resolution having larger errors: ERA5, followed by EIWR and NEWA. Note that the GWA-scaled models are omitted as the correlation is the same as the non-scaled version of the model. The spatial-correlations over distances for all sites are presented in Fig. 8, as well as the characteristic length of the correlation to distance model fit. Table 5 presents the mean and standard deviation of the different error metrics across all sites. All models are able to capture the spatial-correlation of wind speeds with small over-estimation of correlations of the order of 3%-6%. The characteristic length scale of the wind speed spatial-correlation for all models tends to be larger than the one estimated from the measurements, 13% for EIWR, 17% for NEWA and 21% for ERA5. The spatial-correlation of 1 h ramps of wind speed   are better modeled by NEWA, while ERA5 largely over-estimates the spatial-correlations of 1 h ramps over all distances. The characteristic length scale of the 1 h ramp wind speed spatial-correlation is largely overestimated by the models.

Wind energy generation
An example of the country-level generation time-series is presented in Fig. 9. All the simulations capture the larger scale trends in the wind generation, but the GWA-scaled models tend to better follow the observed production.
Tables 6 and 7 present the overall results of the validation of the country-level wind generation statistics, while Fig. 10 depicts the model versus measurements for the relevant variable used in the definition of the different error metrics. The bias and standard deviation of errors in the capacity factor (mean standardized power) are smaller for the GWA-scaled models. Even though the distribution of prediction error for the capacity factors seems widely spread, the bias in CF is within 0.01-0.02. Individual country generation distributions are presented in the appendix in Fig. A.13. ERA5_GWA2 is consistently the best performing model in terms of mean and standard deviation across countries for MAE, RMSE and prediction correlation. Individual country-level prediction correlations are reported in the appendix in Table A.8.
All models show similar mean errors for predicting the diurnal cycle of generation in the order of 0.4 × 10 −2 , while GWA2 scaled reanalyses have a slightly lower standard deviations of around 0.3 × 10 −2 . This indicates that improvements need to be done in the time dependent wind-to-power transformation to reach the same accuracy levels seen in the wind speed diurnal cycle. ERA5 shows the largest deviations over multiple countries which demonstrates the need for mesoscale/microscale modeling. All the GWA2 scaled models show lower mean error metric and standard deviations compared to using the reanalysis data directly.
The mean and standard deviation of the different error metrics across all countries for auto-correlation and spatial-correlation of the J.P. Murcia et al.      country-level wind generation simulations are presented in Table 7. There is no clear over-all best performing model in terms of auto-and cross-correlations. All models have small bias in the auto-correlation metrics, while the standard deviations tend to decrease for the GWA2 scaled models. All models are able to capture the spatial-correlation of wind generation (and its 1 h ramps) on the country-level, but GWA2 scaled models show similar across-site standard deviation of spatial-correlation errors as their correspondent non-scaled model (see Fig. 12). Fig. 11 presents example auto-correlation functions and the over all model versus measurement auto-correlation plot. NEWA and EIWR (both models that use WRF) show growing errors with a 24 h periodicity, which indicates that there is a drifting trend from the boundary conditions given by the ERA5 or ERA-Interim reanalyses.

Discussion
As expected from previous studies [10,13], the errors in the prediction of mean wind speed are unbiased with WRF without microscale downscaling. On the contrary, the microscale mean wind speed scaling is necessary to decrease the uncertainty in the prediction of mean wind speed (standard deviation of mean wind speed errors) and to obtain an unbiased predictions of the country-level capacity factor. The over-prediction of mean wind speeds by NEWA in northern Europe is not reported in other studies [10,19]; however, this can be due to the lack of reliable validation data onshore in Denmark and Germany (two countries with the largest installed capacities). Nevertheless, other authors have reported a similar trend of over-prediction of mean wind speeds in Northern Europe and under-prediction of mean wind speeds in southern Europe [9,36], see Fig. 4.
The obtained prediction correlation with ERA5_GWA2 is larger than the ones reported in the literature with the previous generation reanalyses [11] and similar to the studies that rely on calibrating the generation distribution using the measurements [9]. For example, the prediction correlation of ERA5_GWA2 standardized wind generation ( ) for Germany in this study is .984, while it is .981 in [9] or .97-.985 in [11]; for Spain is .962 in this study while it is .917 in [11]; for France it is .983 in this study while it is 0.976-.986 using high resolution regional reanalyses in [13]. This shows that ERA5 with a mean wind speed scaling (GWA2) and a simple wake modeling can achieve similar accuracy levels as those models that rely in the use of measured generation for calibration or high resolution reanalyses. The present bottom-up approach has the advantages of not needing measurements and being applicable for varying future wind installations without modifications, while the top-down approach needs to assume that the transfer function calibrated on an specific installation capacity will be the same in future fleets. The main disadvantage of the present approach is the high requirements on the technical data of the WT and WPP, even more WPP technical information is needed in comparison to [11] due to the wake modeling. The prediction correlation for Portugal for ERA5-based models is significantly lower compared to the other countries and compared to other models; the reasons for this should be investigated in future work.
We chose not to apply a Gaussian filtering of the single turbine power curve as done in [9,13] because the inclusion of wake and plant shutdown modeling are able to capture the distributions of countrylevel power production. Furthermore, we did not apply age dependent losses as suggested by [37] but instead used a total efficiency of 0.95. Age-dependent losses were considered, but applying them did not provide additional accuracy, so they were not applied.
This paper might provide an explanation to the results presented in [13] in terms of NEWA incorrectly simulating the wind speed daily cycle. NEWA shows a periodic discrepancy in the wind generation auto-correlation with respect the measurements with a 24 h period. This indicates that there are problems in the daily cycles in France and Germany even though NEWA has no bias and a similar model uncertainty in predicting the relative daily cycle of wind as the other data sets studied, including ERA5.
Larger standard deviation errors in generation diurnal cycle than wind speed diurnal cycle indicates that future improvements can be achieved by implementing a more accurate wind-to-power transformation. Time varying wake losses driven by the stability time series available in the different reanalysis data sets might be the missing diurnal-seasonal component of the generation. We demonstrate that it is possible to accurately simulate large-scale regional wind energy generation using the GWA-scaled ERA5 data set and a generic wake model; here accuracy is understood as being able to simulate the distribution of wind generation and to produce timeseries that have the right auto-correlations and spatial-correlations. In the country-level aggregation, the high-frequency variability in the measurements is smoothed out and thus compares well to ERA5 in terms of the country-level auto-and spatial-correlation statistics. We highlight that ERA5 is available starting from 1950 up to nowadays with releases occurring in almost real-time, a significantly larger period than NEWA/EIWR. This coverage will enable extreme event studies such as power fluctuation in storms [6] or modeling of the influence of wind in power outtakes. Regarding wake modeling the authors plan to make a new generation generic surrogate model to consider wind direction as well as stability dependent wake losses as time-series.
There are several limitations with the GWA scaling approach as the micro-scale resource assessment used. The GWA2 downscaling relies on the linear flow model WAsP, which is known for over-predicting wind speeds in complex terrain [10]. The approach proposed in this paper is not sufficient to simulate individual plant wind generation because the mean wind speed scaling does not take into account the effects of terrain in the wind direction time-series. Also, detailed wake modeling will require knowledge of the turbine layout, which is not available, and time dependent wake losses that are driven by the atmospherically stability and turbulence time-series.
The temporal properties of the ERA5 time series are significantly worse than those from NEWA for the prediction of wind speeds at individual locations or WPPs. This is a well known problem: the resulting time-series from WRF or coarser reanalysis data set are too smooth in comparison to individual point measurements. Stochastic models based on experimental missing wind speed spectra can be applied to simulate the missing high frequency component of the wind speed and wind generation [38][39][40].

Conclusions
Even though there is still room for improvement in terms of resolution, the ERA5 reanalysis can successfully be used for simulating large-scale wind energy production with similar levels of accuracy as using higher resolution weather mesoscale modeling. However, this is only true if the mean wind speed bias is corrected based on a high resolution micro-scale modeling and if wake losses are considered, at least in an approximated way. The combination of ERA5 and the Global Wind Atlas (GWA) shows good agreement with measured country-level generation data. The presented bottom-up approach can achieve similar levels of prediction-correlation with the measured country-level wind generation than simulations that rely on a period of measurements for calibration.
Due to the reduced coverage of the mean wind speed validation locations, the mean wind speed error distribution presented in this paper is biased towards offshore sites, therefore it does not represent southern Europe sites. For the available wind speed validation data, ERA5 shows the lowest bias in mean wind speed measurements, while GWA scaled models show the lowest standard deviations for mean wind speeds and lower area between the observed and modeled wind speed cumulative density function, . All models are able to capture the daily cycles over the months with similar levels of precision and accuracy. In terms of auto-and spatial-correlation, the New European Wind Atlas (NEWA) shows the best fits to measurements but as expected there is missing high frequency variability in all models.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Individual country wind generation prediction correlations are presented for all models in Table A.8, while wind generation distribution are presented in Figs. A.13 and A.14 for the GWA2 scaled mean wind speed data sets, and the CF are reported. ERA5_GWA2 shows the best  other countries, which could be due to less coverage or more errors in the technical data of installation database [7].

Appendix B. Wind speed measurements technical data
The technical data from the wind speed measurements including the respective heights used, time period, type of location and type of measurement device are listed in Table B.9. Anonymized stations were named according to the location: Northern North Sea (NNS), Central North Sea (CNS), South North Sea (SNS) and Western Baltic Sea (WBS).  Observational data sets. Type: meteorological masts (M), LIDAR (L); location: coastal (C); land (L); offshore (S); forest (F). Data sources: a [41], b [42], c [43], d [44], e [45], f [21], g [46] and h commercial site. Availability [%] refers to the valid data within time coverage after the quality control.