A perfect prognosis downscaling methodology for seasonal prediction of local-scale wind speeds

This work provides a new methodology based on a statistical downscaling with a perfect prognosis approach to produce seasonal predictions of near-surface wind speeds at the local scale. Hybrid predictions combine a dynamical prediction of the four main Euro-Atlantic Teleconnections (EATC) and a multilinear statistical regression, which is fitted with observations and includes the EATC as predictors. Once generated, the skill of the hybrid predictions is assessed at 17 tall tower locations in Europe targeting the winter season. For comparative purposes, hybrid predictions have also been produced and assessed at a pan-European scale, using the ERA5 100 m wind speed as the observational reference. Overall, results indicate that hybrid predictions outperform the dynamical predictions of near-surface wind speeds, obtained from five prediction systems available through the Climate Data Store of the Copernicus Climate Change Service. The performance of a multi-system ensemble prediction has also been assessed. In all cases, the enhancement is particularly noted in northern Europe. By being more capable of anticipating local wind speed conditions in higher quality, hybrid predictions will boost the application of seasonal predictions outside the field of pure climate research.


Introduction
Recent advances in the fields of climate modelling and seasonal prediction have resulted in skilful seasonal predictions of surface variables over the extratropics (Merryfield et al 2020). This has, in turn, led to the development of climate services that inform weather-and-climate-vulnerable socio-economic sectors of seasonal anomalies a few months ahead (Buontempo et al 2018). The energy sector takes advantage of such valuable information since energy production and demand are strongly linked to climate variability. In particular, the renewable energy industry can profit from seasonal predictions of surface wind speed (Clark et al 2017, Torralba et al 2017 and wind power generation (Lledó et al 2019) to anticipate revenues, balance electricity supply and demand or schedule maintenance activities among others. However, those predictions still suffer from some limitations, mainly due to (1) the limited skill levels on surface variables available from current seasonal prediction systems and (2) its relatively coarse spatial scales.
Generally, seasonal anomalies of atmospheric variables arise from large-scale forcings that other components of the Earth system exert as boundary conditions, such as anomalies of sea ice extent, sea surface temperature or soil moisture. These boundary-condition forcings can be adequately represented in coarse-scale coupled models-often delivered in grids of tens of square kilometresleading to some skill in the predictions. However, the absolute values that are experienced near the surface at the local scales can be highly affected by local effects and vary substantially even at short distances. Values of surface temperature or precipitation are affected by the local topography, particularly in complex-terrain regions (e.g. Anders et al 2006). Near-surface wind speeds are affected not only by topography but also by surface roughness, buildings and obstacles. For instance, near-surface wind speed conditions can be very different at the top of a ridge, at a mountain pass or at a valley floor. These differences in magnitude are especially relevant for deriving indicators that are non-linear and therefore sensitive to absolute magnitudes, such as the capacity factor (CF) of wind power (Pickering et al 2020).
To transfer climate information from coarser to finer scales, many downscaling techniques have been developed and employed in weather and climate studies to refine model outputs. There are essentially two different downscaling approaches. Firstly, dynamical downscaling couples a Regional Climate Model (RCM) to a Global Circulation Model (GCM) over a limited region within the global domain, using the data from the GCM as boundary conditions. The computational costs of dynamical downscaling are rather high and the additional skill is sometimes negligible (Robertson et al 2012), which explains its limited use in seasonal predictions (García-Díez et al 2015, Schwitalla et al 2020. Secondly, statistical downscaling relies on the assumption that a relationship exists between the large-scale information provided by a GCM and the fine-scale variable. Once a statistical relationship is built, local values -predictands-are inferred using large-scale information-predictors-. Then, future dynamical predictions of the large-scale variables (i.e. those generated using the physically-based equations of the dynamics of the atmosphere) can be inserted as predictors into the statistical relationship to produce local-scale predictions. Statistical downscaling techniques (see Gutierrez et al 2013 for a review) can be in turn subdivided depending on whether the statistical model is fitted using observational data for both predictors and predictands (known as Perfect Prognosis or PP; Klein et al 1959) or using data from the GCM itself (often referred to as Model Output Statistics or MOS; Glahn and Lowry 1972).
The statistical downscaling approach is relatively easy to implement with climate prediction systems containing several ensemble members and has already been employed in some studies for downscaling temperature and precipitation forecasts at seasonal timescales (e.g. Pavan andDoblas-Reyes 2013, Manzanas et al 2018). However, to the best of the authors' knowledge, no attempt has yet been made to downscale seasonal predictions of wind speed.
The selection of the employed predictors is vital for the success of the statistical downscaling method. Not only do the predictors need to be strongly related to the predictand, but also predictable from the dynamical model. Teleconnection indices that summarise the state of the atmospheric circulation are optimal for this purpose. In this work, four Euro-Atlantic Teleconnection (EATC) indices (namely the North Atlantic Oscillation (NAO), East Atlantic (EA), East Atlantic/Western Russia (EAWR) and Scandinavian Pattern (SCA)) are employed as predictors to anticipate near-surface wind speed conditions in Europe. Those teleconnection indices are strongly related to wind speed conditions in Europe (Zubiate et al 2017) and wind power generation (Yang et al 2020), and have been recently shown to be predictable . Since the downscaled predictions combine a dynamical forecast of a circulation variable and a statistical relationship with a second variable of interest, they are referred to as hybrid predictions (see chapter 2 in WMO 2020), to differentiate them from purely statistical seasonal forecasts that employ observed values of potential forcing fields to derive the predictions (Kämäräinen et al 2019). Hybrid predictions take advantage of the predictability of the EATC indices from dynamical predictions, especially in winter, and thus help to overcome limitation (1). At the same time, the downscaling allows for transferring such information to a finer grid scale, circumventing limitation (2).
The objective of this work is to generate and assess the quality of a hybrid seasonal prediction of nearsurface wind speeds and wind power CF by applying a statistical downscaling with a PP approach to a set of dynamical predictions of EATC indices. Sections 2 and 3 describe the data and methodology employed, respectively. Results are presented in section 4 while conclusions are drawn in section 5.

Datasets
The hindcasts from five different operationallyproduced seasonal prediction systems have been used in this study: the System2 from Deutscher Wetterdienst ( All five prediction systems have been retrieved from the Climate Data Store data portal in a regular grid of 1 • × 1 • of spatial resolution and covering the 1993-2016 period. Particular details of the employed seasonal prediction systems, as well as the two observational references, can be found in table 1. The ERA5 HRES (hereafter ERA5) reanalysis dataset (Hersbach et al 2020) produced by the ECMWF has been used as the gridded observational reference. The dataset has been downloaded through the ECMWF retrieval system (MARS) in its native grid (i.e. 0.3 • approximately), and at 1-hourly time resolution. Then, the ERA5 data has been horizontally interpolated using a conservative approach to match the spatial resolution of the predictions, allowing for bias adjustment and verification at the grid level. At the local scale, wind speeds measured in-situ at 17 tall tower locations over Europe have been considered (see details in table 2 and their spatial distribution in figure S1 (available online at stacks.iop.org/ERL/16/054010/mmedia)).
Those observations have been obtained from the Tall Tower Dataset (TTD, Ramon et al 2020), a qualitycontrolled collection of wind data taken at tall meteorological masts of 20 to more than 200 m height. Since these structures measure winds simultaneously at several heights above ground, we have selected at each of the 17 locations the wind speed series which is closest to the 100-metre height. Modern wind turbines are placed at those heights since the wind flow is notably less affected by surface roughness than at surface level. The 17 time series span from 6 to 30 years within the 1984-2017 period. To unify the timespan of the series, and ensure the representativeness of the comparisons against predictions, the 17 time series have been averaged into hourly values and reconstructed to cover the entire 1981-2017 period. To this end, a Measure-Correlate-Predict approach with a simple linear regression has been employed (see Carta et al 2013 for further details), using as the reference series the hourly 100 m wind series of the ERA5's closest grid point to each tall tower location.

Hybrid predictions
Hybrid predictions for the boreal winter (December-January-February, DJF) have been produced using the PP methodology as represented in figure 1. Once the dynamical forecasts of the predictors (i.e. the EATC indices) are generated, they are used in a statistical model that accounts for variations in wind speed related to variations in the EATC indices. The statistical model has been previously built solely on observations of wind speed and EATC indices. For the purposes of our work, the PP approach represents an advantage over MOS, because (1) it uses one single statistical relationship that can be applied over various dynamical prediction systems, and (2) the amount of data available for fitting the relationship is not limited to the length of the hindcast, but to the timespan of the observational series (Marzban et al 2006). A more precise description of the different steps of the PP and the generation of the hybrid predictions follows.
Firstly, four EATC indices are computed as described in Lledó et al (2020). The EATC patterns and indices have been derived from the 500 hPa geopotential height field anomalies employing a Rotated Empirical Orthogonal Function (REOF) analysis over the Euro-Atlantic domain [90 • W-60 • E; 20 • N-80 • N]. The four teleconnections obtained correspond to the North Atlantic Oscillation (NAO), East Atlantic (EA), East Atlantic/Western Russia (EAWR) and Scandinavian pattern (SCA). This procedure has been followed to obtain both observedusing the ERA5 anomalies-and predictedemploying the anomalies from DWD2, GS5GC2, MF6, SEAS5 and SPS3-EATC indices. The observed EATC patterns are shown in figure 1 in Lledó et al (2020).
Then, a statistical model that relates seasonal anomalies of near-surface wind speed and the EATC indices is built from historical observations. A very simple multilinear regression model (equation (1)) has been used here, due to the rather small sample size available for fitting the model. This method has already been used in Rust et al (2015) to model European temperatures from several teleconnections. A model that expresses anomalies of near-surface wind speeds (predictand: w ′ ) as a linear combination of the EATC indices (predictors: NAO, EA, EAWR, SCA) is built separately at each grid point or tall tower location (x, y). The fit adjustment parameters a n are obtained employing an ordinary least squares method (see their spatial distribution in figure S2). The reference period that is used in all the model fits is 1981-2017. Additionally in the generation of the multilinear models, a leave-one-out crossvalidation approach has been considered. The EATC observed indices and its corresponding wind observation of the year under consideration are excluded from the sample used to estimate the fit adjustment parameters. In this way, they can be used later for verification To avoid overfitting in the statistical model, a selection of the best subset of predictors that retains the maximum information in the model without necessarily keeping all the predictors is made at each location by using the Akaike Information Criterion and a backward stepwise selection (James et al 2013). Albeit using a relatively simple statistical model, the coefficient of determination (R 2 ) of the fit presented in figure 2(a) shows that the EATCs explain most of the year-to-year variability (also known as interannual variability) in the near-surface winds over extended areas of Europe (figure 2(b)).
In order to obtain an ensemble of hybrid nearsurface wind anomaly predictions, the individualmember predictions of EATC indices are inserted into the multi-linear regressions, both at gridded and local scales. The seasonal ensemble predictions of the EATC indices are initialised at the beginning of winter (December) and one, two and three months in advance (i.e. November, October and September, respectively). In this work, lead-zero predictions will refer to those initialised in December, lead-one predictions will be those initialised in November, and so on. Finally, all members from the five prediction system are pooled together to create a new dataset, the multi-system henceforth, with a total of 148 members. Multi-system ensemble predictions can outperform individual-system predictions .

Wind capacity factor
The wind-based CF index is obtained using the 6 h wind speed data from the predictions, and the 1-hourly winds from the ERA5 and TTD. The conversion between wind speed and power output has been made employing a power curve, which takes into account the specific efficiency characteristics of the wind turbine. Specifically, a power curve for the turbine Type I defined in the IEC-61400-12-1 international standard has been considered (see IEC 2017 and Lledó et al 2019 for further information).
Although this turbine type might not be the most suitable for all the investigated locations, it serves the purpose of investigating whether the non-linearities of its power curve affect the quality of the hybrid predictions. Once the conversion is made, CF values are obtained dividing by the nominal power capacity of the turbine. Lastly, seasonal anomalies are calculated.
Hybrid predictions of CF, which might be of particular interest at a turbine or wind farm level within the wind industry, are studied in detail at one tall tower location where local wind effects represent a huge proportion of the seasonal mean wind speed value, and subsequently the seasonal CF value. The relatively low r obtained for Puijo tall tower (T16, table 2) envisages that local wind effects are likely to occur there, and a comparison against a surface station located two kilometres away reveals so (Leskinen et al 2009).

Verification metrics
The quality of the hybrid predictions has been assessed employing both gridded and local-scale observations. Multiple verification scores have been considered to account for different aspects of forecast quality: association, discrimination and reliability Stephenson 2012, Mason 2018). In some of those scores, the performance of the hybrid predictions is compared to that of a benchmark prediction. Two different benchmarks have been employed: the climatological forecast (i.e. a 33% of probability for all tercile categories) and the dynamical predictions of near-surface wind speed from the considered systems (table 1). Skill scores using the climatological forecast as a reference are identified with the sub-index c while those using the dynamical prediction use d.
To prepare the dynamical prediction benchmark, seasonal anomalies of surface (10 m) wind speeds for the 1993-2016 period have been obtained at gridded and local scales (in the latter case using a bilinear interpolation) and then bias-adjusted using a simple bias correction approach (Torralba et al 2017). The method adjusts predictions to have an equivalent standard deviation and mean to that of the reference dataset, which has been the ERA5 reanalysis near-surface wind speeds. A leave-one-out crossvalidation approach has been again used: the prediction to be adjusted and its corresponding observation are excluded from the sample used to estimate the adjustment parameters (see equations (1)-(4) in Torralba et al 2017). The multi-system of the dynamical predictions is also generated by pooling together all the bias-corrected anomalies from the five prediction systems.
The considered scores for the skill assessment are both deterministic and probabilistic, and the R packages easyVerification and SpecsVerification have been used for their computation: • The Ensemble Mean Correlation (EMC) quantifies the association (i.e. linear dependency) between observed and predicted wind speeds. The EMC ranges from −1 to 1, with a value of 1 indicating a perfect association. A Student's t-test at the 95% of confidence level has been applied to emphasise statistically significant areas. • The Relative Operating Characteristic Skill Score (ROCSS) assesses the discrimination of probabilistic single-category forecasts. Here, predictions are prepared in the form of probability of occurrence of three categories defined by the 33rd and 66th percentiles of the hindcast values. The ROCSS measures the proportion of hits (i.e. correct predictions) versus false alarms (i.e. non-occurrences that were incorrectly predicted) for each of the three categories. The ROCSS ranges from −1 to 1, with negative values indicating a weaker discrimination capacity than that of the benchmark prediction. • The Rank Histogram (RH) tests the reliability of the probabilistic predictions, by comparing how the observations rank with respect to the ensemble members of the predictions. Reliable ensemble predictions show a flat RH, which has been statistically assessed with a decomposed Pearson's χ 2 test as in Jolliffe and Primo (2008). When the sample size is small in comparison with the number of ranks available (i.e. the ensemble size), non-flat rank histograms are likely to occur due to randomness, which is not desirable. To prevent this from happening, counts from every ten adjacent bins have been grouped so that the number of ranks has been reduced by a factor of ten. • The Continuous Ranked Probability Skill Score (CRPSS) measures the quality of the cumulative forecast probability distribution by measuring the distance between the observed and predicted probability distributions. The CRPSS penalises both reliability and resolution-the latter is closely related to discrimination-errors. It ranges from −Inf to 1, and positive values indicate an increased skill compared to the benchmark forecast. The Diebold-Mariano test (Diebold and Mariano 1995) has been applied to explore the statistical significance of the differences between the CRPSSs of hybrid and dynamical predictions.
Finally, areas where the hybrid model shows a poor performance based on the R 2 of the statistical fit being smaller than 0.3-grey areas in figure 2(a)have been omitted in the verification. Those areas are located around the Black Sea and scattered around the northern Mediterranean, where low values of interannual variability are noted ( figure 2(b)). There, winds respond mainly to mesoscale systems rather than large-scale circulation patterns, which may explain the inability of the hybrid model in reproducing the year-to-year variations of near-surface wind speeds. T9 has been omitted in the results as well since the local winds correlate very poorly with the ERA5 winds (table 2), thus not giving robustness to the Measure-Correlate-Predict reconstruction.

Results
In the following sections, we analyse the skill of the hybrid predictions at the local scale. We complement these results with the verification of gridscale hybrid forecasts (i.e. adjusted to reanalysis data instead of tower observations) at a pan-European scale [27 • N-72 • N; 22 • W-45 • E]. This is important because potential users of hybrid predictions may face the limitation of the unavailability of in-situ local data needed to generate the predictions. The verification focuses on three key attributes of a probabilistic prediction: association, discrimination and reliability. For the sake of simplicity, results are shown only for the multi-system prediction. Remaining results for the individual systems are available from the authors upon request. We focus on the winter season, when wind speed variability is highest, and so is the importance of its anticipation.

Do hybrid predictions improve their dynamical counterparts?
The association between the observed and hybridpredicted near-surface wind anomalies is measured by the EMC and illustrated in figures 3(a)-(d) at both local and grid scales. The EMC is a deterministic metric which is insensitive to forecast errors in the magnitudes and the spread of the ensemble, so only some association with the observations is required for a forecast to be skilful. In this regard, the negative correlation values noted across the Mediterranean basin anticipate a poor performance of the hybrid prediction over there. Conversely, positive and significant correlations above 0.6 have been obtained for lead month zero across northern Europe ( figure 3(a)).  At longer leads, correlations decrease but still depict positive values above 0.4 in the British Isles and the east of the Baltic sea. For the latter region, we observe increased and statistically significant EMC values at lead month two, which are not seen at lead months one and three. This improvement in the hybrid prediction at that particular lead month responds to an increase in the skill values of the EATC predictions. More specifically, the SCA index has the greatest weight in the hybrid model over that region (figure S2), and shows a relative maximum in correlation at lead month two (i.e. 0.42; see table S1). The differences in EMC between the hybrid and dynamical predictions (figures 3(e)-(h)) reveal that the highest gains in skill are seen at the longest leads. While the dynamical forecast offers skill only at leads zero and one (see figure S3), the hybrid prediction shows positive EMCs at all lead times. The increased scores for predictions based on the circulation patterns in the hybrid method appear to match the increase in skill seen in other recent studies (e.g. Scaife et al 2014, Baker et al 2017. Results are similar at the local scale. The lollipop plots (figure 4) depict the most noticeable differences between hybrid and dynamical predictions at longer leads, where the improvement of the hybrid prediction is substantial for all systems but the SPS3.
The sensitivity of the predictions to discriminate between observations belonging to different categories has been explored with the ROCSS. The ROCSS c of the lower-tercile category for the multi-system hybrid and dynamical predictions is compared in figure 5. While both hybrid and dynamical predictions show similar skill score values at lead zero (panel (a); the density is centred around the y = x line), it is noted that hybrid predictions enhance the discrimination ability at leads one, two and three (panels (b), (c) and (d); most of the density is found above the y = x line). Furthermore, this improvement is not only restricted to a particular region but positive ROCSS d values are observed all over Europe (not shown).
Analogous results are obtained for the predictions of the upper-tercile category ( figure S4). On the other hand, neither hybrid nor dynamical predictions show skill for the central-tercile category ( figure S5). The lack of skill in predictions for near-normal is a recurrent issue which has already been addressed in the literature and stems from the definition of the skill scores itself, thus not requiring any physical or dynamical explanation (Van Den Dool and Toth 1991).
To gain more insight into the performance of the hybrid predictions at the local scale, we have selected four tall tower locations to evaluate the reliability of the ensemble predictions by exploring their rank histograms (figure 6). The set of four locations include T2 (Cabauw, The Netherlands), T5 (Fino2, Germany), T15 (Obninsk, Russian Federation) and T16 (Puijo, Finland), which are located in both continental-flat and complex terrain-and offshore platforms across northern Europe. Focusing on lead zero, the RHs of the hybrid predictions at T2 and T5 (figures 6(a) and (b), respectively) are both U-shaped, mirroring an overpopulation of the outermost ranks which can occur due to either a lack of ensemble mean signal or a lack of spread around the ensemble mean (Eade et al 2014) in the hybrid prediction. The non-flatness of the RH is statistically supported by the p-values of the Jolliffe-Primo statistical test-at the 95% of confidence level. Conversely, the RH at T15 (figure 6(c)) depicts an opposite convexity (i.e. overdispersion), but this outcome is not statistically significant. These results envisage a poor reliability of the multi-system hybrid predictions at these particular locations, which can also be noted for the individual systems (figures S6-S10), especially at T5. The unreliability of the multi-system hybrid predictions is observed in the RHs of 10 out of the 16 tall tower locations, while the other 6 locations show a flatter plot such as that observed at T16 (figure 6(d)). This indicates that the probability distribution of the ensemble at these six locations is in agreement with the observed values. Similar results are obtained for the other leads (not shown). The performance of the hybrid predictions could be improved further by employing calibration methods (Doblas-Reyes et al 2005, Manzanas et al 2019 or performing variance corrections to the ensemble mean and members (Eade et al 2014).
To complete the skill assessment we compute the CRPSS, a restrictive quality metric of the ensemble distribution that accounts for both discrimination and reliability at the same time. Figure 7 presents the CRPSS d , highlighting areas where the hybrid approach improves (positive values) or degrades (negative values) the dynamical prediction. In general, the results match those discussed for the EMC and ROCSS (figures 3 and 5, respectively) with the highest gains seen for leads two and three. However, the corresponding CRPSS c values of the hybrid predictions are mostly negative (figure S11). Positive CRPSS c values are only noted for the lead-zero predictions and, in the case of MF6, the hybrid forecast is the only that offers skill (figure S12).
According to Mason (2004), some scores such as the Ranked Probability Skill Score-and thus the CRPSS-are often too harsh when the climatological forecast is considered as benchmark. This gives high chances of getting negative values even when predictions provide useful information. Such is the case observed here: although the CRPSS c is generally negative, we observe gains in association and discrimination and, in some cases, hybrid predictions are reliable. Therefore, one should not rely solely on a single skill score but take into account the whole verification.

Can hybrid predictions always be trusted at a local scale?
At this point in the results, it has been shown that hybrid predictions improve the dynamical in many aspects, primarily in northern Europe. However, little has been discussed about how hybrid predictions perform at the micro-scale level, especially when local wind effects occur. In the following, we illustrate how hybrid predictions could be applied to predict the absolute values of the wind CF at a location where local wind effects have been reported, and quantify the error made when reanalysis gridded data-which sometimes misrepresent those effects-are used to fit the hybrid model.
The ensemble predictions of CF for Puijo site are presented in figure 8 in the form of Probability Density Functions (PDF). We note that the direct output of the grid-scale hybrid predictions is considerably biased, being the seasonal mean CF systematically underestimated (figure 8(a)). A CRPSS c value of −4.215 indicates that the prediction is completely useless. A later bias adjustment of this prediction ( figure 8(b)) removes the bias and adjusts the variability to that observed at Puijo-though the skill score of the prediction is still negative (−0.046), indicating a similar performance to that of a climatological forecast. Finally, the hybrid prediction fitted with in-situ data also adjusts well to the observed CFs, and the CRPSS increases a bit more, up to a positive value of 0.0007, indicating that the use of local observations with the hybrid method provides the most accurate prediction of seasonal CF values.
The important bias in the grid-scale predictions in figure 8(a) responds to the fact that gridded data are a representation of the average value within a grid cell of hundreds of square kilometres. Therefore, values of variables with high spatial variability such as wind speed in complex terrain regions may differ substantially from the actual values observed at different locations within the grid cell. This misrepresentation of local values is said to produce representativeness errors. In the case of wind, local effects such as katabatic winds over complex terrain regions may account for a large proportion of the mean wind speed value, thus enlarging the representativeness error of the wind speeds in the reanalysis. These errors are propagated to the CF values, and eventually to the hybrid predictions. Hence, reanalysis gridded data are sometimes not suitable to generate hybrid predictions because these datasets are unable to represent local wind effects occurring at much finer scales, such as those observed at Puijo. A later bias-correction may enhance the grid-scale hybrid predictions, but this post-processing can only be carried out where in-situ measurements are available.

Summary and conclusions
This research proposes and applies a methodology to overcome two main restraints of seasonal predictions that jeopardises every decision based upon them. The first impediment is the limited skill levels observed in the prediction of surface variables such as wind speed, while the second is the lack of adaptation to the local scale due to the relatively coarse scales in which forecasts are delivered.
Results show that hybrid predictions of nearsurface wind speed based on a PP statistical downscaling technique help reduce the effects of both issues simultaneously. Using the indices of the four main EATCs as predictors, the hybrid predictions proposed here have been shown to improve the skill of the same predictions obtained from a dynamical approach. Besides, the statistical downscaling has enabled to transfer the coarse-scale predictions to a station-scale level, and the comparison with stationbased observations has revealed certain level of agreement even when local wind effects play an important role. In particular: • Hybrid predictions enhance the skill scores of the dynamical predictions at both local and pan-European scales. • In general, hybrid predictions are able to provide skill at leads two and three, while dynamical forecasts cannot. • The highest gains in quality are observed in the association with the observations and the discrimination against the different observed outcomes. • Although hybrid predictions can also be built using reanalyses, it is advisable not to use gridded data to build the statistical model over areas where local effects are considerable. • EATC predictions-and thus hybrid predictionsprovide no added value in the Mediterranean basin.
Hybrid forecasts foster the information available in the EATC predictions to anticipate near-surface wind speed or CF anomalies. The derived predictions are consistent with the main features of the atmospheric circulation, which are summarised in the status of the EATCs. This provides interpretability of the results, which enables users to make more informed decisions. For example, one can link higher winds across the UK and the North Sea to a positive NAO phase.
The wind power industry is one of the potential users that can profit most from hybrid predictions. Wind and CF forecasts have been proven to offer useful results at a wind farm scale, provided that site observations from a met mast are available. Moreover, the skilfulness is not restricted to the shortest leadsas it is often the case of the dynamical forecastsbut hybrid predictions issued two or three months in advance can already anticipate understanding of the conditions for the coming season.
The PP is a simple and effective approach but also suffers from some limitations. For instance, the proposed hybrid model does not account for the biases in the EATC predictions. Future work may look into existing post-processing methods like calibration techniques to bias-correct the model output. Besides, the optimal number of EATCs employed to explain the wind variability can be tuned for each region as in Bastien (2018) (chapter 3), who found varying results over France. Investigating whether these improvements lead to a marginal or substantial increase in skill would be valuable for any potential user of the hybrid predictions.

Data availability statement
Most of the data used in this article can be accessed from publicly available sources: Climate Data Store and Tall Tower Dataset.