A climate projection dataset tailored for the European energy sector

Climate information is necessary for the energy sector. However, the use of climate projections has remained limited so far for a number of reasons such us the lack of consistency among climate projections, the inadequate temporal and spatial resolution, the climate model biases, the lack of guidance for users, and the size of data sets. In this work, we develop and assess a consistent ensemble of high time and space resolution climate projections that address these problems. First, a methodology for sub-ensemble selection is developed and proposed. Our ensemble dataset includes eleven 12 km-resolution EURO-CORDEX simulations of temperature, precipitation, wind speed and surface solar radiation on 3-hourly and daily time scales. These variables are bias-corrected for a more effective use into impact studies. The assessment of bias-corrected model simulations against observational data indicates reduced biases and increased coherence in projected changes among models compared to the raw climate projections. We provide a well-documented dataset for energy practitioners and decision-makers to facilitate the access and use of energy-relevant high-quality climate information in operation and planning. The new dataset is freely available via the Earth System Grid Federation (ESGF) platform. between model outputs and energy needs (e.g. wind speed at 10 m vs 100 m), availability of high- frequency outputs, adequacy between model spatial resolution and energy needs, bias-adjustment limitations (remaining biases over some areas, unavailability of observation at high resolution, dependency of projected trends on the adjustment method), manageable data volume, limited ensemble size and over-weighted models.


Practical implications
The energy sector is sensitive to weather and climate in various ways (e.g. heating and cooling demand, extreme weather eventrelated damages on energy infrastructures, cooling water needs for thermo-electric power generation, renewable energy generation, etc.). This represents a challenge for energy generationsupply balance at all time scales. Climate information is then necessary for the energy sector to adapt efficiently to variability and changes in climate.
However, the use of climate projections in the energy sector has remained limited for several reasons: the wide variety of available climate datasets characterized with heterogeneity in terms of model ensembles and emission scenarios; unsuitability of temporal and spatial resolution of climate models for impact modelling; model biases; lack of guidance for users; no userfriendly platforms to data access; special data formats (e.g. NetCDF files) requiring certain software to be handled; among others.
In order to bridge this gap four energy-relevant variables (2 m temperature, 10 m wind speed, precipitation and surface solar radiation) from 11 EURO-CORDEX regional climate models (RCP4.6 and RCP 8.5) have been bias-adjusted at high spatial and temporal resolution to provide energy practitioners and decisionmakers with a facilitated access and use of energy-relevant highquality climate information for operations and planning. The new dataset is freely available via the Earth System Grid Federation (ESGF) nodes (https://esgf.llnl.gov/nodes.html).
Such high-resolution multi-model climate dataset represents a large amount of data, which can pose an obstacle for the climate information uptake by some users (data storage issues, computing time-consuming impact models, etc.). A sub-sampling methodology has been developed, which aims at favouring skilled models while preserving as much as possible the original spread in climate sensitivity and climate future scenarios with regard to variables of interest. This latter aspect is important for the energy sector in order to anticipate a wide range of plausible futures. This dataset has been already used to derive energy-oriented indicators: wind power potential, solar power potential, inflow changes (the flow of water into reservoirs) for hydropower, power demand, power generation-supply balance indicator, frozen soil indicator.
Wind power-and photovoltaic capacity factor, which helps in the planning process of the location of a new wind/solar park project, could be an input for long-term trend analyses and efficiency calculations to evaluate the profitability of a specific wind park.
Inflow anomaly indicator enables effective preparation for the change in future fluctuations of total and seasonal inflow. Changes in inflow affect electricity prices and the optimisation of the operation of the hydropower plants. E.g. higher inflow in northern Europe most likely increases the potential for hydropower production in a region where hydropower production is already high. However, climate predictions also indicate higher winter temperatures and thus less need for energy for heating. In combination, these two changes should allow to use more hydropower generation, and limit water spillage in case the reservoirs capacity is not large enough to store the water in excess.
The impact of freezing rain on energy infrastructure can also be investigated which gives better emergency planning in regions identified as more exposed to the risk of the occurrence of freezing rain events. A detailed assessment with statistics on the duration of events, the prevailing wind conditions and trend analyses can support decision-making processes regarding potential adaptation measures.
On the climatological time scale, the indicator for bioenergy production conditions is the length of the season suited for forest harvesting operations. Forest harvester manufacturers can design and develop a new generation of harvesters for future conditions. Similarly, the information is useful for forestry factory investment decisions for future raw material costs, which are affected by costs of harvesting and logistics.
The energy demand indicator (estimated with heating degreedays weighted by population) aims to help the energy sector to anticipate the production needs and therefore the risk of imbalance between a strong demand and a poor renewable energy potential. For some countries, e.g. France, the electricity consumption is highly correlated with this indicator, therefore in this case a linear model can explain a lot regarding the electricity consumption variation.
However, there are some limitations to the use of the dataset: issues related to adequacy between model outputs and energy needs (e.g. wind speed at 10 m vs 100 m), availability of highfrequency outputs, adequacy between model spatial resolution and energy needs, bias-adjustment limitations (remaining biases over some areas, unavailability of observation at high resolution, dependency of projected trends on the adjustment method), manageable data volume, limited ensemble size and overweighted models.

Introduction
Energy is currently the largest greenhouse gases emitting sector (35% of total emissions worldwide) (Bruckner et al., 2014), and ambitious climate change mitigation requires that, in addition to energy efficiency measures to reduce consumption, low-carbon energies' share in generation to grow very fast in the coming decades. For instance, to be aligned with a global warming reaching 2°C or less at the end of the 21st century, the share of low-carbon energies should exceed 50% by 2050 for total energy supply and 80% for electricity supply (Bruckner et al., 2014) globally. This rapid transition will involve an increasing share of renewables, which will make the energy supply more and more sensitive to weather and climate variability and changes. Energy practitioners therefore need to anticipate renewable resources and demand for planning infrastructures such as power plants and transmission systems. They also need to anticipate the change in risks of extreme events such as heat or cold spells, or low flows in order to adapt management of resources.
The climate impact research community has recently carried out several studies focusing on the impact of climate change on renewable energy supply Koletsis et al., 2016;Davy et al., 2018;Soares et al., 2017;Carvalho et al., 2017;Jerez et al., 2015;Chilkoti et al., 2017). On the other hand, changes in energy demand due to climate change have been also quantified in several works (Auffhammer et al., 2017;Cronin et al., 2018;van Ruijven et al., 2019). In the energy sector, however, the systematic uptake of climate projections has remained limited so far. Since a large amount of highquality regional climate simulations is available, the time is ripe for significant progress in the use of climate projections for adaptation. For this reason we propose and assess a methodology to process regional climate projections in order to provide consistent datasets dedicated to the energy sector.
In general climate simulations are an approximation of the real climate system with different physical and mathematical simplifications resulting in biases of the simulated climate when compared to the observed one. For this reason a subsequent adjustment (referred here as "bias correction") towards the observed climatology is necessary. Furthermore, climate simulations are not supposed to represent observed weather conditions at a specific date since after the initial conditions they are not synchronised with the observed climate representing only the main climate characteristics for a given period. Mean values, or frequency of a phenomenon computed over several years (30 years for example) are therefore more representative. The robustness of an analysis can be evaluated according to the concordance of the results produced by different models. On the other hand climate future evolution is uncertain, mainly because of the evolution of greenhouse gases concentration in the atmosphere. For this reason, climate projections are accounting for different concentration scenarios. The Representative Concentration Pathway (RCP) scenarios introduced in the fifth Assessment Report (AR5) of IPCC are named after a possible range of radiative forcing values in the year 2100 relative to pre-industrial values. In the study two of them are considered: RCP 4.5 (median) and RCP 8.5 (pessimistic).
Indeed climate projections are often disregarded due to a lack of comprehension on their nature and the assumptions they are based on. They are often considered as forecasts, which leads to false interpretations. Also the wide variety of available models and climate simulations requires appropriate guidance in order to help users. In particular, the lack of consistent multi-model datasets, at a resolution that allows the assessment of impacts, with standardized outputs, hinders the users to elaborate proper uncertainty assessments.
Many methodological progresses have recently been proposed that partly address these issues, which, if properly assessed, should allow a better use of ensemble of climate projections. Firstly, coping with model biases can be done through the procedures of statistical bias-correction. This statistical post-processing step adjusts selected statistics (mean, variance, distribution) of the so-called "raw" model simulations to better match observed time series over the reference period. Over the last decades several bias-correction methods have been developed and widely applied on model simulations before introducing them into impact studies. A detailed presentation and evaluation of these methods can be found in the works of Seibert (2012, 2013) and Maraun (2016). Under changed future conditions, however, distribution mapping methods have been considered to perform better compared to simpler approaches of delta-change approach and linear transformations (Teutschbein and Seibert, 2013). In this paper we use the Cumulative Distribution Function-transform (CDF-t) method (Vrac et al., 2012) which has the advantages of a quantile matching method while accounting for the time evolution of the cumulative distribution function (CDF) as provided by the climate model. The method described in Vrac et al. (2012) was further developed in order to improve adjustment of rain frequency (Vrac et al., 2016).
Secondly, a multi-criteria based methodology of ranking model simulations has been developed in order to help users in reducing input data considered in their analysis, when needed. In fact, it is often unclear whether all available projection models should be used, or whether a sub-sample provides sufficient information, also considering the large amount of data processing involved for each of these models. Several methodologies have been previously established to sub-sample large simulation ensembles (Mendlik and Gobiet, 2016;McSweeney et al., 2015). The methodology developed here is based on a set of criteria established a priori that sub-ensembles are to meet. However, the methodology allows for some flexibility in terms of metrics and thresholds used in order to be tailored to specific needs (see Section 3).
Furthermore, since impact studies are focusing on climate change signals acting at the regional scale, another question is the appropriate spatial resolution of the climate data. Nested regional climate models bridge the gap between large scale and regional-to-local scale processes by dynamical downscaling of coupled atmosphere-ocean Global Climate Models (GCM). Since numerous regional climate simulations are available, several initiatives have been set up in order to provide a coordination of research and modelling activities including also common interface to the applicants. The "Coordinated Regional Climate Downscaling Experiment" (CORDEX) (Giorgi et al., 2009) framework aims to compare, improve and standardize regional climate modelling at the individual modelling centers worldwide, thus harmonizing the new generation of regional climate projections applying the most recent versions of RCM ensembles, driven by the latest GCM projections, with unprecedented high resolution (e.g. for Europe simulations with 0.11°degrees resolutions are available which means about 12x12 km grid size). Therefore, in this work we consider the EURO-CORDEX framework. It provides regional climate projections for the European CORDEX domain (Jacob et al., 2014), thereby complementing the previous PRUDENCE (Christensen et al., 2007) and ENSEMBLES (Hewitt, 2005) experiments.
The paper is organized as follows. Section 2 describes the regional climate models and observational data used in the study. Section 3 presents the sub-ensemble selection methodology for ranking climate simulations. In Section 4 the bias-correction process is described. Section 5 includes the verification and quality control of the bias-corrected climate projections tailored for use in the energy sector. Section 6 gives remarks regarding data standardization and access, and finally Section 7 contains the conclusions.

Climate variables
In this study we consider a limited set of climate variables selected on the basis of the following considerations. Firstly, we chose them taking into account the discussion with several energy practitioners for their use after processing these variables. Secondly, the availability of appropriate variable features in the regional climate projection database. As a consequence we describe the processing of four variables: near-surface (2 m) temperature (tas), precipitation (pr), near-surface (10 m) winds (sfcWind), and surface solar radiation (rsds).
Near-surface air temperature is used in several applications: to model energy demand for heating or cooling, to estimate evapotranspiration in hydrological models estimating inflow for hydropower, to modulate the photovoltaic (PV) solar energy production as solar panels have efficiencies which are sensitive to temperature. Precipitation is crucial for modelling inflow for hydropower production or thermal and thermonuclear cooling system efficiency and availability. Near-surface winds allow estimation of the wind energy production and eventually for solar production, as wind influences the solar PV panel temperature. For wind energy, however, hub-height winds are required, but often not available from climate model outputs, hence they are usually recalculated from an empirical formula . Solar surface radiation, also called downwelling solar radiation at the surface, is required to estimate PV energy production.
Several other variables are potentially useful in the energy sector, such as sea-level pressure, often used as a diagnostic to calculate and understand weather patterns. Snow depth has also several applications, such as estimating hydropower inflow potential storage, or even for practicability of forest management for bioenergy in high latitude peatland areas. It is also an important variable for the PV sector since the solar power generation is reduced to zero when snow covers the panel. Ocean variables (waves, sea level, currents) are important in particular for offshore assets, or air humidity for heating/cooling demand. In this article we however restrict our selection to the four main variables mentioned above, where abundant observations exist for comparison, reanalysis were available at the 3-hourly time scale from WFDEI, and which allow to cover most of the energy sector needs, based on discussions with stakeholders.

Climate projections: the original EURO-CORDEX ensemble
The EURO-CORDEX project is a coordinated initiative to produce a multi-scenario and multi-model ensemble of regional climate projections over Europe. Detailed information on simulations achieved in the framework of this project can be found in (Jacob et al., 2014, Vautard et al., 2013, Kotlarski et al., 2014. Most of the completed simulations from the EURO-CORDEX project are published and freely available on the Earth System Grid Federation (ESGF) portals (https://esgf.llnl.gov/nodes.html). However, the daily output frequency is the highest available on this server for most models and variables. In order to address wind energy, PV power production and supply-demand balance issues respectively, a sub-daily frequency (ideally hourly) is required for tas, rsds, and sfcWind. In EURO-CORDEX, modelling groups have generally saved 3-hourly outputs but these data are available only on request. These 3-hourly data under both RCP4.5 and RCP8.5 had therefore to be retrieved directly from the producers (see Table 1) for conducting the present work. The initial set of simulations does not capture the low emission scenarios because at the time of the study only very models had RCP2.6 scenario simulations and we preferred to have homogeneity in the information provided. Contacting producers and transferring between institutes such a big amount of data that represent sets of 150-year 3-hourly 12 km data over a large domain is a heavy process, which conducted us to start with a limited sized initial EURO-CORDEX ensemble. Table 1 summarizes information related to the 11 EURO-CORDEX simulations considered in the study. It should be mention than not all the available EURO-CORDEX regional climate models are included in the study. The selection of the models was based on the availability of 3-hours simulations provided by the producers at the time of study (In Table 1 models are written in red if the data source is the producers and in blue if the data source is ESGF).
Data files were provided in netcdf format. A first initial quality check has been performed, and a number of errors were corrected. Quality checking guarantees consistency for metadata across models, completeness of all time steps, data format since some row data set could contain such small errors.
The quality check consisted in:

Reference data
Observation-based reference datasets are needed for bias-correcting the collected climate simulations. Several constraints have guided the choice of the dataset(s), such as time resolution, spatial coverage (e.g. for winds in case of Integrated Surface Database (ISD) stations coverage is very limited over some parts of Europe like Portugal, Germany, Sweden). Finally we chose the WATCH Forcing Data for ERA-Interim (WFDEI) dataset (Weedon et al., 2014) as reference data, which is provided at a 3-hourly time scale and consists of ERA-Interim re-analyses corrected by elevation plus monthly bias correction from gridded observations.
Additionally, in case of surface solar radiation, an other reference data has been used which consists in fact in a correction of WFDEI data, namely a daily scaling of WFDEI by the satellite-derived HelioClim-3v5 data, data prepared for the Copernicus Climate Change Service (C3S) European Climate Energy Mixes project (http://ecem.climate. copernicus.eu) (Jones et al., 2017).
This additional correction was found interesting as it is done at daily time scale while WFDEI correction is only at monthly time scale, and starting from WFDEI data was found technically easier and consistent with other variables.
WFDEI dataset consists of a combination of the European Centre for Medium Weather Forecasts (ECMWF) ERA-Interim reanalysis (Dee et al., 2011) and observation-based datasets. This dataset has the advantage to include the variables required at the appropriate temporal resolution, namely temperature at 2 m, wind speed at 10 m, surface solar radiation, and precipitation available at the 3-hourly and daily resolution (Table 1). WFDEI has a global coverage at the 0.5°× 0.5°r esolution (although it is available over land only) and covers the 1979-2014 period. WFDEI data is generated using the same methodology as for the widely used WATCH Forcing Data (WFD) (Weedon et al., 2010;Weedon et al., 2011) with slight differences in the basic data, processing and formatting. ERA-Interim at 0.75°is first interpolated to the WFDEI grid using the natural-neighbour methodology. Temperature at 2 m includes a bias-correction using the Climate Research Unit (CRU) TS3.1/TS3.21 (Harris et al., 2013) temperature monthly averages and averaged diurnal ranges along with an elevation Table 1 List of the 11 EURO-CORDEX simulations and the institutes that provided the data. The output frequency for each variable retrieved is written in red if the data source is the producers and in blue if the data sources is ESGF. correction. Rainfall and snowfall rates are bias-corrected at the monthly scale using CRU number of wet days, Global Precipitation Climatology Center (GPCCv5/v6) (Schneider et al., 2013) precipitation totals, ERA-Interim ratio of rainfall-precipitation and rainfall gauge correction. Note that two WFDEI precipitation datasets are available using either CRU only or CRU and GPCC data. We have used the CRU-GPCC product as GPCC includes a higher density of stations than CRU. Surface solar radiation takes into account CRU monthly average cloud cover and interannual changes in atmospheric aerosol loading. Wind speed at 10 m does not include any bias-correction or elevation correction. Over Europe, average WFDEI temperatures are well constrained by the observations. Precipitation over mountainous areas is more problematic as these regions are poorly covered. Lizumi et al. (2014) showed that means and distributions of WFDEI temperature, solar surface radiation, wind speed and total precipitation were overall similar to near-global daily observations. Weedon et al. (2014) carried out an evaluation of WFDEI products against flux tower field observations (FLUXNET) at 4 sites in Europe (Finland, Germany, Belgium, Italy) arguing that daily temperature, surface solar radiation and precipitation rates are in good agreement with observed fluxes. At the subdaily scale also, temperature agrees well with observations. However, they highlight some reasonable biases and correlations but also significant discrepancies at the various sites. Comparing grid box averages with local tower measurements presents limitations, and further evaluation of WFDEI products would be necessary.
Because the WFDEI wind product does not include bias-correction with observations, we have carried out an evaluation using an in-house 10 m wind gridded dataset  built from the stationbased Integrated Surface Database (ISD-Lite) dataset (Smith et al. 2011). This dataset was made by averaging wind speed values over the nearest neighbour stations from each grid cell centre, when not exceeding a 75 km distance. Fig. 1 highlights that WFDEI compares well with ISD-Lite station-based data (mean differences between the 2 datasets are within ± 1 m/s for most of the grid points). In most cases biases are small on continental areas but can be significant in a few coastal areas, and in some cases where wind farms are located such as Southern France, around the Baltic sea or in Southern Italy. The Root Mean Squared Error (RMSE) and spatial correlation calculated at the annual and seasonal time-scales (Table 2) indicate a reasonable agreement between the datasets on average, especially when focusing on grid points for which the ISD-Lite-base wind speed climatology has been assessed from more than one station within 75 km from the grid point centre. These grid points are expected to be characterized by a more representative climatology than a grid point associated with a unique station, which consists in a local measurement.
Another way of comparing WFDEI data with observations is a direct comparison between station and the nearest grid point. Using this procedure, we expect however significant biases in coastal and mountainous regions. In particular, several stations are located at mountain tops in the Alps, and we do not expect a correspondence with 0.5°a verage winds. In order to evaluate the WFDEI data over areas where winds are expected to correspond to observed winds, we removed, out of 417 station anemometers, those located at an elevation higher than 1000 m and those located closer then 0.2°to the sea in 4 directions (N, S, E, W). This approach results in selecting 222 stations. The "spatial" correlation between WFDEI nearest grid point and the 222 stations taken from ISD-Lite data set for the winter and summer averages and 95th percentiles as a function of the hour of the day is shown in Fig. 2. The figure clearly indicates that (i) wintertime winds are more reliably represented by WFDEI than summertime winds, the correlation reaches about 0.8 for the mean winds indicating a fairly good representation of wind variations across Europe, (ii) in summer, correlations are weaker, probably because mesoscale processes (breezes) and planetary boundary layer (PBL) structure are not well represented in gridded data, hence the weaker skill of WFDEI, and (iii) the variability of WFDEI winter winds is however well represented as the 95th percentile exhibits a fairly good correlation. In summer, again, the variability is probably less well represented.
In conclusion, the reference data used for bias-correction are based on WFDEI data, however, some post-processing procedure was needed. As the result, we used 2 m temperature, precipitation and wind speed directly from WFDEI. In case of solar radiation, we used daily re-scaled WFDEI data.

Observations: station data
In order to validate the bias-corrected model simulations several station data have been used. The daily temperature and precipitation products are compared with the station data coming from the European Climate Assessment & Dataset project (ECA&D) (Klein Tank et al., 2002). The simulated monthly wind fields are validated against the ISD-Lite product which is a subset of the larger Integrated Surface Data consisting of global hourly and synoptic observations (Smith et al., 2011). The validation of bias-corrected monthly surface solar radiation has been performed by comparing them with observations coming from the Global Energy Balance Archive (GEBA) (Gilgen et al., 1998. In the latter case, the missing monthly observations has been filled using the MASH homogenization method (Szentimrey, 2003).  Table 2 RMSE and spatial correlation are calculated for 1981-2000 annual and seasonal averages of 10 m wind speed from WFDEI and ISD-Lite-based datasets over two ensembles of grid points. The first ensemble is made up of grid points associated with at least one station within 75 km of the grid point centre and the second ensemble is made up of grid points with at least 2 stations within 75 km of the grid point centre.

Sub-ensemble selection
This section explains the method that can be used in order to subsample the ensemble of projections. The method offers a subset for a given number of simulations, and metrics to evaluate the subset. The reduction of the initial ensemble to a smaller number of simulations is necessary to alleviate the computation work of some impact modelers.
Here we develop the method and provide an example for subsets of 3, 5 and 7-member ensembles.

Model selection methodology
The method was developed in collaboration with energy stakeholders in the framework of the Copernicus C3S funded Clim4Energy project. Three main criteria have been considered important and relevant for climate change-related energy issues. These are related to i) model performance, ii) climate sensitivity and iii) type of climatic variables in future scenarios.
A fourth criterion related to model structural diversity is used additionally when needed to complete the sub-sampling process. The GCMs in the selected sub-ensemble would be ideally as different as possible in terms of model formulation (Masson and Knutti, 2011) in order to avoid undue weight to some model behaviours. Indeed, several GCMs share common code parts or even some GCMs are used to drive several RCMs. This genealogy criterion would also apply to RCMs.

Model performance criterion
The selected simulations must realistically represent the main climate features. Regional Climate Models (GCM-driven) are evaluated with regard to their ability to simulate variables of interest for the energy sector (tas, scfWind, rsds, pr). One essential question is if large scale dynamics are well represented, so for this reason the Global Circulation Models (GCMs), that drive the Regional Climate Models (RCMs), are evaluated with regard to dynamical aspects, i.e. their ability to simulate weather regimes over Europe. Weather regimes have been used in several studies to assess climate models dynamics (Cattiaux et al., 2013;Vautard et al., 2019). Unrealistic simulations should not be included. This will concern simulations exhibiting obvious abnormal features, suggesting bugs or more profound issues of a model.
The evaluation of regional simulations focuses on temperature at 2 m (tas), 10 m wind speed (sfcWind), precipitation (pr) and surface solar radiation (rsds Evaluation is carried out only over land since WFDEI is not available over oceans. Near-surface wind speed would deserve such an evaluation process over ocean also for offshore wind power purposes but this was not possible here. Tables A1, A2, A3, A4 in Appendix show the values for the 11 simulations, the 4 variables, the 5 features and the 3 metrics. Those values will serve also for the identification of unrealistic simulations. The sorting of simulations is carried out per variable, feature and metric, based on those values X. The rank d associated with each simulation is calculated as described by Eq. (1) for metrics to maximize, i.e. Corr, and Eq. (2) for metrics to minimize, i.e. RMSE and Q95Bias: where minimum and maximum are calculated over the 11 simulations. The closer d is to 1 the better the simulation performs compared to the others. We also assessed the capability of the driving GCM to simulate the main features of the large-scale atmospheric flow variability. For this, an assessment of weather regimes representation in GCMs is made. It focuses here on occurrence frequency of the four winter weather regimes (Atlantic Ridge, Blocking, North Atlantic Oscillation + (NAO+) and North Atlantic Oscillation − (NAO-)), and the four summer weather regimes (Atlantic Ridge, NAO-, Blocking, Atlantic Low). NCEP Fig. 2. Correlation between station observations and WFDEI datasets for mean and 95th percentile values of wind speed over the 222 low-elevation and inland stations and the nearest WFDEI grid-point wind values. Winter is shown as black curves and summer as red curves. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) reanalyses are used as reference data since the methodology to calculate the weather regimes is the same as described in the work of Alvarez-Castro et al. (2018). Absolute biases in occurrence frequency are calculated for each regime and each GCM included in the initial EURO-CORDEX ensemble (Table A5 in Appendix). Simulations are then sorted based on their absolute weather regime frequency biases by calculating their rank in the same way as for the regional climate features (Eq. (2)).
The performance score (PS) of a sub-ensemble per variable, feature and metric is then simply defined as the mean of all calculated ranks d of the simulations making up the sub-ensemble. The mean PS (< PS >) calculated as the average over all PS associated to each variable, feature, metrics will characterize a sub-ensemble with regard to this criterion. Note that the different aspects to evaluate can be weighted differently depending on their attributed importance.

Climate sensitivity criterion
EURO-CORDEX regional climate simulations are dowscaling 5 global climate models (GCMs) from the The Coupled Model Intercomparison Project Phase 5 (CMIP5) (Taylor et al., 2012), which provides a state-of-art set of coordinated climate model experiments involving 20 global climate modelling groups from around the world. The selected sub-ensemble must span, to the largest possible extent, the full CMIP5 ensemble climate sensitivity range, in order to account for uncertainties in future global climate evolution. As indicator for climate sensitivity, we use the Equilibrium Climate Sensitivity (ECS), which corresponds to the total amount of global warming induced by a doubling of carbon dioxide atmospheric concentration, once the system reaches a new balanced energy state. The ECS spread (ECSS) of a subensemble is measured using the Eq. (3).
where ECSsubens is the ECS array of the sub-ensemble and ECSinit the ECS array of the initial 11-member EURO-CORDEX ensemble.
The ECS values of 27 CMIP5 GCMs are found in IPCC AR5 WG1 Chapter 9 (Flato et al., 2013;Sherwood et al., 2014;Hazeleger et al., 2012) (personal communication from W. Hazeleger for EC-EARTH). The range of this 27-member CMIP5 ensemble is (2.07; 4.7)°C. The ECS range of the initial GCMs downscaled by EURO-CORDEX RCMs is (3.25-4.55)°C, the models driven by CNRM-CM5 and HadGEM-ES showing the lowest and highest sensitivity respectively. Fig. 3 shows the ECS distribution of the 27 CMIP5 GCMs and also the ECS distribution of the initial EURO-CORDEX ensemble driven by 5 different GCMs. This initial ensemble covers 51% of the CMIP5 ECS range at maximum and is skewed toward medium to high ECS.

Climatic variables future scenarios criterion
The range of climatic variables responses to greenhouse gases forcing covered by the selected sub-ensemble with regard to variables of interest (tas, scfWind, rsds, pr) has to span to the largest possible extent of the initial ensemble response range. It is of high importance for industry in adaptation that the selected sub-ensemble accounts for the diversity in climate variables change signals and includes in particular the "high" and "low" scenarios. The variable responses (VR) are assessed as the differences between averages over the 2035-2065 and 1981-2010 periods. They are calculated for five quantities, namely the annual, winter and summer means, and the 10th and 90th percentiles of the daily values distribution over the 30-year period, averaged over the 3 IPCC European domains (Fig. AI.39 and AI.40 upper part in Oldenborgh et al., 2013, Annex 1 Atlas of Global and Regional Climate Projections) over land for all variables and additionally over sea for sfcWind for wind power purpose. The variable response spread (VRS) of a sub-ensemble for a particular quantity and region is calculated using  (4) where VRsubens is the variable responses array of a sub-ensemble and VRinit the VR array of the initial ensemble. Maximum and minimum are used here rather than percentiles because, as mentioned above, the upper and lower response scenarii are of particular relevance for stakeholders. The mean VRS (< VRS >) is calculated as the average of all VRS associated to each variable, quantity and region and will characterize a sub-ensemble with regard to this criterion.

Sub-ensemble selection algorithm
The sub-ensemble member selection is a multicriteria selection process that we conduct iteratively by applying in a predefined order the criteria introduced in the previous section.
The first step of the sub-ensemble selection procedure consists in applying a first selection filter related to a model performance criterion to the initial ensemble. Considering an initial ensemble made up of N simulations and M simulations after applying the first filter. There are then L combination of M (C(L,M)) possible L-member sub-ensembles to choose among.
The second selection filter is related to the climate sensitivity criterion, and consists in imposing a minimum threshold to the sub-ensemble coverage of the initial ensemble climate sensitivity range. Because the initial ensemble covers only half of the CMIP5 ECS range, we impose to select only sub-ensembles having an ECSS equal to 1. Let P be the number of sub-ensembles that meet this condition.
The third selection filter concerns the criterion on variable responses. As for the second filter, only sub-ensembles characterized by a coverage of the initial variable response range above a certain threshold are selected. Here we impose a < VRS > lying in the upper quartile of the < VRS > distribution of the P-member sub-ensembles.
The fourth selection filter related to the performance score consists in keeping only the most performant pool of sub-ensembles. Here we impose a < PS > lying in the upper quartile of the < PS > distribution of the P-member sub-ensembles.
Several sub-ensembles can meet these filter conditions. To choose among the resulting possibilities the model genealogy criterion is used. The sub-ensemble that maximizes the GCM and RCM diversity is selected. If this last criterion is not sufficient to discriminate among possibilities, then the sub-ensemble that is characterized by the highest < VRS > is selected, thus less weight is given to the < PS > filter.

Selected sub-ensembles
To propose a range of sub-ensemble sizes, the methodology has been applied here to select 3, 5 and 7 simulations out of the 11 initial ones ( Table 1). The algorithm could produce sub-ensembles of other sizes if needed. In this case no unrealistic simulations was found in the first step and the full 11-member ensemble could be considered. The main results for the sub-ensembles selection are presented in the following.

Selected 3-member sub-ensemble
There are 165 possibilities of selecting 3 simulations out of 11. Of these, 32 sub-ensembles combinations meet the condition ECSS = 1. Two sub-ensembles meet the third and fourth filter conditions (the 75th percentile of the < VRS > distribution is 0.63; the 75th percentile of the < PS > distribution is 0.61). One sub-ensemble only is composed of GCM and RCM all different and is then selected:

Selected 5-member sub-ensemble
There are 462 possibilities of selecting 5 simulations out of 11. 231 sub-ensembles meet the condition ECSS = 1. Two sub-ensembles meet the third and fourth filter conditions (the 75th percentile of the < VRS > distribution is 0.80; the 75th percentile of the < PS > distribution is 0.61). One sub-ensemble maximises the GCM and RCM diversity being composed of 5 different GCMs and is then selected:

Selected 7-member sub-ensemble
There are 330 possibilities of selecting 7 simulations out of 11. 259 sub-ensembles meet the condition ECSS = 1. Two sub-ensembles meet the third and fourth filter conditions (the 75th percentile of the < VRS > distribution is 0.90; the 75th percentile of the < PS > distribution is 0.60). Two sub-ensembles maximise equally the GCM and RCM diversity. The one with the highest < VRS > is then selected among those 2 sub-ensembles:
The sub-selection was performed before bias-correction of climate projections because it is based largely on comparing models with observations, which must be done before bias-correction. On the other hand for some specific impact studies, end users may need to have access to the raw model outputs. The most widely used bias-correction are not able to correctly handle the most extreme values for example, and specifically designed methods are used in this case . Furthermore, end users may need to bias adjust against their own reference datasets. Therefore, because the chosen bias-correction does not modify the climate change signal (for example this fact implies that the bias-correction does not affect the variable response ranges either) it has been decided to apply the sub-sampling before bias adjusting the climate model simulations.

Bias-correction method
Climate projections have biases that must be corrected or "adjusted". In this paper we use the Cumulative Distribution Functiontransform (CFD-t) method, of which a detailed description can be found in Vrac et al. (2012). This method is a non-parametric quantile mapping-based technique which accounts for climate change (or changes in the underlying distribution). It corrects model values in a future period given observations and model data in a reference period. In addition, specific tuning was made, as described below. The training period used here for CDF-t is 1979-2005; it is the intersection of all EURO-CORDEX simulations historical periods and the WFDEI data period. It allows to have a similar bias-correction for the two scenarios as they only diverge after 2006. Bias-correction is performed by moving windows of 20 years, with 10-year advancement. For each 20-year period, biascorrection is calculated using the training and the 20-year period, but correction is saved only for the central 10-year period. The next 20-year period is then considered, but it is only shifted by 10 years. The first and last periods have more saved values at the end of the time series.
The bias-correction method is applied separately on the climate variables: temperature at 2 m, precipitation, 10 m wind speed and surface solar radiation. One of the main adaptations of the methodology made here stems from the fact that the WFDEI observation-based dataset is provided on a grid (regular 0.5°× 0.5°) that has a resolution coarser than the model grid (rotated EURO-CORDEX grid, Use the difference or ratios (see below) between corrected and uncorrected model data on each observation grid cell to homogeneously correct each high-resolution model grid point value belonging to an observation cell (downscaling); Merge all months and hours to create a bias-correction data set with continuous time.
Due to the different nature of each variable, there are some differences in the method setting in each case: for tas, the standard CDF-t method is used (Vrac et al., 2012). Additive corrections are brought with identical adjustment added on each model grid point belonging to the same observation 0.5°x0.5°c ell for the downscaling stage. for pr the new adapted CDF-t method, correcting both precipitation intensity and frequency (Vrac et al., 2016) is used. For the downscaling stage, a specific procedure is used: when the corrected value is lower than the uncorrected value over the observation grid cell, model values are multiplied by the ratio (< =1) between corrected and uncorrected values, avoiding negative values. In the opposite case, in order to avoid unrealistically large values due to small denominator, an additive correction is brought: the (positive) difference between corrected and uncorrected values is homogeneously added to all corresponding model grid point values. for sfcWind a classical CDF-t method is used but the change in grid is done using a fully multiplicative correction, as very small denominators (uncorrected very low wind speed values averaged over the coarse resolution observation grid) leading to unrealistic corrected wind values were not found. for rsds a classical CDF-t method is used but a specific downscaling procedure is applied because surface solar radiation has bounds. We applied a multiplicative correction as for wind speed when the corrected value is smaller than the initial value and a multiplicative correction to the difference between the upper bound and the value if not, so the final value does not exceed the upper possible bound. We a posteriori bounded the correction to the range between 0 and the observed maximum over the training period at each grid point.
Except for rsds, the posterior downscaling preserves within-cell averages, so that when re-averaging high-resolution corrected model data over each observation grid cell, the value obtained is identical to the observation-grid initially corrected value. For rsds, we relied on the posterior verification described below to assess the quality of the bias corrected set.

Verification and quality control
After the bias-correction procedure, a number of verifications and quality control tests were carried out on the bias-corrected output files. In particular, we verified that bias-corrected model data averages over the training period , which should be close (albeit not exactly identical due to the difference between reference period and the correction moving periods) to the mean observed values from the low- Fig. 4. Representation of the sub-ensembles characterised with an ECSS equal to 1 in the VRS-PS space for the 3-, 5-and 7-member cases in a), b) and c) respectively. The selected sub-ensembles are circled in red. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) Table 3 Ensemble mean annual (yr) changes (differences between averages over the 2035-2065 and 1981-2010 periods) over the 3 IPCC European regions (R1, R2, R3) for the variables of interest (tas, scfWind, rsds, pr Changes are expected not to differ greatly (see Section 5.1). Large differences found in some data in fact revealed problems in initial files that were corrected. Finally we also tested the high-resolution bias-corrected data directly against station data, and assessed the improvement of the biascorrection method and the improvement of the downscaling (see Section 5.1).

Verification of temperature and precipitation
The differences between the bias-corrected simulations at 0.5°x0.5°r esolution and WFDEI data over the reference period were calculated separately for winter and summer seasons for all variables and displayed for tas and pr, respectively as an example. Small differences are found but they do not exceed a few tenths of mm for pr and tenths of°C for tas (see pr at Fig. 5. and tas at Fig. 6. in case of the WRF331F regional model forced by IPSL-CM5A-MR).
For daily pr and tas data we tested the high-resolution bias-corrected data directly against ECA&D station data. We used 485 temperature stations and 1055 rain gauges from the ECA&D database. Statistics compare the absolute biases for (i) the final bias-corrected data at high resolution, (ii) the intermediate corrected data at WFDEI low resolution (0.5°x0.5°) and (iii) the original non-corrected data using a few metrics, separately for winter (Tables 4 and 6) and summer (Tables 5 and 7).
For precipitation, there is clearly benefit associated to bias-correction, and the gain from higher resolution is only for the heavy precipitation (highest 99th percentile, Q99). The gain from bias-correction is not obvious at low resolution in this case. This conclusion holds for both winter and summer. However in summer, biases for Q99 remain high, because such events are convective and of small scale and difficult to capture even with a 0.11°resolution.
For temperature, there are clear improvements of the bias-corrected simulations when compared to station data, and the high-resolution correction provides improved results compared to low resolution correction. This holds both for mean biases and 95th and 5th percentiles. Note the higher relative gain of high resolution for higher temperatures. Fig. 7 shows the changes (2071-2100 vs 1971-2000) obtained for precipitation at high resolution with and without bias-correction. Although slightly less pronounced, BC changes show essentially similar features. The slightly lower changes in amplitude may be due to an overestimation of rain amount amplitudes in the current climate (Vautard et al., 2013;Kotlarski et al., 2014). For temperature, the biascorrection has resulted in negligible modifications (Fig. 8) as expected.  Fig. 6. Same as in Fig. 5 for tas (in°C). B. Bartók, et al. Climate Services xxx (xxxx) xxxx

Verification of wind speed and surface solar radiation
To validate 10 m wind speed simulations we used measurements from 356 ISD-Lite stations over Europe for the period 1973-2000. Fig. 9 shows the monthly biases before and after the bias-correction in the case of the 11 regional climate models considered in the study Table 4 Verification of bias-correction (BC) performance vs. station data for daily precipitation (pr, in mm/day) for winter, HR -high resolution, LR -low resolution, multimodel mean.   Table 6 Verification of bias-correction (BC) performance vs. station data for daily mean temperature (tas, in°C) for winter, HR -high resolution, LR -low resolution, multimodel mean.   Fig. 7. Changes (2071Changes ( -2100Changes ( vs 1971Changes ( -2000 with (left) and without (right) application of the BC method, for the annual precipitation (pr), in mm/day, multi-model mean.
highlighting the added-value of bias-correction. Except for CNRM (BIAS = −0.84 m/s) the non-bias-corrected simulations overestimate the 10 m wind speed by 0.77 m/s, while after bias-correction the biases become smaller (-0.27) and more convergent. Further monthly statistics (Table 8) show improvements in the lower edge of the distribution, however the 95th percentile biases become larger after bias-correction. In terms of extreme values, the corrected model data (multi-model mean) give slightly higher values in case of low extremes and lower values in case of high extremes compared to the observations. Fig. 10 shows the changes in 10 m wind speed between periods 2031-2060 and 1971-2000. The magnitude and the patterns of negative and positive changes are similar before and after the bias-correction procedure, which justifies the fact that the bias-correction procedure does not affect the trends. Surface solar radiation data have been validated against measurements coming from 61 GEBA stations over Europe for the period 1971-2010. Fig. 11 shows the biases of the original high resolution (0.11°× 0.11°) non-corrected rsds data, the high-resolution (0.11°× 0.11°) data corrected with respect to WFDEI reference data, the intermediate low-resolution (0.5°× 0.5°) data corrected with respect to HelioClim data completed with WFDEI (where HelioClim data are missing), and the final WFDEI-HelioClim bias-corrected data at high resolution (0.11°× 0.11°). The results give an overestimation by 16.41 W/m 2 in case of the raw modelled data, similar value has been reported in (Bartok et al., 2017) including slightly different EURO-CORDEX simulations. The biases have been reduced to 5.42 W/m 2 by applying bias-correction with WFDEI reference data. However using HelioClim satellite data, the biases turn into negative sign, giving a bias of −9.42 W/m 2 in case of low resolution, and −8.55 W/m 2 in case of high resolution. A validation of the monthly raw HelioClim data against the data from the 61 GEBA stations has been carried out showing an underestimation of rsds by −6.85 W/m 2 which explains the negative sign in biases of model simulation bias-corrected by WFDEI and He-lioClim reference data. The absolute difference in bias between data using only WFDEI and WFDEI scaled with HelioClim data is 5 W/m 2 .

Table 8
Verification of bias-correction (BC) performance vs. station data for monthly mean, Q05 and Q95 of 10 m wind speed in the 11 regional climate models (in m/s). The biases after bias-correction are controlled by the reference data used in the procedure. For this reason, we put huge effort to enhance the quality of the reference data sets.
In terms of distribution the WFDEI bias-corrected data shows a small bias in low extremes (5th percentile, Q05) and higher bias in high extremes (95th percentile, Q95). However, compared to the original simulations the biases have been significantly reduced in both cases. HelioClim-WFDEI bias-corrected model data gives slightly higher biases in mean but the added value of bias-correction is also obvious in this case, mainly in extremes. However, enhancement in resolution yields lower biases in high extremes, since local processes as cloudiness directly influencing rsds is better captured in this case (Table 9). Fig. 12 shows the changes in rsds between periods of 2031-2060 and 1971-2000. The magnitude and the patterns of negative and positive changes are similar before and after the bias-correction procedure which justifies the fact that bias-correction procedure does not affect the trends.
In general, impact studies should include bias-corrected data in terms of absolute values, because many applications involve thresholds or nonlinear processing of data. Such is the case for instance of wind power which necessitates the conversion of wind to power with a nonlinear power curve; however, these figures show that the procedure should not affect the magnitude of long-term changes (during the procedure both past and future values are corrected in line with observations).

Data standardization and access
The main goal of this work was to deliver and assess a number of climate datasets of essential climate variables to be widely used in the energy sector.

Table 9
Absolute biases of yearly mean, 5th percentile (Q05) and 95th percentile (Q95) of surface solar radiation from 11 regional climate models against GEBA station data for the period 1971-2010 (in W/m 2  (setting up a common domain, common high resolution, harmonizing output formats, variable names, and much more) (Giorgi et al., 2009). However, before the delivery of the new bias-corrected model results, it is important that files are also standardized in a robust way. Two tasks should be done, namely (i) because of the diversity of the formats of different model results, it is necessary to unify the naming conventions and metadata as netcdf attributes for the facility of usage, and (ii) it is also necessary to complete the information about the bias-correction to clearly distinguish between the original simulation results and their bias-corrected versions. These are the principal tasks of standardization of the bias-corrected datasets. Both 3-hourly and daily datasets of the 11 selected models (Table 1) between the period 1971-2100 have been standardized. The standard used is the Data Reference Syntax (DRS) for bias-corrected CORDEX simulations (Nikulin and Legutke, 2016). In addition, a check of input files and a verification of output files have been done separately before and after the standardization. In this way, first, for the netcdf files, the dimensions, variables, and netcdf attributes have been set in accordance to the CORDEX tables. In particular, the coordinates have been transformed to rotated polar coordinate of which the North Pole is as defined in the parameters of the CORDEX domain. The time unit has been unified to 'days since 1949-12-01 00:00:00' and time values have been transformed accordingly. The variable names have been appended by 'Adjust' and the attribute «long_name» has also been modified by adding 'Bias-Adjusted' in front of it. Several global attributes have been modified, such as «project_id» and «product» have been modified separately to be 'CORDEX-Adjust' and 'bias-adjusted-output'. Note that «contact», «institution» and «institute_id» are the information about the Institut Pierre-Simon Laplace (IPSL) who is responsible for bias-corrected data, while the information about original data suppliers and original data files have been given in these new global attributes: «input_institution», «input_institution_id» and «in-put_tracking_id». In addition, more information about the bias-correction have been shown by several new global attributes which have 'bc' in front of the attribute names, such as «bc_method», «bc_observation_id», «bc_period» and so on.
Secondly, the output netcdf file names have been set according to the naming rules of CORDEX DRS (42). One file name is composed of several CORDEX DRS elements which are separated by underscores ('_') and the order of elements is: The files are currently freely accessible through the Earth System Grid Federation (ESGF) portals (https://esgf.llnl.gov/nodes.html).

Conclusions
A climate projection dataset has been produced for use primarily by the energy sector. It is characterized by state-of-the-art bias-corrected simulations, high spatial and temporal resolution in line with energy needs, multi-model and multi-scenario ensemble to account for uncertainties in climate projections, standardized and quality-checked data, a proposition of sub-ensembles of intermediate sizes. All these data are freely accessible through the Earth System Grid Federation (ESGF) nodes (https://esgf.llnl.gov/nodes.html).
Since the data volume is huge (e.g. one model, one variable, one scenario, 3-hourly data for 1971-2100 for Europe is about 260 Gb) subensemble are proposed to be used. The main criteria for sub-sampling was set up in such a way that the selected sub-ensemble realistically represents the main climate features, to span the largest possible extent of climate sensitivity range, as well as to accounts for the diversity in climate variables change signals and includes in particular the "high" and "low" scenarios, an important feature for the end-users.
Bias-correction of climate model simulations is important for impact studies, where absolute values and thresholds are taken into account. The importance of bias-correction is obvious after validation of biascorrected simulations against observational data. In case of daily temperature, there are clear improvements of the high-resolution biascorrected simulations (the mean absolute bias is reduced from 2.33°C to 1.05°C in winter and from 1.67°C to 0.86°C in summer). This holds both for mean biases and 95th and 5th percentiles. For precipitation, there is also a clear benefit associated to bias-correction especially in case of higher resolution (0.11°). The mean absolute bias is reduced from 1.22 mm to 0.52 mm in winter and from 0.85 mm to 0.25 mm in summer. The gain from higher resolution is especially for the heavy precipitation (highest 99th percentile, Q99). However in summer, biases for Q99 remain high, because such events are convective and of small scale and difficult to capture.
In case of 10 m wind speed after bias-correction the biases become smaller (−0.27 m/s) and more convergent. Further improvements are detected in the lower edge of the distribution, however the 95th percentile biases become larger after bias-correction. In terms of extreme values, the corrected model data (multi-model mean) give slightly higher values in case of low extremes and lower values in case of high extremes compared to the observations.
In case of surface solar radiation, the biases have been diminished to 5.42 W/m 2 (raw simulations having a bias of 16.41 W/m 2 ) by applying bias-correction with WFDEI reference data. However using HelioClim satellite data, the biases turn into negative sign giving −8.55 W/m2 in case of high resolution.
In any case the bias-correction procedure does not affect the magnitude of long-term changes.
However there are several limitations to be mentioned. In some cases the adequacy between model outputs and energy needs is not fulfilled, e.g. wind speed at 10 m instead of wind speed at ca. 100 m required in case of wind energy projects. Furthermore, the time frequency of the outputs, as well as the match between model spatial resolution and spatial resolutions required by the end users in some cases probably need to be enhanced in the future. Regarding bias-correction  -2060 and 1971-2000, multi-model mean before (left) and after bias-correction with WFDEI reference data (middle), and bias-corrected with HelioClim-Wfdei data (right) at 0.11 × 0.11 deg. resolution.
limitations, remaining biases over some areas also have to be mentioned, as well as the unavailability of observations at high resolution. However, bias-correction methods cannot fix fundamental problems of climate models, the basic assumption in this case is that the chosen climate model represents a plausible climate change.
Possible future work to improve the availability and quality of climate projections for the energy sector may include enlarging the number of simulations in the initial ensemble, the use of improved reference data for bias-correction, as well as an increased adequacy between stored model outputs and energy needs (relevant variables, at appropriate level and frequency) in future regional modelling experiments.
The lessons learned from the user engagement in this work can be expressed as follows. End users in the energy sector are used to handle meteorological datasets, either observed or forecasted. For climate change impact studies however, because climate model simulations provide different approximations of the main climate characteristics for a given period, ensembles have to be considered and bias adjustment is needed to get closer to the observed climatology. Furthermore, different scenarii for the future greenhouse gas and aerosols pathways lead to different projected climates for a future period of interest. Therefore, for this kind of impact studies, several scenarii and climate models have to be considered as equally probable future evolutions. However, end users are generally not able to handle a very large ensemble of possible evolutions, either because of limited storage space or of time consuming impact models. This is why an approach has been proposed here to sub sample a large ensemble of climate simulations so that the main uncertainty range can be covered with a smaller set of possible evolutions.
All the work presented in this study has been elaborated in the framework of the finalized proof-of-concept Clim4Energy project (http://clim4energy.climate.copernicus.eu/) in collaboration with the European Climatic Energy Mixes (ECEM) project (http://www. wemcouncil.org/wp/european-climatic-energy-mixes/) (Troccoli et al., 2018), both funded by the European Union's Copernicus Climate Change Service (C3S) program. The main goal of Clim4Energy was to bring together expertise of climate research and service centres and energy practitioners to demonstrate, from case studies, the value chain from essential climate variables to actionable information in the energy sector. In continuation, based on the results of these projects, the Copernicus Climate Change Service (C3S) Energy is now developing an operational climate service for the energy sector (https://climate. copernicus.eu/operational-service-energy-sector).

Acknowledgments
This study was funded by the European Union's Copernicus Climate Change Service (C3S) program through the Clim4Energy service. We benefited from fruitful discussions and data exchanges from a parallel C3S service (European Climatic Energy Mixes) and data exchanges. In particular the daily gridded HelioClim radiation data was used for solar radiation. Special thanks to the EURO-CORDEX model data providers, as well as for the availability of WFDEI, ECA&D, ISD-Lite and GEBA observational data.

Table A5
Evaluation table for the weather regimes (absolute bias in occurrence frequency in %).