The ALEX17 diurnal cycles in complex terrain benchmark

The NEWA ALEX17 experiment was conducted with the objective of characterizing the wind conditions upstream of the Alaiz Test Site for the validation of flow models. From the intensive operational period, a case study has been selected for a Wakebench benchmark consisting of four consecutive days with relatively persistent winds from the North. The validation is centered around a 118-m mast at the Alaiz site and six additional masts located along the valley and at the lee side of a ridge delimiting a 8-km long area of interest. The benchmark is a follow-up of the GABLS3 diurnal cycle benchmark in flat terrain to test mesoscale-to-microscale transient and steady-state modeling methodologies in the assessment of stability-dependent bin-averaged wind conditions. Meso-micro methodologies reduce the wind speed mean bias from 32%, at 3-km mesoscale, to ±5%. Beyond mean bias mitigation, these initial results demonstrate the added value of meso-micro coupling at reproducing non conventional wind conditions at the test site like high-shear low-level jets in stable conditions and negative wind shear in unstable conditions. The benchmark also discusses the challenges of each meso-micro methodology going forward.


Introduction
The ALEX17 benchmark is a follow-up of the GABLS3 diurnal cycle benchmark in flat terrain [1] to test mesoscale-to-microscale methodologies in complex terrain. Further background about meso-micro flow modeling in the wind energy context can be found in [2].
In the GABLS3 case it was demonstrated that microscale models, implemented in computational fluid dynamics (CFD) codes, could successfully reproduce a diurnal cycle leading to a nocturnal low-level jet by driving the models (both LES and URANS) with mesoscale tendencies, which represent large-scale forcings from horizontal pressure gradients and advection. The same methodology is tested here in complex terrain by analyzing a 4-day episode of persistent northerly winds at the Alaiz test site near Pamplona, Spain. Likewise, we also compare this methodology with the online coupling method in WRF using the large-eddy simulation (LES) option at microscale. The Alaiz experiment (ALEX17) [3], was carried out in the New European Wind Atlas (NEWA) project to characterize the inflow conditions of the test site where a significant influence of mesoscale phenomena was expected due to the larger dimensions of the mountain range compared to Perdigão (the alternative in NEWA), with smaller size, whose wind conditions are arguably more dominated by microscale. Figure 1 shows the elevation map and the layout of the ALEX17 mast instrumentation and Z-transect scanning lidar positions. The topography is characterized by a 6-km long mountain-valley-ridge configuration, delimited by the Alaiz mountain range to the south and the Tajonar ridge to the north. A 10-km long Z-transect was built from 2D (A-and C-transect) and 3D (B-transect) Doppler scanning lidar measurements. The campaign and data report provides a detailed description of the data collection, filtering and data availability [4]. This work presents initial comparison results between several models with observations at the met mast positions. This paper presents initial results of the ALEX17 benchmark focusing on the assessment of wind speed predictions. Further results on additional quantities and the model evaluation script can be found in the associated GitHub repository [5].

Benchmark setup
This benchmark is intended for flow models that can reproduce wind conditions at microscale, relevant for wind turbine siting and energy yield assessment, with realistic large-scale forcing characterized by mesoscale modelling. Therefore, the benchmark is designed with transient meso-micro modelling in mind. Nevertheless, steady-state modellers were also welcomed to participate targeting bin-averaged wind conditions for different stability classes.
The ALEX17 intensive campaign, where all the instruments were operational, lasted 4 months from August to December 2018. During this period numerous mountain boundarylayer phenomena were captured like lee waves and hydraulic jumps downstream from Alaiz [3]. However, in this initial benchmark we selected a synoptic weather episode with persistent winds from the North (the prevailing wind sector) to test models in a more conventional setting that facilitates the interpretation of the results.The selected episode corresponds to a period when the synoptic conditions were relatively uniform, resulting in constant northerly winds for four consecutive days, starting at 00:00 UTC on September 30, 2018 and lasting until 23:50 UTC on October 3, 2018. During these days, the wind direction was constant during the entire period at the mountain top, as observed at MP5, with winds varying from 4 to 15 ms −1 . The validation data consists of hourly-averaged wind speed data from seven meteorological masts. From the 118-m tall mast at the mountaintop (MP5), we use measurements from cup anemometers at 40, 80, 100 and 118 m above ground level (agl). From the 80-m tall masts at the valley floor, we use measurements from cup anemometers (M1 and M5) and 3D sonic anemometers (M2 and M7) at 10, 20, 40, 60 and 80 m agl. At the lee side of the Tajonar ridge we use measurements from another 80-m tall mast (M3) and a 60-m tall mast (M6), both also equipped with 3D sonic anemometers.
The M7 mast, located in the center of the valley is used as reference to categorize wind conditions in terms of atmospheric stability. To this end, we use 20-Hz measurements from a Gill WindMaster Pro sonic anemometer at 10 m agl to compute the stability parameter ζ = z/L, based on the Obukhov length L and the height above ground level z. Then, bin-averages are computed following the classification of [6] resulting in the distribution of Table 1. Due to the low sample count, we shall drop the very stable class from the analysis.

Model participants
The ALEX17 Diurnal Cycles benchmark counts with the participation of several meso-micro methodologies, all of them using the Weather Research and Forecasting (WRF) model at mesoscale [7] (Table 2). Regarding meso-micro coupling approaches, the WRF-LES model is the only one doing dynamic coupling within the same code. The others adopt an offline coupling methodology by extracting input quantities from WRF to drive a CFD code at microscale that is based on either Reynolds-Averaged Navier Stokes (unsteady URANS or steady RANS) or large-eddy simulation (LES) turbulence modeling.

WRF Mesoscale
Reference mesoscale simulations have been produced adopting the WRF settings that were used in the production run of the New European Wind Atlas [8] [9]. Three one-way squared nested domains of (27,9,3)

WRF-LES Multi-scale
A WRF-LES [11,12] simulation is set-up on six telescopic one-way nested domains centered at M7 with a 1-to-3 ratio of reduced grid-spacing, i.e. (27, 9, 3, 1, 0.33, 0.11) km. The domains are square with (140, 130, 118, 106, 94, 82) grid points on each side. MYNN PBL scheme is used in the first 5 domains while a large eddy simulation (LES) model based on a 1.5-order TKE-closure subfilter-scale model [13] is activated in the sixth domain. All other settings follow the NEWA production run configuration [9] as described in the reference WRF simulation.
Other WRF-LES setup has already been used with success at Alaiz to simulate an episode of Southerly winds where an atmospheric hydraulic jump was observed on the lee-side of the Alaiz mountain [14].

WRF-Alya (URANS and LES) with 1D tendencies
Alya-RANS and Alya-LES microscale flow models are implementations of RANS and LES turbulence closures in the highly parallelized finite element code Alya, developed at Barcelona Supercomputing Center. The RANS and LES models share implementation, using the same linear finite element functions for spatial discretization schemes, the same Monin Obukhov wall functions and the same coupling to mesoscalar flow. Therefore the obtained differences are due to the turbulence model.
The RANS model uses the k-closure of Sogachev et al. [15] with temperature coupling [16], while, the Smagoringsky sugbrid scale model is used in the LES option [17]. Both simulations use the reference WRF simulation as input data following the 1D tendencies approach whereby horizontally-averaged (45x45 km) mesoscale pressure gradient and advection terms from WRF are added to the microscale momentum equations as volumetric forcing terms [18] to drive the flow. The energy equation is driven by the surface temperature, which is inferred using Monin-Obukhov similarity theory from the 2 m temperature extracted from WRF. Periodic boundary conditions are imposed laterally, Monin-Obukhov at the surface and symmetry at the top of the domain.
The topography has been simulated over a 16.5 × 15 km area, using horizontal resolution of 50 m in the central part for RANS and 35 m for LES. The topography surface mesh is extended to 33 × 43 km through a buffer mesh to reach a uniform height at the lateral boundaries. The domain height is 8 km. The elements closer to the ground have a vertical length of 2 m for the RANS simulations and 10 m for the LES simulation.

WRF-Ellypsis3D (URANS) with 3D tendencies
The EllipSys3D flow solver [19], [20] is a multiblock finite volume CFD code based on a solution of the discretized incompressible Navier-Stokes equations in a general curvilinear coordinates. The equations are solved using a collocated variable arrangement and a revised Rhie/Chow interpolation technique is integrated in order to avoid odd/even pressure coupling. The Message Passing Interface (MPI) routines and libraries are employed for code parallelization, where a nonoverlapping domain decomposition technique is utilized, enabling the code for a highly efficient execution on distributed/shared memory HPC architectures.
The URANS model used in the present work is based on the k-model [21], and ABL relevant revisions proposed by [22] (for implementations details see [23]).
The considered problem is solved on a 24 km x 24 km wide spatial domain, with vertical extension of app. 9 km. 144 tanh-distributed grid points are used in a vertical direction, starting from a 1 cm high cell. Both lateral dimensions are discretized using 240 grid points, where the focus inner zone of 7 km x 7 km is discretized by 40 m cells. Terrain is gradually smoothed out in horizontal directions, starting from approximately 3 km from the corresponding boundaries, and eventually reaching a constant height at all lateral boundaries, in order to fully integrate the periodic boundary conditions. The no-slip wall BC for velocities and mesoscale derived surface temperature (based on mesoscale data for 2m temperature, heat flux and stability) for energy equation are applied on the bottom surface, whereas symmetry BC (employing a buffer damping upper layer of 3 km) is used on the domain top. The damping layer is implemented with a ramp function going from 0 to 1 in range from 5 to 6 km AGL (0 bellow it, 1 above it), where 1 means that the vertical velocity is set to 0.
The model is driven by a 3D tendency-based approach as in [18] [1], here extended to allow for spatial (both horizontal and vertical) and temporal variations. The mesoscale "driver" of a microscale 24 km x 24 km zone is a set of advection and geostrophic wind tendencies extracted from 8 x 8 (3 km discretization) meso cells. Similarly, the wall temperature, derived in a way described above, from the 24 x 24 (1 km discretization) meso cell set is used as a temperature boundary condition. Mesoscale data are obtained using a similar WRF setup as described in section 3.1 with an additional nest at 1-km resolution.
The two WRF-EllipSys3D cases considered have numerically identical setup and differ only in a way the production run was performed. While the baseline WRF-EllipSys3D case was run for a consecutive period of 5 days (1 day spin-up time), 8x12 case was performed using 8 -(6+12) 18 hour runs, where initial 6 hours for each run was spin-up time, and the results were stitched together into a 4 day period considered.

WRF-MeteodynWT (RANS) statistical downscaling
A statistical meso-micro coupling method for wind resource assessment application is developed in Meteodyn. WRF outputs at 9 km resolution are classified by wind direction and stability categories (based on the Monin-Obukhov length obtained from WRF and stability bins in Table  1) to obtain ensemble-averaged inlet and top boundary conditions for MeteodynWT RANS K − l model. Three classes of atmospheric stability (unstable, neutral and stable for 0.2 < ζ) are simulated via the turbulence kinetic energy and the turbulence length scale [24]. A force term is added to relax the microscale flow to the mesoscale wind profiles in the meso zone that starts 500 m above ground level, vertical profiles of mesoscale model output being assimilated into the CFD model as a body force [25] [26] [27]. Below this level, no relaxation is applied because of the high topography resolution in the microscale CFD model. A 20 km microscale domain is simulated with 25 m horizontal resolution and 4 m vertical resolution resulting in a mesh size of 16 million points.  Figure 2 shows reference wind conditions at M7 mast throughout the 4-day evaluation period. Background shading has been used to indicate the stability class of each hourly period according to the binning of Table 1.

Time series
Following an industry convention, Wind shear is computed by the power-law exponent (α) between 40 and 80 m.
where S is the wind speed. Turbulence intensity T I is the wind speed standard deviation (σ S ) to the mean wind speed (S): where σ S is derived in the simulations from the turbulent kinetic energy according to: Even though M7 is located at the center of the valley in relatively flat terrain, the wind conditions are not horizontally homogeneous as the mesoscale model results might indicate. During windy periods in neutral and unstable conditions, increased mixing produces more uniform wind characteristics. However, during stable conditions and relatively low wind speeds, we observe high fluctuations of turbulence intensity and wind shear and more than 90º offset in the wind direction with respect to MP5 at the Alaiz (not shown).
While there is large spread in the temporal evolution of the meso-micro models, they introduce corrections to the mesoscale wind fields to reproduce this heterogeneity in the diurnal variation of the wind conditions. This is a good sign of effective coupling of mesoscale forcings. Nevertheless, we shall have an easier interpretation of model performance by plotting bin-averaged profiles and integrating error metrics. Figures 3, 4 5 show ensemble-averaged vertical profiles at the 7 masts for the unstable, neutral and stable bins respectively. We are interested in understanding how microscale models correct the shape of the profiles from mesoscale to approach the observed ones. In unstable conditions (Figure 3) we observe a good match of mesoscale profiles at the valley masts M7, M5 and M2. However, in the lee of the Tajonar ridge, there is large variability of the simulations at M1, M3 and M6 sites under the influence of the ridge wake. WRF and WRF-LES results underestimate the wind speed deficit in the ridge wake due to insufficient resolution to resolve flow over the steep ridge. In contrast, CFD models, at resolutions of at least 50 m, show better agreement of the wind shear with observations. At the Alaiz site, the large vertical wind speed gradient from mesoscale data at MP5 is corrected at microscale, with Ellypsis3D predicting negative wind shear as in the observations.   Figure 5) prevent flow separation and we observe a much better agreement at the lee of Tajonar. At MP5 we notice how the models predict a low-level jet forming a few hundred meters aloft. The resulting increases in wind shear at MP5 is reasonably well predicted by the microscale models.

Bin-averaged vertical profiles
In stable conditions the differences between Alya and Ellypsis3D become more apparent than in neutal or unstable conditions. This might indicate a limitation of using 1D (vs 3D) tendencies which, in this case, produce an overestimation of the wind speed in the valley and Alaiz masts. In effect, during stable conditions the influence of mesoscale forcings over microscale terrain and roughness effects is more significant than during more turbulent neutral and unstable conditions. Hence, the value of adding more detailed mesoscale inputs is noticed in a more realistic coupling with microscale.

Metrics
While previous sections provide a qualitative assessment of the models, it is only by integrating the differences with respect to the observations that we can quantify the performance of the models and assess the differences from the different meso-micro coupling strategies. To this end we use the mean normalized BIAS from the bin averages: where k is the stability bin and n k = 6 is the number of bins. BIAS is used to show how the overall error is mitigated by error compensation between bins. This is a desirable feature in wind resource assessment applications that focus on long-term  Figure 5. Vertical profiles of wind speed at each mast for stable (s) conditions averaged quantities. The evaluation script also includes the mean absolute error (M AE), which is not sensitive to error compensation and, therefore, a more robust metric to assess how the model predicts a range of stability conditions. Notice that we are not using a weighted average BIAS using the sample count from each bin. Instead, we'd rather give each bin the same weight and assess the predictive capacity of the models in all stability bins regardless of their frequency which would make the assessment more site-specific. For the same reason, we normalize by the observed bin-averaged wind speed to remove dependency on the wind speed magnitude. This approach requires that all the bins share good statistical convergence such that we avoid the influence of outliers. From Table 1 we see that, with the exception of the very stable bin, which is dropped in the assessment, all the bins have more than 10 hourly samples so we should obtain a relatively robust estimate of the mean BIAS for this four-day period. Figures 6 and 7 show the overall mean BIAS for each simulation for all the masts combined and individually. The overall BIAS shows how all the meso-micro methodologies outperform the reference mesoscale model even if this is pushed to a resolution of 333 m. At 111 m resolution the WRF-LES reduces the overall BIAS to -2% and the CFD models are at ± 3 to 5% while the reference mesoscale simulation is at 32% at 3 km resolution and 7.4% at 333 m resolution.
The WRF-MeteodynWT model presents near-zero BIAS by compensating errors between bins. It presents the largest errors at M6 similarly to WRF-LES. The WRF-LES online coupling methodology produces low mean BIAS and a fair agreement with the vertical profiles of velocity and turbulence intensity (not shown). It is conspicuous however that the M6, right at the leeside of the Tajonar hill, is particularly hard to match for the model, and drags its overall bias on wind shear (not shown). Unlike CFD models, WRF cannot adapt its grid size to different regions in the domain. Therefore, such under performance can be traced to the model's 111 m horizontal resolution and vertical discretization that includes only 6 vertical levels below 100 m.  Figure 6. Overall mean normalized bias.
That setup seems to be insufficient for capturing the flow regime given the steepness of the hill and its small characteristic length compared to that of the Alaiz mountain. Alya results produce similar overall performance with URANS than with LES. The URANS model presents large errors in stable conditions that the LES model mitigates. The LES model also improves performance at the lee side of Tajonar ridge (M6 and M3), a zone where the role of turbulence modeling is more significant than in other more exposed sites.
In most cases, Alya under-predicts the velocity and over-predicts the turbulence intensity (not shown). We believe that the use of periodic boundary conditions is the culprit of this behavior. The wake from Alaiz perturbs the incoming flow due to the use of a too-short domain. Flow visualization and analysis of the mast where higher turbulence intensity over-prediction occurs confirms this idea. An extended domain with a slight rotation shall mitigate this issue.
EllipSys3D shows better performance when using 8x12 hr runs than when running the full 4-day cycle in one run. This approach is also used in mesoscale modeling (e.g. [9]), by running intervals of a few days instead of a continuous simulation, to make sure the simulation does not drift away too much from the observed large-scale atmospheric patterns introduced by the global input data. For the same reason, EllipSys3D restarts every 12 hours to make sure the microscale simulation does not drift away too much from the mesoscale fields. Besides that, as periodical boundary conditions are used, the wake of Alaiz exiting the south/south-east boundary of the domain and reentering it from the north/north-west in a continuous run of (1+4) days can be expected to have much greater influence on the obtained results, compared to the (6+12) 18 hour runs, where this process is partially avoided.
Comparing the CFD approaches with WRF-LES, there is a more notable dependency on stability in the CFD models, which show error compensation between stability classes within each mast, while WRF-LES shows the same sign in the error per mast regardless of the stability class. We have two hypotheses to explain this. First of all, in LES mode, WRF can improve the simulation of the turbulent heat transport. However, the radiation, land, and surface layer physics, which account for other important heat exchange processes, still required to be parameterized along all the nests. The advantage over the CFD models, is that these schemes, together with the energy transport equation, remain the same, potentially leading to a bias that is consistent towards the same direction regardless of the model resolution. Another major difference resides in the use of lateral periodic boundary conditions in CFD, as this recirculates artificial topographic wake effects from the Alaiz mountain into the inflow as discussed before.

Conclusions
Initial results of the ALEX17 Diurnal Cycles benchmark are presented on the testing of mesoscale-to-microscale flow models dealing with a 4-day period of sustained northerly winds in the complex topography of the Alaiz test site. The overall performance of the models is quantified in terms of the normalized wind speed BIAS at 7 met masts for a range of stability classes, from very unstable to stable conditions. There is an overall overprediction of the wind speed of 32% if only mesoscale modeling is used at a typical resolution of 3 km. This mean BIAS is reduced below ±5% by further coupling with a microscale model.
MeteodynWT achieves a BIAS of less than 1% using a statistical coupling methodology based on steady RANS simulations. Among the transient models, the WRF-LES online coupling methodology produces the lowest BIAS. However, at 111 m resolution, this model setup is challenged in areas of steep terrain. This is where CFD models present an advantage in their capability to adapt the mesh to the terrain complexity and obtain better agreement in adverse flow conditions like at the lee side of Tajonar or at the Alaiz mountaintop, where the wind shear becomes negative in unstable conditions, as predicted, e.g., by Ellipsys3D. On the other hand, current implementations using periodic boundary conditions are problematic as they recirculate topographic wakes. This is not done in MeteodynWT resulting in lower spread of the errors. By restarting the simulation every 12 hours, Ellypsis3D mitigates the influence of periodic boundary conditions by preventing the microscale model from drifting too much away from the mesoscale inputs.
The results from Alya show that turbulence modelling with LES leads to slightly better results than with URANS. By comparing with Ellipsys3D, we noticed the benefits of using 3D in stable conditions, when mesoscale effects become more prevalent in the lower part of the atmospheric boundary layer.