A physically interpretable statistical wake steering model

Wake steering models for control purposes are typically based on analytical wake descriptions tuned to match experimental or numerical data. This study explores the potential of a data-driven statistical wake steering model with a high degree of physical interpretation. A linear model trained with large eddy simulation data estimates wake parameters such as deficit, center location and curliness from measurable inflow and turbine variables. These wake parameters are then used to generate vertical cross sections of the wake at desired downstream locations. In a validation against eight boundary layers 5 ranging from neutral to stable conditions, the trajectory, shape and available power of the far wake are accurately estimated. The approach allows the choice of different input parameters, while the accuracy of the power estimates remains largely unchanged. A significant improvement in accuracy is shown in a benchmark study against two analytical wake models, especially under derated operating conditions and stable atmospheric stratifications. While results are encouraging, the model’s sensitivity to training data needs further investigation. 10


Introduction
Wind turbine wakes cause considerable power losses and increased loads at downstream machines.Control strategies to mitigate these negative effects are gaining support in the wind energy community.In particular wake steering, or wake redirection through intentional yaw misalignment (Dahlberg and Medici, 2003;Wagenaar et al., 2012), is regarded as a promising control strategy.A yaw misalignment introduces a lateral thrust force component, redirecting the downstream wake and generating two counter-rotating vortices around lower and upper tip height that curl the wake into a kidney shape (Howland et al., 2016).
Table 1.Summary of simulation parameters and classification into neutral (NBL), near neutral (NNBL), weakly stable (WSBL) and stable (SBL) boundary layers.The size (Lx,p, Ly, Lz) of the domains is normalized by the rotor diameter (D = 126 m).All parameters are identical in precursor and main simulations, except for the domain size which is extended in streamwise direction (Lx,m).tp is the simulated time of the precursor run, ug and vg the geostrophic wind, ∂θ ∂t −1 the heating rate, H the sensible heat flux and z0 the surface roughness length.

tp
Lx,p Lx,m Ly Lz ug vg ∂θ ∂t −1 H z0 BL1  2020)), which uses a non-hydrostatic incompressible Boussinesq approximation of the Navier-Stokes equations and the Monin-Obukhov Similarity Theory to describe surface fluxes.A precursor without and a subsequent main simulation with one turbine make up the simulation chain.

Precursor simulations
Realistic turbulent inflow conditions are generated from an initially laminar flow by adding random perturbations in a precursor simulation with cyclic horizontal boundary conditions.A regularly spaced grid on a right-handed Cartesian coordinate system with ∆ = 5 m is used in the boundary layer, while at higher altitudes the vertical grid size increases with 6 % per cell to save computational costs.The Coriolis parameter corresponds to 55 • N and default numerical schemes are used.To study the potential of a statistical wake steering model under different inflow conditions, eight boundary layers (BLs) ranging from a neutral to a strongly stable BL are used as reference inflow conditions, all having approximately the same wind speed and direction at hub height.As reported by Vollmer et al. (2016), wake steering is ineffective in a convective boundary layer and is therefore not considered in this study.The total simulation time and domain size are determined empirically until convergence to a stationary state occurs and are dependent on the size of the largest eddies that explicitly need to be resolved.The details of the precursor simulations are summarized in Table 1.
BL1 and BL2 portray neutral conditions with roughness lengths representing low crops (z 0 = 0.1 m) and parkland (z 0 = 0.5 m), which are typical landscapes found in Northern Germany.Following Basu et al. (2008), constant cooling rates at the surface are ) at the 2.5 m.Equations for these variables can be found in Table 3.
Stationary inflow conditions are taken at 2.5 rotor diameters (D) upstream of the turbines simulated in the main simulations (Sect.2.2 and averaged over a line of size 2 D in crosswise direction and a period of 60 minutes.These inflow conditions are assumed to be undisturbed, hence far enough from the turbine that induction does not play a role.The most relevant inflow parameters are displayed in Fig. 1, showing comparable wind speed for all simulations and dissimilar atmospheric conditions related to the simulated stratification.A more stable boundary layer, indicated by a larger Obukhov stability parameter (z L −1 ), typically has a higher shear (α) and veer (∂α) and lower turbulence intensity (T I).The spread of the parameters between the main simulations (see Sect. 2.2) in the same boundary layer, indicated by the standard deviation as whiskers in Fig. 1, is small enough to be neglected.

Main simulations
After generating stationary inflow conditions with a precursor simulation, a simulation with one turbine is performed.Information on turbulence characteristics from the precursor simulation is fed to the main simulation by adding a turbulent signal to a fixed mean inflow (turbulence recycling method) far upstream of the turbine.A radiation boundary condition ensures undisturbed outflow downstream of the simulated turbine.The size of the recycling area is equal to the domain size of the precursor simulation and the domain size of the main simulation is only extended in streamwise direction by placing a turbine at x = 6 D downstream of the recycling area.Wake data until x = 10 D are used for analysis, but the domain is extended to x = 13 D to eliminate blockage effects.The simulation time consists of 20 minutes spin-up time followed by 60 minutes used for analysis.
The simulated turbine is an Actuator Disc Model with Rotation (ADMR) representing a 5MW NREL turbine, having a hub height of 90 m and a rotor diameter D of 126 m (Jonkman et al., 2009), as implemented in Dörenkämper et al. (2015).Turbine yaw angles (φ) of -30 • , -15 • , 0 • , 15 • , and 30 • are simulated, where a positive yaw angle is here defined as a clockwise rotation of the turbine when looking from above.Pitch angles (β) of 0 • , 2.5 • , and 5 • are simulated to study the effect of the thrust force on downstream wake characteristics.This adds up to a total of 120 main simulations with one turbine, i.e. eight inflow conditions times five yaw angles times three pitch angles.The effect of φ and β on the thrust coefficient C T is illustrated in Fig. 2, illustrating that the effect of the turbine yaw angle of the thrust coefficient is symmetric around zero.
The wake is described using the normalized wake deficit, defined as u nd = u wake −u∞ , where u wake represents the observed wind speed in the wake, u ∞ the undisturbed inflow 2.5 D upstream at the same height and u ∞,h the undisturbed inflow at hub height.It is assumed that the advection velocity is constant in streamwise direction (assumption of frozen turbulence) and that the wake behaves as a passive tracer in the ambient wind (Larsen et al., 2008).

Development of the statistical wake steering model
This section describes the development of the statistical wake steering model (SWSM).Figure 3 displays a flowchart of the training and execution (including testing) procedure of the model.The model is trained with the LES data representing reference inflow conditions (BLs) generated in Sect. 2. From these data, input parameters and key wake steering parameters are deducted.
A coefficient matrix is subsequently generated by performing a multi-task Lasso regression.This matrix can be used in the execution (testing) of the model to estimate the key wake steering parameters with new input parameters (e.g.new inflow conditions), which can subsequently be used to produce gridded wake data.

Defining key wake steering parameters
A statistical model will not be able to produce a full multidimensional wake, but rather estimate parameters describing the wake at downstream positions, for instance at one rotor diameter intervals as done here.Since curled wakes are considered, key wake steering parameters are in this study retrieved with the Multiple 1D Gaussian method (Sengers et al., 2020).In the example below, the wake of a turbine with a +30 • yaw angle in BL1 at x = 5 D is considered (Fig. 4a).This method fits a simple 1D Gaussian at every vertical level (k = 1...K) where information is available to obtain a set of local normalized wake center deficits (A = A 1 ...A K ), wake center positions (µ = µ 1 ...µ K ) and wake widths (σ = σ 1 ...σ K ).Subsequently, another Gaussian can be fitted through the local wake center deficits in the vertical (Fig. 4b) to find the overall normalized wake center deficit (A z ) and vertical position with respect to hub height (µ z ), as well as the vertical extension of the wake (σ z ).The local wake center position and width at vertical level k that corresponds to µ z are subsequently considered as lateral wake center position (µ y ) relative to the turbine location and wake width (σ y ).Next, by fitting a second order polynomial through the local wake center positions between upper and lower tip height (Fig. 4c), one obtains a measure for the curl (coefficient of quadratic term) and tilt (coefficient of linear term) of the wake.An expression for the wake width as function of height is found by repeating this step for the local wake widths (Fig. 4d) to obtain coefficients s a and s b .After this procedure, the wake can be described by the set of dimensionless parameters displayed in Table 2.
Note that this method cannot accurately capture the splitting of the wake in two separate cells, which might occur under strong veer as discussed in Vollmer et al. (2016).Such cases will result in inaccurate values for the key wake steering parameters and should be filtered out before applying the statistical model described in Sect.3.3.Table 2. Defined dimensionless key wake steering parameters.The normalized wake deficit is computed as described in Sect.2.2.All length parameters are nondimensionalized by the rotor diameter D.

Scalar Parameter Symbol
Amplitude normalized wake deficit Az

Vertical wake center displacement µz
Width wake center height σy

Vertical extend σz
Curl curl

Quadratic wake width parameter sa
Linear wake width parameter s b

Input parameters
A regression model (Sect.3.3) is used to estimate the key wake steering parameters in Table 2.A set of measurable inflow and turbine variables is used as input parameters, which are made dimensionless to make the model more universally applicable, at least within the variability found between the simulations in this study.This set of parameters is presented in Table 3.
Although these input parameters might all have their own isolated effect on the wake propagation, they are heavily correlated

Variable Symbol Calculated
Turbine yaw angle φ φ variables, second-order polynomial and interaction terms are added, as well as an intercept (unity), extending the original set of three input variables to ten input parameters.

Regression model
Since the LES data set has a relatively small sample size, a linear model is chosen as they perform well on small sample sizes, reduce the risk of overfitting compared to more complex Machine Learning models and are highly interpretable (Hastie et al., 2009).
The regression is formulated as a linear model in matrix form .
(1) which estimates the output variable Y based on the design matrix Y and coefficient matrix B. Matrix dimensions indicated in Eq. ( 1) represent the sample size n, number of downstream distances d and number of input parameters p.Note that p contains the transformed variables, their second-order and interaction terms, as well as intercepts.Since these parameters are highly correlated and not all relevant, the coefficients are determined based on a Lasso regression method as introduced by Tibshirani (1996).This guarantees a shrinkage of the number of variables through an regularization parameter found by cross-validation.
Additionally, it is desired that the same set of input parameters is used to estimate the output variable at all downstream distances.This is guaranteed in the multi-task Lasso method introduced by Obozinski et al. (2006), which is implemented in the multi-task Lasso algorithm from the scikit-learn Python library (Pedregosa et al., 2011).See Appendix A for further explanation.
Whereas the training is more complex than Ordinary Least Squares fitting, the predictions in the testing are generated through simple matrix multiplication as shown in Eq. ( 1).The algorithm is therefore highly interpretable, easy to implement and computational inexpensive.

Wake composition: reversed Multiple 1D Gaussian
The coefficient matrix B can be used to estimate the key wake steering parameters in Table 2 from new input parameters.This information can be used to compose a vertical cross section of the wake deficit using a methodology that is very similar to the reverse of the Multiple 1D Gaussian method described in Sect.3.1.IN this section, the key wake steering parameters from the example in Figure 4 are used to illustrate this composition method, hence no predictions with the coefficient matrix B are done.For a fair comparison of the composed wake to the original LES, a resolution identical to the LES (∆ = 5 m) is chosen.
First, using the information from the vertically fitted Gaussian (A z , µ z , σ z ), one can compute the amplitude of the normalized wake deficit Â at each height (k = 1..K) by simply filling out the Gaussian function (Fig. 4b).This information is then considered the local wake center deficit at every height.
Next, to identify local wake center positions μ, one can calculate the distance to the origin (turbine location) by filling out the simple second order polynomial using the curl and tilt parameters and the wake center location (µ y , µ z ).It is assumed that curl continues outside of the rotor area (red dashed line in Fig. 4c).Since this was not included in the fitting of the parameters, (ztip−z base ) 2 = 1) is used to determine the wake width.Here, σ(z) indicates the wake width at height z, σ tip the wake width at lower or upper tip height, z − z tip the distance to tip height and z tip − z base the distance from tip height to the surface or wake top.Using information on wake center height µ z and wake width σ y , one obtains a set of local wake widths.
Finally, a simple 1D Gaussian can be filled out at every vertical level using the information from Â, μ, σ, resulting in a two-dimensional grid filled with u nd values.This data can then be plotted in a cross section shown in Fig. 4e.Comparing this composed wake to the original LES in Fig. 4a, one can see that this simplification still contains much of the original information.The shape of the wake is conserved, as well as the displacement of the wake center.The maximum deficit of the composed wake center appears to be slightly larger than in LES.Additionally, in the composition the maximum wake deficit is always in the center (definition of a Gaussian), which is not necessarily true in LES or reality.

Wake composition validation
The procedure described in Sect.3.4 is repeated for all 120 simulations and 1 D ≤ x ≤ 10 D at every D. The metric used here to evaluate the accuracy of this method is the percentage error of available power in the rotor area of the composed wake relative to when computed with the original LES wind field (P E [%] = (P comp − P LES )/P LES * 100).A few things can be noted by studying the results shown in Fig. 6.The composition shows a large systematic positive bias in the near wake (x ≤ 3 D).This is due to the so-called double bell shape of the near wake, with a speed up region around hub height.When attempting to fit this with a simple 1D Gaussian, the deficit in the rotor area is underestimated, resulting in a positive percentage error.
For this reason, the near wake will be excluded from analysis in the remainder of this work.Further downstream (x ≥ 8 D) a small negative systematic bias can be identified, which is due to the 'top head' shape of the wake deficit as a result of temporal averaging.This is not captured by a Gaussian function and will on average result in an overestimation of the wake deficit amplitude.The large (negative) outliers typically indicate cases where the wake does not have a Gaussian shape, such as the separation in two cells under strong veer.The median error in the region 4 D ≤ x ≤ 10 D is, however, smaller than 1 %.

Optimization
Numerous combinations of input parameters, consisting of different variables and their transformations, are possible.In order to find the most accurate solution, an optimization procedure needs to be executed.Here, the goal is to minimize the absolute percentage error of available power over all training data, i.e. all considered simulations and downstream distances (4 D ≤ x ≤ 10 D).All combinations of input parameters, one variable per cluster, and transformations have been tested in order to find the optimal solution, denoted oSWSM.Additionally, a second version, denoted cSWSM, is considered that is based on the cost efficient (relatively cheap to obtain) input parameters {φ, z L −1 , T SR}.Since the input variables are know, the optimization Figure 6.Accuracy of the wake composition procedure expressed as a percentage error of available power of a virtual downstream turbine.
At each downstream distance, data from all 120 simulations are considered.
procedure is limited to finding the optimal transformations for these variables.Both oSWSM and cSWSM will be considered in Sect. 4.

Benchmark models
SWSM is benchmarked against the Gaussian (GAUS) and the Gaussian-Curl Hybrid (GCH) models present in version 2.2.2 of the FLORIS framework (NREL, 2020).Although secondary steering is not studied here, the GCH is still included because of its incorporation of initial wake deflection and the added wake recovery term.Both models share the same tuning parameters for the far wake onset (α floris , β floris ) and wake recovery rate (k a,floris , k b,floris ), in which subscript floris is added to avoid confusion with other parameters.Analogous to the training of SWSM discussed in Sect.3.5, the values of the tuning parameters is determined by minimizing the APE of available power over all considered simulations and downstream distances (4 D ≤ x ≤ 10 D).Information on inflow (e.g.u h , T I) is taken from the LES data.The models are trained independently of each other and will therefore have different values for the tuning parameters.
The data used for the tuning include simulations with yaw and pitch angles.FLORIS adjusts the thrust coefficient numerically for yaw angles, but not for pitch angles.For this reason, the thrust coefficient lookup table was adjusted by the ratio C T,pitch /C T,nopitch found in LES (Fig. 2).

Performance on training data
This section displays the performance of the statistical wake steering model (SWSM) and the benchmark models when using all 120 simulations as training data.This is done to illustrate the effect of choosing a different set of input parameters on the Following the optimization procedure as described in Sect.3.5, the optimal combinations of input parameters, denoted oSWSM, was found to be the set {φ, δα −1 , ln(C T )} and the optimal transformations of the cost efficient input parameters, denoted cSWSM, is {φ, √ z L −1 , T SR}. Figure 7 illustrates the performance of these versions as a function of downstream distance.
The accuracy of oSWSM and cSWSM is very comparable with the latter showing a slightly larger spread.However, both show a median percentage error smaller than 2 % and the interquartile range stretches over less than 5 %.This suggests that SWSM allows for using different sets of input parameters against minimal loss of accuracy.
To test whether a higher accuracy is achieved when more variables are included, allSWSM uses all (non-transformed) available variables of Table 3 as input.No optimization has been carried out for this test.The high correlation between variables is not an issue due to the use of the regression method described in Sect.3.3.This results in an accuracy gain closer to the turbine, since the near wake is more dynamic and therefore needs more parameters to explain its variability.Further downstream this effect diminishes, but is still visible.This suggests that adding more variables can indeed lead to a higher accuracy.
Figure 8 shows the accuracy of GAUS and GCH when using all simulations for the tuning described in Sect.3.6.GCH consistently gives a higher power estimate than GAUS due to the added wake recovery term.The median shows a negative bias closer to the turbine for both models, which is opposite to SWSM in Fig. 6.The biases of the benchmark models and oSWSM are of comparable magnitude.Most striking is the variability of the benchmark models that is an order of magnitude larger than that of oSWSM.The reason for this will be systematically evaluated in Sect.4.2 and 4.3.

Performance on testing data
A simple leave-one-out cross-validation technique is used to discuss the performance of SWSM compared to the benchmark models.The models are trained or tuned with seven out of the eight BLs (Fig. 1) and tested on the remaining one representing a new inflow condition.Eight evaluations can therefore be performed, i.e. each BL being tested once.Note that for each evaluation a set of optimal parameters and transformations are determined, which can differ from oSWSM and cSWSM in Fig. 7.
Similarly, GAUS and GCH are tuned again, resulting in new values for their tuning parameters.Since the models show similar behavior in relation to the downstream distance as discussed before, here only the collective result over 4 D ≤ x ≤ 10 D is discussed.
Figure 9 presents the results of this validation procedure.The shaded areas indicate a significant improvement (green), insignificant difference (yellow) or significant decline (red) of the SWSM accuracy compared to the best performing benchmark model.Statistical significance is determined using an independent Welch's t-test on the absolute percentage error with a p-value < 0.05.This test assumes a normal distribution, but can deal with unequal variances between data sets.
For most BLs, SWSM shows a significant improvement over GAUS and GCH.The systematic biases (indicated by the medians) are similar for all models in the order of a few percent, but the variability is greatly reduced in SWSM.The main reason for this is that the benchmark models do not include a pitch angle parameter β.Although the C T tables in the models are corrected in this study, the tunable parameters do not account for this.To clarify, LES finds a decreasing wake size (in both horizontal and vertical extent) with increasing β.This is accurately captured by SWSM, but GAUS and GCH produce a wake of similar size independent of β or C T .The inclusion of this effect is a notable improvement of SWSM.Arguably, this is currently not a major disadvantage of the benchmark models as turbines tend to operate without derating (β = 0 • ).However, this might get more important in the future with control strategies such as axial induction control (e.g.Corten and Schaak, 2003;   Furthermore, BL5 contains the worst results for all models.Figure 1 indicates that this is an extreme case as it has the highest Obukhov stability parameter and veer along with the lowest turbulence intensity.This is problematic for the models, since it is an inflow condition unlike anything it was trained for.
Figure 9 also shows that oSWSM and cSWSM generally have a very comparable performance, where the former generally has a slightly higher accuracy.However, the differences are small, illustrating the flexibility that SWSM provides by allowing the user to choose the input parameters.

Operation without derating
For a fair comparison between SWSM and the benchmark models, this section only considers simulations representing operation without derating the turbine (β = 0 • ).The training (selection of parameters for oSWSM and cSWSM) and tuning (tuning parameters of GASU and GCH) has been repeated and the results of the leave-one-out cross-validation technique are displayed in Fig. 10.The variability of the benchmark models in (near) neutral conditions (BL 1, 2, 7 and 8) decreases considerably, but generally SWSM still produces significantly more accurate results.In (weakly) stable boundary layers (BLs 3 to 6) GAUS and GCH still show a large variability and occasionally a large systematic bias, which is not true for SWSM.These results suggest that SWSM outperforms the benchmark models especially under stable stratifications, those conditions where wake steering is deemed most effective.Furthermore, the model performance is assessed for partial wake operation.Fig. 11 compares the models when the downstream turbine is moved 0.5 D to the left (from the upstream observer's point of view).Generally, the variability is greatly reduced since the deficit is smaller.The benchmark models display a systematic negative bias in all BLs, which is not true for SWSM.
Only BL8 shows a poorer performance of SWSM, but no satisfying explanation has been found why exactly this BL has this behavior.

285
A case study is displayed in Fig. 12a that presents the LES wind field in a weakly stable boundary layer (BL3).The wake has a clearly defined curl and a wake center left of the hub.The oSWSM wind field in Figure 12b shows that the wake shape and center position are well presented.The GAUS model (Fig. 12c), however, produces a circular wake shape and a larger wake deflection to the left.The percentage errors indicated in the top of the figure show that SWSM has a high accuracy for both virtual turbines, but GAUS has large biases due to the misplacement of the wake center.Under stable conditions the wind veer is relatively high, adding a crosswise force pointing towards to right above hub height.This force effectively opposes the lateral thrust force component introduced by yaw misalignment pointing to the left, reducing the deflection of the wake.The opposite is true for negative yaw angles, where wake deflection is enhanced by veer.This asymmetry has been pointed out in Fleming et al. (2015); Vollmer et al. (2016); Sengers et al. (2020).This effect is implicitly included in SWSM, but not in the benchmark models.Figure 12d illustrates that these models show an ever further deflecting wake, whereas the SWSM settles at a smaller lateral displacement corresponding to LES.This does not only explain the negative bias of the benchmark models in Fig. 11, but also their larger spread observed in Fig. 10.This result strengthens the previous indication that SWSM is superior under stable stratifications.

Discussion
Although the results in Sect. 4 are encouraging, the current limitations of a statistical model need further exploration.The model is sensitive to the data used for training, and encountering conditions that were not covered can result in large errors, as illustrated by the strongly stable BL5 in Fig. 9.In this study, large eddy simulation data were used to train the model, the generation of which is computationally expensive.as generating LES data for each turbine and location is not feasible.However, when properly trained, the accuracy of SWSM is expected to be significantly higher than that of analytical models, as it is specifically trained for a certain situation.Consideration of measured data from the field of wind tunnel might offer potential, although an appropriate measurement strategy and duration need to be explored.
If desired, further development of the model is needed to include the near wake, which can for instance be done by including the super-Gaussian description introduced by Blondel and Cathelain (2020).Additionally, an extension from a two-turbine setup to a wind farm could be desired.This could be achieved by for instance applying the superposition principle as done in GAUS and GCH, although the accuracy of SWSM under disturbed inflow needs attention.
Lastly, a simple evaluation of computational costs has been carried out to ensure that SWSM is sufficiently computationally efficient.The speed test comprises of producing cross sections downstream of the turbine and therefore excludes the computational resources needed to generate the LES data and to train or tune the models.This test was executed on a laptop running Ubuntu 20.04.1 with eight 1.80GHzIntel i7-8550U CPU's and 8 GB RAM, having a minimum number of processes running in the background.All files containing relevant information, such as inflow variables, were stored locally at the same location.
Run times are given as an average and standard deviation over 40 iterations, representing all simulations with β = 0 • , such that no adjustment of the benchmark's thrust coefficient lookup table is needed.Table 4 shows that when producing results for the whole region considered in this study (4 D ≤ x ≤ 10 D), the run time of SWSM is comparable to GCH and slightly higher than GAUS.When simulating only one downstream distance, for instance exactly where a turbine is located, SWSM performs similarly to GAUS.These results suggest that SWSM is quick enough to be used for controlling purposes.

Conclusions
This study explores the potential of a statistical wake steering model that is data-driven, but retains a high degree of physical interpretation.After training with large eddy simulation data, a model consisting of only linear equations is able to accurately describe the curled wake in terms of trajectory, shape and available power.It uses measurable inflow and turbine variables as input parameters and although an optimal set of parameters is found, it allows for choosing different, possibly more cost efficient, input parameters against a minimal loss of accuracy.In a benchmark against the Gaussian and Gaussian-Curl Hybrid models, the statistical wake steering model generally shows a significant improvement in accuracy.In particular it performs better under derated operating conditions and stable atmospheric stratifications, since it implicitly includes the effect of turbine

Figure 1 .
Figure1.Summary of the most relevant inflow parameters (60 min averages), given as mean (dots) and standard deviation (whiskers) over the 15 main simulations.Considered are wind speed (u h ) and turbulence intensity at hub height (T I), wind shear (α) and veer (δα) over the rotor area and the Obukhov stability parameter (z L −1 ) at the 2.5 m.Equations for these variables can be found in Table3.

Figure 2 .
Figure 2. Overview of the effect of yaw angle φ and pitch angle β on thrust coefficient CT.Whiskers indicate the standard deviation between all eight BLs.

Figure 3 .
Figure 3. Flowchart describing the training (a) and execution (b) procedure of the statistical wake steering model.Between parenthesis is indicated in what section the process is described.The coefficient matrix generated in (a) is used in (b).

Figure 4 .
Figure 4. Exemplary figures (BL1, φ = +30 • , x = 5 D) illustrating the key wake steering parameters.(a) Normalized wake deficit cross section (contour) of original LES data.(b) The local normalized wake center deficits A, (c) local wake center positions µ, (d) local wake widths σ.Black crosses indicate LES, red solid lines the relation fitted in according to the Multiple 1D Gaussian method (Sect.3.1) and red dashed lines the assumed continuation in the reversed Multiple 1D Gaussian composition method (Sect.3.4).e) Cross section (contour) of the normalized wake deficit after applying the reversed Multiple 1D Gaussian composition method.
135in LES as shown in Fig.5.One can identify several highly correlated clusters, representing 1) yaw [φ], 2) atmospheric inflow [δα, α, z L −1 , T I] and 3) turbine variables [C T , C Q , T SR].Note that wind speed is not included, since it is approximately constant in all simulations and correlated with both inflow and turbine parameters.Because of the highly correlated clusters, it is hypothesized that one is able to achieve reasonable accuracy in estimating key wake steering parameters with the regression https://doi.org/10.5194/wes-2021-43Preprint.Discussion started: 21 May 2021 c Author(s) 2021.CC BY 4.0 License.Table 3. Set of dimensionless input parameters.dir is the wind direction [ • ], z is the height above the surface [m] u h and σu h are the mean and standard deviation of the wind speed at hub height [m s −1 ], u ef f is rotor effective wind speed [m s −1 ], T is thrust [N], Q is torque [N m] and ω is rotor speed [rad s −1 ].Subscript ut indicates upper tip and lt lower tip height.

Figure 5 .
Figure 5. Correlation matrix of the dimensionless input parameters in LES.Colors indicate a positive (red) or negative (blue) correlation https://doi.org/10.5194/wes-2021-43Preprint.Discussion started: 21 May 2021 c Author(s) 2021.CC BY 4.0 License.a deviation is expected.Similarly, the fitted parameters s a , s b can be used in a second order polynomial to find the set of local wake widths (σ), normalized by the wake width at the height of the wake center (σ z ), within the vertical extent of the rotor area.Outside of the rotor area (red dashed line in Fig.4d), the definition of an ellipse ( σ(z) https://doi.org/10.5194/wes-2021-43Preprint.Discussion started: 21 May 2021 c Author(s) 2021.CC BY 4.0 License.accuracy of SWSM, as well as to highlight the differences between GAUS and GCH.In Sect.4.2 and 4.3 a validation of the model with testing data will be shown.

Figure 9 .
Figure 9. Performance of GAUS (black), GCH (blue), oSWSM (red) and cSWSM (green) using a leave-one-out cross-validation technique.Performance is displayed as a percentage error of available power.Each box includes data from 15 main simulations and 4 D ≤ x ≤ 10 D.The shaded areas indicate a significant improvement (green), insignificant difference (yellow) or significant decline (red) of the accuracy of SWSM compared to the benchmark models.

Figure 11 .
Figure11.Same as Fig.10, but for partial wake operation, i.e. with a virtual downstream turbine moved 0.5 D to the left.
Model run time [ms] when simulating seven (4 D ≤ x ≤ 10 D) and one (x = 6 D) downstream distances expressed as mean ± standard deviation over 40 iterations.
https://doi.org/10.5194/wes-2021-43Preprint.Discussion started: 21 May 2021 c Author(s) 2021.CC BY 4.0 License.derating on wake size, and the effect of veer on wake shape and center position.Although the results are encouraging, the sensitivity of a statistical model to training data needs further investigation.This includes refining the methodology to sample atmospheric conditions and investigating the model's applicability to other locations and turbines.https://doi.org/10.5194/wes-2021-43Preprint.Discussion started: 21 May 2021 c Author(s) 2021.CC BY 4.0 License.
As a consequence, the amount of training data was limited and the performance of the model is subject to what conditions were sampled.Here, boundary layers with one reference wind speed were systematically sampled to investigate the performance of the model under different atmospheric conditions.However, one could examine what conditions occur most frequently in the field and sample accordingly.Alternatively, one could mainly target those conditions that are deemed most valuable, i.e. conditions under which high power gains are expected or the risk of losses due to erroneous yawing is high.Especially since the number of cases that can be generated numerically is limited, employing a more refined sampling method than used in this study is encouraged.Furthermore, since the statistical model is sensitive to training data, its direct applicability to other locations and turbine types is questionable.This can be problematic, https://doi.org/10.5194/wes-2021-43Preprint.Discussion started: 21 May 2021 c Author(s) 2021.CC BY 4.0 License.