Bayesian analysis of turbulent transport coefficients in 2D interchange dominated ExB turbulence involving flow shear

While turbulent transport is known to dominate the radial particle and energy transport in the plasma edge, a self-consistent description of turbulent transport in mean-field transport codes remains lacking. Mean-field closure models based on the turbulent kinetic energy (k ⊥) and turbulent enstrophy (ζ⊥) have recently been proposed to self-consistently model this transport. This contribution analyses the diffusive particle transport relations of these models by means of the Bayesian framework for parameter estimation and model comparison. The reference data includes a set of isothermal simulations that not only include the scrape-off layer (SOL) but also the outer core region (in which a drift wave-like model is activated) and a set of anisothermal SOL simulations, both obtained with the TOKAM2D turbulence code. This analysis shows that the k ⊥(–ζ⊥) model does not appropriately capture the diffusion coefficients for these new data sets, presumably due to the strong flows in the diamagnetic direction that appear in these new cases. While flow shear is expected to quench the turbulence and the turbulent transport, its effect was not explicitly taken into account in the earlier k ⊥(–ζ⊥) transport models. As flow shear provides a new mechanism for the decorrelation of the turbulence, we propose to introduce an additional time scale in the diffusive transport relation as D∼k⊥/(ζ⊥+ΩS) . Inspiration is drawn from shear decorrelation times reported in literature to propose several new candidate models, which are then analysed in a Bayesian setting. This allowed identifying irrelevant terms for certain models and to rank all models according to the Bayesian evidence. While the new models accounting for shear do improve the match to the data, significant errors still remain. Also, no single model could be identified that performs best for all data sets.


Introduction
In mean-field transport codes, turbulent transport across flux surfaces is commonly approximated by diffusive models with empirically determined diffusion coefficients. To match experiments, these diffusion coefficients typically are modelled as 1D or 2D fields [1][2][3], introducing a lot of model flexibility, while not addressing the underlying physics of the turbulence dominating the radial transport. This induces a significant risk of overfitting and limits the reliability of extrapolating to different machines or operating regimes.
while in the core, the sources for the equations are S ||,ni = 0, S ||,pi/e = 2 3 T i/e S ||,ni/e .
In these equations, t is time, r and λ are the radial and diamagnetic directions respectively, n is the density, p i , p e and p = p i + p e the ion, electron and total pressure, T i and T e the ion and electron temperature, φ is the electrostatic potential, ω = ∇ 2 ⊥ φ is the vorticity, V E = b × ∇φ is the ExB drift velocity where b is the unit vector along the parallel direction. S n , S pi , S pe are the particle, ion thermal energy and electron thermal energy sources due to influx from the core region, which have a Gaussian profile in the radial direction and are uniform in the diamagnetic direction and in space. S ||,ni/e and S ||,pi/e are artificial sources accounting for the parallel dynamics. c s = √ T i + T e is the sound speed, Λ the sheath potential and σ, σ DW and g are parameters determining the strength of the sheath losses, parallel drift wave (DW) dynamics and magnetic field curvature respectively. γ i and γ e are the sheath heat transmission coefficients for ions and electrons respectively. τ ei is the collisional ion-electron equilibration time. x λ = L λ 0 xdλ/L λ denotes the diamagnetic average of x, with L λ the diamagnetic length of the domain. Finally, D 0 and χ 0 are (small) diffusion coefficients for particles and heat respectively and ν 0 is a viscosity. Note that all quantities in these equations are normalised to the reference ion gyro-period and gyro-radius.
Equations 1-4 are solved on a fine 2D (Cartesian) mesh perpendicular to the magnetic field direction with a small time step by the finite volume version of the TOKAM2D code [9,10]. Dynamics in the direction parallel to the magnetic field are not explicitly resolved, but are modelled through the artificial sources S ||,ni/e and S ||,pi/e . All the time and length scales of the turbulence are resolved, providing detailed reference data of the turbulence. On radial boundaries, the fluxes of all quantities are forced to zero by Neumann boundary conditions, while periodic boundary conditions are applied on diamagnetic boundaries.
In the core region, the drift wave (DW) terms involving S ||,ne have been added to give a 2D representation of DW characteristics. They are derived from the electron momentum balance η || nJ || = ∇ || p e − n∇ || φ, where η || is the parallel resistivity. Solving for J || and taking the divergence, we find ∇ · J || = ∇ · T 3/2 e η ||,0 (T e ∇ || ln n + ∇ || T e − ∇ || φ), where the resistivity is assumed to scale like η || ∼ η ||,0 ( T 0 Te ) 3/2 with the reference temperature T 0 = 1 (in normalised units). Replacing the parallel gradient operators in this expression with a characteristic parallel wave number ik || and defining σ DW = k 2 || /η ||,0 , we obtain ∇ · J || = σ DW T 3/2 e (φ − T e ln n − T e ). In the final source term, the flux surface (diamagnetic) average of this term is subtracted to ensure zero net divergence of the parallel current on a flux surface. This term leads to the typical DW physics where the potential fluctuations tend to follow the density fluctuations. Note that the S ||,ne term used here matches the parallel term in Ref. [17] in the isothermal case and is similar to the parallel term used in the core region in the HESEL code [18].
In the isothermal SOL case studied earlier, only equations 1-2 were solved, while the ion and electron temperature were assumed to be constant in space and time and the DW-like terms were never active. In this scenario, only the interchange instability is present to drive the turbulence. A description of the linear spectrum and the turbulence characterics for this case can be found in Ref. [10]. Besides this original case, we will also study the isothermal case with a core region where the DW-like terms come into play, and the anisothermal SOL case in which the ion and electron thermal energy equations 3-4 are added such that the dynamics induced by temperature fluctuations are included. For the latter anisothermal SOL case, Refs. [19,20] showed that an additional instability is present in the model due to the interplay between the electron temperature fluctuations and the parallel source terms modelling the sheath. More information about the influence of these anisothermal dynamics on the turbulence can be found in these references as well. Regarding the isothermal case with core region and SOL, it is clear that the parallel term in the core introduces a DW-like instability in the core region in addition to the interchange instability still at play.
For each of these three flow cases, several simulations were performed with varying input parameters as described in Appendix A. We will refer to the resulting data sets as iso, isoDW, and ani respectively.
2.2. Models for mean-field particle transport Next, we briefly turn our attention to the mean-field equations corresponding to equations 1-4. The analysis in this paper focuses on the mean-field particle and energy fluxesΓ E,n andΓ E,pi/e , which can be written asΓ Here, both the Reynolds and the Favre decompositions are used [6,21]. The former decomposes a turbulent quantity x as x =x + x withx = lim T →∞ T 0 xdt. The latter uses x =x + x wherẽ x = nx/n is a Favre or density weighed average. Thus,Γ E,n is the average ExB particle flux and Γ E,p the average ExB energy flux. In both expressions, the first term is a mean-field convection term, while the second term is due to turbulent convection. These turbulent convection terms are the main closure terms for which models need to be developed in mean-field transport models. In this contribution we will investigate models for the radial component of these fluxes and consider the component in the diamagnetic direction as a given. Note that this component in the diamagnetic direction was insignificant in the isothermal SOL case considered before, but that it becomes large in the new flow cases. The generation of this flow in the diamagnetic direction is also a topic of active research. Refs. [16,17] have analysed this in 2D systems for example. However, Refs. [22,23] found that geodesic curvature effects in 3D toroidal geometries alter the dynamics of zonal flows significantly. Hence, in our analysis, we acknowledge that the shear in the considered 2D system may not be completely representative for realistic 3D cases, but our working hypothesis is that these 2D systems do allow to study the effect of flow shear on the transport, i.e. that the nature of the transport in the presence of flow shear is not fundamentally changed by the dynamics of the shear generation.
To close the turbulent fluxes, we consider typical diffusion models of the form Figure 1 shows the radial profiles of the turbulent transport coefficients for a representative simulation of each of the three studied flow cases. The left plot in this figure shows that the diffusion coefficient for a typical iso simulation has a very flat profile. In the isoDW case, shown in the middle plot, the profile in the SOL changed little, but there is a large suppression of the diffusion coefficient in the core region. The right plot shows that the diffusion coefficient develops a clear radial profile in the ani case. This plot also shows that the turbulent heat transfer coefficients seem to be proportional to the particle diffusion coefficient with proportionality factors close to unity, i.e. χ i ∼ χ e ∼ D. Since this is also observed in the other ani simulations, we will focus on models for the particle diffusion coefficient D only, assuming that this will readily yield a model for χ i and χ e as well. Ref. [6] studied this particle diffusion coefficient in the isothermal SOL case only and proposed the k ⊥ model: where the turbulent kinetic energy is defined as k ⊥ = nV 2 E /n with C D a free parameter. A value of C D = 23.9 was found by means of a regression analysis for a set of TOKAM2D simulations. Still considering only the SOL under isothermal conditions, Ref. [7] proposed the k ⊥ − ζ ⊥ model as a refinement of the previous model: where the turbulent enstrophy is defined asnζ ⊥ = nω 2 /2. A regression analysis for a (slightly different) set of TOKAM2D data yielded C D = 7.6. A significant effect coming into play that was almost absent in this iso flow case is flow shear. In the new flow cases, this is especially pronounced in the core region of the isoDW simulations. Literature indicates that mean-flow shear changes the characteristics of the turbulence and decreases the transport [13][14][15][16][17]. It is expected that both the intensity of the turbulence, thus k ⊥ and ζ ⊥ , and the magnitude of the transport for a given turbulence intensity, thus D at fixed k ⊥ and ζ ⊥ , decreases. In this contribution we will only investigate the latter effect.
It might be hoped that the effect of flow shear would inherently be captured by the scales used in the k ⊥ − ζ ⊥ model, however, it will be shown that this model no longer performs satisfactorily for the new flow cases. The ansatz followed in this contribution is that this reduced accuracy is due to the presence of radial shear of the diamagnetic component of the ExB velocity, A first modification of models (13) and (14) that will be tested is to replace the total perpendicular turbulent kinetic energy k ⊥ by the radial turbulent kinetic energy k r = nV 2 E,r /n. The idea behind this modified version is that the turbulent kinetic energy in the radial velocity component may be more strongly related to the average radial transport than the total perpendicular kinetic energy, especially in cases where there is strong shear, presumably affecting the turbulent kinetic energy in the diamagnetic velocity component k λ = nV 2 E,λ /n. Another adjustment proposed here is to take the effect of the flow shear explicitly into account, by adding a frequency scale accounting for the flow shear Ω S to the regular k ⊥ − ζ ⊥ model: with C S a proportionality constant to be determined. The rationale is that √ ζ ⊥ provides a characteristic frequency for the temporal decorrelation of eddies (eddy turn-over frequency), while shear provides an additional decorrelation mechanism. As shear becomes large, the latter starts dominating the former and decreases the turbulent diffusion coefficient for a given turbulence intensity as indicated by k ⊥ and ζ ⊥ . This newly proposed model is in accordance with Refs. [13,14] which indicate that fluctuation suppression occurs when the shearing rate becomes larger than the ambient turbulence decorrelation rate.
Shear decorrelation rates indicated in literature make a natural choice for the shear frequency scale Ω S . A very basic frequency scale is Ω S ∼ |S λ |. A slightly more complicated decorrelation rate is based on the time it takes for two fluid elements a distance ∆r away to be decorrelated by moving a diamagnetic wavelength K λ away in the diamagnetic direction, resulting in Ω S ∼ K λ |S λ |∆r [13,15]. Finally, the interplay between regular radial decorrelation and shear flow decorrelation may result in a hybrid decorrelation time Ω S ∼ (DK 2 λ S 2 λ ) 1/3 [13,15]. These scalings only strictly apply for transport of passive scalars in a constant shear flow. However, here we use them to model the effect of flow shear which may vary in time on the density transport. To this end, we first define the mean-flow shear S λ,mean = ∂ rṼE,λ , which is the shear based on the Favre average velocity in the diamagnetic direction. In order to convert the first shear time scale to a mean-field one we simply take Ω S ∼ |S λ,mean |. This will be referred to as the simple shear model in the following. For the second time scale we have to use an approximation for K λ and a relevant scale for ∆r. We construct length scales for both based on the radial and diamagnetic turbulent kinetic energy, which nicely fits in the modelling framework constructed in Refs. [6,7]. This then yields Ω S ∼ |S λ,mean | k r /k λ , which will be referred to as the strong shear model. For the hybrid decorrelation time model, a time scale is needed in addition to the k λ velocity scale to construct an approximation for K λ . If the turbulent enstrophy is used for that, we find K r/λ ∼ ζ ⊥ /k r/λ and then Ω S ∼ (DS 2 λ,mean ζ ⊥ /k λ ) 1/3 . Note that the diffusion coefficient is now ultimately present in the model for D itself. In a mean-field code that is solved iteratively, this should not pose a problem. In case the shear frequency is dominant (D ∼ k/Ω S ) the diffusion coefficient can be eliminated to find D ∼ k 3/4 / |S λ,mean |K λ as was done in Ref. [15]. This then yields Ω S ∼ |S λ,mean |(kζ ⊥ /k λ ) 1/4 for the resulting mean-field time scale. Note that in the previous derivation for the hybrid decorrelation frequency, the "regular" turbulent enstrophy time scale of the turbulence was used. It may however be more consistent to use the shear decorrelation frequency instead. Doing so, we find Ω S ∼ DS 2 λ,mean /k λ . When this is then in turn filled out in the diffusion relation for the case of dominant shear frequency, we finally obtain D ∼ √ kk λ /|S λ,mean |. This can be seen to be equivalent to the regime where shear is dominant for the strong shear model (when k = k r ). Hence, the hybrid decorrelation time reduces to the strong shear decorrelation time in the strong shear limit in these models. The diffusion models resulting from these four mean-flow shear time scales are summarised in table 1.

Bayesian analysis methodology
The mean-field transport model development is significantly complicated by the additional dynamics that come into play in the two new flow cases which are investigated in this paper. In order to deal with this increased complexity, we apply Bayesian techniques for parameter estimation and model comparison. The solid statistical basis of this approach is expected to give a clearer view of which quantities are relevant and which models perform better compared to the least squares regression techniques used in earlier studies [6,7].

Basics of Bayesian theory
The objective of a Bayesian analysis is to infer about the probability distribution of the model parameters, which provides much more information than a single parameter value resulting from a classical least squares regression. In addition, this methodology also allows to rank different competing models according to the Bayesian evidence.
The basic identity underlying Bayesian techniques is Bayes' rule: where the posterior P(θ|D, M) is the probability of the parameters θ after observing the data D for a certain model M, the prior π(θ|M) is the probability of the parameters before observing the data for the model, the likelihood L(D|θ, M) indicates how likely it is to observe the data given a specific value of the parameters and the model, and the Bayesian evidence is the marginal likelihood of observing the data given the model no matter the parameters [11,12,[24][25][26].
The main objective of a Bayesian analysis for parameter estimation is to infer the posterior distribution of the parameters P(θ|D, M) from the data. Secondly, the model evidence L(D|M) can be evaluated for model comparison. In order to achieve this, a likelihood function and a prior need to be defined. The former is basically a model for the probability distribution describing the error between the data and the model. If we consider a model M : f (x, θ) (which is a function of the input data x) that approximates the (output) data D : y, the error is defined as = y − f . This error is typically assumed to follow a normal distribution with zero mean ∼ N (0, σ). The likelihood function is then this distribution expressed as a function of the parameters, yielding for a single data point: For a data set consisting of multiple points and allowing for possible correlations between the data, we get where it is understood that is a vector grouping the errors i of all individual data points and Σ is the covariance matrix of the error [11,12,26]. Hence, in order to evaluate the likelihood function for a single value of the parameters θ, a forward model evaluation of f (x, θ) is required for all data points. Note that while the likelihood function is a proper probability density as a function of the model error, it is not a probability density when viewed as a function of the parameters.
The prior π(θ|M) then describes what is known about the parameters prior to observing the data. As the models developed in this work are novel such that no prior information is available, it is opted to use wide Gaussian prior distributions in order not to exclude any values of the parameters from the start.
As is evident from equation (16), the product of the likelihood and the prior L(D|θ, M)π(θ|M) determines the shape of the posterior distribution P(θ|D, M), which is the the end goal for the parameter estimation since the evidence is a scalar which can be considered to be a normalisation factor in equation (16). This posterior provides a comprehensive description of the information the data provides on the parameters: the regions in parameter space that are most likely, how large the spread is on the parameters and the correlation between the different parameters of the model M.
The evidence L(D|M) is of particular interest to compare different competing models attempting to explain the data though. Defined as it is independent of the value of the parameters. Note that the evidence is not a probability, since the likelihood is not a probability distribution in terms of the parameters. However, the Bayesian evidence is directly related to the probability of a model given the data: where P(M) is the prior probability for the model before observing the data (independent of the parameters) and L(D) is a normalisation factor that purely depends on the data. Hence, if there is no prior information P(M) about which model is more probable, the ratio of the evidence of two models M 1 and M 2 is equal to the relative probability of both models given the data [24,25]: Thus, while the evidence for a single model does not have a clear meaning (as it is not a probability), the ratio of the evidence between competing models directly indicates which model is to be preferred. Note that this only holds for models trying to explain the same data, the evidence cannot be used to compare models that explain different data sets.
To highlight the main features of Bayesian theory, we briefly compare to a classical least square regression approach. If the model error is assumed to have a Gaussian distribution as in equation (18) and when the covariance matrix would only have diagonal elements that are uniform (Σ = Σ ii = σ 2 ), it can be seen that maximising this likelihood is identical to a classical least squares regression. So, a first generalisation is that the the likelihood function also allows more complex distributions for the error if desired. In a second step, the likelihood function is multiplied with the prior distribution such that any available prior information could be selfconsistently incorporated. Maximising this product yields the maximum a posteriori (MAP) estimate for the parameters. In addition, the distribution of the parameters formed by the product L(D|θ, M)π(θ|M) is proportional to the probability density function for the parameters, hence including much more information than the MAP point-estimate itself. Comparing different models based on the value of L(D|θ, M)π(θ|M) at MAP would still only compare models for a single parameter setting, as the least square comparison does. Instead, the Bayesian evidence L(D|M) integrates this product of likelihood and prior over the parameter space of the models in order to effectively compare the models for any setting of the parameters allowed by the prior. This "prior-weighed" integration of the likelihood function over parameter space adds what is called an "Occam factor", penalizing more complex models to the basic (least squares-like) goodness of fit statistic included in the likelihood function [24,25].

Numerical techniques for Bayesian inference
In order to infer about the complex posterior distribution P(θ|D, M), Markov Chain Monte Carlo (MCMC) techniques are used in this work. The idea is to generate a chain of samples which converges to a statistical steady state, after which the samples are distributed as the target distribution. More in particular, the Adaptive Metropolis-Hastings (AMH) algorithm [11,12,27] is used. A typical step of the algorithm for sampling from a generic target distribution h(θ) is schematically represented below: • Given θ i , generate θ according to the proposal distribution q(θ |θ i ) In this algorithm, q(θ |θ i ) is the proposal distribution describing the probability of where the next proposal parameter will be located given the previous accepted sample. The choice of this distribution is crucial to the efficiency of the algorithm. If this distribution is very narrow, samples will easily be accepted as they are likely to be close to the previous accepted sample, however, it will take a long time for the chain to move to other regions and explore the whole target distribution. If the proposal distribution is very wide on the other hand, the accepted samples will quickly cover the entire target distribution, but a high number proposal samples will be rejected before one is accepted since they are likely to be far off in low probability regions. In order to increase the efficiency of the algorithm, the proposal distribution is adjusted during runtime in the AMH algorithm. The previously accepted samples are used to estimate the covariance matrix of the target distribution, which is then optimally scaled and used as the covariance matrix of the Gaussian proposal function centered around the last accepted sample [11,12,27,28]. This algorithm is used for Bayesian inference by sampling for h(θ) ∼ π(θ|M)L(D|θ, M). Chib's method, described in Ref. [29], is then used to calculate the Bayesian evidence from the samples the AMH algorithm has generated (as the evidence is not directly an output of the latter). More information about the specifics of the algorithms used in this paper can be found in Refs. [11,12].

Transport coefficient modelling using Bayesian analysis
In this section, the Bayesian framework for parameter estimation and model comparison used to study the particle transport models in the new isoDW and ani flow cases and compare to results of the original iso cases. First, the k ⊥ and k ⊥ − ζ ⊥ models developed in Refs. [6,7] will be assessed in section 4.1, and then the new models explicitly taking flow shear into account will analysed in section 4.2.
4.1. Assessment of k ⊥ and k ⊥ − ζ ⊥ models As data for the Bayesian analysis, we consider the 1D profiles of the TOKAM2D reference data obtained by averaging in time (from the onset of the statistical steady state) and in the  a symmetry direction). Hence, the output data y are the radial profiles of the diffusion coefficient D, and the input data x are the radial profiles of turbulent kinetic energy k ⊥ and the turbulent enstrophy ζ ⊥ .
Describing the model error in order to construct the likelihood function is a delicate matter. This error is not only composed of the statistical error that remains on the averaged output data y but also includes the error on the model f (x, θ). The latter is composed of an error because the model form f (x, θ) is not perfect (model inadequacy), an error caused by propagating the statistical error on the averaged input data x through the model and a numerical error in solving the model (which should however be negligible for the algebraic models considered here) [30]. Since very little is known about most of these error contributions, we chose to aggregate them all and assume the error at every data point has a Gaussian distribution, which is consistent with a minimum entropy principle [26,31]. The standard deviation of the error is assumed to scale with the magnitude of the data such that i ∼ N (0, σy i ), where σ is assumed to be the same for all data points. As this "relative standard deviation" σ is not known a priori, it is an additional parameter that will be inferred. The total likelihood function is then the product of the likelihood functions of all individual data points. For the priors of C D and σ we chose wide distributions in order not to exclude any parameter values a priori: The results of the Bayesian analysis for this set-up are summarised in the first rows of table 1. The maximum a posteriori (MAP) value of the parameters of both models as well as the logarithm of their evidence is shown. Both the original models and the variant in which the total perpendicular turbulent kinetic energy k ⊥ is replaced by the radial turbulent kinetic energy k r are studied. Results are shown for the three data sets described in Appendix A. Looking at the MAP values of the C D coefficient for the regular k ⊥ and k ⊥ − ζ ⊥ models in the iso case, it can be seen that these are rather close to the values of respectively 23.9 and 7.6 that were found by means of regression analysis in Refs. [6,7]. Considering the logevidence of the models, it is clear that for all data sets the k ⊥ − ζ ⊥ model outperforms the k ⊥ model, as was also found in Refs. [11,12] and qualitatively in Ref. [7]. In addition, the models where k r is used instead of k ⊥ perform better in the majority of cases. Remember that the evidence also depends on the data, such that it can only be meaningfully compared for models which have the same output data y. Hence only models within (double) columns under iso, isoDW and ani that use the same data set can be compared with each other. Also, since the (natural) logarithm of the evidence is reported, a difference of about 5 between two models can already be considered to be decisive in selecting one model over another [11,12,24]. Now looking at the estimate of the relative standard deviation σ of the error, it is clear that this is significantly smaller for the iso data than for the ani data, and much smaller than for the isoDW data. This indicates that the models perform much less well for the new data sets. Figures 2 and 3 support the observation that the k ⊥ − ζ ⊥ model is unable to capture the trends in the new data sets. Figure 2 compares the radial profile for the diffusion coefficient for representative simulations to the models proposed here. It can be seen that the k ⊥ − ζ ⊥ model captures the profiles very well for the iso and ani (SOL) data sets, but that it fails to properly predict the diffusion coefficient in the isoDW dataset. The model seems to be making a trade-off between having a high enough diffusion coefficient in the SOL and capturing the suppressed value in the core region. Figure 3 then gives an overview of the model's performance in parameter space. It shows a scatter plot of the model value for D on the vertical axis versus the observed TOKAM2D value on the horizontal axis. Each circle in the figure represents a single TOKAM2D simulation (for a certain parameter set g, σ, ν,...), which is now also radially  Table 1. Bayesian analysis of the proposed models for the particle diffusion coefficients for the iso, isoDW and ani datasets. Parameter values are reported at MAP. averaged. While this radial averaging is not a part of the model development since the profiles are non-uniform, this kind of figures does give an overview of the performance in parameter space. Note that the classical coefficient of determination is also reported in these figures. It is defined as , where y m = y i . The figure shows that k ⊥ − ζ ⊥ performs very well also in parameter space for the iso dataset, while quite some scatter remains on the ani data set and that it performs much less well for the isoDW data. In order to further analyse the performance of the k ⊥ − ζ ⊥ models in the isoDW flow case and to investigate the significance of the shear models for this case later on, we also compare with "2 zone" models shown in the third row of table 1. The idea for these models is that the dynamics in the core and those in the SOL may be inherently different because of the different parallel dynamics. As such, it could be imagined that both regions actually follow the diffusion relation of the k ⊥ − ζ ⊥ model, but that different model constants should be allowed in the core   Figure 3. Scatter plot of the particle diffusion coefficient for iso (left), isoDW (middle) and ani (right) data sets for different models. and the SOL. Table 1 shows the evidence of this model to be much higher. The middle plot of figure 2 clearly shows that this model now captures the behaviour in the SOL very well, while the fit in the core region is also improved. The Figure 3 indicates that the 2 zone model performs much better in parameter space as well.
It can also seen that the C D parameter in the SOL of this model is close to that of the iso flow case as would be expected for a well performing model. However, the C D coefficient in the core region and that of the ani case is remarkably lower than that in the other cases. We interpret this as an indication that the turbulence characteristics and related transport differ between the flow cases, and an additional confirmation that the simple k ⊥ (−ζ ⊥ ) models are insufficient to describe all flow cases.

Assessment of shear-adjusted particle transport models
The next five rows in table 1 show the shear-adjusted models suggested in section 2.2 and the Bayesian analysis results for them. The same set-up and priors have been used as for the analysis of the k ⊥ and k ⊥ − ζ ⊥ models. In addition, a new prior needs to be specified for the C S parameters, which is likewise chosen to be uninformative as π(C S |M ) ∼ 2N (0, 100) for C S > 0.
The table shows that the k r − ζ ⊥ model has the highest evidence for the iso data set. Because this model works very well already, adding a new term to the model to incorporate flow shear does not bring a significantly better fit while increasing its prior parameter volume. As a result, adding a shear time scale tends to decrease the model evidence. Note also that the Bayesian techniques automatically predict very low values of the new, insignificant term for most models as expected.
Looking at the ani results next, it is observed that here the shear models do appreciably improve the evidence and reduce the model error. Surprisingly, it is the simplest shear model with k r in the denominator that the works best in this case. It is also interesting to notice that the bottom model, which includes a combination of the strong shear decorrelation timescale and the hybrid decorrelation timescale to account for the possible influence of both, does not bring an improvement with respect to the model with the strong shear model time scale only. Note that the model parameters tend to the parameters of the model with the higher evidence of the both without making a compromise. In addition, the model evidence of this combined model is lower than that of the strong shear case because of its higher complexity and number of fitting parameters.
Finally, including the flow shear in the models for the isoDW data significantly increases the evidence and decreases the model error over the regular k ⊥ −ζ ⊥ model as expected. Interestingly, these shear models also provide an improvement over the 2 zone models, indicating that they are more significant. Here, the strong shear model with k r in the denominator has the highest evidence together with the model with two timescales. In the latter, it seems the model does combine both time scales to improve its results, which presumably has more impact for this case since the remaining error is larger. In general, it has to be remarked that the error on the models for this isoDW data remains large. Figures 2 and 3 show the performance of some of the better models in terms of radial profiles and in parameter space. The left and right plots of figure 2 show that all models manage to capture the radial profiles very well for the iso and ani data sets. The middle figure shows that the strong shear model and the 2 zone model manage to capture the base level of the suppressed diffusion coefficient in the core region rather well, but that both models show significant peaks that are not present in the data. Figure 3 then shows that all models correctly identify the trend in parameter space for the isothermal data as expected. The shear models manage to improve the fit for the anisothermal dataset as well, although a large improvement is already achieved by using k r instead of k ⊥ in the denominator. Much more scatter remains than on the iso data though. The middle scatter plot in figure 3 seems to indicate that both the shear model and the 2 zone model manage to capture the trends in parameter space rather well, with the 2 zone model performing better according to the classical R 2 value. It has to be remarked though that this plot pertains to the radial average of the entire domain, both outer core region and SOL. If a similar scatter plot is made for the core region only, the models perform much less good and the R 2 value is seriously reduced. Hence, the R 2 value and the Bayesian evidence yield largely the same ranking for the models over the different data sets, but for the isoDW flow case, the ranking of the strong shear model and the 2 zone model is reversed. Lastly, it is worth noting that the results of the isoDW shear models seem to be applicable to the iso data as well. This is indicated by the last set of data in the left plot of figures 2 and 3, which represent the strong shear model evaluated with the MAP value of the parameters found for the isoDW data. This seems to be compatible with the iso results as well, both on the radial profiles and across parameter space. Note that we did not include results from the simple shear model in any of these plots to facilitate interpretation. The simple shear model results are very similar those of the strong shear model in most cases, but they are slightly less good for the isoDW data in figure 3.
An important consideration and possible (partial) explanation for the mediocre performance of the shear-adjusted diffusion models reported here may be in the time dependent nature of the flow shear. Assume for example that an exact expression for the relation between instantaneous transport and instantaneous flow shear would be available, D(t) = f (S λ,mean (t)). If this relation is not linear,D = f (S λ ). In our models, we use the mean flow shear as a proxy for the shear, which may impact the performance of the mean field models if fluctuations are large.
Finally we remark that the influence of the Bayesian parameters and the statistical framework should be kept in mind. The results of the Bayesian analysis do not only depend on the physical 14 model (the diffusion models that are proposed), but also on the statistical model: the form of the likelihood function that is chosen, the correlation length and the prior. The width of the prior in particular determines how strongly more complex models are penalised. Inferences with various correlation lengths showed that the value of the parameters depends on it in a rather irregular way, especially for those models that seem to perform less good. As such, we chose to use the results where no correlation length is used. The value of the correlation length did not seem to change the ranking of the different models according to their evidence though. In this context, it may be envisaged to smooth the results in the core region of the isoDW data since the peaks are probably due to the rather small scale specifics of the flow field which are hard to capture in a mean-field model. However, care should be taken for the interaction with the statistical model of the Bayesian framework.

Conclusion
In this contribution, mean-field models for the particle diffusion coefficient are analysed based on detailed reference data from the TOKAM2D turbulence code. Isothermal SOL cases, isothermal simulations with SOL and outer core region and anisothermal SOL simulations are considered. This model development process is supported by a Bayesian framework for parameter estimation and model comparison. These Bayesian techniques allowed estimating the combined posterior distribution of the proposed models, identifying terms which are irrelevant in the models and ranking the different models according to the Bayesian evidence. This Bayesian evidence is especially relevant since it automatically penalises more complex models hence guarding against over fitting.
This analysis has shown that the existing k ⊥ and k ⊥ − ζ ⊥ particle transport models proposed in Refs. [6,7] do not capture the trends in the new simulations with a core region and in the anisothermal simulation. Flow shear is assumed to be the origin of this deficiency and the k ⊥ − ζ ⊥ diffusion model is adapted by incorporating shear decorrelation frequencies inspired by literature to explicitly take this effect into account.
The shear-corrected models provide a large enhancement over the earlier models, but it was not possible to identify a single model as being the best for all data. Using the radial turbulent kinetic energy instead of the total perpendicular kinetic energy provided a significant improvement of the results for all data sets, be it in the original or in the shear corrected models. However, considerable room for improvement certainly does remain for the anisothermal simulations and the core region.
Based on this analysis, it is plausible that the discrepancies of the earlier models are indeed due to flow shear, but it cannot be ruled out that other effects are at play instead or in addition. The model parameters in the anisothermal case are systematically lower than those of the isothermal cases for example. This may be due to the inherent differences between the regimes, but might also indicate that certain anisothermal effects are still missing. This will be analysed in more detail in future work. Next, forward simulations will have to prove if the models developed here effectively captured causal relations, or rather correlations without their common origin. Further research will also focus on investigating the balances of the turbulent kinetic energy and turbulent enstrophy for the new flow cases following the methodology demonstrated in Refs. [6,7]. Another research track is to further extend the physical model of the TOKAM2D simulations that are studied. Up to now, the diamagnetic drift contribution to the vorticity has been neglected, while literature indicates this may significantly influence the dynamics [20]. Also, anisothermal cases including a core region and a SOL could be investigated.