Parameter estimation for a nonlinear control-oriented tokamak profile evolution model

A control-oriented tokamak profile evolution model is crucial for the development and testing of control schemes for a fusion plasma. The RAPTOR (RApid Plasma Transport simulatOR) code was developed with this aim in mind (Felici 2011 Nucl. Fusion 51 083052). The performance of the control system strongly depends on the quality of the control-oriented model predictions. In RAPTOR a semi-empirical transport model is used, instead of a first-principles physics model, to describe the electron heat diffusivity χe ?> in view of computational speed. The structure of the empirical model is given by the physics knowledge, and only some unknown physics of χe ?>, which is more complicated and less well understood, is captured in its model parameters. Additionally, time-averaged sawtooth behavior is modeled by an ad hoc addition to the neoclassical conductivity σ∥ ?> and electron heat diffusivity. As a result, RAPTOR contains parameters that need to be estimated for a tokamak plasma to make reliable predictions. In this paper a generic parameter estimation method, based on the nonlinear least-squares theory, was developed to estimate these model parameters. For the TCV tokamak, interpretative transport simulations that used measured Te ?> profiles were performed and it was shown that the developed method is capable of finding the model parameters such that RAPTOR’s predictions agree within ten percent with the simulated q profile and twenty percent with the measured Te ?> profile. The newly developed model-parameter estimation procedure now results in a better description of a fusion plasma and allows for a less ad hoc and more automated method to implement RAPTOR on a variety of tokamaks.

capable RAPTOR (RApid Plasma Transport simulatOR) code [14,5]. It solves the coupled poloidal flux diffusion equation and the electron temperature transport under the assumption of a fixed magnetic equilibrium and assuming all thermal transport is diffusive. The energy confinement time, which is in the order of 1 ms for TCV and 3.6 s for ITER, determines the required speed for real-time execution. In order to meet this requirement the RAPTOR code uses, at present, an empirical, parameterized model for the heat conductivity. While these empirical models are constructed to satisfy general dependencies of the transport coefficient on local plasma quantities such as the safety factor and magnetic shear, they have some free parameters which must be tuned based either on experiments or more complete physical models. This is presently done by manual tuning of the appropriate coefficients to match experimental or simulated data. While this is feasible in practice due to the short time required for a RAPTOR simulation (typically 10 s for a full discharge), it would be appropriate to have an automated method of tuning these parameters. This should yield a more optimal parameter choice, treatment of larger experimental databases, and an overall faster tuning procedure. In addition, the resultant parameters can yield new information for comparison with theory.
Extensive work has also been done on the use of linear models for plasma profile control [2,15,16]. In these models, the plasma profile dynamics around an operating point is represented by a set of linear ordinary differential equations which include the different timescales of resistive and thermal transport. The coefficients of this linear model are automatically identified based on experiments or simulated data around that operating point. Since the equations are linear, the parameter estimation problem is also linear, which is easily tractable numerically. However, the obtained linear model is valid only in a region close to the chosen operating point and can not be guaranteed to work in other regimes.
In this paper, we present an automated parameter estimation method for the nonlinear transport simulation equations used in RAPTOR based on data. We thereby extend the application of system identification from the linear models, as treated in [15] to the nonlinear model used in RAPTOR.
The data used to tune the model can be obtained either from experiments or from more complete and time-intensive simulations using a more comprehensive interpretative transport calculation. The latter can be either predictive simulations (full simulation of plasma evolution) or interpretative simulations (using some measured data as input for the simulation and solving only for the unknown quantities). For simplicity, we refer to this data as 'experimental' data, irrespective of whether it was obtained from a true tokamak experiment or from a numerical (simulation) experiment. In this paper we will use interpretative simulations of TCV tokamak experiments to tune the transport model.
The problem of finding the unknown model parameters is formulated as a nonlinear least-squares optimization problem, with a cost function reflecting the discrepancy between simulated and known (i.e. measured) profile evolutions. Since the model is nonlinear, also the parameter optimization problem is inherently nonlinear, which makes its solution prone to local minima and other numerical issues.
An important step that greatly facilitates solving the nonlinear optimization problem is to construct an analytical expression for the gradient of this cost function, giving the steepest descent direction of the cost function in parameter space. Once a local minimum has been found, local properties of the solution can be analyzed to give information about confidence intervals or non-identifiability of some combinations of parameters.
One may legitimately wonder why one does not resort to more complete physics-based transport models, such as TGLF [17], QuaLiKiz [18] or MMM8.1 [19] to determine the required parameters. This is due to several reasons. Firstly, despite significant progress, such models are not yet able to fully predict tokamak transport in all regimes and locations over a full discharge timescale. Moreover, they are not yet computationally fast enough to be evaluated practically for a very large region of transport parameters, though very recent work showed how this can be circumvented using a neural-network fit of a large database of QuaLiKiz results [20]. Finally, by ensuring that the simulations match (measured) experimental profile evolution, any systematic mismatch between the model and reality can be partly absorbed into the estimated parameters as well.
The rest of this paper is structured as follows: in section 2 an introduction to RAPTOR is given, followed by the section which describes the parameters of the empirical models χ e and ∥ σ in RAPTOR that need to be identified. In section 4 the nonlinear parameter estimation method for RAPTOR is presented. To demonstrate the developed method, known model parameters from simulated data are first recovered in section 5.1. As the main result of this paper, the parameters of χ e and ∥ σ were identified for a set of discharges from the TCV tokamak in section 5.2. The conclusions are presented in section 7.

Overview of the RAPTOR code
RAPTOR is a control-oriented, physics-based code for simulating the 1D plasma transport equations, that has been designed for real-time control applications [5,9]. Using a prescribed evolution of the equilibrium flux surface layout, density profile evolution, and actuator inputs (I p and power to auxiliary heating and current drive systems), it simulates the time evolution of the poloidal flux and temperature profiles. In the present work, we assume a fixed flux surface geometry, a prescribed density profile and a fixed ratio of ion temperature to electron temperature, though RAPTOR presently includes the possibility to vary these in time.
The physics model solved is that of 1D (flux surface averaged) diffusion of poloidal flux (ψ) and electron temperature (T e ) profiles as a function of the radial variable ρ, the normalized toroidal flux coordinate defined as ρ = Φ Φ / b . The equation reads, following [10], are functions or ρ that depend on the 2D MHD equilibrium Plasma Phys. Control. Fusion 57 (2015) 125008 [21]. Together with Φ b , the (scalar) toroidal magnetic flux at the plasma boundary, these quantities are computed for one representative equilibrium and kept constant thereafter. ∥ σ and j bs are, respectively, the neoclassical electrical conductivity and bootstrap current density, using the Sauter-Angioni formulas [22,23]. The driven current density j cd is modeled using a gaussian deposition shape for ECH/ECCD, which is the only heating/current drive actuator considered in this paper.
Assuming ′ V does not vary in time [14], the equation for the spatial evolution of the electron temperature T e in time is Here n e is the electron density, 2 is another geometric profile quantity and P e is the total power to the electrons. This includes ohmic and auxiliary power sources and losses via equipartition with ions, line radiation and bremsstrahlung. For the electron heat diffusivity, χ e , a nonlinear analytical function of the local safety factor q and the mag- s q q is used, which will be described in detail in section 3.1. Density profiles are prescribed (but can be time-varying) and the ion temperature is assumed to scale proportionally to the electron temperature, with proportionality factor depending on the operating regime.
RAPTOR solves the above set of two coupled nonlinear parabolic PDEs using a finite element approach. Details are discussed in [5], but are repeated here for completeness. The continuous profiles ( ) By the introduction of the state vector x(t) and the input vector u(t) which contains the auxiliary actuator powers and the total plasma current, equations (1) and (2) can be formulated as RAPTOR solves this equation in implicit form, by Newton iterations, to obtain x k+1 from x k and u k at each discrete time step. The Jacobians, f x d /d k k , needed for these Newton iterations are analytically evaluated using the chain rule on (1)-(2).
The state vector evolution is used to calculate the profiles of ι (defined in this paper as ι = q 1/ ), T e and the plasma loop voltage profile U pl at specific ρ locations and specific times, that will be compared to experimental values.
it follows that ι, T e and U pl at any point ρ at = t t k can be written as a linear function of x k : For further details, see [5].

Model parameters to be estimated in RAPTOR
As discussed in the previous section, RAPTOR uses an empirical model for the electron heat diffusivity χ e . This model captures some main physical dependencies of tokamak thermal transport, but includes unknown parameters that have to be identified for each tokamak. This model was extended, with respect to the original version in [14,5], by the inclusion of the time-averaged effect of the sawteeth and an increase of diffusivity at higher temperatures, introduced to mimic thermal confinement degradation at higher input power (as appears in confinement scaling laws). Furthermore, an ad hoc addition of the time-averaged effect of the sawtooth behavior to the model for the electrical conductivity, ∥ σ , was made. Both models contain parameters that cannot be easily computed from physical laws and therefore need to be identified. The structure of the assumed empirical transport model is described here.

The electron heat diffusivity model and inclusion of sawtooth effects
The expression for χ e in RAPTOR includes the anomalous diffusion and the dependence of the confinement on the q profile [5]. In this paper the electron heat diffusivity model was extended to include the effect of sawteeth crashes and the confinement deterioration with increasing temperature and temperature gradient. The expression for the local heat diffusivity (to be used in equation (2)) reads: In this equation χ neo is intended to represent a minimum level of transport that is due to neoclassical effects. This is chosen here as a small constant, though it could be calculated based on neoclassical transport physics [21]. The much larger anomalous diffusion is captured by c ano , and the presence of q in the anomalous diffusion term accounts for increased confinement observed at higher plasma currents I p . The term F(s) is a shear-dependent function to include the effect of improved confinement (ic) at low and negative magnetic shear s. This is used to model the improved confinement observed in some advanced scenarios. The expression for the function F(s), plotted in figure 1, is The term a ic accounts for the amount of reduction of transport, d ic is the level of the shear at which the transition takes place. The sharpness of the transition is governed by the term w ic . A scaling of the heat transport at increasing temperature is modeled by the c Te exponent, where T e0 is the electron temperature at the center of the plasma. This term is included to capture the effect of reduced thermal confinement at higher input powers. The identified coefficient c Te can be related to the power degradation exponent in thermal confinement scaling laws 3 . Other choices of the χ e dependency are possible.
Sawtooth crashes occur when the safety factor q is below unity and cause periodic expulsion of plasma current and energy from the core region. Considering the time-averaged effect, this amounts to a decreased heat confinement in this region. This effect is modeled by the second term in the χ e equation. The function G(q), similar to figure 1, is used to model the effect as a function of q using a sigmoidal function centered at q = 0.95 with smoothness of the transition governed by the parameter χ w saw . The amount of conductivity increase is governed by the parameter χ c saw In all, eight parameters have to be chosen to determine the function for χ e T neo saw s aw ano ic ic ic e (13) While this may seem like a large number, each plays a very specific role in determining the transport, either globally (e.g. χ neo , c ano ) or in a particular region of the operating space (e.g. a ic , χ c saw ).

Including sawtooth effects on the electrical conductivity ∥ σ
The neoclassical electrical conductivity formula in RAPTOR is based on the expression in [22,23] ( ) where σ ∝ T Spitzer e 3/2 is the Spitzer conductivity that evolves in time due to the T e dependency. The parameter c neo is the neoclassical correction which depends on geometric effects as well as collisionality.
As mentioned in the previous section, the time-averaged effect of sawtooth crashes is to expel plasma current from the core region. Similarly as was done for the thermal conductivity, we mimic this effect by reducing the electrical conductivity in the center of the plasma. This is accomplished by multiplying (14) by a function W(q): where function W(q) reads In this equation σ w saw represents the sharpness of the transition around q = 0.95, the term ⩽ ⩽ σ c 0 1 saw accounts for the amount of reduction of the electrical conductivity. The two model parameters, σ w saw and σ c saw , need to be tuned in order to obtain appropriate time-averaged central values of q for to account for improved confinement regimes. In this work these three parameters will be identified using data.  3 The parameter c Te in (10) can be related to the power degradation exponent in thermal confinement scaling laws as follows. The (electron) heat flux in stationary conditions can written as where P is the total (volume-integrated) input power. The scaling law can then be written in the sawtoothing plasmas. Note that we included W(q) to obtain realistic q0 values and avoid too peaked current density profiles in sawtoothing plasmas. W(q) for typical parameters is plotted in figure 2.

Theory of the nonlinear least-squares method
The nonlinear least-squares (NLS) method [24] is an estimation technique to determine the parameter set p for a nonlinear model f (t, p) such that it best describes the given measurements y k at time points t k of the output of the system. The general approach to the solution of the parameter estimation problem is to minimize a cost function, which is a function of the squares of the model residuals (errors) at the time step k: where n k is the number of data points. The estimate of the parameter set p is the minimizer of the cost function Due to the nature of the nonlinear model f (t, p) only a local minimum can be guaranteed, not the global. Before formulating the cost function, we introduce a parameter scaling = Γ − p , ..., n 1 containing the initial guess of the model parameters. This ensures that the parameters are of similar order of magnitude which aids convergence of the nonlinear optimization [24].
The cost function to be minimized is now written as: where the norm ∥ ∥ ⋅ 2 2 of the radial profiles are approximated by the sum-of-squares over several points ρ i chosen on the radial ρ grid, for example: Weighting scalars ν Te , ν ι , ν Upl were chosen such that the con- By minimizing the cost function in equation (19) the model parameters p are found such that RAPTOR describes the temporal evolution of the radial U pl , T e and ι profiles more accurately than when using the initial guess of the model parameters.
The cost function can also be extended to estimate the model parameters for n d datasets at once. This feature can be used to identify one set of model parameters which accurately describe the data for various plasma shots done in different conditions. In this case the cost function definition becomes:

Outline of the parameter estimation algorithm
The parameter estimation algorithm has to identify a (sub)set of eight χ e model parameters that appear in equation (10):  Figure 3 shows the flow diagram of the developed parameter estimation algorithm. To start the parameter estimation routine, an initial guess of the model parameters p must be specified. RAPTOR uses this initial guess to solve the nonlinear poloidal flux diffusion equation and nonlinear electron energy transport equation to obtain the state vector x(t), defined by equation (4). This state vector is used to compute the profiles ( ) ρ T e , ( ) ι ρ and ( ) ρ U pl at all times as defined in equations (7)- (9). The model predictions are compared to experimental data by computing the cost function J. The cost function is minimized with an optimization routine. The sequential quadratic programming (SQP) algorithm [25], which is readily available in MATLAB®, was used for this purpose. The SQP algorithm finds the (local) minimum of the cost function by changing the model parameters p in an iterative way. As long as the cost function is not in a local minimum, the algorithm attempts to compute a new set of parameters p for which the cost function is lower. The SQP algorithm can also deal with constraints, which is useful to impose realistic bounds on the parameters to be estimated. If the algorithm behaves correctly, the parameters obtained at the (local) minimum yield ι, U pl and T e profiles in RAPTOR that match the experimental data better than the initial guess of the model parameters. Due to the nonlinearity of the parameter estimation problem, no strict guarantees can be given on whether a local or global minimum has been found.
One can merely rely on practical techniques, such as shooting methods, to reduce the chance of finding local minima.

Computation of the cost function gradient
The performance of the SQP algorithm critically depends on the derivative of the cost function to the model parameters J p d /d . This cost function gradient gives information on the direction in parameter space in which the cost function decreases. In RAPTOR this gradient J p d /d is computed analytically. This has two major advantages, the first one being that the SQP algorithm converges faster to the (local) minimum because no finite difference approximation is needed to approximate the derivative of the cost function to the model parameters. The second advantage is that the analytical gradient is exact in contrast with the numerical approximation, which enhances the performance of the SQP algorithm. The analytical expression of J p d /d is computed as follows Note: The difference in the value of the cost function yields the finite difference number J d num . The value J d analyt was obtained using the analytical expression for the cost function gradient described in this section (equation (22)): analyt . The gradient ( ) ∂ ∂ J p / analyt was computed using the nominal model parameter p. The relative difference between the two various approaches to calculate J d are small (maximum 0.16% for σ c saw ).
The term ( ) can be obtained from direct differentiation of the cost function. The term ( ) is obtained from the forward sensitivity equation, which is discussed next.
Recall that to obtain the state vector x, RAPTOR solves equation (6) (which contains the flux diffusion equation and electron energy transport) at discrete-time points k Differentiating this equation with respect to a vector of model parameters p results in the forward sensitivity equation In this equation • The first two terms on the right hand side contain the Jacobians ∂ ∂ + f x / k k 1 and ∂ ∂ f x / k k . These are already computed at each time step in RAPTOR since they are necessary for solving the Newton problem used to obtain the next time step of the simulation; • The third term is nonzero only if the inputs are prescribed in terms of parameters, as was the case in [5], but not in this work; • The last term is nonzero for model parameters that appear in the equations (10) and (15).   Now every term is known, the ODE sensitivity equation (23) is recursively solved for each parameter in the parameter vector p starting from the initial condition ∂ ∂ x p / 0 (which is typically 0, since the initial condition does not depend on any parameters). The recursive solution of this finite difference equation yields ∂ ∂ x p / k , for [ ] ∈ k n 1, ..., k . This set of matrices is referred to as the state sensitivities with respect to the model parameters p. Once the state sensitivities are known, the cost function gradient with respect to the model parameters can be computed using (22).
We now verify that the cost function gradient is correctly computed. As shown in table 1 the analytical calculation of the cost function gradient and its finite difference estimate are in good agreement (maximum relative difference between the two approaches is smaller than 0.16%), which confirms the correct implementation.

Results
In this section we present the application of the parameter estimation algorithm to the TCV tokamak. We first present a numerical experiment to validate the parameter estimation algorithm by finding known parameters from simulated data (section 5.1). Then we will show the application of the procedure to ASTRA interpretative transport simulations of TCV tokamak plasmas (section 5.2).

Demonstration: finding known model parameters from simulated data
5.1.1. Noise-free case. A numerical experiment was performed to demonstrate that the developed estimation algorithm is able to find the model parameters in χ e and ∥ σ . In this experiment the experimental data contains a set of T e , ι and U pl profiles generated by a forward simulation of RAPTOR with known parameters and without any artificial noise. A plasma with TCV parameters exhibiting reversed shear and sawteeth behavior was simulated by applying fast changes to the plasma current and EC power. A time step of 1 ms was used in a 0.4 s simulation and 11ρ gridpoints were equally distributed from 0 to 1. In figure 4 the time traces of key quantities of the plasma during this numerical experiment are shown and in figure 5 various plasma profiles are depicted at three distinct time points.
The true model parameters p of χ e and ∥ σ that were used to generate the synthetic data are given in table 2. The developed estimation method was used to recover the model  parameters which were set to generate the data. To start the parameter estimation procedure an arbitrary initial guess of the model parameters was used (table 2). In figure 6 the cost function is plotted versus the iteration steps. The cost function reduces as the number of iterations increase, showing that the estimated model parameters at each increasing iteration step allow for a better description of the synthetic data. At the final iteration step, the relative error between the estimated model parameters (p est ) and true model parameters (p true ) is 3 .5004 10 est t rue 2 true 2 10 , meaning that the estimation routine almost perfectly recovers the model parameters which were used to generate the synthetic data set.
In figure 7(a) the relative error of ι is shown versus time for a variety of different iteration steps. In figure 7(b), the relative error of T e versus time and a variety of different iteration steps is plotted, and in figure 8 the relative error of the loop voltage U pl at the final time point (0.4 s) is shown versus iteration steps. The relative errors reduce with increasing number of iterations as the estimated model parameters converge to the model parameters which were used to generate the data within machine precision.
This numerical experiment demonstrates that the developed parameter estimation algorithm is able to recover the original model parameters from simulated data.

Noisy case.
In this section the reliability and precision of the estimated model parameters in χ e and ∥ σ are investigated. A parameter is ill-determined if its estimated value is affected strongly by seemingly insignificant variations in the data. If a series of experiments were repeated many times, this could result in estimates that differ from one replication to the next. A new numerical experiment was performed to investigate the statistical properties of the estimated model parameters. The same numerical set-up as in the previous section was used but with the difference that the synthetic data of the with the weighting factors ν as defined as for the cost function in section 4.1.1. This numerical experiment also reflects reality: the measurements of the T e , ι and U pl profiles are always contaminated with noise. Let p denote the estimated parameter vector. To investigate the reliability and precision of the estimated model parameters one needs to compute the covariance matrix for the parameters. According to the asymptotic theory, this matrix can be approximated as [26] ( ) The off-diagonal elements give information about the covariance of the parameters. The correlation matrix R, with elements R ij , can be computed from the covariance matrix C ii jj (27) The relative ( ) α − 100 1 % marginal confidence interval for the ith parameter is given by [23]: (28) in which α − t n n /2 k p denotes a quantile of the t-distribution with − n n k p degrees of freedom. Note that the statistical analysis only provides local properties for the parameter estimation problem instead of global ones, as multiple local minima with each different properties could exist due to the nonlinear nature of the equations in RAPTOR. In table 3 the estimated model parameters p and the relative deviations from the true model parameters are presented for the numerical experiment with noise when using the tabled initial guess of the model parameters. The table also contains the 95% confidence intervals.
The true model parameters are all within the 95% confidence intervals. The model parameters χ neo has a large degree of uncertainty. Typically, the value of the electron heat diffusivity χ e profile is more than a factor 10 greater than χ neo . The relative contribution of χ neo to the χ e profile is therefore very small. Perturbing the value of χ neo will have little influence on χ e and thus on the cost function. It can be concluded that the model parameter χ neo is poorly observable from the data and could be kept outside the set of identified parameters when performing the parameter estimation for experimental data. For the nonlinear parameter estimation problem, the 95% confidence intervals commonly underestimate the actual uncertainty of the estimated parameters. This is because the off-diagonal elements of the covariance matrix are ignored for the computation of the confidence intervals. As a consequence, parameter correlation is ignored. For example, from the correlation matrix R it follows that there is a strong negative correlation (−0.8767) between χ neo and c ano . By increasing χ neo and simultaneously decreasing c ano a solution can be obtained that is very nearly as good as the one with the estimated model parameters. The 95% confidence intervals in table 3 should therefore be interpreted as lower limits on the parameter uncertainties.

Parameter estimation for TCV
In this section, the parameter estimation procedure is used to identify the unknown model parameters in RAPTOR, so as to give reliable predictions of the T e , ι and U pl profile evolution in TCV experiments.

Experimental setup.
Since direct experimental measurements of the q profile are not available at TCV, interpretative transport simulations were run with the more complete ASTRA code [10] using the measured loop voltage U pl , plasma current I p , plasma boundary, electron temperature profile T e and electron density profile n e . The following procedure was followed: • TCV experiments were performed, taking measurements of the electron temperature (from Thomson Scattering), line-averaged density (interferometer), loop voltage and plasma current; • No experimental measurements of T i were available, so we used instead the following formula for the ion pressure • The deposition location of heat by the auxiliary EC heating system was obtained using the TORAY-GA code [27]; • The 'experimental' thermal conductivity was calculated filling in the spline-fitted experimental profiles of n e and T e into (2), also using information about the plasma shape obtained from equilibrium reconstruction; • When appropriate, this thermal conductivity is modified to include an effective reduction of core confinement due to sawtooth activity; • The thermal conductivity was used as input in a new predictive ASTRA simulation, evolving T e and ψ as a result of the applied ECH/ECCD actuator powers. This was done to get a high-resolution representation of the evolution of all the profiles.
This procedure was carried out for a selection of 12 relevant TCV shots, which mutually featured a wide range of confinement regimes and settings for the EC heating and current drive system. The individual discharges also contained a large variation in time of the applied axillary power. Time evolutions of the key parameters of these discharges can be seen in figures 10-13.

Parameter estimation procedure.
A three-way data split approach [28] was used in order to identify the model parameters in RAPTOR for TCV discharges. The twelve TCV shots were divided into three disjoint sets: a training, validation and test set. The training set was used to estimate the model parameters with the developed parameter estimation procedure that yield reliable predictions of the T e , ι and U pl profiles. The aim is not that RAPTOR perfectly fits the training data, but gives reliable predictions for shots it has not been trained on. Therefore a validation set was used to investigate if during the learning phase overfitting [29] occurred. This would imply that RAPTOR not only captures the underlying phenomenon of the training data but also the noise associated with the particular data set. Lastly, the test set was used to assess the real performance of RAPTOR on unseen data when using the estimated model parameters. The training set consists of shots 46 712, 46 713, 4 7500, 48 041, 48 042 and 49 293, the validation set of shots 46 715, 47 501 and 48 043. Shots 46 716, 46 714 and 48 044 form the test set which was utilized to assess the performance of RAP-TOR's predictions. The ramp-up and ramp-down phases of these discharges were disregarded to avoid issues related to the initial conditions and strongly time-varying geometric profiles.
The following model parameters in the model for ∥ σ and χ e were estimated The model parameters that describe sawteeth were estimated, because sawtooth crashes are present in the TCV shots and the q profile drops below 1. However, the model parameters a ic , w ic and d ic were not estimated, because the shots considered do not represent advanced plasma scenarios characterized by a very low or negative magnetic shear s. Therefore the data does not contain information about a ic , w ic and d ic , and these parameters are therefore unidentifiable.
Conversely, since they do not influence the profile evolution in this regime, a ic is set to 0 to remove the effect of enhanced shear entirely.

5.2.3.
Results of parameter estimation. The model parameters were estimated for the six shots simultaneously in the training set in order to describe accurately a variety of plasma scenarios with RAPTOR. The time evolution of the T e , ι and U pl profiles from the interpretative ASTRA transport simulations  were fitted by the parameter estimation procedure using the cost function definition (21). In table 4 the estimated parameters at the final iteration step and the corresponding 95% confidence intervals are given for the training set. This table also includes the realistic initial guess of the parameters that was used at the first iteration step. The cost function J of the training set versus the iteration steps is depicted in figure 12.
At each iteration step during the learning process a new set of model parameters is computed which reduce the difference between measurements and predictions. The same model parameters are also used to calculate the cost function of the validation set. Overfitting is avoided by stopping the learning process when the cost function of the validation set starts increasing while the cost function of the learning set still decreases with each iteration step. The cost function of the validation set is shown in figure 13 and is in contrast to the cost function of the learning set not monotonically decreasing because of the slight increase around iteration step 7. Despite this non monotonic character however, the estimated model parameters at the final iteration step for the training set also yield the lowest value for the cost function of the validation set, which indicates that no overfitting during the learning process occurred. Moreover, the identified model parameters    table 4 are therefore the optimal parameters to describe the TCV shots with RAPTOR and will be used to assess the quality of RAPTOR's predictions of the temporal evolution of the T e , ι and U pl profiles for shots not used in the parameter estimation process.
In figures 14-16 RAPTOR's predictions of the temporal evolution of the T e and ι profiles are compared with the results of the ASTRA code for the three shots of the test set and when using both the initial guess of the model parameters at the first iteration step and the optimal model parameters at the final iteration step. The optimal model parameters decrease the cost function of the learning set by more than a factor five compared to their initial guess. From the definition of the cost function it follows that a lower value implies better agreement between the ASTRA interpretative results and RAPTOR. This can indeed be observed from the relative difference plots of the T e and ι profiles computed by ASTRA and RAPTOR at the first and final iteration step in figures 14-16. When using the identified model parameters the relative difference between the predictions of the T e profile of RAPTOR and ASTRA is at most 20% and the relative difference of the ι profile is below 10%.
The model parameters obtained by this technique are given in table 4. We note that the values correspond to our physics understanding of core electron transport. The neoclassical level, χ neo , is about one order of magnitude smaller than the anomalous level, confirming anomalous transport is dominating (trapped electron mode in this case [30]). We find that the sawtooth effects can be assimilated with a diffusion coefficient another order of magnitude larger than c ano . This is consistent with significant profiles flattening observed in tokamak central plasmas. On the other hand the value of c Te , 0.7, is near the lower expected bound. Note that in these TCV L-mode plasmas, the power degradation is also smaller than the H-mode scaling law, P −0.55 [30], which would correspond to = c 1.2

Te
. However other parameters are involved with increasing input power, for example increased recycling and hence density. It will be interesting to perform a similar study with models including particle transport.

Outlook
In this paper a methodology was presented to estimate the parameters for a nonlinear control-oriented transport model. This was applied for estimating the parameters of the realtime transport code RAPTOR. A semi-empirical transport model was used to describe the electron heat diffusivity, which included the time-averaged sawtooth behavior by an ad hoc addition to the neoclassical conductivity and electron heat diffusivity. For future work it would be interesting to asses the effect of the choices of the cost function, weighing factors and where the L 2-norm is taken over the radial ρ i grid. The relative errors of the profiles are shown when using the initial guess of the model parameters (denoted by 'before optimization') and when using the optimal model parameters. The corresponding T e and ι profiles when using the optimal model parameters at the final iteration step are plotted at three distinct time points.
optimization routine on the quality of the predictions of the transport model. There are other means to achieve good predictions using real-time capable transport models. One option, mentioned in the introduction, is to use neural network fits of the output of a quasilinear code [20]. Another option is the recent work on the non-stiffness of edge transport in L-mode tokamaks [30]. In this work it was shown that the average effect of sawteeth on density is different than on T e , the inversion radius ρ inv being larger for n e . To date a prescribed density profile is used in RAPTOR. The reason for this choice is that the most important nonlinear coupling between plasma profiles during a tokamak discharge stems from the electron temperature-dependent resistivity and bootstrap current and the q profile-dependent confinement. The inclusion of a model for n e and the effect of sawteeth on this profile, as described in [30], would increase the quality of RAPTOR's predictions. We would like to stress that the presented parameter estimation method is generic, and also completely different models for the neoclassical conductivity and electron heat diffusivity could be used as long as they appear as analytical functions of the plasma state variables (q, T e , etc).
Capturing more physics with a control-oriented transport model will result in an increase of the number of unknown model parameters. Choosing all the right model parameters by hand will become a tedious, time consuming task and therefore the presented parameter estimation method would greatly aid with this process.
A wider operating regime of the tokamak can be spanned by the model by using data from a wider operating regime to fit the parameters. In this work the considered data of the TCV tokamak did not represent advanced plasma scenarios characterized by a very low or negative magnetic shear s. Also, more direct measurements of the q profile, without requiring the intermediate step of interpretative transport simulations, should also result in a more accurate transport model. As a result some model parameters could not be identified. In the future, the control-oriented transport model in combination with the parameter estimation method can be used on other tokamaks as well in order to assess the performance of the developed methods. Finally, we mention the possibility to use the parameter estimation technique based on high-fidelity model simulations of future tokamaks (ITER, JT60-SA) to yield a fast, real-time capable 1D profile evolution model.

Conclusions
RAPTOR is a control-oriented tokamak profile evolution model designed to run in real-time during the plasma evolution, as well as for use in actuator trajectory optimization routines and controller design and testing. It uses semi-empirical models for the electron heat diffusivity χ e and the electrical where the L 2-norm is taken over the radial ρ i grid. The relative errors of the profiles are shown when using the initial guess of the model parameters (denoted by 'before optimization') and when using the optimal model parameters. The corresponding T e and ι profiles when using the optimal model parameters at the final iteration step are plotted at three distinct time points.
conductivity ∥ σ that contain unknown parameters, which need to be estimated in order to make reliable plasma profile predictions. A systematic method, based on the nonlinear least-squares theory, was developed to identify these model parameters.
A cost function is formulated reflecting the discrepancy between simulated and known, either measured or reconstructed, profile evolutions. The performance of the developed estimation method critically depends on the derivative of the cost function to the model parameters. In the developed estimation method this gradient is computed analytically, resulting in a faster convergence to the solution and higher performance as this analytical expression is exact.
It was shown that the parameter estimation method successfully recovers known model parameters from simulated data, demonstrating the correct implementation of the method. In reality, the measured plasma profiles are contaminated with noise. This situation was mimicked by adding white noise to simulation data. A statistical analysis of the identified model parameters was done for the noisy case, as parameter values could be affected strongly by seemingly insignificant variations in the data. In the noisy case the developed parameter estimation method does not recover the model parameters exactly, but the true model parameters are all covered by the calculated 95% confidence intervals. The statistical analysis also shows that, occasionally, model parameters can not be estimated independently: the effect of a specific model parameter can be compensated by another model parameter. The parameter estimation method was applied for shots of the TCV tokamak. The model parameters of χ e and ∥ σ were estimated such that RAPTOR's predictions are reliable for a variety of plasma conditions and match post-shot interpretative profile simulations using the ASTRA code. A set of parameters was found such that the difference between the experimental and simulated profiles was less than 10% in flattop phase for the ι profile and 20% for the T e profile for the discharges considered.