Drift Estimation for Stochastic Reaction-Diffusion Systems

A parameter estimation problem for a class of semilinear stochastic evolution equations is considered. Conditions for consistency and asymptotic normality are given in terms of growth and continuity properties of the nonlinear part. Emphasis is put on the case of stochastic reaction-diffusion systems. Robustness results for statistical inference under model uncertainty are provided.

We consider a semilinear stochastic partial differential equation (SPDE) (1) dX(t, x) = θAX(t, x)dt + F (t, X(t, x))dt + BdW (t, x) with X(0, x) = X 0 (x) on a suitable domain D ⊂ R n . Detailed conditions for the terms appearing in (1) are stated in Section 1.1. We write X t = X(t, x) for short. Assume that we are given complete information on the process X up to a finite time T > 0. The statistical problem we are interested in consists in estimating the unknown value θ > 0.
To this end, we adopt a maximum likelihood based approach. Denote by X N the N -dimensional approximation to the solution trajectory obtained by truncation in Fourier space. X N generates a probability measure on the space of continuous paths with values in R N , denoted P N θ . Of course, different values for θ lead to different measures on path space. We fix a reference parameter θ 0 > 0 (which is arbitrary and does not necessarily coincide with the true parameter) and formally apply a version of Girsanov's theorem (as in [23], Section 7.6.4) in order to obtain a representation for the density of P N θ with respect to P N θ 0 : dP N θ dP N θ 0 Here, F N is the N -dimensional Fourier approximation of F . Maximizing the loglikelihood with respect to θ yields the following estimator: Note that the derivation ofθ N is purely heuristic, so asymptotic properties of the estimator cannot be simply derived from the general theory of maximum likelihood estimation (as presented e.g. in [18]).
The aim of this work is to extend the results from [9] to a class of semilinear stochastic evolution equations of the form (1). We analyze different variants ofθ N , which correspond to different ways of handling the nonlinear term, see Section 1.2 for details. All estimators are based on the Fourier decomposition of X. We present conditions concerning growth and continuity properties of the nonlinear operator F which are sufficient to guarantee consistency and asymptotic normality for these estimators as the number of Fourier modes N tends to infinity (see Theorem 1.2 in Section 1.3). Special emphasis is put on the important case of stochastic reactiondiffusion systems with polynomial nonlinearities. Furthermore, we study the impact of model misspecification on estimating θ in Section 2.5. More precisely: Assume that the true nonlinearity F which governs the dynamics of X is unknown or too complex to be handled directly. We discuss to what extent F may be approximated by a simple model nonlinearity F approx from the point of view of parameter estimation. Finally, we show how to adapt the argument in order to deal with a coupled system of reaction-diffusion equations, see Section 5. Our motivation in this regard is to study conductance-based neuronal models, see [33] and references therein.
Statistical Inference, in particular drift estimation, of stochastic ordinary differential equations (SODEs) is a well-established theory, see e.g. [20,23,22]. It is a well-known fact that it is in general not possible to identify the drift term of an SODE in finite time. The reason is that due to Girsanov's theorem the measures on path space generated by different drift terms are mutually equivalent. However, as T → ∞, the true drift can be recovered asymptotically. The same is true for stochastic evolution equations with bounded drift on general function spaces.
Notably the situation changes for SPDEs with unbounded drift containing differential operators. In this case, it is usually possible to identify the coefficient in front of the leading term of the drift operator. This has been observed first in [14] and [17] (see also [15]), and since then various publications have been devoted to studying and expanding this phenomenon, see e.g. [26,28,16,32] for the case of non-diagonalizable linear equations. Notice also the recent works [2] dealing with local measurements and [31,3,5,6,10,8] for parameter estimation under spatially and temporally discrete observations for a high-frequency regime. Surveys are presented in [27,7]. The main focus, however, has been put on linear equations such as the stochastic heat equation, which corresponds to the case that F is either zero or another linear operator. So far, only few results about parameter estimation for nonlinear SPDEs are available, most notably [9] (see also [7]), which considers the 2D Navier-Stokes equations and serves as a guideline for our work.  2 ). Recall that V ⊂ H H * ⊂ V * is a Gelfand triple, and for h ∈ H and v ∈ V we have V * h, v V = (h, v) H , where V * ·, · V is the dual pairing between V and its dual V * D((−A) − 1 2 ). The general model we are interested in is given by the following equation in H: together with initial condition X 0 ∈ H. Here, F : [0, T ]×V → V * is a (possibly nonlinear) measurable operator, W is a cylindrical Wiener process on H with respect to some stochastic basis (Ω, F, (F t ) t≥0 , P), and B ∈ L 2 (H) is of Hilbert-Schmidt type.
As we need weak solutions only, the stochastic basis and the cylindrical Wiener process W need not be determined in advance. The number θ > 0 is the unknown parameter to be estimated.
For simplicity, we restrict ourselves to the case B = (−A) −γ , γ > 0. For later use, we introduce some notations. Let (Φ k ) k∈N ⊂ H be an ONB of eigenvectors of −A such that the corresponding eigenvalues (taking into account multiplicity) (λ k ) k∈N are ordered increasingly. For N ∈ N, the projection onto the span of Φ 1 , . . . , Φ N is called P N : H → span{Φ 1 , . . . , Φ N } ⊂ H. The Sobolev norms on the spaces D((−A) ρ ) ⊂ H will be denoted by |x| ρ = |(−A) ρ x| H . The following Poincaré-type inequalities hold for ρ 1 < ρ 2 : For our analysis, the regularity spaces if there is a stochastic basis (Ω, F, (F) t≥0 , P) together with a cylindrical Wiener process W on H and an (F t ) t≥0 -adapted process X ∈ R(ρ) such that We say that X "is" a weak solution to (3) if a stochastic basis and a cylindrical Wiener process can be found such that (7) holds. We need the following class of assumptions, parametrized by ρ ≥ 0: (A ρ ) The observed process X is a weak solution to (3) on [0, T ], unique in the sense of probability law, with X ∈ R(ρ) a.s. Of course, for (A ρ ) it is sufficient that (3) is well-posed in the probabilistically strong sense: Remember that uniqueness in the sense of probability law can be inferred from pathwise uniqueness by means of the Yamada-Watanabe theorem [25,Appendix E]. We give a short and self-contained discussion on existence, uniqueness and regularity of strong solutions to (3) in Appendix A.
Remark. In terms of statistical inference, it does not matter if the process we observe is a strong solution to (3) in the probabilistic sense or just a weak solution. The results of Theorem 1.2 below depend only on the law induced by (X t ) 0≤t≤T on path space (we need, of course, that this law is uniquely determined). The law of the process depends on θ but is independent of the way the weak solution is constructed. We want to point out that even if the examples we are interested in are in fact constructed as strong solutions (see Theorem A.1), this is not at all crucial from the statistical point of view. See [12,Chapter 8] for a discussion of weak solutions to SPDEs in the probabilistic sense.
For N ∈ N, the projected process X N := P N X satisfies (8) dX N t = (θAX N t + P N F (t, X t ))dt + P N BdW t . Throughout this work we assume that the eigenvalues (λ k ) k∈N of −A have polynomial growth, i.e. there exist Λ, β > 0 such that (9) λ k Λk β .
In particular, λ k λ k+1 . Here, a k b k denotes asymptotic equivalence of two sequences of positive numbers (a k ) k∈N , (b k ) k∈N in the sense that lim k→∞ Similarly, a k b k means a k ≤ Cb k for a constant C > 0 independent of k.
Finally, we introduce the parameter ρ * , which turns out to describe the regularity of X: 1.2. Statistical Inference. We describe three estimators for θ (see [9]), which correspond to different levels of knowledge about the solution trajectory (X t ) t∈[0,T ] . All estimators depend on a contrast parameter α ∈ R.
(i) Given continuous-time observation of the full solution (X t ) t∈[0,T ] , the heuristic derivation of the maximum likelihood estimator (cf. [9]) yields the following term: 2 where (12) bias N (U ) : This estimator depends on the whole of X via the bias term. Note that for α = γ this is precisely the estimator given in (2). (ii) Assume we observe just the projected solution (X N t ) t∈[0,T ] . In this case, we need to replace the term P N F (t, X t ) by P N F (t, X N t ) and consider the estimator: (iii) In any of the preceding observation schemes, we may leave out the nonlinear term completely: For notational convenience, we suppress the dependence on α of all estimators.
Remark. 2 Recall that T 0 a t , db t := T 0 a T t db t for vector-valued processes a t and b t .
• Note that by Itô's formula the stochastic integral in the numerator of the estimators has a robust representation: where x k := (X, Φ k ) H . Therefore, the estimators are functionals of the observed data only. • Consistency of any of the three estimators as N → ∞, as proven in Theorem 1.2, implies that for T < ∞ the measures on R(0) induced by (X t ) 0≤t≤T are mutually singular for different values of θ. This extends the observation first made in [14]. • In particular, θ can be reconstructed exactly from full spatial observation . This implies that θ itself is its optimal estimator in this setting. However, it is of independent interest to determine the rate and asymptotic distribution ofθ full N , because the analysis of the estimatorsθ partial (S ρ ) There is ρ ≥ 1 2 , an integrable function f ρ ∈ L 1 (0, T ; R) and a continuous function for any t ∈ [0, T ] and v ∈ D((−A) ρ+ 1 2 ). Equivalently, we may choose g ρ to be just locally bounded, because in this case there is a continuousg ρ : [0, ∞) → [0, ∞) with g ≤g. We call ρ the excess regularity of F . 3 A slightly different version of this condition is useful too: (S ρ ) There is ρ > 0, an integrable function f ρ ∈ L 1 (0, T ; R) and a continuous function for t ∈ [0, T ] and v ∈ D((−A) ρ+ 1 2 ). Either (S ρ ) or (S ρ ) is needed in order to carry out a perturbation argument with respect to the linear case.
(T ρ ) There is δ ρ > 0 and a continuous function . Condition (T ρ ) is sufficient to formalize the intuition thatθ partial N should not be worse thanθ full N , given that the nonlinear behavior is taken into account at least partially in the bias term. The next condition is required in order to ensure wellposedness of the solution to (3). In order to state the condition, we formally write . Finally, we state a property, dependent on a parameter η > 0, which is crucial in the examination of the estimators. However, this property results from the conditions stated above and will not be tested directly in the examples.
dW s is the stochastic convolution with respect to the same Wiener process that is part of the (weak) solution X to (3). Here, S is the strongly continuous semigroup generated by A on H. We use the following two sets of conditions: The connection between the properties is summarized as follows: The first item follows from Theorem A.1, the second item is proven in Section 4.2. Recall the standing assumption B = (−A) −γ with γ > 0 and that β is given by (9).
(iv) For η > 0 as in Proposition 1.1, the following is true: → 0 for each a < βη, and the same holds forθ linear N .

Remark.
• If X is a solution to the two-dimensional stochastic Navier-Stokes equations with additive noise and periodic or Dirichlet boundary conditions, we reobtain the results from [9]. • Note that the convergence rate and the asymptotic variance do not depend on properties of F . In this regard, our results are compatible with previous results on linear F (see e.g. [17,27]) for α = γ. • While the conditions (S ρ ), (S ρ ), (T ρ ) and (C ρ ) are natural conditions satisfied by a big class of examples, we do not claim that they are necessary for the conclusions of Theorem 1.2 to hold. Indeed, if A and F belong to a certain class of linear differential operators, [17] and subsequent works (cf. [28,32]) prove that an estimator of the typeθ full N is consistent and asymptotically normal as N → ∞ if and only if or equivalently, order(F ) ≤ 2 order(A) + n, where n is the dimension of the domain. In particular, the degree of F may exceed the degree of A. (20) is minimal for α = γ, whereas the convergence rate is not affected by the choice of α. In the ideal setting of full information that we study in this work, it is possible to reconstruct γ and therefore also the regularity ρ * given by (10) from the observed trajectory X N , e.g. via the quadratic variation of its first component at time T :

• Elementary considerations show that the asymptotic variance in
1 . Therefore, we may set α = γ right from the beginning. If F = 0, this corresponds to the true maximum likelihood estimator. In the case of incomplete information on γ, for example time-discrete observations, which will be studied in future work, the parameter α can be used to ensure the divergence of the denominator of the estimators (whose expected value corresponds to the Fisher information).
• Note that the asymptotic variance depends itself on the unknown parameter θ. This means that in order to construct confidence intervals it is necessary to modify (20) in a suitable way. This can be done by means of a variancestabilizing transform (see e.g. [36, Section 3.2]). Alternatively, Slutsky's lemma can be used together with any of the consistent estimators for θ, e.g. .
• In general, the parameter δ ρ from (T ρ ) exceeds ρ from (S ρ ), such that a better rate forθ partial N can be guaranteed (see Section 2.2). • It is possible to allow for ω-dependent nonlinearities F : In this case, it suffices to assume that (S ρ ), (S ρ ), (T ρ ) and (C ρ ) hold almost surely in such a way that ρ and δ ρ are deterministic, while f ρ , g ρ , h ρ and b ρ are allowed to depend on ω ∈ Ω. In particular, it is possible to extend the result to solutions of non-Markovian functional SDEs whose nonlinearity depends on the whole solution trajectory (X t ) t∈[0,T ] .

Applications
We now illustrate the general theory by means of some examples. We write in distribution as N → ∞.

2.2.
Reaction-Diffusion-Systems. In this section, we consider a bounded domain D ⊂ R n , n ≥ 1, with Dirichlet boundary conditions. 5 ) for a function f : R k → R k whose components are polynomials in k variables. The largest degree of the component polynomials of f will be denoted by m F . We assume that m F > 1.
. The dynamical behaviour of this equation differs significantly from a linear equation. For a = 1 2 , this equation generates travelling waves, and for a = 1 2 , the nonlinearity is of Allen-Cahn type, as used in phase field models. However, in terms of statistical inference on θ, the nonlinear setting may be treated as a perturbation of the linear case, see Corollary 2.6 below.
(i-ii) We have to control the term |F (x)| ρ− 1 2 + ρ . Note that in order to control the norm | · | ρ− 1 2 + ρ , it suffices to control its one-dimensional components, so w.l.o.g. we assume k = 1. Taking into account the triangle inequality, it suffices to control is a closed subspace, and given the choices of ρ and ρ , the Sobolev space H 2ρ−1+2 ρ (D) is a Banach algebra [1, This proves (i). For (ii), let l ≥ 4. Then (1 + |u| 3 V ). (iv) This is proven with a calculation similar to (25).
(v) As before, we can restrict ourselves to the case F (x) = x l with 0 ≤ l ≤ m F .
For l = 0, the estimate from (T ρ ) is trivial, so assume l ≥ 1. Again using the algebra property of the Sobolev space H 2ρ−1 (D), we have for u, v ∈ D((−A) ρ ): and the claim follows.
Remark. Note that the same proof allows to cover the more general case of polynomial nonlinearities whose coefficients depend on x ∈ D, as long as these coefficients are regular enough.
Taking into account that the growth rate β of the eigenvalues of the Laplacian is given by β = 2 n (see [37], or e.g. [35,Section 13.4]), we get immediately under Assumption B: 16 . If (A ρ ) holds for some ρ > n 4 − 1 m F , the estimator θ full N is asymptotically normal with rate N 1 2 + 1 n and asymptotic variance V given by .
Remark. Assume that (A ρ ) holds even for ρ > n 4 + 1 2 . If n ≥ 2 and m F ≥ 3, then the bound on the convergence rate ofθ partial N due to (T ρ ) is better than the bound on the convergence rate ofθ linear N due to (S ρ ). This corresponds to the intuition that θ partial N is "closer to the truth" thanθ linear N . 6 In dimension n = 1,θ partial N is even asymptotically normal independently of m F .
Loosely speaking, Corollary 2.4 means that the estimators have good properties whenever X is regular enough. Finally, we state a result (cf. [12, Example 7.10]) on the validity of condition (C ρ ). This allows us to make use of the better excess regularity from condition (S ρ ), compared to (S ρ ), via Assumption A. Proposition 2.5. Let k = 1. If m F is odd and the coefficient of leading order of f is negative, then (C ρ ) holds for ρ > n 4 − 1 2 . Proof. Choose x 0 such that f is strictly decreasing on R\(−x 0 , x 0 ), set D : where C is the embedding constant of the (fractional) Sobolev space H 2ρ+1 (D) into L ∞ (D) [21, Theorem 9.8].
Corollary 2.6. Let γ > n 2 + 1 2 , i.e. ρ * > n 4 + 1 2 . If k = 1, m F is odd and the coefficient of leading order of f is negative, then the following is true for every α > γ − n+2 16 : (i) In dimension n = 1, all three estimators are asymptotically normal whenever m F ≤ 7. (ii) In dimension n = 2,θ full N is asymptotically normal andθ partial N ,θ linear N are consistent with optimal rate whenever m F ≤ 3.
With "consistency with optimal rate" we mean consistent with rate N a for every a < 1 2 + 1 n .

Burgers' Equation.
We point out that the validity of this example has been conjectured in [7]. Consider the stochastic viscous Burgers equation , L > 0, with Dirichlet boundary conditions. Here, In this setting we have H = L 2 (D), D(−A) = H 2 (D) ∩ H 1 0 (D).
We follow the convention to denote the viscosity parameter by ν instead of θ. Likewise, the estimators will be calledν full N ,ν partial N andν linear N .
Furthermore,ν partial N andν linear N are consistent with rate N a for each a < 1.

Robustness under Model Uncertainty.
In the preceding examples we assumed that the dynamical law of the process we are interested in is perfectly known. However, it may be reasonable to consider the case when this is not true. We may formalize such a partially unknown model as where G : [0, T ]×V → V * is an unknown perturbation. We assume that the model is well-posed (i.e. (A ρ ) holds for 0 ≤ ρ < ρ * ) and that F satisfies (S ρ ) with ρ + ρ > ρ * . Letθ full N ,θ partial N andθ linear N be given by the same terms as before, i.e.θ full N andθ partial N include knowledge on F but not on G.
This follows directly from the discussion in Section 4, taking into account the decomposition (33) It is easy to verify that if (S ρ ) holds for F and G separately with excess regularity F ρ resp. G ρ , then a version of (S ρ ) holds for F + G as well, with excess regularity min( F ρ , G ρ ). However, in general the excess regularity F +G ρ of F + G can be chosen larger due to cancellation effects of F and G. Corollary 2.12.
, then asymptotic normality with rate N β+1 2 carries over to all estimators.
Said another way, the excess regularity of G determines essentially to what extent the results from Theorem 1.2 remain valid. A large value for G ρ corresponds to a small perturbation.

Remark.
• In applications it is common to approximate a complicated nonlinear system by its linearization. From this point of view, the case that F itself is linear in (32) becomes relevant. Of course, it is desirable to maintain the statistical properties of the linear model under a broad class of nonlinear perturbations. • It is possible to interpret the nonlinear perturbation as follows: Assume there is a true nonlinearity F true describing the model precisely. Assume further that we either do not know the form of F true or we do not want to handle it directly due to its complexity. Instead, we approximate F true by some nonlinearity F = F approx which we can control. If our approximation is good (in the sense that (S ρ ) holds for G = F true −F approx with suitable excess regularity), then the quality of the estimators which are merely based on the approximating model can be guaranteed, i.e. they are consistent or even asymptotically normal. The approximating quality of F approx is measured by the excess regularity of G. • As G is unknown, no knowledge of G can be incorporated into the estimators, and condition (T ρ ) need not be required to hold for G. • The previous examples show that (S ρ ) is fulfilled for a broad class of nonlinearities G (assuming that ρ is sufficiently large if necessary).

Numerical Simulation
We simulate the Allen-Cahn equation 1] with Dirichlet boundary conditions and initial condition X 0 (x) = sin(πx). We discretize the equation in Fourier space and simulate N 0 = 100 modes with a linear-implicit Euler scheme with temporal stepsize h temp = 2.5 × 10 −5 up to time T = 1. The spatial grid is uniform with mesh h space = 5 × 10 −4 . The true parameter is θ = 0.02. We have run M = 1000 Monte-Carlo simulations for each of the choices γ = 0.4 and γ = 0.8. In any case, we have set α = γ. Remember that in this setting all estimators are asymptotically normal. Figure 1 illustrates consistency, the convergence rate and the asymptotic distribution from Theorem 1.2. As expected, the values ofθ full N andθ partial N are closer to each other than toθ linear N . Note that the quality ofθ linear N in this simulation depends on the level of noise given by γ, with decreasing accuracy under smooth noise. Our interpretation is that the nonlinearity becomes more highlighted if the noise is less rough.
We mention that for simulations with even larger values of γ (take γ = 1.3), the values ofθ linear N are mostly negative and therefore not related to the true parameter, whileθ full N andθ partial N stay consistent. Of course, this effect may be influenced by the number of Fourier modes N 0 used for the simulation.

Proof of Theorem 1.2
We follow closely the arguments which have been given in [9] for the special case of the Navier-Stokes equations in two dimensions. Using a slightly different version where (W k ) k∈N are independent one-dimensional Brownian motions, and the solutions have the explicit representation Lemma 4.1 (cf. [9,27]). It holds that Sketch of proof. Use that x k s and x k t , s ≤ t, are jointly Gaussian with mean zero and .
We close this section by giving the precise regularity for the linear process X.

4.2.
Asymptotic Behaviour in the Semilinear Case.

Proof of Proposition 1.1 (ii).
Assuming that (A ρ ) and (S ρ ) hold for some ρ ∈ [0, ρ * ), we define X := X − X and X N := P N X, where as before X is the solution to (35). These processes are well-defined and satisfy (45) Calculations similar to those in Lemma A.3 show 9 (46) The nonlinear term is estimated as follows: , thus X ∈ R(ρ + ρ ). We have proven: We finish the proof of Proposition 1.1 (ii) with the following Lemma: Proof. This follows from X ∈ R(ρ * − δ) for δ > 0, X / ∈ R(ρ * ) and X ∈ R(ρ * ) almost surely.

An Asymptotic Growth Property.
Proposition 4.6. Assume that (R η ) holds for some η > 0. Let α > γ − 1+β −1 4 . Then as N → ∞ in probability. 10 Proof. We set where we used (4) and (38). The last term converges to zero a.s. because We prove asymptotic normality ofθ full N by means of the following CLT, which is a special case of [24,Theorem 5.5.4 (I)] and [19,Theorem VIII.4.17]:  In the present situation, we set and note that these are continuous local martingales with Proposition 4.6 and (38) give . Another application of Proposition 4.6 together with Slutsky's lemma yields Rearranging the terms, we have proven part (ii) from Theorem 1.2.
Remark. It is not necessary to perform a perturbation argument to prove asymptotic normality forθ full N , i.e. we do not have to bound a remainder integral of the type T 0 (−A) 1+2α X N t , P N BdW t directly (even if this is not difficult using the Burkholder-Davis-Gundy inequality).
Next, we prove consistency for the remaining estimators. Taking into account (51) and (52), part (i) and (iv) from Theorem 1.2 follow immediately from the following lemma: , then a.s. a.s. for any < ρ + ρ − ρ * . The same is true for bias N (X N ).

The Case of Coupled SPDEs
The same techniques as applied above allow for further generalization. More precisely, X may be coupled with another state variable X ⊥ with state space H ⊥ . This leads to a system of the form Let us describe this setting in more detail. Let H be a Hilbert space with inner product ( The process X is a unique (in the sense of probability law) weak solution to (63) on [0, T ] with X ∈ R(ρ) a.s. 11 If (A 0 ) holds, then higher regularity (A ρ ), ρ > 0, is equivalent to X ∈ R(ρ) almost surely. The conditions (S ρ ) and (T ρ ) have the following modified counterparts: . This estimator uses information that is accessible in any of the preceding observation schemes. The proof of Theorem 1.2 gives immediately the following extension: Theorem 5.1. Assume (A ρ ) and (S ρ ) hold for ρ ∈ [0, ρ * ) such that ρ + ρ > ρ * . Let α > γ − 1+β (ii)θ full N is asymptotically normal. More precisely, 2θ(β(2α − 2γ + 1) + 1) 2 T Λ 2α−2γ+1 (β(4α − 4γ + 1) + 1) in distribution as N → ∞. Note that Lemma 4.9 does not transfer toθ partial,2 N without further assumptions. The reason is that ||X − X N || ρ+ 1 2 −δρ = |X − X N | ρ+ 1 2 −δρ + |X ⊥ | H ⊥ , where the second summand cannot be controlled as N → ∞. Example 5.2. As an illustration for the theory developed in this section, consider a stochastic Fitzhugh-Nagumo system ( [13,29]) of the type (2) t on a bounded interval I ⊂ R with Neumann boundary conditions, where a ∈ (0, 1), b ≥ 0 and , σ > 0 are constants. Models of that type are well-studied, e.g. in neuroscience. Note that the Laplacian is contained only in the drift term of the first variable. The nonlinearity F (v, w) is cubic in v. Computations similar to Proposition 2.3 show that (S ρ ) holds for any ρ ≥ 0 with ρ = 1 2 + 1 3 = 5 6 . Consequently,θ full N is asymptotically normal. Similarly, (T ρ ) holds for ρ > 1 4 + 1 2 = 3 4 with δ ρ = 1, sô θ partial,1 N is asymptotically normal if v ∈ R(ρ), ρ > 3 4 .
However, in many applications it would be even more natural to drop the noise W (1) from the equation for v t , i.e. to set σ = 0. In this case, the linearization of the equation for v t reduces to the heat equation with analytic solution, so that the perturbation argument used throughout this work does not apply. New methods have to be developed for this situation. and Young's inequality easily gives where we used δ ρ ≥ 1 2 . Gronwall's lemma implies X t = Y t for all t ∈ [0, T ]. This proves Theorem A.1.