An It\^o type formula for the additive stochastic heat equation

We use the theory of regularity structures to develop an It\^o formula for $u$, the solution of the one dimensional stochastic heat equation driven by space-time white noise with periodic boundary conditions. In particular for any smooth enough function $\varphi$ we can express the random distribution $(\partial_t-\partial_{xx})\varphi(u)$ and the random field $\varphi(u)$ in terms of the reconstruction of some modelled distributions. The resulting objects are then identified with some classical constructions of stochastic calculus.

We consider {u(t, x) : t ∈ [0, T ], x ∈ T = R/Z} the solution of the additive stochastic heat equation with periodic boundary conditions and zero initial value: where ξ is a space-time white noise over R×T. This equation was originally formulated to model a one-dimensional string exposed to a stochastic force (see [11]). From a theoretical point of view, the equation (1.1) represents one of the simplest examples of a stochastic PDE whose solution can be written explicitly, the so-called stochastic convolution (see e.g. [31,9]). Writing ξ = ∂W/∂t∂x, where W is the Brownian sheet associated to ξ, one has where the integral dW s,y is a Walsh integral taken with respect to the martingale measure associated to W and P : (0, +∞)×T → R is the fundamental solution of the heat equation with periodic boundary conditions: It is well known in the literature (see e.g. [9]) that u admits a continuous modification in both variables t and x and it satisfies the equation (1.1) in a weak sense, that is for any smooth function l : T → R one has T u(t, y)l(y)dy = Looking at u as a process with values in an infinite-dimensional space, the process u t = u(t, ·) is also a Feller Diffusion taking values in C(T), the space of periodic continuous functions, and L 2 (T). Its hitting properties were intensively studied in [22] using the potential theory for Markov processes and some general features of Gaussian random fields. Nevertheless, the general extension of the Itô formula in the infinite-dimensional stochastic calculus (see e.g. [9]) cannot be applied directly to u t , because the quadratic variation of this process in this setting must coincide with the trace of the identity operator. Moreover, it has been shown in [30] that the process t → u(t, x) for any fixed x ∈ T has an a.s. infinite quadratic variation as a real-valued process. Therefore any attempt to apply classically the powerful theory of Itô calculus seems pointless. Introduced in 2014 and explained through the famous "quartet" of articles [15,6,8,4], the theory of regularity structures has provided a very general framework to prove local pathwise existence and uniqueness of a wide family of stochastic PDEs driven by spacetime white noise. In this article, we will show how these new techniques allow formulating an Itô formula for u. The formula itself will be expressed under a new form, reflecting the new perspective under which the stochastic PDEs are analysed. Indeed, for any fixed smooth function ϕ : R → R, we will study the quantity (∂ t − ∂ xx )ϕ(u), interpreted as a space-time random distribution. This choice is heuristically motivated by the parabolic form of the equation (1.1) defining u and it is manageable by the regularity structures, where it is possible to manipulate random distributions. Thus we are searching for a random distribution g ϕ , depending on higher derivatives of ϕ, such that, denoting by ·, · the duality bracket, one has a.s. the identity for any test function ψ. We will refer to this formula as a differential Itô formula, because of the presence of a differential operator on the left-hand side of (1.4). By uniqueness of the heat equation with a distributional source g ϕ (see section 2), for every (t, x) ∈ [0, T ] × T → R, we can write formally ϕ(u(t, x)) = ϕ(0) + t 0 T P t−s (x − y)g ϕ (s, y) ds dy (1.5) where for any fixed (t, x) the equality (1.5) hold a.s. We call a similar identity an integral Itô formula because of the double integral on the right-hand side of (1.5). This resulting formula may be one possible tool to improve our comprehension of the trajectories of u, even if it is still not clear whether it will be as effective as it is for finite-dimensional diffusions (see e.g. [27]).
(1.9) Thereby yielding (1.10) Let us understand heuristically what happens when ε → 0 + . Since ρ ε is an approximation of the delta function, u is a.s. continuous and the derivative is a continuous operation between distributions, we can reasonably infer that the left-hand side of (1.10) converges in some sense to (∂ t − ∂ xx )ϕ(u). Thus the right-hand side of (1.10) should converge too to some limit distribution. However, written under this form, it is very hard to study this right-hand side because it is possible to show with respect to some norm in an infinite-dimensional space (see the remark 5.2). These two results suggest a cancellation phenomenon between two objects whose divergences compensate each other. This simple cancellation phenomenon between two diverging random quantities, which lies at the heart of the recent study of singular SPDEs, has already been noticed in the pioneering article [32] (see also [14,21]) and now we are able to reinterpret that result in the general context of the renormalization theory, as explained in the theory of regularity structures. Using the notion of modelled distribution and the reconstruction theorem, we can also explain the limit as the difference of two explicit random distributions. However, these limits are only characterised by some analytical properties which cannot allow to understand immediately their probabilistic representation. Therefore the convergence is also linked with some specific identification theorems which describe their law. Summing up both these results we can state the main theorem of the article: ψ(s, y)ϕ ′′ (u s (y))P ′ s−s 1 (y − y 1 )P ′ s−s 2 (y − y 2 )dyds dW 2 s,y .
Moreover for any (t, x) ∈ [0, T ] × T we have a.s. P t−s (x − y)ϕ ′′ (u s (y))P ′ s−s 1 (y − y 1 )P ′ s−s 2 (y − y 2 )dyds dW 2 s,y , where in both formulae P ′ s (y) = ∂ x P s (x), the integral dW 2 s,y is the multiple Skorohod integral of order two integrating the variables s = (s 1 , s 2 ) and y = (y 1 , y 2 ), u s (y) = u(s, y) and C : (0, T ) → R is the integrable function C(s) := P s (·) 2 L 2 (T) . Remark 1.2. It is natural to ask whether the same techniques could be applied to a generic convex function ϕ, to establish a Tanaka formula for u. In case ϕ is not a regular function, the formalism of regularity structures does not work anymore (see the section 4). However even if we try to generalise the Theorem 1.1 using only the Malliavin calculus, the presence of a multiple Skorohod integral of order 2 in both formulae would require apriori that the random variable ϕ ′′ (u(s, y)) ought to be twice differentiable (in the Malliavin sense). Hence the condition ϕ ∈ C 4 b (R) in the statement appears to be optimal. Finally, any Tanaka formula would require a robust theory of local times associated to u, yet this notion is very ambiguous in the literature. On the one hand, using some general results on Gaussian variables (such as [12]) for any x ∈ T we can prove the existence of a local time with respect to the occupation measure of the process t → u(t, x). On the other hand, an alternative notion of a local time for u has been developed employing distributions on the Wiener space in [14].
We discuss the organization of the paper: in Section 3 and 4 we will apply the general theorems of the regularity structures theory to build the analytical and algebraic tools to study the problem: all the constructions are mostly self-contained. In some cases we will also recall some previous results obtained in [19,6,8]. Then in the section 5 we will combine all these tools to obtain firstly two formulae involving only objects built in the previous sections (we will refer to them as pathwise Itô formulae) and later the identifications theorems.
We finally remark that some of the techniques presented here could potentially be used to establish an Itô formula for a non-linear perturbation of (1.1), the so-called generalised KPZ equation: where g, f , h, k are smooth functions and u 0 ∈ C(T) is a generic initial condition. (We refer the reader to [2,18,5]). Establishing such a formula in this generalized setting shall be subject to further investigations. Other possible directions of research may also take into account the It formula for the solutions of other stochastic PDEs with Dirichlet boundary conditions (see [13]) and, using the reformulation in the regularity structures context of differential equations driven by a fractional Brownian motion (see [1]), we could recover some classical results for fractional processes (see e.g. [10,28]).

Acknowledgements
The author is especially grateful to Lorenzo Zambotti, Cyril Labb, Nikolas Tapia, Henri Elad Altman and Yvain Bruned for many enlightening discussions, comments and suggestions concerning the content and the organization of the present article. A special thank goes also to the organisers of the Second Haifa Probability School and the weekly seminar in SPDEs and rough path based in the greater Berlin area, where a preliminary version of these results was presented. Finally, the author is also very thankful for the very careful referees' reports, which led to many improvements and corrections during the revision of the article.

Hölder spaces and Malliavin calculus
We recall here some preliminary notions and notations we will use throughout the chapter. For any space-time variable z = (t, x) ∈ R 2 , in order to preserve the different role of time and space in the parabolic equation (1.1) we define, with an abused notation, its parabolic norm as z := |t| + |x| .
Moreover for any multi-index k = (k 1 , k 2 ) the parabolic degree of k is given by |k| := 2k 1 +k 2 and we adopt the multinomial notation for monomials z k = t k 1 x k 2 and derivatives According to the definition of ρ ε in (1.7), the parabolic rescaling of any function η : R 2 → R of parameter λ > 0 and centred at z = (t, x) is given by For any non integer α ∈ R, a function f : R 2 → R belongs to the α Hölder space C α when one of these conditions is verified: • If α > 1, f has ⌊α⌋ continuous derivative in space and ⌊α/2⌋ continuous derivative in time, where ⌊·⌋ is the integer part of a real number. Moreover for any compact set K ⊂ R 2 • If α < 0, f is an element of S ′ (R 2 ), the set of tempered distributions on R 2 and at the same time f belongs to the dual of C r 0 (R 2 ), the set of compactly supported functions belonging to C r (R 2 ), where r = −⌊α⌋ + 1. Moreover for every compact set K ⊂ R 2 where B r is the set of all test functions η supported on {z ∈ R 2 : z ≤ 1}, such that all the directional derivatives up to order r are bounded in the sup norm.
The spaces C α and the respective localised version C α (D), defined on every open set D ⊂ R 2 are naturally a family of Frchet spaces. Moreover for any α > 0 and any compact set K ⊂ R 2 , defining C α (K) by restriction of f on K, we obtain a Banach space using the quantity f C α (K) . The elements f ∈ C α (R × T) are interpreted as elements of C α whose space variable lives in T. Most of the classical analytical operations apply to the C α spaces as follows: 1 the derivative ∂ 2 x f will be denoted in some cases by ∂ xx f to shorten the notation • Differentiation if f ∈ C α and k is a multi-index then the map f → ∂ k f is a continuous map from C α to C β where β = α − |k|.
• Schauder estimates (see [29]) if P is the Heat kernel on some domain, then the spacetime convolution with P , f → P * f is a well defined map for every f supported on positive times and it sends continuously C α in C α+2 for every real non integer α.
The Hölder spaces and the operations defined on them provide a natural setting to formulate the deterministic PDE where g ∈ C β (R × T) and v 0 ∈ C(T). For any β > 0, classical results on PDE theory (see e.g. [20]) imply that there exists a unique strong solution v ∈ C β+2 ([0, T ] × T) of (2.1) which is given explicitly by the so-called variation of the constant formula  [19,Lem. 6.1]). In particular, it is possible to show (see [15,Prop. 6.9] and [13,Prop. 2.15]) that for any test function ψ where ϕ N is a fixed sequence of smooth functions converging a.e. to 1 (0,t)×T . Thus the solution of the equation (2.1) is given by the same formula (2.2) if g ∈ C β (R × T). The following procedure can be adapted straightforwardly to define a linear map 1 [s,t] : The equation (1.1) can be expressed in the context of the spaces C α . Indeed for every κ > 0 it is possible to show that there exists a modification of ξ belonging to C −3/2−κ (R × T) such that the sequence ξ ε defined in (1.7) converges in probability to ξ with respect to the topology of C −3/2−κ (R × T) (see [15,Lem. 10.2]). Choosing κ < 1/2 and v 0 = 0, we can apply then the deterministic results of (2.1) with every a.s. realisation of ξ and by uniqueness of the solution (1.1) we obtain the pathwise representation (2.4) Using this identity we deduce immediately from the Schauder estimates that every a.s. realisation of u must belong to C 1/2−κ ([0, T ]×T). This property excludes immediately the possibility to define an object like "uξ" because the sum of the Hölder regularity of each factor will be −1 − 2κ and we cannot apply the property of the product stated before. The same reasoning applies also for the formal object "(∂ x u) 2 " and it tells us there is no classical theory to understand these products. Since we will need to compare distributions defined on R × T with distribution on R 2 , for every function v ∈ S ′ (R × T) we denote by v ∈ S ′ (R 2 ) the periodic extension of v, defined for every test function ψ : R 2 → R as u(ψ) = u( m∈Z ψ(·, · + m)) .
(2.5) This operation extends to the level of distribution the usual periodic extension of functions defined on R×T to R 2 (which we denote by the same notation). Thanks to this definition we have the identities From a probabilistic point of view, ξ is an isonormal Gaussian process on H = L 2 (R × T), defined on a complete probability space (Ω, F , P). That is we can associate to any f ∈ H a real Gaussian random variable ξ(f ) such that for any couple f, g ∈ H one has We denote by I n : H ⊗n → L 2 (Ω), n ≥ 1 the multiple stochastic Wiener integral with respect to ξ (see [24]). I n is an isometry between the symmetric elements of H ⊗n equipped with the norm √ n! · H ⊗n and the Wiener chaos of order n, the closed linear subspace of L 2 (Ω) generated by {H n (ξ(h)) : h H = 1} where H n is the n-th Hermite polynomial. Thus we have the natural identifications ξ(f ) = I 1 (f ) = R T f (s, y)dW s,y . Let us introduce some elements of Malliavin calculus for ξ (see [24] for a thorough introduction on this subject). We consider S ⊂ L 2 (Ω) the set of random variables F of the form where h : R n → R is a Schwartz test function and f 1 , · · · , f n ∈ H. The Malliavin derivative with respect to ξ (see [24,Def. 1 Iterating the procedure and adopting the usual convention ∇ 0 = id, for any k ≥ 0 one can define the k-th Malliavin derivative ∇ k F = {∇ k z 1 ···z k F : z 1 , · · · , z k ∈ R × T}, which is a family of H ⊗k -valued random variables. Moreover starting from a separable Hilbert space V and considering the random variables G ∈ S V of the form we can also define the H ⊗k ⊗ V -random variable ∇ k G. In all of these cases the operator ∇ k is closable and its domain contains the space D k,p (V ), the closure of S V with respect to the norm · k,p,V defined by The space D k,p (R) is also denoted by D k,p . Trivially all variables belonging to some finite Wiener chaos are infinitely Malliavin differentiable. We denote by δ : Dom(δ) ⊂ L 2 (Ω; H) → L 2 (Ω) the adjoint operator of ∇ defined by duality as for any u ∈ Dom(δ), F ∈ D 1,2 . The operator δ is known in the literature as the Skorohod integral and for any u ∈ Dom(δ) we will write again δ(u) with the symbol R T u(s, y)dW s,y because δ is a proper extension of the stochastic integration over a class of non adapted integrands. Using the same procedure we define δ k : Dom(δ k ) ⊂ L 2 (Ω; H ⊗k ) → L 2 (Ω) as the adjoint of ∇ k . Similarly to before we call the operator δ k the multiple Skorohod integral of order k and we denote it by Let us recall the main properties of δ k : • Extension of the Wiener integral For any h ∈ H ⊗k , we have δ k (h) = I k (h).
• Product Formula (see [23, Lem. 2.1]) Let u ∈ Dom(δ k ) be a symmetric function in the variables t 1 , · · · , t k and F ∈ D k,2 . If for any couple of positive integers j , r such that 0 ≤ j + r ≤ k one has ∇ r F, δ j u H ⊗r ∈ L 2 (Ω; H ⊗(k−r−j) ) then ∇ r F, u H ⊗r ∈ Dom(δ q−r ) and we have • Continuity property (see [23,Pag. 8]) We have the inclusion D k,2 (H ⊗k ) ⊂ Dom(δ k ) and the map δ 2 : D k,2 (H ⊗k ) → L 2 (Ω) is continuous. In other words, there exists a constant C > 0 such that for any u ∈ D k,2 (H ⊗k ) one has as long as the right-hand side above is well defined. Similar definitions hold for the multiple Skorohod integral of order k, mutatis mutandis. Using this notation one has

Regularity structures
In this part we will recall some general concepts of the theory of regularity structures to show the existence of an explicit regularity structure and a model. These objects will permit to define some analytical operations on u. For a quick introduction to the whole theory, we refer the reader to [16].

Algebraic construction
The starting point of the theory is the notion of a regularity structure (A, T, G), a triple of the following elements: • A discrete lower bounded subset A of R containing 0.
• A graded vector space T = α∈A T α such that each space T α is a Banach space with norm · α and T 0 is generated by a single element 1.
• A group G of linear operators on T such that for each α ∈ A, a in T α and Γ in G, one has Γ1 = 1 and Recalling the equations (1.10) and (1.6), our aim is then to build a regularity structure T whose elements are capable to describe for any ε > 0 the systems of equations Let us give a first description of T in terms of abstract symbols. We start by considering the real polynomials on two indeterminates. For any multi-index k ∈ N 2 , k = (k 1 , k 2 ) we will write X k as a shorthand for the monomial X k 1 1 X k 2 2 while the unit will be denoted by 1. In this way, we will be able to describe smooth functions. At the same time, we introduce an additional abstract symbol Ξ to represent the space-time white noise ξ and for any symbol σ and k ∈ N 2 we introduce a family of symbols I k (σ) (I (0,0) (σ) is denoted by I(σ)) to represent formally the convolution of the k-th derivative of the heat kernel with the function associated to the symbol σ. Since I k (X m ) should be identified with a smooth function, we simply put it to 0 to avoid repetitions. Finally for any two symbols τ 1 and τ 2 we consider also the juxtaposition of symbols τ 1 τ 2 as a new symbol. To include the product between polynomials as a juxtaposition of symbols we impose that the juxtaposition with 1 does not change the symbol and for every multi-index l, m X l X m = X l+m . We denote also the iterated juxtaposition of the same symbol by an integer power. Adding all these formal rules, we denote by F the set of all possible formal expressions satisfying • For any τ 1 , τ 2 ∈ F , τ 1 τ 2 ∈ F .
• For any σ ∈ F and m ∈ N 2 , I m (σ) ∈ F . We write F for the free vector space generated by F . Similarly to polynomials, we define a homogeneity map | · | : F → R which has approximately the same properties of the degree of polynomials but in the context of the Hölder spaces, as described in Section 2. In particular we set recursively: • |X k | := 2k 1 + k 2 the parabolic degree 2 ; • |Ξ| := −3/2 − κ for some fixed parameter κ > 0 ; Starting from the linear space F , we introduce a subset of F where we choose all reasonable products that we will need in our calculations. We write I 1 (Ξ) as shorthand of Definition 3.1. We define the sets of symbols T, U, U ′ ⊂ F as the smallest triple of sets satisfying: • for every k ≥ 0 and any finite family of elements τ 1 , . . . , τ k ∈ U and any couple of elements σ 1 , σ 2 ∈ U ′ , then {τ, τ Ξ, τ σ 1 , τ σ 1 σ 2 } ⊂ T and τ 1 . . . τ k ∈ U.
We denote also by T and U respectively the free vector spaces upon T and U.
The definition of T has an equivalent description in terms of symbols. Defining V = {I(Ξ) m X l : m ∈ N , l ∈ N 2 } and for any σ ∈ {Ξ, I 1 (Ξ), I 1 (Ξ) 2 } V σ := σV the set of all symbols of the form σ times an element of V , it is straightforward to show the identities Therefore, denoting by V σ the free vector space generated upon V σ , we have the decomposition T = V Ξ ⊕V I 1 (Ξ) 2 ⊕V I 1 (Ξ) ⊕U. Let us give the construction of the structure group associated to T . For any h ∈ R 3 , h = (h 1 , h 2 , h 3 ) we define the function Γ h : T → T as the unique linear map such that for any σ ∈ {Ξ, I 1 (Ξ), for any h, k ∈ R 3 and the map h → Γ h is injective. We will denote by G the group Proof. To prove that A is a discrete lower bounded set, we show that for any β ∈ R the set I := {τ ∈ T : |τ | ≤ β} is finite. For any τ ∈ I by means of the identity (3.3) there exist two indices m ∈ N, n ∈ N 2 and σ ∈ {Ξ, I 1 (Ξ), I 1 (Ξ) 2 } such that τ = σI(Ξ) m X n . From |τ | ≤ β we deduce Imposing κ < 1/2, the left-hand side of the inequality (3.6) is bigger or equal than 0 and the set I is bounded. This finiteness result implies also the identity T = γ∈A T γ where T γ = τ ∈ T : |τ | = γ . Moreover there is no need to specify a norm on T γ , since it is finite-dimensional. Finally the property (3.1) comes directly from Newton's binomial formula and the positive homogeneity of the symbol I(Ξ).  [19,. More precisely we consider U HP the smallest set of symbols of F such that {X k } k∈N 2 ⊂ U and satisfying the properties Introducing the set T HP Ξ = {Ξv ∈ F : v ∈ U HP } and T HP Ξ , U HP the corresponding free vector spaces defined on these sets, the space T HP is defined by T HP = T HP Ξ ⊕ U HP . Looking at T ∩ T HP = V Ξ ⊕ U, it is also possible to show that the action of the group G HP coincides with that of G on these subspaces and from the explicit definition of Γ one has Γ(V Ξ ) ⊂ V Ξ and Γ(U) ⊂ U for any Γ ∈ G. Hence the subspaces V Ξ and U are respectively a sector of regularity −3/2 − κ and a function-like sector of both T HP and T (see [15,Def. 2.5]). Due to these identifications, we can transfer some results of [19] to our context. Remark 3.4. As a matter of fact, we can restrict our considerations once and for all to a subspace of T generated by all symbols with homogeneity less than some parameter ζ > 0. By convention we denote by | · | β the euclidean norm on T β (the euclidean norm is coherent with [19] but there is no "canonical" choice because T β is finite-dimensional). For any β ∈ A, we will denote by Q β and Q <β the projection operator respectively on T β and α<β T α .
Remark 3.5. Following the definition (3.4), we can also easily prove that Γ h τ τ ′ = Γ h τ Γ h τ ′ for every symbol τ, τ ′ ∈ T such that also their product τ τ ′ belongs to T and h ∈ R 3 . We remark that the explicit expression of Γ h in (3.4) can be easily rewritten as where h ′ : U → R is the unique real character over U such that and ∆ : T → T ⊗ U is the unique linear map such that for every i = 1, 2, σ ∈ {Ξ, I 1 (Ξ), I 1 (Ξ) 2 } and all τ, τ ′ ∈ T such that τ τ ′ ∈ T . Comparing the relations (3.8) with the explicit definitions given in [15,Sec. 8.1] and [19,Pag. 14], the group G can be obtained also with the general construction presented in [15].
To apply some general results obtained in [6] and [8], we show how to express the regularity structure T using the formalism of trees. Let us recall some basic notations. We start by considering labelled, rooted trees τ . That is τ is a combinatorial tree (a finite connected simple graph) with a non-empty set of nodes N τ and a set of edges E τ without cycles, where we fixed a specific node ρ τ ∈ N τ called the root of τ . The trees we consider are also labelled i.e. there exists a finite set of labels L and a function t : E τ → L. These trees are the building blocks of a more general family of trees. We define a decorated tree as a triple τ n e = (τ, n, e), where τ is a LR rooted tree and n : N τ → N 2 , e : E τ → N 2 are two fixed functions. The set of decorated tree is denoted by T.
Let us define two main operations on T. For any two elements two elements τ n e , σ n ′ e ′ ∈ T we introduce the product tree τ n e σ n ′ e ′ by simple considering τ σ, the tree obtained by joining the roots of τ n e and σ n ′ e ′ . Moreover we impose n(ρ τ σ ) = n(ρ τ ) + n ′ (ρ σ ) and we keep e unaltered. Then for any m ∈ N 2 , l ∈ L we define the grafting application E l m : T → T as follows: for any σ n e ∈ T, E l m (σ n e ) ∈ T is the tree with zero decoration on the root obtained by adding one more edge decorated by (l, m) to the root of σ. The set T can be constructed recursively starting from the root trees {• k } k∈N 2 and applying iteratively the grafting operations and the multiplication. Similarly to what we did for the set of symbols F , for any fixed s : for any τ n e ∈ T. The name homogeneity is used because the function | · | s has the following properties |τ n e σ n ′ e ′ | s = |τ n e | s + |σ n ′ e ′ | s , |E l m (σ n e )| s = |σ n e | s + s(l) − s(m) , which are similar to the properties of the homogeneity on symbols | · | introduced above.
To describe the symbols defining T using trees, we choose in this case a set of labels with three elements L = {Ξ, I, J}, associated to the symbol Ξ and I. The presence of two different labels I, J to denote I is done to isolate all the trees we need for our calculations. Once we introduce the labels, the function s is defined by where κ > 0 is a fixed parameter. This choice of s is done to be coherent with the homogeneity | · |, as explained in the Remark 3.9. We can easily draw a labelled decorated tree τ n e ∈ T by simply putting its root at the bottom and decorating the nodes and the edges with the non zero values of n, e and the labels of L. For example, when we write the tree we suggest that n is zero over three nodes and e is zero on the edge labelled by Ξ.
To conclude the construction of a regularity structure, we need to choose a suitable subset of trees contained in T. This operation is formalised in the context of decorated trees by the notion of a "rule" (see [6,Def. 5.7]). This object takes in account the branching behaviour of the trees and consequently what type of edges are allowed next to every label. More precisely, denoting by E the set of all finite multi-sets of L × N 2 , a rule is a function R : Let us explain what rule we choose in this context.
where the brackets {} describe a subset of E and the brackets () describe a multi-set of L × N 2 (the symbol () denotes the empty multi-set).
Once we established a rule, we can consider the set of all decorated trees which strongly conforms to the rule R (see [6,Def. 5.8]), denoted by T(R), that is τ n e ∈ T(R) if the following properties are satisfied • Looking at the edges attached at the root ρ τ , they can be expressed as R(l) for some l ∈ L; • for any node x ∈ N τ \ {ρ τ }, all the edges attached at x can be written as R(t(e)), where e is the unique edge linking x to its parent.
For example let us consider the two trees where we used the shorthand notation J = ((0, 0), J). The tree on the left-hand side strongly conforms to the rule R, but the tree on the right one does not because the multiset {I, J} is not in the image of R. From the Definition 3.6 it is straightforward to see that all possible decorations of the trees in T(R) are of three types. We will abbreviate them with the shorthand notations 3 Thanks to the definition of T(R), we can extract a specific subset of trees from T. To conclude the existence of a regularity structure from T(R), it is necessary to check two last fundamental properties on R: • R is subcritical (see [6,Def. 5.14]), that is there exists a function reg : L → R such that when we extend it to E for any (l, k) ∈ L × N 2 and M ∈ E 3 to improve the readability the same trees will be henceforth drawn on a smaller then we have for any l ∈ L • R is normal (see [6,Def. 5.8]), that is R(l) = {()} for every l ∈ L such that s(l) < 0 and for any couple of multi-sets M, N ∈ E such that N ∈ R(l) for some l ∈ L and M ⊂ N, then M ∈ R(l).
Both properties are relatively easy to check in this specific case. Indeed the rule R is normal by construction and we can verify the subcriticality hypothesis using the function reg : L → R defined by as long as κ is sufficiently small. These two properties allow us to apply the results [ Even if we could give an explicit description of the group G ′ and T ′ given in the Proposition 3.7, for our purposes it is sufficient to establish a relation between the regularity structure T and T ′ . From the explicit definition of F and T, it is possible to define recursively an injective map ι : F → T as follows: • For any symbol σ such that ι(σ) is defined, then ι(I k (σ)) := E I k (σ) . • For any couple of symbols σ, σ ′ such that ι(σ) and ι(σ ′ ) are well defined we set ι(σσ ′ ) = ι(σ)ι(σ ′ ).
Restricting the map ι on T and extending it by linearity we have the following inclusion Proof. The theorem is a strict consequence of the choices done to define T ′ . Firstly by definition of R, every decorated tree ι(τ ) for some τ ∈ T strongly conforms to the rule R, therefore it will strongly conform to the rule R ′ . Moreover by construction of s in (3.10) we have |ι(σ)| s = |σ| for any σ ∈ F . Thus A ⊂ A ′ . Finally, when we consider the groups G, G ′ , it has been showed in [6,Equation (6.32)] that the group G ′ acts on ι(T ) in the same way as the operator ∆ explained in the Remark 3.5. Therefore we obtain the inclusion of the regularity structures.
Remark 3.9. The function ι is an injective map function from F to T but there are many trees of T which do not belong to ι(F ). In particular, If a tree contains only the label I and no Ξ, then it is not contained, because we identified all the symbols I k (X m ) to zero. Moreover, none of the trees labelled with J belong to ι(F ). In what follows, we will identify both symbols and decorated trees, without writing explicitly the map ι.

Models on a regularity structure
The algebraic structure comes also with a model associated to it. In order to recall this notion and to simplify the whole exposition, we fix a parameter ζ ≥ 2 and with an abuse of notation we will identify all along the chapter T (respectively its canonical basis T ) with the finite-dimensional vector space Q <ζ T (resp. the finite set {τ ∈ T : |τ | < ζ}).
The same applies also for the sets consists of a pair (Π, Γ) given by: Furthermore, for every compact set K ⊂ R 2 , one has where the set of test functions B 2 was already introduced in the Section 2.
This notion plays a fundamental role in the whole theory, because it associates to any τ ∈ T an explicit distribution Π z τ belonging in some way to C |τ | . To compare two different models defined on the same structure, we endow M, the set of all models on (A, T , G), with the topology associated to the corresponding system of semi-distances induced by the conditions (3.12) and (3.13): (3.14) Since we want to study the processes on a finite time horizon, it is sufficient to verify the conditions (3.12) (3.13) on a fixed compact set K containing [0, T ] × [0, 1] and we will avoid any reference to it in the notation. In this way (M, · M ) becomes a complete metric space (M is not a Banach space because the sum of models is not necessarily a model!). In particular if a sequence (Π n , Γ n ) converge to (Π, Γ), then Π n z τ converges to Π z τ in the sense of tempered distributions for any z, τ . To define correctly a model over a symbol of the form I(σ), we need a technical lemma related to a suitable decompositions of G, the heat kernel on R, interpreted as a function G : in such a way that R is C ∞ (R 2 ) and K satisfies: • For every polynomial Q : R 2 → R of parabolic degree less than ζ, one has Remark 3.12. Thanks to these lemmas, it is possible to localise on a compact support the regularising action of the heat kernel. Indeed it is also possible to show (see [15,Lem 5.19]) that the map v → K * v sends continuously C α in C α+2 for any non integer α ∈ R and any distribution v not necessarily compactly supported.
In what follows, for any given realisation of ξ ε , the periodic extension of ξ ε , we will provide the construction of (Π ε ,Γ ε ) a sequence of models associated to it and converging to a model (Π,Γ) related to ξ. As a further simplification, we parametrise all possible Indeed it is straightforward to check that for any given couple (Π, f ) the operators ( 3.16) satisfy trivially the algebraic relationships in the Definition 3.10, because of the identity (3.5). Since any realisation of ξ ε is smooth, we firstly introduce a model upon any deterministic smooth function ζ : R 2 → R.
Proposition 3.13. Let ζ : R 2 → R be a smooth periodic function and we suppose that the map Π satisfies the conditions defined for any k ∈ N d , τ,τ ∈ T such that τ X k ∈ T , I k (τ ) ∈ T and ττ ∈ T . Then, there exists a unique couple (Π, f ) such that, using the identifications (3.16), the associated operators (Π, Γ) is a model. We call it the canonical model of ζ.
Proof. The hypotheses on ζ and the conditions (3.17) (3.18) implies straightforwardly that Πτ is a smooth function for any τ ∈ T which is not a product of symbols. Therefore, the point-wise product on the right-hand side of the equation (3.19) is well defined and by linearity, the operator Π exists and it is unique. To conclude the proof it is sufficient to choose f such that (Π, Γ) satisfy the right analytic properties. We use (3.5) to compute explicitly for any z,z ∈ R 2 , σ ∈ {I 1 (Ξ), I 1 (Ξ) 2 , Ξ, 1} and k, m as before. Imposing the condition we obtain immediately the bound Π z τ (z) ≤ C z − z |τ | for some constant C > 0 depending on ξ and uniformly on τ ∈ T . Thus the condition (3.12) is satisfied. On the other hand, in order to check the second property (3.13), we can easily verify it using when τ ∈ {I(Ξ), X 1 , X 2 } and Applying the multiplicative property of Γ (see the Remark 3.5) we conclude.
Remark 3.14. The existence of a canonical model is a general result already proved in [15,Prop 8.27] but we repeat a simplified version of that proof to take in account the slightly different notation of this article. Looking at the definition of f in (3.21), we remark that f depends on Π but to define it we do not need the multiplicative property of Π (3.19), nor the smoothness of ξ. Therefore for any map Π : T → S ′ (R 2 ), the conditions (3.21) and (3.16) identify uniquely a couple L(Π) := (Π, Γ). L(Π) is not necessarily a model but if ΠΞ = ξ ∈ C −3/2−κ for some 0 < κ < 1/2 and Π satisfies the properties (3.17), (3.18), then the proof of the Proposition 3.13 implies also that the operators Γ zz ′ given by L(Π) will always satisfy the property (3.13). The choice of a kernel K satisfying (3.15) is due in order to be compatible with the assumption that the symbols I k (X m ) are identified with 0.
Remark 3.15. If ξ is also periodic in the space variable it is straightforward to prove that the canonical model (Π, Γ) associated to ξ satisfies also Thus the canonical model is also adapted to the action of translation on R (for this definition see [15,Definition 3.33]). Roughly speaking this property allows to apply the notion of models also for distributions periodic in space.
Remark 3.16. Recalling the inclusion of the regularity structure T in T ′ as explained in the Proposition 3.8, we can immediately extend the Definition 3.10 to define a model (Π ′ , Γ ′ ) over the regularity structure (A ′ , T ′ , G ′ ). This extension will be useful to define the so-called BPHZ renormalisation and the BPHZ model as In particular for any smooth function ζ : R 2 → R we can define again a canonical model in this context starting from an explicit function Π ′ : T(R ′ ) → C ∞ . Using the grafting operation the application Π ′ is defined recursively for any k, m ∈ N 2 , τ, τ ′ ∈ T . These conditions allow to define Π ′ without knowing in detail R ′ and the existence of a model is provided by [6,Prop. 6.12]. By construction when we restrict Π ′ on T we obtain the properties (3.17) (3.18) (3.19). The map Π ′ will be important when we want to define the renormalisation of a model (see Theorem 3.17).

The BPHZ renormalisation and the BPHZ model
For any ε > 0 we denote by Π ε and L(Π ε ) := (Π ε , Γ ε ) the canonical model obtained by applying the Proposition 3.13 where ζ is a fixed a.s. realisation of ξ ε . Since ξ ε converges to ξ a.s. in the sense of distributions, we would like to define a model by studying the convergence of the sequence (Π ε , Γ ε ) as ε → 0. Unfortunately, it is well known from [19] that the sequence Π ε (I(Ξ)Ξ) does not converge as a distribution, implying that L(Π ε ) does not converge. A natural way to get rid of this ill-posedness and to prove a general convergence result is the main content of [6] and [8]. The main consequence of these general results will be the existence of an explicit sequence of applications F ε : M → M such that the sequence F ε (L(Π ε )) := (Π ε ,Γ ε ) converges in probability to some random model. The model (Π ε ,Γ ε ) and the limiting model are referred in the literature as the BPHZ renormalisation and the BPHZ model.
In order to satisfy the bounds (3.13) forΓ ε uniformly on ε > 0, it is reasonable to write (Π ε ,Γ ε ) as L(Π ε ), for some admissible mapΠ ε : T → S ′ (R 2 ) (see the Remark 3.14). This property can be obtained by defining a sequence of linear maps {A ε } ε>0 : T → T satisfying for any k ∈ N d and τ ∈ T such that τ X k ∈ T , I k (τ ) ∈ T . Indeed combining the properties of (3.23) with the explicit definition of Π ε in the proposition 3.13, the application Π ε A ε is automatically admissible (by analogy we call {A ε } an admissible renormalisation scheme) and we can define the couple L(Π ε A ε ). However the conditions (3.23) are not sufficient to prove that L(Π ε A ε ) is again a model. As a matter of fact, writing the elements of T as trees and embedding T in T ′ (see the Proposition 3.8), the BPHZ renormalisation is obtained from an explicit admissible renormalisation scheme Denoting by the combinatorial operation of the disjoint union of graphs and by ∅ the empty graph, we considerT − , the set of all graphs σ such that σ = τ 1 · · · τ n for some n ≥ 1 and The elements ofT − are called forests and we denote byT − the free vector space generated overT − . (T − , , ∅) is trivially a commutative algebra with unity. For any decorated tree τ n e we say that a forest γ ∈T − is a subforest of τ n e (γ ⊂ τ n e ) if γ is an arbitrary subgraph of τ n e with no isolated vertices. For instance let us consider In this case γ 2 and γ 3 are both elements ofT − but only γ 3 ⊂ γ 1 because γ 2 has an isolated vertex. The empty forest ∅ is always a subforest. A decorated tree τ n e and a subforest γ = σ 1 · · · σ n such that γ ⊂ τ n e are used to define the contraction tree K γ τ n e = (K γ τ, K γ n, K γ e), where • K γ τ is the tree obtained from τ replacing each σ i with a node.
• Denoting by • 1 ,· · · , • n each node associated to the contraction of the tree σ i , the function K γ n is equal to n on every non contracted node of K γ τ and for every i, n(• i ) = y∈Nσ i n(y).
• K γ e : E Kγ → N 2 is equal to e on every non contracted edge of K γ τ .
In the previous example we have K γ 3 γ 1 = •.
Once we giveT − , we define T − :=T − /J as the quotient algebra ofT − with respect to J , the ideal ofT − generated by the set The map M ε is then defined for any τ n e ∈ T(R ′ ) as (3.24) We will describe the objects ∆ − and h ε separately. First ∆ − : T → T − ⊗ T is a linear map which is explicitly given for any τ n e ∈ T(R ′ ) by the formula Let us explain the meaning of the formula (3.25). The first sum outside is done over all subforests γ ⊂ τ and for any subforest γ, denoting by N γ and ∂(γ, τ ) respectively the set of the nodes of γ and the edges in E τ that are adjacent to N γ , the second sum is done over all functions n γ : N γ → N 2 and e γ : ∂(γ, τ ) → N 2 such that for any x ∈ N γ n γ (x) ≤ n(x), where with an abuse of notation we denote by ≤ the lexicographical order between vectors of N 2 . A generic subforest γ is an element ofT − so we compose it with the canonical projection homomorphism p :T − → T − , where with an abuse of notation we identify all forests generating J to zero. Moreover, for any e γ : ∂(γ, τ ) → N 2 the function πe γ : N γ → N 2 is given by The remaining combinatorial coefficients are finally interpreted in a multinomial sense, that is for any function l : S → N 2 where S is a finite set we have and similarly for the binomial coefficients. In principle, the summations over n γ and e γ are done over an infinite set of values but the projection p and the constraints n γ ≤ n, together with the subcritical hypothesis of the rule R ′ , make the sum finite.
On the other hand, the map h ε has the explicit form The first object in (3.26) is given by a linear map A − : T − →T − . Its name is twistedantipode and it is characterised as the only homomorphism (so then A − (∅) = ∅) such that for any tree τ n e = ∅, denoting by M the forest product, one has the identity Finally the last object g ε (Π) :T − → R is the only real character on the algebraT − such that for any tree τ n e ∈ T(R ′ ) where (Π ε ) ′ is the extension of Π ε over all the decorated trees as explained in the Remark 3.16. We combine all these definitions to obtain the explicit form of the application M ε .

29)
where the constants C 1 ε and C 2 ε are given by Proof. Thanks to the result [6, Theorem 6.17], for any k ∈ N d , τ ∈ T such that τ X k ∈ T , I k (τ ) ∈ T the map M ε always satisfies Therefore to prove the theorem it is sufficient to show for any m the identities we have to calculate the operator ∆ − and h ε over the elements of W . To do that we need to know for any w ∈ W what are the subforests γ ⊂ w and in principle, we should know explicitly the rule R ′ and the forests which defineT − . However, we remark that every subgraph γ included in w with no isolated vertices can be expressed as a disjoint union of trees belonging only to T(R). Thus the knowledge of R ′ is unnecessary to calculate ∆ − .
Secondly we fix κ > 0 sufficiently small such that the only trees of T(R) with strictly negative homogeneity that are included in W are the following Denoting by τ m = ΞI(Ξ) m , we calculate the quantity M ε (τ m ) in case m = 0, 1 explaining all the passages. Firstly, we apply ∆ − in (3.25) and the recursive definition of A − in (3.27) to obtain immediately The terms with the extra decoration (0, 1) come from the sum over e γ in the definition of ∆ − , combined with the projection operator p and the map π (we recall that all forests generating J are identified to zero). Using again the general definition of ∆ − in (3.25) and the recursive identity (3.27), the calculations for the symbol ΞX (0,1) are given by To complete the calculation of M ε , we need to apply g ε (Π) on the images of the pseudo antipode. By definition of (Π ε ) ′ one has (3.33) Hence we conclude firstly Plugging the formulae (3.34) in the sums of ∆ − we obtain the right identities of (3.32) for M ε τ m m = 0, 1. Let us pass to the calculation of M ε τ m m = 2, 3. Writing ∆ − τ m and A − τ m , a deep consequence of (3.34) and (3.33) is then all the subforests containing the trees Ξ or ΞX (0,1) between the connected components will become zero after applying h ε or g ε (Π), thereby not giving any contribution for M ε . Denoting by (· · · ) all these terms we have Similarly we also have Therefore the calculation of M ε τ m is obtained once we know the constants The first two constants from the left are zero because we are taking the expectations over a product of an odd number of centred Gaussian variables. On the other hand, using the shorthand notation K ε = K * ρ ε we have the identity Applying the Wick's formula for the product of four Gaussian random variables one has By replacing the values of g ε (Π) in the above calculations of A − τ m we yield to Since we know from (3.34) (3.35) the values of h ε on these trees, we obtain the following identity where the term (· · · ) contains some terms that becomes zero after we apply h ε . The combinatorial factor m appears because the tree associated to ΞI(Ξ) appears m times inside τ m . Therefore we prove the first part of the equations (3.32). We pass to the terms of the form σ m = I 1 (Ξ) 2 I(Ξ) m and η k = I 1 (Ξ)I(Ξ) k for m = 0 and k ≤ 1. Adopting the same notation as before to denote the terms that becomes zero after applying h ε for ∆ − or g ε (Π) for A − , we have Applying the map (Π ε ) ′ we obtain also where the first and the last identity of (3.36) are obtained because we take the expectation of a centred Gaussian variable and the function and consequently the identities (3.32) for M ε σ m and M ε η k . Passing to the calculation of M ε σ m for m = 1, 2 we have Using again the same notations Similarly as before, we apply Wick's formula and the definition of (Π ε ) ′ to obtain Thus yielding finally Thus we obtain the final part of the identities (3.32) and we conclude.
We will henceforth fix the parameter κ in order to keep the Theorem 3.17 true. By construction of the BPHZ renormalisation (see [6,Sec. 6]) the application M ε is an admissible renormalisation scheme and, denoting byΠ ε = Π M ε , the couple L(Π ε ) = (Π ε ,Γ ε ) obtained from the Remark 3.14 is always a model for any ε > 0. The explicit form of the map M ε obtained in the Theorem 3.17 allows us to write explicitly also (Π ε ,Γ ε ). Proposition 3.18. For any z ∈ R 2 and z, z ′ ∈ R 2 one haŝ Furthermore the model (Π ε ,Γ ε ) is also adapted to the action of translation on R Proof. By definition of L(Π ε ), the model (Π ε ,Γ ε ) can be represented as the couple Thus the functionf ε coincides with f ε , the same function obtained from the decomposition of the canonical model (Π ε , Γ ε ) as (Π ε , f ε ). By definition of Γ we have straightforwardlŷ Γ ε zz ′ = Γ ε zz ′ . In case ofΠ ε we can apply immediately the identity (3.16) with the Theorem 3.17 to obtainΠ Then the formula (3.39) holds as long as for any h ∈ R 3 and ε > 0 one has (3.41) Let us verify the identity (3.41) for any τ ∈ V Ξ ⊔ V I 1 (Ξ) 2 ⊔ V I 1 (Ξ) ⊔ U. In case τ ∈ V I 1 (Ξ) ⊔ U, this identity holds trivially because Γ h leaves invariant the subspace V I 1 (Ξ) ⊔ U (see equation (3.4)) and M ε is the identity when it is restricted to this subspace. On the other hand if τ ∈ V Ξ ⊔ V I 1 (Ξ) 2 , the multiplicative property of Γ h (see the Remark 3.5) and the behaviour of M ε on the polynomials in (3.29) reduces to verify (3.41) over the symbols I 1 (Ξ) 2 I(Ξ) m and ΞI(Ξ) m for any m ≥ 1.
. Thus yielding the result. The identity (3.39) implies immediately the properties in the identity (3.22). Therefore (Π ε ,Γ ε ) is adapted to the action of translations.
We study the convergence of L(Π ε ) in the space of models. Embedding the regularity structure T into T ′ as explained in the Proposition 3.8, it is possible to prove the convergence of L(Π ε ) using the general criterion exposed in [8, Thm. 2.15]. We introduce some notation to apply this statement. Representing all the elements τ ∈ T as decorated trees, we denote by E Ξ (τ ) the set of edges labelled by Ξ. By construction every element e ∈ E Ξ (τ ) is written uniquely as e = {e Ξ , e Ξ }, where e Ξ is one of the leaves of τ . This decomposition allows to define the sets Moreover, expressing τ as τ n e for some decoration n, e we write τ 0 e to denote the decorated tree whose decoration n is replaced by zero in every node. Let us express the convergence theorem in this context. with respect to the metric · M . We call (Π,Γ) the BPHZ model.
Remark 3.20. The BPHZ model (Π,Γ) obtained from the Theorem 3.19 is an example of a random model with a.s. values on distributions and it will be the main object to formulate a new type of Itô formula for u. Recalling the inclusion of the space V Ξ ⊕ U into the regularity structure T HP defined in [19] (see the Remark 3.3), we can easily check that the renormalisation map M ε defined in (3.29) and the model (Π ε ,Γ ε ) restricted to the sector V Ξ ⊕ U coincide exactly with the renormalisation procedure developed in [19,Thm. 4.5] to define what in this context is called the Itô model. By uniqueness of the limit on this sector we can apply directly this result and we obtain immediatelyΠ z Ξ = ξ, the periodic extension of ξ and for every τ ∈ U, z = (t, x) ∈ R 2 and every smooth test function ψ such that for any s < t ψ(s, y) = 0 we have immediatelŷ We stress that equation (3.44) holds only when the test function is supported in the future. Otherwise, the right-hand side integrand will not be adapted and we cannot interpretΠ z Ξτ (ψ) as a Skorohod integral. An explicit formula to describe the law ofΠ z τ in its full generality has been developed in [8,Prop 4.22]. Moreover we recall that in our case we haveΓ zz ′ = Γf (z ′ )−f (z) wheref : R 2 → R 3 is given bŷ The model (Π,Γ) is also adapted to the action of translation on R, as a consequence of the Proposition 3.18 on the converging sequence (Π ε ,Γ ε ).

Calculus on regularity structures
In this section, we will show how the models (Π ε ,Γ ε ) and (Π,Γ) can be used to describe respectively u ε and u and, more generally, what kind of analytical operations we can define on a the regularity structure T .

Modelled distributions
The main function of a regularity structure and a model upon that is to provide a coherent framework to approximate random distributions similar to how polynomials approximate smooth functions via Taylor's formula. Since for any function f : R → R it is possible to describe the condition f ∈ C γ in terms of F : R 2 → R ⌊γ⌋ , the vector of its derivatives (see [17] for further details), we introduce an equivalent version of this space in our general context.
Remark 4.2. The definition of the set D γ,η does depend crucially on the underlying model (Π, Γ). To remark this dependency, we will adopt for the same set the alternative notation D γ,η (Γ) . Similarly we recall that the quantities U γ,η depend on the compact set K but we avoid to put the symbol K in the notation because of our finite time horizon setting, we will henceforth prove the results on a fixed compact set K ⊂ R 2 containing [0, T ]×[0, 1]. The presence of an extra parameter η allows more freedom than the classical C γ spaces. In this way the coordinates of U are allowed to blow at rate η near the set P = {(t, x) ∈ R 2 : t = 0} and the condition η > −2 is put to keep this singularity integrable. By definition of D γ,η , for any value γ ≥ γ ′ > 0 and U ∈ D γ,η the projection For any given model (Π, Γ) the couple (D γ,η (Γ), |·| γ,η ) is clearly a Banach space. Since we will consider modelled distributions belonging to different models, for any couple of models (Π, Γ) and (Π ′ , Γ ′ ) and modelled distributions U ∈ D γ,η (Γ), U ′ ∈ D γ,η (Γ ′ ) we define the quantity where the parameters z, w, α belong to the same sets as the quantity (4.1). This function, together with the norm · M on models endows the fibred space.
M ⋉ D γ,η := {((Π, Γ), U) : (Π, Γ) ∈ M, U ∈ D γ,η (Γ)} of a complete metric structure using the distance ·, · γ,η + ·, · M . Combining the knowledge of a model (Π, Γ) ∈ M and U ∈ D γ,η (Γ), it is possible to define uniquely a distribution such that the coordinates of U have the same role of the derivatives of a function in the Taylor's formula. This association is called the reconstruction theorem and it is one of the main theorem in the theory of regularity structures (for its proof see [15, Sec. 3, Sec. 6]). • (Generalised Taylor expansion) for any compact set K ⊂ R 2 there exists a constant C > 0 such that uniformly over η ∈ B 2 , λ ∈ (0, 1] and z ∈ K; • the distribution RU ∈ C α U ∧η where α U := min{a ∈ A : Q a U = 0} and in case α ∧ η = 0 we set by convention C 0 the space of locally bounded functions; • (local Lipschitz property) for any fixed R > 0 and all couples (Π ′ , Γ ′ ), (Π, Γ) ∈ M, U ∈ D γ,η (Γ), U ′ ∈ D γ,η (Γ ′ ) such that U; U ′ γ,η + (Π, Γ); (Π ′ , Γ ′ ) M < R and α U = α U ′ = α, denoting by R and R ′ the respective reconstruction operators, there exists a constant C > 0 depending on R such that Remark 4.4. The reconstruction map has in some rare cases an explicit expression. For instance if Π z τ is a continuous function for every τ ∈ T (like the model L(Π ε ) or L(Π ε ) for any ε > 0) and U ∈ D γ,η (Γ), then RU is a continuous function given explicitly by Introducing the space D γ,η U of all modelled distributions taking values in U, the identity (4.4) holds also if (Π, Γ) is a generic model and U ∈ D γ,η U (Γ), because the elements of the canonical basis of U have all non negative homogeneity (see for further details in [15,Sec. 3.4]). We finally conclude that for any value γ ≥ γ ′ > 0 and U ∈ D γ,η we have the identity RQ <γ ′ U = RU, therefore to define correctly the distribution RU is sufficient to fix γ > 0 such that Q <γ T is generated by the set {τ ∈ T : |τ | ≤ 0}.
Remark 4.5. Concerning the regularity of RU, the result stated in the Theorem 4.3 is optimal because of the presence the parameter η in the definition and the possible explosion of the components of U. However if we forget the behaviour at 0 it is also possible to prove RU ∈ C β U (R 2 \ P ) where β U := min{a ∈ A \ N : Q a U = 0} (see [15,Sec. 6]) and the local Lipschitz property (4.3) holds on the same space C β U (R 2 \ P ). We stress that the local Lipschitz continuity of the reconstruction operator R as given in   .22)) we and the function U is periodic in the space variable on R 2 , then using the general result [15,Prop. 3.38] we obtain also RU = u for some u ∈ C α U ∧η (R × T), with an abuse of notation we can identify RU with u.
where the second identity holds a.s. as distributions.
Proof. As we recalled in the Remark 4.4, to prove the first part (4.5) we can apply directly the identity (4.4) obtaining the result trivially. Using the Theorem 3.19, related to the convergence of models and the local Lipschitz continuity of the reconstruction map, the distributionR ε (1 + Ξ) converges in probability toR(1 [0,+∞) Ξ) with respect to the topology of C −3/2−κ (R × T). Since ξ ε converges in probability to ξ with respect to the topology C −3/2−κ (R × T) (see [15,Lem 10.2]) and the operator 1 [0,+∞) (introduced in the section 2) extends continuously the multiplication with the indicator 1 + , then 1 + (z)ξ ε converges in probability to 1 [0,+∞) ξ with respect to the same topology. We conclude by uniqueness of the limit.

Operations with the stochastic heat equation
Although modelled distributions look very unusual, the reconstruction theorem associates to them a distribution, which is a classical analytical object. Under this identification, it is possible to lift some operations on the C γ spaces directly at the level of the modelled distributions as it was explained in detail in [15,Sec. 4,5,6]. Moreover, this "lifting" procedure is also continuous with respects to the intrinsic topology of the modelled distributions. In what follows, we will briefly recall them to put them in relation with the stochastic heat equation.

Convolution
The first operation to define is the convolution with G, the heat kernel on R. In other terms, we analyse under which conditions we can associate to any ((Π, Γ), V ) ∈ M ⋉ D γ,η one modelled distribution P(V ) ∈ Dγ ,η (Π) in a continuous way such that R(P(V )) = G * RV . (4.6) For our purposes we are not interested to describe this operation in general. Indeed recalling the formulae (2.4) and (4.5), it is sufficient to define P only in the case of the modelled distribution V = 1 + Ξ to have an expression of u ε and u, the solution of (1.6) and (1.1), as the reconstruction of some modelled distributions. In this case, we can restate the convolution with G as the convolution with two other kernels thanks to this technical lemma (the proof is a direct consequence of [15,Lemma 7.7]).
where K is the function introduced in the Lemma 3.11, z ∈ (−∞, T + 1] × R and v is the periodic extension of v. •R is smooth,R(t, x) = 0 for t ≤ 0 and it is compactly supported.
Thanks to this decomposition, it is sufficient to write P = K + R for some operators K and R satisfying R(K(V )) = K * RV , R(R(V )) =R * RV . (4.8) Considering the case of R, we remark thatR * v will always be a smooth function for any distribution v supported on positive times. Thus for any fixed couple ((Π, Γ), V ) ∈ M ⋉ D γ,η such that RV is supported on R + × R, the operator R can be easily defined for anyγ > 0 as the lifting of theγ-th order Taylor polynomial ofR * RV , that is: From this definition it is straightforward to check R(V ) ∈ Dγ ,η (Π) for anyγ > 0, −2 < η <γ and R(V ) satisfies the second identity of (4.8). Moreover the application R : M ⋉ D γ,η → Dγ ,η is also continuous with respect to the topology of M⋉D γ,η , as a consequence of [15,Lem. 7.3]. This continuity property is a consequence of the compact support of R and it is the main reason to introduce Lemma 3.11. On the other hand, the kernel K is not a smooth and the definition of K depends on the model, as a consequence of this general result (for its proof see the "Extension theorem" [15, Thm. 5.14] and the "Multi-level Schauder estimates" [15, Thm. 5.14, Prop 6.16]).
are well defined and the application is a map K : M ⋉ D γ,η → Dγ ,η (Γ 2 ) satisfying the first identity of (4.8) without any restriction on the support of R(V ). Moreover K is also continuous with respect to the topology of M.
Choosing in the definition of R the same parametersγ andη of K, the application P = K + R is a well defined map P : M ⋉ D γ,η → Dγ ,η which depends continuously on the topology of the models. We will denote byK ε ,R ε ,P ε (resp.K,R,P) the operators K, R and P associated to the model (Π ε ,Γ ε ) (resp. (Π,Γ)). Let us we calculateP ε (1 + Ξ) andP(1 + Ξ) in this case.
Proposition 4.11. For any γ > 0 and every −3/2 + κ < η < γ non integer, using the shorthand notationγ = γ + 2, the modelled distribution U ε :=P ε (1 + Ξ) and U :=P(1 + Ξ) belong respectively to Dγ ,1/2−κ U (Γ ε ) and Dγ ,1/2−κ U (Γ) and they are both given explicitly for any z = (t, x) ∈ [0, T ] × R by the formulae where Proof. The proposition is a direct consequence of the definition of K in the Proposition 4.10. In particular we have immediatelyK ε (1 + Ξ) ∈ Dγ Considering the explicit formula (4.12), which defines K, by definition of 1 + Ξ we have for any z ∈ R 2 Hence the function N (1 + Ξ) defined in (4.10) is constantly equal to zero in case ofK ε and K. Summing up the definition of I, the definition of J and the identity (4.5), we obtain Applying again the identity (4.5) and the definition of R in (4.9), the formulae (4.13) (4.14) follws from the distributional identities (4.7) and (2.6). The last identities on the reconstruction follow straightforwardly from the general identity (4.6) and the property that the kernels K and R are 0 for negative times. Thus for any z = (t, x) ∈ [0, T ] × R one has and similarly with ξ ε . Thereby obtaining the thesis.
Remark 4.12. For any ε > 0 it is also possible to consider P ε , the convolution operator associated to the canonical model (Π ε , Γ ε ). Following the Remark 4.8 related to the modelled distribution 1 + Ξ in the case of the canonical model and the proof of the Proposition 4.11, we obtain also that P ε (1 + Ξ) ∈ Dγ ,1/2−κ U (Γ ε ) and the identity P ε (1 + Ξ)(z) = U ε (z) for any z ∈ R 2 , implying R ε U ε = u ε . Following Proposition 4.11, the hypothesis γ > 0 impliesγ > 2. However, to reconstruct u ε and u from U ε and U, as explained in the Remark 4.4, we can relax this condition by writing U ε and U as elements of D γ ′ ,1/2−κ for some 0 < γ ′ ≤γ.
Writing u ε and u as the reconstruction of some modelled distribution, we obtain immediately the following convergence.
where the exponent k is the product in U. Denoting by C n b (R) the space of C n functions with all bounded derivatives up to the n-the order, we apply the general theory to deduce a sufficient condition to define the lifting H(V ).
Proposition 4.14. For any γ > 0, 0 ≤ η < γ, the lifting of h in (4.17) is well defined and it depends continuously on the topology of , where we recall the notation β V := min{a ∈ A\N : Q a V = 0}.

Space derivative
Thanks to its definition, the regularity structure T allows us to define easily a linear map D x : U → T , which behaves like a space derivative on abstract symbols. Indeed it is sufficient to characterise D x as the unique linear map satisfying for any couple τ , σ such that στ ∈ U. Thus by composition we can define for any couple ((Π, Γ), V ) ∈ M ⋉ D γ,η U the function D x V : R 2 → T . This abstract operation can pass directly at the level of the reconstruction, thanks to the explicit structure of the models we are considering.

19)
where the equality is interpreted in the sense of distributions.
Proof. By construction of the application D x and using the multiplicative property of Γ h (see the Remark 3.5), it is straightforward to prove recursively for any β ∈ A and all h ∈ R 3 the following identities Hence D x is an abstract gradient operator, as defined in [15,Def. 5.25]. Let us fix a model (Π, Γ) of the form L(Π) for some admissible map Π, then the conditions (3.17) (3.18) imply that Π : U → S ′ (R 2 ) is well defined and for any u ∈ U where the derivative ∂ x is interpreted in the sense of distributions. Summing up the properties (4.20) (4.21) and recalling the definition of Π z in (3.16), for any z ∈ R 2 and u ∈ U we obtain Therefore D x is an abstract gradient operator which is compatible with (Π, Γ), as explained in [15,Def. 5.26]. The remaining part of the statement follows from [15,Prop 6.15]. The continuous dependency on M ⋉ D γ,η U comes immediately from the definition of the metric of M ⋉ D γ,η U . Applying the Proposition 4.16 to U ε and U, we can write ∂ x u ε and ∂ x u as the reconstruction of some modelled distributions.

Product
We conclude the list of operation on modelled distributions with the notion of product between modelled distribution. Even if T is not an algebra with respect to the juxtaposition product m introduced in the section 3, we can still consider m as a well defined bilinear map on some subspaces of T such as m : U ×T → T or m : Therefore for any couple of modelled distribution V 1 , V 2 and γ > 0 we define the function as long as the point-wise product on the right-hand side of (4.23) is well defined. The behaviour of this operation is described in [15,Proposition 6.12], which we recall here.
Proposition 4.18. Let (Π, Γ) ∈ M and V 1 ∈ D γ 1 ,η 1 (Γ), V 2 ∈ D γ 2 ,η 2 (Γ) be a couple of modelled distributions such that the point-wise product is well defined. If the parameters satisfy the conditions γ > 0 and −2 < η < γ, then the function V 1 V 2 is a well defined element of D γ,η . This operation is continuous with respect to the topology of M ⋉ D γ,η U .
Remark 4.19. Differently to the other operations we defined before, where we related the reconstruction operator to some classical operations on distribution, we cannot define directly the reconstruction R(V 1 V 2 ) as an analytical operation between R(V 1 ) and R(V 2 ), because there is no classical notion of product between distributions. However in case of the canonical model (Π ε , Γ ε ) for any fixed ε > 0, we can apply the multiplicative property of Π ε z on symbols and the explicit form of the reconstruction operator in (4.4) to obtain for any couple of V 1 , V 2 ∈ D γ,η (Γ ε ) the general identity (4.25) But this property does not hold any more with the operatorsR ε andR.
Summing up all the operations we defined before, we show the existence of two specific modelled distribution, related to U ε and U.
Remark 4.21. Following the proof of the Proposition (4.20), the choice of the parameter γ ′ and ϕ in the statement could be replaced by a generic value γ ′ > 3/2 + κ and a function ϕ with the right number of bounded derivatives. The value 3/2 + 2κ was simply chosen in order to find the smallest subspace where the modelled distributions Φ ′ (U ε )Ξ,

Itô formula
We combine the explicit knowledge of the sequence (Π ε ,Γ ε ) with the operations on the modelled distributions defined in the section 4 to describe the random distribution (∂ t − ∂ xx )ϕ(u) and ϕ(u), when u is the solution of (1.1) and ϕ is a sufficiently smooth function, as explained in the introduction. The resulting formulae will be called differential and integral Itô formula, in accordance to the formal definitions given in the equations (1.4) and (1.5).

Pathwise Itô formulae
The first type of identities we show are called pathwise differential Itô Formula and pathwise integral Itô Formula. We choose this adjective in their denomination because these identities involve in their terms the reconstruction of some modelled distribution, an object which is defined only pathwise.
Proof. The identity (5.1) will be obtained by rearranging the equality (1.10) in terms of modelled distributions and sending ε → 0. Recalling the Proposition 4.11 and 4.20 we write u ε =R ε U ε where U ε is the projection on D 3/2+2κ,1/2−κ (Γ ε ) of the modelled distributions introduced in (4.13). The hypothesis on ϕ and the definition of U ε allow to lift ϕ ′ and ϕ ′′ at the level of the modelled distributions and we can rewrite the identity (1.10) as On the other hand Proposition 4.20 implies that for any z ∈ (0, T ) × T as a consequence of the equation (4.4) and the Proposition 3.18. From these equalities we deduce an explicit relation between the functions on the lefthand side of (5.3) and the right-hand side of (5.2). To lighten the notation we write down Φ ′ (U ε )Ξ, and Φ ′′ (U ε )(D x U ε ) 2 on the canonical basis of Q <κ T and (D x U) 2 on the canonical bases Q <1/2+2κ T without referring explicitly to z, the indicator 1 + and the Let us now send ε → 0 + . the left-hand side of (5.7) converges in probability to (∂ t − ∂ xx )ϕ(u) thanks to the Proposition 4.13 and the fact that the derivative is a continuous operation between Hölder spaces. On the other hand, the local Lipschitz property of the reconstruction operator R in (4.3) and the convergence (4.26) implŷ with respect to the topology of C −3/2−κ ((0, T ) × T). Thus the theorem holds as long as the deterministic sequence C 1 ε − C 2 ε converges to 0, which is the main consequence of Lemma 6.2.
Remark 5.2. Looking at the identities (5.4) and (5.6) separately and the convergence result (5.8), we obtain the existence of two sequences of random variables X 1 ε , X 2 ε ∈ C −3/2−κ ((0, T ) × T) converging in probability such that Since we know from the Lemma 6.2 that the deterministic sequences C 1 ε and C 2 ε are both diverging, we obtain easily Thus we can justify rigorously the calculations done in the introduction.
From the formula (5.1) we can identify ϕ(u) with the solution of the following equation Using the general results contained in section 2 we obtain immediately.

Identification of the differential formula
Thanks to the explicit Gaussian structure involving the definition of u in (1.2), in order to obtain the Theorem 1.1, we can identify the termsR(Φ ′ (U)Ξ)) andR(Φ ′′ (U)(D x U) 2 ) appearing in the formula (5.1) with some explicit classical operations of stochastic calculus (the so called identification theorems of the introduction). In case ofR(Φ ′ (U)Ξ)), this identification is done by means of a general result contained in [19]. In what follows, we will denote by (F t ) t∈R the natural filtration of ξ, that is F t := σ({ξ(ψ) : ψ| (t,+∞)×T = 0 ; ψ ∈ L 2 (R × T)}).
Proposition 5.4. Let (Π,Γ) be the BPHZ model and ϕ ∈ C 7 b (R). Then for any smooth function ψ : R × T → R with supp (ψ) ⊂ (0, +∞) × T, one has for any t ∈ (0, T ] • By sending ε → 0 one has Proof. We start by considering the trajectories of x → u ε t (x) for any fixed t > 0. Since u is a.s. a continuous function, the regularisation property of the heat kernel P implies the desired property on its trajectories. Moreover, for any integer m ≥ 0 we can pass the derivative under the Lebesgue integral to obtain Using the straightforward bound we can obtain the formula (5.12) by writing the stochastic integral in (5.12) as a Wiener integral and applying the stochastic Fubini theorem for Wiener integral, as explained in [26,Thm. 5.13.1]. For any fixed x ∈ T we study the process t → u ε t (x). By definition of mild solution for the equation (1.1), u satisfies the equality (1.3) for any smooth function l : T → R, thus the identity (5.13) follows by simply setting l(y) = P ε (x − y) in (1.3). Finally for any (t, x) ∈ [0, T ] × T the a.s. Hölder continuity of u in the space and time implies the convergence (5.14), using the classical property of the heat semigroup on continuous functions. Theorem 5.6. Let (Π,Γ) be the BPHZ model and ϕ ∈ C 7 b (R). Then for any smooth function ψ : R × T → R with supp (ψ) ⊂ (0, +∞) × T, one has for any t ∈ (0, T ] the a.s. identity where C : (0, T ) → R is the deterministic integrable function C(s) := P s (·) 2 L 2 (T) .
Proof. We prove firstly the result when ψ = h ⊗ l where h : [0, t] → R is a compactly supported smooth function and l : T → R. ψ is compactly supported up to time t. Therefore we we can forget the operator 1[0, t] on the left hand side of (5.15) and we can apply the Theorem 5.1 and the Proposition 5.4 obtaininĝ ϕ(u(s, y))h ′ (s)l(y) + ϕ(u(s, y))h(s)l ′′ (y) ds dy ϕ ′ (u(s, y))h(s)l(y)dW s,y .

(5.16)
Let us recover the right-hand side of (5.16) via a different approximation. Using the process u ε defined in (5.11), we can apply the Itô formula to the semimartingale h(s)ϕ(u ε s (y)) for some fixed y and we obtain The left-hand side of (5.17) is a.s. equal to zero by hypothesis on h and we can still apply the formula (1.9) with u ε instead of u ε . Hence we can rewrite the equation (5.17) as Writing the integral with respect to dW ε s (x) as a Walsh integral, we can apply the boundedness of ϕ ′ and ϕ to apply a stochastic Fubini's theorem on dW s,y (see [7,Thm. 65 Let us prove that the left-hand side of (5.19) converges in L 2 (Ω) to the right-hand side of (5.16). From the uniform convergence (5.14) of u ε , it is straightforward to show as ε → 0  ϕ(u s (y))h(s)l ′′ (y)dyds a.s. and the convergence holds also in L 2 (Ω) because these random variables are also uniformly bounded. In case of the stochastic integral in (5.20), the same uniform convergence of u ε in (5.20) implies that sup (s,z)∈[0,T ]×T T P ε (z − y)ϕ ′ (u ε s (y))h(s)l(y)dy − ϕ ′ (u s (z))h(s)l(z) → 0 a.s. and bounding these quantity by some constant we obtain by dominated convergence Hence the proof is complete as long as the right-hand side of (5.19) converges in L 2 (Ω) to the right-hand side of (5.15). Using the shorthand notations p ε s (y) = ∂ x P ε+s (y), P ε s (y) := P ε+s (y), the formula (5.12) on u ε when m = 1 becomes Thus writing it as a Wiener integral, we can express (∂ x u ε s (y)) 2 using the Wiener chaos decomposition of a product (see [24,Prop. 1 Hence, recalling the invariance by translations of ζ → T P (s, ζ − y) 2 dy we write the right-hand side of (5.19) as A ε We treat both terms separately. In case of A ε 1 we apply the integration by parts on the z variable and the smoothness of P outside the origin imply for any y ∈ T Using again the invariance by translations of ζ → T P (s, ζ − y) 2 dy, we can rewrite where the function C is given is defined in the statement. Therefore from the convergence (5.14) we obtain ψ(s, y)ϕ ′′ (u(s, y))C(s)dy ds a.s. and the convergence holds also in L 2 (Ω) because the sequence A ε 1 is uniformly bounded. We pass to the treatment of A ε 2 . In order to identify its limit, we interpret the double Wiener integral I 2 as a multiple Skorohod integral of order 2. Then we want to rewrite the quantity using the product formula (2.8) for δ 2 to commute the deterministic integral in ds dy with the stochastic integration. Defining U ε s (y) := l(y)h(s)ϕ ′′ (u ε s (y)), one has U ε s (y) ∈ D 2,2 because u ε ∈ D 2,2 belongs to some fixed Wiener chaos and ϕ ′′ has both two derivatives bounded. Applying the chain rule formula for the Malliavin derivative (see [24, For any ε > 0 it is straightforward to check that the hypothesis of the product formula (2.8) are satisfied, therefore we can write ∇ s 1 ,y 1 U ε s (y)p ε s−s 1 (y − y 1 )p ε s−s 2 (y − y 2 )ds 1 dy 1 dW s 2 y 2 + [0,s] 2 ×T 2 ∇ 2 s 1 ,y 1 ,s 2 ,y 2 U ε s (y)p ε s−s 1 (y − y 1 )p ε s−s 2 (y − y 2 )ds 1 dy 1 ds 2 dy 2 .

(5.22)
Looking at the deterministic deterministic integrals in the right-hand side of (5.22), they are both zero as a consequence of the trivial identity Thus we can interchange the product of U ε s (y) with the multiple Shorokod integral of order 2. For any ε > 0 the stochastic integrand inside dW 2 s 1 y 1 s 2 y 2 is a smooth fnction in all its variables s 1 , y 1 , s 2 , y 2 , s , y, then it is square integrable when we integrate it on its referring domain. Therefore we can apply a Fubini type theorem for Skorohod integrals (see e.g. [25]) to finally obtain h(s)l(y)ϕ ′′ (u ε s (y))p ε s−s 1 (y − y 1 )p ε s−s 2 (y − y 2 )dy ds dW 2 s,y .
Let us explain the convergence of A ε 2 to the multiple Skorohod integral of order two in the final formula (5.15). On one hand we proved that all the previous terms in the formula converge in L 2 (Ω). Then if the sequence of functions h(s)l(y)ϕ ′′ (u s (x))p 0 s−s 1 (y − y 1 )p 0 s−s 2 (y − y 2 )dy ds , the theorem will follow because the multiple Skorohod integral is a closed operator. From the a.s. convergence of u ε in (5.14) it is straightforward to prove that F ε converges to F a.s. and a.e. Then we conclude by dominated convergence by proving that F ε 2 L 2 , the square norm of F ε in L 2 ([0, t] 2 × T 2 ) is uniformly bounded in ε. Using the symmetry of F ε in the variables s 1 and s 2 we introduce the set ∆ 2,t = {0 < s 1 < s 2 < t} and writing the square of an integral as a double integral one has where we adopted the shorthand notation ds = ds 1 ds 2 , dy = dy 1 dy 2 and we applied the Fubini theorem. Integrating by parts with respect to y 1 and y 2 and applying the semigroup property of P , we obtain s∧r 0 T Bounding the terms involving the functions ϕ, l and h with a deterministic constant and applying the rough estimate there exists a constant M > 0 such that for any ε > 0 one has Thereby obtaining the thesis. To conclude the result when ψ is a generic smooth function supported on (0, t) × T, we apply the formula (5.15) with a sequence of test functions h N ⊗l N : (0, t)×T → R converging to ψ as rapidly decreasing functions. This convergence is very strong and writingR Φ ′′ (U)(D x U) 2 (h N ⊗ l N ) as the right-hand side of (5.16) we can use the same argument as before to provê ψ(s, y)ϕ ′′ (u(s, y))C(s)dy .
Then we can repeat the same argument above to prove that the double Skorohod integral converges to the respective quantity. When ψ is a generic test function defined on (0, +∞) × T we repeat the same calculations with the sequence of tests function ϕ N ψ where ϕ N is introduced in (2.3) and it converges a.e. to the indicator function 1 (0,t)×T .
is the a.s. limit in the C −3/2−κ topology of the smooth random fieldŝ where Λ n ([0, T ]) denotes the dyadic grid on [0, T ] × T of order n and the functions ϕ n z (z) are obtained by rescaling of a specific compactly supported function ϕ : R × T → R. When we study the sequence (5.25) in the L 2 (Ω) topology, the behaviour of this sequence is completely determined by knowing only the termsΠ z (τ Ξ))(ϕ n z ) for τ ∈ U. Thus we can apply the identity (3.44) and conclude. However, considering the same approximations for 1 [0,t]R (Φ ′′ (U)(D x U) 2 ), we do not have the same simplification. In particular, the splitting of the heat kernel G as a sum K + R as explained in the Lemma 3.11 make all the calculations very indirect and it does not allow to use directly the explicit structure of P . A general methodology to describe the stochastic properties of the reconstruction operator for the BPHZ model is still missing.
Remark 5.9. From the formulae (5.9) and (5.15), we can easily write the periodic extension of the reconstruction defined above. Indeed for any smooth function ψ : R 2 → R with supp(ψ) ⊂ (0, +∞) × R we have the identities ψ(s, y)ϕ ′′ ( u s (y))C(s)dy ds (5.27) And the indicator operator on the right-hand side tell us that that these identities hold also for any smooth function ψ (see Remark 5.7).

Identification of the integral formula
We pass to the identification of the terms involving the convolution with P . In principle this operation is deterministic and it should be obtained by applying the previous results to the deterministic test function ψ : R × T → R given by ψ(s, y) = P t−s (x − y) for some (t, x) ∈ [0, T ] × T. However the function ψ is not smooth because ψ has a singularity at (t, x). In order to skip this obstacle we recall an additional property of the function K : R 2 \ {0} → R, introduced in the Lemma 3.11 and the Lemma 4.9.
Lemma 5.10. There exists a sequence of smooth positive function K n : (5.28) Moreover for every distribution u ∈ C α with −2 < α < 0 non integer one has for any z ∈ R 2 (K * u)(z) = n≥0 (K n * u)(z) . Following [15,Lem. 5.19], the right-hand side sequence of (5.30) is a Cauchy sequence with respect to the topology of C α+2 . Thus by uniqueness of the limit we obtain the equality as elements of C α+2 . Since α + 2 > 0 (5.31) becomes an equality between functions, thereby obtaining the thesis.
Theorem 5.11. Let ϕ ∈ C 7 b (R). Then for any (t, x) ∈ [0, T ] × T one has the a.s. identities ψ(s, y)ϕ ′′ ( u s (y))C(s)dy ds ψ(s, y)ϕ ′′ ( u s (y))∂ x P s−s 1 (y − y 1 )∂ x P s−s 2 (y − y 2 )dyds dW 2 s,y . (5.33) Proof. We will prove equivalently the identities (5.32) and (5.33) on the periodic extension. Using Lemma 3.11, for any periodic random distribution v and (t, x) ∈ [0, T ] × R we have the general identity , we can apply directly Theorem 5.6 and the previous formulae (5.26), (5.27) to the term involving the kernel R. Thus the theorem is proved as long as these formulae are also true for the kernel K. To calculate it, we apply Lemma 5.10 obtaining for any periodic random distribution v the identity where η N : R 2 → R is the sequence of compactly smooth functions and the convergence is a.s. By applying the Theorem 5.6 to the sequence of function η N , we study the convergence of the sequence 1 [0,t] v(η N ) in the L 2 (Ω) topology when v is equal toR(Φ ′′ (U)(D x U) 2 ) orR(Φ ′ (U)Ξ). In case v =R(Φ ′ (U)Ξ) one has trivially Since ϕ ′ is bounded, there exists a constant M > 0 such that for any (s, y) ∈ [0, t] × R and N ≥ 0 one has |η N (s, y)ϕ ′ ( u s (y))| ≤ MG t−s (x − y) .
The function (s, y) → G t−s (x − y) is L 2 integrable on [0, t] × R and η N (s, y) converges a.e. to K. By using the Itô isometry and the dominated convergence theorem, we can straightforwardly prove Thereby obtaining the identity (5.26) with the kernel K. Let us consider the case v =R(Φ ′′ (U)(D x U) 2 ). To shorten the notation we adopt the convention g s−r (x − y) := ∂ x G s−r (x − y) and O t = [0, t] × R. Looking again at the equation (5.27) we have Ot η N (s, y)ϕ ′′ ( u s (y))C(s)dy ds , Ot×Ot t s 2 ∨s 1 R η N (s, y)ϕ ′′ ( u s (y))g s−s 1 (y − y 1 )g s−s 2 (y − y 2 )dyds d W 2 s,y .
From the definition of η N , one has a.e. and a.s.
(5.37) and similarly for F K by replacing η N with K(t − s, x − y). Bounding uniformly η N and the kernel K by G t−s (x − y), we have trivially F N 2 ≤ F G 2 for every N ≥ 0 and F K 2 ≤ F G 2 where F G (s, y) := t s 2 ∨s 1 R Φ G (s, y, s, y)dsdy , Φ G (s, y, s, y) := G t−s (x − y)ϕ ′′ ( u s (y))g s−s 1 (y − y 1 )g s−s 2 (y − y 2 ).
semigroup property of G we have α 2 = 2 (∆ 2,t ×R 2 )×Ot (∇ t 1 z 1 F G (s, y)) 2 ds dydt 1 dz 1 = 2 (Ot) 2 G t−s (x − y)G t−r (x − z)ϕ (3) ( u s (y))ϕ (3) ( u r (z))Γ 3 s,r (z, y)drdsdydz , where the function Γ 3 s,r (z, y) is defined through the identities Γ 3 s,r,s,t 1 (z, y)ds 1 dt 1 ds 2 , Γ 3 s,r,s,t 1 (z, y) := G s+r−2t 1 (y − z)g s+r−2s 1 (y − z)g s+r−2s 2 (y − z) . Let us consider the term Γ 3 s,r (z, y). Using the elementary estimates we can bound each term in the sum defining Γ 3 s,r (z, y) by some integrable functions depending only on r and s. For example, in case of the first term in the sum defining Γ 3 s,r (z, y) there exists a constant C > 0 such that Writing explicitly the integral on the right-hand side, there exists a constant C ′ > 0 such that this integral is bounded by C ′ ln(s + r) 2 ( √ s + r − |s − r|) + ln(s + r) Let us denote the right-hand side of (5.41) by C T (s, r). This function is integrable on [0, t] 2 . By integrating on the remaining components and bounding the derivatives with some uniform constant, there exists a constant M > 0 such that C T (s, r)drds < +∞ .
(the factor 8 comes out because the function (∇ 2 t,z F N (s, y)) 2 is symmetric under the change of coordinates s 1 → s 2 , t 1 → t 2 and s → t). The function Γ 4 s,r (z, y) is defined through the new identities Γ 4 s,r,s,t (z, y) := G s+r−2t 1 (y − z)G s+r−2t 2 (y − z)g s+r−2s 1 (y − z)g s+r−2s 2 (y − z) . Recalling the elementary estimates in (5.40), we can similarly bound every single integral appearing in Γ 4 s,r (z, y) in the same way implying there exists an integrable function B T (r, s) such that |Γ 4 s,r (z, y)| ≤ B T (s, t). Bounding ϕ (4) we conclude there exists a constant M > 0 such that B T (s, r)drds < +∞ .
Thus we conclude that the random variables α 1 , α 2 and α 3 are uniformly bounded and F N , F K ∈ H. As a matter of fact, the previous estimates have a stronger consequence because they imply that the functions Φ N (s, y, s, y) and Φ K (s, y, s, y) defined above are a.e. on s, y, s, y and a.s. dominated by some integrable functions. Rewriting the norm on H as follows (∇ t 1 z 1 F N (s, y) − ∇ t 1 z 1 F K (s, y)) 2 ds dydt 1 dz 1 + E (Ot) 4 (∇ 2 t,z F N (s, y) − ∇ 2 t,z F K (s, y)) 2 ds dydtdz , we obtain F N − F K 2 H → 0 by dominated convergence because we have trivially the a.e. a.s. the convergence of the functions Φ N (s, y, s, y) → Φ K (s, y, s, y) , ∇ t 1 z 1 F N (s, y) → ∇ t 1 z 1 F K (s, y) , ∇ 2 t,z F N (s, y) → ∇ 2 t,z F K (s, y) , Thereby proving the theorem.
Proof of the Theorem 1.1. For any ϕ ∈ C 7 b (R) the differential and the integral formula are obtained applying straightforwardly the previous results. Looking at their proofs, we realise that the Skorohod and the Wiener integrals and their convolution with P , differently from the reconstructions, are well defined if the derivatives of ϕ are bounded up to the order 4. Thus for any fixed ϕ ∈ C 4 b (R) we can write the differential and the integral Itô formula on ϕ δ , a sequence {ϕ δ } δ>0 of smooth functions with all bounded derivatives converging to ϕ. Using the same calculations of Theorem 5.11, we can prove that the terms involving ϕ δ converges in L 2 (Ω) to the same terms involving ϕ. Thereby obtaining the proof.
Remark 5.12. Using the integrability of the random field u in (5.10) and looking carefully at the proof of the identity (5.33), we could lower down slightly the hypothesis on ϕ in the Theorem 1.1, supposing that ϕ has only the second, the third and the fourth derivative bounded. Indeed the function ϕ ′ (u) will have linear growth and the right-hand side of (5.32) will be always well defined. In this way, the same argument given in the proof above should provide the Theorem 1.1 even in this case. These slight modifications should allow us to obtain a differential and an integral formula even for the random field u 2 , giving an interesting decomposition of this random field.