Malliavin calculus and densities for singular stochastic partial differential equations

We study Malliavin differentiability of solutions to sub-critical singular parabolic stochastic partial differential equations (SPDEs) and we prove the existence of densities for a class of singular SPDEs. Both of these results are implemented in the setting of regularity structures. For this we construct renormalized models in situations where some of the driving noises are replaced by deterministic Cameron–Martin functions, and we show Lipschitz continuity of these models with respect to the Cameron–Martin norm. In particular, in many interesting situations we obtain a convergence and stability result for lifts of L2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$L^2$$\end{document}-functions to models, which is of independent interest. The proof also involves two separate algebraic extensions of the regularity structure which are carried out in rather large generality.


Introduction
We establish Malliavin differentiability and subsequently study the existence of densities (with respect to Lebesgue measure) of finite dimensional projections of solution to singular stochastic partial differential equations (SPDEs). The equations we have where each component u i is in general a distribution on R × T d for some d ≥ 1, subject to some initial condition u i (0, ·) = u i,0 . Here, L i is an elliptic differential operator involving only spatial derivatives, the functions F i and F j i are smooth and allowed to depend on u = (u i ) i≤m and finitely many derivatives of u, and the random fields ξ j , j ≤ m, are assumed to be jointly Gaussian.
Equations of type (1) have been subject to intensive study in recent years and lead to the development of novel technical approaches [15,17,31]. While these approaches differ in their scope and technical details, in situations where more then one of them can be applied, they lead to the same notion of solution. For the purpose of this paper we focus on the theory of regularity structures, originally developed in [17], and subsequently extended and generalized in a series of papers [4,5,10], see also [21]. Interesting examples that fall under this setting include the generalized KPZ equation [12,19,33] in 1 + 1 dimensions, the p d [4,17,26,34]equations in 1 + d dimensions for d ≤ 3 and the generalized PAM equation [15,17] ∂ t u = u + f (u) + i, j≤d f i, j (u)(∂ i u)(∂ j u) + g(u)ξ. (4) in d = 2 or d = 3 dimensions. Choosing ξ as white noise, which is the natural choice in these examples, all of these equations have in common that there does not exist a solution in the classical sense. The robust solution theory of [4,5,10,17] instead considers approximate, renormalized equations that take the form subject to some initial condition u ε i (0, ·) = u ε i,0 , where ξ ε j = ξ j * ρ (ε) for some approximate δ-distribution ρ (ε) . In [4,Thm. 2.21] it was shown that under some appropriate assumption on the equation there exists a choice of constants c ε k with the property that the sequence of solutions u ε converges in probability to some limiting random distribution u as ε → 0, and we call this limit u the (renormalized) solution to (1). The counter-terms ϒ k i and the renormalization constants c ε k are given explicitly in [4, (2.12)]. We recall their definition in (16) below.
The first purpose of the present article is to establish the existence of continuous path-wise derivatives of the renormalized solution to (1) in the direction of Cameron-Martin functions (in the sense of [35,Def. 3.3.1]). This is in particular enough to obtain the existence of a localized version of Malliavin derivative ( [28,Prop. 4.1.3], [7,Prop. 2.4]), which in turn is sufficient for the celebrated Bouleau-Hirsch criterion [3] to apply. The latter gives rather sharp conditions under which densities with respect to Lebesgue measure exist.
The second purpose of this article is to show that the conditions of the Bouleau-Hirsch criterion are indeed satisfied for an interesting class of equations. The equations for which we can show existence of densities include in particular the stochastic heat equation with multiplicative noise and the p d -equations. Malliavin calculus has been used in the context of SPDEs in different settings in the past. The strategy outlined above for singular SPDEs was already used in [7] to show existence of densities for the 2D-PAM equation, and a recent paper [13] treated the case of the 4 3 equation. On the technical level, our approach for showing Malliavin differentiability uses extensions of the regularity structure and is strongly inspired by [7] (compare also [8,9] for the rough path case), although the proofs given in the present paper differ in some key aspects, which in particular allows us to obtain statements that are more general. For the main technical step -the proof that the "shifted" model is well defined -we exploit a noise doubling strategy, which is similar to [22], where this has been done in the context of rough paths. In the second part of the paper we apply the Bouleau-Hirsch criterion by studying the "dual" to the tangent equation, an idea inspired from [30] where this has been used to study linear SPDEs driven by degenerate noise and the recent work [13] which studies the existence of densities for the 4 3 equation. Malliavin calculus has been used to study SPDEs interpreted in Ito's theory, see for instance [32] and the references theirein. In [2] the authors consider the multiplicative stochastic heat equation with Neumann boundary conditions and prove the existence of smooth densities; this was later generalized to Dirichlet boundary conditions in [27]. Apart from the boundary conditions this setting is identical to (8) below; note that they obtain smoothness for densities of point evaluation, in the current paper we obtain densities (but not smoothness) after testing against test functions. In [24] stochastic heat equations with coloured noise in arbitrary dimensions are treated; note that Proposition 1 can deal in principle with coloured noises but doesn't explore this topic in detail (see also the remark below this Proposition 1). In [29] existence and smoothness of law for a wide class of second order SPDEs has been shown, including the stochastic heat equation (generalizing previous results from [2,24]). That paper also treats the stochastic wave equation in no more than 3 dimensions; wave equations do not fall under our setting. Relaxing the assumptions on the coefficients allowed [11] to treat (among other things) the parabolic Anderson model in one dimension.
Finally, a series of results concerning Hörmander theorems have also been obtained for SPDEs under "degenerate" forcing, see for instance [1,25], or the recent generalization to rough forcing in [14]. Note that the main technical difficulty of these papers is quite different from ours. In these papers, showing the non-degeneracy of the Malliavin matrix is the main problem; in our setting the main problem is to show that a suitable path-wise generalization of the Malliavin derivative exists in the first place.
To make things more concrete, we put ourselves in the setting of the "black box" theorem [4,Thm. 2.21]. Given non-linearities F i and F j i , and a Gaussian noise ξ , this theorem establishes explicit formulae for the counterterms and renormalization constants appearing in (5), and works out concrete assumption on the equations under which the sequence of renormalized solutions converge.

Assumption 1
We assume throughout this paper that the assumptions of [4,Thm. 2.21] on the equation, the noises and the initial condition are satisfied. To be more precise, we assume [4, (2.5) 16], we assume that we are given jointly Gaussian random fields (ξ j ) j≤n in the sense of [4,Def. 2.17] and we assume that the initial condition can be decomposed as u ε 0 = S − ρ,ε (ξ )(0, ·) + ψ ε as in [4, (2.23)] with ψ ε converging to some random initial condition ψ in probability in C ireg as ε → 0. We refer the reader to Sect. 2.3 for a summary of these assumptions and the definition of the space C ireg .
For the reader not familiar with these assumptions we recall briefly their purpose. In [4, (2.5)] the authors give a rigours meaning to the notion of sub-criticality. This is a key assumption which is seen in any of the theories developed in [15,17,31], and the equations are believed to behave quite differently when this assumption is violated. It also ensures that one can algebraically build a regularity structure adapted to the equation as in [5]. Assumption [4,Ass. 2.6] deals with compositions of the solution with smooth functions. It also limits the regularity blow-up at the initial time-slice to ensure that the solution is an actual distribution on the whole space (as opposed to just R + × T d ). Throughout the solution theory developed in [17] the equations are treated in their mild formulation, and Assumption [4, Ass. 2.8] guarantees the existence of a Green's function for ∂ t − L i , together with suitable analytic estimates. Assumption [4,Ass. 2.13] is a technical assumption that ensures that the solution to our equation can always be written as an explicit distribution-valued, stationary, random process, plus an implicit function-valued random perturbation (by explicit we mean that this process is given as a stationary solution to a linear equation and polynomial expressions in this solution). The explicit stationary process appearing for the regularized noise is denoted by S − ρ,ε (ξ ) and appears in the rather cumbersome way in which the initial conditions are phrased. This is needed, since in general the spaces in which S − ρ,ε (ξ ) converges as ε → 0 are spaces of space-time distributions, and it follows that evaluating the limit process at a fixed time is in general not well defined. Finally, [4,Ass. 2.15 and 2.16] ensures that the analytic BPHZ theorem of [10,Thm. 2.33] can be applied, which in particular establishes the existence of a limit model. Most notably, this assumption rules out divergent variances in the "trees" used to build the regularity structure.
Remark 1 Assumption 1 is identical to the assumption made in [4,Thm. 2.21], which guarantees the existence of solutions, together with a precise definition of the renormalisation procedure, and is therefore a convenient starting point for our analysis. These assumptions are quite complicated. Fortunately, in the present paper we will never actually need them, other than for their black-box content: They imply that there is an "abstract fixed-point problem" in Hairer's space of modelled distributions (we recall in Sect. 2.2.5 their definition) which (1) has a solution U and (2) its reconstruction u = RU coincides with the classical solution u to a renormalised SPDE whenever the noise is smooth.
We recall from [4,Thm. 2.21] that under Assumption 1 there exists a unique maximal random time τ = τ (ω) > 0 and a maximal solution u = (u i ) i≤m on [0, τ (ω))×T d to (1). To be more precise, there exists a choice of constants c ε k for ε > 0 and a sequence of random times τ ε = τ ε (ω) with τ ε → τ in probability as ε → 0 and such that the classical solution u ε to (5) with ϒ k i given as in (16) exits almost surely on [0, τ ε ), and such that for T > 0 the sequence u ε conditioned on the event {τ > T } converges as ε → 0 to u in probability in the space of space-time distributions D ([0, T ) × T d ).
When restricted to positive times, this convergence also takes place in the Hölder-Besov space 1 C reg := i≤m C reg(i) ((0, T ) × T d ). Moreover, the random time τ can be chosen maximal, in the sense that the statement above does not hold for any random timeτ such thatτ > τ with positive probability.

Main results
We want to study the finite-dimensional law of the random variables given by testing the solution u against a finite number of test-functions, that is, we study the law of (u i (φ i l )) l≤L,i≤m ∈ R L×m (6) for some L ∈ N, test functions φ i l ∈ C ∞ c (R×T d ). Note that our results are obtained for testing the solution against test functions, rather than considering point evaluations as in [7]. We will establish Malliavin differentiability [23,28] of these random variables, and a fortiori study the existence of densities with respect to Lebesgue measure. As has already been observed in [7] and later in [13], the classical notion of Malliavin differentiability is to strong for our purposes, as it imposes moment bounds which are simply not true in general in our setting. Instead, we are lead to use a version of Malliavin differentiability more adapted to this setting, and we borrow the notion of local H -Fréchet differentiability from [35,Def. 3.3.1], which we recall in Definition 5 below. Denoting by H ξ the Cameron-Martin space for the jointly Gaussian random fields ξ = (ξ i ) i≤n , our main result on Malliavin differentiability reads as follows.
Theorem 1 Under Assumption 1, let u be the solution to (1) given by [4,Thm. 2.21], let τ = τ (ω) ∈ (0, ∞] be the time of existence of u, let ψ := lim ε→0 ψ ε and assume that ψ ε and ψ and are locally H ξ -Fréchet differentiable for any ε > 0. Then, for any T > 0 and any i ≤ m the solution u i restricted to (0, T ) × T d and conditioned on We refer the reader to (55) below for a precise formulation of what we mean by renormalized solution to (7).
Local H -Fréchet differentiability is a powerful tool to establish existence of densities due an argument by Bouleau and Hirsch [3], see also [28,Sec. 2.1.3] and the references therein. We show existence of densities under some simplifying assumptions which we introduce in Sect. 5 below. These assumptions are somewhat technical and we refrain from stating them precisely at this stage. Instead, we refer the reader to the paragraph below Theorem 2 for an informal discussion of these assumptions and to Proposition 1 for a class of interesting equations for which these assumptions are indeed satisfied. Taking Assumptions 4, 5 and 6 from Sect. 5 for granted at the moment, our main result concerning densities is the following.
Theorem 2 Assume that Assumptions 4, 5 and 6 below hold. Let furthermore (ξ i ) i≤n be a family of jointly Gaussian noises on some probability space ( , P) with Cameron-Martin space H ξ . Let also T > 0 and assume that P( Then, for any L ∈ N and any family (φ l ) l≤L with φ l ∈ C ∞ c ((0, T ) × T d ; R) L − of linearly independent, smooth, compactly supported R L − -valued test-functions, one has that the R L -valued random variable conditioned on {T < τ} has a density with respect to Lebesgue measure.
We briefly discuss the assumption of the previous theorem. Assumption 4 severly limits the explicit dependence of the right hand side of (5) on the derivatives of the solution. This is done mainly for convenience, as it simplifies many computations below. Assumption 5 ensures that the renormalization constants for the "dualized" tangent equation are identical to the constants appearing in the original tangent equation, which is needed in order to pass to the limit ε → 0 in the dual equation. We believe that both of these assumptions are not really necessary and it will be the subject of future research to establish a density result that does not require them. Assumption 6 on the other hand is crucial, as it ensures that in case of multiplicative noise the term multiplying the noise does not make it degenerate.
Instead of giving the precise assumptions at this stage we will limit ourselves to the following examples in order to demonstrate the scope of Theorem 2. We also point out that Assumption 4 rules out that derivatives of the solution appear explicitly on the right hand side, which excludes in particular the (generalised) KPZ equation.
in 1 + 1 dimension, as soon as the smooth function g does not vanish anywhere on R.
In particular, the statement of Theorem 2 holds as soon as ξ is a Gaussian noise with Cameron-Martin space dense in L 2 (R × T d ).

Remark 2
We remark that in [13] the authors obtained existence of densities for the 4 3 equation under the same assumptions as above. Additionally, they obtained existence of densities for noises whose Cameron-Martin space is not dense in L 2 , but are such that they are everywhere "rough enough" in a certain sense. We expect that it is possible to generalize these arguments to all of examples given above.

Remark 3
On the other hand, the result obtained for the 2D PAM equation in [7] falls out of our setting. Crucially, the purely spatial white noise driving the PAM equation violates our assumption on the density of the Cameron Martin space in L 2 ((0, T )×T d ).
Another difference is that [7] considers point evaluations of the solutions, rather than testing against smooth test functions.

Main results on the level of the model
The main results of this work, Malliavin differentiability and existence of densities, are formulated on the level of the equation. However, the main novelty of this paper is actually contained in Theorem 7, which allows "lifting" Cameron-Martin functions and Gaussian noises jointly to an extended model. We refer to Sect. 1.4 for an outline of how to finish the proof of H -Frechet differentiability of solutions of SPDEs from there. For readers not familiar with the theory regularity structures [17], see also [4,5,10], we refer to Sect. 2.2 for a review of some notations and results from these papers. Recall that Models Z = ( , ) are bounded in terms of expressions of the form sup λ∈(0,1) for test functions φ and some θ > 0; see [17,Thm. 10.7] for a precise statement [we also make use of this argument in (33) below]. For this reason, the tricky part of our argument is to show that "renormalised extended" trees are well defined for Cameron-Martin functions. Once this is established, the necessary technical results (bound of moments of the extended model uniform in the norm of the Cameron-Martin function h, continuity of the extended lift as a function of the noise in some probabilistic sense, Lipschitz continuity in h, definition of the "shifted" model) follow from comparatively standard lines of argument.
To explain the main idea behind the "noise doubling strategy" that allows us to lift Cameron-Martin functions to extended renormalised models, we consider the cherry tree from the 4 3 equation. In the extended regularity structure, there are 4 trees corresponding to this cherry tree which we visualize as , , , .
Here we draw a black bullet for the abstract symbols , which is a place holder for white noise, and we draw a red square for the abstract symbolˆ , which is a place holder for a Cameron-Martin function. (Note that there are really only 3 trees, since the middle two trees are the same symbol due to symmetry.) Taking the second tree above as an example, we have for any w ∈ R + × T 3 the identity where we write Taking squares and expectations on both sides, we have In the diagram in the middle we use a dashed line for a (regularised) δ 0 -distribution. The point of this argument is: 1. Both the "expectation" and the "L 2 (dy)"-norm lead to the same type of contraction. 2. The final expression is identical (modulo the factor h 2 L 2 ) to the second moment of the original tree. The latter is bounded in large generality by [10,Thm. 2.31]; this allows us to show existence of extended models on the same level of generality. (For simplicity we evaluated the trees above at a point z ∈ R + × T 3 , for the actual bounds needed, one should integrate against a (rescaled) test-function instead.)

Application: multiplicative stochastic heat equation
We apply our results to the stochastic heat equation (8) driven by a space-time dependent Gaussian noise ξ on R × T satisfying the assumptions of Sect. 2.2.4 and vector fields f , g ∈ C ∞ (R) with g > 0. We refrain from stating the precise assumptions on the noise at this point as they are somewhat convoluted, but we note that these assumptions allow in particular the case of space-time white-noise. The regularized and renormalized equation is given by for some constants C i ε for i = 1, 2, 3, subjection to (for simplicity deterministic) initial condition u ε (0) = u 0 . For space-time white noise ξ , this equation was first derived in [20], where it was also shown that in this case one can choose C 2 ε and C 3 ε independent of ε. For more general noises, it follows from [4] that given some initial condition u 0 ∈ C 1 2 (T) there exists a choice of constants C i ε , i = 1, 2, 3, such that the regularized solution u ε conditioned on {τ > T } converges to some limit u in probability the space C 1 2 −κ ((0, T ) × T d ). By Theorem 1, the solution is locally H ξ -Fréchet differentiable, and its derivative v h = D h u satisfies the tangent equation subject to the initial condition v ε h (0) = 0. Furthermore, assuming that the Cameron-Martin space H ξ of ξ is dense in L 2 (R × T), then for any family of linearly independent test function (ϕ i ) i≤L with ϕ i ∈ C ∞ c ((0, T ) × T) the R L -valued random variable given by ( u, ϕ 1 , . . . , u, ϕ L ) conditioned on the event {τ > T } admits a density with respect to Lebesgue measure.

Outline of the paper
In the next section we review some of the results in the literature which we need later on. Most notation which we use in the paper will also be introduced in that section. In Sect. 2.1 we introduce general conventions on notation. We review the theory of regularity structures and singular SPDEs, see [4,5,10,17], in Sects. 2.2 and 2.3, respectively. Finally, in Sect. 2.4 we review some classical results about Gaussian measure theory in infinite dimensional spaces.
The outline of the paper is as then follows. In Sects. 3 and 4 we show existence of H -Frechet derivatives in the following way.
1. As in [7], we first construct in a purely algebraic step an extended regularity structure in Sect. 3.1 by adding for any noise-type a symbolˆ that acts as an abstract place-holder for a fixed Cameron-Martin function. The extended set of trees is then given by allowing any appearance of any noise-type in any tree to be replaced byˆ . 2. In Sect. 3.2 we perform the main analytic argument which shows that we can lift a fixed Cameron-Martin function h and Gaussian noise ξ to a renormalized model that in particular has the property that = ξ and ˆ = h . We will show that this lift is locally Lipschitz continuous in h.
3. An extended model can then be mapped in a locally Lipschitz continuous way onto a "shifted" model in Sect. 3.3, which in particular shows that the model behaves in a continuous way under shifting the noise by a Cameron-Martin function. 4. In Sects. 3.4 and 3.5 we show how to lift and shift abstract fixed point problems.
This will in particular allow us to consider for fixed Cameron-Martin function h the equations driven by ξ +rh for any r ∈ R in an r -independent model, and is thus suited to study Gâteaux differentiability of the solution map in Cameron-Martin directions. 5. Gâteaux and Fréchet differentiability are then established in Sects. 4.1 and 4.2, respectively. Finally, in Sect. 4.3 this abstract theory is applied to singular SPDEs of the type (1) under Assumption 1, and we derive in particular the tangent equation (7).
In Sect. 5 we show the existence of densities. A rough outline of this section is as follows.
1. In order to establish the existence of densities we study the dual equation (57) of the tangent equation (7). We want to lift the dual equation again to an abstract fixed point problem, and since the dual equation is a stochastic PDE going backward in time, we are led to construct another extension of the regularity structure, this time extending the set of kernel-types by adding for any type t a type t representing the dualized kernel. 2. We then derive in Sect. 5.2 an abstract fixed point problem for the dual equation and we identify its reconstruction as the actual solution to the dual equation in Sect. 5.3. This step is not automatic, since it is a-priori not clear that the renormalization constants obtained in these two ways coincide (it is not even clear that they differ by something of order 1 in a suitable sense, which is the main reason that Sect. 5 is less general then the rest of the paper).
3. This identification relies on Assumption 5 which basically enforces the identity that we need, and we show in Sect. C that Assumption 5 is satisfied when considering single equations (as opposed to systems of equations). Finally, we derive the existence of densities in a spirit similar to [13] by showing that the solution to the dual equation does not vanish identically in Sect. 5.4.

General conventions on notation
We introduce some notation that is used throughout this article. Given M ∈ N we write Finally, the following terminology of multi-sets will be useful. A multiset m with values in A is an element of N A . Given two multisets m, n ∈ N A we write m n ∈ N A for the multiset given by for any a ∈ A.
We sometimes discuss concepts in detail for concrete examples in order to clarify notation. In these cases we use notations of the form [a, b, c, . . .], to denote multisets. For instance, we write [a, a, b] := 2I a + I b .
Given a multiset m as above and a function f on A we also freely use the notation

Regularity structures
In this section we recall the main notations and results about regularity structures that we will use in the sequel. Throughout this paper we assume we are given a finite set of types L = L − L + . The finite set L + will index the components of the equation, while the finite set L − will index the Gaussian noises appearing on the right hand side of the equation. We assume that L is equipped with a homogeneity assignment | · | s : L → R for ∈ {+, −}. Recall from [5,Def. 5.14] that a rule R is a collection (R(t)) t∈L that assigns to any type t ∈ L a set of multisets R(t) with values in L × N d+1 . We recall the notions of normal, sub-critical and complete from [5, Def. 5.7, Def. 5.14, Def. 5.22]. Let us especially recall that a rule is subcritical if there exists a map reg : L → R with the property that where we write (t, 0), . . . denote an arbitrary (possible vanishing) number of occurrences of (t, 0).
We assume we are given a normal, subcritical, complete rule R and we denote by T ex the regularity structure constructed as in [5,Def. 5.26]. We will actually work with a slightly simplified structure as far as the extended decoration is concerned, compare Sect. 2.2.1 below. We extend the homogeneity assignment | · | s to T ex in the usual way, taking into account the extended decoration 3 , and we write T ⊆ T ex for the reduced regularity structure obtained as in [5,Sec. 6.4]. We will very rarely need the homogeneity assignment that neglects the extended decoration, but in these situations we will denote this by | · | − as in [5,Def. 5.3]. We write T ex and T for the set of trees in T ex and T, respectively, so that T ex and T are freely generated by T ex and T as linear spaces. We write T ex α for the set of trees τ ∈ T ex with the property that |τ | s = α, we write T ex α := T ex α , and for γ ∈ R we write Q <γ for the projection of T ex onto α<γ T ex α . Finally, we make the following Assumption on the regularity structure, which is needed to apply the results of [  Singular SPDEs are the application that we have in mind in the present publication, and the formulation of Theorem 1 is therefore on the level of the equation. However, the main technical contribution of this paper is the noise doubling strategy which shows that Cameron-Martin functions can be "lifted" to models in Theorem 7. This theorem is completely carried out on the level of the regularity structure and does not require that we are in the setting of a singular SPDE.
Finally, we make the simplifying assumption on the rule that we do not allow products or derivatives of noises to appear on the right hand side of the equation. (Such an assumption does not seem to be crucial for the statement, but simplifies certain arguments.)

Assumption 3
We assume that for any t ∈ L and any N ∈ R(t) there exists at most one pair ( , k) ∈ L − × N d+1 such that N ( ,k) = 0, and this case k = 0 and N ( ,0) = 1.

Trees
Trees τ ∈ T ex can be written as typed, decorated trees τ = (T n,o e , t), where T is a rooted tree with vertex set V (T ), edge set E(T ) and root ρ T , the map t assigns types to edges and is formally a Here we define the decomposition of the set of edges into E(T ) = L(T ) K (T ) with e ∈ L(T ) (resp. e ∈ K (T )) if and only if t(e) ∈ L − (resp. t(e) ∈ L + ), and we write N (T ) ⊆ V (T ) for the set of u ∈ V (T ) such that there does not exist e ∈ L(τ ) such that u = e ↑ . We will often abuse notation slightly and leave the type map t and the root ρ τ implicit. We recall that the relation between the homogeneity assignments | · | s and | · | − is given by | · | s = | · | − + u∈N (T ) o(u), so that in particular one has |τ | s ≤ |τ | − for any tree τ ∈ T ex .
On a rooted tree T we define a total order ≤ on V (T ) by setting u ≤ v if and only if u lies on the unique shortest path from v to the root ρ T and we write edges e ∈ E(T ) as order pairs e = (e ↑ , e ↓ ) with e ↑ ≥ e ↓ . If u ∈ V (T )\{ρ T }, then there exists a unique edge e ∈ E(T ) such that u = e ↑ , and in this case we write u ↓ := e. Recall that it follows from the fact that R is normal (c.f. [5,Def. 5.7]) that elements u ∈ V (T )\N (T ) are leaves of the tree T .
Given a typed, decorated tree τ as above, k ∈ N d+1 and t ∈ L + we write J k t τ for the planted, decorated, typed tree obtained from τ by attaching an edge e to the root with type t and e(e) = k. We write T t ⊆ T for the set of trees τ ∈ T such that J 0 t τ ∈ T .
Example 2 Throughout the paper we will consider examples from stochastic heat equation (8) whenever we need to clarify notations. In particular, we often consider the tree , where we introduce the following graphical conventions: ... root ρ( ), element of N ( ) ... node, element of N ( ) ... edge of kernel type, element of K ( ) ... edge of noise type, element of L( )

Algebraic notation
We use the notation T ex − ,T ex − , T ex + ,T ex + , G ex − and G ex + from [5] for the respective spaces defined in [5,Def. 5.26,(5.23), Def. 5.36], and we write G − for the reduced renormalization group defined above [5,Thm. 6.29]. We recall that T ex − and T ex + form Hopf algebras and G ex − and G ex + are defined as their respective character groups. We use the notation ex − and ex + for the co-products for negative and positive renormalization respectively, as in [5,Cor. 5

Models
We assume that for any t ∈ L + we are given a decomposition of the Green's function into G t = K t + R t with R t ∈ C ∞ (D) and such that K t ∈ C ∞ c (D\{0}) satisfies [17, Ass. 5.1, Ass. 5.4], and given the kernel assignment (K t ) t∈L + we recall the definition of admissible models [17,Def. 2.7,Def. 8.29]. We call a model Z = ( , ) smooth if x τ ∈ C ∞ (D) for any τ ∈ T ex and any x ∈ D, and we call Z reduced if x τ does not depend on the extended decoration of τ . Given an admissible tuple τ ∈ C ∞ (D) for τ ∈ T ex we write Z( ) for the model constructed as in [5, (6.11),(6.12)], whenever this is well defined, and we write M ∞ for the set of smooth, reduced, admissible models for T ex of the form Z( ). We write M 0 for the closure of M ∞ in the space of models, and given a probability space ( , P) we write M rand Remark 6 We use the notation ∞ and 0 to be consistent with previous papers on this topic, e.g. [10,Def. 2.13]. This should not be confused with the (abstract) probability space ( , P) used throughout the paper.

Gaussian driving noises
We now discuss in some detail the assumptions we need on the driving noises; these assumptions are identical to those made in [10, (2.7)]. First, given a probability space ( , P) we write M ∞ := M ∞ (L − ) for the space of ∞ -valued centred, stationary, jointly Gaussian random fields (η t ) t∈L − on ( , P). Next, we want to introduce a set M 0 of 0 -valued Gaussian random fields (η t ) t∈L − that we will consider as driving noises for our SPDE. Given a 0 -valued jointly Gaussian, stationary, centred random noise η on ( , P), we denote by C η t,t ∈ D (D) the distributional covariance of η t and η t defined via the identity for any ϕ, ψ ∈ C ∞ c (D). We note that this is well defined by stationarity. The next definition is as in [10, (2.7)]. We fix for any k ∈ N d+1 a function P k ∈ C ∞ c (D) such that P k (x) = x k in a neighbourhood of the origin.

Definition 1
We write C(L − ) for the space of families of kernels (C t,t ) t,t ∈L − such that C * t,t = C t ,t in the sense that one has for any t, t ∈ L − and any choice of test functions φ, ψ ∈ C ∞ c (D), such that C is non-negative definite in the sense that one has for any family φ ∈ C ∞ c (D) L − , such that the singular support of the distribution C t,t ∈ D (D) is contained in {0}, and denoting byĈ t,t the smooth function representing C t,t away from the origin, we require that • for any test function ϕ ∈ C ∞ c (D) such that D k ϕ(0) = 0 for any |k| s < −|t| s − |t | s − |s|, one has C t,t (ϕ) = ϕ(x)Ĉ t,t (x)dx; and • there exists θ > 0 such that one has C |·| s < ∞.
Here, we define the quantity We write M 0 = M 0 (L − ) for the set of 0 -valued, jointly Gaussian, centred, stationary random fields η with the property that C η t,t ∈ C(L − ), and we write η |·| s := C η |·| s . We finally make the following precise to type of approximations used below: Definition 2 Given an element η ∈ M 0 and a mollifier ρ ∈ C ∞ c (D) with ρ(x)dx = 1, we call the sequence η ε := η * ρ (ε) ∈ M ∞ an approximation by mollification of η. We say that a map X from M ∞ into a topological space X can be extended continuously by mollification to M 0 if there exists a mapX : M 0 → Xthat extends X and is such that whenever η ε ∈ M ∞ is an approximation by mollification of η ∈ M 0 in the sense above, then X (η ε ) → X (η).

Modelled distributions
We recall the terminology and notation from [17, Sec. 3, Sec. 6] of modelled distributions. Given a model Z ∈ M 0 and γ > 0, η ∈ R we write D γ,η for the space of singular modelled distributions defined in [17, Def. 6.2] allowing a singularity at the hyperplane t = 0. More precisely D γ,η consists of all maps f : D → T <γ with the property that, setting P = {z ∈ D : z 0 = 0}, one has that is finite for any compact K ⊆ D. Here, the first supremum runs over all z,z ∈ K \P with the property that z −z s ≤ |z 0 | the reconstruction operator defined in [17,Prop. 6.9], provided that α ∧ η > −|s| + s(0), where α ≤ 0 denotes the regularity of the sector V .
Finally, we denote for any t ∈ L + by K t the operator constructed in [17, (5.15)] acting between D γ,η V and D γ +|t| s ,(η∧α)+|t| s for any sector V of regularity α such that one has RK t f = K t * R f for any f ∈ D γ,η . We also define the operator P t f : Of course, the operators K t and P t depend slightly on γ . Since γ will always be clear from the context, we leave it implicit in this notation.
In [17,Sec. 6] basic properties of certain maps (multiplication, differentiation, integration, composition) between space of modelled distributions were derived and we summarize them in Proposition 8 below.

BPHZ theorem
Given a smooth Gaussian field η ∈ M ∞ we define as in [5, (6 , and extending this linearly and multiplicatively, where η is such that Z η = Z( η ) is the canonical lift of η. We then define the BPHZ-character g η BPHZ ∈ G − as in [5, (6.24)] by setting for any τ ∈ T ex − := {τ ∈ T ex : |τ | − < 0}, and extending this linearly and multiplicatively. For any character g ∈ G ex − we use the notation M g : T ex → T ex for the linear operator given by and for a smooth Gaussian field η ∈ M ∞ we set for any τ ∈ T ex , and we defined the BPHZ-renormalized modelẐ , which we always we assume in this paper.

Singular SPDEs
In [4] the authors established a black box theorem for solving a large class of singular SPDEs of the form (1). We briefly recall the notations introduced in this paper, as far as we are going to need it later on. In order to unify the notation, we assume that #L + = n and #L − = m and we write (u t ) t∈L + and (ξ t ) t∈L − rather than (u i ) i≤n and (ξ j ) j≤m . We recall that we assume that for t ∈ L + we are given a differential operator L t involving only spatial derivatives and such that ∂ t − L t admits a Green's function for the map given by [4, (A.10)], and we assume that the initial condition u ε 0 is of the form for a sequence of C ireg -valued random fields ψ ε such that ψ ε → ψ in probability as ε → 0. We now recall the definition of the counter-terms appearing on the right hand side of (5). For this we borrow some more notation from [4].
For t ∈ L + we often write F • t := F t in order to avoid case distinctions. The smooth functions F t are allowed to depend on D k u l where l ∈ L + and where k ∈ N d+1 ranges over a finite set of multi-indices, say |k| s ≤ r . Consequently, it makes sense to define for any l ∈ L + and k ∈ N d+1 the derivative D (l,k) F t of F t in the direction of (D k u l ). We will reserve the symbol ∂ for derivatives in direction of space-time variables.
Definition 3 For l ∈ L + we say that a tree τ ∈ T is l-non-vanishing if for any μ ∈ N (τ ) one has that does not vanish identically for any smooth function u : We write T F l for the set of tree τ ∈ T that are l-non-vanishing and are such that J 0 l [τ ] ∈ T , and we write T F l,− for the set of τ ∈ T F l such that |τ | s < 0, and we writeT It follows from a straight forward inductive argument that the definition of l-nonvanshing given above coincides with [4,Def. 2.12]. We now define the counter-terms appearing in the renormalized equation as in [4, (2.12)].

Definition 4
For l ∈ L + and τ ∈T F l we define the function where we use the notation from Definition 3 for the type t(μ) ∈ L + .

Example 3
In case of stochastic heat equation (8) one has F • t = f and F t = g. Moreover, for the tree τ = from Example 2 one has Given a character g ∈ G − and a smooth Gaussian field η ∈ M ∞ (L − ) we define the g-renormalization of (1) by Here S(τ ) ∈ N is a symmetry factor explicitly given in [4, (2.22)]. We are thus in the setting of (5) with K = #T F t,− . Given initial conditions as above and letting u ε be the solution to (17) with η replaced by ξ ε and g replaced by g ε BPHZ , then the statement of [4,Thm. 2.21] precisely says that u ε converges to some limit u in probability as ε → 0.

Gaussian measure theory
In this section we review basic facts about Gaussian measures on infinite dimensional spaces as far as it is needed for the purpose of this paper. We follow in this section mostly the lecture notes [16] for basic properties of the Cameron-Martin space. For more details we refer the reader to standard literature [28]. Given a separable Fréchet space Xwe call a centered probability measure μ on Xequipped with the Borel sigma field Gaussian if all finite dimensional projections of μ are Gaussian. A Gaussian measure μ is uniquely determined by its covariance operator C μ : X * → X defined via the identity l * (C μ (k * )) = E μ [k * l * ] for any k * , l * ∈ X * . We denote the image of the covariance operator by • H μ ⊆ X and we equip this space with a scalar product given by h, H μ under the norm induced from this scalar product is a Hilbert space know as Cameron-Martin space. It is well known that H μ → X continuously, the space H μ determines the Gaussian measure μ uniquely, and one has the following classical result due to Cameron and Martin [6].
We note that this version of local H -Fréchet differentiability was also used in [7, Def. 2.2] and [13,Def. 4.1]. If X is locally H μ -Fréchet differentiable we denote its H μ -Fréchet derivative by D X. The main motivation for this definition is the criterion by Bouleau-Hirsch [3] for the existence of densities. In order to deal with situations in which we the solution does not exist globally, we use a slightly generalized version, and for this we make the following construction. Let U ⊆ X be a measurable subset of X. We say that U is H μ -open if for any x ∈ U there exists ε > 0 such that for any h ∈ H μ with h H μ < ε one has x + h ∈ U . We fix an H μ -open set U and we define for ε > 0 the set U ε as the ε-involution of U in H μ , i.e.
We then assume that there exists a sequence of locally H μ -Fréchet differentiable random variables ϕ ε : X → R for ε > 0 that approximates the indicator function I U from the inside in the sense that one has 0 ≤ ϕ ε ≤ 1, and ϕ ε (x) = 1 for any x ∈ U ε and ϕ ε (x) = 0 for x ∈ U c . If such a sequence exists, we say that U can be approximated from the inside.
Theorem 5 (Bouleau-Hirsch) Let X be an R n -valued random variable on a Gaussian probability space (X, μ) with separable Cameron-Martin space H μ , and let U ⊆ X be an H μ -open measurable subset of X such that U can be approximated from the inside. Let moreover X ⊆ X be the event that D X has full rank, and assume that μ(U ∩ X ) > 0. Then X conditioned on the event U ∩ X admits a density with respect to Lebesgue measure.
We will see that the time of existence τ is lower semi-continuous with respect to H μ -shifts, see Remark 12 below, which implies in particular that the event {τ > T } is H μ -open. The fact that {τ > T } can be approximated from the inside can be shown exactly as in [7,Lem. 5.3], and this leads to the following.
be the solution to a singular SPDE of the form (1) on a Gaussian probability space ( , P), let X : We now implement the general constructions from this section in the situation that the underlying Fréchet space is given by = t∈L − D (D), and the Gaussian probability measure P on is a stationary, centred Gaussian measure such that its covariance C P is an element of C(L − ). Note in particular that in this case the random field ξ which P-almost surely agrees with the identity on Xis an element of M 0 . Given such a Gaussian measure P, we will henceforth use the convention that ξ denotes this particular random Gaussian field, while η will still be used to denote more general random fields on whose laws under P are Gaussian. We will usually leave the measure P implicit in the notation and one should always think of P as arbitrary but fixed. It is straightforward to see that the space • H ξ is then given by It follows in particular that one has . We finish this section by introducing the following terminology. It is then not hard to see that for any ξ as above one has that

Extension and translation of models
In this section we introduce the main technical tools and show key estimates needed to prove Theorem 1. On the technical level, pathwise differentiability (in Cameron Martin directions) of solutions to singular SPDEs can be effectively studied by introducing an extended regularity structure. The basic idea, which was already used in [7], is to extend the regularity structure T ex by adding for any noise type a new noise typeˆ , which plays the role of an abstract place holder for a fixed Cameron-Martin function. We perform this extension in two separate steps. We first construct in a purely algebraic step, using the formalism developed in [5], an extended regularity structure. Afterwards we show in an analytic step, building up on the result of [10, Thm. 2.31], that any fixed Cameron-Martin function h can be indeed be "lifted" to a renormalized extended model, and, crucially, this lift is locally Lipschitz continuous in h.
For any r ∈ R the original regularity structure then maps into the extended structure via a map S r , which is essentially the multiplicative extension of the map → + rˆ . Conversely, any extended model maps onto a model for the original structure via the "dual" map S * r , which can be viewed as implementing this shift on an analytic level.
At the end of this section we are going to lift abstract fixed point problems to the extended regularity structure, and, using the shift operator, we also make sense of shifted fixed point problems.

Extension of the regularity structure
In the sequel it will be useful to consider general extensions of the set of noise types, and we are led to make the following general construction. Given a finite set I we define a new set of noise types byL for any ∈ L − and any i ∈ I . We extend the homogeneity assignment | · | s to L I − by setting |( , i)| s := | | s for any ∈ L − and any i ∈ I . In order to avoid case distinctions, we will sometimes add a distinct element to the index set by setting I := I { }, and we identify L I − with L − × I . Starting from the set of noise-types L I − and the (unchanged) set of kernel-types L + , we can consider an extension of the rule R to a rule R I , which is defined by allowing any appearance of any noise types ∈ L − being replaced by any extended noise type of the form ( , i) for i ∈ I .
To be more precise, with the notation N I := N L I ×N d+1 , we define a rule R I : for any t ∈ L + , and for any t ∈ L and any k ∈ N d+1 , where the sum runs over allt ∈ L I with qt = t. The following Lemma shows that one can construct a regularity structure T ex [I ] starting from the extended rule R I as in [5]. Proof In order to see that R I is subcritical, recall from [5, Def. 5.14] and the fact that R is subcritical that there exists a function reg : L → R with the property that for any t ∈ L. We extend reg to a function reg : L I → R by setting reg(t) := reg(qt) for any t ∈ L I . Then one has reg(N ) = reg(qN ), where qN is as in (20), and thus the fact that (21) holds for any t ∈ L I is a trivial consequence from the respective bound for qt and the fact that |qt| s = |t| s . Completeness (c.f. [5,Def. 5.22]) is a little tedious to verify, but completely straight forward.
Note that we could always consider the completion of R I as in [5,Prop. 5.21], so that showing completeness is not really crucial. The construction in [5,Sec. 5.5,Sec.6] results in a number of spaces which are all completely determined by the rule R I . We adopt the convention that we use the notation X [I ] to denote a space X constructed from R I , and we sometimes drop [I ] from the notation, whenever I is clear from the context. In particular, we writeT ex One has the obvious embedding T ex → T ex [I ], and multiplicatively extended we obtain a Hopf algebra This embedding between the Hopf algebras induces a natural group monomorphism between their character groups G ex − → G ex − [I ], which is defined by extending any character g ∈ G ex − in such a way that g(τ ) vanished for any tree τ outside of T ex − . We will use all of these embeddings implicitly, so that in particular we view T ex − as a sub Hopf algebra of T ex − [I ], and we view G ex − as a subgroup of G ex − . Given an admissible family τ ∈ C ∞ (D) for any τ ∈ T ex we write Z( ) for the admissible model constructed as in [5, (6.12),(6.13)] and we write similar to before M I ∞ and M I 0 for the set of smooth reduced admissible models of the form Z( ) and for their closure in the space of models, respectively. For any finite set I we consider , and we write elements of this set as tuples (η, h) or (η, (h i ) i∈I ), with h ∈ ∞ (L I − ) and h i ∈ C ∞ (D), depending on the situation. Note that one has (η, h) ∈ ∞ (L − ) almost surely, and thus the canonical lift of (η, h) to a random admissible model Z η,h ∈ M I ∞ is well defined. Finally, we denote bŷ the BPHZ renormalization. Here, we denote by g η BPHZ ∈ G ex − ⊆ G ex − the BPHZ character for the smooth stationary noise η, and we use the convention introduced above to view any character g ∈ G ex − also as a character of T ex − . The particular case that I = {1} will play the most important role in the sequel. In this case we use the shorter notationˆ := ( , 1) for any ∈ L − , and we write , and similar for the other spaces defined above. We call T ex the onefold extension of T ex . More generally, if I = {1, . . . , m}, then we call T ex [I ] the m-fold extension of T ex .
Extensions of the set of noise-types can be used to conveniently encode shifts and differences between canonical lifts of smooth functions to models. This construction will allow us later in particular to almost automatically obtain Lipschitz bounds from uniform bounds applied to extended regularity structures. We make two constructions that we will use throughout the paper. For this, let T ex be a regularity structure obtained from some noise-type set L − , and let as above T ex be its onefold extension.
The first constructions concerns "shifts" of models. For this, we introduce the operator S : T ex → T ex by setting for any typed, decorated tree (τ, t) ∈ T ex , where the sum overt runs over all maps t : E(τ ) → L − such that qt = t. The shift operator algebraically encodes a binomial expansion of a tree τ when it is interpreted for the "shifted" noise f + h. The following Lemma shows in particular that this binomial expansion interacts nicely with the action of renormalization.
on T ex for any z ∈ D.

Example 4
As an example, consider the tree form Example 2. We graphically represent the shifted noise typeˆ by , so that one has S = + + + . The left and the right hand side of Eq. (23) for g = 1 then read respectively Proof For the identity element g = 1 * ∈ G ex − , this follows directly from the definition of the canonical lift and the definition of the shift operator (22). Indeed, since the canonical lift is multiplicative, it suffices to show (23) on planted trees τ . If τ = for some ∈ L − , then S τ = +ˆ , and one has h+k then one has S τ = J k t S σ , and the result follows inductively from the respective identity for σ and the admissibility condition [17, (8.19)], see also [5, (6.14)].
If g = 1 * we use the fact that S commutes with the co-product (see Lemma 6) below) and the fact that by our convention g vanishes outside of T ex − , which by definition of S implies in particular that one has the identity gS = g on T ex − . It follows that one has on T ex . Comparing this with M g = (g ⊗ Id) ex − , the result follows by applying the first part of the proof to the right components of these tensor products.
The second construction we are carrying out in this section concerns differences between shifts of canonical lifts. To this end we consider the sets I := {h} and J := {h, k, h−k} and the corresponding extended noise-type sets In order to construct the next operator, it will be helpful to fix for any tree τ ∈ T ex [I ] a total order onL(τ ). We then define an linear operator A : T ex [I ] → T ex [J ] by setting for any typed, decorated tree (τ, t) ∈ T ex [I ] for any edge e ∈ E(τ ). Here, t e denotes the first component of t e , so that t e = (t (1) e , h) for any e ∈L(τ ). We write A and t [u] if we want to highlight the underlying family of total orderings on the sets L(τ ) for τ ∈ T ex − used to construct A . The point of this total ordering is that we want to expand the difference z into a telescoping sum, and the statement below will be valid for any total order ofL(τ ). We will use this telescopic sum later on in order to obtain almost automatically local Lipschitz bounds from uniform bounds. The next Lemma shows in particular that this telescopic sum interacts nicely with the action of renormalization. Now, directly from the definition we get that whenever u, v ∈L(τ ) are adjacent with respect to and such that u v, then one has that t + A τ turns into a telescopic sum, which is equal to For a general character g ∈ G ex − , note that the first part of the proof implies in particular that A is independent of . Since moreover by our convention the character g vanished outside of T ex − it follows with an argument almost identically to the one given in Lemma 2 for the shift operator S that A satisfies the following identity 4 . We conclude as in the proof of Lemma 2.

Extension of models
We now assume that we are given a partition of the set of noise-types into L − = L rand − L det − . We want to consider noises which are random, centred, stationary and Gaussian for ∈ L rand − and deterministic for ∈ L det − . To this end, we introduce the notation that given a pair (η, f ) ∈ M ∞ (L rand − ) × ∞ (L det − ) we write Z η, f fo the canonical lift of the tupel η f to a random model. In such a situation, we furthermore want to consider a modification of negative renormalization that only takes into account diverging subtrees τ which have the property that all leaves u ∈ L(τ ) have types t(u) ∈ L rand − . Denoting the set of trees τ ∈ T ex − with this property by T ex − [L rand − ], we define the character g , and g η BPHZ (τ ) = 0 otherwise, and extending this linearly and multiplicatively. Finally, we define the renormalized model bŷ We write L rand (τ ) and L det (τ ) for the set of u ∈ L(τ ) with the property that t(u) ∈ L rand − and t(u) ∈ L det − , respectively.

Remark 8
We are mainly going to be interested in the setting where T ex is itself a onefold extension with noise types L − L − , and one has L rand − = L − and L det − =L − . Note that in this case the notation of canonical lifts Z η, f and the BPHZ character g η BPHZ introduced above coincides with the notation introduced in Sect. 3.1. 4 Note however that such an identity is not true on the purely algebraic level, that is, without hitting the right component on both sides with the operator f ,h,k,h−k z . Indeed, since a tree τ can contain multiple subtrees which are different as subtrees in τ but identical as algebraic objects, there is in general no choice of total order onL(τ ) for any tree τ ∈ T ex with the property that such a statement becomes true.
We recall that, following arguments similar to [17,Thm. 7.8], see also [ uniformly in λ ∈ (0, 1), for some κ > 0, any τ ∈ T of negative homogeneity and any φ ∈ C ∞ c (D), compare [17,Thm. 7.8]. These moments can be conveniently represented as a finite sum over BPHZ-renormalized evaluations of graphs, each obtained via a "pairing" of the leaves of two disjoint copies of τ , and the bound (26) follows from bounding each of these contractions separately. This was carried out in [10] by applying a purely analytical BPHZ theorem for (hyper-)graphs. In the sequel we will need to work with a slightly different formulation of this analytical bound and in order to state it we introduce some notation. We begin with a simple lemma.

Lemma 4
Given w, z ∈ D and a tree τ ∈ T there exists a unique locally integrable function z;w τ : D L(τ ) → R, smooth away from the big diagonal 5 and away from w, z, symmetric under any permutation σ of L(τ ) with the property that t • σ = t, and such that one has for any η ∈ ∞ .
Proof The proof is straightforward using induction over the number of edges of τ , the fact that the canonical lift is an admissible model, and the fact that η w is multiplicative for the tree product.
Furthermore, for any f ∈ ∞ (L rand − ) and C ∈ C(L det − ) we define the function It follows that f ,C z;w is symmetric under any permutation σ of L det (τ ) with the property that t•σ = t on L det (τ ), so that we can naturally view the domain of definition of , t], compare the notation introduced in Sect. 2.1. We will switch frequently between these two pictures in the sequel.
Given a L det − -valued multiset m and a kernel C ∈ C(L det − ), we let • H m,C be the space given by all functions F ∈ C ∞ c (D m ) which can be written in the form for some G ∈ C ∞ c (D m ), endowed with the scalar product where G andḠ are as in (27) for F andF respectively, and we write H m,C for the closure of • H m,C under the induced norm. We also write T ex [m] ⊆ T ex for the linear subspace spanned by trees τ ∈ T ex with the property that one has m det (τ ) = m, and we note that for any f ∈ ∞ (L rand − ) and any w, z ∈ D one has Finally, it follows directly from the definition of the coproduct ex − and the character g η BPHZ that for any η ∈ M ∞ (L rand − ) one has for any multiset m, so that it makes sense to define the random variable The following theorem contains the key analytic bound on which the analysis below is bases on.

Remark 9
The reason for writing the above expression in this unusual way is that we are going to apply Cauchy-Schwarz estimates in the Hilbert space H m,C .
Proof First note that we can replace C t,t with a regularization C ε t,t := C t,t * ρ (ε) for some symmetric, non-negative definite mollifier ρ and then take the limit ε → 0, so that it suffices to show (29) for smooth kernels C ∈ C(L det − ). Furthermore, by definition C ∈ C(L det − ) is non-negative definite when viewed as an integral operator on L 2 (D) L det − , and as a consequence there exists a (unique) Gaussian, centred, stationary noiseη ∈ M ∞ (L det − ) with the property that for any choice of test functions φ, ψ ∈ C ∞ c (D) and any t, t ∈ L det − . Enlarging the probability space ( , P) if necessary, we can additionally assume that = ( rand × det ) and P = P rand ⊗ P det for some probability measures P rand on rand and P det on det , respectively, and such that η respectivelyη is a collection of random fields on ( rand , P rand ) respectively ( det , P det ). In particular, one has that η andη are independent, and we write ξ := η η ∈ M ∞ (L − ). We denote as usual the BPHZ character for ξ by g ξ BPHZ ∈ G ex − and we denote the BPHZ renormalized canonical lift of ξ to a model byẐ We first assume that the tree τ ∈ T has the property that for any noise type ∈ L det − there exists at most one u ∈ L(τ ) such that t(u) = . We claim that in this case one has from which (29) follows from Theorem 3. In order to see (30) note that from the assumption on τ and the fact that ξ t and ξ t are independent for any t ∈ L det − and any t ∈ L rand − it follows that g ξ BPHZ vanishes on subtrees σ ⊆ τ with the property that σ / ∈ T ex − [L rand − ], which in turn implies that one has the identity A fortiori it follows that one has − such thatt(e) = t(e) for any kernel-type edge e ∈ K (τ ) and such that the following holds.
• For any noise-type edge u ∈ L det (τ ) one hast We note the following consequences of this definition: For anyt ∈ one has (τ,t) ∈ T ex [m] and (τ,t) satisfies the assumption of the first part of the proof. It remains to apply the results of the first part to any of the trees (τ,t) individually, note that one has and use a Cauchy-Schwarz estimate. Here S is a symmetry factor given by S = In order to continue, we first note a relation between the norms of the spaces (18) and · m,C defined in (28).

Lemma 5 Let m be any multi-set with values in
Then one has Proof By definition (28), it suffices to show the statement for #m = 1. In this case write t ∈ L det − for the type such that one has m = {t} and note that, writing π t : for the projection given by (π t (k)) t := k t δ t,t for any The result now follows from the fact that the π t is a continuous projection, since both kernel and range of π t are closed in H [L det − , C ].
Recall that for τ ∈ T ex we defined the multiset m det (τ ) = [L det (τ ), t], and with this notation we define for any h ∈ H [L det − , C ] the quantities We stress that [·] τ fails to be a semi-norm unless #m det (τ ) = 1. Our ultimate goal is to show that (η, f ) →Ẑ η, f BPHZ can be extended by mollification to η ∈ M 0 (L rand − ) and f ∈ H [L det − , C ] for any kernel C ∈ C(L det − ). As a preparation for this statement, we show the following result.
Proof For any fixed γ > 0 and compact K ⊆ D the pseudo metric ·; · γ,K induces a complete metric space M 0 (K ) via metric identification, so that it is sufficient to show that one has the bound (32) for any h, k ∈ We first show that one has the bound from which a Cauchy-Schwarz estimate on the Hilbert space H m det (τ ),C shows that the left hand side of (34) can be estimated by where F z (x L(τ ) ) := D dy y;z τ (x L(τ ) )φ λ z (y). Comparing the second term in this expression with Theorem 6, the estimate (34) follows.
The bound (33) is now an almost immediate consequence of (34) applied to extended regularity structure and Lemma 3 applied for g = 1 * . Indeed, first note that one has for any typed tree (τ, t) ∈ T ex the identity 6 where on the right hand side we denote the canonical lift of (0, h) to a model for the onefold extension T ex of T ex , and we writet := t for any t ∈ L + . By Lemma 3 it follows that one has Applying (34) to each tree on the right hand side of this identity, we obtain the desired bound. Note for this that Lemma 5 implies in particular that The main stochastic ingredient for the proof of Theorem 7 below is the following bound, for which we introduce the notationẐ We then have the following. Proposition 3 Let T ex be a regularity structure constructed as in Sect. 2.2 satisfying Assumption 2, and assume that we are given the decomposition L − = L det − L rand − , and let C ∈ C(L det − ). Then there exists κ > 0 such that one has for any compact K ⊆ D, any p ≥ 1, any C > 0, and any τ ∈ T ex , and any test function Proof First note thatẐ η,h BPHZ is almost surely well defined for any η ∈ M ∞ (L rand − ) and any h ∈ H [L det − , C ] by Proposition 2. Furthermore, using Gaussian hypercontractivity it suffices to show this proposition for p = 1.
We now write h h τ ∈ H m det (τ ),C for the function in (31), and we use the identitŷ so that by Cauchy-Schwarz we obtain the estimate The result now follows from Theorem 6.
We demonstrate the idea of the proof of Proposition 3 with the following example.

Example 5
We consider a situation similar to Example 2, but this time we assume we have two noise types L − = {, } with ∈ L rand − and ∈ L det − , and we write H := H [{ }, C ]. We want to derive in detail the bound (36) for and p = 1. First, we can write where on the right hand side we take the · H -norm of the map y →ˆ η,δ y z (φ λ z ), where δ y denotes the δ-distribution centred around y. This is a slight abuse of notation, explicitly this expression is equal tô Now, letη ∈ M 0 be a Gaussian noise independent of η and such that E[η(x),η(y)] = C (x − y). It then follows from the definition of the Hilbert space H that we have the identity From this expression we obtain the bound (36) from [10, Thm. 2.31].
The main result of this section is the following theorem.

BPHZ can be extended by mollification to a continuous map from
Moreover, for any η ∈ M 0 (L rand − ) there exists a null set N such that for any ω ∈ N c one has that the map h →Ẑ η,h BPHZ (ω) is locally Lipschitz continuous in the sense for any γ ∈ R, any compact K ⊆ D and any R > 0 one has Finally, given an approximation η ε ∈ M ∞ (L rand − ) of η there exists a subsequence ε → 0 and a null set N with the property that η ε (ω) → η(ω) andẐ and ω ∈ N c . Remark 10 At this stage we point out a key difference between the present approach and [7]. In the latter the authors obtain a stronger statement as they construct a deterministic continuous extension operator which is such that it maps (Ẑ η BPHZ (ω), h) ontô Z η,h BPHZ (ω) for almost every fixed ω. This however comes at the expense of analytical difficulties that could only be carried out in a very special case. In the current paper in contrast, the analytic difficulties are bypassed by constructing the extension stochastically, which in particular allows us to use the results of [10, Thm. 2.31] to show the necessary analytic estimates. This comes at the expense of a somewhat weaker statement but has the advantage of being immediately completely general.

Remark 11
The proof of this theorem relies on a "noise doubling" strategy. This is very similar to the strategy exploited previously in [22] in the context of rough path theory.
Proof We first note that as a consequence of Proposition 3 for any η ∈ M 0 (L rand − ) and uniformly in ε > 0, where η ε ∈ M ∞ (L rand − ) is an approximation of η. To see this, note that the first bound above is a consequence of Proposition 3 using an argument identical to the one given in [17,Thm. 10.7]. The existence of the extensionẐ from now on in order to simplify notation. Upon choosing a sub-subsequence, this follows from the estimate uniformly over ε > 0. Using again an argument identical to the one given in [17,Thm. 10.7] we see that (39) is implied once we show for any fixed φ ∈ C r c (D) and any τ ∈ T ex the bound uniformly over λ ∈ (0, 1) for any p ≥ 1.
Now we note that one has for any (τ, t) ∈ T ex the identityˆ follows that one has the identity and by definition of A we conclude that (36) implies (40).
The fact that the null set N such thatẐ is now a direct consequence of the uniform bound of the local Lipschitz constants.
The main application of the previous theorem is the situation in which the regularity structure is itself a onefold extension with noise types given by L − L − , and one has that L rand − = L − and L det − =L − . This leads to the following immediate corollary.

Shift of models
We recall the map q : L → L from Sect. 3.1 defined as the identity on L and mappinĝ onto for any ∈ L − . We extend this map to a projection q : T ex → T ex by defining for any typed tree τ = (τ, t) ∈ T ex with type map t : E(T ) → L the tree q(τ, t) := (τ, qt), and extending q linearly to all of T ex . An important role is then played by the following operator, which generalizes the operator defined in (22).

Definition 7
For τ ∈ T ex we denote by S [τ ] ⊆ T ex the set of trees σ ∈ T ex such that qσ = τ . For any r ∈ R we define the linear operator S r : T ex → T ex by setting for any tree τ ∈ T ex where m(σ ) := #L(σ ). We call S r the shift operator, and we write S r [T ex ] ⊆ T ex for the image of the shift operator.

Example 6
On the tree from Example 2, writing = andˆ = , we obtain the formula S r = + r + r + r 2 .
We extend the shift operator linearly and multiplicatively to the algebrasT ex − andT ex + , as well as to the Hopf algebras T ex − and T ex + . The following is a simple consequence of the definition.

Lemma 6 For any r ∈ R the shift operator S r commutes with the action of both ex
− and ex + on T ex . In particular, its image S r [T ex ] forms a sector in T ex and S r commutes with the action of renormalization M g for any g ∈ G ex − . Moreover, S r maps T into T, is multiplicative under the tree product, and commutes with the operation of compositions with smooth functions, integration and differentiation.
Proof The fact that S r commutes with the action of the co-products is tedious to verify, but straightforward.
To see that S r [T ex ] forms a sector, let = (Id⊗γ ) + ex for some character γ ∈ G ex + as in [5, (6.13)]. Then one has for any τ ∈ T ex . Similarly, for any g ∈ G ex − one has Here we used the fact that by definition of the embedding G ex − → G ex − one has that g = gS r for any g ∈ G ex − . The remaining statements of the lemma are a simple consequence of the definitions.
It follows directly from the definition that the operator S r is one to one, so that we can define its inverse S −1 for any x, y ∈ D.
Proof First note the right hand side of second identity in (41) makes sense, since by Lemma 6 the image S r [T] forms a sector, so that it is invariant under x,y . In order to see that the map S * r : M ∞ → M ∞ is well defined, note first that it is clear from the definition that S * r maps admissible onto admissible S * r , so that it remains to show the required analytic bounds. Noting that the map S r leaves homogeneities of trees invariant, these analytic bounds follow once we show that the identity (41) holds for any and given by the expressions [5, (6.12)] and [5, (6.13)], respectively, but with replaced by S * r . But as a direct consequence of Lemma 6 7 We write L(X, Y ) for the space of continuous linear maps from X to Y . it follows that S r commutes with the positive twisted antipodeÃ ex + , so that it follows that one has f z (S * r ) = f z ( )S r , where f z is as in [5, (6.12)]. Plugging this into the respective formulae for and , the identities in (41) follow.
Finally, the fact that S * r extends to a locally Lipschitz map on M 0 follows straight forwardly from the identity (41) and the definition of the metric in the space of models [17, (2.17)].
With this notation we have the following Theorem.
Theorem 8 Let N ⊆ be the null set constructed in Corollary 2. Then one has for any ω ∈ N c , any r ∈ R, and any h ∈ H ξ the identity 8  (42) with ξ replaced by ξ ε for any ε > 0. For this in turn it is sufficient to show the identity S * r (R g Z f ,h ) = R g Z f +rh for any r ∈ R, any g ∈ G − and any f ∈ ∞ , which is a simple consequence of Lemma 6. The rest of Theorem 8 is then a direct consequence of Theorem 7 and the fact that S * r is locally Lipschitz.

Lifts of abstract fixed point problems
We are going to describe a class of abstract fixed point problems on the spaces D γ,η V that we are going to look at in the sequel.
Let V ⊆ Tbe a sector spanned by a set of treesV. Then the space V ⊆ Tspanned by all trees σ ∈ S [τ ] for some τ ∈V forms a sector in T. More generally assume we are given for any t ∈ L + a sector V t in T spanned by sets of treesV t , we write V t for the sector in T constructed as above, and we write Moreover, assume we are given additionally exponents γ = (γ t ) t∈L + and η = (η t ) t∈L + . For any model Z ∈ M 0 we then define the space D γ,η V (Z ) and a system of semi norms · γ,η,K for any compact K ⊆ D by We fix from now on families of sectors V t andV t in Tfor t ∈ L + , both spanned by sets of trees, and families of exponents γ = (γ t ) t∈L + , η = (η t ) t∈L + ,γ = (γ t ) t∈L + andη = (η t ) t∈L + with γ t > 0 andγ t ≥ γ t − |t| s and η t ,η t ∈ R. We then recall the following terminology from [17,Sec. 7.3] Definition 8 Given a model Z ∈ M 0 , we call a map F : D γ,η Lipschitz if for any compact set K ⊆ D and any R > 0 one has the bound uniformly over all f , g ∈ D γ,η uniformly over all modelsZ ∈ U and f ∈ D γ,η Following our usual convention, we will drop the dependence on the model Z from the notation whenever there is no room for confusion. We say that F is a strongly locally Lipschitz family for ( V,V) if we want to emphasise the underling sectors. We want to consider a class of strongly locally Lipschitz families that admit lifts to the extended regularity structure as described in the next definition.

Definition 9
Let F be a strongly locally Lipschitz family for ( V,V). Then we call a family F Z ,(r ) for Z ∈ M 0 and r ∈ R a lift of F if for any fixed r ∈ R the family F (r ) = (F Z ,(r ) ) Z ∈M 0 is a strongly locally Lipschitz family for ( V,V), one has that F Z ,(r ) ( f ) is jointly Lipschitz continuous in ( f , Z , r ), i.e. one can strengthen (44) to uniformly additionally for |r | ∨ |s| ≤ R, and one has the identity on V for any r ∈ R and Z ∈ M 0 . Here, on the left hand side of (46) we apply F for the model S * r Z and on the right hand side we apply F (r ) for Z . We call F a C 1 -lift if additionally one has that for any fixed model Z ∈ M 0 the map (r , with strongly locally Lipschitz continuous derivatives. In the case that such a (C 1 -) lift exists, we say that F admits a (C 1 -) lift.

Shift of abstract fixed point problems
In this section, if not explicitly otherwise stated, we make the notational convention that given Z ∈ M 0 and r ∈ R we write and We will show how to use lifts of strongly locally Lipschitz continuous non-linearities to relate abstract fixed point problems for the model Z ∈ M 0 to abstract fixed point problems for Z ∈ M 0 and consequently how to "shift" these fixed point problems in directions of Cameron-Martin functions. We start with the following Lemma.

Finally, S r maps strongly locally Lipschitz families
Proof In order to see that S r maps D γ,η , it suffices to use the identity xy S r = S r xy given by Lemma 6 and to note that S r does not change homogeneities of trees. The identity (48) is a direct consequence from the properties of the reconstruction operator, in particular [17, (3.3)], and the first identity in (41).
Let now F be a strongly locally Lipschitz family for ( V,V) and let F (r ) be a lift of F. We assume from now on that the pairs of sectors ( V t ,V t ) are chosen such that P t :V t → V t for any t ∈ L + . We also fix a strongly locally Lipschitz family W Z ∈ D γ,η V (Z ) for Z ∈ M 0 , and we define for any Z ∈ M 0 and r ∈ R the function We consider the fixed point problems for U and U (r ) t given by in D γ,η V (Z ) and D γ,η V (Z ) respectively for any model Z ∈ M 0 and Z ∈ M 0 and any r ∈ R. Sinceγ t ≥ γ t − |t| s and the right hand sides are locally Lipschitz continuous by definition, it follows from [17,Thm. 7.8] (see also [4,Thm. 5.21]) that there exist unique maximal solutions U and U (r ) to these equations. We denote the maximal time of existence by T (Z ) and T (Z , r ), respectively. We also define the stopping time τ (ω) := T (Ẑ ξ BPHZ (ω)). In this setting, we have the following result.

Proposition 4
Fix Z ∈ M 0 and r ∈ R, and let U and U (r ) be the unique solutions to the fixed point equations (50) for the models S * r Z and Z , respectively. Then one has Proof Since the solution to these fixed point equations are unique, we only need to show that S r U satisfies the second equation in (50). For this note that and since it follows directly from the definition that one has P t I + S r = S r P t I + the result follows.
We stress at this point that the function U on the right hand side of (51) depends on r through the model S * r Z .

Remark 12
is additionally lower semi-continuous in r ∈ R.

Remark 13
The modelled distribution W (r ),Z will be used in order to deal with the initial condition. The assumption that W Z is locally Lipschitz continuous in the model Z ∈ M 0 is unreasonably strong, as it ends up imposing that the initial condition is a locally Lipschitz continuous map of the model. 9 For the existence of the local H ξ -Fréchet derivative it is sufficient to assume that for any ω ∈ N c the map is locally Lipschitz continuous, which is a trivial consequence of the assumption that the initial condition is locally H ξ -Fréchet differentiable. Under this less restrictive assumption, the statement of Remark 12 is no longer true. However, this assumption is still sufficient to ensure in the same way as above that the time of existence is lower semi-continuous with respect to r , which ultimately ends up ensuring that the time of existence of the solution u is lower semi-continuous with respect to Cameron-Martin shifts. A similar Remark applies to the functions V Z ,(r ) and r 0 (Z , T ) introduced in Proposition 5 below.

The Malliavin derivative
We show in this section that the reconstructed solutions to the abstract fixed point problems considered in Sects. 3.4 and 3.5 admit a local H ξ -Fréchet derivative. In Sect. 4.3 we apply this abstract result to singular SPDEs of the form (1).

Differentiability of the solution to the abstract fixed point problem
We show that U (r ) is differentiable in r ∈ R with values in D γ,η V . For this let F be a C 1liftable, strongly locally Lipschitz family for a pair of sectors ( V,V) = ( V t ,V t ) t∈L + with the property that P t :V t → V t for any t ∈ L + , and let F (r ) be a C 1 -lift of F. Let moreover W Z ∈ D γ,η V (Z ) for Z ∈ M 0 be a family as in the previous section, and assume that additionally the map r → W (r ),Z defined as in (49) is Fréchet differentiable as a map from R into D γ,η V (Z ) for any Z ∈ M 0 . Finally, for any Z ∈ M 0 let U and U (r ) be the solutions to (50) for S * r Z and Z , respectively. We then have the following. Proof We fix for the first part of the proof a model Z ∈ M 0 . By definition one has that

Proposition 5 Under the assumption at the beginning of this section, for any Z ∈ M 0 and any T
for any model Z ∈ M 0 . In order to see that r → U (r ) is Fréchet differentiable, we make use of the implicit function theorem 10 on the map : in a neighbourhood of (0, U (0) ). Note that by (50) one has (0, U (0) ) = 0. Since Q <γ t P t I + is a bounded linear operator from Dγ t ,η t ,T Proposition 8 it also Fréchet differentiable, and since by assumption one has that r → W (r ) is Fréchet differentiable as well, it follows that is a . We now show that the derivative of with respect to U is an isomorphism . By definition, the derivative D 2 (r , U ) is a bounded linear operator between these spaces, which is given by so that we are left to show that this expression is invertible. This is equivalent to solving, for any fixed , which is in the form of a fixed point problem and admits a unique solution by [17,Thm. 7.8]. This follows from the fact that the map V t → D F is linear and continuous, and thus it is also Lipschitz continuous.
By uniqueness of solutions to the fixed point problem (50) we infer that one has necessarily u(r ) = U (r ) , so that it follows in particular that U (r ) is C 1 in (−r 0 , r 0 ). In order to see the identity (52) for the derivative, note that at this point all functions appearing in (50) are Fréchet differentiable in r ∈ (−r 0 , r 0 ), so that (52) follows by differentiating the right hand side of this identity. In order to see (53) it suffices to show local Lipschitz continuity in r and Z separately. The former follows from arguments identically to above, which shows the stronger statement of Fréchet differentiability of V (r ) in r . For the latter, by [17,Thm. 7.8]  For the last part of the theorem assume that r 0 has been chosen maximally, and assume that for some r 1 > r 0 one has that T (Z , s) < T for all r ∈ (−r 1 , r 1 ). We can then redo the arguments in the first part of the proof with (0, U (0) ) replaced by (r 0 , U (r 0 ) ) to obtain s 0 > 0 such that s → U (r 0 +s) is C 1 as a map from (−s 0 , s 0 ) into D γ,η,T V (Z ), which shows that U (r ) is a C 1 map on (−r 0 , r 0 + s 0 ). A similar argument shows that the lower bound can be improved and yields a contradiction. The lower semicontinuity of r 0 is now a consequence of the lower semicontinuity of T (Z , s) in (Z , s).

Local H-differentiability of the solution
As a consequence of Proposition 5 we can show that the reconstructed solution map u = RU is Gateaux differentiable in H ξ directions.

Lemma 9
Let ω ∈ N c and let h ∈ H ξ . Then for any T < τ(ω) there exists r 0 > 0 such that the map , T ) be as in Proposition 5. By Theorem 8 and Proposition 4 one has that Proof For fixed ω ∈ N c and h ∈ H ξ let r 0 (ω, h) := r 0 (Ẑ ξ,h BPHZ (ω)) be as in Proposition 5. Then one has the identity h denotes the derivative of U (r ) in the direction of r as in Proposition 5 for the modelẐ ξ,h BPHZ (ω). Since h →Ẑ ξ,h BPHZ (ω) is locally Lipschitz continuous by Corollary 2, it follows that for any fixed ω ∈ N c the map h → r 0 (ω, h) is lower semi-continuous. Since furthermore one has r 0 (ω, 0) = +∞, there exists μ > 0 and a ball B μ (0) ⊆ H ξ around the origin such hat one has r 0 (ω, h) > 1 for any h ∈ B μ (0). Now it follows from (42) that for any h, k ∈ B μ (0) one has so that it follows in particular from Proposition 5 that h → u(ω + h) is Gateaux differentiable in B μ (0) with Gâteaux derivative given by so that it remains to show that this expression is continuous in (h, k) ∈ H ξ × H ξ . This follows from the fact that R is strongly Lipschitz continuous, the map (h, k) → Z ξ,k BPHZ (ω + h) is locally Lipschitz continuous and by (53) one has that V (0) is strongly locally Lipschitz continuous.

Application to subcritical SPDEs
We now apply the result of the previous section to abstract fixed point problems arising from singular SPDEs and conclude the proof of Theorem 1. That is, we show that under the assumption introduced in Theorem 1, the solution u to the singular SPDE (1) admits a local H -Fréchet derivative, and in this situation we can furthermore derive a "tangent equation" (7) for this Fréchet derivative, which is informally given by differentiating the original equation with respect to the noise. The precise meaning of (7) is that v h can be written as a limit v h = lim ε→0 v ε h , where the random smooth function v ε h = (v ε t,h ) t∈L + is the unique classical solutions to the system of equations with initial condition v ε t,h (0) = D h u ε t,0 . It is not hard to see that v ε h is the H ξ -Fréchet derivative D h u ε of the solution u ε to the regularized and renormalized Eq. (5) in the direction of h ∈ H ξ . Note that both ψ ε and S − ρ,ε (ξ )(0, ·) are locally H ξ -Fréchet differentiable (for the former this follows by assumption, while for the latter this follows from the explicit definition of S − ρ,ε (ξ ) in [4, (A.10), (5.10)], which imply in particular that S − ρ,ε (ξ )(0, ·) takes values in some inhomogeneous Wiener chaos), so that the same is true for u ε 0 = ψ ε + S − ρ,ε (ξ )(0, ·). Remark 14 The tangent equation (7) is in the form of a singular SPDE, however it does not fall under the setting of [4] since it involves a source h which is deterministic and in general not smooth. The fact that h is not necessarily smooth was the main reason that the analysis of Sect. 3.2 was necessary in the first place. However, if h happens to be smooth for any ∈ L − , then one can treat the tangent equation directly in the framework of [4], and in this case the regularized and renormalized equation derived in [4] coincides with (55).
The solution u constructed in [4] is given as the reconstruction ofŪ t + P tŨ t (c.f. [4,Prop. 5.22]), whereŨ t is the constant modelled distributions in D ∞,∞ explicitly given in [4, (5.10)]. We now introduce an abstract differentiation operator D : T → T as the derivative of S r at r = 0, so that one has With this notation we can see that the H ξ -Fréchet derivative of the function RP tŨ t in the direction of h ∈ H ξ is given by D h RẐ ξ BPHZ P tŨt = RẐ ξ,h BPHZ DP tŨ t . The fact that u t is locally H ξ -Fréchet differentiable is now equivalent to showing that RŪ t is locally H ξ -Fréchet differentiable, which at this stage is an application of Theorem 9 to the abstract fixed point problem [4, (5.16)] forŪ . The main step here is to show that the right hand side of this fixed point problem admits C 1 lifts, and since this is largely a technical issue, we postpone the proof to Appendix B below.
It remains to derive the tangent equation. For this consider a regularization ξ ε ∈ M ∞ (L − ) of ξ given by ξ ε := ξ * ρ (ε) and let h ε := h * ρ (ε) , let v ε h,t := D h u ε t be the local H ξ -Fréchet derivative of the solution u ε t to (5) in the direction of h ∈ H ξ . The fact that one has v h,t = lim ε→0 v ε h,t follows simply from the fact that both sides of this equation are given as the reconstruction of the modelled distribution V (0) + DŨ , where V (0) denotes the solution to the abstract tangent equation (52), for the model Z ξ,h BPHZ andẐ ξ ε ,h ε BPHZ , respectively, and the fact that the latter converges to the former as ε → 0 by Theorem 7. It is now sufficient to show that v ε h,t solves (55). But in the regularized case the map r → ξ ε (ω + rh) is smooth with derivative given by ∂ r | r =0 ξ ε (ω + rh) = h ε . Furthermore, (r , x) → ξ ε (ω + rh)(x) is a smooth function R × D → R L − and since both F and ϒ are smooth, the former by assumption and the latter by construction, c.f. (16), it follows readily from standard Schauder estimates that the map (r , x) → u ε (ω + rh)(x) is smooth as well. This is sufficient to argue that we are allowed to commute the differentiation operators ∂ t and L t with ∂ r in (5), and since per definitionem one has that v ε h,t = ∂ r | r =0 u ε t (ω + rh), we obtain (55) by a direct computation.

Density of solutions to singular SPDEs
In this section we always fix a time T > 0. We want to derive conditions such that random variables of the form X = u, φ := t∈L + u t , φ t , for some tuple of test , conditioned on the event {τ > T } admit a density with respect to Lebesgue measure. By Theorem 1 the random variable X is locally H ξ -Frechet differentiable with derivative in direction of h ∈ H ξ given by v h , φ , where v h solves (7). By the Bouleau Hirsch criterion, Corollary 1, we are lead to study non-degeneracy of this local H ξ -Fréchet derivative.
For simplicity we make in this section the following additional assumption.

Assumption 4
We assume that the following is satisfied.
• The renormalization constants are given by the BPHZ character c ε τ = g ε BPHZ (τ ). • For any ∈ L − and any t ∈ L + one has that F t = F t (u) and F t = F t (u) depend only on the solution (and not on its derivatives).
• For any noise type t ∈ L − , any τ ∈ T F t,− and any ε > 0 with the property that c ε τ = 0 one has either ϒ τ t = ϒ τ t (u) depends only on the solution (and not on its derivatives), or ϒ τ t (u, ∇u, . . .) = ∂ i u t for some i ∈ {0, . . . , d}. We write τ ∈ T F,• t,− in the first case and τ ∈ T F,i t,− for i ∈ {0, . . . d} in the second case.
The first assumption is merely a convenience and could be easily dropped with a little more algebraic effort later on, compare Remark 20 below. The second and the third assumption greatly simplify the computation; we believe that the statements below are still true without these assumptions, but the proofs given in this paper do not seem to easily generalize to this case. The case ϒ τ t (u, ∇u, . . .) = ∂ i u t ensures we cover the 4 We denote by L * t the dual operator to L t , which is again a differential operator involving only spatial derivatives, and we consider the equation dual to (56), which is a backward random PDE given by on (0, T ) × T d with final condition w ε t,φ (T , ·) = 0. The following lemma is a straightforward computation.
between random variables conditioned on the event {T < τ ε }.

A regularity structure adapted to the dual equation
Our goal is to understand the behaviour of w ε t,φ in the limit ε → 0. To this end we want to interpret w ε t,φ as the reconstructed solution to an abstract fixed point problem, which can be viewed as the dualization of the abstract tangent equation (52). The equation for w can be written in its mild formulation, and it is not hard to see that the Green's function for −∂ t − L * t is given by x → G t (−x) for any t ∈ L + , where G t is the Greens function for ∂ t − L t . It follows that the kernel types present in our regularity structure T ex are not rich enough to encode the dual equation, so that as a first step we are lead to build an extension T ex of the regularity structure T ex in which one can consider abstract fixed point problems associated to G t (−·).

Remark 15
Note that in Sect. 3.1 we considered an extension of the noise-types L − , whereas in this section we will consider an extension of the kernel-types L + . The extension constructed in Sect. 3.1 plays no role in this section, so that from now on we use the symbol T ex for the regularity structure constructed below, and we refer to this structure as the extended regularity structure form now on.
To this end we extend the set of kernel types to a set L + := L + L + , where L + := {t : t ∈ L + } is a disjoint copy of L + , and we let |t | s := |t| s for any t ∈ L + . One should think of t as representing the "dual" integral operator to t. In particular, it will represent the Greens function for a parabolic differential operator going backward in time.
Given the extended set of types L := L + L − we extend reg to a function reg : L → R by setting reg(t ) := θ for any t ∈ L + and some θ > 0 small enough, and we define an extension R of the rule R by allowing any kernel type t ∈ L + to be replaced by t , and additionally allowing an arbitrary number of types L + . To be more precise, we define N 0 := N L + ×{0} and N := N , with := L × N d+1 , and we set for any t ∈ L + , and as usual R(t) := {∅} for t ∈ L − . Here we define qt := t for any t ∈ L, qt := t for any t ∈ L + and where the sum runs over allt ∈ L with qt = t.

Remark 16
The fact that we allow for arbitrary M ∈ N 0 instead of just M = 0 in (59) has the advantage that R satisfies [4,Ass. 3.12] as soon as R satisfies this assumption. This simply ends up ensuring that one can build arbitrary products of U t for any t ∈ L + with reg(t) > 0. As was already remarked below [4,Ass. 3.12], this is not a restriction at all, since any subcritical rule can be trivially extended in such a way that this assumption holds, and we will assume from now that [4,Ass. 3.12] holds for R and thus also for R.

Remark 17
It might appear more natural to set R(t ) = R(t) for any t ∈ L + , in which case t and t would be end up to be completely interchangeable in the extended regularity structure constructed from R. The present formulation is more restrictive and leads to a smaller regularity structure, but as we shall see, this structure is still rich enough to lift the Eq. (57) for w to an abstract fixed point problem. The reason we choose the present formulation is that the natural sector W in which the solution to this abstract fixed point problem takes values in is function-like, compare Lemma 15 below.

Example 7
Continuing Example 1 of the stochastic heat equation, the set R(t) is equal to R(t ) for any t ∈ L + and contains those multisets m ∈ N with the property that m( , 0) ≤ 1 and m(l, k) = 0 for any l ∈ L and k ∈ N d+1 \{0}. In order to continue, we recall from [5,Rem. 5.17] that givenθ > 0 one can assume without loss of generality that the function reg satisfies the bound for any t ∈ L + , where T F t denotes the set of trees τ such that J 0 t τ ∈ T and τ is t-non-vanishing as in Sect. 2.3.

Remark 18
There are some subtleties here, since in [5,Rem. 5.17] this identity was only shown with T F t replaced by the larger set T t . In general (61) might simply not be true, since the rule might be chosen larger then necessary to deal with the singular SPDE at hand. However, this problem can easily be circumvented by assuming without loss of generality that R is given as the completion of the "naive" rule R naive , which is defined in such a way that the set of trees τ ∈ T t that strongly conform to R naive coincide with the set of trees τ ∈ T t that are t-non-vanishing. One can then apply [5,Rem. 5.17] to R naive in order to obtain (61).
We assume from now on that (61) holds for someθ > 0 small enough (to be determined later), and with this convention we have the following lemma.

Lemma 11
Assume that θ > 0 is small enough and that (61) holds. Then the rule R is a subcritical rule with respect to reg. In particular, there exists a subcritical completion of R defined via [5,Prop. 5.21], which we again denote by R, and we can define the extended regularity structure T ex as in [5,Sec. 5.5].
Proof In order to see that R is subcritical, note first that for t ∈ L + one has inf N ∈R(t) Let now t ∈ L + and N ∈ R(t ), and let l ∈ L + and k ∈ N d+1 such that N {(l, k)} ∈ R(t). By (61) we can choose for any kernel-type j ∈ L + a tree τ j ∈ T F j \T such that reg(j) > |τ j | s + |j| s −θ for someθ > 0 small enough. We now consider the tree It follows that τ strongly conforms to R (c.f. [5,Def. 5.8]) and it follows from the definition of T ex in [5,Def. 5.26] that τ ∈ T F t . We also define the subtreeτ of τ by settingτ Thenτ ∈ T F t is a proper subtree of τ with identically root, and trees like this satisfy |τ | s > −|t| s + 2θ for θ > 0 small enough by [4,Ass. 2.13].
For X ∈ {G, K , R} we write X t (z) := X t (−z), so that in particular G t is the Greens function for −∂ t − L * t and the compactly supported kernels K t satisfy [17, Ass. 5.1 Ass. 5.4]. Given the kernel assignment K l for any l ∈ L + we write M ∞ for the set of smooth, reduced, admissible models for T ex and we write M 0 for the closure of this set. For f ∈ ∞ (L − ) and η ∈ M ∞ (L − ) we write Z f andẐ η BPHZ for the canonical lift of f and the BPHZ-renormalized canonical lift of η, respectively, defined as usual via [5,Rem. 6.12] and [5,Thm. 6.17]. We also write D γ for the space of modelled distributions as in Sect. 2.2.5 with T ex replaced by T ex . We will later on need to work with modelled distributions that are only defined on a domain of the form (θ, T ) × T d for some T > θ > 0, and we write D γ,(θ,T ) for the space of functions W : (θ, T ) × T d → T <γ that satisfy (13).

An abstract fixed point problem for the dual equation
Given U ∈ D γ,η V , with γ, η, V as in Proposition 9, and φ ∈ t∈L + C ∞ c (D) we want to consider the abstract point problem in D γ W for some families γ t > 0, for t ∈ L + , and some sector W = t ∈L + W t , given by for any t ∈ L + . The purpose of this section it to find the right sectors and exponents for this fixed point problem to be well posed. In order to unify notation, we define for any t ∈ L + , ∈ L − and u, w ∈ D L + . With this convention F l and F l are well defined for any l ∈ L + and ∈ L − . We will sometimes write F • t := F t to avoid case distinctions.

Remark 19
In [4, the authors were working in a more general setting, in the sense that they allowed for derivatives hitting noises and noises being multiplied together. Additionally, they were considering non-linearities that can depend on the extended decoration o. Our setting can easily be embedded into this more general setting, by definingF viaF 0,0 t := F t andF I ,0 t := F t for any t ∈ L + and ∈ L − , and F l t = 0 otherwise. Whenever we refer to results from [4] in the sequel we will make these identification implicitly.
In the sequel we will need results of [4] applied to (F l ) l∈L + . In order to do so, we need the following technical Lemma.
The claim concerning F l for any l ∈ L + and ∈ L − follows in the same way.
In analogue to (15), given l ∈ L + we say that a tree τ = (T n,o e , t) ∈ T ex is l-non-vanishing for F if and τ e is t(e)-non-vanishing for any e ∈ K (τ ) with e ↓ = ρ τ . Here we set := t(e) if there exists a (necessarily unique) noise-type edge e ∈ K (τ ) with e ↓ = ρ τ and := • otherwise, and τ e denotes the largest sub-tree of τ with root e ↑ . We define T F l ,T F l and T F l,− in analogue to Sect. 2.3, so that one has T F l := {τ ∈ T : τ is l -non vanishing for F and J (l,0) [τ ] ∈ T } (note that in particular T F t = T F t for any t ∈ L + ), and the setsT F l and T F l,− consist of those trees τ ∈ T F l such that |τ | s ≤ 0 and |τ | s < 0, respectively. We also set T F l := T F l for any l ∈ L + . With this notation we define for any l ∈ L + the sectors We write similar to above W := l∈L + W l andW := l∈L +W l . We now have the following analogue to [4,Lem. 5.9].

Lemma 13
For any l ∈ L + the spaces W l andW l form sectors in T ex . Moreover, for any U ∈ V and W ∈ W and any ∈ L − one has that are elements ofW t for any t ∈ L + .
Proof This is the content of [4,Lem. 5.9].
In the sequel we need to understand structure of the sets T F t for t ∈ L + . For this we introduce the following notation. Given a tree τ = (T n,o e , t) ∈ T ex and a node It follows from Lemma 11 that one has D u (τ ) ∈ T ex for any τ ∈ T ex and any u ∈ N (τ ). Given additionally an edge e ∈ K (τ ) with e ↓ = u, then we write D e u (τ ) for the tree obtained from D u (τ ) by removing e from the edge set, and removing furthermore all edgesẽ ∈ E(D u (τ )) and verticesũ ∈ V (D u (τ )) with the property that e lies on the shortest path fromẽ respectivelyũ to the root ρ τ . It is clear that one obtains another decorated, typed tree in this way by simply restricting the corresponding maps to N (D e u (τ )) and E(D e u (τ )), respectively, and since R is a normal rule, one has D e u (τ ) ∈ T ex . We now have the following Lemma.

Lemma 14
Assume that Assumption 4 holds. Then for any t ∈ L + the set T F t agrees with the set of trees D e u (τ ) for τ ∈ T F t , u ∈ N (τ ) and e ∈ K (τ ) such that e ↓ = u.
which does not vanish identically by assumption. In case u = ρ τ one has E =Ẽ, and there exists a unique edgeē ∈ Esuch thatē lies on the unique shortest path from ρ τ to u. It follows thatt u (ē) = t (ē) and t(e) =t u (e) for any e ∈ E\{f }, and using the fact that D (l ,k) F t (u, ∇u, . . . ; w, ∇w, . . .) = D (l,k) F t (u, ∇u, . . .) for any t, l ∈ L + , we obtain which does not vanish identically by assumption. Conversely, let σ = (S n e , t) ∈ T F t . It follows from the fact that F l is linear in (w, ∇w, . . .) for any l ∈ L + and ∈ L − that there exists a (unique) vertex μ ∈ N (σ ) such that t(e) ∈ L + if and only if e lies on the unique shortest path from ρ σ to μ. Let E be the set of edges e ∈ K (σ ) such that e ↓ = μ, and define j ∈ L + by setting j := t(u ↓ ) if u = ρ σ , and j := t otherwise. By definition of the rule R in (60) it follows that there exists (l, k) ∈ L + × N d+1 such that one has 11 [E, (t, e)] {(l, k)} ∈ R(j).
Choose an arbitrary treeτ ∈ T F l and define now the typed, decorated tree (Tñ e , l) by connecting ρ(τ ) to μ via an edgeē such that l(ē) = l andẽ(ē) = k, and whereñ, e and l extend the decorations and type-maps of σ andτ otherwise. It then follows that τ = (Tñ e , ql) ∈ T t and one has σ = Dē μ (τ ). The fact that τ if t-non-vanishing follows by reversing the arguments of the first part of the proof.
A particular consequence of Lemma 14 is that we can give a direct proof of the fact that the sectors W t are function like. Note that such a statement would also follow directly from the analysis [4] and the fact that reg(t ) > 0.

Lemma 15
For any t ∈ L + and any τ ∈ T F t one has |τ | s > −(|t| s ∨ s 0 ). In particular, the regularity of the sectorW t is larger then −(|t| s ∨ s 0 ), and the sector W t is function-like.
Proof We first note that as a corollary from the proof of [4,Lem. 5.9], in particular [4, (5.15)], it follows that for any U ∈ D γ,η,T +θ V , any l ∈ L + and any ∈ L − one has Qγ t (∂ l F t )(U ) and Qγ t (∂ l F t )(U ) are elements of Dγ t ,(θ,T ) .
Moreover, combing Lemma 15 and Lemma 13, it follows that both Qγ t (∂ l F t )(U ) and Qγ t (∂ l F t )(U ) take values in a sector of regularity bigger then −|t| s . Consequently, using the results of [17, Sec. 6] (see Proposition 8), one has that

Identifying the solution to the dual equation
We fix from now on a regularization ξ ε of ξ , and we writeẐ ε BPHZ :=Ẑ ξ ε BPHZ for any ε > 0. We also write WẐ ε BPHZ for the solution of (62) constructed in Corollary 3 for the modelẐ ε BPHZ with U =Ū +Ũ given as in Sect. B (recall that U ∈ D γ,η V and u = RU is the solution to (1)). As above we denote by g ε BPHZ ∈ G ex − the BPHZ-character of ξ ε (for the extended regularity structure T ex ) and we let M g ε BPHZ := (g ε BPHZ ⊗Id) ex − . Our goal is to link the abstract dual equation (62) to the dual tangent equation (57). In a first step we can use the machinery of [4] to derive an equation for the reconstructed solution RW t to the abstract fixed-point problem (62). This equation will be automatically of the form (57), but it is a-priori unclear whether the renormalization constants that one obtains in these two ways coincide (or at least differ by something of order 1 in a suitable sense). This however is necessary if we want to take the limit ε → 0 in the model. Thus, in order to continue, we introduce the following assumption that makes sure that the dual renormalization constants are given by what we would naively expect.

Assumption 5
For any t ∈ L + one has the identity (70)

Remark 20
The simplicity of Assumption 5 is the main reason for assuming that c ε τ is given by the BPHZ character. In general, in order to pass to the limit ε → 0 in u ε , one could choose c ε τ = (h • g ε BPHZ )(τ ) where h ∈ G − is an arbitrary fixed group element Using Proposition 6 and performing the limit ε → 0 in (58) now gives the following corollary.

Corollary 4 Assume that Assumptions 4 and 5 holds, and let
Then one has the identity Assumption 5 is not straight-forward to show in general. However, an important special case in which we can show directly that Assumption 5 holds is the case that we consider only a single equation, that is, in case that #L + = 1.
Proof With the aid of Lemmas 16 -19 below, we obtain the following chain of equalities.
Note that the summand in (76) vanishes whenever D u (τ ) / ∈T F t , and otherwise one has D u (τ ) ∈T F t ⊆ T ex − by Lemma 18, so that g ε BPHZ ( D u (τ )) is well defined.

Existence of densities
Assumption 6 We assume that the smooth functions F t ∈ C ∞ (R L − ) and the solution u have the property that ( t∈L + F t (u)w t ) ∈L − = 0 on (0, τ ) × T d for any w ∈ R L + \{0} almost surely.
We now have the following theorem, the proof of which is at this stage a generalization of the proof of [13,Prop. 5.3].
Theorem 10 Under Assumptions 4 to 6, assume that additionally H ξ is such that , then f = 0. Let T > 0 and let φ i , i ≤ n be a collection of linearly independent test function φ i ∈ C ∞ c ((0, T ) × D) L − . Then, the R n -valued random variable u, φ 1 , · · · , u, φ n conditioned on the event {T < τ} admits a density with respect to Lebesgue measure.
Proof Let X := ( u t , φ 1 t , · · · , u t , φ n t ) t∈L + . By Theorem 1 we know that X is locally H ξ -differentiable, so that by the Bouleau-Hirsch criterion, Corollary 1, we are left to show that D X is almost surely of full rank. Assume first that n = 1. Then by Theorem 1 and Corollary 4 one has for any h ∈ H ξ the identity Using the assumption that H ξ is dense in L 2 (D) L − , it suffices to show that one has t∈L + F t (u)w φ,t = 0, which together with Assumption 6 is equivalent to showing that w φ,t = 0 for at least one t ∈ L + . On the other hand, by assumption there exists t ∈ L + such that φ t = 0, and it follows directly from (62) that W l = 0 for at least one l ∈ L + . It thus suffices to argue that whenever W is a solution to (62) on some time interval (θ, T ) such that the reconstruction RW vanishes on (θ, T )×T d , then this implies that one also has W = 0 on (θ, T ) × T d . Since W takes values in a function-like sector by Lemma 15, one hat 0 = RW = W , 1 , and thus by [17,Prop. 3.29] it suffices to show that W t , τ = 0 for any t ∈ L + and any non-polynomial tree τ ∈ T ex \T. Assume this was not the case, and let l ∈ L + andτ ∈ T ex \T be the tree of minimal homogeneity such that W l ,τ = 0. It follows in particular from Lemma 15 that D F l (U ) and D F l (U ) take values in a sector of regularity α l > −|l| s . Plugging this in the fixed point equation (62) implies that min |τ | s : τ ∈ T ex \T and W l , τ = 0 = |τ | s + α l + |l| s > |τ | s , which gives the desired contradiction.
The case n > 1 can readily be reduced to the case n = 1. To see this, assume that the statement holds in the one dimensional case. Note that the Nullset outside which the one dimensional conclusion holds is fixed (it coincides with the Nullset on which the extended model converges). In particular, this is the same Nullset for every φ. Fix φ i t ∈ C ∞ c ((0, T ) × D) n×L − and assume now that D X is not almost surely of full rank. This implies in particular that there exists (random) λ i t ∈ R for t ∈ L + and i ≤ n such that λ is not identically zero and which in turn implies that one has v h , ψ = 0 where ψ t := i≤n λ i t φ i t = 0, in contradiction to the assumption that the statement holds for n = 1. Note that the coefficients λ are random in this proof; this is were we use the fact that the Null set does not depend on the test function.
Data Availability Data sharing not applicable to this article as no datasets were generated or analysed.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

A Continuity of maps between spaces of modelled distributions
Proposition 8 Let V ,V be sectors in T of regularity α,ᾱ respectively. The one has the following.

B Lift of the abstract fixed point problems coming from singular SPDEs
We show in this section that the right hand side of the fixed point problems considered in [4] admit C 1 lifts to the extended regularity structure. In order to state our results in a clean way, we introduce some notation from [4]. To begin with, we fix the subspace M 0,1 of M 0 considered in [4, Def. 5.1] with metric given by It is clear that this metric is stronger then ·; · γ,K for any γ > 0 and K ⊆ D compact. From this it follows easily that all statements derived above holds true with M 0 replaced by M 0,1 . Let nowŨ t ∈ D ∞,∞ V t be the constant function defined in [4, (5.10)] and recall that from [4,Prop. 5.18] that PŨ := (P tŨ t ) t∈L + ∈ D ∞ V is strongly Lipschitz continuous. Recall from [4,Prop. 5.22] that under Assumption 1 there exists a solution u to the singular SPDE (1) and it is given as the reconstruction of a modelled distribution U ∈ D γ,η V , which in turn can be written as U =Ũ +Ū withŨ as above andŪ satisfies the fixed point equation SinceŨ is a constant D ∞,∞ modelled distribution its reconstruction is trivially locally H ξ -Fréchet differentiable, and it remains to show that the same is true forŪ . This will follow from the general result of Theorem 9, once we show that the right hand side of (79) admits a C 1 lift. The main statement that we will show in this section is thus the following.
First note that in [4,Lem. 5.19] and the discussions below the proof of [4,Lem. 5.19] it was shown that the non-linearity H is strongly locally Lipschitz continuous.
Recall from [4, (3.7)] that F l t are given as composition with smooth functions f l t so that one has for some f l t ∈ C ∞ (R L + ), where we adopted the notation U = (U t,k ) (t,k)∈L + ×N d+1 where U t,k := D k U t , and we write U, 1 := ( U t,k , 1 ) (t,k)∈L + ×N d+1 . We then have a natural candidate for the lift H (r ) t of H t which is given by Here, given an extended model Z ∈ M 0,1 we write F l t (U + S r PŨ ) for the composition of the smooth function f t l with U + S r PŨ in the model Z . First note that (46) is a direct consequence of the definitions. We now sketch the proof that H (r ) t is a strongly locally Lipschitz map from D γ,η V into Dγ t ,η t V t . Since the proof is very similar to the one given in [4,Lem. 5.19], we will not go into too much detail. The proof essentially boils down to an application of the results of [17,Sec. 6], which we have summarized in Proposition 8, and the only thing to notice is that the arguments given in the proof of [4,Lem. 5.19] carry over to to H where e k denotes the k-th unit vector in R n and V k denotes the k-th component of V ,

Proof
The fact that multiplication, differentiation and integration are C ∞ follows simply from the fact that these operations are continuous and (multi-)linear. We now show Fréchet differentiability of composition with smooth functions, together with (81). For this we writê where we writeŪ := U , 1, andŨ := U −Ū . We now define for x ∈ D and α ∈ N d+1 the function g α by setting from which we infer that (82) can be re-written into k,l≤n where F k,l := D e k +e l F. Now using that F k,l : D γ,η → D γ,η is a Lipschitz continuous map, we can estimate the D γ,η norm of this expression by W 2 γ,η W 2 γ,η , which proves the claim.
With this proposition, the proof that H (r )

C The case of a single equation
In the entire section we assume that L + = {t} contains a unique element. Our goal is to derive the identities necessary to show Proposition 7.
The first lemma we are going to prove shows that functional derivatives of the original counter-terms are of the same form as the counter-terms of the dual equation.

Lemma 16
Under Assumption 4, one has for any t ∈ L + and any τ ∈T for any i ≤ n. Moreover, by definition of F t in (63), it follows that D (t i ,0) F t = D (t i ,0) F t , and hence Since i≤n N (τ i ) = N (τ )\{ρ τ }, it remains to note that D ρ τ (τ ) = τ and by definition (63).
Next, we derive a useful identity for the symmetry factors appearing in (73) and (72). In order to state it, we introduce the set D(T Proof Note first that by Lemma 14 the setT F t is included in the set D(T F t ), so that the right hand side of (85) makes sense. Since moreover f vanishes outside ofT F t by definition, it follows that we can rewrite the left hand side of (85) into where m(τ ) ∈ N is a symmetry factor given by It remains to show the identity m(τ )S(τ ) = S(qτ ), which we show inductively in the number of kernel type edges of τ . If #K (τ ) = 0 or τ = qτ the identity is trivial, so that we exclude these cases in the sequel. In case that τ = J k t [τ ] is planted and (86) holds forτ , this identity also holds for τ since m(τ ) = m(τ ), S(τ ) = S(τ ) and S(qτ ) = S(qτ ). It remains to treat the case that qτ is of the form qτ = X k n i=1 J k i t i [qτ i ] p i with n ≥ 1, p i ≥ 1 and (t i , k i , τ i ) = (t j , k j , τ j ) for i = j, k ∈ N d+1 and ∈ L − {•}, and such that (86) holds for J k i t i τ i for any i ≤ n.
By assumption there exists u ∈ N (qτ )\{ρ τ } such that τ = D u (qτ ), and we assume without loss of generality that τ is of the form which can always be achieved by simply rearranging the order of the triples (t i , k i , τ i ).
In this case one has On the other hand, we obtain and the proof is finished, noting that one has the identity m(τ ) = p 1 m(τ 1 ).
Identifying the renormalization constants takes a bit more work. We start with some intuition. The trees σ ∈T F t , which appear in the regularity structure used to solve the dual tangent equation, contain kernel-type edges which correspond to the Green's function for the dual equation, which is given by x → G(−x), where G is the Green's function of the original equation. Reversing a kernel for an edge e in the evalution σ is equivalent to reversing the direction of the corresponding e itself. Fortunately (since the dual equation is linear and does not feed back into the original equation), all trees that produce non-vanishing counter-terms for the dual equation are such that there exists an increasing (in the tree order) sequence of edge e 1 , . . . , e r for some r ≥ 0 such that e ↓ i+1 = e ↑ i for any i < r (i.e. the edges are connected) and the set {e i : 1 ≤ i ≤ r } contains exactly those edges with dual type. Inverting the direction of these edges is therefore equivalent to moving the root from ρ(σ ) = e ↓ 1 to e ↑ r . We finally change the type of the edges that have been reversed to the original kernel. We create a new tree in this way with identical renormalisation constant to the tree we started with (apart from symmetry factors, see below), but without dual types. This will allow us to match renormalisation constants of the dual equation with renormalisation constants of the tangent equation.
As a preparation, we introduce some notation. Given a rooted tree T with vertex set V (T ), edge set E(T ) and root ρ T , we can define for any ν ∈ V (T ) another tree ν (T ) with identical edge and vertex sets, but where we set ρ( ν (T )) := ν. Given additionally a type map t : E(T ) → L and decorations e : E(T ) → N d+1 and n : V (T ) → N d+1 we obtain another typed, decorated tree ( ν (T )ñ e , t) by simply letting the maps e, n, and t unchanged. For a tree τ = T n e we also defineˆ u τ by For the proof of Lemma 19, we are thus left to show that for any t ∈ L + and any σ ∈ T F t ,− one has the identity g η,L We first deal with the issue that the image of T ex underˆ does in general not coincide with T ex , which is due to the fact that if τ is a tree that strongly conforms to the rule R, its image underˆ might not. We will circumvent this issue by working in the Hopf algebra H 1 defined in [5, (4.10)] for the type set L. Actually, it suffices for us to work in the reduced Hopf algebra H, where His obtained from H 1 by identifying any trees that only differ by the extended decoration and additionally factoring out any trees τ = (T n e , t) with the property that there exists e ∈ E(T ) such that t(e) ∈ L − and e ↑ is either not a leaf or one has n(e ↑ ) = 0 (or both). Following [5,Rem. 4.16], this leads to the following space.

Definition 10
We denote by H the unital algebra freely generated by typed, rooted, decorated trees τ = (T n e , t) such that τ = • and such that t : E(τ ) → L, n : N (τ ) → N d+1 and e : E(τ ) → N d+1 , and such that e ↑ is a leave of T for any noise type edge e ∈ L(τ ).
By [5,Prop. 3.30], this space becomes a Hopf algebra when endowed with the co-product 1 defined in [5,Def. 3.3]. We denote this co-product on H simply by .

Definition 11
We define the ideal I + ⊆ H generated by all trees τ ∈ H such that |τ | s > 0, and we define the factor algebra H − := H/I + , with canonical embedding i ex − : H − → H.
It straight forward to see that I + is a Hopf ideal, so that H − is a factor Hopf algebra. The following Proposition follows exactly as [5, Prop. 6.5]. We now define a subspaceH ⊆ H with the property thatˆ is well defined onH and an involutory bijection.

Definition 12
We defineH − ⊆ H − (respectivelyH ⊆ H) as the unital sub algebra generated by all trees τ ∈ H − (respectively τ ∈ H) with the property that there exists a node u ∈ N (τ ) such that for any edge e ∈ E(τ ) one has t(e) ∈ L + if and only if e lies on the unique path from u to the root ρ τ .
It is readily checked from the definition of the co-product and the operation thatH − is closed under , in the sense that :H − →H − ⊗H − , so thatH − is a Hopf algebra, andˆ :H − →H − is such thatˆ •ˆ = Id.
On H (respectively H − ) we define the character g η,K (respectively g η,K BPHZ ) by setting g η,K τ := E η,K τ (respectively g η,K BPHZ (τ ) = g η,K (Aτ )) for any tree τ and extending this linearly and multiplicatively. Note that T ex − ⊆ H − andT ex − ⊆ H, and on these subspaces this notation is consistent, in the sense that one has g BPHZ •ˆ onH − . We now note that directly from the definition one has the identity g η,L = g η,K • q on H, and sinceˆ is an involutory bijection onH − , it follows that it suffices to show We also define the factor algebras K − := H − /J − and K := H/J.
Here we write e i ∈ N d+1 for the i-th unit vector.
With a proof identical to [18,Prop. 2.12], we obtain the following.

Lemma 20
The ideal J − is a Hopf ideal in H − , so that in particular K − is a factor Hopf algebra. Moreover, one has A : J → J − , so that in particular the space K forms a left co-module over K − . Now note that by definition one has τ − qˆ τ ∈ J − for any τ ∈H − , so that it remains to show that g η,K BPHZ vanishes on J − . It follows readily from Lemma 20 and the recursive identity for the twisted antipode that one has A : J − → J. It thus remains to show that g η,K vanishes identically on J. This however follows identically to [18,Property 4].