Uniqueness in law for parabolic SPDEs and infinite-dimensional SDEs

We prove uniqueness in law for a class of parabolic stochastic partial differential equations in an interval driven by a functional A(u) of the temperature u times a space-time white noise. The functional A(u) is H\"older continuous in u of order greater than 1/2. Our method involves looking at an associated system of infinite-dimensional stochastic differential equations and we obtain a uniqueness result for such systems.

where A is an operator mapping C[0, 1] into itself andẆ is a spacetime white noise. The approach is to first prove uniqueness for the martingale problem for the operator where λ i = ci 2 and the a ij is a positive definite bounded operator in Toeplitz form.

Introduction
Our goal is to obtain a uniqueness in law result for parabolic stochastic partial differential equations (SPDEs) of the form whereẆ is a space-time white noise on [0, 1] × [0, ∞), suitable boundary conditions are imposed at 0 and 1, and A is an appropriate operator from C[0, 1] to C[0, 1] which is bounded above and away from zero. A common approach to (1.1) (see, e.g., Chapter 3 of Walsh [18]) is to convert it to a Hilbert space-valued stochastic differential equation (SDE) by setting where {e j } is a complete orthonormal sequence of eigenfunctions for the Laplacian (with the above boundary conditions) on L 2 [0, 1] with eigenvalues {−λ j }, u t (·) = u(·, t), and ·, · is the usual inner product on L 2 [0, 1]. This will convert the SPDE (1.1) to the ℓ 2 -valued SDE dX j (t) = −λ j X j (t)dt + k σ jk (X t )dW k t , (1.2) where {W j } are i.i.d. one-dimensional Brownian motions, σ(x) = a(x), L + (ℓ 2 , ℓ 2 ) is the space of positive definite bounded self-adjoint operators on ℓ 2 , and a : ℓ 2 → L + (ℓ 2 , ℓ 2 ) is easily defined in terms of A (see (1.3) below).
(1.2) has been studied extensively (see, for example, Chapters 4 and 5 of Kallianpur and Xiong [10] or Chapters I and II of Da Prato and Zabczyk [7]) but, as discussed in the introduction of Zambotti [20], we are still far away from any uniqueness theory that would allow us to characterize solutions to (1.1), except of course in the classical Lipschitz setting.
There has been some interesting work on Stroock-Varadhan type uniqueness results for equations such as (1.2). These focus on Schauder estimates, that is, smoothing properties of the resolvent, for the constant coefficient case which correspond to infinite-dimensional Ornstein-Uhlenbeck processes, and produce uniqueness under appropriate Hölder continuity conditions on a. For example Zambotti [20] and Athreya, Bass, Gordina and Perkins [1] consider the above equation and Cannarsa and Da Prato [6] considers the slightly different setting where there is no restorative drift but (necessarily) a trace class condition on the driving noise. Cannarsa and Da Prato [6] and Zambotti [20] use clever interpolation arguments to derive their Schauder estimates. However, none of the above results appear to allow one to establish uniqueness in equations arising from the SPDE (1.1). In [20] a is assumed to be a small trace class perturbation of a constant operator (see (9) and (10) of that reference) and in [6] the coefficient of the noise is essentially a Hölder continuous trace class perturbation of the identity. If we take e j (y) = exp(2πijy), j ∈ Z (periodic boundary conditions) and λ j = 2π 2 j 2 , then it is not hard to see that in terms of these coordinates the corresponding operator a = (a jk ) associated with the SPDE (1.1) is a jk (x) = 1 0 A(u(x))(y) 2 e 2πi(j−k)y dy, j, k ∈ Z, (1.3) where u = j x j e j . In practice we will in fact work with cosine series and Neumann boundary conditions and avoid complex values -see (9.7) in Section 9 for a more careful derivation. Note that a is a Toeplitz matrix, that is, a jk depends only on j − k. In particular a jj (x) = 1 0 A(u(x))(y) 2 dy and a(x) will not be a trace class perturbation of a constant operator unless A itself is constant. In [1] this restriction manifests itself in a condition (5.3) which in particular forces the α-Hölder norms |a ii | C α to approach zero at a certain rate as i → ∞; a condition which evidently fails unless A is constant.
Our main results for infinite-dimensional SDEs (Theorems 2.1 and 9.1 below) in fact will use the Toeplitz form of a (or more precisely its near Toeplitz form for our cosine series) to obtain a uniqueness result under an appropriate Hölder continuity condition on a. See the discussion prior to (3.3) in Section 3 to see how the Toeplitz condition is used. As a result these results can be used to prove a uniqueness in law result for the SPDE (1.1) under a certain Hölder continuity condition on A(·) (see Theorem 2.3 and Theorem 2.4).
There is a price to be paid for this advance. First, the Hölder continuity of a in the e k direction must improve as k gets large, that is, for appropriate β > 0 |a ij (y + he k ) − a ij (y)| ≤ κ β k −β |h| α . (1.4) Secondly, we require α > 1/2. Finally, to handle the off-diagonal terms of a, we assume that for appropriate γ > 0, (1.5) To handle the SPDE, these conditions on the a ij translate to assumptions on A. The operator A will have two types of smoothness. The more interesting type of smoothness is the Hölder continuity of the map u → A(u). In order that (1.4) be satisfied, we require Hölder continuity of the map u → A(u) of order α > 1/2 and with respect to a weak Wasserstein norm involving sufficiently smooth test functions (see (2.10) in Theorem 2.3 and (9.9) in Theorem 2.4). The other type of smoothness is that of A(u)(x) as a function of x. In order that the a ij satisfy (1.5), we require that A map C[0, 1] into a bounded subset of C γ for sufficiently large γ.
A consequence of the fact that A must be Hölder continuous with respect to a weak Wasserstein norm is that A(u)(x) cannot be a Hölder continuous function of point values u(x+ x i , t), i = 1, . . . , n but can be a Hölder continuous function of u, φ i , i = 1, . . . , n, for sufficiently smooth test functions as in Corollary 2.6. One can of course argue that all measurements are averages of u and so on physical grounds this restriction could be reasonable in a number of settings. Although dependence on point values is not a strong feature of our results, it is perhaps of interest to see what can be done in this direction. Let {ψ ε : ε > 0} be a C ∞ compactly supported even approximate identity so that ψ ε * h(x) → h(x) as ε → 0 for any bounded continuous h. Here * is convolution on the line as usual. Let f : R n → [a, b] (0 < a < b < ∞) be Hölder continuous of index α > 1 2 and x 1 , . . . , x n ∈ [0, 1]. Then a special case of Corollary 2.7 implies uniqueness in law for (1.1) with Neumann boundary conditions if A(u)(y) = ψ δ * (f (ψ ε * u(x 1 + ·), . . . , ψ ε * u(x n + ·)))(y), (1.6) where u(y) is the even 2-periodic extension of u to R. As δ, ε ↓ 0 the above approachesÃ (u)(y) = f (u(x 1 + y), . . . , u(x n + y)). (1.7) Proving uniqueness in (1.1) for A =Ã remains unresolved for any α < 1 unless n = 1 and x 1 = 0. In this case and for the equation (1.1) on the line, Mytnik and Perkins [14] established pathwise uniqueness, and hence uniqueness in law for A(u)(y) = f (u(y)) when f is Hölder continuous of index α > 3/4, while Mueller, Mytnik and Perkins [13] showed uniqueness in law may fail in general for α < 3/4. These latter results are infinitedimensional extensions of the classical pathwise uniqueness results of Yamada and Watanabe [19] and a classical example of Girsanov (see e.g. Section V.26 of [16]), respectively. These equations are motivated by branching models with interactions (f (u) = σ(u)u, u ≥ 0), the stepping stone models in population genetics (f (u) = u(1 − u), u ∈ [0, 1]) and two type branching models with annihilation f (u) = |u|, u ∈ R. Note these examples have degenerate diffusion coefficients and, as in the finite-dimensional case, [14] does not require any non-degeneracy condition on f but is very much confined to the diagonal case in which A(u)(y) depends on u(y). In particular their result certainly cannot deal with A as in (1.6) (and conversely).
Due to the failure of standard perturbation methods to produce a uniqueness result for (1.2) which is applicable to (1.1), we follow a different and more recent approach used to prove well-posedness of martingale problems, first for jump processes in Bass [2], for uniformly elliptic finite dimensional diffusions in Bass and Perkins [5], and recently for a class of degenerate diffusions in Menozzi [12]. Instead of perturbing off a constant coefficient Ornstein-Uhlenbeck operator, the method perturbs off of a mixture of such operators. Further details are provided in Section 3.
We have not spent too much effort on trying to minimize the coefficients β and γ appearing in (1.4) [13] are for A(u)(x) = |u(x)| (3/4)−ǫ and so do not satisfy our non-degeneracy condition on A.) The main existence and uniqueness results for (1.2) and (1.1) are stated in Section 2. Section 3 contains a more detailed description of our basic method using mixtures of Ornstein-Uhlenbeck densities. Section 4 collects some linear algebra results and elementary inequalities for Gaussian densities. In addition this section presents Jaffard's theorem and some useful applications of it. The heavy lifting is done in Sections 5 and 6 which give bounds on the mixtures of Ornstein-Uhlenbeck process and their moments, and the second order derivatives of these quantities, respectively. Section 7 then proves the main estimate on smoothing properties of our mixed semigroup. The main uniqueness result for Hilbert space-valued SDEs (Theorem 2.1) is proved in Section 8. Finally Section 9 proves the slightly more general uniqueness result for SDEs, Theorem 9.1, and uses it to establish the existence and uniqueness results for the SPDE (1.1) (Theorem 2.3 and Theorem 2.4) and then some specific applications (Corollaries 2.6 and 2.7).
The proofs of some of the linear algebra results and of the existence of a solution to (2.7) are given in Appendices A and B.
We often use c 1 for constants appearing in statements of results and use c 2 , c ′ 2 , c 3 , c ′ 3 etc. for constants appearing in the proofs.
Acknowledgment. M. Neumann acquainted us with the theorem of Jaffard and related work and also provided additional help with some of the linear algebra. We would also like to thank K. Gröchenig and V. Olshevsky for information concerning Jaffard's theorem. Finally, we want to thank an anonymous referee, who did a fine job of reading the paper carefully and making useful suggestions.

Main results
We use D i f for the partial derivative of f in the i th coordinate direction and D ij f for the corresponding second derivatives. We denote the inner product in R d and the usual inner product in L 2 [0, 1] by ·, · ; no confusion should result.
Let C 2 b (R k ) be the set of twice continuous differentiable functions on R k such that the function together with all of its first and second partial derivatives are bounded, and define . , x k ) and we let T 2,C k be the set of such f where f k is compactly supported. Let T 2 = ∪ k T 2 k be the class of functions in C 2 b (ℓ 2 ) which depend only on finitely many coordinates. We let X t (ω) = ω(t) denote the coordinate maps on C(R + , ℓ 2 ).
We are interested in the Ornstein-Uhlenbeck type operator Here {λ i } is a sequence of positive numbers satisfying for all i = 1, 2, . . ., where κ λ is a fixed positive finite constant. We assume throughout that a is a map from ℓ 2 to L + (ℓ 2 , ℓ 2 ) so that there exist 0 Later on we will suppose there exist γ > 1 and a constant κ γ such that for all x ∈ ℓ 2 and all i, j. We will also suppose there exist α ∈ ( 1 2 , 1], β > 0 and a constant κ β such that for all i, j, k ≥ 1 and y ∈ ℓ 2 , where e k is the unit vector in the x k direction.
Recall that a ij is of Toeplitz form if a ij depends only on i − j.
We consider C(R + , ℓ 2 ) together with the right continuous filtration generated by the cylindrical sets. A probability P on C(R + , ℓ 2 ) satisfies the martingale problem for L starting at v ∈ ℓ 2 if P(X 0 = v) = 1 and Our main theorem on countable systems of SDEs, and the theorem whose proof takes up the bulk of this paper, is the following. It is routine to derive the following corollary from Theorem 2.1.
then the a ij satisfy the assumptions of Theorem 2.1. Then the ℓ 2 -valued continuous solution to the system of SDEs is unique in law.
Uniqueness in law has the usual meaning here. If there exists another process X with the same initial condition and satisfying where {W } is a sequence of independent Brownian motions, then the joint laws of (X, W ) and (X, W ) are the same.
We now turn to the stochastic partial differential equation (SPDE) that we are considering:  Recall that {e k } is a complete orthonormal system for L 2 [0, 1] of eigenfunctions of the Laplacian satisfying appropriate boundary conditions. We specialize our earlier notation and let e k (x) = √ 2 cos(kπx) if k ≥ 1, and e 0 (x) ≡ 1. Here is our theorem for SPDEs. It is proved in Section 9 along with the remaining results in this section.
and also positive constants κ 1 , κ 2 and κ 3 such that for all u ∈ C[0, 1], and Then for any u 0 ∈ C([0, 1]) there is a solution of (2.7) in the sense of (2.8) and the solution is unique in law.
To give a better idea of what the above conditions (2.10) and (2.12) entail we formulate some regularity conditions on A(u) which will imply them.
For δ ∈ [0, 1) and k ∈ Z + , u C k+δ has the usual definition: where u (i) is the i th derivative of u and we consider the 0 th derivative of u to just be u itself. C k is the usual space of k times continuously differentiable functions equipped with · C k and C k+δ = {u ∈ C k : u C k+δ < ∞} with the norm u C k+δ .
If f ∈ C([0, 1]) let f be the extension of f to R obtained by first reflecting to define an even function on [−1, 1], and then extending to R as a 2-periodic continuous function. That is, f (−x) = f (x) for 0 < x ≤ 1 and f (x + 2) = f (x) for all x. In order to be able to work with real valued processes and functions, we introduce the space , that is, the set of f whose even extension to the circle of circumference 2 is in C ζ . A bit of calculus shows that f ∈ C ζ per if and only if f ∈ C ζ ([0, 1]) and f (k) (0) = f (k) (1) = 0 for all odd k ≤ ζ. Such f will be even functions, and consequently their Fourier coefficients (considered on the interval [−1, 1]) will be real.
The following theorem is a corollary to Theorem 2.3.

Theorem 2.4 Suppose there exist
and also positive constants κ 1 , κ 2 and κ 3 such that for all u, v continuous on

14)
and Then for any u 0 ∈ C([0, 1]) there is a solution of (2.7) and the solution is unique in law.
Note that (2.13) is imposing Hölder continuity in a certain Wasserstein metric.
Corollary 2.6 With f and ϕ 1 , . . . , ϕ n as above, let Then a solution to (2.7) exists and is unique in law.
A second class of examples can be built from convolution operators. If f , g are real-valued functions on the line, f * g is the usual convolution of f and g.

17)
then there is a solution to (2.7) and the solution is unique in law.
One can construct a physical model corresponding to Corollaries 2.6 and 2.7. Consider a thin metal rod of unit length with insulated ends and wrapped with a non-homogeneous partially insulated material. Subject the rod to random heat along the length of the rod; this representsẆ t,x . The heat flows along the rod according to (1.1). The partially insulated wrapping corresponds to A(u). If n = 1 and A is a function of a weighted average of the temperatures along the rod, we are in the context of Corollary 2.6. If n = 1 and one can only measure temperatures as an average of a neighborhood of any given point, then Corollary 2.7 might apply.

Overview of proof
In this section we give an overview of our argument. For most of this overview, we focus on the stochastic differential equation (1.2) where a is of Toeplitz form, that is, a ij depends only on i − j. This is where the difficulties lie and puts us in the context of Theorem 2.1.
Assume we have a K × K matrix a that is of Toeplitz form, and we will require all of our estimates to be independent of K. Define where λ i satisfies (2.2). Let p z (t, x, y) be the corresponding transition probability densities and let r z θ (x, y) be the resolvent densities. Thus Lf (x) = M x f (x).
We were unable to get the standard perturbation method to work and instead we used the method described in [5]. The idea is to suppose there are two solutions P 1 and P 2 to the martingale problem and to let were in the domain of L when g is C ∞ with compact support, we would have Such f need not be in the domain of L, but we can do an approximation to get around that problem.
If we can show that for θ large enough, we would then get |S ∆ g| ≤ 1 2 S ∆ g ∞ , which implies that the norm of the linear functional S ∆ is zero. It is then standard to obtain the uniqueness of the martingale problem from this. We derive (3.1) from a suitable bound on (M y − M x )p y (t, x, y) dy. (3.2) Our bound needs to be independent of K, and it turns out the difficulties are all when t is small.
When calculating D ij p y (t, x, y), where the derivatives are with respect to the x variable, we obtain a factor e −(λ i +λ j )t (see (6.1)), and thus by (2.2), when summing over i and j, we need only sum from 1 to J ≈ t −1/2 instead of from 1 to K. When we estimate (3.2), we get a factor t −1 from D ij p y (t, x, y) and we get a factor |y − x| α ≈ t α/2 from the terms a ij (y) − a ij (x). If we consider only the main diagonal, we have J terms, but they behave somewhat like sums of independent mean zero random variables, so we get a factor √ J ≈ t −1/4 from summing over the main diagonal where i = j ranges from 1 to J. Therefore when α > 1/2, we get a total contribution of order t −1+η for some η > 0, which is integrable near 0. The Toeplitz form of a allows us to factor out a ii (y) − a ii (x) from the sum since it is independent of i and so we are indeed left with the integral in y of Let us point out a number of difficulties. All of our estimates need to be independent of K, and it is not at all clear that can be bounded independently of K. That it can is Theorem 5.3. We replace the a ij (y) by a matrix that does not depend on y K . This introduces an error, but not too bad a one. We can then integrate over y K and reduce the situation from the case where a is a K × K matrix to where it is (K − 1) × (K − 1) and we are now in the (K − 1) × (K − 1) situation. We do an induction and keep track of the errors.

From (3.3) we need to handle
and here we use Cauchy-Schwarz, and get an estimate on This is done in a manner similar to bounding p y (t, x, y) dy, although the calculations are of course more complicated.
We are assuming that a ij (x) decays at a rate at least (1 + |i − j|) γ as |i − j| gets large. Thus the other diagonals besides the main one can be handled in a similar manner and γ > 1 allows us to then sum over the diagonals.
A major complication that arises is that D ij p y (t, x, y) involves a −1 and we need a good off-diagonal decay on a −1 as well as on a. An elegant linear algebra theorem of Jaffard gives us the necessary decay, independently of the dimension.
To apply the above, or more precisely its cousin Theorem 9.1, to the SPDE (1.1) with Neumann boundary conditions, we write a solution u(·, t) in terms of a Fourier cosine series with random coefficients. Let e n (x) = √ 2 cos(πnx) if n ≥ 1, and e 0 (x) ≡ 1, λ n = n 2 π 2 /2 and define X n (t) = u(·, t), e n . Then it is easy to see that X = (X n ) satisfies (1.2) with a jk (x) = 1 0 A(u(x)) 2 (y)e j (y)e k (y) dy, x ∈ ℓ 2 (Z + ), where u(x) = ∞ 0 x n e n . We are suppressing some issues in this overview, such as extending the domain of A to L 2 . Although (a jk ) is not of Toeplitz form it is easy to see it is a small perturbation of a Toeplitz matrix and satisfies the hypotheses of Theorem 9.1. This result then gives the uniqueness in law of X and hence of u.

Some linear algebra
Suppose m ≥ 1 is given. Define g r = rI, where I is the m × m identity matrix and let E(s) be the diagonal matrix whose (i, i) entry is e −λ i s for a given sequence of positive reals λ 1 ≤ · · · ≤ λ m . Given an m × m matrix a, let be the matrix whose (i, j) entry is Note lim t→0 a ij (t)/t = a ij , and we may view a as a ′ (0).
Given a nonsingular matrix a, we use A for a −1 . When we write A(t), this will refer to the inverse of a(t). Given a matrix b or g r , we define B, G r , b(t), g r (t), B(t), and G r (t) analogously. If r = 1 we will write G for G 1 and g for g 1 .
Let a be the usual operator norm, that is, a = sup{ aw : w ≤ 1}. If C is a m × m matrix, recall that the determinant of C is the product of the eigenvalues and the spectral radius is bounded by C . Hence This can be found, for example, in [8,Corollary 7.7.4].
Proof. This is equivalent to and so is immediate from Cauchy-Schwarz. Define Let A(t) be the inverse of a(t), that is, A calculus exercise will show that for all positive λ, t, then for all t > 0, For the proof see Appendix A. Then

10)
and for all w, w ′ , For the proof see Appendix A.

Lemma 4.6 Let a and b be positive definite matrices with
For the proof see Appendix A.
Let us introduce the notation where C is a positive definite m × m matrix, and w ∈ R m . Proposition 4.7 Assume a, b are as in Lemma 4.6. Set For the proof see Appendix A.
For the proof see Appendix A.
There is a constant c 2 , depending only on c 1 , γ, Λ 0 and Λ 1 , but not K, such that The dependence of c 2 on the given parameters is implicit in the proof in [9].
We now suppose that a is a positive definite K × K matrix such that for some positive Λ 0 , Λ 1 , We suppose also that (2.4) holds. Our estimates and constants in this section may depend on Λ i and κ γ , but will be independent of K, as is the case in Proposition 4.9.
Recall a(t) and a(t) are defined in (4.1) and (4.6), respectively, and A(t) and A(t), respectively, are their inverses.
We set The proposition we will use in the later parts of the paper is the following.
For the left hand inequality, by (4.16) and the lower bound in (4.8) it suffices to show and this is immediate from the uniform upper bound on a(t) in Lemma 4.4.

A Gaussian-like measure
Let us suppose K is a fixed positive integer, 0 < Λ 0 ≤ Λ 1 < ∞, and that we have a K × K symmetric matrix-valued function a : It will be important that all our bounds and estimates in this section will not depend on K. We will assume 0 < λ 1 ≤ λ 2 ≤ · · · ≤ λ K satisfy (2.2). As usual, A(x) denotes the inverse to a(x), and we define and then A(x, t) to be the inverse of a(x, t). Let a(x, t) and A(x, t) be defined as in (4.6) and (4.7), respectively. When , and write π j for π j,x if there is no ambiguity. From (4.12) we see that The dependence of A on y but not x is not a misprint; y → Q K (y −x ′ , A(y, t)) will not be a probability density. It is however readily seen to be integrable; we show more below.
The choice of K in the next result is designed to implement a key induction argument later in this section.
Lemma 5.1 Assume K = m + 1 and a(y) = a(π m (y)) for all y ∈ R K , that is, a(y) does not depend on y m+1 . Let b(y) be the m × m matrix with b ij (y) = a ij (y) for i, j ≤ m, and let B(y) be the inverse of b(y). Then for all x, (a) we have ) equals the density of a normal random variable with mean and variance σ 2 (y 1 , . . . , y m ) = (A m+1,m+1 (y)) −1 .
Proof. Lemma 4.8 and some algebra show that Therefore det C(y) = det A(y) > 0, and it follows that det A(y) = det C(y) = A m+1,m+1 (y) det B(y). Let B 0 = 8 log(Λ 1 /Λ 0 ) + 4 log 2 and for B > 0 let Recalling that w = y − x ′ , we will often use the further change of variables Note that when integrating Q K (w ′ , A(y, t)) with respect to w ′ , y is an implicit function of w ′ .
Let Z i be i.i.d. mean zero normal random variables with variance 1 and let The right hand side is the same as Since E e |Z 1 | 2 /4 = √ 2, our choice of B shows that the above is at most For m ≤ K we let a m (y, t), respectively a m (y, t), be the m × m matrices whose (i, j) entry is a ij (π m,x ′ (y), t), respectively a ij (π m,x ′ (y), t). We use A m (y, t) and A m (y, t) to denote their respective inverses.
The main theorem of this section is the following.
Then there exists a constant c 1 depending on α, β, κ β , p, Λ 0 , and Λ 1 but not K, such that for all t > 0 and x ∈ R: Remark 5.4 This is one of the more important theorems of the paper. In the proof of (a) we will define a geometrically decreasing sequence K 0 , ..., K N with K 0 = K and K N = j and let C m be the expression on the right-hand side of (a) but with K m in place of K and A Km in place of A. We will bound C m inductively in terms of C m+1 by using Lemma 5.2 and Proposition 4.7. This will give (a) and reduce (b) to the boundedness in the K = 1 case, which is easy to check.
Proof. [Proof of Theorem 5.3] All constants in this argument may depend on α, β, κ β , Λ 0 , Λ 1 , and p. Let K 0 , K 1 , . . . , K N be a decreasing sequence of positive integers such that K 0 = K, K N = j, and Our plan is to bound C m inductively over m. Write Assume m < N. We can bound I 1 using Lemma 5.2 and conclude Turning to I 2 , we see that by our hypothesis on a, we have In the last line we use Hölder's inequality and the bound by (2.2). We also used the geometric decay of the {K m }.
. We now apply Proposition 4.7 for w ′ ∈ S B 0 ,Km with a = a Km (y, t) and b = a Km (π K m+1 (y), t). In view of (4.13) and (5.11), we may take Proposition 4.7 shows that for w ′ ∈ S B 0 ,Km , Therefore we have Recall m + 1 ≤ N so that j ≤ K m+1 . Integrate over w ′ Km using Lemma 5.1, then over w ′ Km+1 using Lemma 5.1 again, and continue until we have integrated over w ′ K m+1 +1 to see that and hence This and (5.9) together show that (5.8) implies that for 0 ≤ m < N, This and a simple induction imply For (b), we may apply (a) with p = 0 and j = 1 to get Recall from Lemma 4.5 that the scalar A 1 (y, t) satisfies (Λ 1 ) −1 ≤ | A 1 (y, t)| ≤ (Λ 0 ) −1 and so the above integral is at most The first bound in (b) follows from this and (5.17). Using the change of variables w ′ = G(t) 1/2 w, we see that the second integral in (b) equals the first.

Proposition 5.5 Under the hypotheses of Theorem 5.3,
as t → 0, uniformly in K and x.
Proof. We will use the notation of the proof of Theorem 5.3 with j = 1, p = 0, and t < 1. Using the change of variables w ′ = G(t) 1/2 (y − x ′ ), it suffices to prove We define a decreasing sequence K 0 , . . . , K N as in the proof of Theorem 5.3 with K 0 = K and K N = 1, we let we let R > 0 be a real number to be chosen later, and we write We will bound each term on the right hand side of (5.18) appropriately, and that will complete the proof.
Using (5.13) and with S B,K defined by (5.4), we write Follow the argument in the proof of Theorem 5.3 with this value of δ to see that We used the uniform boundedness of C m+1 from Theorem 5.3 for the last inequality.
A very similar argument shows that where c 5 depends on R. For example, in bounding the analog of J 2 (t), we may now take δ = c 6 R α t α/2 by adjusting the argument leading up to (5.11).
c 8 depends on R but c 7 does not. For the second inequality recall that 3 − β − α + η < 0 and the K m were chosen in the proof of Theorem 5.3 so that 5 4 ≤ K m /K m+1 ≤ 4. Given ε > 0, choose R large so that c 7 e −c 7 R < ε and then take t small enough so that c 8 (t α/2 + t η/2 ) < ε.
Proof. Bound the above integral by The first term is at most c p K p e −B 0 K/16 by Lemma 5.2. The integral in the second term is at most c 1 by Theorem 5.3 (b). The result follows.
Proof. The above sum is bounded by The first term is at most c 3 N n=1 n r−γ and the second term is at most c 4 k r . The result follows.
For p ≥ 1/2 and f : We start with a rather crude bound. We write Aw ′ for A(y, t)w ′ .
Lemma 5.8 There exists c 1 such that for all 1 ≤ k ≤ j ≤ K, Proof. By (2.4) and Lemma 4.11 we have We can use Corollary 5.6 with K = j to bound w ′ m 2p by The bound follows.

Lemma 5.9
Assume there exists c 1 > 0 such that for all j ≥ k ≥ ((j/2) ∨ 2) and t > 0. Then there is a constant c 2 , so that for all 1 ≤ j ≤ K and all t > 0, Proof. If z = A(y, t)w ′ , then by Lemma 4.10 Use Lemma 5.8 to bound z k 2p for k ≤ (j/2) ∨ 1 and (5.19) to bound it for k > (j/2) ∨ 1. This leads to where γ > 3/2 is used in the last line. This gives (5.20).

Lemma 5.10
There exists c 1 such that for all K ≥ j ≥ k ≥ j/2 > 0, Proof. As usual, w ′ = G(t) 1/2 (y − x ′ ). If j, k are as above, then by (2.5) and (5.10) by (5.10) and k ≥ 2. So for w ′ ∈ S B 0 ,j we can use k ≥ j/2 to conclude and therefore using k ≥ j/2 again, For w ′ ∈ S B 0 ,j we may therefore apply Proposition 4.7 with It follows from Proposition 4.7 and the first inequality in (5.23) that and so (the constants below may depend on p) Use (5.24) and (5.26) to bound the required integral by The first term is at most c p j p e −B 0 j/16 by Lemma 5.2, and the last term is bounded by c 9 k 2−α/2+p−β , thanks to Theorem 5.3. Adding the above bounds gives the required result because β ≥ 2 − α/2 + p.

Proof.
Consider (5.27). First assume p ≥ 1/2. As β > 3 − α, Theorem 5.3(a) allows us to assume K = j. Lemma 5.9 reduces the proof to establishing (5.19) in Lemma 5.9 for j and k as in that result, so assume j ≥ k ≥ (j/2) ∨ 2. Lemmas 5.10 and 5.11 imply that To evaluate the integral I, note that Changing the indices in Lemma 5.1 with a and b playing the roles of a and b, respectively, we see that provided we hold the coordinates w = (w ′ j ) j =k fixed, if y = (y j ) j =k and B( y, t) is the inverse of ( b mn (y, t)) m =k,n =k , then Q j (w ′ , B(y, t))/Q j−1 ( w, B( y, t)) dw ′ as a function of w ′ k is the density of a normal random variable with mean and variance σ 2 = B kk (y, t) −1 . So if we integrate over w ′ k , Lemma 5.1 implies Finally we use Theorem 6.3(b) to bound the above integral by c ′ p . Put this bound into (5.29) to complete the proof of (5.27) when p ≥ 1/2. A(y, t)) dw ′ and apply the above and Theorem 5.3(b).
The change of variables w ′ = G(t) 1/2 w shows that Now use (4.8) to see that This and (5.27) now give (5.28).

A second derivative estimate
We assume 0 ≤ λ 1 ≤ λ 2 ≤ ... ≤ λ K satisfies (2.2) for all i ≤ K. Our goal in this section is to bound the second derivatives uniformly in K. Here a(y, t) and A(y, t) = a(y, t) −1 are as in Section 5, and we assume (2.5) for appropriate β and (2.4) for γ > 3/2 throughout. The precise conditions on β will be specified in each of the results below. The notations A m , A m , A from Section 5 are also used.
A routine calculation shows that for j, k ≤ K, where w = y − x ′ and for a K × K matrix A, We use the same notation if A is an m × m matrix for m ≤ K, but then our sums are up to m instead of K.
We will need a bound on the L 2 norm of a sum of second derivatives. The usual change of variables w ′ = G(t) 1/2 (y − x ′ ) will reduce this to bounds on These bounds will be derived by induction as in Theorem 5.3 and so we introduce for m ≤ K, As the argument is more involved than the one in the proof of Theorem 5.3, to simplify things we will do our induction from m to m − 1 rather than using geometric blocks of variables. This leads to a slightly stronger condition on β in Proposition 6.6 below than would otherwise be needed.
If A is an m × m matrix, we set A ij = 0 if i or j is greater than m. This means, for example, that S j,k (w, A) = 0 if j ∨ k > m. In what follows x is always fixed, all bounds are uniform in x, and when integrating over w ′ j , we will be integrating over y j = y j (w ′ j ) as well. Since w ′ = G(t) 1/2 w we have from (5.11) There exists c 1 such that for all m, j, k > 0 and ℓ ≥ 0 satisfying (j ∨ k) + ℓ ≤ m ≤ K and m ≥ 2,

Proof.
Let j, k, ℓ and m be as above. The pointwise bound on A m in Lemma 4.11 implies The triangle inequality gives and (4.11) together with the calculation in (6.6) implies as in (6.6) above. Now use (6.6), (6.7), and (6.8) in (6.5) and then appeal to (6.4) to conclude that There are several integrals to bound but the one giving the largest contribution and requiring the strongest condition on β will be Apply Hölder's inequality for triples with p = 1+α 1−ε , q = 1+α α(1−ε) and r = ε −1 to conclude Here we used Corollary 5.6, Theorem 5.12 and the fact that β > 5/2 means the hypotheses of this last result are satisfied for ε small enough. The other integrals on the right-hand side of (6.9) lead to smaller bounds and so the left-hand side of (6.9) is at most c 9 m 5/2−β−α . A similar bound applies with the roles of j and k reversed, and so the required result is proved.
Lemma 6.2 Assume β > 2−(α/2). There exists c 1 such that for all j, k, ℓ, m as in Lemma 6.1 and satisfying 2 ≤ m, Proof. Recall that B 0 is as in Lemma 5.2. Use (6.4) on S c B 0 ,M and (6.3) on S B 0 ,m to bound the above integrand by By Lemma 5. We bound I 2 (t) as in the proof of Theorem 5.3 but with m in place of K m . This requires some minor changes. Now for w ′ ∈ S B 0 ,m the δ coming from (5.11) is less than or equal to So for w ′ ∈ S B 0 .m , applying Proposition 4.7 as before, we get and therefore where Theorem 5.12 and Cauchy-Schwarz are used in the last line. The lower bound on β shows the hypotheses of Theorem 5.12 are satisfied. Combining the bounds on I 1 (t) and I 2 (t) completes the proof.
Note that if Z is a standard normal random variable, then E [(Z 2 −1) 2 ] = 2.
Q m−1 (r m−1 w ′ , A m−1 (y, t)) dw ′ m = S j,j+ℓ S k,k+ℓ (w ′ , A m−1 (y, t))1 ((j∨k)+ℓ≤m−1) (6.10) Proof. We apply Lemma 5.1 with m in place of m + 1 and a m (π m−1 (y), t) playing the role of a(y). Then under Q m−1 (r m−1 w ′ , A m−1 (y, t)) , (6.11) w ′ m has a normal distribution with mean and so for j, k, ℓ, m as in the lemma, . Rearranging terms, we see that the above equals When we multiply each off-diagonal term by G m (y, t) and integrate over w ′ m , we get zero. This is because the conditional normal distribution of w ′ m under G m (y, t) implies that each of Now integrate the remaining terms on the right hand side of (6.12) with to obtain the desired expression. In particular note that We treat V 2 and V 3 in (6.10) as error terms and so introduce and E(j, k, l, m) = E 1 (j, k, ℓ, m) + E 2 (j, k, ℓ, m).
We are ready for our inductive bounds on the integral I m jkℓ , defined at the beginning of this section. Proposition 6.4 Assume β > 7 2 − α. There exists c 1 such that for all integers j, k, ℓ such that 1 ≤ j ≤ k ≤ k + ℓ ≤ K, E(j, k, ℓ, m).
Proof. If K ≥ m ≥ 2 ∨ (k + ℓ), we can combine Lemmas 6.1, 6.2 and 6.3 to see that k, ℓ, m). Therefore by induction The first term in the above is I 1 110 1 (k+ℓ=1) . For m = 1, A 1 (y, t) is a scalar and an argument similar to that in (b) of Theorem 5.3 shows that I 1 110 =S 1,1 (w ′ , A 1 (y, t)) 2 Q 1 (w ′ , A 1 (y, t))dw ′ (6.14) Use (6.14) to bound the first term in (6.13) and then bound the second terms in the obvious manner to complete the proof.
To use the above bound we of course will have to control the E(j, k, ℓ, m)'s.
There exists a c 1 such that for all 0 ≤ ℓ ≤ K, Proof. We consider E 1 (j, k, ℓ, m). There is a product giving rise to four terms, all of which are handled in a similar way. We consider only as this is the worst term. Use the upper bound on A m ij and the lower bound on A m ii from Lemma 4.11 to see that An application of Cauchy-Schwarz and Theorem 5.12 shows that for our value of β the last integral is bounded by c 3 . This leads to Now sum over j, m, and k in that order to see that The other terms making up E 1 (j, k, ℓ, m) are bounded in a similar manner.
The integral we have to bound now becomes Now use the upper bound on H i , Lemma 6.5, Proposition 6.4 for j ≤ k, and symmetry in (j, k) to bound the above by where Lemma 6.5 and the condition on β are used in the last line.
We need a separate (and much simpler) bound to handle the absolute values of D jk Q K (y − x ′ , A(y, t)) for j ∨ k ≥ J ζ (t).
There exists c 1 such that for all i, j, k ≤ K and p ≥ 0, Proof. By (6.3) the above integral is at most Now apply Theorem 5.12 and Cauchy-Schwarz to obtain the required bound.
The proof of the following is left to the reader.
Proof. As in the proof of Proposition 6.6, if H i (t) = e −λ i t G ii (t) 1/2 , then the substitution w = g(t) 1/2 w ′ leads to the last by Lemma 6.7.
A bit of calculus shows that Proposition 6.10 Assume β > 3 − α 2 . There are constants ζ 0 and c 1 such that if ζ ≥ ζ 0 and J = J ζ (t), then Proof. Using Proposition 6.9, the sum is at most Lemma 6.8 is used in the last line, and (2.2) and j ∨ k > J ≥ 1 are used in the next to the last line. The above bound is at most Now take ζ 0 = 2/c 4 to complete the proof.
and let A K (x, t) be the inverse of a K (x, t). We will apply the results of Sections 6 and 7 to these K × K matrices. We will sometimes write x K for (x 1 , . . . , x K ), and when convenient will identify π K (x) with x K . It will be convenient now to work with the notation x, y) = N K (t, π K (x), π K (y)), x, y ∈ ℓ 2 . (7.2) As before D ij N K (t, x, y) denotes second order partial derivatives in the x variable.
Our goal in this section is to prove the following: Theorem 7.1 Assume (a ij (y)) satisfies (2.5) and (2.4) for all i, j, k ∈ N, for some α ∈ ( 1 2 , 1], β > 9 2 − α, and γ > 2α 2α−1 . Then there is a c 1 > 0 and η 1 = η 1 (α, γ) > 0 so that for all x ∈ ℓ 2 , K ∈ N, and t > 0, Proof. Note first that by (7.2) D ij N K = 0 if i ∨ j > K and so by the symmetry of a(x) and the Toeplitz form of a, the integral we need to bound is I ≡ where ζ is as in Proposition 6.10. If j > J or ℓ ≥ J then clearly i = j + ℓ > J, so that Note that |a K ij (z)| = | a K (z)e i , e j | ≤ | a K (z)e i , e i | 1/2 | a K (z)e j , e j | 1/2 ≤ Λ 1 (7.5) by Cauchy-Schwarz. Then Proposition 6.10 implies that Recalling that x ′ k = e −λ k t x k , we can write x, y) dy K (7.7) |x n − y n | α n −β . (7.9) By (2.5) and (2.4), y)). (7.10) Therefore by (6.1) In the last line we used Proposition 6.6 on the second factor and the Cauchy-Schwarz inequality on the sum in the first factor and then Theorem 5.3(b) to bound the total mass in this factor. Next use Theorem 5.12 with p = α to conclude that It now follows from (7.11) and the choice of J that I 1,1 is at most By splitting the above sum up at ℓ = ⌊t −α/2γ ⌋ we see that Using this in (7.12), we may bound I 1,1 by for some η = η(α, γ) > 0 because γ > 2α 2α−1 . Turning to I 1,2 , note that where (2.2) and β − 2α > 1 are used in the last line. Therefore (7.10) now gives As in (7.11) we now get (again using Proposition 6.6) Now use (7.13) with x α ∞ t α in place of t α/2 to conclude that for some η = η(α, γ) > 0 because γ > 2α 2α−1 > 4α 4α−1 . Finally use the above bound on I 1,2 and the bound on I 1,1 in (7.14) to bound I 1 by the right-hand side of (7.3). Combining this with the bound on I 2 in (7.6) completes the proof.
For R > 0 let p R : R → R be given by p R (x) = (x ∧ R) ∨ (−R) and define a truncation operator τ R : Clearly a R (x) = a(x) whenever x ∞ ≡ sup n |x n | ≤ R. We write a K,R for the K × K matrix (a R ) K .
Proof. Assume without loss of generality that x > 0 and set Finally if x ′ < R < x, then Proof. This is elementary and so we only consider (2.5). For this note that as required.
Corollary 7.4 Assume the hypotheses of Theorem 7.1. Then for all x ∈ ℓ 2 , K ∈ N and R, t > 0, Proof. We use the notation in the proof of Theorem 7.1. By Lemma 7.3 and the proof of Theorem 7.1 it suffices to show that we have instead of (7.18). We have by Lemma 7.2 The fact that β − 2α > 1 is used in the last line. Now use (7.22) in place of (7.15) and argue exactly as in the proof of (7.18) to derive (7.21) and so complete the proof.

Uniqueness
In this section we prove Theorem 2.1. Recall the definitions of T 2 k and T 2,C k and the definition of the martingale problem for the operator L from Section 2. Throughout this section we assume the hypotheses of Theorem 2.1 are in force.

Proof.
We need only consider the second inequality by (4.4). Our hypotheses (2.5) and (2.4) imply The second inequality follows from min(r, s) ≤ r 1/2 s 1/2 if r, s ≥ 0. We have γ > 2 and 2β > 2 − α by (2.5), and so Proof. This is well known and follows, for example from the continuity of a given by Lemma 8.1 and Theorem 4.2 of [1].
We turn to uniqueness. Let L R (x) be defined in terms of a R analogously to how L is defined in terms of a. We fix R > 0 and for K ∈ N define Note that if f ∈ T 2 k and K ≥ k, then where m is Lebesgue measure on R and δ z is point mass at z. Define Suppose P 1 , P 2 are two solutions to the martingale problem for L R started at some fixed point v. For θ > 0 and f bounded and measurable on ℓ 2 , let and where M f is a martingale under each P i . Taking expectations, multiplying both sides by θe −θt , and integrating over t from 0 to ∞, we see that . Now take differences in the above to get Next let g ∈ T 2,C k and for K ≥ k set x, y)g(y) dt γ K (dy).
Recall that N K is defined in (7.1). Since N K (t, x, y) is smooth in x, bounded uniformly for t ≥ ε and N K (t, x, y) depends on x only through π K (x), we see Holding y fixed and viewing N K (t, x, y) and W εK (x, y) as functions of x, we see by Kolmogorov's backward equation for the Ornstein-Uhlenbeck process with diffusion matrix (a ij (y)) i,j≤K that Alternatively, one can explicitly calculate the derivatives. Using dominated converge to differentiate under the integral in (8.5) gives By (8.1) for all x and K ≥ k We used (8.6) in the third equality.
For x ∈ ℓ 2 fixed we first claim that boundedly and uniformly in K ≥ k as ε → 0. By virtue of Proposition 5.5, it suffices to show boundedly and pointwise as ε → 0, uniformly in K ≥ k. The boundedness is immediate from Theorem 5.3. Since g ∈ T 2 k , given η there exists δ such that |g(y) − g(x)| ≤ η if |π k (y − x)| ≤ δ, and using Theorem 5.3, it suffices to show as ε → 0 uniformly in K ≥ k. By Theorem 5.12 the above integral is at most and (8.8) is established.
Next we claim that for each ε > 0 Since t ≥ ε in the integral defining W εK (x, y) we can use dominated convergence to differentiate through the integral and conclude that As in the proof of Proposition 6.6, the substitution w ′ = G(t) 1/2 w shows that the integral over R K in (8.11) equals where (6.16) and Lemma 6.7 are used in the above. By (2.5) we have Use (8.12) and (8.13) in (8.11) to get which proves (8.10) by our hypothesis on β.
Now let K → ∞ and use (8.10) and (8.15) to conclude that Then letting ε → 0 and using (8.8), we obtain provided g ∈ T 2,C k . By a monotone class argument and the fact that S ∆ is the difference of two finite measures, we have the above inequality for g ∈ T . The σ-field we are using is generated by the cylindrical sets, so another application of the monotone class theorem leads to for all bounded g which are measurable with respect to σ(∪ j T j ). Taking the supremum over all such g bounded by 1, we obtain Γ ≤ 1 2 Γ.
This proves that S 1 θ f = S 2 θ f for every bounded and continuous f . By the uniqueness of the Laplace transform, this shows that the one-dimensional distributions of X t are the same under P 1 and P 2 . We now proceed as in [17,Chapter 6] or [3,Chapter 5] to obtain uniqueness of the martingale problem for L R .
We now complete the proof of the main result for infinite-dimensional stochastic differential equations from the introduction.

SPDEs
Before proving our uniqueness result for our SPDE, we first need need a variant of Theorem 2.1 for our application to SPDEs. Let λ 0 = 0 and now let for f ∈ T . In this case ℓ 2 = ℓ 2 (Z + ). Define N K in terms of a and its inverse A as in (7.1). We prove the following analog of (7.20) exactly as in the proof of Corollary 7.4 and Theorem 7.1: Here note that the proof uses the bounds on N K and D ij N K from Sections 5 and 6 and the regularity properties of a (1) (which are the same as those of a in the proof of Theorem 2.1) separately. If we prove the analog of (9.2) with a (1) replaced by a (2) , we can then proceed exactly as in Section 8 to obtain our theorem. That is, it suffices to fix K and R and to show that for some Very similarly to the derivation of (7.16) (see also that of (7.22)), we have . Since α ∈ (1/2, 1] and γ > 2α/(2α−1), then γ > 2. We can choose η 2 ∈ (0, 1) such that γ(1 − η 2 ) > 2, and then |(a (2) Using this and Proposition 6.9 with p = 0 and observing that (a (2) ) K,R satisfies all the hypotheses in Section 6, we conclude that The condition β > γ/(γ − 2), allows us to find η 3 such that γ(1 − η 3 ) > 2 and βη 3 > 1. Fix i and j for the moment and let d α,β (x, y) be defined as in (7.9). We write using Proposition 6.9. Since Combining with (9.4) gives (9.3), as required.
Before proving Theorem 2.3, we need the following lemma. Recall that e n (x) = √ 2 cos nπx for n ≥ 1 and e 0 ≡ 1.
Lemma 9.2 Suppose f ∈ C ζ per and f C ζ ≤ 1. There exists a constant c 1 depending only on ζ such that | f, e n | ≤ c 1 1 + n ζ for all n ∈ Z + .
Proof. Let T be the circle of circumference 2 obtained by identifying ±1 in [−1, 1]. Since we can extend the domain of f to T so that f is C ζ on T and cos y = 1 2 (e iy + e −iy ), it suffices to show that the Fourier coefficients of a C ζ function on T decay at the rate |n| −ζ . If ζ = k + δ for k ∈ Z + and δ ∈ [0, 1), [21,II.2.5] says that the n th Fourier coefficients of f (k) is c 2 |n| k times the n th Fourier coefficient of f . Writing g for the Fourier coefficients of g, we then Combining proves the lemma.
We now prove Theorem 2.3.
We first argue that A has a unique continuous extension to a map A : y i e i ∈ S, then by (2.10) and Hölder's inequality we have for u, v ∈ C[0, 1]. Therefore A, whose domain is C[0, 1], is a bounded operator with respect to the L 2 norm. Thus there is a unique extension of A to all of L 2 . By continuity, it is clear that the extension satisfies (2.10), (2.11) (for almost every x with respect to Lebesgue measure), and (2.12).
If i or j is 0, there is a trivial adjustment of a multiplicative constant. Note both a (1) and a (2) are symmetric because cosine is an even function, and that a (1) is of Toeplitz form. Also (2.12) now shows that a (1) satisfies (2.4) and a (2) satisfies (9.1).
Finally we check (2.5). We have by (2.11) and (2.10). This establishes (2.5) for a (1) and virtually the same argument gives it for a (2) . Hence a satisfies the hypotheses of Theorem 9.1.
Turning next to uniqueness in law, let u satisfy (2.7) with u 0 ∈ C[0, 1] and define X n (t) = u(·, t), e n . The continuity of t → u(t, ·) in C[0, 1] shows that t → X t ≡ {X n (t)} is a continuous ℓ 2 -valued process. Applying (2.8) with ϕ = e k , we see that where M k (t) is a martingale such that Thus we see that {X k } satisfies (2.6) with λ i = i 2 π 2 /2.
Since u t is the L 2 limit of the sums n k=0 X k (t)e k (x) and u t is continuous in x, then u t is easily seen to be a Borel measurable function of X(t). Thus to prove uniqueness in law of u, it suffices to prove uniqueness in law of X. It is routine to show the equivalence of uniqueness in law of (2.6) to uniqueness of the martingale problem for L. Since the a ij satisfy the hypotheses of Theorem 9.1, we have uniqueness of the martingale problem for L.
Finally, the proof of Theorem 2.3 will be complete once we establish the existence of solutions to (2.7). The proof of this is standard, but we include it in Appendix B for the sake of completeness.
Proof. (a) It follows easily from Leibniz's formula that It is also clear that A(u) ∈ C γ per implies that the same is true of A(u) 2 . The result now follows from Lemma 9.2.
[Proof of Corollary 2.6] By our assumptions on f , A(u)(x) is bounded above and below by positive constants, is in C γ per , and is bounded in C γ norm uniformly in u. By our assumptions on f , Squaring and integrating over Moreover the kth derivative of A(u)(x) is bounded uniformly in x and u. If we choose γ and β large enough so that the conditions of Theorem 2.3 are satisfied, we see from the above and Proposition 9.3(a) that (2.12) holds.
Turning to the boundedness condition (2.11), we have In the last inequality we use the linearity of u → u and e k = e k . Since φ j is smooth with compact support, its Fourier transform decays faster than any power, and so φ j (w)e −iw2πx dw ≤ c β/α,j (1 + |2πx|) −β/α for all x. (9.11) Now for k ≥ 0, by (9.11). Use this in (9.10) to obtain (2.10). Finally, the proof of (2.9) is easy and should be clear from (9.10). The result now follows from Theorem 2.3.

A Proofs of linear algebra results
We give the proofs of some of the linear algebra results of Section 4.

Proof. [Proof of Lemma 4.4] Our definitions imply
by the hypotheses on a. The right side is The upper bound is similar. The bounds on a(t) are a reformulation of Lemma 4.1 and the analogous upper bound. Proof.
[Proof of Lemma 4.5] The first inequality in (4.9) follows from (4.4). The second inequality holds since where Lemma 4.3 is used in the last line and symmetry is used in the next to last line.
Turning to (4.10), we have The lower bound on a(t) (and hence b(t)) in Lemma 4.4 implies that Use this and (4.9) in (A.1) to derive (4.10). (4.11) is then immediate. Proof.
[Proof of Lemma 4.6] We write .
Use the lower bound on a(t) in Lemma 4.4 to see that A(t) ≤ Λ −1 0 , and then use (4.9) in the above to conclude that Observe that a(t) and b(t) are positive definite, so det a(t) and det b(t) are positive real numbers. We now use the inequality e x ≤ 1 + xe x for x > 0 to obtain det b(t) det a(t) ≤ 1 + θe θ .

B Proof of existence
We give here the proof of existence to a solution to (2.7).
Proof. Let X n (t) = u t , e n . By Theorem 9.1 there is a unique continuous ℓ 2 -valued solution X to (2.6) with λ n = n 2 π 2 /2, where a is constructed from A as above. If u(s, x) = Integrating by parts twice in φ, e n , and using the boundary conditions of φ, we find that Since A is bounded, M is an orthogonal martingale measure in the sense of Chapter 2 of [18] and so is a continuous orthogonal martingale measure in the sense of Chapter 2 of [18]. This (see especially Theorem 2.5 and Proposition 2.10 of [18]) and the fact that A is bounded below means one can define a white noiseẆ on [0, 1] × [0, ∞) on the same probability space, so that Therefore we may take limits in (B.5) and use the above, together with (B.6) and (B.7), to conclude that u satisfies (2.8).
It remains to show that there is a jointly continuous version of u(t, x). Note first that X n (t) = e −λnt u 0 , e n + where the series converges uniformly on t ≥ ε, x, y ∈ [0, 1] for every ε > 0. It follows thatû N t (x, y) → P t u 0 (x) for all t > 0, x ∈ [0, 1].
It now follows easily from (B.4) that u(t, x) = P t u 0 (x) +ũ(t, x) a.a. x, P − a.s. for all t ≥ 0, (B.11) where the equality holds trivially for all x if t = 0.
Clearly P t u 0 (x) is jointly continuous by the continuity of u 0 , and so we next show there is a continuous version ofũ(t, x). Let 0 ≤ s < t, choose reals x < y and fix q ≥ 1. Our constants c i below may depend on q but not s, t, x, y. By Burkholder's inequality and (B.3) we have Next use the uniform boundedness of A(u v )(z) (by (2.11)) and the fact that {e n } is an orthonormal system in L 2 ([0, 1]) to bound the above by By using (B.13)-(B.15) in (B.12) we may conclude that for all T > 0 there is a c(T, q, δ) so that for 0 ≤ s ≤ t ≤ T and x, y ∈ R, By Fatou's Lemma and (B.10) the same upper bound is valid for E (|ũ(t, x) −ũ(s, y)| q ). Kolmogorov's continuity criterion (see, for example, Theorem (2.1) in Chapter I of [15]) shows there is a jointly continuous version ofũ on R + × R.
We have shown that there is a jointly continuous process v(t, x) such that u(t, x) = v(t, x) a.a. x for all t ≥ 0, and v(0, ·) = u 0 (·), P − a.s.
Here the continuity in t in L 2 of both sides allows us to find a null set independent of t. As A has been continuously extended to a map from L 2 to L 2 , we have A(u s ) = A(v s ) in L 2 [0, 1] for all s ≥ 0 a.s. and so the white noise integral in (2.8) remains unchanged if u is replaced by v. It now follows easily that (2.8) remains valid with v in place of u. Therefore v is the required continuous C[0, 1]-valued solution of (2.7).