Limit theorems for infinite-dimensional piecewise deterministic Markov processes. Applications to stochastic excitable membrane models

We present limit theorems for a sequence of Piecewise Deterministic Markov Processes (PDMPs) taking values in a separable Hilbert space. This class of processes provides a rigorous framework for stochastic spatial models in which discrete random events are globally coupled with continuous space-dependent variables solving partial differential equations, e.g., stochastic hybrid models of excitable membranes. We derive a law of large numbers which establishes a connection to deterministic macroscopic models and a martingale central limit theorem which connects the stochastic fluctuations to diffusion processes. As a prerequisite we carry out a thorough discussion of Hilbert space valued martingales associated to the PDMPs. Furthermore, these limit theorems provide the basis for a general Langevin approximation to PDMPs, i.e., stochastic partial differential equations that are expected to be similar in their dynamics to PDMPs. We apply these results to compartmental-type models of spatially extended excitable membranes. Ultimately this yields a system of stochastic partial differential equations which models the internal noise of a biological excitable membrane based on a theoretical derivation from exact stochastic hybrid models.


Introduction
In this study we present limit theorems for sequences of Piecewise Deterministic Markov Processes (PDMPs) with values in a separable Hilbert space. PDMPs are a particular class of càdlàg, strong Markov processes which combine continuous deterministic time evolution and discontinuous, instantaneous, random 'jump' events. We note that in view of applications this paper is ultimately motivated by the interest in the derivation of a justifiable Langevin approximation to spatiotemporal stochastic hybrid models of excitable membranes, e. g., neuronal membranes. This is accomplished by the limit theorems we present in the following. We start briefly introducing the general idea of our framework and the main results which are made precise in the subsequent sections. We consider a family of fully coupled, Hilbert spacevalued PDMPs indexed by n ∈ N. Here fully coupled means that the PDMPs which split into a continuously moving and a piecewise constant component are such that the jump rates of the processes depend on the state of the full system and the continuous dynamics depend on the state of the jump component. For the limit theorems we rely on two key assumptions. Firstly, jumps possess heights decreasing to zero for n → ∞ but occur at an increasing frequency roughly inversely proportional to the jump size. We are therefore in the fluid limit setting, cf. [29,30]. Secondly, we assume that for each n the continuous dynamics in between jumps depend on the piecewise constant component only via a finite set of (Hilbert space-valued) functions thereof, which we call coordinate functions. It is the sequence of coordinate functions coupled to the continuous component for which we derive limits. The first limit theorem we present is a weak law of large numbers for PDMPs in infinite-dimensional Hilbert spaces where the deterministic limit is given by a solution of an abstract evolution equation. Next we proceed to the presentation of a central limit theorem for the martingales associated with a PDMP. This central limit theorem gives the basis for an approximation of PDMPs by diffusion processes which are solutions of stochastic partial differential equations. Finally, we show how to represent the stochastic process arising as the limit in the central limit theorem as a solution of a stochastic partial differential equation (SPDE) which then yields a Langevin approximation for PDMPs by a system of SPDEs. The new results presented extend previous results for PDMPs and pure jump processes in Euclidean space [30,16,35]. The difficulties in extending the fluid limit theorems in [29,30,35] to processes taking values in infinite-dimensional Hilbert spaces lie, on the one hand, in the appropriate treatment of Hilbert space-valued martingales. These arise by splitting a PDMP, being a semi-martingale, into a sum of a part with finite variation and a local martingale. As these considerations are essential we have devoted a full section, Section 3, to the discussion of the martingales. On the other hand, the more intricate existence theory of solutions to abstract evolution equations compared to solutions of ordinary differential equations in Euclidean space demands for additional technical rigour. We apply our theoretical findings to spatially extended hybrid models of excitable membranes. A first hybrid formulation of one such model in the context of neuroscience was presented in [5] and reformulated and extended as examples for PDMPs taking values in infinite-dimensional Hilbert spaces in [13]. For example, the Hodgkin-Huxley model is a deterministic, macroscopic model for the coupled evolution of the neuronal membrane potential and the averaged gating dynamics of ion channels [21]. More realistically, the membrane potential, which is the macroscopically observed variable of interest, arises from the stochastic dynamics of finitely many ion channels. Thus the application of our limit theorems shows that the Hodgkin-Huxley is obtained as the limit of a sequence of stochastic microscopic models taking the form of Hilbert space valued PDMPs in the sense of a law of large numbers. Conceptually, here the fluid limit corresponds to increasing the number of ion channels while simultaneously decreasing the individual influence of an individual channel on the total current. The martingale central limit theorem can then be used to define the Langevin approximation providing a relatively simple stochastic version of the Hodgkin-Huxley model incorporating internal fluctuations. Concluding this introduction, we comment on related work to fluid limits in the infinite-dimensional setting. Averaging for PDMPs in infinite-dimension, in particular for the neuron model introduced in [5], wherein also a law of large numbers was considered, has been recently considered in [20]. For a model of linear chemical reactions by jump Markov processes a law of large numbers [4] and a central limit theorem [25] have been proven based on the original work of [29,30] for finitedimensional jump-processes. In these cases the deterministic limit is a reaction-diffusion partial differential equation and the central limit theorem yields diffusion processes given by stochastic partial differential equations. Limit theorems for variations of this model have been investigated in two series of studies, cf. [26,27,28] and [7,8,9,10,11]. A central difference between spatial models of excitable media to models of chemical reactions is that the latter exhibit diffusive motion of the reactants (∼ channels) which is absent in the former. Additionally excitable media equations exhibit non-local interaction of channels as their dynamics are coupled globally via the membrane potential. The limit theorems we establish have to account for these differences. Further, there is also a difference on the technical side. The technique employed in [25] and in all subsequent publications cited above is based on the semigroup approach to stochastic / deterministic evolution equations. In contrast, we pursue in the present paper the approach of a weak formulation. At large, the weak formulation of evolution problems allows to consider more general equations as when dealing with mild, strong or classical solutions, cf. a discussion of this aspect in [43,Chap. 23.1]. Finally, we also mention a central limit theorem for Hilbert-valued martingales in [34] and a diffusion approximation of SPDEs on nuclear spaces driven by Poisson random measures in [23]. The methods of proof we employ for the theoretical results in this study are motivated by the two last references, but differ as the classes of stochastic processes considered therein and in the present manuscript are different.
The remainder of the paper is organised as follows. We first briefly define PDMPs in Section 2 and precisely state the structure for a sequence of such PDMPs to allow for a limit. Then we discuss in detail the associated martingale process in Section 3. Limit theorems and the diffusion approximation are presented in Sections 4 and 5. We have deferred the proofs of the main results to Section 6. Next in Section 7 we discuss applications of these limit theorems to compartmental-type models of excitable membranes where the proofs of the conditions are deferred to Appendix B. The paper is concluded in Section 8 with a brief discussion and an outlook on further developments and applications. Finally, the Appendix A of the paper contains the proof of the technical Theorem 3.1 that guarantees the square-integrability of the associated Hilbert space valued martingales and establishes an appropriate Itô-isometry.

Piecewise Deterministic Markov Processes
In the first subsection we briefly define PDMPs and, in particular, discuss the specific subclass of PDMPs for which we present limit theorems in this study. For a general discussion of PDMPs we refer to the monographs [15,22] and, specifically, for Hilbert space valued PDMPs associated to solutions of partial differential equations we refer to [13,40]. In the second subsection we present the sequence of PDMPs for which the limits are analyzed in this study. Finally, a notational remark: in this paper pairings (·, ·) and ·, · denote the inner product or the duality pairing, respectively, with respect to a certain Hilbert space which is usually indicated with a subindex. Further, * is used to denote dual spaces.

PDMPs on Hilbert spaces
Let (Ω, F , (F t ) t≥0 , P) denote a filtered probability space satisfying the usual conditions, X ⊂ H ⊂ X * be an evolution triple of separable real Hilbert spaces and K be a countable set of isolated states. The product H ×K serves as the state space for a PDMP. Then, a PDMP is a càdlàg strong Markov process X t (ω) = (U t (ω), Θ t (ω)) ∈ H × K for all t ≥ 0 which consists of two components. The first, U t , takes values in H, possesses continuous sample paths and is denoted the continuous component of the PDMP. The second, Θ t , taking values in K and possessing right-continuous, piecewise constant sample paths, we call its jump component. We say a PDMP is regular if the number of jumps of Θ t is a.s. finite in every finite time interval [0, T ]. In this study PDMPs are always regular. We next state the mechanisms which govern the time evolution of the paths of such a PDMP. Firstly, there exist for each θ ∈ K an abstract evolution equatioṅ u = A(θ) u + B(θ, u) (2.1) where A(θ) : X → X * is a linear and B(θ, ·) : X → X * a (possibly nonlinear) operator. We assume that the family of abstract evolution equations (2.1) is well-posed, i.e., given any θ ∈ K and any initial condition u ∈ H there exists a unique global weak solution φ(·, (u, θ)) ∈ L 2 ((0, T ), X) ∩ H 1 ((0, T ), X * ) depending continuously on the initial condition. Note, that the regularity implies φ(·, (u, θ)) ∈ C([0, T ], H), cf. [38,Chap. 11]. Then the trajectory of the continuous component U t follows in between jumps of the jump component Θ t the weak solution to (2.1) corresponding to the parameter θ given by the current state of the jump component. That is, for τ k , k ∈ N, denoting the jump times of the PDMP we have that Secondly, describing the stochastic transition dynamics of the jump component Θ t there exist measurable transition rates Λ : H × K → R + that define the distributions of the random jump time of Θ t in the sense that for all θ ∈ K In view of (2.2) we assume that Λ is integrable along the solutions of (2.1) on any finite time interval, i.e., for all θ ∈ K and all initial conditions u ∈ H, but diverging as T → ∞. We note that in applications we usually find that the transition rate Λ is bounded which implies the regularity of the PDMP. Finally, there exists a Markov kernel µ on H × K into K that gives the distribution of the post jump value, i.e., 3) The elements of the quadruple (A, B, Λ, µ) are called the characteristics of the process and under the above conditions define a regular PDMP uniquely (up to versions). Furthermore, under these conditions the following result characterising the extended generator of PDMPs is proven in [13,40].
is absolutely continuous almost surely and the map- Moreover, if in addition f is continuously Fréchet-differentiable with respect to its first argument such that the Riesz Representation 1 f u ∈ H of the Fréchet derivative satisfies f u (u, θ) ∈ X for u ∈ X and is a locally bounded composition operator in L 2 ((0, T ), X), 2 then the extended generator Af is given by We now define the structure of the sequence of processes for which we derive the limit theorems. For all n ∈ N let (Ω n , F n , (F n t ) t≥0 , P n ) be a a filtered probability space satisfying the usual conditions and the processes (X n t ) t≥0 = (U n t , Θ n t ) t≥0 defined thereon are regular PDMPs taking values in H × K n with path properties as defined in Section 2.1. Correspondingly, the characteristics of the PDMPs are given by (A n , B n , Λ n , µ n ). Note that the state space K n for the piecewise constant component changes with varying index n whereas the state space H for the continuous component remains fixed. Therefore, in order for such a sequence of processes to allow for a limit we need to impose a special structure on the characteristics referring to the continuous component. To this end we assume there exists an m ∈ N, introduced above, such that for each PDMP (U n t , Θ n t ) t≥0 there exists a family of measurable coordinate functions z n i : K n → E, i = 1, . . . , m, such that the characteristics A n (θ), B n (θ) depend on the piecewise constant component and on the index n only via the E-valued coordinate process z n (θ) = (z n 1 (θ) , . . . , z n m (θ)). That is, there exist measurable operators A, B : E × X → X * such that for all n ∈ N, all u ∈ H and all θ ∈ K n A n (θ) u = A(z n (θ)) u, B n (θ, u) = B(z n (θ), u). (2.5) The coordinates z n can be interpreted as a 'sufficient statistic' of the piecewise constant component for the evolution of the continuous component. In statistics a sufficient statistic for a quantity of interest is a function of the observations that is sufficient to estimate this particular quantity. For example, the sample average of independently and identically distributed real random variables is a sufficient statistic for the mean of their distribution. In the present setting, this means that the coordinate functions contain all information about the vector θ that is needed to determine the continuous dynamics in between jumps. Further, the essence of the subsequent limit theorems is that the sequence of coordinate processes on the space E allows for a limit under certain conditions. Typically, in applications one is interested in the dynamics of the continuous components only, thus a restriction of the attention to the coordinate functions is well justified. As E is a (vector-valued) Hilbert space itself, no generality would be lost if instead of the family of coordinate functions we assumed the existence of Hilbert space-valued functions z n taking values in the same Hilbert space for each n. However, we decided to use this more detailed notation since in examples one usually encounters that it is a set of coordinate functions that encodes the information necessary for defining the dynamics of the continuous component.
In order to illustrate this set-up let us briefly discuss the Hodgkin-Huxley model as an example of the general excitable membrane model considered in Section 7. Here the sequence of abstract evolution equations (2.1) arises from parabolic partial differential equations modelling the spacetime evolution of the membrane potential of the forṁ with constants g i > 0 and E i ∈ R, cf. (7.6). The indices refer to electrical currents due to the movement of charged Sodium (Na) and Potassium (K) ions across the membrane and ohmic leakage (L) current mainly due to Chloride ions [24]. In hybrid versions of the Hodgkin-Huxley system the conductances p i (t, x) depend on the finite number of open ion channels distributed in the membrane which increases with n. Each individual channel is modelled stochastically opening or closing at random times with dynamics depending on u, cf. Section 7.2 for more details. In the case of constant potential u(t, x) ≡ u each channel were a continuous time Markov chain. The collection of channel states at any time instant t defines the discrete component Θ n t . Finally, the coordinate functions z n relate channels in a specific state to their location in the physical space D, cf. their definition in (7.5). They map the channel configurations into piecewise constant space-time functions stating the local density of channels in the particular states, thus p n i (t, x) := z n i (Θ n t ) ∈ L 2 (D) = E. Hence, equipped with suitable boundary conditions equation (2.6) is an abstract evolution equation of the type (2.1) where the Hilbert spaces H, X and E are spaces of real functions on D ⊂ R d .

The associated martingale process
For the limit theorems we derive in this paper, the main estimation procedures concern certain martingales associated with the PDMP. As these are of such central importance we discuss them in this separate section. The principle aim is, on the one hand, to derive conditions that imply the convergence in probability of the associated martingales as needed for the law of large numbers (cf. condition (4.5) in Theorem 4.1) and, on the other hand, we present some necessary structure for the central limit theorems. Therefore we define for all j = 1, . . . , m the E-valued stochastic process M n j by where the integrand in the right hand side is given by Hence the integrand is a countable convex combination of elements in E with time-dependent coefficients and in between jumps it depends continuously on s. Anticipating condition (3.4) below, which we generally assume to hold, we find that the integral in the right hand side of (3.1) almost surely exists in the sense of Bochner. For a brief discussion of the Bochner integral we refer to [37,App. A]. For an application of a functional φ ∈ E * to (3.1) we obtain where the integrand is Thus the integral has the form of the extended generator, cf. Theorem 2.1, applied to the mapping (u, θ) → φ, z n j (θ) E . This already suggests that the processes (3.3) are martingales under suitable boundedness conditions. In fact we are able to establish that the processes M n j are E-valued càdlàg martingales. We refer to [14,37] for a brief discussion of martingales in infinite-dimensional spaces. The easiest way to validate the martingale property is due to the following result [37,Sec. 2.3]: , the Hilbert space-valued martingale property holds if and only if φ, M n j (t) E is a real-valued martingale for all φ ∈ E * . The following theorem gives a condition that guarantees that the processes (3.1) are square-integrable martingales and satisfy an Itô-isometry. The proof is rather technical and thus we have deferred it to the Appendix A.
Theorem 3.1. Let n ∈ N be fixed and assume that for all t > 0 it holds that Then the process M n j is a square-integrable martingale and satisfies the Itô-isometry We continue the investigation of the processes M n j as Hilbert space valued martingales. From now on we always assume that assumption (3.4) holds. Note that the finiteness of the second moments of the jump sizes is a standard condition in related fluid limit theorems [29,35,34]. We introduce a concept akin to the quadratic covariance operator in Euclidean finite dimensional spaces. This concept is important for the central limit theorems in, on the one hand, establishing weak convergence, and, on the other hand, characterising the limit. For further reference we refer to [33].
Definition 3.1. For the square-integrable, E-valued, càdlàg martingale M n j we denote by (≪ M n j ≫ t ) t≥0 its predictable quadratic variation process, i.e., the unique (up to indistinguishability), predictable L 1 (E * , E)-valued 3 process which satisfies that for all φ, ψ ∈ E * the real-valued process is a local martingale.
The aim now is to obtain an explicit formula for the quadratic variation process of the individual martingales M n j as well as of the vector-valued process M n of all martingales M n j , i.e., the E-valued process t → M n (t) = M n 1 (t), . . . , M n m (t) . To this end we define for all i, j = 1, . . . , m operators G n ij ∈ L(E * , E) by ψ → G n ij (u, θ n )ψ := (3.7) Clearly, these are linear, bounded operators mapping E * → E and depend measurably on (u, θ n ) ∈ H × K n . For i = j each operator is non-negative, i.e., φ, G n jj (u, θ n )φ E ≥ 0 for all φ ∈ E * , and symmetric, i.e., ψ, G n jj (u, θ n )φ E = φ, G n jj (u, θ n )ψ E for all φ, ψ ∈ E * . Let (ϕ k ) k∈N denote an orthonormal basis in E * . We find due to the Riesz Representation Theorem and Parseval's identity that the trace of the operators G jj satisfies Tr G n jj (u, θ n ) = Λ n (u, θ n ) For arbitrary i, j the trace is bounded in terms of (3.8) as it follows from Young's inequality that Tr G n ij (u, θ n ) ≤ 1 2 Tr G n ii (u, θ n ) + 1 2 Tr G n jj (u, θ n ). Let Φ = (φ 1 , . . . , φ m ) and Ψ = (ψ 1 , . . . , ψ m ) be elements of E * . Summing over all operators (3.7) applied to the components of Φ, Ψ as indicated by the indices, i.e., we obtain a linear, bounded operator G n (u, θ n ) mapping E * to E. This operator is symmetric as the operators G n ij satisfy φ, G n ij (u, θ n )ψ E = ψ, G n ji (u, θ n )φ E for all i, j. Moreover, the operator G n (u, θ n ) is non-negative as it holds that Finally, the operator G n (u, θ n ) is of trace class if the operators G jj , j = 1, . . . , m, are of trace class and the trace satisfies Tr G n (u, θ n ) = Λ n (u, θ n ) kn z n (ξ) − z n (θ n )) 2 E µ n (u, θ n ), dξ . (3.10) We next prove that the operators (3.8) give the quadratic variations of the martingales (3.1).
3 L 1 (E * , E) denotes the space of trace class operators from the Hilbert space E * into E.
Proposition 3.1. The quadratic variation of the martingale M n j satisfies for all t ≥ 0 Remark 3.1. It is an immediate consequence of Proposition 3.1 that the quadratic variation of the E-valued martingale M n is given analogously to (3.11) by integrating the operator G n .
Proof. First of all note that due to the characterisation of the trace (3.8) and condition (3.4) it holds that the process in the right hand side of (3.11) takes values in L 1 (E * , E) almost surely. Further, it holds that ≪M n j ≫ t satisfies for all φ, ψ ∈ E that Here φ, M n j (t) E and ψ, M n j (t) E are understood as real-valued stochastic integrals with respect to the associated martingale measure of the PDMP. Thus we infer that for all φ, ψ ∈ E it holds Finally, the linearity of the Bochner integral (note that L 1 (E * , E) is a Banach space) implies (3.11).
A further second property of the quadratic variation is that the process is a local martingale. We note that the trace process t → Tr ≪M n j ≫ t is the unique, predictable increasing process exhibiting this property. Using the characterisation (3.11) of the quadratic variation we thus obtain that the process is a local martingale vanishing almost surely at t = 0 and analogously in the case of the E-valued martingale M n .
We are now in a position to state a lemma which establishes the convergence in probability (3.14) of the processes (M n j ) t≥0 necessary for the law of large numbers, cf. condition (4.5) in Theorem 4.1.
Then the process (3.12) is a martingale and for all T, ǫ > 0, it holds that lim n→∞ P n sup t∈[0,T ] M n j (t) E > ǫ = 0 . (3.14) Proof. As the process M n j is an E-valued càdlàg martingale, it holds that M n j 2 E is a càdlàg submartingale. Thus an application of Markov's and Doob's inequalities yield the estimates Now, the Itô-isometry (3.5) and condition (3.13) imply the convergence in probability (3.14). It remains to show that the process (3.12) is a martingale. A sufficient condition, see, e.g., [22,Prop. B.0.13], is that for all T > 0 it holds Estimating the term inside the expectation we obtain The expectation of the first supremum term in the right hand side is bounded due to Doob's inequality and the square-integrability of the martingale. The term inside the second supremum is increasing, thus its expectation is finite due to condition (3.13).

A weak law of large numbers
In order to propose a deterministic limit of the sequence of PDMPs we consider functions F j : E × H → E, j = 1, . . . , m. In combination with the operators A, B these functions are used to define a coupled system of deterministic abstract evolution equationṡ u = A(p) u + B(p, u), We assume that to suitable initial condition (u 0 , p 0 ) ∈ H × E there exists a unique weak solution (u(t), p(t)) t≥0 in C(R + , H × E) of (4.1). Additionally, we assume that for all i = 1, . . . , m the components p i satisfy That is, the components p j satisfy the equation (4.1) in the sense of an Hilbert space valued integral equation. We note that in application one usually encounters deterministic limit systems that possess strong or classical solutions and hence the current weak framework is satisfied. Finally, we assume that the operators A, B and F j , j = 1, . . . , m, satisfy Lipschitz-type conditions on L 2 ((0, T ), E × X) in the sense that for every T > 0 there exist constants L 1 and L 2 such that for all u, v ∈ L 2 ((0, T ), X) and all p, q ∈ L 2 ((0, T ), E) it holds that and where we have omitted the arguments t of the functions u, v, p and q.
Remark 4.1. In the proof of the law of large numbers, see Section 6.1, these Lipschitz conditions are applied such that one pairing (v, q) refers to a path segment of the continuous component of a PDMP and the coordinate process and the second (u, p) to the deterministic limit functions. Thus for the applications of (4.3) and (4.4) in the proof it is sufficient that these hold only for pairings (v, q) out of a set containing almost all paths of the sequence of PDMPs and (u, p) being the deterministic limit, i.e., one (!) distinguished pairing. This restriction of (4.3) and (4.4) to be satisfied only for particular pairings (v, q) and (u, p) out of the whole path space has a decisive advantage: In order to establish these conditions we are able to incorporate additional qualitative results on the trajectories of the PDMPs and the deterministic limit and the constants L 1 , L 2 may depend on (u, p). For example, in the application to excitable membrane models such an additional qualitative is that the components corresponding to u, v, p, q are pointwise bounded.
We now present a weak law of large numbers in Theorem 4.1 below. The proof of the theorem follows the lines of previously published limit theorems considering processes in finite dimensions [29,35]. The main difficulties arising in infinite-dimensional phase space concerns the bounds on the martingale part, cf. condition (C1), which is rarely a problem in finite dimensions. However, using the appropriate martingale theory in Hilbert spaces these can be kept to a minimum. Then the difficulties are mainly of a technical nature as martingale theory in connection with PDMPs in infinite-dimensional spaces gets more involved and is not covered by previous results in [22]. We have established the necessary theory in the preceding Section 3 and addressed the question of the convergence of the martingale part (C1) within this framework. Most importantly, in Lemma 3.1 we have proven a sufficient condition for (C1) to be satisfied. In particular, this sufficient condition (3.13) is a natural extension of the condition employed in finite dimensions, cf. [29,35].
A different approach to establishing condition (C1) which avoids using martingale theory in Hilbert spaces is exemplified in the law of large numbers proved in [5]. In infinite-dimensional space this approach encounters the problem of simultaneously controlling countably many real martingales compared to only finitely many in the case of its finite-dimensional counterpart. This problem can be overcome with an intricate compactness argument which relies on the assumption that the dual space E * is compactly embedded in some additional normed space and all estimatesespecially an estimate which also implies condition (3.13) -have to be derived in the norm of this additional space. Furthermore, the condition, that all martingales ( φ, M n j (t) E ) t≥0 , j = 1, . . . , m and φ ∈ E * , possess almost surely uniformly bounded paths, has to be introduced. We are of the opinion that our approach is more elegant, but, more importantly, it avoids the introduction of additional conditions. Finally, consistently with the notation in Section 3 we use in the subsequent theorem and its proof the notation A n ·, z n j (·) E as defined in (3.2). Then, for given (u, θ n ) ∈ H × K n functionals A n · , z n j (·) E (u, θ n ) on E * are defined by the mappings φ → A n φ, z n j (·) E (u, θ n ). As usual we identify the bidual E * * with E and thus A n · , z n j (·) E (u, θ n ) ∈ E. Theorem 4.1. We assume that the following conditions hold: where we have omitted the argument t of the functions u and θ.

(C3)
The initial conditions (U n 0 , Θ n 0 ) of the sequence of PDMPs converge in probability to the initial conditions of the deterministic limit in the sense that for all ǫ > 0 Then, for every ǫ > 0 and every fixed T > 0 it holds that Remark 4.2. The result (4.7) implies convergence in probability of the processes (U n t , z n (Θ n t )) t≥0 to the deterministic function (u(t), p(t)) t∈[0,T ] in the Hilbert space L 2 ((0, T ), H × E). If the differences of the components are almost surely bounded independent of n the convergence even holds in the mean, cf. the application of the law of large numbers in Theorem 7.1. Further, the conditions (C1)-(C3) are generalisations from Euclidean space to infinite-dimensional Hilbert spaces of those employed in the corresponding theorems for PDMPs in Euclidean space [35] and, in particular, of the original formulation in case of pure jump processes in Euclidean space [29]. In these cases the conditions above reduce to the corresponding assumptions.

The central limit theorem and the Langevin approximation
We proceed to the presentation of the central limit theorem for associated martingales (M n t ) t≥0 defined in (3.1). The central limit theorem provides the theoretical basis for an approximation of spatio-temporal PDMPs by Hilbert-space valued diffusion processes where the latter can be represented by solutions of stochastic partial differential equations, see Section 5.2. Proving central limit theorems usually involves two tasks: On the one hand, one has to show the existence of a limit and, on the other hand, one has to provide a characterisation of the limit as a certain stochastic process. The former is equivalent to the problem of tightness of the stochastic processes. In the case of martingales sufficient conditions for tightness depending on the quadratic variation process are stated in [34]. In order to characterise the limit there exist different approaches, showing either that the limit solves a given (local) martingale problem which is known to have a unique solution (cf. [23,34]) or proving weak convergence of the finite dimensional distributions (cf. [25,35]). We present two central limit theorems, Theorems 5.1 and 5.2, employing the two methods, respectively, however, to avoid repetition we state only the proof of the first in the present study and refer to the PhD thesis of one of the authors [40] for the proof of the second. The two theorems differ in a technical assumption which in each case arises in addition to the central condition of the convergence of the quadratic variations. We believe that for applications of the limit theorems it is advantageous to know both versions of the martingale central limit theorem, as it is easily conceivable that only one of these technical assumptions is satisfied. Hence the theorems are applicable in different situations. Finally, we emphasise that in the following the space E need not necessarily be the same space for which the law of large numbers is satisfied. However, clearly, the space E in the present section contains the space in the law of large numbers as subspace. In applications, usually, the law of large numbers holds in a space with a stronger norm, for example, for the excitable membrane model considered in Section 7 the law of large numbers holds in L 2 (D) whereas the central limit theorem holds in the space H −2s (D). 4 This is a major difference to the corresponding results in finite-dimensional space where both limit theorems hold in the same space. 5

A martingale central limit theorem
In this section we present central limit theorems for the scaled E-valued martingales ( √ α n M n t ) t≥0 associated with a sequence of PDMPs where α n ∈ R + , n ∈ N, is a suitable rescaling sequence such that lim n→∞ α n = ∞. Clearly, the rescaling is necessary in order to be able to obtain a limit different from the trivial limit as (4.5) implies that (M n t ) t≥0 converges to zero in distribution. We note that the sequence α n can also be interpreted as characterising the speed of convergence of the martingales (M n t ) t≥0 .
In the following let t → G(u(t), p(t)) ∈ L E * , E be a Bochner-integrable operator-valued map such that each G(u(t), p(t)) is a symmetric, positive trace class operator. Particularly this implies for all Φ ∈ E * and all t > 0, that it holds that Here (u(t), p(t)) t≥0 is the deterministic limit obtained in Theorem 4.1 and the use of this notation for the -at this point -arbitrary time-dependent operator G only illustrates that in applications the time-dependence is due to a dependence on the deterministic limit system. These operatorvalued functions are used to define a unique centred diffusion process on E, i.e., an E-valued centred Gaussian process with independent increments, continuous sample paths. Such a process is uniquely defined by its covariance operator and due to a theorem of Itô stated in [25] every family of trace class operators C * (t) ∈ L 1 (E, E) which are increasing and continuous in t define a centred diffusion process. In the present situation we define C * in the following way. We denote by ι : E → E * the canonical identification of a Hilbert space with its dual, hence we can define for x, y ∈ E, which is continuous and increasing for all x ∈ E and C * (t) is a trace class operator on E. Moreover, for operators there is obviously a one-to-one relationship between C * and C. Hence, we may say that also the latter defines a diffusion process on the space E. We proceed to the statement of the central limit theorem. The proof of the theorem employs a characterisation of the limit via the local martingale problem. The essential condition characterising the limit is the convergence of the quadratic variation processes (5.6). The second condition (5.7) is a technical condition on the jump heights which arises due to the method of proof and is usually satisfied in applications. The remaining conditions are such that (D1) guarantees tightness of the sequence of processes and in combination with (D2) that any limit is a continuous stochastic process. The proof of the following theorem is deferred to Section 6.2.
Theorem 5.1. We assume that the following conditions hold: and there exists an orthonormal basis (ϕ k ) k∈N of E * such that for all k ∈ N and all (u, θ n ) ∈ H × K n except on a set of potential zero 6 where the constants γ k > 0, independent of n, t and (u, θ n ), satisfy k∈N γ k < ∞, and the constant C(t) > 0, independent of n, k and (u, θ n ), satisfies lim t→0 C(t) = 0.
(D2) For all β > 0 and every Φ ∈ E * it holds that Further, for all Φ ∈ E * and all t > 0 it holds that Finally, we assume that the jump heights of the rescaled martingales are almost surely uniformly bounded, i.e., there exists a constant C < ∞ such that it holds almost surely for all Then it follows that the process ( √ α n M n t ) t≥0 converges weakly to an E-valued centred diffusion process characterised by the covariance operator (5.2).
We now state a second version of the martingale central limit theorem wherein the limiting process is characterised by the convergence of the characteristic functions.
Theorem 5.2. Assume that the laws of the martingales ( √ α n M n t ) t≥0 form a tight sequence, e.g., condition (D1) is satisfied.
The convergence (5.6) holds and there exists a sequence β n > 0 decreasing to zero such that for all Then it follows that the process ( √ α n M n t ) t≥0 converges weakly to an E-valued centred diffusion process characterised by the covariance operator (5.2).
The central condition of the convergence of the quadratic variation processes (5.6) is unchanged, however, the second, technical condition (5.7) in (D3) is changed due to the different method of proof. That is, condition (5.8) arises instead of (5.7) as an assumption on the distribution of the jump heights employing a characterisation of the limit process using convergence of characteristic functions instead of the local martingale problem. The significance for applications of condition (5.8) in contrast to (5.7) is that the former avoids the almost sure uniform bound on the jump heights in the latter. That is, arbitrarily large jumps are possible for each martingale in the sequence as long as their probability decreases sufficiently fast. Note that (5.8) is stronger than the similar condition (D2) in the preceding theorem. We omit the proof of the theorem which is an adaptation of the estimating procedures in [30,35] to the infinite-dimensional setting. For details we refer to the PhD thesis of one of the present authors [40].
Remark 5.1. We remark without proof that the assumptions (D1) and (5.6) imply the convergence of the trace processes, i.e., for all T > 0 Tr G(u(s), p(s)) ds .

Langevin approximation
Usually, e.g., in models of excitable membranes, one is ultimately interested in the dynamics of the continuous component. We have discussed in Section 2.2 that the coordinate functions z n i , i = 1, . . . , m, carry all the information needed for the dynamics of the continuous component (U n approximation of the processes (U n t , z n (Θ n t )) t≥0 for large enough n by a deterministic evolution equation, on the one hand, and, as we argue in this section, by a stochastic partial differential equation on the other hand. To this end we first discuss representations of the limiting diffusion in Theorems 5.2 and 5.1 as a stochastic integral. By definition G(u(s), p(s)) • ι is a non-negative, self-adjoint trace class operator acting on E, hence there exists a unique non-negative square root, i.e., a non-negative Let (W t ) t≥0 be a standard cylindrical Wiener process on E with covariance operator given by the identity (cf. [14,37]). Then, as the mapping t → G(u(s), p(s)) • ι is a valid integrand process for a stochastic integral with respect to (W t ) t≥0 . That is, the process (Z t ) t≥0 defined for all t ≥ 0 by is an E-valued Gaussian process with continuous sample paths and independent increments which, in addition, is also a square-integrable martingale. Moreover, the process has the covariance given by the operator t 0 G(u(s), p(s)) ds. Therefore, due to unique definition of Gaussian processes via their covariance operators, the process (Z t ) t≥0 is a version of the limiting diffusion identified for the sequence of martingales ( √ α n M n G( U n t , P n t ) dW t . (5.10) The sequence of Langevin approximations ( U n t , P n t ) t≥0 possesses the same asymptotic behaviour as the sequence of processes (U n t , z n (Θ n )) t≥0 . It is obvious that for n → ∞ and thus α n → ∞ the noise term in (5.10) vanishes and the system approximates the deterministic solution (u(t), p(t)) t≥0 of the system (4.1), just as was proven in the law of large numbers Theorem 4.1 for the sequence of PDMPs. It poses no difficulties to make this statement precise in the form of a weak law of large numbers similar to Theorem 4.1. Thus for large enough α n one might expect that equation (5.10) produces a similar behaviour than the PDMP with the major advantage of being analytically (and numerically) to a great extent less complex. In order to analyse properties of the Langevin approximation, clearly, well-posedness of the system (5.10) has to be addressed first. This is suitably done within the variational approach to stochastic partial differential equations. That is, equation (5.10) is assumed to hold as an integral equation in X * × E * in contrast to the semigroup approach which defines the solution via the semigroup generated by the linear part of (5.10) and the variation of constants formula. (Note that in its generic form (5.10) does not neceassarily posses a fully linear part.) The variational approach reflects the approach of using weak solution to abstract evolution equations defining the deterministic inter-jump motion of PDMPs taken in this paper. We refer to [31, Sec. 1.3.1] for a concise introduction to the variational approach to SPDEs containing an existence and uniqueness theorem as well as further references. We do not pursue the issue of well-posedness of the Langevin approximation any further at this point, as we are of the opinion that this question is best addressed when analysing the Langevin approximation for particular models.
Remark 5.2. The process (5.9) is not necessarily the only stochastic integral process which is a version of the limiting diffusion. Let U be another separable, real Hilbert space, where U = E is possible, and assume there exists an operator Q ∈ L 1 (U, U ) (or Q cylindrical) and a function 7 g ∈ L 2 ((0, T ), L 2 (U, E)) for all T > 0 such that G(u(t), p(t)) • ι = g(u(s), p(s)) • Q • g * (u(s), p(s)) for all t ≥ 0. Then, the process (Z Q t ) t≥0 defined by the stochastic integral is an E-valued Q-Wiener process, has the same quadratic variation as (Z t ) t≥0 hence the processes coincide in distribution. Then starting from the representation (5.11) the Langevin approximation is given by (5.10) with the obvious changes in the diffusion term. We note that in finite dimensions the non-uniqueness, see, e.g., [3,Chap. 8], of a stochastic integral associated to a given covariance matrix can be exploited to improve the speed of numerical approximations in Monte-Carlo simulations of diffusion approximations by choosing an optimal diffusion coefficient structure, see [32]. In infinite-dimensions the question of a practical implication of choosing a diffusion approximation based on (5.11) over (5.10) needs, to the best of our knowledge, still to be addressed.
6 Proofs of the main results

Proof of Theorem 4.1 (Law of large numbers)
The central argument of the subsequent proof is an appropriate application of Gronwall's Lemma such that the upper bound satisfies the convergence in probability. Here the estimating procedure yielding the estimate to which Gronwall's Lemma is applied necessitates careful attention due to more intricate regularity aspects of solutions to abstract evolution equations in contrast to solutions of ODEs in Euclidean space.
The continuous component U n t of each PDMP is in between successive jump times the weak solution of an abstract evolution equation. Similarly u(t) is the weak solution of the abstract evolution equation (4.1). Therefore also the difference of the two paths is in between jump times the weak solution of an abstract evolution equation. It thus holds due to [17,Sec. 5.9,Thm. 3] for almost all t that which is valid for almost all t 0 , t 1 in between two successive jump times. Since both sides of equation (6.1) are continuous the equality (6.1) holds for all t 0 , t 1 between successive jump times. Moreover, as U n t is continuous also at jump times it follows that equation (6.1) holds for all t ∈ [0, T ], i.e., we have Next we employ the one-sided Lipschitz condition (4.3) to estimate the integral in the right hand side of equation (6.2). This yields the inequality The overall aim is to apply Gronwall's inequality to the growth inequality (6.3). Therefore, in the next step we derive a control on the terms z n j (Θ n s ) − p j (s) 2 E in the right hand side of inequality (6.3). As p is a solution of (4.1) satisfying (4.2) we obtain for every functional φ ∈ E * a decomposition where the term φ, M n j (t) E has precisely the form (3.3) for all t ∈ [0, T ]. Next we expand the decomposition (6.4) to obtain We take the supremum over all φ ∈ E * with φ E * ≤ 1 on both sides of this equation, square both sides and apply to the right hand side the inequality |a 1 + . . . + a k | 2 ≤ k(|a 1 | 2 + . . . + |a k | 2 ) and the Cauchy-Schwarz inequality which yields We next apply the Lipschitz condition (4.4) on F and obtain the estimate To further estimate this term we employ the convergence (4.5) of the term M n j E and the convergence (4.6) of the generator. It follows by the definition of these limits that for every ǫ 1 > 0 and every δ > 0 we can find an N ǫ1,δ such that for all n ≥ N ǫ1,δ it holds due to (4.5) for all j = 1, . . . , m and all t ∈ [0, T ] that and due to (4.6) and the Continuous mapping Theorem that on a set Ω 1 ⊂ Ω satisfying P n (Ω\Ω 1 ) ≤ δ for all n ≥ N ǫ1,δ . Thus continuing to estimate only for paths on the set Ω 1 we obtain from (6.5) the inequality In order to finally obtain the growth estimate suitable for an application of Gronwall's inequality we add inequality (6.3) and inequalities (6.6) for all j = 1, . . . , m which yields An application of Gronwall's inequality to (6.7) yields Finally, due to (C3), i.e., the convergence in probability of the initial conditions, it holds that for every ǫ 2 > 0 we can find to every δ > 0 an N ǫ2,δ such that on a set Ω 2 ⊂ Ω with P n (Ω\Ω 2 ) < δ it holds for all n ≥ N ǫ2,δ that Let ǫ, δ > 0 be arbitrary. Then we obtain choosing ǫ 2 = ǫ e −CT and ǫ 1 = ǫ2 5(m+2) , thus K 1 = ǫ 2 , that for all n ≥ N ǫ,δ := N ǫ1,δ ∨ N ǫ2,δ it holds that on the set Ω 1 ∩ Ω 2 . Therefore it holds for all n ≥ N ǫ,δ that As δ and ǫ are arbitrary the statement (4.7) follows.

Proof of Theorem 5.1 (Central limit theorem)
The proof of Theorem 5.1 is split into three successive steps. In the first step we prove tightness of the sequence of martingales which guarantees the existence of a limit. Secondly, we show that any limit is a continuous process. Finally, in the last step we prove that the limit is the specific diffusion process as stated in the theorem. The conditions (D1)-(D3) in Theorem 5.1 are such that each, in addition, to the preceding is needed in the successive steps of the proof.

Tightness
In order to prove tightness of the sequence of E-valued martingales ( √ α n M n t ) t≥0 it suffices to show that the following conditions are satisfied, cf. [34] wherein general conditions for tightness of sequences of Hilbert space valued processes and, in particular, martingales are considered: (T1) The sequence of initial conditions ( √ α n M n 0 ) n≥0 is tight.
(T2) For all t ≥ 0 it holds that and there exists an orthonormal basis (ϕ k ) k∈N of E * such that for each ǫ > 0 The sequence of the real-valued trace processes (Tr ≪ √ α n M n ≫ t ) t≥0 , n ≥ N, satisfies the Aldous condition: For every T, ǫ, δ > 0 there exists a h > 0 and an N > 0 such that for any sequence of stopping times 8 (σ n ) n≥0 with σ n ≤ T it is valid that We next establish the above conditions. First note that condition (T1) is trivially satisfied as M n 0 = 0 for all n > 0. Hence we proceed to condition (T2). In order to establish the first condition (6.10) we use Markov's inequality to obtain the estimate where the right hand side is finite due to assumption (5.3). Taking the supremum on both sides the same assumption implies (6.10).
Next, in order to show the second condition (6.11) we employ Markov's inequality, the monotone convergence theorem (in order to change the order of expectation and the countable summation over all k > m), the form of the quadratic variation (3.11) and inequality (5.4) to obtain for the term in the left hand side the estimates where the upper bound is independent of n ∈ N. Moreover, the property k∈N γ k < ∞ implies that lim m→∞ k>m γ k = 0 and hence (6.11) holds for all t ≥ 0. Finally, it remains to show (A). Let T, δ > 0 and σ n < T be an arbitrary sequence of stopping times, then for all for all h > 0 it holds that for s ≤ h Here we have used Markov's inequality, the strong Markov property of the PDMP and the assumption (5.4). As the final upper bound is independent of s and n and converges to zero for h → 0 condition (A) follows.

Any Limit is a continuous process
In the preceding part of the proof we have established that the laws of the sequence of martingales ( √ α n M n t ) t≥0 are tight which is equivalent to there existence of a weakly convergent subsequence. We now prove that under the additional condition (D2) any cluster point of the sequence is a measure supported on C(R + , E). The method of proof follows the outline of [23,Lemma 3.2] adapted for the stochastic processes being PDMPs on Hilbert spaces, the general setup in this study and the particular conditions (D1) and (D2) in Theorem 5.1 which differ from [23]. Furthermore, we have extended the result in [23,Lemma 3.2], which only considers convergence on finite time intervals [0, T ], to convergence on D(R + , E). In the following we employ the abbreviations Z n t := √ α n M n t and ∆ t Z n := Z n t − Z n t− , i.e., (∆ t Z n ) t≥0 denotes the process of jump heights. Note that ∆ t Z n = √ α n ∆ t z n (θ n ). Further, let P * denote an accumulation point of the sequence (P n ) n∈N . Without loss of generality we use P n , n ≥ 1, to also denote the subsequence converging weakly to P * . Furthermore, here P n is understood as a law on the Skorokhod space D(R + , E) given by the pushforward measure of the process ( √ α n M n t ) t≥0 . Then due to the Skorokhod Representation Theorem, e.g., [16, Chap. 3,Thm. 1.8], there exists a probability space (Ω o , F o , P o ) supporting D(R + , E)-valued random variables ζ n , n ≥ 1, and ζ * with distributions P n and P * , respectively, such that ζ n converges to ζ * almost surely with respect to P o . Further, it clearly holds that E n f (Z n ) = E o f (ζ n ) for suitable functionals f .
We begin the proof with preliminary estimates on functions evaluated along the path of the PDMPs. These ultimately allow to infer that the process of jumps vanishes in the limit. Let g be a measurable, bounded, non-negative function g : R → R, that vanishes in a neighbourhood of 0 and of ∞, that is, there exists a finite constant C g := sup x∈R g(x)/x 2 < ∞. For such a function g and any Φ ∈ E * we define the process where M n is the martingale measure associated with the PDMP, and we infer that G n t ( Φ, Z n E ) is a martingale. Note that the above summation over all s ∈ (0, t] is well-defined as the PDMPs are regular and thus g Φ, ∆ s Z n E is non-zero for only finitely many s ≤ t. The proof now proceeds as follows. We first show (a) that for all t ≥ 0 the random variables G n t ( Φ, ζ n E ), n ∈ N, are uniformly integrable for all t and (b) that they converge to s∈(0,t] g Φ, ∆ s ζ n E in probability. This allows to infer that the convergence result also holds as convergence in mean. In part (c) we then use these results to show that the jump heights of the canonical process of the law P * are constantly zero almost surely. This implies that  Therefore, employing the special structure of the map g we obtain the estimate where the right hand side is finite for every t > 0 due to condition (5.3) in (D1).
(b) In this part of the proof we establish convergence in probability of the random variables G n t ( Φ, ∆ζ n E ). Let β > 0 be such that g(x) = 0 for |x| ≤ β, i.e., the interval (−β, β) is contained in the neighbourhood of 0 whereon g vanishes. Then we obtain using Markov's inequality and due to the boundedness of g the estimates Thus due to condition (5.5) in (D2) it holds that Therefore, combining these two convergence results we obtain that holds as convergence in probability.
(c) From parts (a) and (b) we infer that (6.13) also holds as convergence in mean. Together with Jensen's inequality this implies and hence we infer that Furthermore, G n t ( Φ, Z n E ) is a martingale which satisfies G n 0 ( Φ, Z n E ) = 0. This, in turn, implies that E n G n t ( Φ, Z n E ) = 0 for every n ∈ N. Therefore we obtain due to (6.14) In a next step, let g m be a sequence of functions satisfying the properties for functions g proposed above. Further we assume that the functions g m (x) increase pointwise to x 2 for m → ∞ (for an example of such functions we refer to [23]). Then due to the monotone convergence theorem it holds that lim Furthermore, the limiting expectation in the right hand side is zero as each element of the sequence of expectations in the left hand side is zero due to (6.15). Next we choose Φ to be an element of an orthonormal basis (ϕ k ) k∈N of E and sum the expectations over all elements of the basis yielding Due to the dominated convergence theorem we can interchange the countable summation and the expectation and, as the PDMP is regular, we afterwards interchange the resulting two summation inside the expectation. Then Parseval's identity yields As the non-negative random variable inside the expectation is zero only for continuous paths of the process (Z s ) s∈[0,t] we infer that almost all paths are continuous, i.e., P * C([0, t], E) = 1.
(d) To conclude the proof let t k , k ∈ N, be a sequence of times increasing to infinity then that is a process with distribution given by the limit P * possesses almost surely continuous paths.

Limit is a diffusion process
In the final part of the proof we uniquely characterise the limit of the sequence of martingales ( √ α n M n t ) t≥0 under the additional assumptions (D3). The method of proof is via the local martingale problem motivated by a proof presented in [34], i.e., the limiting probability measure is the unique solution to a particular martingale problem. The author in [34] considers Hilbert space valued stochastic integral equations driven by Hilbert space valued martingales with state dependent quadratic variation. A central limit theorem for the martingales is presented. The arguments of the subsequent proof are closely related to [34]. This is as the general result on martingales associated with PDMPs, which we have proven in Section 3, result in the problem in this part of the proof to be of the same underlying structure as in [34]. One difference, however, is that the present conditions (D1)-(D3) are more general than the conditions in [34] and adapted to the PDMP setup, hence some estimates differ. As in the preceding part of the proof we interpret the sequence of martingales ( √ α n M n t ) t≥0 defined on the probability spaces (Ω n , F n , (F n t ) t≥0 , P n ) as random variables on the space D(R + , E) equipped with its natural σ-field D. Further, laws on the canonical space are given by the pushforward measure. In order to simplify the notation we denote the laws on the canonical space also by P n . Due to results in the preceding two parts of the proof we know the sequence P n , n ∈ N, admits a limit P * supported on C(R + , E). We use (ζ t ) t≥0 to denote the canonical process on D(R + , E) which is a version of the martingale ( √ α n M n t ) t≥0 under the push-forward maesure P n for all n ∈ N or of the weak limit under the measure P * .
In the following we prove that the limit P * is a solution to a local martingale problem the unique solution of which is an E-valued centered diffusion process with covariance operator C(t) ∈ L 1 (E * , E) as given in (5.2). For any twice continuously differentiable function f : E → R the extended generator Af of such a diffusion is given by Then, in order to uniquely characterise the solution to the local martingale problem connected with this generator and supported on the space C(R, E) it suffices to consider mappings f of the form Φ, · E and Φ, · 2 E for all Φ ∈ E * , cf. [34]. That is, we have to show that the canonical process ζ t is such that for all Φ ∈ E * the processes Φ, ζ t E and are P * -local martingales. We start introducing some notation and then show in parts (a) and (b) the local martingale properties of the two indicated processes on the canonical space D(R + , E).
As before we use Z n t := √ α n M n t and ∆ t Z n := Z n t − Z n t− . Further, as indicated above the notation is such that we use P n and E n to denote probabilities and expectations on the original given measurable spaces (Ω n , F n ) as well as on the canonical space (D(R + , E), D). That is, e.g., E n f (Z n t ) = E n f (ζ t ) for any bounded function f , where the former is the expectation taken on the original space (Ω n , F n , P n ) and the latter the expectation on the canonical space of càdlàg processes with respect to the pushforward measure. Furthermore, we employ the Itô-formula [33,Thm. 25.7] for smooth functions f ∈ C ∞ c (R) applied to semi-martingales. For the particular choice of the semi-martingales being the real martingales Φ, Z n t E the Itô-formula reads where (M f,n t ) t≥0 is some martingale on (Ω n , F n , (F n t ) t≥0 , P n ) depending on Z n and f . Next, we introduce on the canonical space for all positive ρ the stopping times τ ρ := inf{t ∈ R + | ζ t E > ρ} and note that due to the bound (5.7) in (D3) on the jump heights we have that for any law P n , n ≥ 1, it holds almost surely ζ τρ E ≤ ρ + C . (6.18) Analogously we define the stopping times τ n ρ := inf{t ∈ R + | Z n t E > ρ} on the spaces (Ω n , F n , (F n t ) t≥0 , P n ). Finally, as already mentioned (D t ) t≥0 denotes the natural filtration on the canonical space. Then for A ∈ D t we define A n := (Z n ) −1 F ∈ F n t its preimage with respect to the random variable Z n . We now proceed to show that the two processes Φ, ζ t E and (6.16) are indeed local martingales with respect to the limit measure P * .
(a) Let Φ ∈ E * be fixed and we choose for every ρ a smooth function f ρ ∈ C ∞ c (R) which satisfies f ρ (x) = x if |x| ≤ Φ E * (ρ + C) and thus f ′ (x) = 1 and f ′′ (x) = 0 for |x| ≤ Φ E * (ρ + C). Therefore it holds for t < τ n ρ , which implies the estimate | Φ, It follows that applying the Itô-formula (6.17) to the function f ρ and the martingale Z n t∧τ n ρ all terms besides the martingale M n,fρ vanish in the the right hand side. Therefore we obtain for t 2 ≥ t 1 and all A ∈ D t1 that The proof of the first martingale property is concluded as in [34]: The mapping ζ → f ρ ( Φ, ζ t2∧τρ E ) is almost surely (with respect to the probability P * ) continuous and as P n converges weakly to P * it holds due to (6.19) that Here we have employed a weaker version of the continuous mapping theorem, see, e.g., [6,Thm. 2.7] . We infer from the definition of the conditional expectation that the stopped processes are martingales. Furthermore, as ζ t possesses continuous paths almost surely under the measure P * it holds that τ ρ diverges to ∞ almost surely for ρ → ∞. Hence, we can find a sequence of stopping times τ ρ k , k ∈ N, such that τ ρ k → ∞ almost surely for k → ∞. Thus, the process Φ, ζ t is a local martingale with respect to P * .
(b) For the second class of processes we consider smooth functions g ρ ∈ C ∞ c (R) such that g ρ (x) = x 2 for all |x| ≤ Φ E * (ρ + C). Starting from the definition of the conditional expectation as in (6.19) we obtain Here the first expectation in the final right hand side vanishes due to the Itô-formula (6.17): We apply the Itô-formula for the function g ρ and the martingales Z n t∧τ n ρ to the terms Φ, Z n t2∧τρ 2 E and Φ, Z n t1∧τρ 2 E . Then we find -similarly to part (a) -that the summands in the right hand side of the Itô-formula vanish. Therefore we are left with only the martingale M n,gρ and the integral term, wherein g ′′ ρ ( φ, Z n t− E ) = 2 for all t < τ n ρ . The martingale term vanishes due to the martingale property and the remaining integral is cancelled by the integral in the above expectation. Overall this shows that the first expectation vanishes. Next we take the absolute value on both sides of the above equality and obtain, estimating the second expectation and extending the integration interval to [0, T ], the inequality The convergence of the upper bound to zero for n → ∞ follows by assumption (5.6). Hence we have proven an analogous result to (6.19) in part (a). The same line of argument that concluded part (a) also concludes part (b). The proof is completed.

Application to models of excitable membranes
The primary motivation for the present work stems from the study of stochastic version of the Hodgkin-Huxley model [21] describing action potential generation and propagation in spatially extended neurons in a PDMP formulation. This model is analogous in structure to hybrid models that are used for the modelling of Calcium dynamics, cf. [18,24,42], or models of cardiac tissue, cf. [19]. Therefore, we consider as an example of the application of the presented limit theorems a general compartmental-type hybrid stochastic model for spatially extended excitable membranes introduced in [40, Sec. 3.2] which subsumises the above mentioned applications. (Another example for the application of Theorem 4.1 is the law of large numbers that is presented in [5] for a particular one-dimensional hybrid model.) We refrain from discussing the physiological derivations of this type of model and the implications and interpretations of the limit theorems in this setting. These aspects will be subject to a forthcoming publication. We now fix some notation for the remainder of the section. The set D ⊂ R d denotes bounded spatial domain with the physically reasonable dimensions d ≤ 3. That is, the set D is a bounded interval when d = 1 and when d ∈ {2, 3} we assume it possesses a C 3 -boundary. Further, for a given dimension d, let s denote the smallest integer such that s > d/2. Finally, let m ∈ N be the fixed number of states ion channels can be in.

Deterministic limit system
The deterministic limit is the solution to the membrane equatioṅ with p j , j = 1 . . . , m given by solutions of the coupled equationṡ We choose Dirichlet boundary conditions for the component u, i.e., u(t, x) = 0 for all x ∈ ∂D and all t ∈ [0, T ], which, however, is of no particular importance for the considerations that follow and can be readily changed. Here the coefficient functions a ij and g i are smooth on D, with g i non-negative, and the differential operator is strongly elliptic. Further, the rate functions q ij are sufficiently smooth. 9 Finally, the initial conditions satisfy u 0 ∈ H 1 0 (D)∩H s (D) and p i (0) ∈ H s (D) and, in addition, the pointwise bounds u(0, x) ∈ [u − , u + ] and p i (0, x) ∈ [0, 1], i = 1, . . . , m, such that m i=1 p i (0, x) = 1, hold for all x ∈ D. Then, the deterministic system (7.1), (7.2) is well-posed, that is, there exists a unique global solution depending continuously on the initial condition, which also satisfies (4.

Compartmental-type membrane models
We briefly summarise the essential features of PDMPs (U n t , Θ n t ) t≥0 , n ∈ N, constituting compartmentaltype membrane models.
Firstly, an integral component of the sequence of models is a sequence of compartmentalisation of the spatial domain D. Thus, for each n ∈ N let P n be a convex partition of the domain D, i.e., P n is a finite collection of mutually disjoint convex 10 subsets of D, called compartments, such that their union equals D. The second fundamental aspect is the channel distribution across the compartments yielding the stochastic jump dynamics and the coordinate functions z n . We assume that each compartment either does not contain channels or a fixed deterministic number. Let p(n) denote the number of non-empty compartments of the nth model denoted by D 1,n , . . . , D p(n),n and l(k, n) be for k ≤ p(n) the total number of channels in the kth non-empty compartment of the nth model. Then the piecewise constant components of the PDMPs are given by mp(n)-dimensional vectors Θ n t = (Θ k,n i (t)) i=1,...,m, k=1,...,p(n) with finite state spaces K n . Each component Θ k,n i (t) counts the number of channels located in the domain D k,n which are in state i at time t. and it holds that as channels can neither be destroyed nor created. We proceed to define the stochastic jump dynamics. As two channel switching do not occur simultaneously, the only jumps in the configuration θ n ∈ K n with non-zero probability are transitions concerning one single channel. That is, events for which in one particular compartment one particular channel changes its state. Let q ij : R → R + denote the u-dependent instantaneous rate of one channel switching from state i to j. Then given a specific configuration θ n ∈ K n the rate that one channel in compartment D k,n switches from state i to j is given by where Q n ij (u) is a functional of the membrane variable u ∈ L 2 (D) defined as That is, Q k,n ij (u) is the instantaneous rate q ij evaluated at the average value of the membrane variable over the compartment D k,n . Hence the rate (7.3) is the number of channels in state i in domain D k,n times the rate of one channel switching from i to j. This definition yields by summing over all events the total instantaneous rate Note that for each n the total instantaneous rate is bounded and as expected proportional to the total number of channels which implies that the PDMPs are regular. Finally, we define on the set K n for i = 1, . . . , m the coordinate functions The coordinate process z n (Θ n t ) is càdlàg with each component taking values in L 2 (D). Clearly, the coordinate process is zero on those compartments which do not contain channels. Moreover, each z n i (Θ n t ) is for every t ≥ 0 a piecewise constant function on the spatial domain D which takes values in [0, 1].
Thirdly, the family of abstract evolution equations (2.1) defining the dynamics of the PDMP's continuous component U n are given by the parabolic, linear, inhomogeneous second order partial differential equationsu Consistently with the deterministic limit system we equip equation (7.6) with Dirichlet boundary conditions. Finally, we define the operators A, B depending on θ n only via suitable coordinate functions, cf. (2.5), by To conclude, it is easy to see that the characteristics defined via the individual rates (7.3), the total jump rate (7.4) and the evolution equation (7.7) define a sequence of L 2 (D) × K n -valued infinite-dimensional PDMPs (U n t , Θ n t ) t≥0 . Moreover, the membrane component (U n t ) t≥0 is almost everywhere pointwise bounded, i.e., U n t (x) ∈ [u − , u + ] for almost all x ∈ D and all t ≥ 0, where u − := min i E i ≤ 0 and u + := max i E i ≥ 0, for initial conditions U n 0 satisfying these bounds, cf. [40,Sec. 3.2], which we always assume.

Limit theorems for compartmental-type models
Applying the limit theorems derived in Sections 4 and 5 to compartmental models we find that the conditions therein translate into assumptions on the behaviour of the sequence of partitions P n and the number of ion channels in the membrane, see Appendix B. Thus, we denote by δ(n) the maximal diameter of the non-empty compartments in the nth model, i.e., l(k, n) .
Then the law of large numbers takes the following form.
Theorem 7.1. Assume that the sequence of partitions satisfies that lim n→∞ δ + (n) = 0, lim n→∞ ℓ − (n) = ∞, (7.8) and that the initial conditions (U n 0 , z n (Θ n 0 )) converge in probability to (u 0 , p 0 ) in the space L 2 (D) m+1 . Then the compartmental-type models converge in probability to the deterministic solution of the excitable media system (7.1), (7.2) in the sense that it holds for all ǫ > 0 that Moreover, the convergence also holds in the mean in the space L 2 ((0, T ), L 2 (D)), i.e., z n i (Θ n t ) − p(t) L 2 ((0,T ),L 2 ) = 0 . (7.10) Next we present the appropriate quadratic variation process for the martingale central limit theorem. For the definition of the limiting diffusion we consider for u, p i ∈ C(D) the bilinear form Proof. As stated in [25] it is sufficient for the statement of the proposition that the operator G(u(t), p(t)) is self-adjoint, positive and of trace class. These properties are easily verified and for a detailed proof we refer to [40].
In order to state the conditions on the partitions in the central limit theorem we define ν + (n) and ν − (n) to be the maximum and minimum Lebesgue measure of non-empty compartments, i.e., Finally, note that in the following the coordinate functions z n are considered as maps from K n into the space H −2s (D).
Theorem 7.2. Let s be the smallest integer such that s > d/2. If in addition to (7.8) and the convergence of the initial conditions the sequence of partitions satisfies Remark 7.1. We note that for all reasonable physical domains D and all initial conditions (u 0 , p 0 ) sequences of partitions P n and initial conditions (U n 0 , Θ n 0 ) for the PDMPs can be found satisfying the conditions of Theorems 7.1 and 7.2. For example, a suitable sequence of partitions is obtained by grids of uniform cubes with decreasing edge length covering the domain D and putting channels only into these cubes which are fully contained in D. For a more detailed discussion of these aspects we refer to the PhD thesis of one of the present authors [40].

Conclusions
As a general theoretical results for PDMPs we have derived a law of large numbers and martingale central limit theorem in Sections 4 and 5 of this study. The former establishes a connection of stochastic hybrid models to deterministic models given, e.g., by systems of partial differential equations. Whereas the latter connects the stochastic fluctuations in the hybrid models to diffusion processes. As a prerequisite to these limit theorems we carried out a thorough discussion of Hilbert space valued martingales associated to the PDMPs. Furthermore, these limit theorems provide the basis for a general Langevin approximation to PDMPs, i.e., certain stochastic partial differential equations that are expected to be similar in their dynamics to PDMPs. We have applied these results to compartmental-type models of spatially extended excitable membranes. Ultimately this yields a system of SPDEs which models the internal noise of a biological excitable membrane based on a theoretical derivation from exact stochastic hybrid models. Topics for further research are motivated by corresponding results in finite-dimensions [30,35] and for spatially inhomogeneous chemical reaction systems converging to reaction diffusion equations, cf. [25]. In these studies limit theorems are derived for the fluctuations around the deterministic limit identified by the law of large numbers. Using the notation of Section 5 we conjecture that the sequence of processes, √ α n (U n t − u(t), z n (Θ n t ) − p(t) t≥0 , n ∈ N, converges in distribution to a suitable diffusion process. Moreover, we further conjecture that this limit is closely related to the asymptotic linearisation of the Langevin approximation around the solution of the deterministic limit, cf. [35] wherein this result is proven for finite-dimensional PDMPs. Further, on the applications side we believe that the Langevin approximation to spatio-temporal PDMP models of excitable membranes poses an important object for further investigation. Its derivation was the initial motivation of the study of the limit theorems in the present study and it is their main application herein which enables to write down the system of SPDEs that constitute a Langevin approximation. This system now demands for further analysis, particularly, first of all the question of existence and uniqueness of the Langevin approximation has to be addressed. Subsequently, as SPDEs are analytically more accessible than PDMPs a theoretical analysis of qualitative and quantitative properties of the models may be possible. Finally, we want to mention that the limit theorems presented also find applications beyond excitable membrane models. In current work in progress by one of the present authors the limit theorems derived in Sections 4 and 5 are applied to stochastic neural field equations, based on a model presented in [12], cf. a preliminary account in [39]. We also plan to investigate the connection to similar limits derived for reaction-diffusion models, cf. the series of results on variations of the model in [25,26,27,28] and [7,8,9,10,11]. An answer to this question would contribute to a more complete picture of limit-theorems for spatio-temporal stochastic models.
Acknowledgements: During the time the presented work was accomplished M. Riedler was a PhD student at Heriot-Watt University supported by the EPSRC grant EP/E03635X/1. M. Riedler further acknowledges support from a joint UK Mathematical Neuroscience Network (MNN) and the Cell Signalling Network (SIGNET) travel grant.

A Proof of Theorem 3.1 (Itô-isometry)
In this proof we show that under condition (3.4) the processes M n j , j = 1, . . . , m, n ∈ N, defined in (3.1) are square-integrable, càdlàg martingales which satisfy the Itô-isometry (3.5). Throughout the proof we fix a j = 1, . . . , m and n ∈ N and the results holds for any such j and n. Therefore, speaking of a PDMP in the following always refers to the PDMP (U n t , Θ n t ) t≥0 corresponding to the fixed n. Further, for notational simplicity we omit the indices n and j discriminating processes and characteristics of PDMPs, i.e., M n j and z n j (Θ n ) are denoted simply by M and z(Θ). Finally, recall that τ k , k = 1, 2, . . ., denotes the sequence of increasing random jump times of the PDMP which are stopping times satisfying lim k→∞ τ k = ∞ almost surely.
First of all, note that the process M is càdlàg by definition. The proof of the remaining open results is split into three parts. In the first, part (a), we prove the martingale property for the real process ( φ, M (t) E ) t≥0 for every φ ∈ E * . Then, the first main statement of Theorem 3.1, the squareintegrability of the process M (t), is proved in part (b). Moreover, as square-integrability implies integrability, the Hilbert space martingale property follows. Finally, the second main statement, the Itô-Isometry (3.5), is established in part (c). The proof we present in part (b) is motivated by the proof of [22,Prop. 4.5.3] which states the corresponding results for real-valued martingales associated with PDMPs. In extending to the present setup the method of proof employed therein one has to ensure, on the one hand, that the employed results and estimation procedures all have corresponding analoga in the infinite-dimensional setting. On the other hand, one has to carefully make sure that only the weaker regularity results available in infinite-dimensions are used. Finally, the introduction of random initial conditions, not considered in [22], also necessitates some adaptations. (a) First note that for all φ ∈ E * the real-valued processes φ, which is independent of u. It follows that the process φ, M (t) E is a local martingale if the map (A.2) is in the domain of the extended generator, cf. Theorem 2.1. Obviously, path-differentiability almost everywhere is trivially satisfied as the map t → φ, z(Θ t ) E is piecewise constant. Hence, it remains to consider the integrability condition for which it is a sufficient that cf. [13,15,22]. Using Young's inequality we obtain an upper bound to (A.3) by Here the first expectation is finite due to the PDMP being regular and the second is finite by an immediate consequence of assumption (3.4). Next, we show that the process is not only a local martingale but even a martingale. As mentioned above the process φ, that M stopped at the first jump τ i is square-integrable. Subsequently in part (b.2) this result is extended to M stopped at any jump time τ k , k ∈ N. Then we are able to infer square-integrability of the process M . As square-integrability implies integrability it follows from part (a) that M is a Hilbert space valued martingale.
(b.1) Note that prior to τ 1 the jump component Θ of the PDMP remains constant. We introduce the notation Due to the structure of a PDMP we obtain for the conditional expectation with respect to the initial condition That is, the first term in the right hand side is the position of the stopped process M (τ 1 ∧ t) 2 E at time t if t < τ 1 times the conditional probability that the first jump does not occur before t.
The second term is its position after the jump integrated over the conditional density that a jump occurs in [0, t]. We apply integration by parts to the first term (note that N (0) = 0) and find that Therefore we obtain Due to form of the derivative (A.4) the first two terms cancel and we are left with the equality Next we calculate the expectation of the real-valued process stopped at τ 1 . The process N 2 is connected to the process N defined at the beginning of part (b.1) inasmuch as the integrand of the former is the squared norm of the latter. Furthermore note that N 2 is the term inside the expectation in the right hand side of the Itô-isometry (3.5).
Thus the aim is now to show that the conditional expectation of N 2 (t ∧ τ 1 ) equals the conditional expectation of M (t ∧ τ 1 ) 2 E . Again due to the particular structure of the PDMP we obtain for the conditional expectation Integration by parts applied to the integral term yields Therefore we obtain that A comparison of the right hand sides in equalities (A.5) and (A.6) shows that they are equal and thus we obtain after taking the expectation of both conditional expectations that As N 2 is increasing and thus N 2 (τ 1 ∧ t) ≤ N 2 (t) almost surely, we obtain that the right hand side in this equation is finite due to condition (3.4). Note that (A.7) is the Itô-isometry (3.5) for the stopped process M (t ∧ τ 1 ). (b.2) In this part of the proof we show the square-integrability for the process M stopped at an arbitrary jump time τ k , k ∈ N, and finally for the non-stopped process M . To this end we first note that Analogously to part (b.1) we find that Thus taking expectations on both sides of this equality yields where the right hand side is finite as due to (3.4) both expectations are finite. By induction we next show that each M (τ k ∧t) is square-integrable. Assume that E M (τ k ∧t) 2 E < ∞, where the induction basis for k = 1 holds due to part (b.1). Then the reverse triangle inequality yields that Here the right hand side is finite due to (A.8) and an application of Young's inequality to the product in the left hand side yields that for all ǫ > 0 Then choosing ǫ < 1/2 we obtain a contradiction due to the induction hypotheses. In a final step of this part of the proof we show square-integrability for the non-stopped process. Using Fatou's Lemma and monotone convergence for interchanging limits and expectation we obtain the following upper estimate where the final term is finite due to condition (3.4). Moreover, as square-integrability implies integrability, the martingale property for the Hilbert space valued process M now follows due to part (a).
(c) Finally, in the last part of the proof we establish the Itô-isometry. To this end we first show that equality (A.7) holds for all τ k ∧ t, k ∈ N. Again we proceed by induction with the induction basis given by (A.7). We observe that Taking the conditional expectation with respect to the stopped σ-field F τ k ∧t we find that the second term in the right hand side of (A.10) vanishes as it holds due to the following properties of the conditional expectation: Firstly, for E-valued random variables Secondly, the Optional Sampling Theorem, i.e., E M (τ k+1 ∧ t) F τ k ∧t = M (τ k ∧t) in the above application, also holds for Hilbert space-valued martingales 11 . Thus we obtain Taking the expectation on both sides of this equality and using the induction hypotheses, i.e., the second expectations on both sides of the above equality equate, yields We conclude the proof extending the Itô-isometry (A.11) from the stopped processes to the nonstopped process. We have already obtained the upper estimate E M (t) 2 E ≤ E N 2 (t), cf. (A.9). Hence it remains to prove that a lower bound is given by the same term. As M (t) 2 E is a realvalued submartingale it holds for all k ≥ 1 due to the standard Optional Sampling Theorem for càdlàg submartingales, see, e.g., [22,App. B], that Hence, for k → ∞ we obtain by monotone convergence E M (t) 2 E ≥ E N 2 (t) which, combined with the upper bound (A.9), yields the Itô-isometry (3.5). The proof is completed.

B.1 Proof of Theorem 7.1 (Conditions for the LLN)
We apply Theorem 4.1 for the choice of spaces X = H 1 0 (D), H = L 2 (D) and E = L 2 (D). Hence, we have to prove in the following that the assumptions therein are satisfied, i.e., (i) the one-sided Lipschitz condition (4.3) on the operators A and B defined by (7.7), (ii) the Lipschitz condition on the right hand side of the gating system (7.2), (iii) the uniform convergence of the generator and (iv) the martingale convergence. Finally, in (v) we extend the convergence in probability due to Theorem 4.1 to convergence in the mean (7.10). In the following we use · to denote the pointwise product of real functions on D.
(i) For the non-linear operator B we find that the left hand side in the Lipschitz condition is for almost all t given by a finite sum of terms with u, v ∈ H 1 0 (D) and p i , p i ∈ L 2 (D). Hence, the duality pairing corresponds to the inner product in L 2 (D). We estimate each of the summand of the type (B.1) separately. Using the triangle inequality we obtain Here, the first term in this right hand side is further estimated using Cauchy-Schwarz and Young's inequality, which yields For the second term we obtain, making use of the triangle inequality, Cauchy-Schwarz and Young's inequality and the pointwise bounds on p i and v, the sequence of estimates A summation over all these estimates for i = 1, . . . m yields Adding the estimate for some γ 1 , γ 2 > 0, which holds as the linear operator A is coercive and independent of p, we obtain for a suitable constant C. Finally, integrating over (0, T ) we find the one-sided Lipschitz condition (4.3) is satisfied.
(ii) Due to the triangle inequality it suffices to consider differences of the form p i · q(u) − p i · q(v) L 2 , where q substitutes for an arbitrary rate function q jk . Using the triangle inequality, the pointwise boundedness of p i and q by 1 and q, respectively, and the Lipschitz condition on the rate functions q (with common Lipschitz constant L) we obtain A summation over all such separate estimates, integrating and squaring both resulting sides yield the Lipschitz condition (4.4).
(iii) In order to prove the convergence of the generators (4.6) we employ in the following two technical results which we collect in a separate proposition. Firstly, the purpose of the formula (B.2) is to transform the generator of the PDMP into a form that allows comparison with the deterministic limit system (7.2). Secondly, the inequality (B.3), which bounds the norm U n L 2 ((0,T ),H 1 ) by a deterministic constant uniformly over n ∈ N, is used repeatedly in the subsequent estimation procedures.
The generator of the PDMP satisfies (b) For all n ∈ N and all T > 0 it holds that T 0 U n t 2 where the constants C 1 , C 2 are deterministic and independent of n ∈ N.
Proof. (a) We denote by θ n k,i→j for all k = 1, . . . , p(n) and all i = j, i, j = 1, . . . m the configuration in K n that arises from the configuration θ n through the event that a channel in state i located in the compartment D k,n switches to state j. Then simple reorganisation of finite sums yields Thus we obtain that the generator satisfies (B.2). (b) By definition of a PDMP it holds that the component (U n t ) t≥0 is the weak solution of the evolution equationU with initial condition U n 0 . We consider the reaction term in this equation as a given inhomogeneity. Then standard estimation procedures from the theory of linear parabolic partial differential equations, cf. [17,Sec. 7], yield, after appropriately estimating the inhomogeneous term, where the constants K 1 , K 2 are deterministic and depend only on the domain D and the coefficients of A. Further, it holds that z n i (Θ n t ) L1 ≤ |D| and the sequence of initial conditions is bounded by assumption as U n 0 (x) ∈ [u − , u + ] for all x ∈ D almost surely. The inequality (B.3) follows.
We now proceed to the actual proof of the convergence (4.6). To this end we need to consider for almost every t and all i = 1, . . . , m, the convergence in L 2 (D) of (B.2) to F i (z n (Θ n t ), U n t ) where F i is as defined in (7.2). That is, we have to estimate We find that the single summands in the two summations match up and thus it suffices to consider each of them separately. Employing the boundedness of the coordinate functions, i.e., z n j (Θ n t ) L ∞ ≤ 1 we obtain the estimates For the last equality we have used that the summands are mutually orthogonal in L 2 (D). Next we estimate each of the remaining integrals in (B.5) using the Lipschitz continuity of q ij and Poincaré's inequality in L 2 (D k,n ), i.e., where ∇U n t L 2 (D k,n ) is the norm in L 2 (D kn ) of the Euclidean norm of the gradient vector ∇U n t . Here we have employed that for convex domains the optimal Poincaré constant is given by π −1 diam(D k,n ) [36]. Hence, a summation over all k = 1, . . . , p(n) and employing the estimate ∇U n t 2 Integrating over (0, T ) we therefore obtain for (B.4) the estimate Finally, the norm U n t L 2 ((0,T ),H 1 ) is bounded independently of n ∈ N by a deterministic constant due to Proposition B.1(b). This upper bound holds for almost all paths of the PDMPs (U n t , θ n t ) t≥0 and thus there exists a constant C > 0 independent of n such that almost surely. Due to the assmuption (7.8) the estimate in the right hand side converges to zero for n → ∞ and the convergence (4.6) follows.
(iv) Next we consider convergence in probability of the martingale part. To this end we employ Lemma 3.1. As before we denote by θ n k,i→j the channel configuration that arises from the configuration θ n if a channel in compartment D k,n switches from state i to state j. Then it holds that This implies that Hence, under condition (7.8) the assumption of Lemma 3.1 is satisfied.
(v) Finally, we extend the convergence in probability to convergence in the mean for the individual components being in the space L 2 ((0, T ), L 2 ), see the remark following  (D) is embedded in C(D) due to the Sobolev Embedding Theorem. These two properties are essential in order to prove the conditions (5.3) -(5.7) of Theorem 5.1. All conditions except (5.6), which establishes the convergence of the quadratic variation, are straightforward consequences of the assumptions of the theorem. These are shown in part (i) of the subsequent proof. For condition (5.6) more involved estimation procedures are necessary which are presented in part (ii).
(i) We first show condition (5.3). As in the preceding section θ n k,i→j denotes the element of K n that differs from θ n by one channel in the kth compartment being in state i instead of state j. Then, the Sobolev Embedding Theorem yields the estimate z n i (θ n k,i→j (t)) − z n (Θ n t ) E = sup φ H 2s =1 l(k, n) −1 φ, I D k,n H 2s ≤ C l(k, n) |D k,n | , (B.7) where C is a constant resulting from the continuous embedding of H 2s (D) into C(D). Using this estimate for the jump heights in the space H −2s (D) we find similarly to part (iv) of the proof of Theorem 7.1 that it holds α n E n T 0 Λ n (U n t , Θ n t ) (ii) In the second part of the proof we establish the central condition (5.6) of the convergence of the quadratic variation. For simplicity of notation we omit the time argument of the PDMP paths and the deterministic solution as the following estimates hold for almost all t. First of all we expand the quadratic variation of the martingales into the finite sum We next estimate these terms separately in parts (ii.1) and (ii.2). Finally, in part (ii.3) the estimates are combined to prove the convergence of the quadratic variation.
(ii.1) A further application of the triangle inequality yields (B.8) = D p j (x) q ji (u(x)) φ 2 j (x) dx − D z n j (Θ n )(x) q ji (U n (x)) φ 2 j (x) dx + p(n) k=1 Θ k,n j l(k, n) D k,n q ji (U n (x)) φ 2 j (x) dx − α n p(n) k=1 Θ k,n j l(k, n) 2 Q k,n ji (U n ) φ j , I D k,n 2 E . (B.10) We estimate the two resulting differences separately and obtain for the first term in the right hand side of (B.10) the estimate For the second term in the right hand side of (B.10) we obtain by employing Θ k,n j /l(k, n) ≤ 1 the estimate p(n) k=1 D k,n q ji (U n (x)) φ 2 j (x) dx − α n l(k, n) q ji 1 |D k,n | D k,n U n (x) dx D k,n φ j (x) dx 2 (B.12) and we continue estimating each summand therein separately. We begin employing the Mean Value Theorem to expand the rate function q ji in the integral in the left hand side such that q ji (U n (x)) = q ij 1 |D k,n | D k,n U n (y) dy + q ′ ji (ϑ k,n (x)) U n (x) − 1 |D k,n | D k,n U n (y) dy , (B.13) where ϑ k,n (x) denotes an appropriate mean value. For now we omit the remainder term, i.e., the second term in the right hand side of (B.13), a consideration of which is deferred. Hence, we obtain for the absolute value in each summand in (B.12) the estimate q ji 1 |D k,n | D k,n U n (y) dy We note that q ji is bounded by q and continue estimating which yields ≤ q |D k,n | 1 |D k,n | D k,n φ 2 j (x) dx − α n |D k,n | 2 l(k, n)|D k,n | 1 |D k,n | D k,n φ j (x) dx 2 ≤ q D k,n φ j (x) − 1 |D k,n | D k,n φ j (y) dy 2 dx (B.14) + q |D k,n | 1 − α n |D k,n | 2 l(k, n)|D k,n | 1 |D k,n | D k,n φ j (x) dx 2 (B.15) The term (B.14) is estimated using Poincaré's inequality which yields an upper bound by q π −2 diam 2 (D k,n ) ∇φ 2 L 2 (D k,n ) . For the terms (B.15) a summation over all k = 1, . . . , p(n) yields q p(n) k=1 |D k,n | 1 − α n |D k,n | 2 l(k, n)|D k,n | 1 |D k,n | D k,n φ j (x) dx 2 ≤ q 1 − ℓ − (n) ν − (n) ℓ + (n) ν + (n) φ n j 2 L 2 , (B. 16) where φ n j is a piecewise constant approximation to φ j defined by φ n j := p(n) k=1 1 |D k,n | D k,n φ j (x) dx I D k,n .
As φ n j converges to φ j in L 2 (D) it holds that the sequence of norms converge, hence φ n j L 2 is a bounded sequence. Therefore the right hand side in (B.16) is a componentwise product of convergent sequences. The sequence |1 − (ℓ − (n) ν − (n)/(ℓ + (n) ν + (n))| converges to zero, cf. condition (7.12), thus the right hand side in (B.16) converges to zero for n → ∞. Finally, it remains to consider the term arising from the remainder in the expansion of q ji , see (B.13), inserted into (B.12). By assumption q ′ ji is bounded (by a constant q). Therefore we obtain an upper bound on the respective term by Here we have employed the Poincaré inequality in L 1 with optimal Poincaré constant given by diam(D k,n )/2 [1]. A combination of these estimates yields an upper bound to (B.8) by (B.8) ≤ C Φ p j − z n j (Θ n ) L 1 + u − U n L 1 + δ + (n) 2 ∇U n L 1 + δ 2 + (n) + δ + (n) + R(n) , (B.17) where the term R(n) is given by the right hand side of (B.16) and converges to zero for n → ∞. The constant C Φ < ∞ is a suitable deterministic constant independent of n ∈ N which depends on Φ ∈ (H 2s (D)) m via the norm in H s (D) of the components of Φ. (ii.2) Next we consider the mixed terms (B.9). Analogously to part (ii.1) we apply the triangle inequality and obtain Θ k,n j l(k, n) 2 Q k,n ji (U n ) φ j , I D k,n E φ i , I D k,n E As in (ii.1) we obtain for the first term in this right hand side an upper bound by Also the second term is treated as in (ii.1), i.e., applying the Mean Value Theorem and estimating the resulting terms accordingly. In particular the remainder term is estimated completely analogously. Therefore, the only term we are left to estimate is + q |D k,n | 1 − α n |D k,n | 2 l(k, n)|D k,n | 1 |D k,n | D k,n φ i (x) dx 1 |D k,n | D k,n φ j (x) dx .

(B.19)
First of all, using Young's inequality we obtain for the second term the estimate which converges to zero for n → ∞. We next estimate the term (B.18). Firstly, we note that as in part (a) we find using Poincaré's inequality an upper bound to the term and the upper bound is proportional to δ + (n) 2 . Next, expanding the two squared terms in (B.21) we find using the reverse triangle inequality that the term (B.21) is an upper bound to Thus also this term possesses an upper bound which is proportional to δ + (n) 2 . For n → ∞ the upper bound converges to zero. As for δ + (n) → 0 also the term spanning the first and second line converges to zero which was established in (ii.1), necessarily also the term in the third line converges to zero. Therefore we infer that the term (B.18) converges to zero proportional to δ + (n) 2 . Now, a combination of these estimates yields analogously to (B.17) in (ii.1) that (B.9) ≤ C Φ p j − z n j (θ n ) L 2 + u − U n L 2 + δ(n) 2 ∇U n L 1 + δ 2 (n) + δ(n) + R(n) . (B.22) Here R(n) is a term converging to zero for n → ∞ arising from (B.20) and it is of the same type as the term R(n) in (ii.1). The deterministic constant C Φ is independent of n ∈ N and depends on Φ via the norm in H s (D) of the components of Φ.
p i (t)−z n i (Θ n t ) L 2 + u(t)−U n (t) L 2 +δ(n) 2 ∇U n L 2 +δ 2 (n)+δ(n)+R(n) Here we have also employed the continuous embedding of L 2 (D) ֒→ L 1 (D). We next square both sides of this inequality and integrate over (0, T ). Afterwards we take the square root of the integral terms and further take the expectation of the resulting inequality. Finally, appropriate applications of Jensen's inequality yields that T 0 E n Φ, G(u(t), p(t)) Φ E − α n Φ, G n (U n t , Θ n t ) Φ E dt (B.23) ≤ C Φ,T δ 2 (n)+δ(n)+R(n)+E n u−U n L 2 ((0,T ), for an appropriate constant C T,Φ < ∞. Note that in order to arrive at the estimate (B.23) we have further employed that the random term ∇U n L 2 ((0,T ),L 2 ) can be estimated by a deterministic bound independent of n ∈ N due to Proposition B.1 (b). Finally, due to the law of large numbers, i.e., Theorem 7.1, the sequence of PDMPs converges to the deterministic limit in the mean. Hence the expectation in the right hand side in (B.23) converges to zero for n → ∞. Furthermore, δ + (n) converges to zero by assumption (7.8), as does the term R(n). Thus, overall the right hand side in (B.23) converges to zero. The convergence of the quadratic variation is proved which completes the proof of Theorem 7.1.