Interacting diffusions and trees of excursions: convergence and comparison

We consider systems of interacting diffusions with local population regulation. Our main result shows that the total mass process of such a system is bounded above by the total mass process of a tree of excursions with appropriate drift and diffusion coefficients. As a corollary, this entails a sufficient, explicit condition for extinction of the total mass as time tends to infinity. On the way to our comparison result, we establish that systems of interacting diffusions with uniform migration between finitely many islands converge to a tree of excursions as the number of islands tends to infinity. In the special case of logistic branching, this leads to a duality between the tree of excursions and the solution of a McKean-Vlasov equation.


Introduction
In population dynamics and population genetics a prominent role is played by diffusion processes on [0, ∞) G or [0, 1] G driven by stochastic differential equations (SDEs) of the form for t ≥ 0 where G is a finite or countable set and (B t (i)) t≥0 , i ∈ G, are independent standard Brownian motions and where (m(j, i)) j,i∈G is a stochastic matrix. We will refer to the solution (X t ) t≥0 of (1) as (G, m, µ, σ 2 )-process. In appropriate timescales and for suitable choices of µ and σ 2 , the component X t (i) describes the (rescaled) population size on island i ∈ G at time t ≥ 0, or the relative frequency of a genetic type that is present on island i ∈ G at time t ≥ 0. The linear interaction term on the right-hand side of (1) models a mass flow between the islands, which might be caused by migration of individuals or a flow of genes. Here we will use a picture from population dynamics. The coefficient σ 2 (x) then is the infinitesimal variance of the local population size given its current value x ∈ [0, ∞). A classical case is Feller's branching diffusions where σ 2 (x) = const · x for x ∈ [0, ∞). Moreover the drift term µ describes the growth rate of a local population apart from immigration from other islands and emigration. A prototypical example is µ(x) = x(K − x), i.e. logistic growth. The migration matrix m(i, j) i,j∈G is assumed to be a stochastic matrix.
Here we address the question of the maximal effect of the migration matrix m(i, j) i,j∈G for fixed µ and σ 2 . We will show in Theorem 2 that the total mass process of (X t ) t≥0 is dominted by the virgin island model, which has been constructed in [19]. This dominating process does not depend on G or on the migration matrix. The intuition which leads to a comparison result is as follows. Consider a model with supercritical population-size independent branching and additional deaths due to competition within each island of G, i.e., µ is a concave function with µ(0) = 0 and σ 2 is a linear function. Now compare different distributions of individuals over space. If there is at most one individual on each island, then there are no deaths due to competition until the next birth or migration event. If, however, all individuals are on the same island, then there are deaths due to competition. The effect of the population-size independent branching is the same in both situations. We infer that more individuals survive if the distribution of individuals is more uniform over space. Now the migration dynamics which distributes mass uniformly over space would be uniform migration on G. As there is no uniform migration on an infinite set G, we approximate G with larger and larger finite subsets and consider uniform migration on the finite subsets. This intuition leads to the assertion that the total mass of the (G, m, µ, σ 2 )-process is dominated by the total mass of an N -island model ( For N ∈ N := {1, 2, . . .} the N -island model (X N t ) t≥0 is the solution of (1) with G := {1, . . . , N } and m(i, j) := 1 N , i, j ∈ G. We will refer to this N -island model as (N, µ, σ 2 )process. The initial configurations X N 0 are assumed to converge suitably, as N → ∞, to a sequence (x(i)) i∈N whose sum is finite.
Our first main result (Theorem 1) establishes convergence of (X N t ) t≥0 as N → ∞. The limiting process turns out to be a forest of (mass) excursions started in x(i), i ∈ N. This forest of excursions has been constructed and analysed in [19] and has been denoted as virgin island model. In this model, every migrant populates a new island, similar to the infiniteallele model in which every mutation produces a new allele. Jean Bertoin turned this idea into a tree of alleles [7] and a partition into colonies [6] (see also Section 7 of Pardoux and Wakolbinger (2010) for a connection with the virgin island model). In particular, Bertoin [7] shows in case of independent branching that the virgin island model is the diffusion approximation of a discrete mass branching process. In Section 2 we will review and discuss the virgin island model.
Here is a brief heuristics how a forest of (mass) excursions emerges from the (N, µ, σ 2 )processes as N → ∞. For finite N , every island continuously sends migrants to every other island. Every once a while (in fact, densely in time), a migrant is successful and found offspring which immediately find themselves interacting with the resident population on that island. For a bounded initial mass, the probability that two "large" populations that descended from two different founders coexist at the same time in one and the same island turns out to be negligible as N → ∞. Thus, in the limit N → ∞, every emigrant populates a previously unpopulated island. The evolution of the population size on every freshly populated island is described by a random (mass) excursion. These mass excursions are born (densely in time) in a Poissonian manner on ever new islands with an intensity proportional to the currently extant mass, and, once born, follow the SDE Formally, this is described by means of the excursion measure Q associated with (3) in the sense of Pitman and Yor (1982) (see also [19]). The intensity measure with which a path (z t ) t≥0 spawns a "daughter" excursion born at time t ≥ 0 is z t dt Q. The roots of the forest are random paths Y i,x(i) , i ∈ N, which are independent solutions of (3) with Y i,x(i) ure 1 does not contain the whole tree. In fact islands are populated by emigrants densely in time but only finitely many excursions started by these emigrants reach a given strictly positive height. Now a noteworthy observation is that the tree structure provides us with independence of disjoint subtrees. Putting this differently, the virgin island model is a branching process in discrete time in the sense of Jiřina (1958) except that there are now infinitely many types, one for each excursion path. Due to this branching structure, the virgin island model is easier to study than the N -island process. Several authors considered analogons of the virgin island model in the case of state-independent branching; see [1,5,6,7,13,18,25,36,41] for a selection of articles.
To state our convergence result of Theorem 1 more formally, denote the population size spectrum for the (N, µ, σ 2 )-process as for every N ∈ N. Furthermore define the population size spectrum of the virgin island model Our convergence result assert that the population size spectra converge in distribution, i.e., in distribution as N → ∞; see Theorem 1 for the precise statement. Now we state the comparison result (2) more precisly. Recall that µ is the infinitesimal mean in a non-spatial situation. We assume µ to be subadditive. Then a population of size x that is separated into two islands experiences (in sum) a larger growth rate than a population of the same size that is concentrated on one island. Thus the virgin island model should offer in expectation a more prolific evolution of the total mass than a model (1) with the same coefficients µ and σ 2 . The infinitesimal variance σ 2 has an impact on a comparison in distribution. More precisely the stochastic order in (2) depends on whether σ 2 is superadditive, additive or subadditive. In our prototype example of additive σ 2 , the stochastic order is the usual increasing order. Then we have that for every non-decreasing function f : [0, ∞) → [0, ∞). If σ 2 is superadditve or subadditive, then we will use a concave increasing order and a convex increasing order, respectively. In fact Theorem 2 is a comparison result not only on the one-dimensional, but on the finite dimensional distributions. Thereby results on the distribution of the total mass of the virgin island model have an immediate impact on the distribution of interacting diffusions with local population regulation. As a very special application of our comparison result, we obtain a sufficient condition for extinction for interacting diffusions with local population regulation. Here we speak of global extinction if the total mass i∈G X t (i) converges to zero in distribution as t → ∞ whenever i∈G X 0 (i) < ∞. Theorem 2 of [19] shows that global extinction of the virgin island model is equivalent to ∞ 0 y σ 2 (y)/2 exp y 0 −x + µ(x) σ 2 (x)/2 dx dy ≤ 1.
As a consequence of (7), if condition (8) [19] implies an upper bound for the growth rate of i∈G X t (i) as t → ∞ if (8) fails to hold. The limit of the N -island model is well-known if the random variables {X N 0 (i) : i ≤ N } are exchangeable with common law ν, say. Here the situation is different due to the unbounded total mass. Emigrants still choose to migrate to island 1, say, with rate 1/N which is small in the limit N → ∞. However, the total mass is of order N in the case of a nontrivial exchangeable initial configuration. Consequently there is a continuous immigration of migrants onto island 1 even in the limit N → ∞. Moreover the dynamics of the N -island model preserves exchangeability. Therefore the rate of immigration converges to the expected average over all islands as N → ∞ due to de Finetti's theorem. This leads to the following limiting process. The N -island process (X N t ) t≥0 converges in distribution as N → ∞ to see e.g. Theorem 1.4 in [40] for the case σ 2 ≡ 1 and µ bounded and globally Lipschitz continuous. Thus the components of the N -island process become independent in the limit N → ∞. Kac (1957) described the first known instance of this phenomenon and called it propagation of chaos. After McKean (1967), propagation of chaos has then been intensively studied in the mathematical literature, see Sznitman's St. Flour lecture notes [40]. Our convergence (6) is another instance of propagation of chaos. Here independence in the limiting object is independence of disjoint subtrees.
In the prototype example of logistic branching, the McKean-Vlasov equation (9) and the virgin island model can be interpreted as forward and backward process. Corollary 2 shows for that case that the solution of the McKean-Vlasov equation and the total mass process are dual to each other. This duality derives from a self-duality of the N -island model. This self-duality in turn derives from pathwise forward and backward processes in a graphical representation, see [3]. So in our prototype example, the virgin island model may be interpreted as backward process of the solution of the McKean-Vlasov equation.
Here is a selection of models for which one could think of a similar comparison result. Mueller and Tribe (1994) investigate a one-dimensional SPDE analog of interacting Feller branching diffusions with logistic growth. Bolker and Pacala (1997) propose a branching random walk in which the individual mortality rate is increased by a weighted sum of the entire population. Etheridge (2004) studies two diffusion limits hereof. The "stepping stone version of the Bolker-Pacala model" is a system of interacting Feller branching diffusions with non-local logistic growth. The "superprocess version of the Bolker-Pacala model" is an analog of this in continuous space. Blath, Etheridge and Meredith (2007) study a two-type version hereof, which is a spatial extension of the classical Lotka-Volterra model. Fournier and Méléard (2004) generalize the model of Bolker and Pacala (1997) by allowing spatial dependence of all rates. A model in discrete time and discrete space is constructed in Birkner and Depperschmidt (2007). In that paper an individual has a Poisson number of offspring with mean depending on the current configuration and, once created, offspring take an independent random walk step from the location of their mother.

The virgin island model
The virgin island model without immigration has been introduced in [19]. Here we slightly generalize this model by adding independent immigration of mass. The virgin island model is an analog of (1) in which every emigrant populates a new island. Islands with positive mass at time zero evolve as the one-dimensional diffusion (Y t ) t≥0 solving (3). The following assumption guarantees existence and uniqueness of a strong [0, ∞)valued solution of equation (3), see e.g. Theorem IV.3.1 in [21].
Note that zero is a trap for (Y t ) t≥0 , that is, Y t = 0 implies Y t+s = 0 for all s ≥ 0. If we need the solution (X t ) t≥0 of (1) to be well-defined, then we additionally assume the migration matrix to be substochastic.
Assumption A2.2. The set G is (at most) countable and the matrix (m(j, i)) j,i∈G is nonnegative and substochastic, i.e., m(j, k) ≥ 0 and i∈G m(j, i) ≤ 1 for all j, k ∈ G.
Note that Assumption A2.1 together with Assumption A2.2 guarantees existence and uniqueness of a solution of (1) with values in {x ∈ I G : |x| < ∞}. This follows from Proposition 2.1 and inequality (48) of [20] by letting σ i 1 for i ∈ G and using monotone convergence. Mass emigrates from each island at rate one and colonizes new islands. A new population should evolve as the process (Y t ) t≥0 and should start from a single individual which has mass zero due to the diffusion approximation. Thus we need the law of excursions of (Y t ) t≥0 from the trap zero. For this, define the set of excursions from zero by where T y = T y (χ) := inf{t > 0 : χ t = y} is the first hitting time of y ∈ [0, ∞). The set U is furnished with locally uniform convergence. For existence of the excursion measure Q and in order to apply the results of [19], we need to assume additional properties of µ(·) and of σ 2 (·). For the motivation of these assumptions, we refer the reader to [19]. Assume ε 0 y σ 2 (y) dy < ∞ for some 0 < ε < |I|. Then the scale function S : is well-defined.
Under Assumption A2.3, the process (Y t ) t≥0 hits zero in finite time almost surely and the expected total emigration intensity of the virgin island model is finite, see Lemma 9.5 and Lemma 9.6 in [19]. Moreover the scale function S(·) is well-defined and satisfies S (0) = 1.
Assuming A2.1 and A2.3, we now define the excursion measure Q as the law of (Y t ) t≥0 started in ε > 0 and rescaled with S(ε) as ε → 0. More formally, it is stated in Pitman and Yor (1982) (see [19] for a proof) that there exists a unique σ-finite measure Q on U such that for every bounded continuous F : C [0, ∞), [0, ∞) → R for which there exists a δ > 0 such that F (χ) = 0 whenever sup t≥0 χ t < δ. The reader might want to think of Q as describing the evolution of a population founded by a single individual. In the special case σ 2 (y) = 2βy, µ(y) = cy with β > 0 and c ∈ R, the process (Y t ) t≥0 is Feller's branching diffusion whose law is infinitely divisible. Then the excursion measure coincides with the canonical measure.
Next we introduce the state space of the virgin island model. The construction of the virgin island model is generation by generation. As 0-th generation, we denote the set of all islands which are non-empty at time zero or which are colonized by individuals immigrating into the system. All islands being colonized by emigrants from the 0-th generation are denoted as islands of the first generation. The second generation is colonized from the first generation and so on. For this generation-wise construction, we use a method to index islands which keeps track of which island has been colonized from which island. An island is identified with a triple which indicates its mother island, the time of its colonization and the population size on the island as a function of time. Let be the set of all possible islands of the 0-th generation. For each n ∈ N, define I n := ι n−1 , s, χ : ι n−1 ∈ I n−1 , (s, χ) ∈ [0, ∞) × U which we will refer to as the set of all possible islands of the n-th generation. This notation should be read as follows. Island ι n = (ι n−1 , s, χ) ∈ I n has been colonized from island ι n−1 ∈ I n−1 at time s and carries total mass χ t−s at time t ≥ 0. Denote by I the union of all I n over n ∈ N ≥0 . The virgin island model will have values in the set of subset of I.
Having introduced the excursion measure, we now construct the virgin island model with constant immigration rate θ ∈ [0, ∞) and started in ( be independent solutions of (3) such that Y k,x k 0 = x k almost surely. Moreover let Π ∅ be a Poisson point process on [0, ∞) × U with intensity measure The elements of the Poisson point process Π ∅ are the islands whose founders immigrated into the system. Next we construct all islands which are colonized from a given mother island. Let {Π ι : ι ∈ I} be a set of independent Poisson point processes on [0, ∞) × U with intensity measure The elements of the Poisson point process Π (ι,s,χ) are the islands which descend from the mother island (ι, s, χ). All ingredients are assumed to be independent. Now the 0-th generation is The (n + 1)-st generation, n ≥ 0, is the (random) set of all islands which have been colonized from islands of the n-th generation and the virgin island model V is the union of all generations V (n+1) := ι n , t, χ ∈ I n : ι n ∈ V (n) , (t, χ) ∈ Π ιn and V := We call V the virgin island model with immigration rate θ and initial configuration (x k ) k∈N .

Main results
We begin with convergence of the N -island process. In this convergence, we allow the drift function µ N and the diffusion function σ N to depend on N in order to include the case of weak immigration. For example, one could be interested in an N -island model with logistic branching and weak immigration at rate θ N on each island. In that case, one would set µ N (x) = θ N + γx(K − x) and σ 2 N (x) := x for x ∈ I. The equation of the N -island process now reads as where i = 1, . . . , N and where (B t (i)) t≥0 , i ∈ N, are independent standard Brownian motions.
The idea to include weak immigration into a convergence result is due to Dawson and Greven (2010) who independently obtain convergence of an N -island model using different methods.
Assumption A3.2. The random variables (X 0 (i)) i∈N and (X N 0 (i)) i≤N are defined on the same probability space for each N ∈ N. There exists a random permutation π N 1 , . . . , π N N of Furthermore the total mass of X 0 (·) has finite expectation E i∈N X 0 (i) < ∞.
If (x i ) i∈N ⊂ I is a summable sequence, then Assumption A3.2 is satisfied for X 0 (i) = x i and Next we introduce the topology for the weak convergence of the N -island process. What will be relevant here is not any specific numbering of the islands but the statistics (or "spectrum") of their population sizes, described by the sum of Dirac measures at each time point, that is, where δ x is the Dirac measure on x. The state space of the measure-valued process (22) is the set M F I of finite measures on I. We equip the state space M F I with the vague topology on I \ {0}. Now we formulate the convergence of the (N, µ N , σ 2 N )-process defined in (20). Theorem 1. Suppose that µ N N ∈N and σ 2 N N ∈N satisfy Assumption A3.1 and that the initial configurations (X N 0 (i)) i≤N and (X 0 (i)) i∈N satisfy Assumption A3.2. Then, for every in distribution where V is the virgin island model with immigration rate θ = lim N →∞ N µ N (0) and initial configuration (X 0 (i)) i∈N .
The proof is deferred to Section 4.

Remark 1.
For readability we rewrite the convergence in terms of test functions. The weak convergence (23) is equivalent to for every bounded continuous function for some 0 ≤ t 1 ≤ · · · ≤ t n ≤ T . In addition letf : I → R be a continuous function satisfying |f (x)| ≤ Lf x for all x ∈ I. Then following the arguments in the proof of Lemma 22 below, one can show that (24) holds with F and f replaced byF andf , respectively.
The assumptions of Theorem 1 are satisfied for branching diffusions with local population regulation. A prominent example is the N -island model with logistic drift µ(x) = γx(K − x) and with σ 2 (x) = 2βx for x ∈ [0, ∞) and some constants γ, K, β > 0. More generally, Theorem 1 can be applied if µ(x) = γx − c(x) and σ 2 ( is a concave function with c (0) ∈ R. We believe that Theorem 1 also holds for non-linear infinitesimal variances such as σ 2 (x) = x(1 − x) in case of the Wright-Fisher diffusion. Our proof requires linearity only for one argument which is the step from equation (144) to equation (145).
In case of logistic branching, we obtain a noteworthy duality of the total mass process of the virgin island model with the mean field model (M t ) t≥0 defined in (9). By Theorem 3 of [20], systems of interacting Feller branching diffusions with logistic drift satisfy a duality relation which for the (N, γy(K − y), 2βy)-process reads as where the notation E yδ1 refers to the initial configuration X N 0 = (y, 0, . . . , 0) and E x1 refers to X N 0 = (x, . . . , x). This duality is established in [20] via a generator calculation, in Swart (2006) via dualities between Lloyd-Sudbury particle models and in [3] by following ancestral lineages of forward and backward processes in a graphical representation. Now let N → ∞ in (27). Then the left-hand side converges to the Laplace transform of the total mass process of the virgin island model (without immigration) and the right-hand side converges to the Laplace transform of the mean field model (9). This proves the following corollary.
Corollary 2. Let (V t ) t≥0 be the total mass process of the virgin island model without immigration starting on only one island. Furthermore let (M t ) t≥0 be the solution of (9), both with coefficients µ(y) = γy(K − y) and σ 2 (y) = 2βy for y ∈ [0, ∞) where γ, K, β > 0. Then where E y and E x refer to V 0 = y and M 0 = x, respectively.
Together with known results on the mean field model, this corollary leads to a computable expression for the extinction probability of the virgin island model.
Proof. Theorem 2 of [19] shows convergence in distribution of V t to V ∞ as t → ∞ and (29) holds. If (29) fails to hold, then Corollary 2 together with convergence The distribution of M ∞ is necessarily an invariant distribution of the mean field model (9) and is nontrivial. Lemma 5.1 of [20] shows that there is exactly one nontrivial invariant distribution for (9) and this distribution is given by (32).
The second main result is a comparison of systems of locally regulated diffusions with the virgin island model. For its formulation, we introduce three stochastic orders which are inspired by Cox et al (1996). Let Z = (Z t ) t≥0 andZ = (Z t ) t≥0 be two stochastic processes with state space I. We say that Z is dominated byZ with respect to a set F of test functions The first order is 'the usual stochastic order' ≤ st in which Z is dominated byZ if there is a coupling of Z andZ in which Z t is dominated byZ t for all t ≥ 0 almost surely. Assuming path continuity, an equivalent condition is as follows. Denote the set of non-decreasing test functions of n ∈ N ≥0 arguments by for a set S ⊆ [0, ∞). Furthermore let F +± be the set of non-decreasing functions which depend on finitely many time-space points If there is no space component, then we simply write F +± (S). In this notation, Z ≤ stZ is equivalent to Z ≤ F+±Z , see Subsection 4.B.1 in Shaked and Shanthikumar (1994).
We will use two more stochastic orders. In the literature, the set of non-decreasing, convex functions is often used. Here an adequate set is the collection of non-decreasing functions whose second order partial derivatives are nonnegative. As we do not want to assume smoothness, we slightly weaken the latter assumption. We say for Note that if f is smooth, then this is equivalent to A function is called directionally convex (e.g. Shaked and Shanthikumar 1990) if it is (i, j)-convex for all 1 ≤ i, j ≤ n. Such functions are also referred to as L-superadditive functions (e.g. Rueschendorf 1983). Define the set of increasing, directionally convex functions as F (n) and similarly F +− with '(i, j)-convex' replaced by '(i, j)-concave'. Furthermore define F ++ and F +− as in (35) If µ is concave and σ 2 is subadditive, then inequality (38) holds with F +− replaced by F ++ . If µ is subadditive and σ 2 is additive, then inequality (38) holds with F +− replaced by F +± .
The proof is deferred to Section 5. Comparisons of diffusions at fixed time points are wellknown. Theorem 2 provides an inequality for the whole process. The techniques we develope for this in Subsection 5.1 might allow to generalize the comparison results of Bergenthum and Rueschendorf (2007) on semimartingales.
The assumption of µ being subadditive is natural in the following sense. Let us assume that letting two 1-island processes with initial masses x and y, respectively, evolve independently is better in expectation for the total mass than letting one 1-island process with initial mass x + y evolve. This assumption implies that for all x, y, x + y ∈ I and thus subadditivity of the infinitesimal mean µ. If σ 2 is not additive, then we need the stronger assumption of µ being concave for Lemma 30. From Theorem 2 and a global extinction result for the virgin island model, we obtain a condition for global extinction of systems of locally regulated diffusions. According to Theorem 2 of [19], the total mass of the virgin island model converges in distribution to zero as t → ∞ if and only if condition (40) below is satisfied. Together with Theorem 2, this proves the following corollary.

Convergence to the virgin island model 4.1 Outline
First we outline the intuition behind the proof. The virgin island process is a tree of excursions whereas the N -island process has no tree structure. It happens in the latter process that different emigrants colonize the same island. In addition, the N -island process is not loop-free. An individual could migrate from island 1 to island 3 and then back to island 1. That these two effects vanish in the limit as the number of islands tends to infinity will be established in two separate steps. The first step ensures that the limit of the N -island process as N → ∞ is loop-free. For this purpose, we decompose the N -island process according to the number of migration steps. Throughout the paper, we say that an individual has migration level k ∈ N 0 at time t ∈ [0, ∞) if its ancestral lineage contains exactly k migration steps. For example, an individual starting on island 1 at time 0, moving to island 3 and then back to island 1 has migration level 2. Let N ∈ N. We define a system {(X N,k (42) will be denoted as a (N, µ N , σ 2 N )-process with migration levels. See Lemma 5 for existence of a weak solution of (42).
Lemma 24 below indicates that the individuals with migration level k at a fixed time are concentrated on essentially finitely many islands in the limit N → ∞. A later migration event will not hit these essentially finitely many islands because hitting a fixed island has probability 1 N . Therefore we expect that all individuals on an island have the same migration level. Inserting this into (42) suggests to consider the solution (Z N,k where Z N,−1 t := 0 for all t ≥ 0 and i ≤ N . We will refer to this solution as the loopfree (N, µ N , σ 2 N )-process or as loop-free N -island model. Lemma 26 below establishes the assertion that the distance between the (N, µ N , σ 2 N )-process with migration levels and the loop-free (N, µ N , σ 2 N )-process converges to zero in a suitable sense as N → ∞. It turns out that some difficulties arise from the different forms of the diffusion coefficients in the (N, µ N , σ 2 N )-process with migration levels and in the loop-free (N, µ N , σ 2 N )-process. As we could not resolve these difficulties, we additionally assume for Lemma 26 that σ 2 N is linear.
(i) and the diffusion coefficients in (42) and in (43) are similar. Our proof of Lemma 26 is a moment estimate in the spirit of Yamada and Watanabe (1971).
In Subsection 4.3 we show that two emigrants colonize different islands in the limit N → ∞. Let us rephrase this more formally.
where Y N,ζ s,s (i) = 0, s = 0, and ζ N (t) := N j=1 Z N,k−1 t (j). Note that (Y N,ζ t,s (i)) t≥s , i ≤ N , are independent and identically distributed. Now we are interested in the total mass |Y N,ζ ·,s | as N → ∞. As Y N,ζ s,s (i) = 0 and as the immigration rate on island i tends to zero, the process (Y N,ζ t,s (i)) t≥s converges to zero as N → ∞ for every i ∈ N. However, as mass of order O( ζ N N ) immigrates on a fixed island, the probability that the excursion started by these immigrants reaches a certain level δ > 0, say, is of order O( ζ N N ) as the convergence in (13) indicates. Now as there are N independent trials, the Poisson limit theorem should imply that where Π is a Poisson point process with intensity measure We will prove (45) in Lemma 22 by reversing time.
For convergence of the loop-free (N, µ N , σ 2 N )-process, we do not need to assume linearity of the diffusion function. Here we may replace Assumption A3.1 with the following weaker assumption.
Assumption A2.1 and A2.3 hold for µ and σ 2 . Moreover (µ N ) N ∈N is uniformly upward Lipschitz continuous in zero, that is µ N (x)−µ N (y) ≤ L µ |x − y| for all x ≥ y ∈ I, N ∈ N and some con- for all y ∈ I, N ∈ N where L σ is a finite constant.
Note that if σ 2 N is linear, then Assumption A4.1 implies Assumption A3.1. Some steps of our proof are based on second moment estimates and require the following assumption of uniformly finite second moments of the initial distribution. This assumption is then relaxed in further steps.
Assumption A4.2. The initial distribution satisfies that

Preliminaries
In this subsection we establish preliminary results such as moment estimates and existence of the processes. The quick reader might want to skip this subsection. We begin with weak existence of the N -island process with migration levels.
Lemma 5. Assume A4.1. The (N, µ N , σ 2 N )-process with migration levels exists in the weak sense, that is, equation (42) has a weak solution for every N ∈ N.
Proof. As the proof is fairly standard, we only give an outline. Approximate (42) with stochastic differential equations for which weak solutions exist. For example, approximate µ N and σ 2 N with bounded continuous functions µ N,n and σ 2 N,n , respectively. Consider the solution (X N,k,n t ) t≥0 of (42) with µ N and σ 2 N replaced by µ N,n and σ 2 N,n , respectively, and which only depends on the migration levels k ≤ n. Then this solution has a weak solution according to Theorem V.23.5 and Theorem V.20.1 of [34] as the coefficients are bounded and continuous and the stochastic differential equation is finite-dimensional. Show that the formal generator hereof converges to the formal generator associated with (42). In addition establish tightness of (X N,k,n ) n∈N . Then there exists a converging subsequence and its limit solves the martingale problem associated with (42), see Lemma 4.5.1 in [15]. From this solution of the martingale problem, construct a weak solution of (42) as in Theorem V.20.1 of [34].
Next we prove that the N -island model with migration levels is indeed a decomposition of the N -island model (20).
for every initial configuration X N 0 = x N ∈ I N and every N ∈ N. (48) is a solution of (20). Sum (20) over i ≤ N , stop at timeτ N K and take expectations to obtain that for every t ≤ T and K ∈ [0, ∞). Note that the right-hand side is finite. Now Gronwall's inequality implies that This proves inequality (51) with C T := T exp(LT ). The inequality for the loop-free N -island process follows similarly.
Proof. We prove inequality (55) for the solution of (42). The estimate for the solution of (43) is analogous.
for all T ≥ 0 and all k ∈ N 0 . Note that the right-hand side is finite due to Lemma 7 and Assumption A3.2. Summing over k ≤ K ∈ N and applying Gronwall's inequality implies for every K ∈ N. Letting K → ∞ proves (55).
is finite for every T < ∞. The analogous assertion holds for the loop-free (N, µ N , σ 2 N )process.
Proof. Recallτ N K from (50) and fix N ∈ N for the moment. According to Lemma 6, for every t ≤ T , every K ∈ N and some constant C T ∈ [1, ∞). We used Lemma 7 for the last inequality. Note that the right-hand side is finite. Applying Doob's L 2 submartingale inequality (e.g. Theorem II.70.2 in [33]) to the submartingale N i=1X N t (i), using Fatou's lemma and applying Gronwall's inequality to (59), we conclude that The right-hand side is bounded uniformly in N ∈ N due to Assumption A4.2. The proof in the case of the loop-free N -island model is analogous.
Recallτ N K from (50). Next we show that stopping at the timeτ N K has no impact in the limit K → ∞.
, the assertion follows from the Markov inequality and from the second moment estimate of Lemma 9.
Next we prove some preliminary results for the solution (Y N,ζ t,s ) t≥0 of (44).
Lemma 11. Assume A4.1. Let ζ N : [0, ∞) → N ·I be locally square Lebesgue integrable function. Then we have that for all x ∈ I, 0 ≤ s ≤ T , N ∈ N and some constant C T < ∞ which does not depend on x, N or on ζ N .
Proof. The proof is similar to the proof of Lemma 9, so we omit it.
for all 0 ≤ s ≤ T , N ∈ N and some constant C T < ∞ which does not depend on N or ζ N .
Proof. The proof is similar to the proof of Lemma 9, so we omit it.
for all N ∈ N, for all 0 ≤ s ≤ t ≤ T and all x, y ∈ I.
Proof. As in Theorem 1 of Yamada and Watanabe (1971) [42], an approximation of x → |x| with C 2 -functions results in Taking expectations, the upward Lipschitz continuity ofμ N implies that for all t ≥ s. The right-hand side is finite due to Lemma 11. Therefore, Gronwall's inequality implies (63) with C T = e LµT . Now we study the solution (Y N,c t,s ) t≥s of (44) in which ζ N ≡ c ∈ N ·I. Let α denote a fixed point in (0, |I|). Recall the scale function S from (11). Define the scale function S N of (Y N,c t,s ) t≥s through for every δ > 0.
Proof. Fix δ > 0 and c > 0. Integration by parts yields that for every N ∈ N. As σ 2 (71) and applying the dominated convergence theorem shows that which is equal to one.
We recall the following lemma from [19], see Lemma 9.8 there.
The last result of this subsection is a variation of the second moment estimate of Lemma 9. Define the stopping times τ N K ∈ [0, ∞) through for every K ∈ [0, ∞) and every N ∈ N.
is finite for every T < ∞ and K ∈ N.
Proof. Lemma 3.3 in [20] shows that, on the event {τ N K ≥ t}, X N t (i) is stochastically bounded above by Y N,K+N µ N (0) t,0 . By Assumption A4.1 we have that N µ N (0) ≤ 2θ for all N ∈ N.
Together with the second moment estimate of Lemma 11,this for every N ∈ N and some constant C T < ∞. The right-hand side is finite due to Assumptions A2.3 and A4.2.

Poisson limit of independent diffusions with vanishing immigration
In this subsection, we prove (45) which is the central step in the proof of Theorem 1. Our proof is based on reversing time in the stationary process. For the time reversal, we consider the following stationary situation. Excursions from zero of the process (Y t ) t≥0 start at times given by the points of an homogeneous Poisson point process on R with rate 1. This process of immigrating excursions is invariant for the dynamics of (Y t ) t≥0 restricted to nonextinction. Now the time reversal of an excursion is again governed by the excursion measure, see Lemma 18. As a consequence reversing time in the process of immigrating excursions does not change the distribution. Let us retell the story more formally. Consider a Poisson point process Π on (−∞, ∞) × U with intensity measure ds ⊗ Q. Then (s,η)∈Π δ (ηt−s) t≥0 is the process of immigrating excursions. Note that at a fixed time t, {η t−s : (s, η) ∈ Π} is a Poisson point process on (0, ∞) with intensity measure where m is the speed measure defined in (67). Here we used that η t = 0 for t ∈ (−∞, 0]. The relation (77) between the speed measure and the excursion measure has been established in Lemma 9.8 of [19] by exploiting a well-known explicit formula for E y ∞ 0 f (Y s ) ds. It is also well-known (e.g. equation (15.5.34) in [23]) that the speed measure m is an invariant measure for the sub-Markov semigroup E · f (Y t )1 Yt>0 . This fact can also be seen from (77) by noting that Q(η s ∈ dy) is an entrance measure for this sub-Markov semigroup. Thus the process of immigrating excursions is indeed invariant for the dynamics of (Y t ) t≥0 restricted to non-extinction. Now we show that reversing time in the process of immigrating excursions does not change the distribution of the process. The process (Y t ) t≥0 restricted to non-extinction is time-reversible when started in the invariant measure m, that is, for every T ∈ [0, ∞) and every non-negative measurable function on C [0, T ] . If the speed measure m is replaced by the left-hand side of (77), then (78) can be extended to allow for extinction. To show this we first formulate the Markov property of the excursion measure. Definition (13) of Q as rescaled law of (Y t ) t≥0 together with the Markov property of (Y t ) t≥0 implies that for all T ∈ R and all measurable functions F : Proof. It suffices (see e.g. Theorem 14.12 in [24]) to establish (80) for F n (η) : where t 1 < . . . < t n ∈ R and f 1 , . . . , f n ∈ C b [0, ∞), [0, ∞) . If F n (0) > 0, then both sides of (80) are infinite. For the rest of the proof, we assume F n (0) = 0, that is, f i (0) = 0 for at least one i ∈ {1, . . . , n}. We may even assume f i ∈ C c (0, ∞) for at least one i ∈ {1, . . . , n}.
Otherwise approximate f i monotonically from below with test functions which have compact support. In addition, we may without loss of generality assume t 1 = 0 = T . Otherwise use a time translation. If F n vanishes on {η : η 0 = 0 or η tn = 0}, then (80) is essentially (78). To see this, consider Applying (78) with T = t n , reversing the calculation in (81) and substituting s − t n → s shows that We prove (80) with F replaced by F n by induction on n ∈ N. The base case n = 1 follows from a time translation. The induction step n − 1 → n follows directly from (82) if f 1 (0) = 0 = f n (0). If f n (0) > 0, then For the second step we used linearity, applied the induction hypothesis and equation (82) and again used linearity. Adding (82) for all measurable functions F : C [0, T ] → [0, ∞) satisfying F (0) = 0 and all T ≥ 0.
Proof. Express the speed measure in terms of the excursion measure as in (77) The last two steps are the Markov property (79) and Lemma 18, respectively.
With Lemma 19 in hand, we now reverse time to prove a first version of the Poisson approximation (45).
where C δ is a suitable function of δ. The term C δ converges to zero as δ → 0 according to Lemma 14. The speed measure m N is an invariant (non-probability) measure for (Y N,c t,0 ) t≥0 , see e.g. Theorem V.54.5 in [34]. Thus we may reverse time. As we let N → ∞, we will exploit that (Y N,c t,0 ) t≥0 converges weakly to (Y t ) t≥0 . In addition, m N (dy) converges vaguely to m(dy) as N → ∞ as the densities converge. These observations imply that The last but one step is Lemma 19 and the last step follows from the dominated convergence theorem together with Lemma 16. Putting (87) and (88) together completes the proof of Lemma 20.
Next we use induction to generalize Lemma 20 to test functions which depend on finitely many time coordinates. For this let E T be the following set of bounded functions on C [0, T ], I which depend on finitely many coordinates and which are globally Lipschitz continuous in every coordinate Note that the set E T is closed under multiplication and separates points. Thus the linear span of E T is an algebra which separates points. According to Theorem 3.4.5 in [15] the linear span of E T is distribution determining and so is E T . Proof. Let L F be such that F satisfies (90). Moreover let F be bounded by C F . Lemma 13 implies that lim N →∞ where C T is the finite constant from Lemma 13. Therefore, it suffices to prove (91) withζ N replaced by ζ.
We begin with the case of ζ being a simple function. W.l.o.g. we consider ζ(t) = n i=1 c i 1 [ti−1,ti) (t) where c 1 , ..., c n ≥ 0 and t 0 = s as we may let F depend trivially on further time points. The proof of (91) is by induction on n. The case n = 1 has been settled in Lemma 20. For the induction step we split up the left-hand side of (91) into two terms according to whether the process at time t 1 is essentially zero or not. In order to formalize the notion "essentially zero", let δ > 0 be arbitrary and choose functions φ δ ∈ C 2 [0, ∞), [0, 1] such that φ δ (x) = 1 for x ≥ 2δ and φ δ (x) = 0 for x ≤ δ. Furthermore let C T be the constant from Lemma 13 and defineF 2 (η) := n i=2 f i (η ti ). First we consider the case that the process is away from 0 at time t 1 . The following estimate shows that we may discard immigration after time t 1 . The moment estimate of Lemma 13 implies that for every fixed δ > 0. Here we used that Y N,ζ ·,s converges to the zero function in distribution as N → ∞. If the process is essentially zero at time t 1 (that is 1 − φ δ Y N,ζ t1,s The last step is Lemma 20. By the dominated convergence theorem together with Lemma 16, the right-hand side of (94) converges to zero as δ → 0. Therefore, using (93), (94) and applying the induction hypothesis lim N →∞ The last step follows from the pointwise convergence φ δ (x) → 1 as δ → 0 for every x > 0 together with the dominated convergence theorem and from the Markov property (79). Finally let ζ be integrable. Approximate ζ with simple functions (ζ n ) n∈N . Applying Lemma 13, it is straight forward to show that equation (91) with ζ replaced by ζ n converges to equation (91) as n → ∞.
where Π is a Poisson point process on [s, T ] × U with intensity measure Moreover letF be a continuous function on C [s, T ], R satisfying the Lipschitz condition for some s ≤ t 1 ≤ · · · ≤ t n ≤ T , some n ∈ N and some LF ∈ (0, ∞). In addition let f : I → R be a continuous function satisfying |f (x)| ≤ Lf x for all x ∈ I. Then we have that This type of argument has been established in Theorem 2.1 of Roelly-Coppoletta (1986) for the weak topology and C 0 . Following the proof hereof, one can show the analogous argument for the vague topology and C c . Fix f ∈ F and define S N t : all s ≤ t ≤ T and N ∈ N. Note that f is globally Lipschitz continuous. For K ∈ N and fixed t ∈ [0, ∞), global Lipschitz continuity of f implies that for some constant L f < ∞ and for all N ∈ N. The right-hand side is finite according to Lemma 13 and converges to zero as K → ∞. This proves tightness of S N t , N ∈ N, for every t ∈ [0, ∞). For the second part of the Aldous criterion, fix T ∈ [0, ∞) and let τ N , N ∈ N, be stopping times which are uniformly bounded by T . In addition define (103) The functions µ N , σ N and σ 2 N are uniformly globally Lipschitz continuous on the support of f according to Assumption A4.1. Therefore there exists a constant C f ∈ [1, ∞) such that N + x and f σ N 2 (x) ≤ C f x for all x ∈ I and all N ∈ N. For fixed η > 0 andδ ∈ [0, 1], we use Itô's formula to obtain that for all δ ≤δ and for all N ∈ N. In the last step we used that δ 0 h(u)du 2 ≤ δ δ 0 (h(u)) 2 du for every integrable function h. The right-hand side of (104) is finite by Lemma 12. Lettinḡ δ → 0, the left-hand side of (104) converges to zero uniformly in N ∈ N and δ ≤δ.
Next we prove convergence of finite-dimensional distributions. Let n ∈ N, f ∈ F, s ≤ t 1 ≤ · · · ≤ t n and λ 1 , · · · , λ n ≥ 0 be arbitrary. Using independence we obtain that for all x 1 , . . . , x n ∈ [0, ∞). Applying Lemma 21 to each summand of this telescope sum we get that This proves convergence of finite-dimensional distributions. It remains to prove (99), which includes non-bounded test functions. By the previous step and by the Skorokhod representation of weak convergence (e.g. Theorem II.86.1 in [33]) there exists a version of and Lemma 12. Thus (109) with M = ∞ converges also in expectation. The Lipschitz condition (98) implies that for all N ∈ N. Letting N → ∞ and then K → ∞, the right-hand side converges to zero.
The last equality follows from the dominated convergence theorem together with Assumption A2.3. This proves (99).

Convergence of the loop-free process
Recall the loop-free N -island process from (43). The following lemma shows that the loopfree N -island process converges to the virgin island model.
Tightness of the left-hand side of (113) in N ∈ N follows from tightness of This type of argument has been established in Theorem 2.1 of Roelly-Coppoletta (1986) for the weak topology and C 0 . Following the proof hereof, one can show the analogous argument for the vague topology and C c . Fix f ∈ F and define S N t := for some constant L f < ∞ and for all K, N ∈ N. The right-hand side is finite according to Lemma 7. This proves tightness of (114) for every fixed time point. For the second part of the Aldous criterion, let τ N , N ∈ N, be stopping times which are uniformly bounded by T .
for all x = (x k i ) i≤N,k∈N0 ∈ I N × I N0 . Assumption A4.1 implies that the functions µ N , σ N and σ 2 N are uniformly globally Lipschitz continuous on the support of f . Moreover N µ N (0) is bounded by 2θ uniformly in N ∈ N. Therefore there exists a constant C H ∈ [1, ∞) be k∈N0 and all N ∈ N and such that (f σ N )(y) ≤ C H y for all y ∈ I and N ∈ N. For fixed η > 0 andδ ∈ [0, 1], we use Itô's formula to obtain that for all δ ≤δ and for all N ∈ N. The second step follows from (a + b) 2 ≤ 2a 2 + 2b 2 for a, b ∈ R and from Itô's isometry. In the last step we used that δ 0 h(u)du 2 ≤ δ δ 0 (h(u)) 2 du for every integrable function h. The right-hand side of (117) is finite according to Lemma 9.
Tightness of (114) now follows from the Aldous criterion, see Aldous (1978). It remains to identify the limit of the left-hand side of (113) by proving that for all λ 1 , . . . , λ n , for all 0 ≤ t 1 ≤ · · · ≤ t n and for all m, n ∈ N 0 . Lemma 8 justifies to restrict the summation over k to finitely many summands. We prove (118) by induction on m ∈ N 0 using the Poisson limit (96) for independent one-dimensional diffusions. Define F (η) := n j=1 λ j f (η tj ) for every η ∈ C([0, T ], R). Note that F satisfies the Lipschitz condition (25) for some constant L F ∈ (0, ∞). Let Y N,ζ t,0 (i) t≥0 , i ∈ N, and Ỹ N,ζ t,0 (i) t≥0 , i ∈ N, be independent solutions of (44) with Y N,ζ 0,0 (i) = X 0 (i),Ỹ N,ζ 0,0 (i) = 0 and ζ N (·) ≡ N µ N (0) for every N ∈ N and i ∈ N. In addition let Y t (i) t≥0 , i ∈ N, be independent solutions of (3) with Y 0 (·) ≡ X 0 (·). Note that Z N,0 · (i) is a solution of (44) with Z N,0 0 (i) = X N 0 (i). The first moment estimate of Lemma 13 implies that for all N ∈ N where C tn is the finite constant of Lemma 13. Letting N → ∞ the right-hand side converges to zero according to Assumption A3.2. The process Y N,ζ ·,0 (i) i≤N in turn is close to Ỹ N,ζ ·,0 (i) i≤N except for islands with a significant amount of mass at time zero. Formalizing this we use Lemma 13 to obtain that The first summand on the right-hand side is zero according to Assumption A3.2. Note that µ N → µ and σ N → σ as N → ∞ by Assumption A4.1. Thus Ỹ N,ζ t,0 (i) t≥0 converges in distribution to the zero function as N → ∞ for every fixed i ∈ N. Consequently the second summand on the right-hand side is zero. Moreover Y N,ζ t,0 (i) t≥0 converges in distribution to Y t (i) t≥0 as N → ∞ for every fixed i ∈ N. These observations imply that The last but one step follows from Lemma 22 withζ N (·) ≡ N µ N (0) and ζ(·) ≡ θ. This proves (118) in the base case m = 0. For the induction step m → m + 1 note that a version of (Z N,m+1 t (i)) t≥0 conditioned on N j=1 Z N,m r (j) =: ζ N (r), r ≥ 0, is given by the one-dimensional diffusion (Y N,ζ t,0 ) t≥0 with vanishing immigration. Thus we may realize (Z N,m+1 t (i)) t≥0 by choosing a suitable version of (Z N,m t (j)) t≥0 , j = 1, . . . , N , and by independently sampling a version of (Y N,ζ t,0 ) t≥0 whose driving Brownian motion is independent of {(Z N,m t (j)) t≥0 : j = 1, . . . , N }. Tightness together with the induction hypothesis implies that for everym ≤ m. Thanks to the Skorokhod representation of weak convergence (e.g. Theorem II.86.1 in [33]), we may assume that the convergence in (119) holds almost surely. As a consequence we obtain that holds almost surely. Using arguments from the proof of (99) one can deduce from this that holds almost surely where the total mass of the n-th generation of the virgin island model is defined as V for every n ∈ N 0 . Together with continuity of η s s≤T → t That this functional is not bounded is remedied by a truncation argument. Now the main step of the proof is Due to this decomposition of Π (m+1) , we may realize Π (m+1) conditioned on V (m) as the independent superposition of {Π (ι,s,ψ) : (ι, s, ψ) ∈ V (m) }. In other words, Π (m+1) is equal in distribution to the (m + 1)st-generation of the virgin island model. Therefore we get that which proves (118) and completes the proof of Lemma 23.

4.5
Reducing to the loop-free (N, µ N , σ 2 N )-process In this subsection, we show that the (N, µ N , σ 2 N )-process with migration levels and the loopfree (N, µ N , σ 2 N )-process are identical in the limit N → ∞. Our proof formalizes the following intuition. The individuals of a certain migration level are concentrated on essentially finitely many islands. That these finitely many islands are populated by migrants of a different migration level has a probability of order 1 N . As a consequence, all individuals on one fixed island have the same migration level in the limit N → ∞. This intuition is subject of Lemma 25.
First we show that a generation cannot be dispersed uniformly over all islands. To obtain this interpretation from the following lemma, assume X N,k t (i) ≈ 1 N for all i ≤ N and some time t ≥ 0. Then the cutting operation in (126) has no effect for N large enough. However it is clear that the total mass of all individuals with migration level k does not tend to zero as N → ∞. Thus X N,k t (i) ≈ 1 N cannot be true. Lemma 24. Assume A4.1, A3.2 and A4.2. Then any solution of (42) satisfies Proof. Fix T ∈ [0, ∞) and t ∈ [0, T ]. According to Lemma 8, for each ε > 0, there exists a Thus it suffices to prove convergence of every summand in (126). In addition, if we forget the migration levels in the (N, µ N , σ 2 N )-process with migration levels, then we obtain the N -island model. More formally, Lemma 6 shows that defines an N -island model. Recallτ N K from (50) and fix K ∈ N. According to Lemma 10 it suffices to prove (126) with expectation being restricted to the event {τ N K ≤ t} for every N ∈ N. Now Lemma 3.3 in [20] shows that, on the event {τ N K > t},X N t (i) is stochastically bounded above by Y . Hence we get for all N, K ∈ N, k ∈ N 0 andδ > 0 that for some constant C f < ∞. The last step follows from Lemma 13. Next we let N → ∞ and δ → 0 in (129). Applying Lemma 22 withζ N := K + N µ N (0) and using Assumption A3.2, we see that the limit of (129) as N → ∞ and as δ → 0 is bounded above by for every K ∈ N andδ > 0. The first summand in (130) for every T ∈ [0, ∞). The assertion is also true if X N,k for all x ≤ K and all N ∈ N. Applying Itô's formula, we see that where (M N,m t (i)) t≥0 and (M N,k t (i)) t≥0 are suitable martingales for each i ≤ N and k, m ∈ N 0 . Now take expectations to obtain that for every N ∈ N and t ≤ T . Note that the right-hand side is finite. Using Gronwall's inequality, µ N (0) ≤ 2θ/N and for every N ∈ N. Letting N → ∞ proves (131) in the case s = t. For the case s < t, apply Itô's formula in and use similar estimates as above. The case s > t is analogous to the case s < t.
Knowing that there is asymptotically at most one generation on every island, we are now in a position to prove that X N,k t (i) t≥0 and Z N,k t (i) t≥0 are close to each other.
be a solution of (42) and let for every t ∈ [0, ∞).
Proof. For the time being, assume that σ 2 N and µ N are uniformly globally Lipschitz continuous and bounded. The general case will later be handled by a stopping argument. Define x + := max(x, 0) and x − := (−x) + for all x ∈ R. We first prove (138) with |x| replaced by x + and x − , respectively, separately. As R x → x + is not differentiable, we will apply Itô's formula to φ n being defined as follows. Let 1 = a 0 > a 1 > · · · > a n > · · · > 0 satisfy Note that a n → 0 as n → ∞. For every n ∈ N, there exists a continuous function ψ n : R → [0, ∞) with support in (a n , a n−1 ) such that for every n ∈ N. These functions satisfy φ n ∈ C 2 (R), |φ n (x)| ≤ 1, φ n (x) = 1 x>0 ψ n (x), φ n (x) ≤ x + and φ k (x)→x + as k → ∞ for every x ∈ R and n ∈ N.
Denote the difference process by ∆ k t (i) := X N,k t (i) − Z N,k t (i) for all i ∈ G, t ≥ 0, k ∈ N 0 and N ∈ N. The dependence on N is suppressed for the sake of a more compact notation. By definition of φ n , x + ≤ φ n (x) + a n−1 ∧ x + for all x ∈ R and n ∈ N. Thus for all K ∈ N and N ∈ N. The first summand on the right-hand side converges to zero as K → ∞ uniformly in N ∈ N according to Lemma 8. The last summand on the right-hand side converges to zero as N → ∞ according to Lemma 24. Consequently To estimate the right-hand side, we apply Itô's formula to obtain that for every k ∈ N 0 . Now we simplify the last summand on the right-hand side using the assumption of σ 2 N begin linear. This linearity implies that x · σ 2 N (y)/y = σ 2 for every t ≥ 0, k ∈ N 0 and N ∈ N. The last but one summand on the right-hand side is estimated as follows. Let δ > 0. In case of X N,k for every t ≥ 0, k ∈ N 0 and N ∈ N. Summing over k ∈ {0, . . . , K} leads to for every t ≥ 0 and K, N ∈ N. The last three summands on the right-hand side of (147) converge to zero uniformly in t ∈ [0, T ] if we first let N → ∞ and then δ → 0. This follows from Lemma 25 and Lemma 24. After inserting (147) into (143), we see that for all t ∈ [0, T ]. Note that the right-hand side is finite by Lemma 8. Similarly we obtain (148) with (·) + replaced by | · |. Finally apply Gronwall's inequality to arrive at (138). Next we consider functions σ 2 N and µ N which are not globally Lipschitz-continuous. For each K > 0 choose function σ 2 N,K and µ N,K which agree with σ N and µ N , respectively, on for every K > 0 by the preceding step. Lemma 10 handles the event {τ N K ≤ t}. This completes the proof of Lemma 26.

Proof of Theorem 1
Proof of Theorem 1. First we prove Theorem 1 under the additional Assumption A4.2. This will be relaxed later.
We begin with convergence of finite-dimensional distributions. Recall E T from (89). Let F (η) = n j=1 f j (η tj ) ∈ E T satisfy the Lipschitz condition (90) with Lipschitz constant L F ∈ (0, ∞) and let F be bounded by C F < ∞. Furthermore let the function f : I → R have compact support in (0, |I|), let f be bounded by C f < ∞ and let f be globally Lipschitz continuous with Lipschitz constant L f ∈ (0, ∞). Recall the (N, µ N , σ N )-process with migration levels from (42). We will exploit below that all individuals on one island have the same migration level in order to show that Assuming (150) we now prove convergence of finite-dimensional distributions. We may replace the (N, µ N , σ 2 N )-process with migration levels in (150) by the loop-free process because of Lemma 26 and the Lipschitz continuity of F and f . Hence (150) and Lemma 26 imply that The last equality is the convergence of the loop-free (N, µ N , σ 2 N )-process to the virgin island model and has been established in Lemma 23.
Next we prove (150). According to Lemma 6 if we ignore the migration levels in the (N, µ N , σ 2 N )-process with migration levels, then we obtain a version of the (N, µ N , σ 2 N )process, that is, For proving (150) we observe that for every sequence (x k ) k∈N0 ⊆ [0, ∞) and every δ > 0. Thus we get that for all N ∈ N, t ≥ 0 and δ > 0. The second summand on the right-hand side converges to zero as N → ∞ according to Lemma 25. The first summand on the right-hand side converges to zero as δ → 0 uniformly in N ∈ N according to Lemma 24. Using (154) we obtain that for all N ∈ N and δ > 0. Letting first N → ∞ and then δ → 0, the right-hand side converges to zero according to Lemmas 25 and 24 and according to the preceding step. Inserting this into (152) proves (150). The next step is to prove tightness. This is analogous to the tightness proof in Lemma 22. Use the lemmas 10 and 17 instead of Lemma 12. So we omit this step.
It remains to prove Theorem 1 in the case when Assumption A4.2 fails to hold. Fix T ∈ [0, ∞). Let H be a bounded continuous function on M F I . It follows from Assumption A3.2 that |X N 0 | converges in L 1 and thus also in distribution to |X 0 |. By the Skorokhod representation theorem there exists a version of {X N 0 (·) : N ∈ N} such that |X N 0 | converges almost surely to |X 0 | as N → ∞. Now the previous step implies that almost surely. Taking expectations and applying the dominated convergence theorem results in The last step is again the induction hypothesis.
Lemma 29. Let n ∈ N, c ∈ [0, ∞) and f ∈ F (n+1) Then the two functions andf are elements of F (n) . This is also true if F ++ is replaced by F +− and F +± , respectively.
Proof. The functionsf andf are non-decreasing and either bounded or nonnegative. It is clear thatf is again (i, j)-convex for 1 ≤ i, j ≤ n and thatf is (i, j)-convex for 1 ≤ i, j ≤ n−1. It remains to prove (i, n)-convexity off for 1 ≤ i ≤ n. Applying Lemma 28 at location x n+1 to the index tuple (i, n, n + 1), we obtain for all h 1 , h 2 ≥ 0 that that is,f is (i, n)-convex. Proof. Fix 0 ≤ s ≤ t and n ∈ N. We only prove the case of µ being concave and f ∈ F (n+1) ++ as the remaining cases are similar. According to Lemma 29, it suffices to prove that is an element of F (n+1) ++ [0, ∞) . Let (Y ζ,x t,s ) t≥s , x ∈ I, be solutions of (159) with respect to the same Brownian motion. It is known that Y ζ,x t,s ≤ Y ζ,x+h t,s holds almost surely for all x ≤ x + h ∈ I, see e.g. Theorem V.43.1 in [34] for the time-homogeneous case. Thus the functionf is again non-decreasing. Moreoverf inherits (i, j)-convexity from f for every 1 ≤ i, j ≤ n. It remains to show thatf is (i, n + 1)-convex for 1 ≤ i ≤ n + 1. If i ≤ n, then (i, n + 1)-convexity of f at the point x 1 , . . . , x n , Y ζ,xn+1 t,s shows that for every h 1 , h 2 ≥ 0 and (x 1 , . . . , x n ) ∈ [0, ∞) n , that is, (i, n + 1)-convexity off in the case i ≤ n. One can establish convexity of y →f (x 1 , . . . , x n , y) as in Lemma 6.1 of [20] (this Lemma 6.1 shows concavity if f is (n + 1, n + 1)-concave and smooth). This step uses concavity of µ. Consequently,f is (n + 1, n + 1)-convex. This completes the proof off ∈ F  [11]. This Proposition 16 is used in [11] to establish a comparison result between diffusions with different diffusion functions, see Theorem 1 in [11]. Using the above Lemma 30, this comparison result can be extended to more general test functions.

Decomposition of a one-dimensional diffusion with immigration into subfamilies
Feller's branching diffusion with immigration can be decomposed into independent families which originate either from an individual at time zero or from an immigrant, see e.
For the base case n = 1 additionally assume f 1 ∈ C 2 . Approximate σ and µ with functions σ l , µ l ∈ C ∞ (R) having the following properties. All derivatives σ bounded, µ l is concave and σ 2 l is superadditive. Both functions vanish at zero. If |I| < ∞, then µ l (|I|) ≤ 0 = σ 2 l (|I|). Moreover µ l (x) → µ(x) and σ 2 l (x) → σ 2 (x) as l → ∞ for all x ∈ I. Let (Y ζ,x,l t,s ) t≥s be a solution of (159) with σ 2 and µ replaced by σ 2 l and µ l , respectively, and let (Ỹζ ,y,l t,s ) t≥s be an independent version hereof starting in y ∈ I. Then Now as l → ∞, (Y c1,x,l t ) t≥0 converges weakly to (Y c1,x t ) t≥0 for every x ∈ I, see Lemma 19 in Cox et al. (1996) for a sketch of the proof. Therefore letting l → ∞ in (177)  (178) Note that the induction hypothesis implies that f n (x 1 + y 1 , . . . , x n + y n ) ≤ Ef n+1 x 1 + y 1 , . . . , x n + y n , Y ζ,xn tn+1,tn +Ỹζ ,yn tn+1,tn and that Lemma 30 implies thatf n ∈ F (n) +− . Therefore, using the Markov property and the induction hypothesis, we get that for all x, y ∈ I satisfying x + y ∈ I. The last step follows from the Markov property and from independence of the two processes. This proves (176). In case of general functions ζ andζ, approximate ζ andζ with simple functions ζ l and ζ l , l ∈ N, respectively. The process Y ζ l ,x ·,s converges in the sense of finite-dimensional distributions in L 1 , see Lemma 11, and due to tightness also weakly to the process Y ζ,x ·,s . This completes the proof as the remaining cases are analogous. Finally we prove the main result of this subsection. The following lemma shows that the total mass increases if we let all subfamilies evolve independently. In the special case of µ and σ 2 being linear, inequality (182) is actually an equality according to the classical family decomposition of Feller's branching diffusion with immigration. for every s ≥ 0 where Π is a Poisson point process on [0, ∞) × U with intensity measure Leb ⊗ Q and whereΠ is an independent Poisson point process on [0, ∞) × U with intensity measure ζ(s) ds ⊗ Q. If µ is concave and σ 2 is subadditive, then (182) holds with F +− replaced by F ++ . If µ is subadditive and σ 2 is additive, then (182) holds with F +− replaced by F +± .
Proof. The idea is to split the initial mass and the immigrating mass into smaller and smaller pieces. Fix s ≥ 0. Let µ be concave and let σ 2 be superadditive. According to Lemma 31 for every N ∈ N where all processes are independent of each other. Letting N → ∞ in (183), the right-hand side of (183) converges to the right-hand side of (182), see Lemma 22 and Lemma 32. The remaining cases are analogous.
Recall the total mass process (V for every k 0 ∈ N 0 . If µ is concave and σ 2 is subadditive, then inequality (192) holds with F +− replaced by F ++ . If µ is subadditive and σ 2 is additive, then inequality (192) holds with F +− replaced by F +± .

Proof of the comparison result of Theorem 2
Proof of Theorem 2. We prove the case of µ being concave and σ 2 being superadditive. The remaining two cases are analogous. According to Lemma 34 we have that for every finite subset Λ ⊆ G. Letting Λ G, we see that the total mass of the (G, m, µ, σ 2 )process is dominated by the total mass of the loop-free (G, m, µ, σ 2 )-process. Now we get from Lemma 36 that for every k 0 ∈ N. Letting k 0 → ∞, we obtain that the total mass of the loop-free (G, m, µ, σ 2 )-process is dominated by the total mass of the virgin island model. Therefore, the total mass of the (G, m, µ, σ 2 )-process is dominated by the total mass of the virgin island model.