SuperBrownian motion and the spatial Lambda-Fleming-Viot process

It is well known that the dynamics of a subpopulation of individuals of a rare type in a Wright-Fisher diffusion can be approximated by a Feller branching process. Here we establish an analogue of that result for a spatially distributed population whose dynamics are described by a spatial Lambda-Fleming-Viot process (SLFV). The subpopulation of rare individuals is then approximated by a superBrownian motion. This result mirrors Cox et al. (2000), where it is shown that when suitably rescaled, sparse voter models converge to superBrownian motion. We also prove the somewhat more surprising result, that by choosing the dynamics of the SLFV appropriately we can recover superBrownian motion with stable branching in an analogous way. This is a spatial analogue of (a special case of) results of Bertoin and Le Gall (2006), who show that the generalised Fleming-Viot process that is dual to the beta-coalescent, when suitably rescaled, converges to a continuous state branching process with stable branching mechanism.


Background
Our aim in this paper is to establish a relationship between two, at first sight, very different classes of measure-valued processes. The first, the spatial Lambda-Fleming-Viot processes, is a collection of models for the evolution of frequencies of different genetic types in a population that is dispersed across a spatial continuum. The second is the (finite and infinite variance) superBrownian motions. Our motivation is two-fold. On the one hand, we add to the panoply of processes that converge to superBrownian motion; on the other, we address a question of some interest in population genetics: how does the frequency of a rare neutral mutation evolve in a spatially distributed population?
SuperBrownian motion, or the Dawson-Watanabe superprocess, was introduced independently by Watanabe (1968) and Dawson (1975) as a continuous time and space approximation to systems of branching Brownian motions. In this way it can be thought of as a spatial analogue of the Feller diffusion approximation to critical (or near-critical) Galton-Watson branching processes. We shall recall its definition in Section 2.1 below.
In addition to the huge literature exploring the rich mathematical structure of superBrownian motion, over the last two decades an increasing body of evidence has emerged that it is a universal scaling limit of critical interacting particle systems above a critical dimension. It has been obtained as a limit of lattice trees (above 8 dimensions, e.g. Holmes (2008)), oriented percolation (above 4 dimensions, Van der Hofstad and Slade (2003)), the contact process (above 4 dimensions, e.g. Van der Hofstad and Sakai (2010)), the voter model (in two or more dimensions, e.g. Cox et al. (2000)) and the Lotka-Volterra model (Cox and Perkins (2005)). By changing the range of the interaction with the scaling, one can also obtain it from the contact process in lower dimensions (Cox et al. (1999)). These analyses prove convergence of finite-dimensional distributions; Van der Hofstad et al. (2017) provide a tightness criterion that allows the extension to convergence on path space and apply it to the example of sufficiently spread out lattice trees above 8 dimensions. We also refer to that paper for a more complete list of references.
For populations that are not spatially distributed, one classically models frequencies of different genetic types (usually refered to as alleles) through a Wright-Fisher or a Cannings model. Suppose that we are interested in the proportion of individuals of a particular type, that we shall call type 1. Under the Wright-Fisher model, when suitably scaled, this proportion converges to the Wright-Fisher diffusion. If type 1 is rare, the absolute number of type 1 individuals evolves approximately according to a branching process which, under the same scaling, converges to a Feller diffusion. This branching process approximation for the rare type has been used extensively in the population genetics literature and so it is natural to try to establish analogous results for spatially distributed populations.
In one spatial dimension, the Wright-Fisher diffusion has a stochastic pde counterpart: where w t (x) denotes the proportion of the population at spatial position x at time t that is of type 1, K is the local population density, and W (dt, dx) is a space time white noise. Formally at least, if type 1 is rare, this reduces to dw t (x) = 1 2 ∆w t (x) dt + 1 K w t (x)W ( dt, dx), and if we set X t = Kw t to recover absolute numbers rather than proportions, the type 1 population is modelled by which is the stochastic pde governing the density with respect to Lebesgue measure of the (finite variance) superBrownian motion, and so it is certainly reasonable to hope to describe establishment of rare alleles in one dimensional populations using superBrownian motion. In dimensions two and higher, equation (1.1) has no solution and so we need an alternative approach to modelling allele frequencies in higher dimensional spatial continua. The obstructions to finding such an approach, often refered to as 'the pain in the torus', are well documented. We refer to Barton et al. (2013) for a survey. The spatial Lambda-Fleming-Viot process (SLFV) introduced in Etheridge (2008), overcomes the pain in the torus to provide a class of models for allele frequencies in populations distributed across spatial continua of any dimension. The first rigorous construction is in Barton et al. (2010). The SLFV can be thought of as the spatial counterpart of the 'generalised Fleming-Viot process' (that we shall refer to as the Lambda-Fleming-Viot process in what follows) of Bertoin and Le Gall (2003) and, just as for their model, comes with a consistent 'backwards in time' model (a spatial analogue of the Lambda-coalescent) for the genealogies describing relatedness between genes in individuals sampled from the population. We recall the definition of the process in Section 2.2.
As a special case of the results in , in one spatial dimension one can recover (1.1) as a scaling limit of a particular SLFV. The corresponding scaling in higher dimensions leads to the (deterministic) heat equation. This can perhaps best be understood as a 'law of large numbers' effect. In particular, the initial conditions taken in that paper don't correspond to 'rare' alleles. As we shall see, we can recover superBrownian motion from the SLFV in arbitrary spatial dimensions, but only if we take a sufficiently 'sparse' initial condition. This should of course be compared to the results of Cox et al. (2000), who recover superBrownian motion from sparse voter models and our analysis in the finite variance case owes a great deal to that paper. We should also mention the work of Freeman (2010), in which he introduces a very close relative of the SLFV, which he calls a 'bursting process', on Z d and shows that for d ≥ 3, started from sparse initial conditions and suitably scaled, that process too converges to a superBrownian limit.
In the discussion up to this point we have (implicitly) considered the finite variance superBrownian motion. Where our work diverges from the body of work described above is that we are also able to obtain superBrownian motions with stable branching mechanisms from particular choices of the SLFV. Such superprocesses are the spatial analogue of the continuous state branching processes sometimes known as stable branching processes. In Birkner et al. (2005), it is shown that the special class of Lambda-Fleming-Viot processes that are dual to the so-called Beta-coalescents can be obtained as time-changed stable branching processes, revealing a deep connection between the two classes of processes. Bertoin and Le Gall (2006) show that in much the same way as the Feller diffusion describes evolution of a rare allele in a population evolving according to the Wright-Fisher diffusion, stable branching describes the evolution of a rare allele under this Lambda-Fleming-Viot process (see Lambert and Schertzer (2016) for a 'backwards in time' analogue). We provide the 'back of the envelope' calculation that explains Bertoin and Le Gall's result in Section 2.4. What is more surprising is that we can extract a superBrownian motion with stable branching mechanism from a sequence of rescaled SLFVs. First, the conditions on the 'Lambda'-measure under which we can construct the SLFV are more restrictive than those under which we can construct the (non-spatial) Lambda-Fleming-Viot process. Second, the spatial motion of individuals in the SLFV is intricately connected to the reproduction mechanism, yet we are trying to produce a limit in which spatial motion is continuous and reproduction is discontinuous. On the other hand, in  the analogue of (1.1) with the Laplacian replaced by the generator of a symmetric stable process is obtained as a scaling limit of an SLFV. In that case, in the limit the spatial motion is discontinuous and the reproduction mechanism continuous. The rest of this paper is laid out as follows. In Section 2 we remind the reader of the definitions of the superBrownian motion and the SLFV before stating our main results.
We also give a heuristic explanation of our results. In Section 3 we provide martingale characterisations of the scaled SLFVs, from which, in Section 4, we formally identify the limiting objects, deferring tightness to Section 5 and the proof of some key estimates to Section 6. The proof of convergence follows in Section 7.

Definitions and statement of results
Before stating our results in Section 2.3 below, we fix notation and define our two classes of processes.

SuperBrownian motion
We shall characterise superBrownian motion through a martingale problem. It is convenient to use distinct formulations in the finite and infinite variance cases. For an introduction to superprocesses and, in particular, their construction as scaling limits of branching particle systems, we refer to Dawson (1993), Perkins (2002) and Etheridge (2000).
A complete filtered probability space (Ω, F , F t , P) will be implicit throughout. We write M F (R d ) for the space of finite measures on R d , equipped with the topology of weak convergence, and C k 0 (R d ) for the space of k times differentiable functions φ : R d → R, vanishing at infinity, and such that φ and its derivatives up to kth order are bounded with norm Definition 2.1 (Finite variance superBrownian motion) The finite variance superBrownian motion is the unique M F (R d )-valued Markov process {X t } t≥0 with continuous sample paths such that for each non-negative φ ∈ C 3 0 (R d ), the process is a continuous, square integrable martingale with quadratic variation given by where m, κ > 0 are constants.
We shall not consider the most general possible superBrownian motions. Instead, we restrict ourselves to those that arise as scaling limits of branching Brownian motions with offspring distribution in the domain of attraction of a stable law. These are naturally parametrised by a parameter β ∈ (0, 1) (with β = 1 corresponding to the finite variance case).
Definition 2.2 (SuperBrownian motion with stable branching law) The superBrownian motion with stable branching law of parameter β ∈ (0, 1) is the unique M F (R d )-valued Markov process {X t } t≥0 with càdlàg sample paths such that for each non-negative φ ∈ C 3 0 (R d ), the process is a martingale, where m, κ > 0 are constants.

The Spatial Lambda-Fleming-Viot Process
We now introduce the SLFV processes. In fact there is a much richer class of these processes than those we consider here, incorporating, for example, various forms of natural selection. For a (somewhat out of date) survey we refer to Barton et al. (2013). We restrict ourselves to a population in which there are just two genetic types which we label by {0, 1}. At each time t, the random function {w t (x), x ∈ R d } is defined, up to a Lebesgue null set of R d , by w t (x) := proportion of type 1 at spatial position x at time t.
A construction of an appropriate state space for x → w t (x) can be found in Véber and Wakolbinger (2015). Using the identification this state space is in one-to-one correspondence with the space M λ of measures on R d × {0, 1} with 'spatial marginal' Lebesgue measure, which we endow with the topology of vague convergence. By a slight abuse of notation, we also denote the state space of the process (w t ) t∈R + by M λ .
Definition 2.3 (SLFV) Let µ be a finite measure on (0, ∞) and, for each r ∈ (0, ∞), let ν r be a probability measure on (0, 1]. Further, let Π be a Poisson point process on The spatial Lambda-Fleming-Viot process (SLFV) driven by (2.2), when it exists, is the M λ -valued process (w t ) t∈R + with dynamics given as follows.
If (x, t, r, ρ) ∈ Π, a reproduction event occurs at time t within the closed ball B r (x) of radius r, centred on x, in which case: 1. Choose a parental location z uniformly at random within B r (x), and a parental type, α, according to w t− (z); that is α = 1 with probability w t− (z) and α = 0 with probability 1 − w t− (z).
We shall refer to ρ as the impact of the event.
Before providing conditions under which the process exists, it is convenient to introduce the dual process of coalescing lineages that plays the rôle for the SLFV played by the Lambdacoalescents for the (non-spatial) Lambda-Fleming-Viot processes. The idea is that these lineages trace out the ancestry of a sample from the population. The dual will also play a crucial rôle in establishing the estimates of Section 6. The dynamics of the dual are driven by the same Poisson process of events Π that drives the SLFV. This driving process is reversible and we shall abuse notation by indexing events by 'backwards time' when discussing our dual. We suppose that at time 0, 'the present', we sample k individuals from locations x 1 , . . . , x k and we write ξ 1 s , . . . , ξ Ns s for the locations of the N s 'ancestors' that make up our dual at time s before the present.
Definition 2.4 (Dual to the SLFV) The coalescing dual process (Ξ t ) t≥0 is the n≥1 (R d ) nvalued Markov process with dynamics defined as follows. At each event (x, t, r, ρ) ∈ Π: 1. For each ξ i t− ∈ B r (x), independently mark the corresponding ancestral lineage with probability ρ; 2. if at least one lineage is marked, all marked lineages disappear and are replaced by a single ancestor, whose location is drawn uniformly at random from within B r (x).
If no particles are marked, then nothing happens.
Assuming that the SLFV and its dual exist, the duality is expressed through the following proposition.
Proposition 2.5 The SLFV is dual to the process (Ξ t ) t≥0 in the sense that for every k ∈ N and ψ ∈ C( where the subscripts on the expectations denote the initial values of the corresponding processes. In particular, E {x 1 ,...,x k } denotes expectation under the distribution of the dual process started from Ξ 0 = {x 1 , ..., x k }. In Barton et al. (2010), through a powerful result of Evans (1997), it is shown that existence of the SLFV can be deduced from existence of the dual. In that paper, by assuming that one guarantees that started from any finite number of individuals, the jump rate in the dual is finite, and so Definition 2.4 gives rise to a well-defined process. Ancestral lineages in the dual process move around according to (dependent) compound Poisson processes which can coalesce if they are affected by the same event. Although one can write down more general conditions under which the SLFV exists, see Etheridge and Kurtz (2014), Condition 2.4 is trivially satisfied for the processes considered below.

Main results
We are going to extract superBrownian motion from the SLFV through a scaling and a passage to the limit. There will be two cases, the first leading to the finite variance superprocess and the second to a superprocess with a stable branching law. At the Nth stage of our scaling, the local population density will be K = K(N). We shall denote our scaled SLFV by w N and the population of type 1 individuals by X N = Kw N , which is defined Lebesgue almost everywhere. We shall think of X N as a measure-valued process and abuse notation by writing, for any Borel measurable φ, Before scaling, the SLFV will be driven by a Poisson point process Π N with intensity dx ⊗ dt ⊗ µ N ( dr)ν r ( dρ). Each (x, t, r, ρ) ∈ Π N signals a reproduction event for the scaled process associated with the quadruple ( In other words, for the scaled process, time is sped up by a factor N, space is shrunk by M(N), and the impact of each event is reduced by a factor J(N). Moreover, the local population density is increased to K = K(N) where K(N) is another increasing function of N.
2. Variable radius case: Here we shall take where α is a real constant and γ is a positive constant. We then take ν r := δ r −γ .
The lower bound on r in the variable radius case ensures that after scaling the impact of each event is at most 1.
Theorem 2.6 (Fixed radius case) In the notation above, in the fixed radius case, suppose that X N 0 is absolutely continuous with respect to Lebesgue measure, that the support the sequence {X N } N ≥1 converges weakly to finite variance superBrownian motion with initial condition X 0 and parameters Conditions 1-3 guarantee tightness; (2.6) will ensure that type 1 is sufficiently 'sparse' that, asymptotically, descendants of different type 1 'individuals' evolve independently (they don't sense that total population density is constrained) and we recover a branching structure. Notice in particular that (2.6), combined with Conditions 2 and 3 implies that K → ∞ as N → ∞. In Section 2.4 we present a heuristic argument which suggests that these conditions are in some sense optimal. We note that the conditions of Theorem 2.6 are analogous to those of Cox et al. (2000), Theorem 1.1. It isn't hard to convince oneself that if we fix the radius of events, then it is not possible to find a sequence of impact distributions ν N and a scaling under which the limiting process is superBrownian motion with a stable branching mechanism of infinite variance, which is why we turn to µ N (dr). Theorem 2.7 provides conditions under which we do then have convergence to superBrownian motion with a stable branching mechanism. Even with our special choice of µ N , in d = 1 these are considerably more technical than those in the fixed radius case. However, we shall see them emerge in a natural way from our calculations.
Theorem 2.7 (Variable radius case) In the notation above, suppose that X N 0 is absolutely continuous with respect to Lebesgue measure, that the support supp(X N 0 ) ⊆ D, where D is a compact subset of R d (independent of N) and that X N 0 converges weakly to X 0 ∈ M F (R d ). We work in the variable radius case with µ N (dr) given by (2.5) and ν r = δ r −γ . Fix β ∈ (0, 1) and take α and γ such that In addition, we require: (2.7) Then the sequence {X N } N ≥1 converges to superBrownian motion with stable branching law with parameter β, initial condition X 0 , and Once again Conditions 3-5 guarantee tightness of the sequence, whereas (2.7) ensures 'sparsity'. For d = 1 the condition in (2.7) is not necessary, even our proof shows that it could be improved a little, but in this form it is easy to check: since γ > 2 and (1 − β)(γ − 1) < 1, we only have to make sure that J → ∞ as N → ∞ sufficiently quickly compared to M.
Example 2.8 The conditions of Theorem 2.7 are satisfied if The structure of the proofs of Theorems 2.6 and 2.7 will come as no surprise. We establish tightness of the sequences of rescaled processes and then check that all limit points satisfy an appropriate martingale problem. The first step will be to write down the martingale characterisation of the the scaled SLFV and manipulate it into a form that resembles the desired limit. Tightness for the fixed radius case is then highly reminiscent of the arguments in Cox et al. (2000). To prove tightness in the variable radius case, we modify the arguments used in constructing superBrownian motion with a stable branching mechanism as the limit of a sequence of branching Brownian motions (although the calculations here are somewhat more involved). In both cases, a key step in identifying the limit is to establish control over the probability that two individuals sampled from the same small region in the SLFV are close relatives. This is also reminiscent of Cox et al. (2000), being based on estimates for the coalescing dual of the SLFV.

Heuristics
Before proceeding to the proofs, let us try to motivate the scalings in Theorem 2.6 and (at least some of those in) Theorem 2.7.

Fixed radius
First consider the fixed radius case. If superBrownian motion really is a good approximation for the type 1 population, then, in particular, we expect that the motion of a single ancestral lineage in the SLFV should converge to Brownian motion. Events that affect regions in which a given lineage lies fall according to a Poisson process with rate proportional to N, and, for each such event, the chance that the lineage is affected by it is u/J. Thus the lineage will jump at rate proportional to N/J. Each jump is mean zero, finite variance, and O(1/M). In order to obtain a Brownian limit, we seek a diffusive rescaling; that is N/(JM 2 ) should converge.
Next recall that in d ≥ 2 superBrownian motion is a two-dimensional object, whereas the support of the SLFV has the same dimension as the state space. The scaling of the impact of each event dictates that an 'atom' of mass is O(1/J), so in order that the number of atoms in a region of diameter M scale like KM 2 (as it would for a two-dimensional object) we take The large parameter K controls the total population density, but we must still ensure that the population of rare alleles in our scaled SLFV is sufficiently 'sparse' if we are to recover superBrownian motion. In the SLFV, the density of the population is strictly regulated, creating a strong dependence between the mass born during a reproduction event and that which dies. In contrast, in superBrownian motion, once born, 'individuals' reproduce and die independently of one another. In order to ensure that the dependence inherent in the SLFV is not apparent to us when we follow just a single (rare) type, we should like to know that if we sample individuals from the same small region they are not likely to be close relatives. In this way we can guarantee that individuals are not victims of reproduction events in which their own close family reproduces.
To check whether two individuals sampled from very close to one another are close relatives, we follow the dual process of ancestral lineages. We should like them to move apart to a distance of O(1) in the scaled process (without coalescing). If both lineages are in the region affected by an event, then the chance that they are both affected (and therefore coalesce) given that at least one of them jumps is of order 1/J. On the other hand, it only takes a finite number of events in which only one of the lineages jumps before they are sufficiently far apart that they cannot be affected by the same event and so evolve independently. We then think of them as making an excursion away from one another, before they once again come close enough that they are susceptible to coalescence. The number of such excursions before we see one in which they move apart to a distance of O(1) (after scaling) has mean Since at the end of each excursion, the chance that the lineages will coalesce rather than starting the next excursion is proportional to 1/J, we see that our 'sparsity' conditions (2.6) ensure that the probability that they successfully 'escape' to a distance of O(1) from one another tends to one. This is the intuition underlying the calculations in Section 6.

Variable radii
Now we turn to the case of variable radii, from which we are trying to extract a superBrownian motion with stable branching mechanism.
In the non-spatial setting, Bertoin and Le Gall (2006) recover a stable branching process with parameter β from a Lambda-Fleming-Viot process in much the same way as we recovered the Feller branching process from the Wright-Fisher diffusion in the introduction. The Lambda-Fleming-Viot process is driven by a Poisson point process Π on [0, ∞) × (0, 1]. A point (t, ρ) ∈ Π signals a reproduction event at time t in which a proportion ρ of the population is replaced by offspring of a randomly chosen parent. The intensity of Π is dt ⊗ Λ( dρ)/ρ 2 , where Λ is such that the tail Λ([ε, 1]) is regularly varying with index −(1 + β) as ε → 0. A simple 'back of the envelope' calculation illustrates why Bertoin and Le Gall's result should hold. To be completely concrete, we present it in the special case Λ( dρ) = C(β)ρ −β (1 − ρ) β dρ, corresponding to the Lambda-Fleming-Viot process which is dual to a so-called Beta-coalescent and, as shown in Birkner et al. (2005), is a timechange of the stable branching process.
Once again let K be the total population size and consider a rare allele that makes up a proportion w of the population. We are interested in the absolute number, X = Kw of rare alleles. We apply the infinitesimal generator of X to a test function of the form exp(−θX), where θ ≥ 0. This yields which we recognise as the infinitesimal generator of the stable branching process, timechanged by a factor proportional to K β . Recalling the construction of this Lambda-Fleming-Viot process from individual based models, for example as in Schweinsberg (2003), we see that the evolution of a population of size K should be compared to the Lambda-Fleming-Viot process on the timescale 1/K β (just as we see a factor 1/K in the Wright-Fisher diffusion (1.1)). This precisely cancels the K β we see here. This calculation confirms that the emergence of the stable branching process was dictated by the behaviour of the measure Λ( dρ) close to ρ = 0.
If we are to extract infinite variance superBrownian motion from an SLFV in an analogous way, we must have random event radii and, since the spatial motion is bound up in reproduction events, if in the limit the spatial motion is to be continuous, the impact will depend on the event size in a nontrivial way. In order to make the calculations tractable, we fix the impact to be a negative power of the radius of events and we take µ N ( dr) to be a truncated power law. The purpose of the truncation is two-fold: first, by bounding the radii below we force the impact of each event to lie in (0, 1]; second, by bounding radii above, we ensure that the jumps of lineages in the unscaled SLFV have finite moments of all orders. As for the fixed radius case, we should like the motion of a single ancestral lineage to converge to Brownian motion. A lineage will fall in the region affected by an event of (scaled) radius r/M at rate Nr d µ N (dr) in which case, with probability ρ/J (with ρ sampled from ν r (dρ)), it will make a mean zero, finite variance jump of size of order r/M. Substituting our chosen form of µ N (dr)ν r (dρ), we see that in order to obtain a Brownian limit for the motion of lineages, we should require convergence of Under our conditions on α, γ, this implies that N/(JM 2 ) should converge, just as in the fixed radius case. We now turn to recovery of the stable branching mechanism. When an event of radius r falls, the total mass of the offspring in the rescaled population process is K/(JM d ) times v(r) = C(d)r d−γ . Such events fall on a given point x, and an 'individual' at that point is selected as parent of the event, at rate (N/K)r α dr dx (a factor of C(d)r d in the rate at which events of radius r cover the point x has cancelled with the reciprocal of the same factor in the probability that x is the point selected uniformly at random from within the ball to be the location of the parent). Observing that dv(r) = C(d)(d − γ)r d−γ−1 dr (and thus substituting for r α dr), we see that for a given 'individual' at x, events in which it produces offspring with mass proportional to v occur at rate proportional to v −(γ−d+α+1)/(γ−d) dv dx. We need to match this with the v −β−2 of the non-spatial case close to v = 0, and so we take A more careful version of the argument in the previous paragraph leads to Conditions 4 and 5. In fact Condition 4 can be understood in terms of the dimension of the limit in much the same way as the corresponding condition for the fixed radius case. In the stable case, the Hausdorff dimension of the support of the limiting superBrownian motion will be d ∧ (2/β), so at least in high enough dimensions we expect to need JM d ∼ KM 2/β , which combined with convergence of N/(JM 2 ) leads to Condition 4. Conditons (2.7), which follow from the same considerations as in the fixed radius case, ensure sufficient 'sparsity'.

Martingale characterisation of the process {X N t } t≥0
In this section we characterise the distribution of the scaled process {X N t } t≥0 as a solution to a martingale problem. It will be convenient to use different formulations of the martingale problem for our two scalings. First we need some notation. Suppose that φ ∈ C 3 0 (R d ) and that f ∈ C 2 0 (R). We use the notation for the infinitesimal generator of the measure-valued process X N t applied to test functions of the form F (X N t ) = f ( X N t , φ ), and, with a slight abuse of notation, L N (φ)(X N 0 ) for the corresponding quantity when f (x) ≡ x.
Notice that we have not assumed that φ has compact support. Because X N 0 has compact support, the rate at which that support is overlapped by a reproduction event is bounded, and such an event can only increase the volume of the support of X N by an amount bounded above by the volume of the event (which in turn is uniformly bounded). Iterating this argument and comparing to a pure birth process, it is evident that the support will remain bounded (indeed compact) up to any finite time, and so the rate of events affecting X N is bounded.
Writing down the generator is now standard as our process evolves according to a series of jumps of finite rate. Recall that at each point (x, t, r, ρ) ∈ Π N , the scaled process w N is subject to the reproduction event associated with the quadruple ( x M , t N , r M , ρ J ). At such an event, within the ball B r/M (x) we have two possibilities: We write B M r (x) for the ball B r/M (x) and |B M r | = |B r |/M d for its volume. At the time of the first event to affect X N , we have w N t− = w N 0 and since, moreover, we only consider ν r of the form δ u(r) , we find Since {X N t } t≥0 is a pure jump Markov process, driven by a Poisson process of jumps, it follows immediately that for f and φ as above, defines a mean zero local martingale. In the variable radius case, we shall exploit this with f (x) = exp(−x) and non-negative φ. In the fixed radius case, the following lemma, which follows immediately on setting f (x) = x and f (x) = x 2 , will provide a more convenient tool.
Lemma 3.1 The quantity X N t , φ has the semimartingale decomposition: The local martingale M N t (φ) has quadratic variation process 3) It will be convenient to extend the semimartingale decomposition to we deduce that this extension simply results in the additional term in the generator. In particular, we arrive at the following lemma.
Lemma 3.2 The quantity X N t , φ t has the semimartingale decomposition: φ s (y)X N s (y) dy dz µ N (dr) dx.

Identifying the limit
We now turn to identifying the possible limit points of the sequence of processes {X N } N ≥1 under our chosen scalings, deferring the proof of tightness of the sequence to Section 5.

Spatial motion
Although our scaling and limits are quite different, our calculations are reminiscent of those of Berestycki et al. (2013). Our aim is to find an approximate form of the martingale problem that is close to that for superBrownian motion. For simplicity we take φ to be constant in time. An interchange of integrals followed by an interchange of the rôles of x and y in our notation yields We use the Taylor expansion Again using Taylor's Theorem (and interchanging the role of x and y) this is Combining the above with our expression for L N (φ) from Lemma 3.1, for the fixed radius case we obtain We note that, using the same manipulation as in (4.1), ∆φ(y)X N s (x) dy dx, and so since φ ∈ C 3 b , a Taylor expansion yields, For the variable radius case, we integrate this expression against ν r (du)µ N (dr). Evidently it is straightforward to extend the calculation above to suitable time dependent φ · . We record the result as a lemma.

Fixed radius case
We now turn to identification of the limit as N → ∞ of the quadratic variation M N (φ) t of (3.3) in the fixed radius case. Recall that Expanding the brackets we see that φ(y)X N s (y) dy 2 dz dx ds, which can be rearranged to yield Mimicking the manipulation that gave us (4.1), and using the regularity of φ, this can be written where we used that if z ∈ B M r (x) and y ∈ B M r (x), then |z − y| < 2/M and X/K ≤ 1 to estimate the error in replacing the second term in (4.2) by twice the third. The necessity of Conditions 1-3 of Theorem 2.6 is already evident from (4.3). This result will be sufficient for the proof of tightness, but more work will be needed to check that the second term on the right tends to zero as N → ∞, and thus identify the limiting quadratic variation as that corresponding to superBrownian motion. The proof of the following lemma, which we defer to Section 6, rests on the duality of Proposition 2.5. Lemma 4.3 Under the conditions of Theorem 2.6, for any φ ∈ C 3 0 (R d ), If we can prove that our sequence {X N } N ≥1 of processes is tight and that all limit points are martingales, then granted Lemma 4.3 (and some uniform integrability), Lemma 4.2 and 4.3 allows us to identify the limit points as solutions to the martingale problem for finite variance superBrownian motion with m and κ as in the statement of Theorem 2.6.

Variable radius case
We now turn to the variable radius case. Since, if we are to have convergence to the super-Brownian motion with stable branching, the quadratic variation of the previous subsection must be unbounded as N → ∞, we instead turn our attention to L N f (φ) with f (x) = e −x . We suppose that φ is non-negative and, for simplicity, independent of time. Substituting in (3.1), we obtain Taylor expansion of the exponential function yields from which, noting that, under the assumptions of Theorem 2.7, K/(JM d ) → 0,

The analogue of Lemma 4.3 that we need in this context is
Lemma 4.4 Under the assumptions of Theorem 2.7, for φ ∈ C 3 0 (R d ), Once again the proof, which relies on duality, is deferred to Section 6.
Rearranging the expression for L exp(−·) (φ)(X N 0 ) and reversing the order of integration we find Consider A N φ(x). Again by Taylor expansion, we have where the last line follows by observing that φ(y) − φ(x) = O(r φ C 1 /M) for y ∈ B M r (x). We note that the conditions γ − d < 1 1−β and α + 1 = (β + 1)(γ − d) will imply ∞ 0 u(r) 2 r 2d+2 µ N (dr) < ∞. The calculations of Section 4.1 then allow us to write We now specialise to u(r) = r −γ and µ N (dr) := r α 1 dr as specified in Theorem 2.7. However, it should be clear that other choices would result in nontrivial scaling limits.
Substituting into our expression for B(x) and making the change of variable υ = |B M r |Ku(r)/J where κ 0 is a constant, and so to recover the superBrownian motion with stable branching law with index β in this limit, we choose α + 1 = (β + 1)(γ − d) (to get the right exponent) and J γ−d γ K/(JM d ) → ∞, to ensure that the upper limit of integration tends to infinity. That the lower limit of integration K/(JM d ) → 0 was imposed already in order for A N φ(x) to converge to a nontrivial limit. To ensure that the term premultiplying the integral in (4.5) is positive and finite, we take γ − d > 0 and assume that N Granted Lemma 4.4, we have now shown that under the conditions of Theorem 2.7, for any φ ∈ C 3 0 (R d ), as N → ∞, where m, κ ∈ (0, ∞) are as in the statement of the Theorem.

Tightness
In this section we turn to the proof of tightness of the sequence {X N } N ≥1 in the space of càdlàg M F (R d )-valued processes. In the fixed radius case, we shall also check that all limit points are actually continuous processes.
To prove tightness, we appeal to a specialised version of Jakubowski's general criterion that we have taken from Cox et al. (2000). For a Borel set A, let X N t (A) := X N t , 1 A .
Proposition 5.1 (Cox et al. (2000), is tight if and only if the following conditions hold: In fact it is convenient to slightly modify the statements of Conditions 1 and 2.
Corollary 5.2 The conclusion of Proposition 5.1 remains valid if Condition 1 and 2 are replaced by: Proof. We mimic the proof of Ethier and Kurtz (1986), Chapter 3, Corollary 7.4. Suppose that (5.2) is satisfied, then given ε > 0, there exists a compact set K 0 T,ε and an integer N 0 such that for all N > N 0 For each N ≤ N 0 (c.f. the argument at the beginning of Section 3 that X N t has compact support for all t ≤ T ), there is a set K N T,ε such that Set K T,ε = N 0 N =0 K N T,ε and Condition 1 of Proposition 5.1 is satisfied. The proof that 2 ′ implies 2 is similar.
We borrow a convenient formulation of the additional criterion for the limit points to be continuous from the same paper.
Corollary 5.3 (Cox et al. (2000), Corollary 3.2) If a sequence of measure valued processes satisfy the conditions of Proposition 5.1 with Φ = C ∞ 0 (R d ) and for each φ ∈ Φ, every limit point of { X N · , φ } N ≥1 is supported on the space of continuous functions, then {X N } N ≥1 is tight and all limit points are continuous.
These two results reduce much of the work in proving tightness to an examination of the one-dimensional projections { X N · , φ } N ≥1 . For this we shall make use of the calculations of Section 4 which showed, in particular, that L N (φ)(X N t ) is close to X N t , A N φ where A N , defined in Definition 4.1, is m(N)∆/2. This will allow us to exploit the following elementary properties of the heat equation which, for convenience, we record as a lemma.
Lemma 5.4 Suppose that ψ : R d → R ∈ C 3 0 and that v is a classical solution to where the diffusion coefficient is a constant m ∈ (0, ∞).

If ψ is uniformly Lipschitz continuous with Lipschitz-constant M, then for all t, v(t, ·)
is Lipschitz continuous with Lipschitz constant M.
2. For any k ∈ N, for each t, the C k norm of v(t, x) (as a function of x) is bounded above by that of ψ.
3. If v solves the equation with initial data ψ then ∆v solves the same equation with initial data ∆ψ.
Notation 5.5 We denote by P (m) t the heat semi-group with diffusion coefficient m, so that the solution to (5.4) can be written where B t is a standard Brownian Motion.

Verification of Condition 2 ′ of Corollary 5.2
Lemma 5.6 Under the conditions of either Theorem 2.6 or Theorem 2.7, for each T > 0, Proof. 1. We set χ N (t) = E[ X N t , 1 ]. Let ε > 0 and let {h R } R≥1 be a sequence of smooth, compactly supported functions on R d such that We can further arrange that h R are uniformly bounded in C 3 .
In the notation of Lemma 4.2, Taking expectations, We have used the fact that, under our assumptions, m(N) converges to bound the constant in front of the Laplacian, and the bound on the error E[ζ N t (h R )] from Lemma 4.2. Letting R ↑ ∞ and applying the Monotone Convergence Theorem, and applying Gronwall's inequality and using that ε was arbitrary, as required.
2. Define a sequence of stopping times by and set X N t := X N t∧τ N . Then 1 ], and proceed as above, but with the stopped martingale problem, to obtain Markov's inequality now gives Noting that P[sup t≤T X N t , 1 ≥ H] = P[ X N t , 1 ≥ H], we can now conclude, since for any T , ε > 0 we can choose H 1 , N 1 such that for H ≥ H 1 , N ≥ N 1 and t ≤ T the right hand side of (5.6) is less than ε.

Verification of Condition 1 ′ of Corollary 5.2
Lemma 5.7 Under the conditions of either Theorem 2.6 or Theorem 2.7, for each T , ε > 0 there is a compact set K T,ε ⊂ R d such that Proof.
Define h n (x) := h(x/n). It will evidently suffice to show that for sufficiently large n, We note that the C 3 norms of ∆h n are uniformly bounded in n, so that by Lemma 5.4 the C 3 norms of ∆P and this bound is uniform in n. In particular, using (5.5), We recall that We shall consider each term on the right hand side separately. The first term will be zero for sufficiently large n, since supp(X N 0 ) ⊆ D for all N, where D ⊆ R d is compact. The bound (5.10) will help us to control the integral term. First note that, since m(N) converges, where C is independent of n and N. Now set m := ⌊ n−1 2 ⌋.
where we used (5.10) to obtain (5.12). Under the assumptions of Theorem 2.6 or Theorem 2.7, N/(JM 2 ) → C 1 and M → ∞ as N → ∞ and so we can take N sufficiently large that the error term is less than ε 2 /32.
Denoting standard Brownian motion by {B t } t≥0 , consider now which (by our assumptions that supp(X N 0 ) ⊆ D for all N and that m(N) converges) tends to 0 as m → ∞ uniformly in N, so in particular is < ε 2 /32 for sufficiently large m.
Combining the estimates above, we have shown that given ε > 0, there exist N 0 , n 0 such that for N ≥ N 0 and n ≥ n 0 , T 0 X N s , 1 ds and we can estimate the right hand side through (5.5) from Lemma 5.6 and see that for sufficiently large N it is bounded above by ε 2 /16. In order to control the martingale term we shall control E[|M N T (h n )|] and then apply Doob's inequality.
For sufficiently large N, the argument that gave us (5.13) allows us to bound (5.10) with t = T by ε 2 /8. Rearranging (5.11) and using the triangle inequality, given ε > 0, there exist N 0 , n 0 , such that uniformly in n ≥ n 0 , for N ≥ N 0 , E[|M N T (h n )|] ≤ ε 2 /4. Therefore by Doob's maximal inequality, for any such n and N, we have that Combining the estimates above with another application of Markov's inequality to the sum of the remaining terms on the right hand side of (5.11) we have verified (5.8) and the proof is complete.

Tightness of projections: fixed radius case
We shall verify Condition 3 of Proposition 5.1 separately for our two cases. In the fixed radius case it is relatively straightforward as the quadratic variation of the martingale part of our semimartingale decomposition (5.11) will remain bounded as N → ∞. In this section we shall deal with that case and moreover check the conditions of Corollary 5.3 so that we can deduce that all the limit points of {X N } N ≥1 are in fact continuous processes. We shall exploit some standard results that we record here for convenience.
Proposition 5.8 (Jacod and Shiryaev (2003), Chapter VI, Part of Proposition 3.26) A sequence {Y N } N ≥1 of càdlàg R d -valued processes is tight and all limit points of the sequence of laws of Y N are laws of continuous processes if and only if the following two conditions are satisfied: Theorem 5.9 (Jacod and Shiryaev (2003), Chapter VI, Theorem 4.13) If for each N, M N is a locally square integrable local martingale, then sufficient conditions for the sequence {M N } N ≥1 to be tight are: 1. The sequence {M N 0 } N ≥1 is tight.
2. The sequence { M N } N ≥1 is tight with all limit points being continuous.
Proposition 5.10 (Jacod and Shiryaev (2003), Chapter VI, Part of Proposition 3.26) A sequence {Y N } N ≥1 is tight and all limit points of the sequence of laws of Y N are laws of continuous processes if and only if {Y N } N ≥1 is tight and for all T > 0, ε > 0, Lemma 5.11 Under the assumptions of Theorem 2.6, for each φ ∈ C 3 0 (R d ), { X N · , φ } N ≥1 is tight and all the limit points are continuous.
1 , the first condition of Proposition 5.8 follows immediately from our verification of Condition 2 ′ of Corollary 5.2. To check the second condition of Proposition 5.8, we once again use the semimartingale decomposition SuperBrownian motion and the spatial Lambda-Fleming-Viot process and combining with and (5.6) we see that under the conditions of Theorem 2.6 both these terms satisfy Condition 2 of Proposition 5.8 and our problem is reduced to showing tightness with continuous limits of the martingale part. Theorem 5.9 tells us that it suffices to consider the quadratic variation.
Recall from (4.3) that Using Proposition 5.8 and our bounds on X N s , 1 from Lemma 5.6, tightness of { M N (φ) } N ≥1 is immediate.
We now conclude through an application of Proposition 5.10. The probability that the support of X N , which is compact, is simultaneously overlapped by two events of the SLFV is zero, and φ is bounded, so Proposition 5.12 Under the conditions of Theorem 2.6, {X N } N ≥1 is tight and all limit points are continuous.
Proof. This is now immediate from Corollary 5.3.

Tightness of projections: variable radius case
Our proof in the fixed radius case breaks down in the variable radius setting since the quadratic variation of the martingale part of X N · , φ will grow without bound as N → ∞. Instead we exploit the fact that the (1 + θ)th moments of sup 0≤t≤T X N t , φ will remain bounded for any θ < β. We follow Dawson (1993) who used essentially the same argument to show that branching Brownian motion with stable branching law, suitably rescaled, converges to superBrownian motion with stable branching, although there are some extra layers of estimation in our setting.
We exploit the following elementary lemma, which we learned from a preliminary version of Dawson (1993), but we record here since it does not appear in the published version.
Lemma 5.13 There exist c 1 , c 2 , c 3 ∈ (0, ∞) such that 4. For a non-negative random variable Y and θ = −1, 5. For a non-negative random variable Y and any y ≥ 1, Proof.(sketch) For 4, note that x 1+θ = 1 + (1 + θ) x 1 y θ dy, and so where the last line follows from reversing the order of integration. For 5, observe that where we have used 1 and 2.
In the classical setting of Dawson, at this stage one exploits the fact that the approximating processes are branching processes, in exact duality with the solution to a deterministic evolution equation. This is no longer true in our setting, so instead our first task is to write down an approximate evolution equation, whose solution we denote by v N φ , with the property that From our calculations in Section 4.3 (in particular equations (4.4 and (4.5)), and assuming the result of Lemma 4.4, we can write (5.16) This is the evolution equation corresponding to a superprocess in which the spatial motion is the compound Poisson process with generator L N and the branching mechanism is determined by B N . The existence of such a process (which then, by duality, guarantees the uniqueness of the solution of the evolution equation) follows from, for example, El Karoui and Roelly (1991). Let us write {S N t } t≥0 for the semigroup generated by L N . Then the solution to (5.16) can be written An immediate consequence of (5.15) is that we have an approximate duality between X N and v N φ since and so integrating over [0, t], where we used Lemma 5.6 to replace E[ t 0 X N s , 1 ds] by X N 0 , 1 at the expense of replacing ε N by ε N , but still with ε N → 0 as N → ∞, and we used (5.17) to see that v N φ (t − s) ∞ ≤ φ ∞ . In particular, applying this with λφ in place of φ and differentiating with respect to λ at λ = 0 gives The key step in proving tightness is the following: Lemma 5.14 For {X N } N ≥1 satisfying the conditions of Theorem 2.7 and φ ∈ C ∞ 0 we will have that, for any fixed 0 < θ < β, where H is independent of N.
Proof. In the proof of this lemma, C is a constant which is independent of N, but may change from line to line. First observe that, using Lemma 5.13 parts 4 and 5, where we have used (5.18 and (5.5) and the constant C is determined by those in Lemma 5.13. Using Part 3 of Lemma 5.13, The first term is at most c 2 λφ 1+β ∞ X N 0 , 1 1+β . To control the remaining terms, note that (where we have used (5.17)). Now for non-negative φ, under the assumptions of Theorem 2.7, using (4.5), so that using (5.17) again, and since θ < β the right hand side is finite. An application of the Monotone Convergence Theorem yields with H ′ < ∞ independent of t ≤ T .
We have now established a bound of the form The proof now follows the pattern established in Lemmas 5.7 and 5.11. Taking the terms in the semimartingale decomposition of X N t , φ one by one, first observe that by Jensen's inequality, which we can bound using (5.20) and the fact that, from Lemma 5.4, |A N φ| ≤ m(N) φ C 3 . Rearranging the semimartingale decomposition to give an expression for M N t (φ) and combining with Minkowski's inequality we can bound E[|M N T (φ)| 1+θ ], and then by Doob's martingale inequality we see that and, since we have established uniform bounds on each of the terms on the right hand side of this expression, this completes the proof.
To conclude the proof of tightness of {X N } N ≥1 we shall use a criterion due to Aldous.
Theorem 5.15 (Aldous (1978)) A sequence of càdlàg real-valued processes {Y N } N ≥1 is tight if and only if the following two conditions hold: 1. for each fixed t, {Y N t } N ≥1 is tight in R; 2. for any ε > 0, given a sequence of stopping times τ N bounded above by T and a sequence of real numbers δ N → 0 as N → ∞, The first condition is trivially satisfied, and so it remains to prove the following lemma.
Tightness is now proved.

Proofs of Lemmas 4.3 and 4.4
In this section we exploit the duality of the SLFV with a process of coalescing lineages to prove the two key estimates provided by Lemmas 4.3 and 4.4. In the fixed radius case, it is straightforward to make the heuristic argument of Section 2.4 rigorous and analogous arguments can be found in, for example, Etheridge et al. (2015). However, in the variable radius case, we must work a little harder to recover the result claimed here. We present an approach that works in either setting, but to avoid repetition restrict ourselves to the variable radius case.
J|ε N,1 t − ε N,2 t | 2 ] = O 1 . (6.4) Before proving this result, let us see why it helps. The error terms ε N,i t are very far from being independent, but they are very small. Notice in particular that if ξ N,1 and ξ N,2 have coalesced by time t, then since Kw N 0 = X N 0 converges weakly to X 0 , R d R d ψ(y, z)E (y,z) [w N 0 (ξ N,1 t )] dy dz, is of order 1/K, which when multiplied by K 2 to give us the contribution to (6.1) is O(K). If, on the other hand, the two lineages have not coalesced, then they are close to independent random walks and Under the assumptions of Theorem 2.6, (KN)/(J 2 M d ) → C 2 as N → ∞, and so the proof of Lemma 4.3 is reduced to checking that the coalescence probability tends to zero as N → ∞. Under the conditions of Theorem 2.7, In d ≥ 2, since (1 − β)(γ − d)/γ < 1, we easily see that this tends to zero as N → ∞. The sparsity condition (2.7) guarantees that the same is true in d = 1 and so the contribution to the expression in Lemma 4.4 from lineages that have not coalesced is negligible. On the other hand, multiplying the expression in (6.5) by K we obtain an expression of order which grows without bound as N → ∞ (because of Condition 5 of Theorem 2.7). To prove Lemma 4.4 we must show that this quantity multiplied by the coalescence probability tends to zero. In d ≥ 2, under the conditions of Theorem 2.7, the coalescence probabilities will be of the same order as in the fixed radius case, and so the Lemma will follow easily (again using (1 − β)(γ − d)/γ < 1). In one dimension, things are more delicate, and we shall see the need for our more stringent sparsity condition.
Proof.[Of Proposition 6.1] We rewrite the Poisson process of events Π N that drives the dynamics of ξ N,1 and ξ N,2 as the sum of four components through a thinning. Each event (x, t, r, ρ) ∈ Π is augmented by two independent Bernoulli random variables, η 1 , η 2 , each with success probability ρ = u(r) = r −γ . The random variable η i determines whether or not the ith ancestral lineage is affected by the event. We now let Π N,1 be the events in Π for which η 1 = 1, η 2 = 0; Π N,2 is the subset of events with η 1 = 0, η 2 = 1 and Π N,1,2 is the events with η 1 = η 2 = 1. The remaining events won't affect the motion of either lineage. Proposition 6.2 Let the scaled ancestral lineages ξ N,1 , ξ N,2 be as in the previous subsection. We start them from two points sampled independently at random from B M r (0). The probability that they have coalesced by time T , which we shall denote by p N (T ) satisfies (6.6) In the light of the discussion immediately after the statement of Proposition 6.1, this result will complete the proof of our key lemma in the variable radius case. Proof.[of Proposition 6.2] Our aim is to show that is bounded by the quantities on the right hand side of (6.6). where h(r) is given by (6.2).
To do this, we shall use the coupling of Proposition 6.1 to approximate |ξ N,1 − ξ N,2 | by |Y N,1 −Y N,2 | and then the classical Fourier transform approach of Darling and Kac (1957) to estimate the corresponding expected integral of the hazard rate. Since we are only interested in the separation of our lineages we denote Y N := |Y N,1 − Y N,2 | and ε N := |ε N,1 − ε N,2 |. We first observe that for some θ(s) ∈ (0, 1) and that h ′ (r) = Ch(r) 1 r 1 [J −1/γ ,1] (r) ≤ J 1/γ h(r); h ′′ (r) = C ′ h(r) 1 r 2 1 [J −1/γ ,1] (r) ≤ J 2/γ h(r). (6.9) We now begin with d ≥ 2. Using (6.3), we see that The random walk Y N jumps at exponentially distributed times with mean O(1/M 2 ), and so will make O(M 2 ) such jumps by time T . It is evidently enough to show that is bounded by the quantities on the right of (6.6) where T N is the random time at which the walk Y N has made a geometric number of jumps with mean M 2 . Let us write p(x, y) for the probability density function of the unscaled displacement MY N at a single jump