Optimisation-based representations for branching processes

It is shown that a certain functional of a branching process has representations in terms of both a maximisation problem and a minimisation problem. A consequence of these representations is that upper and lower bounds on the functional can be found easily, yielding a non-asymptotic Trotter product formula. As an application, the speed of the right-most particle of a branching Lévy process is calculated.


Introduction
Consider a branching process {X i t : i ∈ I t , t ≥ 0} constructed as follows. Initially, there is one particle sitting at a point x 0 in a Polish space X . The position of the particle then evolves according to the law of a given right-continuous strong Markov process X started from X 0 = x 0 . At time T > 0, the initial particle is killed and replaced with N particles, where both T and N are random. Each of these new particles then move and branch as independent copies of the initial particle, except that each new particle now starts from the final position X T of the initial particle. We assume that the conditional law of the first branching time T given the Markov process X is P(T > t|X) = e − t 0 λ(Xs)ds for all t ≥ 0, denote the position at time t of the initial particle if it were allowed to continue living after the branching event.
We will assume that the branching rate λ(x) and the mean number of new offspring E(N |X T = x) per branching event are bounded functions of x ∈ X . This is a sufficient condition that the branching process does not explode in finite time, in the sense that the number of particles |I t | is almost surely finite for all t ≥ 0. See, for instance, the book of Athreya & Ney [1,Theorem III.2.1] Our main result is the following: Theorem 1.1. Let F be the filtrations generated by X. Let Z be the set of bounded adapted processes, let Z • be the set of bounded anticipative processes, and let M the set of non-negative martingales.

Remark 1.2.
We are using the convention that all real processes have measurable sample paths, so that the pathwise integrals appearing in the statement of Theorem 1.1 are well-defined. Remark 1.3. The proof will show that it is possible to replace the function h appearing in the statement of Theorem 1.1 with a functionh so long ash(x, z) ≥ h(x, z) for all (x, z) andh(x, z) = h(x, z) when λ(x)P(N = 1|X T = x) ≤ z ≤ λ(x)E(N |X T = x). For instance, we may takeh The full proof of this result appears in Section 2. To put the above optimisation-based representations into context, we jump ahead a bit. The rough idea behind the equality u = M appearing in Theorem 1.1 is that the value function of the stochastic optimal control problem defining M should satisfy the Bellman equation of the problem. However, we have chosen the data of the control problem in such a way that the associated Bellman equation is, essentially, the S-equation (in the terminology of Ikeda-Nagasawa-Watanabe [12, equation (4)]) of the branching process. Although we have not found this done explicitly in other papers, we acknowledge that this may not be surprising in this respect.
In contrast, the dual minimisation problem defining m is not in the form of a standard stochastic control problem, and so the usual dynamic programming arguments do not apply. In particular, there is no conventional Bellman equation to this minimisation problem. Our formulation of the dual problem is inspired by the pathwise stochastic control approach of Rogers [14]. The general formulation is a bit cumbersome, involving both a maximisation over anticipative processes and a minimisation over martingales.
EJP 25 (2020), paper 143. Page 2/15 https://www.imstat.org/ejp However, in the special case of dyadic branching, when the number of offspring is the constant N = 2, the pathwise maximisation problem can be solved explicitly, yielding the following corollary: Corollary 1.4. With the notation of Theorem 1.1, suppose N = 2 almost surely. Then A proof of this fact will be given in section 2. We find it somewhat surprising (maybe even mysterious) that the maximisation problem appearing in Corollary 1.4 is related to a dyadic branching process.
A consequence of the connection between branching processes and the various optimisation problems is that lower and upper bounds of certain functionals of the branching process can be derived immediately, simply by evaluating the objective functions of the optimisation problems at feasible controls. In particular, this technique can be used in principle to derive asymptotic estimates on the behaviour of the branching process.
As an illustrative application, we consider the branching process where X is a realvalued Lévy process, and where the rate of branching λ is a positive constant and the distribution of the number of offspring N is independent of the position of the particles. Let K be the cumulant generating function of the underlying Lévy process, defined by E x [e θXt ] = e θx+tK(θ) for all t ≥ 0.
Suppose that K is finite in a neighbourhood of θ = 0. Recall that by the Lévy-Khintchine formula we have for some constants b, σ and measure ν, where we are supposing that (e θy ∧y 2 )ν(dy) < ∞ for all θ in some neighbourhood of θ = 0. Theorem 1.5. Let µ = E(N ) − 1 be the mean net number of new particles created at a branching event and suppose µ > 0. Conditional on the event {I t = ∅ for all t ≥ 0} that the branching Lévy process never becomes extinct, we have Remark 1.6. The condition µ > 0 is necessary and sufficient for supercriticality of the branching process, that is P( , but our precise formulation seems new and requires fewer assumptions. More importantly, our proof will be rather different, using estimates derived from Theorem 1.1, rather than renewal theory. EJP 25 (2020), paper 143.
The remainder of the paper is structured as follows. Section 2 contains the proof of Theorem 1.1. The key ingredient is a more general representation result given by Theorem 2.1. Section 3 gives the main take-away implications of Theorem 2.1: easy to apply bounds on the solution to certain reaction-diffusion-type equations. Section 4 contains the proof of Theorem 1.5 which finds the speed of the right-most particle of a branching Lévy process.

Proof of Theorem 1.1
In this section, we prove Theorem 1.1. We first prove a more general result. As in the introduction, let X be a right-continuous strong Markov process valued in a Polish space X . As in Theorem 1.1, we let Z, Z • and M be the set of bounded adapted processes, bounded anticipative processes and non-negative martingales, respectively. Theorem 2.1. Let φ : X × R → R be measurable and such that φ(x, ·) is concave and differentiable with a derivative bounded uniformly in x ∈ X . Suppose the bounded For fixed (t, x), a minimiser is given by the adapted control For fixed (t, x), a maximiser is given by the non-negative martingale For the martingale ζ * , the essential infimum is attained for the control z * = Z * .
Remark 2.2. Formally, the differential form of the integral equation appearing Theorem 2.1 is where L is the infinitesimal generator of the Markov process X. The hypothesis can be reworded to say that v is a mild solution of the above differential equation.
We also note in passing that when X is a diffusion process in finite-dimensional Euclidean space, so that L is a second order differential operator, the semi-linear partial differential equation is of the reaction-diffusion type.
Under our assumption that φ(x, ·) is uniformly Lipschitz, one can show by a standard Picard iteration argument that given a bounded initial condition v(0, ·) the integral EJP 25 (2020), paper 143.
The key observation is that Note that the two path-wise Lebesgue integrals on the right-hand side are well-defined, though the second one might take the value −∞. Indeed, the integrand in the first integral is Lebesgue integrable almost surely, since by the assumed with equality if Z = Z * . Note Z * is bounded, and hence feasible, by the assumption of uniform boundedness of ∂φ/∂v. This proves that v(t, x) is the value of the minimisation problem.
Now consider the max-min problem. Fix a non-negative martingale ζ and note that by Fubini's theorem and iterating expectations we have Since ζ is arbitrary, computing the supremum of the right-hand side yields the lower bound.
It remains to show that there is no duality gap. We now assume φ(x, 0) = 0 for all x. Under the uniform Lipschitz assumption, the function θ is bounded.
Note that by Fubini's theorem Θqdq dr so ζ * is a non-negative bounded martingale. Similar to the key observation above, we have for any anticipative process z that To prove Theorem 1.1 we need one more ingredient, the so-called S-equation, due to Skorokhod [15, equation (4)]. We provide a proof for completeness.
t ≥ 0} be the branching process described in the introduction. Fix a measurable f : X → [0, 1] and for all (t, x) ∈ R + × X , let be the conditional probability generating function of the offspring distribution, we have t 0 e − s 0 λ(Xr)dr λ(X s )G(X s , u(t − s, X s ))ds .
By assumption, the function λ is bounded and hence the pathwise integral on the right-hand side is integrable, with mean zero. Since E x (M t ) = M 0 = u(t, x) we are done.

Remark 2.4.
McKean [13] noted that that the solution of the FKPP equation (

Bounding solutions
In this section, we explore a simple consequence of Theorem 2.1. We now assume that the non-linearity φ appearing in the integral equation is such that there is a concave, differentiable and Lipschitz functionφ : R → R such that φ(x, η) =φ(η) for all (x, η) ∈ X × R. In order to avoid overburdening the notation, we will drop theˆand simply write this function as φ.
We also introduce the following notation. We let V t be the operator that sends the bounded measurable function v 0 : . We let P t be the transition operator of the Markov process, such that for all bounded measurable f . Finally, we let R t (r 0 ) = V t (r 0 1) for all (t, r 0 ) ∈ R + × R where 1(x) = 1 for all x ∈ X . That is to say, if r : R + → R solves the ordinary differential equationṙ = φ(r), r(0) = r 0 then R t (r 0 ) = r(t). Now let R t be the operator defined by The main result of this section is the following non-asymptotic form of the Trotter product formula:   where L is the generator of the Markov process X. The 'diffusion' term corresponds to the Markov (linear) semigroup (P t ) t≥0 generated by L, while the 'reaction' term corresponds to the non-linear semigroup (R t ) t≥0 generated by the concave (state-independent) function φ. Finally, (V t ) t≥0 is the non-linear 'reaction-diffusion' semigroup generated by the sum L + φ. An interesting reformulation of Corollary 3.1 is Similarly, letting (Z * s ) 0≤s≤t be the maximiser of the minimisation in Theorem 2.1, we Now, note that each of the operators P t , R t and V t are increasing. In particular, we The same argument works for the upper bound. Induction completes the proof.   η)) where λ > 0 is constant and G is the probability generating function of a non-negative integer-valued random variable N . This corresponds to the case of a branching process with a constant branching rate λ and the distribution of the number of particles N produced at a branching event is independent of the event's location. In this case, we have the formula where I t is the set of particles alive at times t ≥ 0.

An application to a branching Lévy process
In this section, we prove Theorem 1.5. Recall that here the branching process {X i t : i ∈ I t , t ≥ 0} is constructed from a real-valued Lévy process X starting from X 0 = x 0 . Recall also that the branching rate is a positive constant λ and the distribution of the number of particles N produced at a branching event is independent of the position of the particles. Recall also that the cumulant generating function K of X is assumed finite in a neighbourhood of the origin. In what follows, we will letX be the Lévy process with the transition distribution of −X. Note that the function K plays the role of the Laplace exponent ofX: for all t ≥ 0, where here the subscript x denotes conditioning on the event {X 0 = x}.
The key step of our proof of Theorem 1.5 is the following proposition: Before we prove Proposition 4.1, we show how it can be used to find the asymptotic speed of the right-most particle: Proof of Theorem 1.5. Let u(t, ·) be the distribution function of M t = sup i∈It X i t , with M t = −∞ when I t is empty. Given the result we are trying to prove, we may assume that the position of the initial particle is x 0 = 0. Note that by the translational invariance of the transition distribution of a Lévy process According to Theorem 2.3 applied to the branching process {X i t : i ∈ I t , t ≥ 0} we have Note that g is convex with g(1) = 0 and g(0) = P(N = 0) ≥ 0. By the assumption that E[N ] > 1, we have g (1) > 0 and hence there exists a smaller root 0 ≤ α < 1 such that g(α) = 0. Note that v = 1 − u satisfies the conditions of Proposition 4.1 with β = 1 − α. Now recall that α = P(E) where E = {I t = ∅ for some t > 0} is the event that population EJP 25 (2020), paper 143.
the conclusion follows since This shows that for any ε > 0 we have The rest of this section contains the proof of Proposition 4.1. The case whereX is degenerate, in the sense thatX t = x + bt for a constant b is immediate. Therefore, we will assume without loss thatX is non-degenerate, so that K (0) = Var(X 1 ) > 0.
Of the two bounds, the upper bound is easier to obtain. Using the n = 1 case of By the concavity of φ we have φ(v) ≤ γv and hence by Grönwall's inequality R t (r 0 ) ≤ r 0 e γt .
For the lower bound, we will introduce some more notation. Let F t (y) = P 0 (X t ≤ y) be the conditional distribution function of the random variableX t givenX 0 = 0. Note that by spacial homogeneity of the Lévy process, we have The key estimates are the following: where δ = t/n.

Remark 4.3.
It is interesting to note that Lemma 4.2 actually holds with no assumption on law of the Lévy process. In particular, it holds for processes, such as stable processes, for which the Laplace exponent K(θ) is infinite for all θ = 0.
Proof of Lemma 4.2. We fix δ and use induction on n. We first consider the n = 1 case. Since the points 0 and β ≤ 1 are fixed points of φ, we have R δ (0) = 0 and R δ (1) ≥ β.
In particular, we have To do the inductive step, we will make use of the following observation: for any
In this notation, we must prove that for δ large enough. To do this, note that the limit of the left-hand side as δ → ∞ is c/γ < 1 by l'Hôpital's rule.
In particular, we may invoke Cramér's large deviation principle to conclude that, as δ → ∞, where the large deviation rate functionK is the Legendre transform of K, defined byK Hence, it is enough to show thatK (r) < γ. Now, since r > K (0) = −E 0 (X 1 ), there exists an ε > 0 such that r > K (ε), since K is continuous and increasing in a neighbourhood of θ = 0. By the convexity of K we have the inequality The conclusion follows since qθ − K(θ) ≤ γ for all θ > 0 by the definition of q.
Since b < β and ε > 0 are arbitrary, the conclusion follows.
EJP 25 (2020), paper 143. Remark 4.6. It is possible to express the speed q of the travelling wave front in several ways. The above proof shows that q can be rewritten as q = sup{r :K(r) < γ}, whereK is the Legendre transform of Λ. This formulation for the speed of the right-most particle appears in the paper of Biggins [3] or, more recently, in the paper of Groisman & Jonckheere [8].
Following an idea in the paper of Hiriart-Urruty & Martínez-Legaz [9], an inverse to the functionK can be calculated as follow. First, define a new function K • by the Note that the function K • is convex, and indeed, it is related to the perspective function of the Laplace exponent K. Define its Legendre transform in the usual fashion Then it can be shown that an inverse function toK is the function −K • . In particular, the speed q can be rewritten as q = −K • (γ).
Simplifying the above formula recovers the formula in Proposition 4.1.

Remark 4.7.
Consider the case of dyadic branching Brownian motion, where N = 2 and K(θ) = 1 2 θ 2 . Letting m(t) be the median, defined by be interesting to see if, by optimising over the free parameters b and n, it is possible to recover the log t term in the lower bound as well.