Haldane linearisation done right: Solving the nonlinear recombination equation the easy way

The nonlinear recombination equation from population genetics has a long history and is notoriously difficult to solve, both in continuous and in discrete time. This is particularly so if one aims at full generality, thus also including degenerate parameter cases. Due to recent progress for the continuous time case via the identification of an underlying stochastic fragmentation process, it became clear that a direct general solution at the level of the corresponding ODE itself should also be possible. This paper shows how to do it, and how to extend the approach to the discrete-time case as well.


Introduction
The recombination equation is a well-known dynamical system from mathematical population genetics [15,10,9,3], which describes the evolution of the genetic composition of a population that evolves under recombination. The genetic composition is described via a probability distribution (or measure) on a space of sequences of finite length; and recombination is the genetic mechanism in which two parent individuals are involved in creating the mixed sequence of their offspring during sexual reproduction. The model comes in a continuous-time and a discrete-time version; it can accommodate a variety of different mechanisms by which the genetic material of the offspring is partitioned across its parents. In all cases, the resulting equations are nonlinear and notoriously difficult to solve. Elucidating the underlying structure and finding solutions has been a challenge to theoretical population geneticists for nearly a century now.
The first studies go back to Jennings in 1917 [14] and Robbins in 1918 [19]. Geiringer in 1944 [13] and Bennett in 1954 [8] were the first to state the generic general form of the solution in terms of a convex combination of certain basis functions, and developed methods for the recursive evaluation of the corresponding coefficients to obtain the solution itself, at least in principle. The approach was later continued within the systematic framework of genetic algebras; compare [15,16]. It could be shown that, despite the nonlinearity, the dynamical system may be (exactly) transformed into a linear one by embedding it into a higher-dimensional space. More explicitly, a large number of further components are added that correspond to multilinear transformations of the original measure. This method is known as Haldane linearisation [16]. However, this line of research led to astonishingly few concrete or applicable results. This raises the question whether the program may be completed outside the abstract framework, and what kinds of results can be obtained via different approaches.
A first step forward, for an important special case, was achieved in [6], via a rather powerful use of the inclusion-exclusion principle in the form of the Möbius inversion formula. At the same time, a general formalism via nonlinear operators, called recombinators, was introduced that also allowed for an alternative consideration starting from the nonlinear equation with a single such operator and extending this to the same solution [5].
After various intermediate steps of gradual generalisations, the complete equation, in the setting of general partitions, was analysed and solved in [4]. In the generic parameter case, the general solution was given in recursive form. Also, the principal form of the solution in degenerate cases was analysed, but no general formula was given. The most important insight, however, was the identification of an underlying stochastic fragmentation process in [4,Sec. 6]. This means that the solution of the nonlinear recombination equation has a representation in terms of the solution of the Kolmogorov forward equation for this very process, which is a linear ODE.
In [7], the slightly simpler setting of ordered or interval partitions was analysed, with focus on explicit solution formulas for all parameter values (thus including the degenerate cases) for sequences of length up to five. Here, the above-mentioned Kolmogorov equation was investigated further, and the Markov generator from [4,Sec. 6] was derived explicitly. This demonstrated two important things: (1) The solvability of the nonlinear recombination ODE ultimately rests upon the fact that this solution essentially also solves a system of linear equations; (2) The degenerate cases are in one-to-one correspondence to the cases where the Markov generator fails to be diagonalisable, and the appearance of Jordan blocks, well known from classic ODE theory, determines the solutions then. Now, with hindsight, one can ask whether one can treat the original nonlinear ODE in such a way that this becomes immediately transparent, without resorting to the underlying stochastic process. The answer is affirmative, and this paper explains how to do it. Effectively, this new approach means to re-interpret the original Haldane linearisation in a suitable way, without any need for genetic algebras. Moreover, as we shall see, a completely analogous approach also works for the discrete-time recombination equation.
This paper builds on previous work, most importantly on [4,7]. Some of the results from these papers will be freely used below, and not re-derived here (though we will always provide precise references). Also, the biological background is explained in [4]. After recalling the preliminaries and our notation in Section 2, the general recombination equation in continuous time, together with its reduction to subsystems via marginalisation, is discussed in Section 3. This is followed by its general solution via our new and simplified strategy (Section 4), which leads to the first main result in Theorem 1. A stratified interpretation in terms of the underlying partitioning process is offered in Section 5, which gives our second main result (Theorem 2). The discrete-time version of the recombination equation is then discussed and solved in Section 6, by the same method, which leads to our third main result in Theorem 3. The corresponding stochastic process is also identified and briefly summarised.

Partitions, product spaces, measures and recombinators
Let S be a finite set, and consider the lattice P(S) of partitions of S; see [1] for general background on lattice theory and [4] for details of the present setting. Here, we write a partition of S as A = {A 1 , . . . , A m }, where m = |A| is the number of its (non-empty) parts (also called blocks), and one has A i ∩ A j = ∅ for all i = j together with A 1 ∪ · · · ∪ A m = S. The natural ordering relation is denoted by , where A B means that A is finer than B, or that B is coarser than A. The conditions A B and B A are synonymous, while A ≺ B means A B together with A = B, so A is strictly finer than B.
The joint refinement of two partitions A and B is written as A ∧ B, and is the coarsest partition below A and B. The unique minimal partition within the lattice P(S) is denoted as 0 = {x} | x ∈ S , while the unique maximal one is 1 = {S}. When U and V are disjoint (finite) sets, two partitions A ∈ P(U ) and B ∈ P(V ) can be joined to form an element of P(U ∪ V ). We denote such a joining by A ⊔ B, and similarly for multiple joinings. Conversely, if U ⊆ S, a partition A ∈ P(S), with A = {A 1 , . . . , A m } say, defines a unique partition of U by restriction. The latter is denoted by A| U , and its parts are precisely all non-empty sets of the form A i ∩ U with 1 i m.
Fix now S = {1, 2, . . . , n} and define X := X 1 × · · · × X n , where each X i is a locally compact space (which we mean to include the Hausdorff property). The natural projection of X to its ith component is denoted by π i , so π i (X) = X i . For an arbitrary non-empty subset U ⊆ S, we use the notation π U : X −→ X U := × i∈U X i for the projection to X U .
Let M(X) denote the space of signed, finite and regular Borel measures on X, equipped with the usual total variation norm . , which makes it into a Banach space. Also, we need the closed subset (or cone) M + (X) of positive measures, which includes the zero measure. Within M + (X), we denote the closed subset of probability measures by P(X). Note that M + (X) and P(X) are convex sets. The restriction of a measure µ ∈ M(X) to a subspace X U is written as π U .µ := µ •π −1 U , which is consistent with marginalisation of measures. When the context is clear, we will use the abbreviation µ U := π U .µ. For any Borel set A ⊆ X U , one thus has the relation µ Note that the product is (implicitly) 'site ordered', which means that it matches the ordering of the sites as specified by the set S. We shall use (implicit) site ordering also for product sets. We call a mapping of type R A a recombinator. Note that recombinators are nonlinear whenever A = 1.
Let us recall some results from [4, Prop. 1 and Cor. 1] as follows.
(1) R A is positive homogeneous of degree 1, which means that R A (aµ) = aR A (µ) holds for all µ ∈ M(X) and all a 0.
(6) R A preserves the norm of positive measures, and hence also maps P(X) into itself.
In particular, each recombinator is an idempotent and any two recombinators commute.
Several of these properties will be used below without further mentioning, some in results that we simply recall from previous work. We are now set to define and analyse the recombination ODE.

The general recombination equation and marginalisation
The general recombination equation in continuous time is formulated within the Banach space (M(X), . ), as the nonlinear ODE with non-negative numbers ̺(A) that have the meaning of recombination rates in our context. We will usually assume that an initial condition ω 0 ∈ M(X) for t = 0 is given for the ODE (2), and then speak of the corresponding Cauchy problem (or initial value problem).
but we must keep in mind that Φ is a nonlinear operator. Nevertheless, one has the following basic result [6,4]; see [2] for general background on ODEs on Banach spaces. Proposition 2. Let S be a finite set and X the corresponding locally compact product space as introduced above. Then, the Cauchy problem of Eq. (2) with initial condition ω 0 ∈ M(X) has a unique solution. Moreover, the cone M + (X) is forward invariant, and the flow is norm-preserving on M + (X). In particular, P(X) is forward invariant under the flow.
Without loss of generality, when we start with a positive measure, we may thus restrict our attention to the investigation of the recombination equation on the cone M + (X), and on P(X) in particular. So, let us assume that we consider the Cauchy problem with ω 0 ∈ P(X).
The way to a solution of the recombination equation in previous papers started with the ansatz which effectively means a complete separation of the time evolution and the recombination of the initial condition. This led to a nonlinear ODE system for the coefficient functions a t (A) that could be solved recursively in [4], for generic recombination rates. Via the identification of an underlying Markov partitioning process in [4,Sec. 6], the further analysis of [7] showed that these coefficient functions also solve a linear ODE system with constant coefficient matrix, Q say, which makes the entire solvability understandable in retrospect. As mentioned in the Introduction, the degenerate cases then correspond to Q not being diagonalisable. In this approach, which meant a significant progress and simplification in comparison to earlier attempts [15,11,12] while being more general at the same time, the number of steps were still formidable, and another simplification was suggestive. This is precisely what we want to describe now. Here, the golden key emerges from also considering the time evolution of R B (ω t ) for an arbitrary B ∈ P(S), where ω t is a solution of the recombination equation (2). In view of Propositions 1 and 2, it suffices to look at probability measures, so that we get To proceed, it will be instrumental to understand the behaviour of recombination on subsystems defined by a set ∅ = U ⊆ S, where we begin by recalling [4, Lemma 1]. Note that this result effectively underlies assertion (7) of Proposition 1.
Lemma 1. Let S be a finite set as above, and A = {A 1 , . . . , A |A| } ∈ P(S) an arbitrary partition. If U ⊆ S is non-empty and ω ∈ M + (X), one has where A| U ∈ P(U ) and the upper index of a recombinator indicates on which measure space it acts, with R S A = R A .
To continue, it is clear that we will need the recombination rates on subsystems defined by some ∅ = U ⊆ S, as induced by the marginalisation where ̺ S (B) = ̺(B). Now, Lemma 1 implies the following marginalisation consistency on the level of probability measures, where we use the notation ω U t = π U .ω t as introduced earlier. The version we state here is a special case of [4,Prop. 6]; see also Lemma 3 below. Proposition 3. Let ∅ = U ⊆ S. If ω t is a solution of the Cauchy problem of Eq. (2) with initial condition ω 0 ∈ P(X), the marginal measures (ω U t ) t 0 on X U solve the ODE d dt with initial condition ω U 0 = π U .ω 0 and marginalised rates ̺ U (A) according to Eq. (5). In particular, one has ω U t ∈ P(X U ) for all t 0.
In this context, it is helpful to also note a factorisation property of the recombinators on M + (X).

Lemma 2.
Let {U, V } be a partition of S, and assume that two partitions A ∈ P(U ) and B ∈ P(V ) are given. Then, for any 0 = µ ∈ M + (X), one has Proof. Observe first that {U, V } ∧ (A ⊔ B) = A ⊔ B due to our assumptions. By assertions (1) and (7) from Proposition 1, we then know that from which the second claim is immediate.
We are now set to proceed with solving Eq. (2).

Solution of the recombination equation in continuous time
Let us consider the time evolution of R B (ω t ) for an arbitrary partition B ∈ P(S), here written as B = {B 1 , . . . , B m }. As before, we assume ω t to be a solution of the recombination equation (2) with initial condition ω 0 ∈ P(X). Using the product rule as above, and employing Proposition 3, we obtain where B \ B i denotes the partition of S \ B i that emerges from B by removing B i . Note that the crucial third step follows from Lemma 2 used backwards. Clearly, we may restrict the inner summation to A i = {B i }, as the then omitted term vanishes anyhow, which gives the last line. We can now state our main result as follows.
Theorem 1. Let ω t be a solution of the recombination equation (2), with initial condition ω 0 ∈ P(X). Then, for any partition B ∈ P(S), the measure R B (ω t ) satisfies an ODE of the linear form d dt where the coefficients are explicitly given by and precisely one index 1 i |B|, In particular, the matrix Q = Q BC B,C∈P(S) is a Markov generator with triangular structure, and R B (ω t ) is a probability measure for all t 0.
Proof. It is clear from Eq. (6) that the derivative of R B (ω t ) can indeed be written as a linear combination, namely where the second step follows from a simple change of summation, while the third just reflects the fact that Q has a triangular structure. It remains to show that the coefficients Q BC are those given by Eq. (7). If C ≺ B, Eq. (6) tells us that this coefficient must vanish unless C refines precisely one part of B, in which case its value is as claimed, and non-negative due to our general assumption on the recombination coefficients. When C = B, we read from our change of summation that all non-diagonal coefficients of the row of Q defined by B must occur on the diagonal once, with negative sign. All other coefficients clearly vanish.
The Markov generator property is then clear, and the last claim follows from our general properties of the recombinators in conjunction with Proposition 2.
The meaning of the Markov generator Q will become clear in the next section. Let us now define the (column) vector ϕ t := ϕ t (B) B∈P(S) with ϕ t (B) := R B (ω t ). With this abbreviation, the ODEs from Theorem 1 now turn into the linear ODE system d dt ϕ t = Qϕ t with initial condition ϕ 0 = R B (ω 0 ) B∈P(S) and solution ϕ t = e tQ ϕ 0 .
Note that {e tQ | t 0} is the Markov semigroup generated by Q. In particular, for the first component of ϕ t , we now get with a t (A) = (e tQ ) 1A , which leads us back to Eq. (4). Clearly, A∈P(S) a t (A) = 1 since each e tQ with t 0 is a Markov matrix, so all row sums are 1. Via the Kolmogorov forward equation for the Markov semigroup {e tQ | t ≥ 0}, compare [18, Thm. 2.1.1], the following consequence is now immediate; see [4,7] for the original (but much longer) derivation. Let us pause to discuss what Eq. (8) tells us. First, the solution of the recombination equation may be expressed in terms of a convex combination of the initial measure recombined in all possible ways. This is quite plausible, given that the differential equation means a continuous replacement of the current measure by its recombined versions. Second, the procedure just described has uncovered a linear structure that underlies the nonlinear recombination equation, and thus reduced the problem to a linear one. With hindsight, we recognise a streamlined version of Haldane linearisation: The R B (ω t ) with B = 1 can be viewed as additional components that are used to enlarge the system in order to unravel its intrinsic linear structure. The latter is conveyed by the Markov generator Q, whose meaning still remains to be elucidated, as will be done next.

The backward point of view: Partitioning process
Now that we have understood the structure of the ODE system for the coefficients a t in the usual (forward) direction of time, let us consider a related (stochastic) process that will provide an additional meaning for a t . Let {Σ t } t 0 be a Markov chain in continuous time with values in P(S) that is constructed as follows. Start with Σ 0 = 1. If the current state is independently of all other parts. That is, the transition from B to (B \ B i ) ⊔ A i occurs at rate ̺ B i (A i ) for all {B i } = A i ∈ P(B i ) and 1 i |B|. Put differently, the transition from B to C happens at rate Q BC of Eq. (7). This way, we have given a meaning to the generator Q of Section 4: It holds the transition rates of the process of progressive refinements {Σ t } t 0 , which we have just described, and which we call the underlying partitioning process. The argument is illustrated in Figure 1.
Since Q is the Markov generator of {Σ t } t 0 , we can further conclude that (where P denotes probability), that is, the transition probability from 'state' B to 'state' C during a time interval of length t. In particular, a t (A) = e tQ 1A = P Σ t = A | Σ 0 = 1 . We have therefore shown our second main result, which can now be stated as follows.

Recombination equation in discrete time
Let us turn our attention to the discrete-time analogue of Eq. (2), which is often considered in population genetics [9,12,15]. We can use the same general setting of Section 2, in particular the space M(X) of measures and the action of the recombinators on it. Then, one has to consider the nonlinear iteration where the parameters are now recombination probabilities, so r(A) 0 for A ∈ P(S) together with A∈P(S) r(A) = 1. Moreover, t ∈ N 0 denotes discrete time (counting generations, say) with initial condition ω 0 . Clearly, the positive cone M + (X) is preserved under the iteration, as is the norm of a positive measure. In view of the general properties of the recombinators from Proposition 1, we can further confine our discussion to ω 0 ∈ P(X), which immediately implies that ω t ∈ P(X) for all t ∈ N as well.
As above in Proposition 3, we need marginalisation consistency for subsystems. A generalisation of [10, Eq. 2.5] to our setting leads to the following result. Let ω t with t ∈ N 0 be a solution of the discrete recombination equation (9), with initial condition ω 0 ∈ P(X). Then, for any ∅ = U ⊆ S, the marginal measures ω U t satisfy the induced recombination equation for all t ∈ N 0 , with R U A as before. Here, are the induced recombination probabilities for the subsystem, with A∈P(U ) r U (A) = 1.
Proof. The claim follows from a simple calculation, where the first step in the second line is a consequence of Lemma 1.
Let now B = {B 1 , . . . , B m } be an arbitrary partition of S, and consider where we have used Lemma 3. Invoking the factorisation property from Lemma 2, we thus see that we can rewrite the last expression as a linear combination of terms of the form R C (ω t ) with C ∈ P(S) being a refinement of A 1 ⊔ · · · ⊔ A m . We can now formulate the following general result.
Theorem 3. Let ω t with t ∈ N 0 be a solution of the discrete recombination equation (9), with initial condition ω 0 ∈ P(X). Then, for any B ∈ P(S) and any t ∈ N 0 , the measure R B (ω t ) is a probability measure and satisfies the linear recursion with the coefficients otherwise.
In particular, M = M BC B,C∈P(S) is a triangular Markov matrix.
Proof. The first claim, as mentioned earlier, is a direct consequence of part (6) of Proposition 1. Our above calculation, in conjunction with Lemma 2, proves the second claim, where the determination of the coefficients M BC is a straight-forward exercise. Clearly, we have M BC 0 for all B, C ∈ P(S) because r(A) 0 by assumption, hence also r U (B) 0 for all ∅ = U ⊆ S by the formula in Lemma 3. It remains to show that each row of M sums to 1. Indeed, given any B ∈ P(S), one has C∈P(S) because r U (A) A∈P(U ) is a probability vector for all ∅ = U ⊆ S by Lemma 3.
Note that the matrix entry M BB is the probability that 'nothing happens' to partition B in one step. These diagonal entries are, due to the triangular structure, the eigenvalues of M . For a special case of our setting, they have been determined earlier, by rather different methods and without reference to a linear structure, in [15,Thm. 6.4.3]; see also [17,20]. Theorem 3 now gives them a clearer meaning in a more general setting. Also, the degenerate cases that had to be excluded in previous attempts [15,12] precisely correspond to the cases where M fails to be diagonalisable. They pose no problem in the above approach.
At this point, we can repeat our previous interpretation. If ϕ t = R A (ω t ) A∈P(S) is considered as a column vector, with ω t a solution of the recombination equation, we get ϕ t+1 = M ϕ t , and hence There is again an underlying stochastic process, in analogy to the continuous-time case in Section 5. Here, {Σ t } t∈N 0 is a Markov chain in discrete time with values in P(S), starting at Σ 0 = 1. When Σ t = B, in the time step from t to t + 1, part B i of B is replaced by A i ∈ P(B i ) with probability r B i A i , independently for each 1 i |B|. Note that, unlike in the continuous-time case, several parts can be refined on one step, which makes the discrete-time case actually more complicated. Of course, A i = {B i }, which means no action on this part, is also possible. Put together, it is not difficult to verify that one ends up precisely with the transition matrix M from Eq. (10).