Fluctuations of Martingales and Winning Probabilities of Game Contestants

Within a contest there is some probability M_i(t) that contestant i will be the winner, given information available at time t, and M_i(t) must be a martingale in t. Assume continuous paths, to capture the idea that relevant information is acquired slowly. Provided each contestant's initial winning probability is at most b, one can easily calculate, without needing further model specification, the expectations of the random variables N_b = number of contestants whose winning probability ever exceeds b, and D_{ab} = total number of downcrossings of the martingales over an interval [a,b]. The distributions of N_b and D_{ab} do depend on further model details, and we study how concentrated or spread out the distributions can be. The extremal models for N_b correspond to two contrasting intuitively natural methods for determining a winner: progressively shorten a list of remaining candidates, or sequentially examine candidates to be declared winner or eliminated. We give less precise bounds on the variability of D_{ab}. We formalize the setting of infinitely many contestants each with infinitesimally small chance of winning, in which the explicit results are more elegant. A canonical process in this setting is the Wright-Fisher diffusion associated with an infinite population of initially distinct alleles; we show how this process fits our setting and raise the problem of finding the distributions of N_b and D_{ab} for this process.


Introduction
Given a probability distribution p = (p i , i ≥ 1) consider a collection of processes (M i (t), 0 ≤ t < ∞, i ≥ 1) adapted to a filtration (F t ) and satisfying (i) M i (0) = p i , i ≥ 1; (ii) for each t > 0 we have 0 ≤ M i (t) ≤ 1 ∀i and i M i (t) = 1; (iii) for each i ≥ 1, (M i (t), t ≥ 0) is a continuous path martingale; (iv) there exists a random time T < ∞ a.s. such that, for some random I, M I (T ) = 1 and M j (T ) = 0 ∀j = I.
Call such a collection a p-feasible process, and call the M i (·) its component martingales. To motivate this definition, consider contestants in a contest which will have one winner at some random future time. Then the probability M i (t) that contestant i will be the winner, given information known at time t, must be a martingale as t increases. In this scenario all the assumptions will hold automatically except for path-continuity, which expresses the idea that information becomes known slowly.
In view of the fact that continuous-path martingales have long been a central concept in mathematical probability, it seems curious that this particular "contest" setting has apparently not previously been studied systematically. Moreover the topic is appealing at the expository level because it can be treated at any technical level. In an accompanying non-technical article for undergraduates [?] we show data on probabilities (from the Intrade prediction market) for candidates for the 2012 Republican U.S. Presidential Nomination. The data is observed values of the variables N b and D ab below, and one can examine the question of whether there was an unusually large number of candidates that year whose fortunes rose and fell substantially. In this paper, the proof in section 3 of distributional bounds on N b is mostly accessible to a student taking a first course in continuous-time martingales, and subsequent sections slowly become more technically sophisticated.
The starting point for this paper is the observation that there are certain random variables associated with a p-feasible process whose expectations do not depend on the actual joint distribution of the component martingales, and indeed depend very little on p. For 0 < a < b < 1 consider D a,b := sum over i of the number of downcrossings of M i (·) over [a, b]. Straightforward uses of the optional sampling theorem (described verbally in [?] as gambling strategies) establish Lemma 1. If max i p i ≤ b then for any p-feasible process, In contrast, the distributions of N b and D a,b will depend on the joint distributions of the component martingales, and one goal of this paper is to study the extremal possibilities. Here is our result for N b .
Proposition 2. (a) If max i p i ≤ b then there exists a p-feasible process for which the distribution of N p b is supported on the integers ⌊1/b⌋ and ⌈1/b⌉ bracketing its mean 1/b. (b) There exists a family, that is a p-feasible process for each p, such that the (c) Any possible limit distribution for N p b as max i p i → 0 has variance at most (1 − b)/b 2 , the variance of Geometric(b).
Clearly the distribution in (a) is the "most concentrated" possible, and part (c) gives a sense in which the Geometric(b) distribution is the "most spread out" distribution possible. The proof will be given in section 3. The construction for (a) formalizes the idea that we maintain a list of candidates still under consideration, and at each stage choose one candidate to be eliminated. The construction for (b) formalizes the idea that we examine candidates sequentially, deciding to declare the current candidate to be the winner or to be eliminated. Returning briefly to the theme that this topic is amenable to popular exposition, with some imagination one can relate these two alternate ideas to those used in season-long television shows. Shows like Survivor overtly follow the idea for (a), whereas the idea for (b) would correspond to a variant of . . . . . . Millionaire in which contestants were required to try for the million dollar prize and where the season ends when the prize is won.
We give an analysis of downcrossings D ab in section 4, though with less precise results. The construction that gave the Geometric limit distribution for N b in (1.1) also gives a Geometric limit distribution for D ab (Proposition 6). We conjecture this is the maximum-variance possible limit, but can give only a weaker bound in Proposition 7. As for minimum-variance constructions, Proposition 8 shows one can construct feasible processes for which, in the limit as b → 0 with a/b bounded away from 1, the variance of D ab is bounded by a constant depending only on a/b. The case a/b ≈ 1 remains mysterious, but prompts novel open problems about negative correlations for Brownian local times -see section 7.
1.1. 0-feasible processes. As a second goal of this paper, it seems intuitively clear that the concept of p-feasible process can be taken to the limit as max i p i → 0, to represent the idea of starting with an infinite number of contestants each with only infinitesimal chance of winning. Informally, we define a 0-feasible process as a process with the properties: (i) for each t 0 > 0, conditional on M i (t 0 ) = p i , i ≥ 1, the process (M i (t 0 + t), 0 ≤ t < ∞, i ≥ 1) is a p-feasible process; (ii) sup i M i (t) → 0 a.s. as t ↓ 0. There is some subtlety in devising a precise definition, which we will give in section 5. Once this is done we can deduce results for general 0-feasible processes as limits of results for p-feasible processes under the regime max i p i → 0, and also we can construct specific 0-feasible processes by splicing together specific p-feasible processes under the same regime (Proposition 11).
By eliminating any dependence on p, results often become cleaner for 0feasible processes. For instance Proposition 2 becomes Corollary 3. (a) There exists a 0-feasible process such that, for each 0 < b < 1, the distribution N b is supported on the integers ⌊1/b⌋ and ⌈1/b⌉ bracketing its (c) Moreover for any 0-feasible process and any 0 < b < 1 the variance of N b is at most (1 − b)/b 2 , the variance of Geometric(b).
Setting aside the "extremal" questions we have discussed so far, another motivation for considering the class of 0-feasible processes is that there is one particular such process which we regard intuitively as the "canonical" choice, and this is the 0-Wright-Fisher process discussed in section 6. This connection between (a corner of) the large literature on processes inspired by population genetics and our game contest setting seems not to have been developed before. In particular, questions about the fluctuation behavior of the 0-Wright-Fisher process -the distributions of N b and D ab -arise more naturally in the contest setting, though it seems hard to get quantitative estimates of these distributions.

Preliminary observations
2.1. The downcrossing formula. In our setting of a continuous-path martingale M(·) ultimately stopped at 0 or 1, recall the "fair game formula" ≤ 1 from which one can readily derive the well known formula for the expectation of the number D of downcrossings of M(·) over [a, b] Moreover, starting from b there is a modified Geometric distribution for D: Note that throughout what follows, we consider only the case of no mutation and no selection. It is classical that the infinite population limit of the k-allele model is the multivariate Wright-Fisher diffusion on the k − 1-dimensional simplex, that is with generator Each component is a martingale, the one-dimensional diffusion on [0, 1] with drift rate zero and variance rate x(1 − x). There has been extensive work since the 1970s on the infinitely-many-alleles case, but this has focussed on the case of positive mutation rates to novel alleles, in which case the martingale property no longer holds. In our setting (no mutation and no selection) it is straightforward to show directly (see section 6) that for any p = (p i , i ≥ 1) with countable support there exists what we will call the p-Wright-Fisher process, the infinitedimensional diffusion with generator analogous to (2.5) starting from state p, and that this is a p-feasible process. So we know that p-feasible processes do actually exist, and these p-Wright-Fisher processes will be useful ingredients in later constructions. (When p has finite support we could use instead Brownian motion on the finite-dimensional simplex, whose components are killed at 0 and 1, but this does not extend so readily to the infinite-dimensional setting). It is convenient to adopt from genetics the phrase fixation time for the time T at which the winner is determined.
2.3. Constructions using Wright-Fisher. In a Wright-Fisher diffusion we have i M i (t) ≡ 1, but trivially we can consider a rescaled Wright-Fisher diffusion for which i M i (t) is a prescribed constant.
Our constructions of feasible processes typically proceed in stages. Within a stage we may declare that some component martingales are "frozen" (held constant) and the others evolve as a rescaled Wright-Fisher process. In particular if only two component martingales are unfrozen, say at the start S of the stage we have M i (S) = x i and M j (S) = x j , then during the stage we have a "reflection coupling" with M i (t) + M j (t) = x i + x j , and we can choose to continue the stage until the processes reach x i + x j and 0, or we can choose to stop earlier.
An alternative construction method is to select one component martingale M i (S) at the start of the stage, let (M i (·), 1 − M i (·)) evolve as the two-allele Wright-Fisher process during the stage, and set M j (·) = . We describe this construction by saying that the processes (M j (·), j = i) are tied.
Both constructions clearly give continuous-path martingale components. The results in sections 3 and 4 are based on concrete calculations and constructions, though in applying them to 0-feasible processes we "look ahead" and quote results from later (Propositions 11 and 12) which are designed for this purpose, formalizing the intuitive description from section 1.1 so as to allow results to be easily interchanged between p-feasible and 0-feasible processes.

Proofs of distributional bounds on N b
Proof of Proposition 2(a). Fix b. Run a Wright-Fisher process started at p until some M i (·) reaches b. Freeze that i and run the remaining processes as rescaled Wright-Fisher until some other M j (·) reaches b. Freeze that j and continue. After a finite number of such stages we must reach a state where all component martingales except one are frozen at b or at 0, and the remaining one is in [0, b]. Because i M i (t) ≡ 1 the number frozen at b must be ⌊1/b⌋ and the remaining martingale must be at 1 − b⌊1/b⌋. Finally, unfreeze and run from this configuration to fixation as Wright-Fisher. Clearly N b takes only the values ⌊1/b⌋ and ⌈1/b⌉.

Proof of Corollary 3(a)
. This construction is similar to that above, but is closer to our earlier informal description "maintain a list of candidates still under consideration, and at each stage choose one candidate to be eliminated".
For each integer m ≥ 2, we will define a stage which starts with m component martingales at 1/m, and ends with m − 1 of these martingales at 1/(m − 1) and the other frozen at 0. To construct this stage, run as Wright-Fisher until some M i (·) reaches 1/(m − 1). Freeze that i and run the remaining martingales as rescaled Wright-Fisher until some other M j (·) reaches 1/(m − 1). Freeze that j and continue. Eventually we must reach a state where m − 1 martingales are frozen at 1/(m − 1) and the remaining process is 0. This stage takes some random time τ m with finite expectation; without needing to calculate it, we can simply rescale time so that E[τ m ] = 2 −m .
Intuitively, we simply put these stages together, to obtain a 0-feasible process in which, for each M ≥ 1, at time m>M τ m there are exactly M martingales at 1/M. Proposition 11 formalizes this construction. This process satisfies the assertion of the Corollary for each b = 1/M, and then for general b because N b is monotone in b.
Remark 1. Let us call a 0-feasible process with the property above, that for each M ≥ 2 there is a time at which there are exactly M component martingales each at value 1/M, a Survivor process. We placed the proof here to illustrate the technical issue arising in making precise the construction of such a 0-feasible process, which is to arrange consistent labeling of each component martingale across the different stages. The point is that the one out of the M that does not reach 1/(M − 1) is a uniform random pick, so we cannot just label them as 1, . . . , M for each M.

Proof of Proposition
We use the "tied" construction from the start of this section. Run (M 1 (·), 1− M 1 (·)) as Wright-Fisher started from (p 1 , 1 − p 1 ) and stopped at S 1 := min{t : M 1 (t) = 0 or 1}, and for i ≥ 2 set So M i (·) is a martingale on this time interval. Note that if J = 1 then no M i (·) can reach b before time S 1 , for i ≥ 2. If M 1 (S 1 ) = 1 the process stops. If M 1 (S 1 ) = 0 then for i ≥ 2 we have M i (S 1 ) = p i /(1 − p 1 ). For t ≥ S 1 run (M 2 (·), 1 − M 2 (·)) as Wright-Fisher started from ( p 2 1−p 1 , 1−p 1 −p 2 1−p 1 ) and stopped at S 2 := min{t : M 2 (t) = 0 or 1}, and for i ≥ 3 set If J = 2 then no M i (·) can reach b before time S 2 , for i ≥ 3. Continue in this way to define processes (M j (t), S j−1 ≤ t ≤ S j ) for 1 ≤ j ≤ J, or until some M j (·) reaches 1 and the whole process stops. If the process has not stopped by time S J , continue in an arbitrary manner, which makes the resulting process p-feasible. Note that, if M j (·) reaches b, then with probability exactly 1/b it will reach 1, and that with probability 1 − j≤J p j the process has not stopped by time S J .
Write N = number of martingales j ≤ J that reach b. We can now apply Lemma 4 below to Z = N (J) b , and deduce that N with Z ′ constructed in Lemma 4. Then Lemma 4. Given 0 < b < 1 and probabilities q i , 1 ≤ i ≤ J define a counting process by: for each i, given not yet terminated, with probability q i b, increment count by 1 and terminate; with probability q i (1 − b), increment count by 1 and continue; with probability 1 − q i , continue. Let Z be the value of the counting process after step J or at time T (the termination time, if any), whichever occurs first. Then there exists Proof. Augment the process by setting q i = 1, i > J and follow the algorithm for all i ≥ 1. The process must now terminate at some a.s. finite time T ′ , at which time the value Z ′ of the counting process has exactly Geometric(b) distribution.
Proof of Proposition 2(c). Fix b and for k ≥ 1 let S k ≤ ∞ be the first time at which k distinct component martingales have reached b. If N b ≥ k, then at time S k one martingale takes value b, the other k − 1 that previously reached b take some values Z 1 , . . . , Z k−1 , and the remaining martingales take some values M j (S k ) < b. The chance that such a remaining martingale subsequently reaches b equals M j (S k )/b, and so, on {N b ≥ k}, For later use (section 5) note that to have equality in the final display above we need equality in (3.1), implying each Z j = 0, that is the martingale components that previously reached b have all reached zero. We deduce Corollary 5. If, for a p-feasible process, N b has Geometric(b) distribution, then there is no time at which more than one component martingale is in [b, 1].
Proof of Corollary 3(c). This follows from Proposition 2(c) and the definition (section 5) of 0-feasible process via embedded p-feasible processes.
Proof of Corollary 3(b). Given b 0 , consider the vector p of Geometric probabilities with The construction in the proof of Proposition 2(b) and its analysis show that for this p-feasible process and any b ≥ b 0 we have J = ∞ and that N b has Geometric(b) distribution. So it is enough to show that there exists a 0-feasible process and a stopping time at which the values of the component martingales are p. But Proposition 12 shows this is true for every p.

Distributional bounds on downcrossings
4.1. The large spread setting.
Proposition 6. Given b 0 > 0, there exists a 0-feasible process such that D ab +1 has Geometric( b−a 1−a ) distribution, for each b 0 ≤ b < 1 and 0 < a < b. The corresponding result (cf. Proposition 2(b)) holds for p-feasible processes in the limit as max i p i → 0.
Proof. As in the proof of Corollary 3(b), we may start with the Geometric(b 0 ) distribution p at (3.2) and use the construction in the proof of Proposition 2(b). Every time a martingale component reaches b, the other components must be at positions ( The event that there are no further downcrossings is the event that, after the next time some component reaches b, it then reaches 1 before a, and this has probability (b − a)/(1 − a) by (2.1). So By the same argument P(D ab = 0) = (b − a)/(1 − a).
The variance of the Geometric( b−a 1−a ) distribution can be written as (4.1) It is natural to guess, analogous to Corollary 3(c), that this is an upper bound on the variance of D ab in any 0-feasible process.
Conjecture 1. For any 0-feasible process, The following result establishes a weaker bound. One can check that in the a ↑ b limit this bound is first order asymptotic to ( 1−b b−a ) 2 , which coincides with the first order asymptotics in (4.1).

Proposition 7.
For any 0-feasible process and any 0 < a < b < 1, Proof. Fix 0 < a < b < 1 and consider an arbitrary 0-feasible process. Call a particular component martingale at a particular time active if it is potentially part of a downcrossing of [a, b]. That is, the martingale is initially inactive; it becomes active if and when it first reaches b; it becomes inactive if and when it next reaches a; and so on. So a martingale at x is always active if x > b, is always inactive if x < a, but may be active or inactive if a < x < b.
Given that a particular martingale is currently at x, the mean number of future downcrossing completions equals, by (2.2, 2.3) Analogously to the proof of Proposition 2(c), consider the time S k at which the k'th downcrossing has been completed. On {S k < ∞}, The number of active martingales at time S k is at most N The event {S k < ∞} is the event {D ab ≥ k}, so taking expectations and rearranging gives Apply the Cauchy-Schwarz inequality to the second summand on the right side and use E[ Next, for positive constants C 1 , C 2 we have the elementary implication if 0 ≤ a ≤ C 1 + 2 C 2 √ a then √ a ≤ C 1 + C 2 2 + C 2 . In our situation, this gives

Using first Jensen's inequality and then the result (Corollary 3(c)) that var(N
1/a 2 ) from which the inequality in the proposition readily follows.

4.2.
The small spread setting. Proposition 2(a) showed that the spread of N b could be very small. To see that the case of D ab must be somewhat different, recall that for a martingale component which reaches b, its number of downcrossings has the modified Geometric distribution (2.4) with mean b(1 − b)/(b − a). So if we fix b and consider limits in distribution as a ↑ b, we must obtain a limit of the form where each ξ i has Exponential(1) distribution. And although there will be some complicated dependence between (N b , ξ 1 , ξ 2 , . . .) it is clear that the limit cannot be a constant, and therefore in any p-feasible process the variance of D ab as a ↑ b must grow at least as order (b − a) −2 . We will not consider that case further here (but see an open problem in section 7), instead turning to the case where a/b is bounded away from 1. Here, in a 0-feasible process, E[D ab ] grows as order 1/b as b ↓ 0. The next result shows there exist 0-feasible processes for which the variance of D ab remains O(1). The idea behind the construction is to exploit reflection coupling. For instance, starting with 2m components at b, a reflection coupling moves the process to a configuration with m components at a and m at 2b − a while adding m downcrossings; one can extend this kind of construction to make the process pass through a deterministic sequence of configurations while adding a deterministic number of downcrossings.
Proposition 8. For each 0 ≤ α < 1 there exists a constant C(α) < ∞ such that: given 0 < a k < b k → 0 with a k /b k → α, there exist 0-feasible processes such that Proof. Fix k, set (a, b) = (a k , b k ) and with an abuse of notation write α = a k /b k . By Proposition 12 we may assume we have a p 0 -feasible process, where p 0 has finite support and its components are in (0, b). The proof makes repeated use of the following kind of construction. Specify an interval [a 0 , b 0 ], freeze martingale components initially outside that interval, run the other components as a rescaled Wright-Fisher process and freeze them upon reaching a 0 or b 0 (typically there will be one component ending within (a 0 , b 0 )). Note this construction has a particular "deterministic" property, that in the final random configuration (M i (t), i ≥ 1) the ranked (decreasing ordered) values rank(M i (t), i ≥ 1) are non-random, determined by the (ranked) initial values. This holds because i M i (t) = 1.
The central idea of the proof is the following lemma.
Lemma 9. Write K = K(α) = 6⌊ 1 1−α ⌋ − 1. There exists a p 0 -feasible process which reaches a configuration p 1 with at most one martingale with value in (b, 1] and at most K martingales taking values in (0, b], having accomplished a deterministic number of downcrossings before that time.
Proof. We construct the process in stages. At the start of each stage, we consider the first case in the list below which holds, and do the construction specified below for that case. If no case holds then stop; note the property "at most K martingales taking values in (0, b]" will then be satisfied. Case 1. There are at least 1 + ⌊ 1 1−α ⌋ martingales at b; Case 2. There are at least 2⌊ 1 1−α ⌋ + 1 active martingales in (a, b); Case 3. There are at least 2⌊ 1 1−α ⌋ + 1 inactive martingales in (a, b); Case 4. There are at least ⌊ 1 1−α ⌋ martingales in (0, a]. Construction in case 1. We let the martingales at b evolve according to the appropriately rescaled Wright-Fisher diffusion, while freezing all other martingales, and then freeze the evolving martingales that reach level a. At least ⌊ 1 1−α ⌋ martingales will reach level a, and exactly one will be above b. Once all martingales are frozen, we let those at a evolve as the rescaled Wright-Fisher diffusion until they reach 0 or b. Finally, if initially there were martingales above b, then we let all the martingales above b evolve as the appropriate Wright-Fisher diffusion and freeze those that reach b. This procedure adds a deterministic number of downcrossings (all in the first step), and leaves exactly one martingale above b.
Construction in cases 2 and 3. In case 2 we let the active martingales in (a, b) evolve until they either reach a or b and freeze them at that time. All except one of these martingales reach a or b, so either at least ⌊ 1 1−α ⌋ + 1 martingales end at b, or at least ⌊ 1 1−α ⌋ martingales end at a, adding a deterministic number of downcrossings. So the ending configuration will fit case 1 or case 4. In case 3 we do the same but with the inactive martingales instead; no additional downcrossings are added.
Construction in case 4. We let the martingales in (0, a] evolve until they reach 0 or b and freeze them at that time. At least one of them must reach 0, and no additional downcrossings are added.
The sequence of stages must end, because: in each case 4 stage at least one martingale is stopped at 0, and each case 1 stage creates at least one downcrossing, so there can be only a finite number of such stages; and each case 2 or 3 stage is followed by such a stage.
Moreover each stage is "deterministic", in the previous sense that the ranked configuration at the end of the stage is determined by the ranked configuration at the start, and therefore the ranked configuration p 1 at the termination of the entire construction is non-random, determined by the initial configuration p 0 . This implies the total number of downcrossings is deterministic, because the number within each stage is determined by that stage's starting configuration. As already mentioned, p 1 has the property "at most K martingales taking values in (0, b]" by the termination condition. The number of martingale components taking values in (b, 1] is at most 1, because each case 1 stage ends that way and the other cases do not allow components to exceed level b. In view of Lemma 9, to complete the proof of the proposition it suffices to show (4.4) for some p 1 -feasible process with p 1 as in Lemma 9. In fact we can take an arbitrary such process. The point is that (as noted earlier) the number of downcrossings D ab has a representation of the form where N * is the number of martingale components that hit b, and each G i has the modified Geometric distribution (2.4). Without any knowledge of the dependence between (N * , G 1 , G 2 , . . .), the fact N * ≤ K + 1 implies 1 ] is bounded in the limit as b → 0 with a/b → α < 1, and (4.4) follows.

0-feasible processes
In section 5.1 we will give one formalization of the notion of a 0-feasible process introduced informally in section 1.1, and in sections 5.2 and 5.3 we give results allowing one to relate constructions and properties of 0-feasible processes to those of p-feasible processes.
There are several possible choices for the level of generality we might adopt. The "canonical" example of the 0-Wright-Fisher process, and the "Survivor" process featuring in Corollary 3(a), have the property that at times t > 0 the process has only finitely many non-zero components, so we could make this a requirement. Instead we will allow a countable number of non-zero components -"because we can". In the other direction, consider the construction of reflecting Brownian motion R(t) from standard Brownian motion W (t) as and run the process until R(·) hits 1. Within our setting, interpret this as saying that at time t there is one contestant with chance R(t) of winning, the remaining chance 1−R(t) being split amongst an infinite number of unidentified contestants each with only infinitesimal chance of winning. Informally this is a 0-feasible process such that (5.1) N b has Geometric(b) distribution for every 0 < b < 1, strengthening the assertion of Corollary 3(b), but it does not fit our set-up which will require the unit mass to be split as a random discrete distribution at times t > 0. In fact Corollary 5 implies that, within our formalization, no 0-feasible process can have property (5.1). One could choose a more general set-up which allows such "dust", as in the literature [?] cited below, but we are not doing so.
The existing classes of processes in the literature with somewhat similar qualitative behavior -in the theory of stochastic fragmentation and coagulation processes [?] which studies partitions of unit mass into clusters, or in population genetics inspired processes associated with Kingman's coalescent, are (to our knowledge) explicitly Markovian, in which context the question becomes determining the entrance boundary of a specific Markov process [?, ?]. Our setting differs in that we wish to continue making only the "martingale" assumptions (ii,iii,iv) at the start of the Introduction, and we are seeking to define a class of processes.
The following observation shows that the most naive formalization does not work.
Lemma 10. Let I be countable, There does not exist any process (M i (t), 0 ≤ t < ∞, i ∈ I) adapted to a filtration (F t ) and satisfying (i) for each t > 0 we have 0 ≤ M i (t) ≤ 1 ∀i and i M i (t) = 1; 5.1. A formalization. The issue, indicated by Lemma 10 above and the particular Survivor example in Remark 1, is to find a formalization which preserves the identity of martingale components as t varies. The often used device of simply ranking (decreasing-ordering) components at each time t does not work. Our formalization combines ranking and a point process representation. This is admittedly somewhat ad hoc; a different but equivalent formalization is mentioned in Remark 2. A probability distribution p with p 1 ≥ p 2 ≥ p 3 ≥ . . . is called ranked; write ∇ for the space of ranked probability distributions. For a general discrete distribution q = (q j , j ∈ J) write rank(q) for its decreasing ordering, where zero entries are omitted. More generally, for a collection (A j , j ∈ J) of objects with the same index set as (q j , j ∈ J), write rank(A j , j ∈ J||q) for the collection re-ordered so that q is ranked (this is not completely specified if the values q j are not distinct, but the arbitrariness does not matter for our purposes).
Write C 0 for the space of continuous functions f : [0, ∞) → [0, 1] with f (0) = 0. Consider a random point process on C 0 . That is, a realization of the process is (informally) an unordered countable set {f α (·)} of functions or (formally) the counting measure associated with that set. We will use the former notation, which is more intuitive. We define a 0-feasible process to be a random point process {M α (·)} on C 0 such that and with the following property. For each t 0 > 0 and each ranked p, Conditional on rank(M α (t 0 )) = p and on F(t 0 ), the ranked process In words, given t 0 we simply label component martingales as 1, 2, 3, . . . in decreasing order of their values at t 0 , and we can use this labeling over t 0 ≤ t < ∞ to define a process (M i (t 0 + t), t ≥ 0, i ≥ 1) which we require to be a p-feasible process, where p is the ranked ordering of (M α (t 0 )). For F t we take the natural filtration, generated by the restriction of the point process to (0, t]. By standard arguments, property (5.2) extends to any stopping time S with 0 < S < ∞: Conditional on rank(M α (S)) = p and on F(S), the ranked process (5.3) rank(M α (S + ·))||{M α (S)}) is p-feasible.
In our initial definition of a p-feasible process we assumed the initial configuration p was deterministic. Now define a ⊕-feasible process to be a mixture over p of p-feasible processes; in other words, a process (M i (t), i ≥ 1, t ≥ 0) which, conditional on (M i (0), i ≥ 1) = (p i , i ≥ 1), is a p-feasible process. So the ranked process rank(M α (S + ·))||{M α (S)}) in (5.3), considered unconditionally, is a ⊕-feasible process, and we describe the relationship (5.3) by saying this ⊕feasible process is embedded into the 0-feasible process via the stopping time S. Similarly, any stopping time within a ⊕-feasible process specifies an embedded ⊕-feasible process.
Remark 2. An essentially equivalent formalization would be to assign random U[0, 1] labels U α to component martingales, so the state of the process at t is described via the pairs (U α , M α (t)) for which M α (t) > 0, and this can in turn be described via the probability measure α M α (t)δ Uα or its distribution function. We will use this "random labels" idea in an argument below.

5.2.
A general construction of 0-feasible processes. Given a 0-feasible process and stopping times S k ↓ 0 a.s., the associated embedded ⊕-feasible processes are embedded within each other, and their initial values (M (k) → 0 a.s.. The following result formalizes the converse idea: one can construct a 0-feasible process from a sequence of p-feasible or more generally ⊕-feasible processes embedded into each other, via Kolmogorov consistency.
Proposition 11. Suppose that (µ k , k ≥ 1) are probability measures on ∇ and that for each k there are families Then there exists a 0-feasible process {M α (·)} which is consistent with the families above, in the following sense. There exist stopping times S k such that for each k ≥ 1 and the embedded process rank(M α (S k + ·))||{M α (S k )}) is distributed as M k (·).
Proof. By conditions (i)-(iii), for each k ≥ 2 we can represent the process M k−1 as the process M k (T k + ·); more precisely, we can couple the two processes such that ). Then by the Kolmogorov consistency theorem we can assume this representation holds simultaneously for all k. We now attach labels α to the component martingales by the following inductive scheme. For k = 1, to each of the indices i designating a component martingale M 1 i (·) we associate an independent Uniform(0, 1) label. For k = 2, a component martingale M 2 i (·) might be zero or non-zero at T 2 . If non-zero then we copy the label already associated within M 1 (·) via the coupling (5.4). If zero the we create a new independent Uniform(0, 1) label.
Continue for each k this scheme of copying or creating labels. For each label α, the sample path of that martingale component in the process M k+1 is obtained from the sample path in M k by inserting an extra initial segment. By (iv) the path converges as k → ∞ to a function M α (t), 0 ≤ t < ∞, and by (v) we must have M α (0) = 0. The remaining properties are straightforward. 5.3. All p-feasible processes embed. Proposition 11 enables construction of specific 0-feasible processes. The following result implies that any p-feasible process can be embedded into some 0-feasible process -simply splice the 0feasible process in the proposition to the given p-feasible process at time S. We already used this fact in the proofs of Corollary 3(b) and Proposition 8.
For the proof it is convenient to use Brownian-type process instead of Wright-Fisher. Write We will use constructions with the property At each time 0 ≤ t ≤ S, at least one component is evolving as Brownian motion for a specified stopping time S. That is, our constructions can be written as for (dependent) standard Brownian motions W i (t), and we require that some σ i (t) equals 1. In general Q(t)− t 0 i σ 2 i (s) ds is a martingale, so the advantage of property (5.5) is that Q(t) − t is a submartingale, implying A simple construction satisfying (5.5) is the Brownian reflection coupling of two component martingales. That is, on 0 ≤ t ≤ S we freeze components other than i, j, and set Lemma 14. Let I 0 be countable, and I 1 and I 2 be finite, index sets. Let (p i , i ∈ I 0 ∪ I 1 ) and (q i , i ∈ I 0 ∪ I 2 ) be probability distributions which coincide on I 0 and satisfy max i∈I 1 p i ≤ min i∈I 2 q i . Then there exists a p-feasible process {M α (·)} satisfying (5.5) such that for some stopping time S we have rank( {M α (S)} ) = rank(q).
Proof. Freeze permanently the component martingales with i ∈ I 0 . Pick two arbitrary indices i ′ , i * in I 1 and run the Brownian reflection coupling on these two components M i ′ (t), M i * (t) until one component hits zero or min i∈I 2 q i . In the latter case, freeze that component permanently and delete its index from I 1 and delete arg min i∈I 2 q i from I 2 . In the former case, only delete the index from I 1 . The total number (originally |I 1 | + |I 2 |) of unfrozen components is now decreased by at least 1. Continue inductively, picking two components from I 1 at each stage. Eventually all components are frozen and the ranked state is rank(q).
Proof of Proposition 12. Define p 0 = p and for k ≥ 1 construct p k from p by (i) retaining entries p i with p i ≤ 2 −k ; (ii) replacing other p i by 2 j(i) copies of 2 −j(i) p i , where j(i) ≥ 1 is the smallest integer such that 2 −j(i) p i ≤ 2 −k . Each pair (p k , p k−1 ) satisfies the hypothesis of Lemma 14. So for each k, writting µ k = δ p k and writing M k and T k for the p k -feasible process and the stopping time given by Lemma 14, we see that hypotheses (i)-(iii) of Proposition 11 are satisfied. Moreover by Lemma 13 we have E[T k ] ≤ q k−1 − q k for q k := i (p k i ) 2 , implying that hypotheses (i)-(iii) are also satisfied. The conclusion of Proposition 11 now establishes Proposition 12.
6. The 0-Wright-Fisher process Write ∆ for the (unranked) infinite simplex {(p i , 1 ≤ i < ∞) : p i ≥ 0, i p i = 1}. As mentioned in section 2.2, for each p ∈ ∆ there exists the p-Wright-Fisher process, a process with sample paths in C([0, ∞), ∆) and initial state p, which is the infinite-dimensional diffusion with generator analogous to (2.5) starting from state p, and that this is a p-feasible process. This has a straightforward construction: given p ∈ ∆, set p n = (p 1 , . . . , p n−1 , m≥n p m ), so the p n -process exists as a finite-dimensional diffusion. But there is a natural coupling between the p n−1 -and the p n -processes in which the first n − 2 coordinate processes coincide, and appealing to Kolmogorov consistency for the infinite sequences of processes we immediately obtain the p-process.
That process is in some senses the process we want, but that formalization does not suffice for our purposes because it does not preserve the identity of components as t varies. That is, we want the 0-feasible process {M α (t)} whose components are martingales and for which (6.1) X(t) = rank({M α (t)}) with a separate ranking for each t.
The component processes X i (·) are not martingales and we cannot define quantities like N b and D ab in terms of X. Note that by Lemma 10 we cannot represent X(t) as rank(M(t)) for any process in C([0, ∞), ∆) with martingale components.
Fortunately we can fit the 0-Wright-Fisher process into our abstract set-up by combining the existence of the process X(t) with our Proposition 11. Take times s k ↓ 0 and let µ k be the distribution of X(s k ). Then there is a ⊕feasible Wright-Fisher process M k with initial distribution µ k , and existence of