Choices, intervals and equidistribution

We give a sufficient condition for a random sequence in [0,1] generated by a $\Psi$-process to be equidistributed. The condition is met by the canonical example -- the $\max$-2 process -- where the $n$th term is whichever of two uniformly placed points falls in the larger gap formed by the previous $n-1$ points. This solves an open problem from Itai Benjamini, Pascal Maillard and Elliot Paquette. We also deduce equidistribution for more general $\Psi$-processes. This includes an interpolation of the $\min$-2 and $\max$-2 processes that is biased towards $\min$-2.


Introduction
A sequence in [0, 1] is equidistributed if the limiting proportion of terms in each subinterval is equal to the subinterval's length. Over a century ago Weyl proved that {βn mod 1} n≥1 is equidistributed for any irrational number β (see [Wey10]). Since then connections have been found in ergodic theory, number theory, complex analysis and computer science ( [BM72], [Vau77], [FSZ09], [CKK + 07]). See [KN06] for an overview.
More recently attention has been given to equidistribution of random sequences. One way to obtain a random sequence in [0, 1] is to independently choose points uniformly. Call the resulting sequence the uniform process. The law of large numbers guarantees this is equidistributed almost surely.
Another random process known to equidistribute points is the Kakutani interval splitting procedure (introduced in [Kak76]), where at each step a point is added uniformly to the current largest subinterval. Almost sure equidistribution is proven in [Zwe78] and [Loo78] using stopping times. Because points are placed in the largest gaps they ought to spread more evenly than the uniform process. Indeed, [Pyk80] proves the size of the largest interval is asymptotic to 2/n; the same order as the average interval. Compare to log n/n in the uniform process (see [Dar53]). [MP14] introduces a family of interval splitting processes that exhibit a wider range of behavior. The quintessential example is the max-2 process. The dynamics are as follows: • Partition [0, 1] into subintervals by placing finitely many points in any manner.
• At each step sample two points uniformly from [0, 1]. Each lies in a subinterval formed by the previous configuration. A discrete analogue of the max-2 process appears in [ABKU99] where n balls are placed into n bins. For each ball two bins are selected uniformly and the ball is placed in the bin with fewer balls. They find that the most-filled bin has ≈ log 2 log n balls; significantly less then ≈ log n if the balls were instead placed uniformly. This is studied in more detail in [MRS00] and [LM05].
In the max-2 process choosing the larger gap should spread points more evenly. Despite our intuition this is difficult to formalize, and equidistribution was a primary open problem from [MP14].
The natural counterpart is the min-2 process where the point contained in the smaller subinterval is kept. Unlike the previous processes, points are prone to clump together. We show that some random mixtures of the max-2, uniform and min-2 processes are equidistributed.
Theorem 2. Any mixture of max-2, uniform and min-2 processes with probability p of placing a point uniformly and probability less than or equal to .6 − .5p of placing a point according to the min-2 process is equidistributed a.s.
The formal definition of a mixture is in Section 5. As a corollary we state two examples of equidistributed processes intuitively less spread out than the uniform process.
Two generalizations are the max-k process and min-k process. In the max-k process keep the point in the largest subinterval among k uniformly placed points. Alternatively, in the min-k process keep the point in the smallest subinterval.
Our final theorem (Theorem 12 in Section 5) informally states that given any collection of max-k and min-k processes there is an equidistributed mixture which places ǫ weight on these and the other 1 − ǫ on the uniform process. Discussion and further questions. Processes in our theorems satisfy the special inequality at (18). The reason our approach works for only certain mixtures is unclear. Numerical methods indicate the inequality fails for other processes, suggesting a different approach is needed. Hopefully, the properties we establish for general mixtures in Proposition 11 will aide further progress. [MP14] conjectures that any max-k or min-k process is equidistributed. Based on this we suspect that any mixture of these is also equidistributed. The rate of convergence to a uniform placement of points and also the asymptotic size of the largest interval are open problems. More thorough discussion can be found in [MP14].
Overview. A more general family of interval splitting processes is introduced in [MP14]. Their main result is that, when properly scaled, the empirical distribution of subinterval lengths converges to a distribution function. The idea behind our argument is to reproduce parts of [MP14] when restricted to subintervals contained in [0, α]. We find that that the empirical distribution of subinterval lengths in [0, α] evolves to be essentially the same as the unrestricted version on [0, 1]. This sameness is enough to deduce equidistribution.
This article is organized to quickly arrive at the proof of Theorem 1. In Section 2 we describe the evolution of [0, α] and give the major definitions. In Section 3 we state without proof several propositions and then prove Theorem 1. Section 4 contains the proofs for the previous section. In Section 5 we generalize to random mixtures. Finally, in Section 6 we prove that processes captured by our theorems satisfy the inequality at (18). The empirical density of 10 4 (left) and 10 6 (right) points for the 20%-min-2/80%-uniform mixture, the uniform process, and the max-2 process. The interval is discretized into 100 equally sized bins.

Intervals in [0, α]
Leading up to Theorem 1 we frame all of our discussion in terms of max-k and min-k processes. The reason being the majority of our propositions hold for any k. Moreover, we will see in Section 5 that this readily generalizes to random mixtures.
We start with a formal definition for a process to be equidistributed. Suppose n 0 points are initially placed. After n iterations of an interval splitting process let N α n be the number of the first n 0 + n terms smaller than α. We say a sequence is equidistributed if n −1 N α n → α for all α ∈ [0, 1]. It is convenient to work in continuous time. Following [MP14] we have points arrive as a Poisson process with intensity e t . Formal details are in Section 4.1. So, in continuous time equidistribution is equivalent to e −t N α t → α for all α ∈ [0, 1]. Fix k ≥ 2 and α ∈ [0, 1]. We use the convention that a bold face letter represents a process indexed by time (i.e. A = ( A t ) t≥0 ). Define the joint processes ( A α , A α+ , A) to be the size-biased empirical distributions restricted to intervals contained in [0, α], [α, 1] and [0, 1], respectively. Formally, letting I α,(t) 1 , . . . , I α,(t) N α t be the lengths of subintervals contained in [0, α] at time t we define and similarly for A α+ t and A t . The spark for the refined analysis comes from the relation To ensure that no intervals are double counted assume the initial set of points placed in [0, 1] always contains {α}. This assumption is only for convenience. Our proof could be adapted to omit it by running the process until two points α 1 ≤ α ≤ α 2 land sufficiently close to α, and then using the bound N α1 For the max-k process define Ψ(u) = u k and for the min-k process define Ψ(u) = 1 − (1 − u) k . Also, let ψ(u) = Ψ ′ (u). In [MP14, Section 2] the authors prove that for some martingale M t . The following proposition shows that A α t satisfies a similar equation.
Proposition 5. For any max-k or min-k process, the joint processes ( A α , A α+ , A) satisfy the equation The similarity between the semimartingale decompositions of A t and A α t is paramount in obtaining our theorems. However, the details are a bit technical. To keep our momentum we delay the proof until Section 4. What follows are facts and notation essential to our main theorems.
Let non-tilde processes represent the original process scaled by

In light of Proposition 5 a change of variables gives the relationship
where C : The set X is a subspace of the space B([0, ∞), L 1 loc ) of measurable maps from [0, ∞) to L 1 loc with the topology of locally uniform convergence, which we denote by the symbol X →. We will useF and F Ψ interchangeably to denote the a.s. pointwise limiting distribution of A t from [MP14, Theorem 1.1]. Also define the stationary distributionF * so thatF * t =F for all t ≥ 0. With the convergence A t →F in mind, we consider the operator We will see in the proof of Theorem 1 that the limiting distribution of A α t belongs to the set of fixed points . We will use the following two norms on L 1 loc ([0, ∞)):

Proof of Theorem 1
We delay the proofs of two propositions until the next section. The first gives a sufficient condition for processes in F α to converge to αF in · x −3/2 . This is paired with a lemma proving the condition is met for the max-2 process.
Proof. This is the case p 2 = 1 covered by Lemma 13.
We will also need several properties of max-k and min-k processes. Direct analogues hold for random mixtures (see Proposition 11).
Proposition 8. For the max-k and min-k processes: Proof of Theorem 1. All statements are meant to hold almost surely. Also we abbreviate items from Proposition 8 as a roman numeral. In the continuous process points are added as a Poisson process with intensity e t dt. So, it suffices to show e −t N α t → α. By (II), (III) and the version of the Arzelá-Ascoli theorem in [MP14, Lemma 7.3] we may choose a sequence (A α,(n k ) ) which converges to a family of (scaled by α) distributions F α,(∞) with F α,(∞)) t (+∞) = α for every t ≥ 0. Taking limits in the formula at (2) we obtain

By (IV) and (V) we have
This finishes the proof since (I) states that A α t x −2 = e −t N α t and αF x −2 = α.

Technical Proofs
4.1. Proposition 5. The idea is to compare subintervals selected from [0, α] against subintervals from [α, 1]. For example, in an iteration of the max-k process we consider the event that j of the k candidate points land in [0, α]. The other k − j must land in [α, 1]. We obtain an interval selected from [0, α] according to a max-j process and another selected from [α, 1] according to a max-(k − j) process. If the interval from [0, α] is larger, then the point is kept and accounted for by A α t . The relationship at (1) and a combinatorial identity then yield the desired formula.
Before giving the proof we first build up some necessary definitions. Our construction is for the max-k process. The definitions for the min-k process are similar. Define Ψ a j (u) = (u/a) j to be the distribution function for the maximum and minimum of j independent Uniform[0, a] random variables. We use the convention that Ψ j (u) = Ψ 1 j (u) and . To help clarify we give a brief explanation for each term: • 1{ξ = j} accounts for how many of the k points land in [0, α].
• ℓ α s (u j ) is the initial length of the interval that potentially will be added to A α t . • 1{ℓ α s (u j ) > x} is zero if the selected interval is smaller than x and is already being counted by A α t . • 1{ℓ α s (u j ) > ℓ α+ s (w k−j )} indicates whether the interval from [0, α] is larger than that from [α, 1]. The inequality would be reversed for the min-k process.
• h(v, ℓ α s (u j ), x) "cuts" the interval ℓ α s and detects whether the resulting subintervals are smaller than x and so, should be added to A α t .
Proof of Proposition 5. Our proof is for the max-k process, the argument for the min-k process is similar (the functions Ψ would change as would the bounds of the inside integral at (6)). We will obtain the semi-martingale decomposition of We start by computing the integral Using the fact that 1 0 h(v, ℓ, x)dv = (x/ℓ) 2 , we first integrate with respect to ξ and v to write as Integrating one step further and normalizing the Ψ α j and Ψ 1−α k−j to Ψ j and Ψ k−j we obtain factors of α −j (1 − α) −(k−j) . This lets us cancel all but the binomial coefficients from the q j terms and obtain k j=1 Make the change of variables y = ℓ α+ (w j ) so that A α+ s (y) = w j−k . Hence the above can be written as Integrate one step further and use the fact that Ψ k−j (0) = 0 to obtain Now apply the change of variables z = ℓ α s (u j ) and so A α s (z) = u j to rewrite the above as Writing out Ψ k−j (u) = u k−j and ψ j (u) = ju j−1 and using the equality A α+ s (z) = A s (z) − A α s (z) from (1) we can rewrite the above as The identity k j=1 k j (a − b) k−j jb j−1 = ka k−1 = ψ(a) (derived by applying the binomial theorem to a k = ((a − b) + b) k then differentiating both sides with respect to a) gives (4) is equal to Finish by multiplying by e s and integrating from 0 to t.
4.2. Proposition 6. The proof of Proposition 6 proceeds analogously to [MP14, Lemma 4.1 and Proposition 3.4]. A significant difference is that they apply integration by parts to 1 z dΨ( F s (z)), whereas our operator C * requires applying integration by parts to The requirement at (3) arrises from the extra term ψ(F (z)). Also, note that we work in the norm · x −3/2 to obtain the constant 2 3 in (3). We will need this factor to prove an inequality similar to (3) holds for processes biased towards min-2. In [MP14] they use the norm · x −2 . This change of norms does not significantly alter the argument. In fact, we could equally well work with any norm · x −1−δ with 0 < δ < 1.
Proof of Proposition 6. Let F ∈ F α . We consider the rescaled processes F t (x) = F (e t x), We seek to prove the distance between F and α F * is decreasing in t: Starting from the calculation Multiply both sides by sgn( F t − α F Ψ t ) to obtain .
). An application of integration by parts to the integral gives The previous two equations therefore yield We next multiply both sides by x −3/2 and integrate with respect to x from 0 to infinity to obtain the bound dzdx.

An application of Fubini's theorem lets us rewrite the second integral as
Hence we can combine the integrals to obtain the bound The above is less than or equal to zero by our hypothesis (3). This establishes that where at the last line we apply (8).
It remains to prove that F 0 − αF x −3/2 ≤ 6. By assumption, F ∈ X 1 and therefore F 0 x −2 ≤ 1. As 0 ≤ F 0 (x) ≤ 1 we can break up the integral and use integrability of x −3/2 1{x > 1}: Similarly, αF x −3/2 ≤ 3. Apply the triangle inequality to conclude F 0 − αF x −3/2 ≤ F 0 x −3/2 + αF x −3/2 ≤ 6. 4.3. Proposition 8. In Proposition 8 we prove that A α t and A t have similar properties. Each statement requires some manipulation. Fortunately [MP14] contains much of the 'heavy-lifting'. We make one remark concerning the proof of (V). In [MP14] they prove continuity of an operator S Ψ with domain X . Our operator C has domain X × X . This makes the proof more involved, and also restricts us to proving continuity in sequences of the form (F (n) , A (n) ). This gives µ α t is the empirical distribution of rescaled interval lengths. We can then write

Proof of (I). The equality αF
Applying Fubini's theorem shows that

Proof of (II). Recall that a family of distributions (F
Proof of (III). We say that a family of functions (F (n) ) n∈N in X is asymptotically equicontinuous if for every compact K ⊂ [0, ∞), The proof is similar to [MP14, Lemma 7.5]. We omit the details and just remark that for any δ > 0 the number of points kept in [0, α] from time t to t + δ is bounded by the number of points added to [0, 1] in that same time interval. Formally, for any δ > 0 we have t . This lets us use the same bounds. Proof of (V). Suppose that F (n) X → F. An equivalent notion of convergence in the topology of local uniform convergence is that

Proof of (IV
[MP14, Theorem 7.1] implies A (n) X → F * . Thus it suffices to prove for any fixed T > 0 and K > 0 uniformly for t ≤ T . For fixed n we can write We can then bound the left side of (10) by It suffices to show that as n → ∞ each summand converges to zero uniformly for t ≤ T . First summand: Start by bounding the summand at (11) by The first quantity goes to zero uniformly for t ≤ T by the definition of F (n) X → F since a change of variables gives Expand the interior of the second quantity with integration by parts and take the absolute value signs inside to bound it by Multiply term one by (e s−t x) 2 and integrate so it becomes Notice that ψ(u) ≤ k for all u ∈ [0, 1]. Thus, the above is bounded by This puts us in the case of I 1 from [MP14, Lemma 3.3] and so converges to zero uniformly for t ≤ T . As for term two, we differentiate to rewrite it as By Lemma 15 we know that zF ′ (z) is bounded. Since ψ and ψ ′ are also bounded we have C = sup 0≤z≤∞ |zF ′ (z)ψ ′ (F (z)) − ψ(F (z))| < ∞. Therefore, (13) is less than Finally we are in the position of I 2 from [MP14, Lemma 3.3] and can conclude that (14) goes to zero uniformly for t ≤ T . t (z)) − ψ(F (z))| < ǫ, for n ≥ n 0 .
We truncate the integral then apply (15) to bound the absolute value of (12) by Integrate the inside integral of (16) by parts to obtain, Once more using the bound F (n) s (z) ≤ 1 we conclude that for M large the above becomes arbitrarily small. Therefore, the absolute value of (12) can be bounded by any ǫ > 0 uniformly for t ≤ T .

Random Mixtures
We conclude by generalizing to a family of random mixtures of max-k, uniform and min-k processes. Let κ be a random variable supported in Z \ {−1, 0}. We let P[κ = k] = p k . Define the dynamics of the mix-κ process as follows: • Start by partitioning [0, 1] into subintervals by placing finitely many points in any manner. • Sample an i.i.d. copy κ n ∼ κ and add the nth point according to the min-|κ n | process if κ n ≤ −2, -uniform process if κ n = 1, -max-κ n process if κ n ≥ 2. We follow the same outline as the proof of Theorem 1; first stating the analogous definitions and propositions and then sketching the proofs of our main theorems.
Let A α,κ t , A α+,κ t and A κ t be the size-biased empirical interval distribution for subintervals of [0, α], [α, 1] and [0, 1] in the mix-κ process. Define Also let ψ κ = Ψ ′ κ . To ensure our processes are well defined we only consider κ such that ψ ′ κ is bounded. The evolution of A α,κ t is essentially identical to that of A α t .
Proposition 9. For the mix-κ process the joint processes ( A α,κ t , A α+,κ t , A κ t ) satisfy the equation Proof. We obtain a semimartingale decomposition by integrating each variable. The first step is to integrate over the possible values of κ. This results in a weighted sum of terms corresponding to either a max-k, uniform or min-k process. Handle each term separately as in Proposition 5 to arrive at the decomposition Using linearity of the integral and the fact that ψ κ (u) = k p k ψ k (u) gives the desired decomposition.
[MP14, Theorem 1.1] implies A κ t → F Ψκ a.s. for any mix-κ process. Again letF * = (F Ψκ ) t≥0 . With this convergence in mind, it is natural to consider the analogues of C and F α from Section 2. We define F α,κ = {F ∈ X 1 : F = C κ, * (F), ∀t ≥ 0 : F t (+∞) = α and ( 1 α F t ) t≥0 tight}. The corresponding versions of Proposition 6 and Proposition 8 continue to hold for mix-κ processes. The proofs are very similar to as before and we omit them.
Proof. Lemma 20 implies that (18) holds so long as C κ ≤ 1/4. Thus, Proposition 10 and Proposition 11 hold. From here an identical argument as the proof of Theorem 1 gives the desired convergence.

Inequalities
For this entire section we will let F denote F Ψκ , Ψ denote Ψ κ and ψ denote ψ κ . Our goal is to show that the mixtures in Theorem 2 and Theorem 12 satisfy (18). Key to establishing the inequality is the differential equation from [MP14, Proposition 8.1]. We split the work up into two subsections. The first assumes that the mixture consists of only max-2, uniform and min-2 processes. The second section deals with general mixtures. Throughout we will use the characterization of κ in terms of its point densities P[κ = k] = p k . 6.1. Inequalities for Theorem 2. Our first lemma establishes (18) holds for any mixture of max-2, uniform and min-2 processes so long as p −2 ≤ p 2 .
After some algebra this is equivalent to zF ′′ (z) ≤ F ′ (z).