One more approach to the convergence of the empirical process to the Brownian bridge

A theorem of Donsker asserts that the empirical process converges in distribution to the Brownian bridge. The aim of this paper is to provide a new and simple proof of this fact.

the inverse of the cumulative function of the U i 's. Some problems of continuity arise due to the atoms of the U i 's but roughly speaking one may say that all the difficulties are present in the case of the uniform distribution.
We recall that the Brownian bridge is the continuous centered Gaussian process such that cov(b(s), b(t)) = s(1 − t) when 0 ≤ s ≤ t ≤ 1. It owns the following trajectorial representation : where B is the standard Brownian motion. This may immediately be checked using that B is a centered Gaussian process such that cov(B s , B t ) = min(s, t).
In fact, Donsker proves only in details max b n (d) − − → n max b justifying the Doob's heuristic [3]. One may find in the literature numerous more or less direct proofs of Theorem 1. See e.g. Billingsley [2] (and references therein), Kallenberg [4], and also more advanced proofs and constructions (and stronger results) as that of Komlós, Major and Tusnády [5]. Some books are devoted to the convergence of empirical measures and processes : we send the interested reader to Shorack & Wellner [6], van der Vaart & Wellner [7]. As a matter of fact, usual proofs of Theorem 1 use often some advanced constructions or are treated in probability books when a lot of materials have been introduced, leading to some intricate and complex proofs, quite difficult to be taught entirely to beginners. The aim of this paper is to present a new proof of Theorem 1 using only "simple" arguments: only immediate considerations about the weak convergence in C[0, 1] and D[0, 1], and the other very famous Donsker's Theorem which says that a rescaled random walk converges to the Brownian motion are used. The Appendix recalls this material.
The vector (N j ) j=1,...,n defined by has the mult(n, 1/n, . . . , 1/n) distribution. The empirical process taken at time k/n for k ∈ {0, . . . , n} is a simple function of this vector : Let b n be the process obtained by interpolating b n between the points {k/n, k ∈ {0, . . . , n}}.
Let (P k ) be a sequence of i.d.d. Poisson random variables with parameter 1. The distribution of (P k ) k=1,...,n under the condition n k=1 P k = n (or n k=1 (P k − 1) = 0) is also mult(n, 1/n, . . . , 1/n) as can be straightforwardly checked. For any k ∈ {0, . . . , n}, set and let S = (S k ) k=0,...,n be the "centered" Poisson random walk, interpolated between integer points. Hence, we have conditioned by S n = 0, and sup This is controlled as follow: the N i 's are Binomial(n, 1/n). By the Markov n → n 0, and then Hence, by (5), (6) and (7), (see also Lemma 7 in Appendix) Theorem 1 stating the convergence of (b n ) to b in D[0, 1] is easily implied by the following proposition.

Proposition 2
The sequence n −1/2 S nt t∈[0,1] conditioned by S n = 0 converges in distribution to b in C[0, 1] equipped with the topology of the uniform convergence.
The proof we propose for this classical proposition is the real novelty of this paper.

The "correction" of a Poisson random walk
The main line in our approach is the comparison between S and S conditioned by S n = 0. We introduce a correcting process C = (C k ) k=0,...,n , such that the pair (S, C) have the following feature : • S is the centered Poisson random walk (defined in (4)), • S + C is distributed as S conditioned by S n = 0.
Note 2 Transforming a problem involving n random variables into a problem involving I n ∼ Poisson(n) random variables is called Poissonization. Taking U 1 , . . . , U In instead of U 1 , . . . , U n in the construction presented at the beginning of the paper amounts to replacing S conditioned by S n = 0 by the centered Poisson random walk S. This is a Poissonization. The correction of the Poisson random walk we propose, which allows to pass from S to S conditioned by S n = 0 is from our point of view different in nature from the usual depoissonization techniques. Here, everything relies on an exact combinatoral correction, when usually, most rely on the convergence in distribution of (I n −n)/ √ n, ensuring the problem with n variables and with I n variables being asymptotically equivalent, which is not the case here.
Let us come back to our correction procedure. To fix the details, we will use a classical interpretation of Poisson random walk in term of urns/balls. Conditionally on S n = s, the vector (P i ) i=1,...,n has the mult(s+n, 1/n, . . . , 1/n) law. When m balls labeled 1, . . . , m are sent independently in n urns according to the uniform distribution, the vector (N ′ i ) i=1,...,n giving the number of balls in the urns follows also the mult(m, 1/n, . . . , 1/n) distribution.
Let us throw P i balls in urn i where (P i ) i=1,...,n are i.i.d. Poisson random variables with parameter 1. Then three cases arise: In the first case S n = 0, no correction are necessary, then set C i = 0 for any i. The two last cases are treated below. Notice that we focus on the uni-dimensional distributions of the process C since this will appear to be sufficient.
Case S n < 0. We work conditionally on S n = s. Since −s balls are lacking: throw −s new balls and denote by C k the number of new balls fallen in the k first urns; for any k, C k ∼ Binomial(−s, k/n).

Lemma 3
For any s < 0 and any n ≥ 1, conditionally on S n = s the process S + C is distributed as S conditioned by S n = 0 and C k ∼ Binomial(−s, k/n).
Case S n > 0. We work conditionally on S n = s. In this case n + s balls have been thrown instead of n and then s balls must be taken out. The vector (V k ) k=1,...,n giving the exceeding number of balls in the different urns (those with labels in n + 1, . . . , n + s) follows the law mult(s, 1/n, · · · , 1/n). Then given S n = s, we search a correcting process (∆C k ) k=1,...,n ..,n . Of course there is a problem to define the correcting process in terms of balls/urns, the balls/urns problem living a priori on a larger probability space than the (P i )'s. But this gives us the intuition for a right correcting process: we define C conditionally on the P i 's as follows. Let (p i ) i=1,...,n be non negative integers summing to n + s. Set for any given non negative integers c 1 , . . . , c n summing to s, and 0 otherwise.

Lemma 4
For any s > 0 and any n ≥ 1, conditionally on S n = s the process S + C is distributed as S conditioned by S n = 0 and C k ∼ −Binomial(s, k/n).
Proof. We have to check that C + S is distributed as S conditioned by S n = 0: pi=n+s,pi≥ji where we have used (9), the fact that n + S n is Poisson(n) distributed, and We now show that knowing S n = s, (−∆C k ) k=1,...,n ∼ mult(s, 1/n, . . . , 1/n). This implies the second point. Let c 1 , . . . , c n be non negative integers summing to s. Write P(∆C k = −c k , ∀k|S n = s) = (pi),pi≥ci, pi=n+s P(∆C k = −c k , ∀k|P i = p i , ∀i)P(P i = p i , ∀i) P(S n = s) .
By (10), S n + n ∼Poisson(n) and (9), this is easily shown to be equal to From now on, consider the process C as being interpolated between integer points.
Lemma 5 For any t ∈ [0, 1], Proof. We work with C ⌊nt⌋ instead of C nt . For t = 0, (11) holds clearly. Let t ∈ (0, 1], and ε > 0 be fixed. Write Let α > 0 be a fixed (small) positive real number. The central limit theorem applied to S n ensures that there exists M such that W M n ≤ α, for n large enough.
Proof. Assertion (ii) is a consequence of (i). Proof of (i) : the convergence of n −1/2 S n. to B in C[0, 1] is given by the other famous Donsker's theorem stating the convergence of rescaled random walks to the Brownian motion (see [2] or [4]). In particular n −1/2 S n → B 1 .
The finite dimensional distribution of n −1/2 C n. converges to those of the process (tB 1 ) t∈[0,1] . Indeed by Lemma 5 and (12), for any 0 ≤ t 1 ≤ · · · ≤ t k ≤ 1. Then the family (n −1/2 C n. ) is tight since it is a sequence of monotone processes whose finite dimensional distribution converge to those of the a.s. continuous process (tB 1 ) t∈[0,1] (this is Lemma 8(ii)). Hence the family n −1/2 (S n· , C n· ) is tight. The limit is identified again thanks to Lemma 5.

Conclusion
The idea of this proof appeared after a discussion with Philippe Duchon, where he explained me his algorithm to generate uniformly a Bernoulli bridge with 2n steps, that is a random walk S = (S k ) k=0,...,2n with increments ±1, conditioned by S 2n = 0 : build first a simple random walk with 2n steps, choosing i.i.d.
increments +1,or −1 with probability 1/2. If S 2n = 0 then it's done. If not, assume that S 2n = 2k > 0. Then pick up at random indices I 1 , I 2 , . . . in 1, 2n . If I i is the index of a positive increment, change it into a negative one; if it is negative then do nothing. Stop when you have changed k increments. By a simple symmetry argument the path obtained is uniform in the set of Bernoulli bridges of size 2n. I found that this was a nice way to prove that rescaled Bernoulli bridge converges to the Brownian bridge; this can be proved using the same argument than the ones exposed above: the correction procedure will asymptotically and "eventually removes a straight line of the Brownian motion". Therefore, I tried to find other increment distributions for which a similar correction procedure would have been possible. It appears to be not so general, or at least, not so agreeable. The problem is the following one: in general there does not exist any simple correction procedure that conserves at each step of the correction the property of the trajectory to have conditionally on its terminal position k, the law of a simple random walk conditioned by S n = k.
Convergence in C[0, 1] and in C[0, 1] 2 We recall some classical facts concerning the weak convergence in C[0, 1] and (C[0, 1]) 2 . First tightness and relative compactness are equivalent in these sets by Prohorov's theorem, since they are both Polish spaces.
Lemma 8 (i)Let (X n , Y n ) be a sequence of pairs of processes in (C[0, 1]) 2 . The tightnesses of both families (X n ) and (Y n ) imply that of (X n , Y n ).
(ii) Let (X n ) be a sequence of monotone processes in C[0, 1]. If the finite dimensional distributions of (X n ) converge to those of an a.s. continuous process X, then (X n ) is tight and then X n Proof. (i) Take two compacts K 1 and K 2 of C[0, 1] such that P(X n ∈ K 1 ) ≥ 1 − ε and P(Y n ∈ K 2 ) ≥ 1 − ε, then P((X n , Y n ) ∈ K 1 × K 2 ) ≥ 1 − 2ε and K 1 × K 2 is compact in (C[0, 1]) 2 .
(ii) Only the tightness of (X n ) in C[0, 1] has to be checked. For any function f : [0, 1] → R, and δ > 0, the global modulus of continuity of f is Since X n is increasing, for any positive integer m, ω 1/m (X n ) ≤ A m,n := 2 max X n ( k m ) − X n ( k − 1 m ) , k = 1, . . . , m .
Since the finite dimensional distributions of (X n ) converge to those of X,