Bootstrapping the empirical distribution of a stationary process with change-point

: When detecting a change-point in the marginal distribution of a stationary time series, bootstrap techniques are required to determine critical values for the tests when the pre-change distribution is unknown. In this paper, we propose a sequential moving block bootstrap and demonstrate its validity under a converging alternative. Furthermore, we demonstrate that power is still achieved by the bootstrap under a non-converging alternative. We follow the approach taken by Peligrad in [14], and avoid assumptions of mixing, association or near epoch dependence. These results are applied to a linear process and are shown to be valid under very mild conditions on the existence of any moment of the innovations and a corresponding condition of summability of the coeﬃcients.


Introduction
Structural stability is typically a key component of time series models and corresponding inferential methods. Consequently, tests for structural change are crit-Bootstrapping with change-point 3573 ical when fitting any time series model to real-world data, and as a result there is an enormous literature on change-point detection. Somewhat surprisingly however, there has been relatively little attention paid to detecting a change in the marginal distribution of the time series. The most relevant references in this context are [7], [8] and most recently, [17]. The clear advantage of this approach is that the nature of the change (change in location, scale, covariance, etc.) need not be specified in the alternative, provided that it results in a change in the marginal distribution.
In the case of [7] and [8], the key to studying the asymptotic behaviour of the test statistics is a functional central limit theorem (FCLT) for the sequential empirical process both with and without a change point. A different approach is taken in [17] where more general forms of functional data are considered, with the empirical distribution considered as a special case. In [17], complicated tightness arguments are avoided by regarding the empirical distribution as a Hilbert space-valued functional, with weak convergence defined in terms of the L 2 norm of the Hilbert space. Appropriate test statistics are defined in [7], [8] and [17]. In [8] and [17], critical values for the test statistics are found via bootstrap techniques. In the case of [8], a weighted moving block bootstrap is shown to be valid under strong mixing conditions when the change evolves gradually over the observed period, while in [17], a disjoint block bootstrap (DBB) is proposed whose asymptotic behaviour is considered under a converging alternative, assuming L 1 -near epoch dependence (NEP(1)) of the stationary sequence. In [7], there is little discussion of finding critical values, although the long memory linear process is discussed in some depth.
In this article, we take a closer look at the validity of the moving block bootstrap (MBB) in establishing critical values for detecting a change in the marginal distribution of a stationary time series using an approach due to Peligrad [14]. Theorem 2.2 of [14] establishes the validity of the MBB for the empirical distribution of a stationary process under straightforward moment conditions that do not involve any specific assumptions of mixing, association or near epoch dependence. Here we define a sequential version of the bootstrapped empirical process; our main results are a sequential version of Peligrad's Theorem 2.2 and an extension to a time series with a change point. As noted in [17], the most difficult aspect is establishing tightness of the bootstrapped empirical process in the function space D(R × [0, 1]).
There are several advantages of this approach. The MBB is known to be superior to the DBB of [17] in concrete applications (cf. [16], [11] and [12]). With Peligrad's moment conditions, we are able to establish almost sure weak convergence of the sequential bootstrapped empirical process on D(R × [0, 1]) under both converging and non-converging alternatives. Our main example, the causal linear process, is used to illustrate that the bootstrap is valid for processes that are not mixing or NEP(1). Consequently, we are able to apply the bootstrap to heavy-tailed models, which is of particular importance for examples in finance and economics. Further, by working with the empirical process we achieve more flexibility than is possible with the L 2 approach of [17] and are able to bootstrap any statistic that is a continuous functional of the sequential empirical process.
We demonstrate that Kolmogorov-Smirnov and Cramér-Von Mises test statistics achieve good power under both converging and non-converging alternatives.
We proceed as follows: we introduce the sequential moving block bootstrap in the next section and present a sequential version of Peligrad's CLT for the bootstrapped empirical process. In Section 3 we consider the behaviour of the bootstrapped empirical process when there is a change-point. It is shown in Theorem 3.2 that under a converging alternative, the bootstrapped sequential empirical process has the same asymptotic behaviour as the sequential empirical process without a change. On the other hand, Theorem 3.3 illustrates that a stronger normalization is needed when the alternative is not converging. In Section 4 we consider the asymptotic behaviour of Kolmogorov-Smirnov and Cramér-Von Mises test statistics under the null hypothesis and both converging and non-converging alternatives, and show that the bootstrap leads to consistent tests in all cases. Examples are given in Section 5, including an in-depth discussion of the linear model. In Section 6, simulations illustrate the performance of the tests in the case of both converging and non-converging alternatives. The tests will be seen to perform well even when first moments do not exist. Concluding comments and directions for further research are presented in Section 7, and all proofs appear in Section 8.

The bootstrap sequential empirical central limit theorem
As noted in the Introduction, our goal is to establish the validity of a sequential bootstrap technique that does not require conditions of mixing, association or near epoch dependence. To avoid these assumptions, we use the approach taken by Peligrad in [14], where sufficient conditions for the moving block bootstrap empirical CLT are expressed in terms of moments. In this section we review the moving block bootstrap and define a sequential bootstrapped empirical process. Our main result is Theorem 2.2, a sequential version of the bootstrap empirical CLT of [14].
We shall assume throughout that we have a strictly stationary ergodic stochastic process (X i , i ∈ Z) with marginal distribution F , defined on a probability space (Ω, F, P ). The empirical distribution function (edf) F n is and the sequential edf is defined as The empirical process is defined as and the sequential empirical process is Under appropriate regularity conditions, including some form of short memory (see, for example, [4], [5], [8]), where W (·, ·) is a mean zero Gaussian process with covariance function and D → denotes weak convergence of random elements taking values in the space D(R × [0, 1]) equipped with Skorokhod's J 1 -topology (cf. [1] and [9] for more details).
The (non-sequential) bootstrapped empirical process is defined as follows: for where is the bootstrapped empirical distribution. Continuing with the sequential bootstrapped empirical process, define the i th block empirical distribution to be

F. El Ktaibi and B. G. Ivanoff
Using the definition (5) of the bootstrapped empirical distribution and the fact that n i=1 I(I nj = i) = 1, when n = lk we can rewrite the bootstrapped empirical process defined above in (4) as follows: The representation in (6) suggests the following definition for a sequential bootstrapped empirical process: for (x, s) ∈ (R × [0, 1]) and n = lk, where F As in [14], we assume the following relationship between the block lengths l n and the number of blocks k n , which allows the block size l n to be arbitrarily close to O(n 1/3 ), observed by Künsch [11] to be the optimal length for the MBB. Following the notation used in [14], we write a n b n to indicate a n = O(b n ).
Assumption 2.1. Let (l n ) and (k n ) be sequences of natural numbers satisfying l n = l 2 k for 2 k ≤ n < 2 k+1 , l n → ∞ as n → ∞ and n = k n l n .
Our first result is the following sequential version of Theorem 2.2 of [14].

Theorem 2.2.
Let (X n ) n∈Z be a stationary sequence of random variables. Let l n , k n be sequences of natural numbers satisfying Assumption 2.1. Assume there are two constants C 1 and C 2 such that, for some γ > 0 and every x < y, and for every 1 ≤ m ≤ n, where .
n (·, ·) as in (7), as n → ∞, almost surely, where W (·, ·) is a Gaussian process with zero mean and covariance The proof appears in Section 8.

Comments:
1. Assumption 2.1 requires that n be a multiple of l n . In practice, if this is not the case then the number of blocks selected is k n = n ln . It is onerous but straightforward to show that this does not affect the asymptotic behaviour of the resampled process. See [17] for example. 2. We note that that the sequential bootstrap is valid under the same conditions as in [14] for the usual empirical process. Although this is unsurprising, the proof of tightness is highly technical. Unfortunately, although the blocks are (conditionally) i.i.d., their definition changes with n. As a result, the straightforward argument used by van der Vaart and Wellner in [18] for sequential empirical processes based on a single sequence of i.i.d. random variables cannot be directly applied to the moving block bootstrap. 3. As observed by Peligrad in [14] and Radulovic in [16], the bootstrap process may converge in situations in which the original sequential empirical process does not, and vice versa. This point will be illustrated in the discussion in Section 5.

The bootstrap empirical central limit theorems with change-point
We now introduce the change-point model. Let (Y i , i ∈ Z) and (Z i , i ∈ Z) be strictly stationary ergodic sequences and let θ n ∈ (0, 1]. Let F and G be the distributions of Y 0 and Z 0 , respectively. Borrowing the notation of [7], we write X n := (X 1 , ..., X n ) ∈ Ψ n (θ n , F, G) if In the case of no change (θ n = 1 ∀n), we write X n ∈ Ψ n (F ).

F. El Ktaibi and B. G. Ivanoff
Recall the sequential empirical distribution function defined in (1): where We note H (n) (x, s) = sF (x) when there is no change (θ n = 1), and so under the null hypothesis |W n (x, s) − W n (x, s)| ≤ 1/ √ n and the processes are asymptotically equivalent.
Under appropriate regularity conditions on both the pre-and post-change stationary sequences (including some form of short memory) and assuming that θ n → θ ∈ [0, 1], it may be shown that if X n ∈ Ψ n (θ n , F, G), as n → ∞, then where W (θ) (·, ·) is a centred Gaussian process with finite covariance See, for example, Theorem 2.4 of [5]. We now consider the behaviour of the sequential bootstrapped empirical process W (b) n (·, ·) defined in (7). We have two cases to consider, and we will use the following definition: Definition 3.1. We say that we have a converging alternative if θ n → 0 or if θ n → 1. The alternative is non-converging if θ n → θ ∈ (0, 1).
The so-called "converging alternative" means that the change takes place quite early or late in the observation period, and we shall see that consistency of the bootstrap depends on the rate at which θ n converges to 0 or 1.
When the assumption of a converging alternative is not satisfied, the conditional covariances of the bootstrapped process diverge. In this case, we need a stronger normalization.
Remark 3.4. The significance of Theorem 3.3 is that while the bootstrapped test statistics defined in the next section diverge, they do so more slowly than the original test statistics, and so the bootstrap test still achieves power without a converging alternative. This will be made precise in Proposition 4.6 and Comment 4.7.
The proofs of Theorems 3.2 and 3.3 appear in Section 8.

Test statistics
Recalling the notation introduced in the preceding section, when there is an unknown change-point, the hypothesis and alternative are In this case, the test statistics will be based on the following process: where is the empirical distribution function based on X m+1 , . . . , X n . The process V n (·, ·) compares the (suitably weighted) empirical distributions before and after [ns], for 0 ≤ s ≤ 1: To test the pair (H 0 , H 1 ), we use the following statistics: • Weighted Kolmogorov-Smirnov statistic: • Weighted Cramér-Von Mises statistic: We reject the null hypothesis H 0 for large values of T i , for i = 1, 2.
The proof of Proposition 4.1 is identical to that of [5], Proposition 2.7. We next deal with consistency of the test statistics T 1 and T 2 . There are two cases of converging alternatives to consider: θ n p → 0 and θ n p → 1. The test statistics are consistent under a converging alternative provided that the rate of convergence of θ n is slower than 1/ √ n. The following is a slight sharpening of [5], Proposition 2.8. The proof appears in Section 8.
. Suppressing dependence of l and k on n for notational convenience, recall the bootstrap sample of size n defined in (3): where I n1 , I n2 , . . . , I nk are independent and identically distributed random variables each having uniform distribution on {1, 2, . . . , n}.
The testing procedure is based on V n (·, ·), the sequential counterpart of the process V n (·.·) introduced in (16), defined as follows: We define bootstrapped versions of the Kolmogorov-Smirnov and Cramér-Von Mises statistics as follows: • Bootstrapped Kolmogorov-Smirnov statistic: • Bootstrapped Cramér-Von Mises statistic: The asymptotic behaviour of the test statistics follows from an application of the continuous mapping theorem to Theorems 2.2, Theorem 3.2, and Theorem 3.3.
First we deal with the null hypothesis.

Proposition 4.3. Under the assumptions of Theorem 2.2
is the limiting process in Theorem 2.2.
Next we deal with a converging alternative (θ n → 0 or 1).

Proposition 4.4. Under the assumptions of Theorem 3.2
almost surely as n → ∞, where W (·, ·) is the limiting process in Theorem 3.2 andH This will certainly be true, for instance, if n h−1/2 θ n n −1/3 for some 0 < h < 1/6 in the first case or analogously n h−1/2 1 − θ n n −1/3 in the second case.
diverge at a slower rate than the corresponding test statistics T 1 and T 2 , ensuring that critical values tabulated from repeated bootstrap samples yield consistent tests. In fact, as will be seen in Section 6, simulations illustrate that good power is still achieved when the alternative is not converging.

Examples
For the test statistics T 1 , T 2 and their bootstrapped counterparts to yield consistent tests of the hypothesis of no change, we need that (14)); and 2. The pre-and post-change sequences {Y i } and {Z i } both satisfy (9) and (10) of Theorem 2.2.
Note that the behaviour of the bootstrapped process W n depends only on item 2. above, and there are no additional assumptions needed on the relationship between the pre-and post-change sequences (cf. Theorems 3.2 and 3.3). However, for item 1. above, further information is required in order to ensure the asymptotic independence of the pre-and post-change increments of W n (for example, some form of joint short memory of {(Y i , Z i )}). This will be investigated in detail for the causal linear process, which was the principal motivation for this work. First, we present two examples from [14] giving sufficient conditions for the validity of the bootstrap for mixing and associated sequences.

Causal linear processes
The causal linear process was the subject of both [6] and [5]. The validity of the MBB for the (non-sequential) empirical process under conditions similar to those of [3] was established in [6] and the linear process with a changepoint was considered in [5]. Here, we combine these results to develop sufficient conditions for both items 1. and 2. above to hold, ensuring that the sequential MBB produces consistent tests for linear processes with change-point.
The stationary causal linear process is defined as follows: for i ∈ Z, where (ξ j : j ∈ Z) is a stationary sequence of independent and identically distributed (i.i.d.) random variables and (a j : j ∈ N) is an absolutely summable sequence of constants. Summability of the coefficients ensures that if E[ξ 2 0 ] < ∞, the process has short memory in the sense that the covariances are summable, and more generally, if E[|ξ 0 |] < ∞, the process is L 1 -near epoch dependent (cf. [17]). However, as will be seen below, the linear model includes processes that are not mixing or NEP(1).
We begin by stating sufficient conditions for weak convergence of the empirical process W (·, ·) both with and without a change (cf. (14)): 1. Let {a j , j ∈ Z} be a sequence of non-random weights, infinitely many of which are non-zero, satisfying

Comments 5.2.
• Any linear process with Gaussian innovations and summable coefficients satisfies Assumptions 5.1, but such a process is not necessarily mixing, as illustrated by the following classic example due to Ibragimov: let the is not strong mixing. • We emphasize that this model can include any sort of heavy-tailed innovations ξ i , provided that E[|ξ 0 | δ ] < ∞ for some δ > 0. Since it is implicit in the definition of L 1 -near epoch dependence that the X i 's have a finite first moment (cf. [17]), the linear model above includes processes that are not NEP(1). • Assumption 5.1.2 implies that the distribution function F ξ of ξ 0 satisfies the Hölder condition |F ξ (x) − F ξ (y)| < C|x − y| Δ . It also implies that the distribution function of a partial sum of the a j ξ i−j terms is differentiable with a bounded density satisfying a uniform Lipschitz condition, provided that sufficient terms with non-zero a j are included in the moving average (cf. [3]). Obviously, the distribution function of X 0 is uniformly Lipschitz as well. • The assumption that infinitely many coefficients (a i ) are non-zero is not required if F ξ has a uniformly Lipschitz derivative. In this case, all the results that follow remain valid.
We now introduce the change-point model for the linear process. Let {θ n ∈ (0, 1], n ∈ N} be a convergent sequence with limit θ ∈ [0, 1]. To define the causal linear process with a change-point at [nθ n ], consider the following stationary processes for i ∈ Z and a (1) j , a (2) j ∈ R: where the vectors (ξ i , ξ i ) are i.i.d. and both sequences {a (1) j } and {a (2) j } are absolutely summable. We do not make any assumption about the relation between ξ i and ξ i -they can have any sort of dependence structure (and consequently, {Y i } and {Z i } need not be independent). Denote by F and G the respective distribution functions of Y 0 and Z 0 .
The following result is Theorem 2.4 of [5]. Note that under the null hypothesis, θ n = 1 for all n.
Next, we deal with weak convergence of the bootstrapped linear process W (b) n . However, we impose a slightly different set of assumptions than those for convergence of W n .

Assumptions 5.4.
1. Let (a j , j ∈ Z) be a sequence of non-random weights, infinitely many of which are non-zero, such that for some γ ∈ (0, 1], 2. There exist constants C < ∞ and Δ > 0 such that for all u ∈ R 3. E[|ξ 0 | 2γ ] < ∞ where γ ∈ (0, 1] is as in 1 above. The following theorem follows from the proof of Theorem 2.5 in [6] and demonstrates the validity of the sequential moving block bootstrap.   (1), and so cannot be handled by the results of [8] or [17].

Simulations
In this section, we illustrate the performance of the MBB tests in detecting changes under various scenarios. We begin by simulating the linear model to illustrate the performance of the tests proposed in Section 4 for both converging and non-converging alternatives. In addition, we consider both normal and Cauchy innovations to illustrate that the procedure performs well regardless of whether the innovations have a finite first moment.
To this end, we consider the following stationary autoregressive processes where the vectors of innovations (ξ i , ξ i ) are i.i.d. and ρ 1 , ρ 2 < 1 (a (i) j = ρ j i , i = 1, 2). The change-point model satisfies In our first two examples we consider normal and Cauchy innovations. In both cases, we investigate separately changes in location or scale of the innovations under converging (θ n = 0.08) and non-converging alternatives (θ n = 0.5), illustrating the performance of both test statistics, the Kolmogorov-Smirnov (K.S) and Cramér-Von Mises (C.V.M). In all our examples, the tests are carried out at a nominal level of significance α = 5% with 500 bootstrap replications used to determine the appropriate critical value in each case. Each simulation was repeated 400 times for the analysis of the power.
It was found that the converging alternative was much more sensitive to the relation between n and l than the non-converging alternative. In the case of a converging alternative (θ n = 0.08), the sample size used was n = 15, 000 with l = 15 and k = 1, 000; for the non-converging alternative (θ n = 0.5), the sample size was n = 10, 000 with l = 25 and k = 400.

Example 1
Here we investigate the performance of our test statistics in detecting a change in an AR(1) process with normal innovations. We consider changes in the mean and the variance of the innovations.
• Change in the mean of the innovations: In this case, we consider the following model with ρ 1 = ρ 2 = 0.5:  In this case, we consider the following model with ρ 1 = ρ 2 = 0.5: We consider σ = 1 under the null hypothesis and σ = 0.5, 0.6, 0.7, 0.8, 0.9, 1.1, 1.2, 1.3, 1.4, 1.5 under the alternatives. The empirical size and the power performance of the tests are illustrated in Fig. 2.

Example 2
Using the same parameters as in the preceding example, we will be considering now a stationary autoregressive model with Cauchy innovations.
• Change in the location parameter: The model to be considered in this case is as follows:  In this case, we consider the following model:  Examples 1 and 2 illustrate that our unified approach allows us to easily detect changes in location or scale in the marginal distribution of a linear model. In all cases, we can see that the rejection rate under the null hypothesis is close to the nominal level of significance α = 0.05 and that we achieve good power under the alternatives. We note that contrary to what is frequently observed, the Cramér-Von Mises statistic does not consistently outperform the Kolmogorov-Smirnov statistic. In both cases, normal and Cauchy innovations, we remark a notable improvement of the performance of the test under the non-converging alternative, despite the fact that the bootstrap test statistics diverge.
The large sample sizes used in Examples 1 and 2 are consistent with financial data. Next we consider smaller sample sizes and various block lengths.

Example 3
Analogously to the first example, we examine the performance of the testing procedure for smaller sample sizes n and various values of l when detecting a change in the mean of the normal innovations from μ 0 = 0 to μ 1 = 0.5 with θ n = 0.5. The empirical power and size are displayed in Table 1.
We observe that the tests still perform very well with much smaller sizes under the non-converging alternative. We can see that for all combinations of l and k, the power achieved is very good. However, the empirical size is higher than the nominal size for highly correlated processes (ρ = 0.9) and small values of l. This is an unsurprising issue of the bootstrapped tests since the short blocks do not accurately reflect the stronger dependence structure and consequently the critical values obtained by the bootstrap techniques are too small.

Example 4
In this example, we will present simulations that illustrate the effect of a change in the tail behaviour of the marginal distribution. To this end, we consider the following model: As in the previous example, θ n = 0.5 and we consider various values of n and l. Table 2 shows that the test performs very well in this case. However, the empirical power decreases as the length of blocks increases, since as observed in the preceding example, the bootstrapped critical values are too small with shorter blocks when the observations are highly correlated.

Example 5
Since the applicability of the results presented in this paper go beyond the scope of linear processes, one can analyse the performance of our testing approach in the nonlinear case. This example is of particular interest, since both the preand post-change distributions are defined to have mean 0 and variance 1. First we introduce a simple ARCH process defined recursively by Figure 5 displays a change from an i.i.d sequence to a simple ARCH process: Figure 6 illustrates the following change: N (0, 1).
and σ (2)   As in the previous examples, the simulations were made at a nominal level α = 5% with 500 bootstrap replications, and each simulation was performed 400 times to get the empirical power under a non-converging alternative (θ n = 0.5). The sample size used in this case was n = 5000 with l = 10 and k = 500. Figures 5 and 6 show that our procedure can be used to detect changes from an i.i.d sequence to an ARCH process or from an ARCH process to another one in case of moderate sample sizes. Although the usual distribution-free tests are applicable for i.i.d. random variables, our results show that the bootstrap works well even if the assumption that the pre-change process is i.i.d. has not been made. In this case, the power increases as the coefficient a increases. This makes sense, since larger values of a cause more dependence in the post-change process, and this, in turn, is reflected in the post-change marginal distribution. The second change model shows that the empirical size is very close to the nominal level and the test statistics achieve an acceptable power for small values of a. It also shows much better performance for large values of a. In both cases, the performance of the test statistics (Kolmogorov-Smirnov, Cramér-Von Mises) is virtually identical.

Conclusion
In this article, we have defined a sequential version of the moving block bootstrap and demonstrated its validity in detecting a change-point in a stationary sequence under simple moment conditions. These results have been applied to a linear model to illustrate that the MBB can be applied to processes that are not mixing, associated, or near epoch dependent. Simulations illustrate the performance of the procedure in detecting changes in both linear and non-linear models.
There are many open questions beyond the scope of this paper that are of interest for further research: • In fact, Theorem 2.4 of [5] is slightly more general than as stated here in Theorem 5.3. In the case of the linear process, it is possible to prove that (14) holds for random values of θ n , provided that θ n → P θ, where θ ∈ [0, 1] is fixed. The random change-point [nθ n ] can be either independent or datadependent (for details, see [5]). It would be useful to investigate conditions under which the MBB can be extended to a random change-point. • Estimation of the value of θ n should be considered.
• The results obtained here should be extended to processes with multiple change points. • There are many interesting models of stationary random fields that generalize both mixing and linear sequences (cf. [10]). The block bootstrap as well has been extended to planar processes. The methods developed here could be generalized to detect a planar change-point or change-set. This question is currently under investigation.

Proof of Theorem 2.2
In what follows, for notational convenience we write l = l n and k = k n and fix a realization of the stochastic process, {x i }. Hence, the resampling mechanism becomes the unique source of randomness. We also consider the triangular array {x ni } defined as in Section 2: Some definitions will be needed in the sequel.
The i th block sample mean: x nj , and the sample mean: The bootstrapped sample mean is then defined as . , x n,In1+l−1 , . . . , x nI nk , . . . , x n,I nk +l−1 ) is defined as in Section 2.
We shall also use the notation: y).
n (x) to be a sample-based version of the bootstrapped empirical process by replacing X ni by x ni in (6). Hence, where f n (x) is defined by replacing X ni by x ni in (5). We also define a sample-based version of the sequential bootstrapped empirical process, by replacing X ni by x ni in (7), for (x, s) ∈ (R × [0, 1]) as We should note again that the terms in the partial sum are independent and identically distributed.
To prove Theorem 2.2, we proceed with a sequence of propositions that are sequential versions of the results in Section 3 of [14]. Recall that we use C to denote a generic constant that may change at each appearance.
Our first proposition is a straightforward generalization of Proposition 3.1 of [14] and so is presented without proof.
The following proposition proves the convergence of finite dimensional distributions Z (b) n (·, ·). We shall first define for x, y in R exists and for every (z 1 , . . . , z p , s 1 , . . . , s p ) ∈ [0, 1] 2p Proof. Recalling the definition of Z n (x) in (21), exactly as in the proof of Proposition 3.3 of [14] we have that the limit in (28) exists and Using now the representation in (22), we can see that for 0 ≤ s < t ≤ 1 → sσ(x, y) as n → ∞.
Suppose now, for instance, that 0 = s 0 ≤ s 1 < s 2 < . . . < s p ≤ 1 and let α 1 , α 2 , . . . , α p be real numbers. We will use the Cramér-Wold device and prove that We have the following representation: Denote and apply Proposition 8.1 with x i replaced by y u i to get, as n → ∞ √ n k provided the existence of Using (28), we get Remark now that the u-indexed sums in (31) are independent and This completes the proof of the proposition.
Next, we prove tightness of the sequence (Z n (·, ·)) n defined by (22) is tight in D([0, 1] 2 ); in particular, for every ε, η > 0 there exists δ, 0 < δ < 1 and N 0 such that for every n ≥ N 0 , and consequently, if Y is taken as a limiting distribution on a subsequence, For computational clarity, we will identify all of the constants involved in the following proof. Note that we now assume 0 ≤ x i ≤ 1, i = 1, ..., n.
Proof. The tightness of the sequence Z n (x, s) will be proven by closely following the approach used by Naik-Nimbalkar and Rajarshi in [13] and using a restricted chaining argument given in Theorem VII.26 in [15] applied with the semimetric d((x, s), (y, t) , where x, y, s, t in [0, 1], 0 < b < 1 and C 3 > 0.
By (22), we have the following representation and suppose for instance that s ≤ t, so we can now obtain by virtue of the independence of the E nj 's and inequality (33) that We also have by (32) Therefore, by Bennett's inequality (see [15], page 192), we have for every η > 0 where Remark now that for any δ > 0 Let us denote the nearest member of the α-net of [0, 1] 2 to (x, s) with respect to the semimetric d by (x α , s α ). If d((x, s), (y, t)) ≤ δ and n −r ≤ δ 2 , where r = min( 1 2 + a, c), then D n (x, y, s, t) ≤ C 4 δ 2 for some C 4 > 0 by (32). It will be shown later that i) For every λ ∈ (0, 1), every η > 0 and δ > 0 such that δ 2 ≥ n −r and where α 2 = α 2 n = n −r + To apply Theorem VII.26 of Pollard [15], it remains to show that the associated covering integral with respect to the δ-net of the semimetric d is finite for any 0 < δ ≤ 1. By definition, the covering integral is for δ small enough. Now, taking λ = 1/4, D = 2 √ C 4 and α 2 = n −r + 2C 1 n −a /C 4 B −1 (1/4) in Theorem VII.26 of Pollard [15], inequality (34) holds.
To complete the proof of the proposition, we shall prove inequalities (37) and (38).
Let x and y be any points in [0, 1]. Then, arguing as in [13] Since f n and [ks] assume (n + 1) and (k + 1) different values respectively, Z n (y, t) assumes at most (n + 1) 2 (k + 1) 2 values as (x, s) and (y, t) vary in [0, 1] 2 . Therefore, Suppose for instance that s ≤ s α , then by the kind of computations seen before, we get Bernstein's inequality (see [15], page 193) leads to the following: The definition of α and the condition (32) imply that for some constants C 5 > 0 and p > 0. Hence, This completes the proof of inequality (38) and that of Proposition 8.3.
We now return to the proof of Theorem 2.2. We will be applying Propositions 8.2 and 8.3 to a fixed trajectory x i = X i (ω), i = 1, 2, ... of the sequence of random variables {X i }.
Define the variables U i = F (X i ) and consider the bootstrapped empirical process W n (·) based on the U i 's, as in (4). Let Now replace P with P * and x i = U i (ω) in Propositions 8.2 and 8.3, where P * denotes the conditional probability given the sample (X 1 , X 2 , · · · , X n ). The almost sure convergence of the sequential bootstrapped empirical process W n (·, ·) will follow from Propositions 8.2 and 8.3 provided that conditions (27) and (33) hold almost surely: i.e.
and for each x and y in [0, 1] for some C > 0, depending only on trajectory, 0 < b < 1 and C > 0. This was proven by Peligrad under the conditions of our Theorem 2.2 (Proposition 4.1 and proof of Theorem 2.2 in [14]). The conclusion of our Theorem 2.2 can now be derived in a routine manner, as in [2], for the sequential bootstrapped empirical process W n (F (x), s).

Proofs of Theorems 3.2 and 3.3
We begin with a slight variation of Theorem 2.2. Let (x n ) n∈Z be a realization of a stationary sequence (X n ) n∈Z and let (θ n ) n∈Z be a sequence in [0, 1] such that θ n → θ ∈ [0, 1]. We introduce some notation: for where F X [nθn] (x)) is the empirical distribution based on X 1 , ..., X [  (46) Furthermore, the limits in (a) and (b) are independent.
We defer the proof of Theorem 8.4 to the end of the subsection.
The key to proving Theorems 3.2 and 3.3 is the following representation of W (P * denotes the conditional probability given the sample (X 1 , X 2 , · · · , X n )).
Proof. This can be seen as follows: