Weighted resampling of martingale diﬀerence arrays with applications

: In this paper the behaviour of linear resampling statistics in martingale diﬀerence arrays X n,i ,i ≤ k ( n ) is studied. It is shown that diﬀerent bootstrap and permutation procedures work if the array ( X n,i ) i fulﬁls the conditions of a general central limit theorem. As an application we obtain amongst others resampling versions of the Kuan and Lee [20] test for the martingale diﬀerence hypothesis.


Introduction
The asymptotic behaviour of martingales plays a very important role in testing statistical hypothesis. One reason therefore lays in the fact that many test statistics T n of interest can, in the context of general Doob-Meyer decompositions, be divided in a martingale part M n and an appropriate part A n . For getting better power functions or even distribution-free tests it can then be reasonable to replace the original T n -test by tests in the martingale part M n . This occurs e.g. by Khamaladze transformations in connections with goodness-of-fit-, survival-42 M. Pauly or rank-statistics, see Khmaladze [18], Khmaladze and Koul [19] or Janssen and Meyer [15].
Since every martingale (M n ) n∈N can be written as the sum of its increments we consider the linear statistic where k(n) is a subsequence of N such that k(n) → ∞ as n → ∞. Here the random variables X n,i : (Ω, A, P ) → (R, B), i ≤ k(n), form a martingale difference array (abbreviatory MDA). This means that (X n,i ) i≤k(n) is, for every fixed n ∈ N, integrable and adapted to a given array of filtrations (F n,i ) i≤k(n) and satisfies the condition E(X n,i |F n,i−1 ) = 0 for all 1 ≤ i ≤ k(n) and n ∈ N. (1. 3) There we have set X n,0 := 0 and F n,0 := {∅, Ω}. S n can for example be a weighted form of (1.1) where C n,i is F n,i−1 -measurable. The asymptotic properties of the linear statistic S n have been studied in detail in the literature; one may refer to the book of Hall and Heyde [12] and the references therein. Yet, the asymptotic behaviour of the associated nonparametric bootstrap and permutation statistics has not been investigated in the same generality. This is in contrast to the case of i.i.d. variables where even bootstrapping of the Student t-statistic is well understood, see Mason and Shao [23]. Applications of such resampling procedures are meaningful since the corresponding resampling tests are distribution free and may have better finite sample properties. Moreover, in the case of two-sample problems permutation tests are even exact under the null-hypothesis of exchangeability. This advantage of the permutation method has e.g. been used by Janssen and Meyer [15] in the context of weighted two sample survival statistics. Because of their good finite sample performance, bootstrap procedures have been used in the literature for testing the martingale difference hypothesis in different situations, see e.g. Clark and West [3] or Escancianoa and Velasco [8].
In this paper we will analyze weighted bootstrap versions of the linear statistic (1.2) for general MDAs. It will turn out that a lot of different resampling procedures work if the MDA fulfils the conditions of a general central limit theorem. Therefore we follow a weighted resampling approach that was first proposed by Mason and Newton [22] and amongst others extended by Praestgaard and Wellner [25], Janssen and Pauls [16], Janssen [17] and Del Barrio et al. [4].

Resampling martingale difference arrays
Let (X n,i , F n,i ) i≤k(n) be a MDA. For a triangular array of exchangeable random weights W n,i : (Ω,Ã,P ) → (R, B), i ≤ k(n), that is independent of the MDA (defined as r.v. on the product space), we define a general resampling version of the linear statistic (1.2) as Here we have set X n = k(n) −1 k(n) i=1 X n,i . Remark that the choice of the weights determines the resampling procedure. As special case the m(n)-bootstrap procedure (where we draw the bootstrap sample X * n,1 , . . . , X * n,m(n) by sampling m(n) times with replacement from {X n,1 , . . . , X n,k(n) }) is included. There we use the weights where (M n,1 , . . . , M n,k(n) ) ∼ M ult(m(n), 1/k(n)) is a multinomial distributed random vector and m(n) is a subsequence of N such that m(n) → ∞ as n → ∞.
In the case of Efron's bootstrap (where m(n) = k(n) = n) the corresponding resampling statistic reduces to T * n = n i=1 M n,i (X n,i − X n ), which is equal in distribution to n i=1 X * n,i − X n,i . Another example is given by row-wise i.i.d. weights with E(W n,1 ) = 0 and Var(W n,1 ) = k(n) −1 . This setting corresponds to the general wild bootstrap, see e.g. Mammen [21]. Other included resampling procedures are amongst others permutation techniques and the m(n)-double-as well as the Bayesian bootstrap, see Rubin [26]. For more details on resampling schemes I refer to Janssen and Pauls [16], Janssen [17], Del Barrio et. al. [4] or chapter 6 in Pauly [24]. The advantage of this weighted approach is that we can prove consistency of several different resampling procedures simultaneously.
For gaining a conditional central limit theorem for (2.1) given (X n,i ) i≤k(n) we will assume throughout that the weights fulfil the following conditions, which are due to Janssen [17]: where Z is a r.v. with E(Z) = 0 and Var(Z) = 1. Here we have used the notation P −→ for convergence in P -probability and d −→ for convergence in distribution as n → ∞. Recall that an S n -type resampling test (i.e. a test with test statistic S n and critical value given by quantiles of the conditional distribution of T * n ) is 44 M. Pauly applicable if (1.2) and its resampling version (2.1) posses the same asymptotic limit under the null. The following theorem gives conditions for this situation. As usual Φ denotes the df. of the standard normal distribution.
Theorem 2.1. Suppose that the MDA (X n,i , F n,i ) i≤k(n) is square-integrable and fulfils the following three assumptions Then we have both, unconditional convergence and conditional convergence in probability as n → ∞.
The proofs and all technical details of the following sections are shifted to the Appendix.
Remark 2.1. Suppose that the conditions of Theorem 2.1 hold, where now the limit σ 2 in (2.7) is, unlike above, not a constant but an a.s. finite random variable that is measurable in the completions of all the σ-fields F n,i . Then there exist also some individual limit theorems for S n and T * n respectively, but the limit distributions in (2.9) and (2.10) are in generally not equal. Hence resampling cannot be applied in the described manner.
We now specialize to MDAs of the form In this context the easiest example is just given by the assumptions of the classical central limit theorem, where X 1 , . . . , X n are i.i.d. random variables with finite variance. There F n,i = F i = σ(X 1 , . . . , X i ) and k(n) = n hold and for the weights (2.2) the results (2.9) and (2.10) are well known, see e.g. Theorem 2.1 in Bickel and Freedman [1]. The next more general step is to contemplate arrays of the form (2.11), where now (X n ) n≥1 is a strictly stationary and ergodic martingale difference sequence. This is the context of the next theorem.
Theorem 2.2. Let (X n ) n≥1 be a strictly stationary and ergodic martingale difference sequence (abbreviatory MDS) with respect to the canonical filtration n. In this case the limit theorems (2.9) and (2.10) hold if X 1 has finite variance E(X 2 1 ) = σ 2 < ∞. Moreover, the convergence (2.10) will even hold if (X n ) n≥1 is only strictly stationary and ergodic (i.e. must not be a MDS).
A special example occurs in the context of ARCH(1)-or more general GARCH(1,1)-processes. These processes are e.g. of interest in the context of financial time series.

Applications
It is well known that bootstrap and permutation tests posses better finite sample properties than the corresponding asymptotic tests with fixed critical values (derived from some limit theorem), see e.g. Edgington and Onghena [6], Good [11] or Clark and West [3] as well as Escancianoa and Velasco [8] in the context of testing martingale differences. In order to use our conditional central limit theorem for constructing such resampling tests, we need a test statistic that is a martingale under H 0 of whose increments fulfil the conditions of Theorem 2.1 or 2.2. In the following we will give some examples. The first natural application occurs in the context of testing or comparing means. For example, we could think of analyzing comparative gains of a financial time series. Denote by r i the comparative log-gains (or comparative log-returns) from time i − 1 to time i. Then r i is often modeled by r i = X i + µ, where (X i ) i≤n is an ARCH(1)-or GARCH(1,1)-process and µ ∈ R is an unknown value of interest. By applying Theorem 2.2 it is straightforward to construct consistent resampling tests for the null hypothesis {µ ≤ µ 0 }, where µ 0 is a given benchmark value. In addition, we could also compare the means of two independent time series r (j) i = X (j) i + µ j , j = 1, 2, by permutation methods (if we assume that (X (j) i ) i≤n , j = 1, 2, are both ergodic and stationary MDS). This would extend some results of Janssen [14] and section 5 in [17] for i.i.d. and rowwise i.i.d. r.v.
Further possible applications for the usage of Theorem 2.2 can be derived from the monograph of Hayashi [13], where plenty of statistical models (without M. Pauly using resampling procedures for them) are defined via strictly stationary and ergodic MDS.
Another example occurs in the context of testing for martingale differences. In the sequel we will use our theory to construct different resampling versions of the asymptotic test of Kuan and Lee [20]. Let us shortly recall their situation: Suppose that (Y n ) n≥1 is a strictly stationary and ergodic time series with finite variance and set Y n (w) := (Y n , . . . , Y n−w+1 ) for n ≥ w. For a fixed window w ∈ N and a given weight function g : R w → R with |g(x)|dx < ∞ we define the C-valued random variable
Theorem 2.1 of Kuan and Lee implies that we have convergence Q g d −→ Z 2 in distribution under H 0 , where Z is a normally distributed random variable with expectaion zero and covariance matrix Σ g = (σ j,r, g ) 1≤j,r≤2 := (Cov(ψ j,g (Y w+1 , . . . , Y 1 ), ψ r,g (Y w+1 , . . . , Y 1 )) 1≤j,r≤2 . To avoid trivialities we assume throughout that Σ g = 0 holds. Based on the above weak convergence result Kuan and Lee use covariance estimators to construct an asymptotical level α test under H 0 . One of the main advantages of this test is that it also has power against non-MDS that fufil H 1 , see Corollary 2.2 of Kuan and Lee. For a more detailed discussion on this test and the (very important) choice of the function g I refer to their paper.
Instead of using covariance estimators it is also possible to apply various resampling techniques to derive tests with the same asymptotic properties. Therefore set k(n) = n − w and let (W n,i ) i≤k(n) satisfy the conditions of section 2. Moreover, we define by a triangular array of R 2 -valued random variables. A weighted resampling version of Q g is now given by Indeed, it will be shown in the Appendix that we have conditional convergence Q * g d −→ Z 2 in probability given (Y n ) n≥1 , whenever (Y n ) n≥1 is strictly stationary and ergodic (i.e. must not be a MDS). Hence we can apply the resampling test ξ * n := 1 (c * n,α ,∞) (Q g ), where c * n,α denotes the (1 − α)-quantile of the conditional distribution L(Q * g |Y 1 , . . . , Y n ) of Q * g given the data. Its properties are summarized in the following theorem.
Theorem 3.1. Under the above conditions the test ξ * n is an asymptotically level α test, i.e. E(ξ * n ) → α holds under H 0 as n → ∞. Moreover, ξ * n is consistent, i.e. we have convergence E(ξ * n ) → 1 under the alternative H 1 as n → ∞. A further application in the context of survival analyis will be given in a forthcoming paper.

Appendix
The following theorem is crucial for all proofs and generalizes parts of Theorem 2.1 of Janssen [17] to the multivariate case. Here (Z n,i ) i≤k(n) is a triangular array of R p −valued random vectors, d p denotes a distance that metrizes weak convergence on R p , p ∈ N, e.g. the Prohorov metric (see p.394 in [5]), and T * n ((Z n,i ) i≤k(n) ) = k(n) 1/2 k(n) i=1 W n,i (Z n,i − Z n ), see (2.1). Recall that we have assumed throughout the paper that the exchangeable weights (W n,i ) i fulfil the conditions (2.3)-(2.5).
Theorem 4.1. Suppose that (Z n,i ) i≤k(n) satisfies the following two assertions
Proof of Theorem 4.1. Remark that (4.2) and the Portmanteau theorem imply that Γ is a.s. a covariance matrix. Hence N (0, Γ(ω)) is well-defined for almost every ω. We now prove (4.3). First suppose that p = 1 holds and denote the order statistics of (Z n,i ) i≤k(n) by Z 1:k(n) ≤ Z 2:k(n) ≤ · · · ≤ Z n:k(n) . If we set Z i:k(n) = 0 whenever i ∈ {1, . . . , k(n)}, condition (4.1) shows convergence in probability (Z i:k(n) ) i∈N , (Z n+1−j:k(n) ) j∈N P −→ 0. Now Slutzky's lemma and (4.2) imply condition (2.8) in [17]. Hence an application of Theorem 2.1 from [17] proves (4.3) for p = 1. For general p ∈ N we use a modified Cramer-Wold-Device. Let D := {λ k : k ∈ N} be a countable dense subset of R p . Applying (4.3) for p = 1 on the triangular array λ T j Z n,i , i ≤ k(n) yields Hence we can find a set M, P (M ) = 0, and a common subsequence such that (4.4) holds for all j ∈ N and ω ∈ M c along this subsequence. Now continuity of the characteristic function of the limit and tightness of L(T * n ((Z n,i ) i≤k(n) )|Z n,1 . . . Z n,k(n) )(ω, ·) show that (4.4) holds for all λ ∈ R p and ω ∈ M c along this subsequence. Thus an application of the classical Cramer-Wold-Device completes the proof.
Proof of Theorem 2.1. By Theorem 3.2 of Hall and Heyde [12] we have convergence in distribution of S n to a normally distributed random variable, i.e. (2.9) holds. For the limit behaviour of T * n we apply the above theorem with the centered array X n,i − X n , i ≤ k(n). Since (2.6) implies X n P → 0, we have that max i≤k(n) |X n,i − X n | P → 0 counts. Moreover, the statements (2.9) and (2.7) deduce by Slutzky's lemma. Since the occurring limit distributions are continuous, this shows (4.3) and (2.10).
Proof of Remark 2.1. Suppose that the conditions of Theorem 2.1 hold, where now σ 2 is an a.s. finite and non-degenerated random variable that is measurable in the completions of all the σ-fields F n,i . In this case a remark of Hall and Heyde [12], see p. 59, states that the convergence holds, where the random variable Z has the characteristic function ϕ Z (t) = E(exp(− 1 2 σ 2 t 2 )). In comparison to this Theorem 4.1 shows that d 1 (L(T * n ((X n,i ) i≤k(n) )|X n,1 . . . X n,k(n) ), N (0, σ 2 (·))) P −→ 0 holds. Due to the fact that this limit is ω-dependent and ϕ Z (t) is independent of the special choice of ω ∈ Ω, we can conclude that the consistency fails in general.
Proof of Theorem 2.2. The unconditional convergence follows from Theorem 23.1 of Billingsley [2], see also p.106 in [13]. We now show the conditional convergence (2.10) with the help of Theorem 4.1. Notice that condition (2.6) is equivalent to the weak Lindeberg condition This is obvious and has e.g. been used on p.53 in [12]. Since Tchebycheff's inequality yields P (|X 1 | > ǫ √ n) ≤ ǫ −2 Var(X 1 )/n → 0 for all ǫ > 0, we get by the dominated convergence theorem as n → ∞. This shows (4.5) and so (4.1) (since we have X n P −→ 0 by Slutzky's Lemma and (2.6)).
The second condition (4.2) follows by the SLLN for stationary and ergodic sequences, see e.g. Theorem 3.5.7 in [27]. It ensures that we have the almost sure convergences as n → ∞. Altogether this implies (4.2) with Γ = E(X 2 1 ). Remark that we have not used the MDS property to prove (2.10). holds, where T * n is as in (3.4) and d 2 as in Theorem 4.1. Hence it remains to check the conditions of Theorem 4.1. Let X n,i = (X (j) n,i ) 1≤j≤2 . Since the time series (Y n ) n≥1 is ergodic and stationary, the same holds true for the sequences X (1) n,i X (2) n,i and (X (j) n,i ) 2 , see e.g. p.170 ff. in [27]. Thus the SLLN for stationary and ergodic sequences proves condition (4.2): n,i holds. As in the proof of Theorem 2.2 above, we can conclude that X (j) n,i fulfils (2.6). This shows (4.1) and therefore (4.7). Altogether this proves (4.6) under H 0 from which we can deduce the asymptotic exactness of ξ * n , i.e. E(ξ * n ) → α as n → ∞. For the consistency remark first that the above proof shows that the convergence (4.6) is still valid under H 1 . Set E(ψ j,g (Y w+1 , . . . , Y 1 )) =: δ j , j = 1, 2 and suppose that (δ 1 , δ 2 ) = 0. Again the SLLN for stationary and ergodic sequences implies Since the critical value does not change under H 1 this shows E(ξ * n ) → 1, which completes the proof.