A Sliding Blocks Estimator for the Extremal Index

In extreme value statistics for stationary sequences, blocks estimators are usually constructed by using disjoint blocks because exceedances over high thresholds of different blocks can be assumed asymptotically independent. In this paper we focus on the estimation of the extremal index which measures the degree of clustering of extremes. We consider disjoint and sliding blocks estimators and compare their asymptotic properties. In particular we show that the sliding blocks estimator is more efficient than the disjoint version and has a smaller asymptotic bias. Moreover we propose a method to reduce its bias when considering sufficiently large block sizes.


Introduction
Suppose that (X n ) n∈N is a strictly stationary sequence of random variables with marginal distribution function F . We assume that this sequence has an extremal index θ ∈ (0, 1], that is, for each τ > 0, there exists a sequence of levels (u n (τ )) n∈N such that lim n→∞ nF u n (τ ) = τ and lim n→∞ Pr M n u n (τ ) = e −θτ , whereF = 1 − F and M n = max{X 1 , . . . , X n }. The extremal index can be interpreted in a number of ways, the most common one being the reciprocal of the mean cluster size in the limiting point process of exceedance times over high thresholds. The probabilistic theory was worked out in (11), (13), (15), (9), (14), and (12).
Our objective is to estimate θ based on a finite stretch X 1 , . . . , X n from the time series. Inference about the extremal index parameter has been extensively studied. The three more common approaches are the blocks method, the runs method and the inter-exceedance times method. The two first methods identify clusters and construct estimates for θ based on these clusters. For each of these methods, there are two parameters which determine the clusters and consequently the estimates of θ: a threshold and a cluster identification scheme parameter. The third method is based on inter-exceedance times and obviates the need for a cluster identification scheme parameter. Some references on estimation of the extremal index using these three approaches are (7), (8), (17), (18), (6), (10) and (16) among others.
In this paper we focus on the blocks method. Traditionally, it consists of partitioning the n observations into consecutive blocks of a certain length, say r. In each block, the number of exceedances over a certain high threshold are counted, and the blocks estimator is then defined as the reciprocal of the average number of exceedances per block among blocks with at least one exceedance.
Blocks estimators are usually constructed by using disjoint blocks, for in that case the blocks can be assumed to be approximately independent.
The main novelty in this paper is our proposal to use sliding rather than disjoint blocks, that is, to slide a window of length r through the sample, yielding n − r + 1 blocks rather than just n/r disjoint blocks. Surprisingly, this simple modification leads to a more efficient estimator with a smaller asymptotic variance. Moreover we provide estimators of the asymptotic variances of the estimators, which permits the construction of confidence intervals and the selection of variance-minimizing thresholds. We also provide a way to estimate and correct for the asymptotic bias of the estimators.
In contrast to most previous papers but in accordance with (16), we assume that thresholds and block sizes are such that the expected number of excesses per block converges to a positive constant. In practice, the threshold is chosen as a large order statistic. However, mathematical treatment of such random thresholds requires complicated empirical process techniques.
The content of the paper is organized as follows. In Section 2 we introduce the blocks estimators for the extremal index. In Section 3 we consider asymptotic variances and covariances of the mean number of excesses per block and the empirical distribution functions of disjoint and sliding block maxima. We establish consistency and asymptotic normality of our estimators in Section 4. We discuss how to estimate and minimize their asymptotic variance in Section 5 and how to reduce their bias in Section 6. In Section 7, we investigate the finite sample behavior of the estimators on simulated data and we provide a case study. Proofs are spelled out in the appendices.
(2.1) It follows from the definition of the extremal index that θ = lim r→∞ θ r (u r ) where u r = u r (τ ). The estimators of θ to be proposed are based upon empirical analogues of the functions F r and τ r .
For integer 0 s < r, put M s,r := max s<i r X i . Note that M r = M 0,r . The distribution function F r of the block maximum M r can be estimated using maxima of k := ⌊n/r⌋ disjoint blocks or using maxima of n−r+1 sliding blocks: One may wonder why the use of sliding rather than disjoint blocks should make a difference. After all, the n − r + 1 blocks in the definition ofF sl n,r (u) are overlapping and hence strongly dependent, even in the iid case. Nevertheless, we will show in Proposition 3.1 below that the asymptotic variance ofF sl n,r (u) is typically smaller than the one ofF dj n,r (u). Writingτ the disjoint and the sliding blocks estimators of the extremal index can now be defined as follows: ,θ sl n,r (u) := − logF sl n,r (u) τ n,r (u) .
As above, the sliding version will turn out to be more efficient than the disjoint one, see Corollary 4.3.
The estimators require the choice of two tuning parameters: the threshold u and the block size r. If u is equal to the ⌊kτ ⌋-th largest order statistic of X 1 , . . . , X n , the disjoint blocks estimator is the same asθ (τ ) n,1 in (16). As mentioned in Section 1, the mathematical treatment of such random thresholds is intricate and requires empirical process techniques. For the sake of simplicity, the threshold sequence (u r ) r∈N will be assumed to be deterministic. Comparing our Corollary 4.3 with (author?) (16,Corollary 4.2), it follows that this simplifying assumption does not make any difference asymptotically.

Asymptotic variances and covariances
The disjoint and sliding blocks estimators for θ are functions ofF dj n,r (u),F sl n,r (u), andτ n,r (u). We shall need to find the asymptotic variances and covariances of the latter three estimators. Most importantly, we will show thatF dj n,r (u)−F sl n,r (u) has a non-negligible asymptotic variance and is asymptotically uncorrelated witĥ F sl n,r (u). As a result, the sliding blocks estimator for F r (u) is the most efficient convex combination of the disjoint and sliding blocks estimators for F r (u). The proofs of the results in this section are to be found in Appendix A.
The maximal correlation coefficients of the process (X n ) n∈N relative to the threshold u are defined by Here F a,b (u) is the σ-field generated by the events {X i u} for i ∈ {a, . . . , b}, and L 2 (F ) is the space of F -measurable square-integrable random variables. Obviously, the random variables ξ and η in the definition of ρ n,l (u) should have positive variance. For comparisons with other mixing coefficients, see e.g. (2). Here we just wish to note that The coefficients α n,l underlie the condition called ∆(u n ) in (9) and are themselves greater than the coefficients introduced in (11) yielding Leadbetter's D(u n ) condition. Since the upper bounds we will impose on ρ n,l will trivially imply the same upper bounds on α n,l , the results in (9) become available to us as well.
Let r n and l n be positive integer sequences such that, as n → ∞, Note that the assumptions imply n l=1 ρ n,l (u rn ) = o(r n ) and that the final assumption is implied by k n ρ n,ln (u rn ) → 0 as n → ∞, where k n = ⌊n/r n ⌋.
Proposition 3.1. Let (X n ) n∈N be stationary with extremal index θ and let (u r ) r∈N be a sequence of thresholds such that rF (u r ) → τ ∈ (0, ∞) as r → ∞. If (3.1) holds, then, as n → ∞, denoting α := θτ , By Proposition 3.1, we can writeF dj n,rn (u rn ) =F sl n,rn (u rn ) + ε n , the random term ε n having mean zero, being asymptotically uncorrelated withF sl n,rn (u rn ), and having non-negligible asymptotic variance. As a consequence, the sliding blocks estimator of the distribution function of the block maximum is more efficient than the disjoint blocks estimator. The two asymptotic variance functions as well as their ratio are shown in Figure 1. Observe that the relative efficiency of the disjoint versus the sliding blocks estimator is decreasing in α. For α → 0, the clusters of exceedances become very sparse, and the two estimators are asymptotically equivalent.
In order to get the asymptotic covariances betweenτ n,rn (u rn ) in (2.2) with the disjoint and sliding blocks estimators for F r (u), a somewhat stronger condition on the maximal correlation coefficients is needed: n l=1 ρ n,l (u rn ) = o(r 1/2 n ), n → ∞. holds, then as n → ∞, k n cov F dj n,rn (u rn ),τ n,rn (u rn ) k n cov F sl n,rn (u rn ),τ n,rn (u rn ) Finally, in order to find the asymptotic variance ofτ n,rn (u rn ), another additional assumption is needed: there exists a positive integer sequence (s n ) j∈N and a probability distribution (π j ) j∈N on the positive integers such that as n → ∞ as well as, writing N s (u) = s i=1 I(X i > u), (3.8) The distribution (π j ) j∈N is called the cluster size distribution; it describes the limiting probability distribution of the number of threshold excesses within the block X 1 , . . . , X sn given that there is at least one such excess. The second part of (3.8) is a uniform integrability condition ensuring that the first two moments of the finite-sample cluster size distribution converge to the proper limits. Note that Pr(M sn > u rn ) s n Pr(X 1 > u rn ) → 0 as n → ∞ while Under the above conditions, the asymptotic distribution of N rn (u rn ) is compound Poisson (9, Theorem 5.1): where ν is a Poisson(θτ ) random variable and (ζ i ) i∈N is a sequence of positive independent and identically distributed integer-valued random variables from the cluster size distribution, independent of ν. Note that E(ζ1) = j 1 jπ j = θ −1 . Moreover,

Weak consistency and asymptotic normality
The main result of this paper is the joint asymptotic normality of the disjoint and sliding blocks estimators for θ in Corollary 4.3. The proofs of the results in this section are to be found in Appendix B. Write Recall that m 1 = θ −1 .
In order to get asymptotic normality of the estimators, we will need an additional technical assumption: there exists a constant p with p > 1 such that as n → ∞, We first state joint asymptotic normality ofF dj n,r (u),F sl n,r (u), andτ n,r (u). Joint asymptotic normality ofθ dj n,r (u) andθ sl n,r (u) then follows by the deltamethod.

C. Y. Robert, J. Segers and C. Ferro/Sliding blocks estimator for the extremal index 7
Recall θ r (u) in (2.1). In order to control the bias of the extremal index estimators, assume that the block sizes are sufficiently large so that as n → ∞, The asymptotic variance of the extremal index estimators will depend on θ, τ , and the squared coefficient of variation c 2 of the cluster size distribution (π j ) j∈N : The asymptotic variance of the disjoint blocks estimator corresponds with the one for the same estimator but at a random threshold (order statistic) in (author?) (16, Corollary 4.2). It is worth noting that v 22 v 11 . As a result, the sliding blocks estimator is more efficient than its disjoint version. Even more, the most efficient convex combination of the disjoint and sliding blocks estimators is the sliding blocks estimator itself.

Estimating and minimizing the asymptotic variance
For a fixed c 2 0, the asymptotic variance functions of √ k n (θ n,rn /θ − 1), are convex and possess unique global minima. These minima and the values of α for which they are attained can be computed numerically, see Figure 2. Hence, given an estimate of c 2 , we can estimate the respective optimal values for α, divide by an estimate of θ, and thus obtain estimates of the optimal τ to be used for the disjoint or sliding blocks estimators. Given such estimates, we can for a given threshold u estimate the asymptotically optimal block lengths r and vice versa. The missing element in this procedure is an estimate of c 2 . Knowledge of c 2 is also needed when one wants to construct asymptotic confidence intervals for θ based on Corollary 4.3 or estimate the asymptotic bias of the extremal index estimators in Section 6 below. In addition, the quantity c 2 is interesting in its own right as a measure of dispersion of the cluster size distribution (π j ) j∈N . Since the mean cluster size is equal to m 1 = θ −1 , for which consistent estimators are available, we can focus here on estimating the cluster-size variance m 2 − m 2 1 or the second moment m 2 . A first possible strategy to estimate the cluster-size variance is to partition the threshold exceedances into clusters and estimate the cluster size variance by its empirical counterpart. However, this is difficult for two reasons: (a) the rareness of the clusters, and (b) the uncertainty on how to group the observed excesses into clusters. For nonparametric estimators of the cluster-size distribution, we refer to (5) and (16).
On the other hand, we can remain in the spirit of the paper and propose a sliding blocks estimator. Recall N r (u) = r i=1 I(X i > u) and its compound Poisson limit N in (3.9). Put σ 2 r (u) := var N r (u) . Under an appropriate uniform integrability condition, we have by (3.10), as r → ∞, . For a threshold u and a block size r, we defineN n,r (u) := We set the denominator equal to n − 2r + 1 in order to reduce the bias ofσ 2 n,r (u) The sliding blocks estimator for We derive the consistency ofĉ 2 n,r (u) under a condition on the fourth moment of At the price of a longer proof involving a characteristic function argument, condition (4.1) on the moment of order 2p (with 1 < p < 2) would be sufficient as well. The proof of Proposition 5.1 is given in Appendix C.

Reducing the bias
Recall θ r (u) as in (2.1) and letθ n,r (u) denote either the disjoint or the sliding blocks estimator. The bias ofθ n,r (u) can be decomposed into two parts: − log E[Fn,r(u)] E 3 [τ n,r (u)] var τ n,r (u) .
By the above expansion and Propositions 3.1 and 3.2, we obtain, as n → ∞, for the disjoint and the sliding blocks estimator, respectively. Note that 0 µ sl µ dj . If in addition

C. Y. Robert, J. Segers and C. Ferro/Sliding blocks estimator for the extremal index 10
then it follows that, as n → ∞, Just like the asymptotic variances in Corollary 4.3, the asymptotic biases of the disjoint and sliding blocks estimators in (6.1) are functions of θ, α = θτ , and c 2 . Given consistent estimators of these three quantities, we can estimate µ and then correct the extremal index estimators by subtractingμ/k. Observe that this procedure has to do with the O(1/k) asymptotics of the estimators only, whereas minimization of the asymptotic variance affects the O(1/ √ k) asymptotics. Note that condition (6.2) is slightly stronger than (4.2). In case θ rn (u rn )−θ = o(1/r n ), as in the three examples below, (6.2) is equivalent to k n = o(r n ), that is, n 1/2 = o(r n ). In contrast, for condition (4.2), the requirement is only that k n = o(r 2 n ), that is, n 1/3 = o(r n ). Example 6.1 (IID sequence). Let (X n ) n∈N be a sequence of independent random variables with a common, continuous distribution function F . Then θ = 1 and Example 6.2 (Max Auto-Regressive Process). Let (W n ) n∈N be a sequence of independent, unit-Fréchet distributed random variables. For 0 < θ 1, let The extremal index of the sequence is equal to θ and Example 6.3 (Moving Maximum Process). Let (W n ) n∈N be a sequence of independent, unit-Fréchet distributed random variables. Let X 1 = 2W 1 and X n = max(W n−1 , W n ). The extremal index of the sequence is equal to θ = 1/2 and r θ r (u r ) − θ → τ 4 , r → ∞.

Simulation study
The finite sample properties of the disjoint and sliding blocks estimators for the extremal index are compared in a simulation study. Sequences of length n = 10 000 are simulated from Max Auto-Regressive processes with θ = 0.25, 0.5, 0.75 and 1. For each sequence the estimatorsθ dj n,r (u) andθ sl n,r (u) are computed for five block sizes and two thresholds. The block size is r = 25, 50, 100, 200 or 400. The threshold u is the ⌊kτ ⌋-th largest order statistic and is defined by either a default value of τ = 1 or the estimate of the optimal value of τ described in Section 5. The initial estimates of c 2 and θ required in the latter case are based   on the threshold when τ = 1. Monte Carlo approximations to the properties of the estimators are computed from 10 000 simulated sequences. Figure 3 shows the biases and standard errors of the estimators. Biases tend to be positive and smallest at intermediate block sizes while variances increase with block size. Sliding blocks always yield lower standard errors than disjoint blocks. There is also evidence that sliding blocks yield larger biases than disjoint blocks when r is small and smaller biases when r is large. Optimizing τ tends to yield lower variances than the default τ = 1, but also larger biases when θ < 1. This is explained by the fact that the estimated values for the optimal τ tend to exceed 1 except when θ = 1. Example 6.2 suggests that increasing τ increases the bias.
The effect of the bias correction described in Section 6 is shown in Figure 4. There is little improvement for small block sizes, but biases are reduced significantly and stabilized for larger block sizes. The impact on the standard errors is negligible (not shown).
The positive biases ofθ dj n,r (u) andθ sl n,r (u) can lead to poor coverage properties (not shown) of confidence intervals for θ based on the asymptotic Normal distribution of Section 4. Lower and upper confidence limits tend to be too high when r is small but coverage improves when r is large. Coverage is also affected

C. Y. Robert, J. Segers and C. Ferro/Sliding blocks estimator for the extremal index 13
Year Daily log return by underestimation of standard errors when θ < 1 (not shown). These simulations were repeated for the doubly stochastic process of (17) and for ARCH(1) processes; see (3) and (author?) (4, Chapter 8). Results for the doubly stochastic process were very similar to those reported above. Results for ARCH(1) processes were also similar but the improvement in variance afforded by sliding blocks was less clear. Qualitatively similar results were found when the simulations were repeated with n = 1000 and r = 5, 10, 20, 40 or 80.

Case study
The extremal index is now estimated for a financial time series: daily log returns of the FTSE100 index between 25 December 1968 and 12 November 2001. This series was analysed previously by (10) and is plotted in Figure 5; the data were kindly passed on by Jonathan Tawn. Clusters of large, negative returns can be financially damaging so estimates of the extremal index for the negated series are plotted against block size in Figure 6. Two sliding blocks estimators are compared: both employ the bias correction but one uses the default value τ = 1 while the other uses estimated optimal values τ =τ opt . Thresholds are the ⌊kτ ⌋th largest order statistics so that the proportion of data exceeding the threshold for block size r is τ /r. The lower horizontal axis in Figure 6 is therefore a transformation of the threshold used when τ = 1, and coincides with the scale used by (10). The upper horizontal axis represents the same transformation of the threshold when τ =τ opt . These latter thresholds are lower becauseτ opt ≈ 5 for all but the smallest block sizes.
The point estimates from the two sliding blocks estimators are similar and both stablize near θ = 1/3. Estimates from the intervals estimator of (6) are also shown and differ slightly but are consistent with an extremal index of onethird once sampling variation is taken into account. However, these values are approximately half those obtained by (10) with a two-thresholds estimator. The confidence intervals in Figure 6 are computed using the estimated standard errors for the sliding blocks estimators with no bias correction. The confidence intervals when τ =τ opt are often much narrower than when τ = 1 owing to the lower thresholds mentioned above.

C. Y. Robert, J. Segers and C. Ferro/Sliding blocks estimator for the extremal index 15
Appendix A: Proofs for Section 3 A.1. Proof of Proposition 3.1 Asymptotic variance ofF dj n,rn (u rn ). By stationarity, By definition of the extremal index, F rn (u rn ) → e −τ θ and F 2rn (u rn ) → e −2τ θ as n → ∞. Hence By hypothesis, this converges to zero as n → ∞.
Asymptotic variance ofF sl n,rn (u rn ). By stationarity, var F sl n,r (u) = The sum on the right-hand side of the previous display can be written as By dominated convergence, as n → ∞, Hence (3.3) will follow if we can show that, as n → ∞, But this follows from the assumption that r −1 n n l=1 ρ n,l (u rn ) → 0 as n → ∞.
Asymptotic covariance ofF dj n,rn (u rn ) andF sl n,rn (u rn ). We have k cov F dj n,r (u),F sl n,r (u) = cov I(M 0,r u),F sl n,r (u) + cov I(M (k−1)r,kr u),F sl n,r (u) The first two terms on the right-hand side are bounded by var F sl n,r (u) 1/2 ; in view of (3.3), they will not contribute to the limit. The final term on the righthand side of the previous display can be decomposed into two pieces, I + II say, according to whether (i − 2)r j < ir or not. For the first term, the union of the intervals of integers {(i − 1)r + 1, . . . , ir} and {j + 1, . . . , j + r} is again an interval of integers; by stationarity, cov I(M r,2r u), I(M j,j+r u) Adding the subscript n to indicate the dependence on n, we get The second term can be bounded as follows |II | 2(k − 2) n − r + 1 n l=1 ρ n,l (u).
We split the sum according into two pieces, I + II say, according to whether (i − 1)r < j r or not. By stationarity, the first term is equal to Adding subscripts n to indicate the dependence on n, we get I n → −τ e −α as n → ∞. The second term, II , can be bounded in absolute value as follows: Assumptions (3.1) and (3.5) now imply that II n → 0 as n → ∞.
We split the sum according into two pieces, I + II say, according to whether i < j i + r or not. The first term, I is the same as in (A.1) and so gives rise to the same limit. The second term, II , admits the same bounds as in (A.2) and hence is asymptotically negligible.

A.3. Proof of Proposition 3.3
Recall N s (u) = s i=1 I(X i > u). We have k var τ n,r (u) = 1 k var N rk (u) .
For integer 0 a b, put N a,b (u) := a<i b I(X i > u), the sum being zero if a = b. Fix integer 1 l < s < n and write m := ⌊rk/s⌋. We have N isn−ln,isn (u rn ) + N mnsn,rnkn (u rn ) =: A n + B n + C n .
By the Cauchy-Schwarz inequality, it is sufficient to show that, as n → ∞, Before we treat these three terms, it is useful to note that the assumptions imply that var N ln (u rn ) = o s n Pr(X 1 > u rn ) , n → ∞, Since m n ∼ n/s n and Pr(M sn > u rn ) ∼ s n Pr(X 1 > u rn )θ as n → ∞, we find (1/k n )m n var N sn (u rn ) → θτ m 2 , n → ∞.
Since N sn (u rn ) = N sn−ln (u rn ) + N sn−ln,sn (u rn ) and since N sn−ln,sn (u rn ) and N ln (u rn ) have the same distribution, the previous display and (A.4) imply that I n → θτ m 2 as n → ∞.
Next we treat II n . We have |II n | 2(1/k n )m 2 n var N sn−ln (u rn ) ρ n,ln (u rn ).
In view of what we obtained for I n and since m n ρ n,ln (u rn ) → 0, we conclude that II n → 0 as n → ∞.
The term B n . By stationarity (1/k n )m n var N ln (u rn ) + 2(1/k n )m 2 n var N ln (u rn ) ρ n,ln (u rn ).

C. Y. Robert, J. Segers and C. Ferro/Sliding blocks estimator for the extremal index 19
The term C n . By stationarity, By assumption, the limit as n → ∞ is zero.
Note that where for i ∈ {1, . . . , k} and t ∈ {0, . . . , n − r}, The idea of the proof is as follows: By clipping out certain terms in the definitions of Z n,j , the latter can be viewed upon as sums of approximately independent random variables. Asymptotic normality then follows from the Lindeberg-Feller central limit theorem for triangular arrays.
Let k * ∈ {1, . . . , k − 1}. Construct a partition of {1, . . . , k} into subsets of size k * , with two adjacent such subsets separated by a singleton. The number of subsets of size k * that can be formed in this way is q = ⌊(k + 1)/(k * + 1)⌋. We have Let k * = k * n be such that k * n → ∞ but k * n = o(k n ) as n → ∞. The final two terms on the right-hand side of the previous display are negligible as their variances tend to zero: the variance of the second term on the right-hand side is of the order (k * n ) 2 k n ρ n,rn (u rn ) var(Ī dj 1,rn ) = o(1).
As a consequence, In a completely similar way, just replacingĪ dj i,r byN i,r , we can also show that a crucial element here is that var(N 1,rn ) = O(1), which follows from (4.1). Next, construct a partition {1, . . . , n} into k blocks of size r and in case kr < n a final block of length n − kr. Form subsamples by taking unions over k * consecutive blocks of size r, two consecutive subsamples being separated by a single block of size r. The number of subsamples that can formed in this way is again q = ⌊(k + 1)/(k * + 1)⌋, the jth subsample being In the definition of the sliding-blocks estimator, retain only those t ∈ {0, . . . , n− r} such that the (sliding) block {t + 1, . . . , t + r} is contained entirely in one of the subsamples S j,k * ,r . In other words, discard those t such that {t+1, . . . , t+r} has a nonempty intersection with one of the q − 1 blocks of size r separating two consecutive subsamples or with the remaining part of the sample after the final subsample S q,k * ,r . The values of t to be retained are then given as follows: for j ∈ {1, . . . , q}, {t + 1, . . . , t + r} ⊂ S j,k * ,r if and only if (j − 1)(k * + 1)r t j(k * + 1)r − 2.
We find Again, let k * = k * n be such that k * n → ∞ and k * n = o(k n ) as n → ∞. Asymptotically, the variances of the final two sums tend to zero: the variance of the second sum on the right-hand side is of the order O k n n 2 {q n + q 2 n ρ n,rn (u rn )}r 2 n = O q n k n + q 2 n k n ρ n,rn (u rn ) = o(1); and since the number of terms in the third sum on the right-hand side is not larger than 2k * r, the variance of that sum is of the order the variance of that sum is of the order O k n n 2 k * n r 2 n = O k * n k n = o(1).
As a consequence, a 1 Z n,1 + a 2 Z n,2 + a 3 Z n,3 = 1 √ q n qn j=1 ξ n,j + o p (1) with ξ n,j = a 1 ξ n,j,1 + a 2 ξ n,j,2 + a 3 ξ n,j,3 . Note that ξ n,j,v is measurable with respect to the σ-field generated by the events {X t u r } with t ranging over the jth subsample S j,k * n ,rn . Since these subsamples are separated by at least one block of size r n and since ρ n,rn (u rn ) = o(1/k n ) = o(1/q n ), a characteristicfunction argument shows that the asymptotic distribution of q −1/2 n qn j=1 ξ n,j is the same as if the variables ξ n,1 , . . . , ξ n,qn were independent.
We apply the Lindeberg-Feller central limit theorem with Lyapounov's condition. By Propositions 3.1, 3.2 and 3.3 applied for sample size n * = k * r together with the fact that q n /k n ∼ 1/k * n and q n k n /(n − r n + 1) 2 ∼ 1/(r 2 n k * n ), we have var(ξ n,1 ) → a ⊤ Σa, n → ∞.

C. Y. Robert, J. Segers and C. Ferro/Sliding blocks estimator for the extremal index 23
Above we have chosen k * n in such a way that k * n → ∞ and k * n = o(k n ) as n → ∞. Now we reinforce the latter requirement to k * n = o k For v ∈ {1, 3}, we have to proceed a little differently. Let ζ i,r be equal toĪ dj i,r if v = 1 andN i,r if v = 3. Then Now E[|ζ1,r n | 2+δ ] = O(1); for v = 1 this is obvious and for v = 3 this follows by condition (4.1). Again, requirement (B.5) on k * n ensures that the right-hand side of the previous display is o(1) as n → ∞.