Asymptotic theorems for kernel U-quantiles

: For a locally smooth statistical model, we investigate kernel U-quantiles estimators. Under suitable assumptions, we establish a strong Bahadur representation theorem, an invariance principle, and the asymp- totic normality for randomly indexed sequences of observations.


Introduction
Let X 1 , X 2 , . . . be independent random variables having common unknown distribution function (df ) F . Let h(x 1 , . . . , x m ) be a real-valued measurable function symmetric in its m arguments, and let H(y) = P (h(X 1 , . . . , X m ) ≤ y), y ∈ ℜ denote the distribution function of the random variable h(X 1 , . . . , X m ). Since the df of the random variable h(X 1 , . . . , X m ) is rarely known exactly, quantiles H −1 (p) = ξ p = inf{x : H(x) ≥ p}, 0 < p < 1 and other features of the df H must be estimated from the data. The natural and most widely used estimator of the parameter ξ p is given by U -quantile H −1 n (p) where for each n ≥ m and real y H n (y) = 1 n (m) I(h(X i1 , . . . , X im ) ≤ y) the sum being taken over the n (m) = n(n−1) · · · (n−m+1) m-tuples (i 1 , . . . , i m ) of distinct elements from {1, 2, . . . , n} Note that when h(x) = x, the empirical df of U -statistic structure H n reduces to the usual empirical df and H −1 n (p) becomes the usual pth sample quantile of F . For m ≥ 2, a second choice of interest is h(x 1 , . . . , x m ) = x1+x2+···+xm m in which case the U -quantile H −1 n ( 1 2 ) becomes the Hodges-Lehman estimator median m −1 (X i1 + · · · + X im ) . Another interesting example corresponds to h(x 1 , x 2 ) = |x 1 − x 2 | for which H −1 n ( 1 2 ) provides an estimator for the spread measure H −1 ( 1 2 ), the median of the distribution of |X 1 − X 2 |, where X 1 and X 2 are independent with common distribution function F . The U -quantiles estimators have been investigated, among others, by Serfling [10], Choudhury and Serfling [2], Arcones [1], and Wendler [13]. A clear disadvantage of H −1 n (p) is its poor performance when H is smooth. Estimation of H −1 (p) in smooth models plays a fundamental role in many statistical applications, especially in data-analytic and functional statistical methods (see Parzen [8]).
Studies have shown that a smoothed estimator T n (p) may be preferable to H −1 n (p). First, smoothing reduces the random variation in the data resulting in a more efficient estimator. Second, the "noise level" in the data is reduced by smoothing providing thus an estimator that better displays the interesting features of the df . Of the several alternative estimators that have been proposed, we consider the kernel U -quantile estimator where α n is a specified sequence of positive constants (bandwidth) tending to zero and k(x) is a known kernel function. In the case h(x) = x, this estimator has been proposed by Parzen [7] and has been studied by Falk [4,5], Yang [14], Sheather and Marron [11], and Ralescu [9]. For the general case T n (p) has been investigated by Veraverbeke [12] who established its asymptotic normality. Using the kernel U -quantile estimator brings a clear improvement over the traditional U -quantile when H is differentiable. The size and order of the improvement is usually revealed when studying the Edgeworth expansion of T n (p) since using one or more terms beyond the normal approximation significantly improves the accuracy for small to moderate samples. This paper studies further asymptotic properties of the kernel U -quantile estimator. In Section 2 we establish a strong Bahadur representation of T n (p). In particular, under regularity conditions on (α n ), the a.s. rate O(n − 3 4 (log n) 1 4 ) is obtained. In Section 3 we prove an invariance principle (functional CLT) for kernel U -quantiles, and in Section 4 we derive the asymptotic normality results for random samples sizes.

Asymptotic representation of T n (p)
To study the strong asymptotic representation of T n (p), the following assumptions are needed: The next result provides the Bahadur representation for the kernel U -quantiles.

The invariance principle for T n (p)
Here we consider the Donsker type invariance principle for T n (p). The proof makes use of Theorem 2.1. Let for t = k n , k = m, . . . , n and define Y n (t) by linear interpolation for the other t ∈ [0, 1] We now prove that, for n → ∞, the random function Y n (·) converges weakly to a standard Brownian motion W (·) in the space C[0, 1] of all continuous functions on [0, 1] endowed with the uniform topology. Proof. Define the associated process {Y * n (t)} 0≤t≤1 by for k = m, . . . , n and for t ∈ [ k−1 n , k n ] with k = m, . . . , n Y * Since for fixed y, H n (y) is a U -statistic, by the functional central limit theorem for U -statistics (Miller and Sen [6]) it follows that: where ρ is the sup-norm in C[0, 1]. Therefore, to conclude the proof it suffices to show that for n → ∞, ) a.s. as k → ∞ for any sequence ǫ n satisfying assumption (A 3 ). Now, if √ nα n → 0, by taking ǫ n = n − 1 2 +α with 0 < α < 1 2 , it is readily seen that αn ǫn → 0, Also, since √ nR n → 0 a.s. as n → ∞, it follows that Combining (3.7)-(3.9) we conclude that (3.5) holds and the proof is complete.
Remark 3.1. From Theorem 3.1 it is clear that the form of the asymptotic variance used by Veraverbeke (1987) is incorrect. In fact, our result shows that Remark 3.2. Theorem 3.1 will be used to prove the result presented in the next section. Further applications of the weak convergence Y n (·) =⇒ W (·) may be obtained as follows: (a) Consider a sequence {r k } k≥1 of positive real numbers such that lim k→∞ k − 1 2 r k = r, 0 < r < ∞. Let N k denote the first time n such that √ n (T n (p) − ξ p ) exceeds or reaches r k . Let G k (x) = P {N k ≤ x}. Then, if x k > 0 is a sequence that tends to infinity in such a way that lim k→∞ k −1 x k = c > 0, then on account of Theorem 3.1, Under the assumptions of Theorem 3.1, the invariance principle implies that for x > 0: where Φ denotes the distribution function of the N (0, 1) random variable.

Asymptotic normality for randomly indexed sequences of random variables
In many applied models statistical inference is based on a counting random sequence {N k } k≥1 of nonnegative integer valued random variables. For example, n might be the number of observations obtained within a fixed period of time. Applications connected to studies of randomly indexed samples appear often in queueing problems, insurance and liability applications. The situation is equally important in connection with stopping times arising in sequential tagging. Typically, in a sequential point or interval estimation problem, the sample size is not pre-determined and is itself an integer-valued random variable. For such stochastic sample sizes, the usual asymptotic normality results may require extra regularity conditions and a direct proof might be too involved. Our next results establishes the asymptotic normality of T N k (p) for random sample sizes.
Proof. Since √ n k (T n k (p) − ξ p ) converges in distribution to a normal random variable with mean 0 and standard deviation mvp h(ξp) , it suffices to show that: To this end, let 0 < δ < 1 2 . Note that on N k n k − 1 ≤ δ we have N k < 2n k , 1 N k − 1 n k ≤ δ and the following estimate obtains: By assumption D 1k can be made arbitrarily small for sufficiently large k. To treat D 2k , in view of (4.4) we have: , as k → ∞, there exists δ > 0 such that the first term on the right hand side of (4.6) is less than or equal to ǫ 3 for sufficiently large k. On the other hand, from Theorem 3.1, by the tightness property of Y 2n k (t), there exists δ > 0 such that for all k sufficiently large, the second term of the right hand side of (4.6) is less than or equal to ǫ 3 . Therefore, there exists δ > 0, and k 0 ≥ 1, such that for all k ≥ k 0 , D 1k ≤ ǫ 3 and D 2k ≤ 2ǫ 3 From these estimates and (4.5) we obtain (4.3). This completes the proof of the theorem.
Remark 4.1. Theorem 4.1 may be used to study the sequential fixed-width confidence intervals for ξ p = H −1 (p) with given required accuracy. More precisely, by appropriately selecting the window-width, we can obtain random intervals I n constructed from T n (p) with length(I n ) → 0 with probability 1, as n → ∞ such that for d > 0, if the random variable N d is defined to be the first n ≥ 1 for which length(I n ) ≤ 2d, given 0 < α < 1 2 , we have: (i) lim d→0 P {ξ p ∈ I N d } = 1 − 2α and (ii) N d ≈ cd − 5 2 w. p. 1 as d → 0 Details are omitted.