Non standard functional limit laws for the increments of the compound empirical distribution function

Let $(Y_i,Z_i)_{i\geq 1}$ be a sequence of independent, identically distributed (i.i.d.) random vectors taking values in $\RRR^k\times\RRR^d$, for some integers $k$ and $d$. Given $z\in \RRR^d$, we provide a nonstandard functional limit law for the sequence of functional increments of the compound empirical process, namely $$\mathbf{\Delta}_{n,\cc}(h_n,z,\cdot):= \frac{1}{nh_n}\sliin 1_{[0,\cdot)}\poo \frac{Z_i-z}{{h_n}^{1/d}}\pff Y_i.$$ Provided that $nh_n\sim c\log n $ as $\nif$, we obtain, under some natural conditions on the conditional exponential moments of $Y\mid Z=z$, that $$\mathbf{\Delta}_{n,\cc}(h_n,z,\cdot)\leadsto \Gam\text{almost surely},$$ where $\leadsto$ denotes the clustering process under the sup norm on $\Idd$. Here, $\Gam$ is a compact set that is related to the large deviations of certain compound Poisson processes.

A sequence (f n ) in a metric space (E, ρ), is said to be relatively compact with limit set equal to K when K is (non void) compact and the following assertions are true We shall write this property x n K. Throughout this article, we shall consider a sequence of constants (h n ) n≥1 satisfying the so called local nonstandard conditions, namely, as n → ∞, (HV) 0 < h n < 1, h n ↓ 0, nh n ↑ ∞, nh n / log 2 n → c ∈ (0, ∞).
Here we have set log 2 n := log(log(n ∨ 3)), with the notation a ∨ b := max{a, b}. In a pioneering work, Deheuvels and Mason [4] established a nonstandard functional law of the iterated logarithm for a single functional increment of the empirical distribution function. With the notation of the present paper, their theorem can be stated as follows.
Later, Deheuvels and Mason [6] extended the just mentioned result to a more general setting, where d > 1 and with fewer assumptions on the law of the Z i , considering the ∆ n,c (h n , z, ·) as random measures indexed by a class of sets. The aim of the present paper is to extend the above mentioned results to the case where the random vectors Y i are not constant, but do satisfy some assumptions on their conditional exponential moments given Z = z. From now on < ·, · > will always denote the Euclidian scalar product on R k and λ stands for the Lebesgue measure. Define C as the class of each C ⊂ R d which is the union of d hypercubes of R d , and with λ(C) > 0. The two key assumptions that we shall make upon the law of (Y 1 , Z 1 ) are stated as follows.
(HL1) There exists a constant f (z) > 0 satisfying, for each C ∈ C straightforward analysis shows that these assumptions are fulfilled when the Y i are bounded by a constant and when Z i admit a (version of ) density f which is continuous at z. Another interesting case where (HL1) and (HL2) are fulfilled is a general semi parametric setting which appears in the following proposition.
3. For each z ′ ∈ V, and each t ∈ R k we have 4. Z has a version of density (with respect to the Lebesgue measure λ) which is continuous on V.
Proof : The proof is a straightforward application of Schéffé's lemma. .

Remark 1.2.
Roughly speaking, assumption (HL2) imposes that the Laplace transform of the law Y | Z = z is finite on R k . One could argue that this assumption could be weakened. However, it seems that, when this assumption is dropped, Theorem 1 (see below) does not hold anymore under the strong norm || · || k . A close look at the works of Deheuvels [3] and Borovkov [2] on the functional increments of random walks leads to the conjecture that the appropriate topology when L is finite only on a neighborhood of 0 seems to be the usually called weak star topology (see, e.g. [3]). This topic is however beyond the scope of this article, and shall be investigated in future works.
Notice that L Y and L |Y | k are positive convex functions when they exist. We now introduce h Y (resp. h |Y | k ), which is defined as the Legendre transform of L Y − 1 (resp. L |Y | k − 1), namely : Recall that the constant c > 0 appears in assumption (HV) and that Γ hY (1/c(z)) has been defined by (1.4) and (1.8). Our result can be stated as follows.
A consequence of this result is the following unconsistency result, for which no proof has been yet provided to the best of our knowledge : let K be real function on R d with bounded variation and compact support. The Nadaraya-Watson regression estimator of r(z) := E Y | Z = z is defined as : Theorem 1 entails that, under (HV), (HL1) and (HL2), the pointwise strong consistency of r n does not hold. (1.10) Moreover, assuming without loss of generality that Y has a second moment matrix which is strictly positive, we have ∇ 2 L Y > 0 (strictly positive matrix) on R k+1 , which ensures that is a C 1 diffeomorphism from R k+1 to an open set O ∋ m. And hence admits an inverse that we write ∇L −1 Y . We deduce that is continuous in x, which implies, that, for ǫ > 0 small enough we have . (1.11) For g ∈ B k+1 ([0, 1) d ), we shall write g = (g k , g k+1 ), where g k+1 denotes the last coordinate of g and g k ∈ B k ([0, 1) d ) is equal to g without its last coordinate. We shall also write, for a Borel set A and for ℓ = (ℓ 1 , . . . , which is well defined as soon as either 1 A or each g i has bounded variations on R d . Consider the following mappings : Also consider . As K has bounded variations, Ψ is continuous and so is T • Ψ. Applying Theorem 1, we then deduce that, almost surely It hence remains to show that empty interior, which shall obviously imply that, almost surely, r n (z) → r(z) as n → ∞. It is well known that, as Ψ ′ is continuous, surjective and linear from the which concludes the proof.
The remainder of our paper is organised as follows. In §2, we introduce an almost sure approximation of ∆ n,c (z, h n , ·) by a sum of compound Poisson processes. This approximation is largely inspired by a lemma of Deheuvels and Mason [6]. We then focus on these "poissonised" processes and provide some exponential inequalities on their modulus of continuity. In §3, we establish a Large Deviation Principle (LDP). Then §4 and §5 are devoted to proving points (1.6) and (1.7) of Theorem 1 respectively.

A Poisson approximation
Recall that z ∈ R d is fixed once for all in our problem. For ease of notation we write Throughout this article, we shall refer to a generic stochastic process U , usually called compound Poisson process. It is defined as follows : consider an infinite i.i.d array Y ij , Z ij i≥1, j≥1 having the same law as (Y 1 , Z 1 ), as well as a Poisson random variable with expectation equal to 1 fulfilling Note that the law of U is entirely determined by the following property : For each p ≥ 1 and for each partition Recall the expression U (A) is understood according to (1.12). The following proposition enables to switch the study of the almost behaviour of the sequence ∆ n,c (z, h n , ·) n≥1 to that of a sequence with the following generic term where the U i are suitably built independent copies of U . This result is in the spirit of Deheuvels and Mason (see [6], Lemma 2.1, or [4], Proposition 2.1).
Proposition 2.1. On a probability space rich enough (Ω, A, IP) we can construct an i.i.d. sequence of processes (U i ) i≥1 having the same law as U and an se- with ∆Π n,c (·, ·) defined in (2.4).
Proof : Denote by U a process having the same law as in (2.2).
2) (f ) The union of these five families of random elements is a stochastically independent family.
In (e), equality in law is understood as an equality with respect to the σ-algebra by the open balls. In (f ), stochastic independence is understood with respect to a suitably chosen product σ-algebra where each factor is either In fact, η * i and b i are a coupling of a Poisson and Bernouilli random variables (η, b) with expectation p i such that the probability P(η = b) is maximal. Second, notice that the following random vectors are i.i.d. with common law equal to (Y 1 , Z 1 ). Moreover, the following assertions are true with probability one, for each i ≥ 1 : We now define, for each i ≥ 1, Here, V C denotes the complement of a given set V ⊂ R d . Some usual computations on characteristic functions show that the processes U i (·) fulfill (2.3), and hence are distributed like U . Moreover since h It follows from (2.6), (2.8) and (2.10) that, for each i ≥ 1,

M. Maumy and D. Varron/Non standard behaviour of the compound empirical increments8
Since p n = f (z)h n (1 + o(1)) as n → ∞, and by assumption (HV), we have p 2 i < ∞, which entails, by making use of the Borel-Cantelli lemma, that (2.5) is true with respect to our construction.
By Proposition 2.1, proving Theorem 1 is equivalent to proving a version of Theorem 1 with the process ∆ n,c (z, h n , ·) replaced by their Poisson approxiations ∆Π n,c (h, ·). This will be the aim of §3, §4 and §5. In each of these three sections, we shall require the following exponential inequality for the absolute oscillations of ∆Π n,c , which are defined as the oscillations of the following process : (2.12) Recall that h |Y | k has been defined by (1.9). (2.14) Proof : Given s and s ′ ∈ [0, 1) d , we write s ≺ s ′ whenever each coordinate of s is lesser than the corresponding coordinate of s ′ . Obviously, the ∆Π n,c h, s are almost surely increasing in each coordinate of s. First fix δ > 0 and set We then discretise [0, 1) d into the following finite grid : where η n is a Poisson random variable with expectation n and independent of (Y i , Z i ) i≥1 (here = L stands for the equality in law for processes). For a Borel By the triangle inequality we have almost surely

M. Maumy and D. Varron/Non standard behaviour of the compound empirical increments 10
By Markov's inequality we have Note that (2.21) has been obtained by conditioning with respect to η n . Now, by conditioning with respect to E i,h : Note that assumptions (HL1) and (HL2) readily entail Choose h x > 0 small enough so that the quantity involved in ( This concludes the proof of Lemma 2.1.

M. Maumy and D. Varron/Non standard behaviour of the compound empirical increments 11
3. Large deviations for ∆Π n,c (h n , ·) In this section, we establish a Large Deviation Principle (LDP) for the sequence of processes ∆Π n,c (h n , ·). For the definition of large deviations for sequences for bounded stochastic processes and of a (good) rate function, we refer to Arcones [1].

Some tools in large deviation theory
We begin this subsection with some well known properties (see, e.g., [3], Lemma 2.1, or Borovkov [2] just above the main Theorem) of h Y and h |Y | k given in (1.8) and (1.9) respectively.
Arcones (see [1], Theorem 3.1) has established a very useful criterion to establish a LDP for processes in B k ([0, 1) d ) (actually only with k = 1 but the extension of his results to k > 1 is straightforward). We cannot make a direct use of his Theorem 3.1 and shall make use of a slight modification of it. To state this modification, we shall introduce some more notations. For each integer p ≥ 1, consider a finite grid S p = s j,p , j ∈ {1, . . . , 2 p } d := 2 −p j, j ∈ {1, . . . , 2 p } d .
(3.1) and consider its associated partition of [0, 1) d into hypercubes, namely Here we have written j − 1 = (j 1 − 1, . . . , j d − 1). Now for each integer p ≥ 1 and for each g ∈ B k ([0, 1) d ) write The following proposition is a straightforward variation of Theorem 1 of Arcones [1], and is written according to the notation of that theorem (in particular, we refer to [1] for a definition of the outer probability P * ).
Proposition 3.1. Let (X n ) n≥1 be a sequence of stochastic processes and let (ǫ n ) n≥1 be a sequence of constants fulfilling ǫ n > 0, n ≥ 1 and ǫ n → 0 as n → ∞. Assume that the following conditions are satisfied.
1. The sequence of stochastic processes (X (p) n ) n≥1 satisfies the LDP for (ǫ −1 n ) n≥1 and for a rate function J p on B k ([0, 1) d ), || · || k . 2. For each τ > 0 and M > 0 there exists an integer p ≥ 1 satisfying Then (X n ) n≥1 satisfies the LDP for (ǫ −1 n ) n≥1 and for the following rate function.
Proof : The proof follows the same lines as in the proof of Theorem 3.1 of Arcones [1]. We omit details for sake of briefness.
For g = (g 1 , . . . , g k ) ∈ B k ([0, 1) d ) and A Borel set, we shall write which is valid as long as 1 A or each g l have bounded variations. We shall now consider the following (rate) functions on B k ([0, 1) d ), || · || k that will play the role of successive approximations of J hY : given p ≥ 1 and The following fact is a straightforward extension to the multivariate case of Proposition 2.1 in [12]. Recall that J hY has been defined through (1.3) and (1.8). As a consequence, J hY is lower semicontinuous on B k ([0, 1) d ).
Our next lemma states that the function J hY (recall (1.3)) is a "rate" function.
Proof : By Fact 3.1 we have | x | k ≤| x | k 1 |x| k ≤M ∧ J hY for some M > 0 and for each x. Hence, for any g ∈ Γ J h Y (a) we have (recall that λ stands for the Lebesgue measure) ≤M + a, from where we conclude that Γ J h Y is relatively compact in B k ([0, 1) d ). It is also closed in B k ([0, 1) d ) by a combination of Fact 3.2 and (3.8), which proves Lemma 3.1.

A large deviation principle
In this subsection, we state and prove a large deviation principle that will play a crucial role in the sequel of our proof of Theorem 1. This LDP is stated as follows : and for the rate function J hY .
Proof : As we shall make use of Proposition 3.1, we have to check conditions 1 and 2 of that proposition, which will be the aim of the following lemmas. Notice that, almost surely, we have with ∆Π n,c (h, C) defined according to (1.12). Our proof is divided in two steps, where we shall respectively verify conditions 1 and 2 of Proposition 3.1.
Step 1 : To check condition 2 of Proposition 3.1, we shall make use of Lemma 2.1, which readily entails, for fixed p ≥ 1 and τ > 0, and for all n ≥ n(p, τ ) : Now fix M > 0 and τ > 0. By Fact (3.1), we have, for all large p : which implies that condition 2 of Proposition 3.1 is verified.
Step 2 : To check condition 1 of Proposition 3.1, we shall require the following preliminary lemma.

M. Maumy and D. Varron/Non standard behaviour of the compound empirical increments 14
satisfies the LDP for the sequence (ǫ −1 n ) n≥1 := ((nh n f (z)) −1 ) n≥1 and the following rate function Proof : The proof of Lemma 3.2 is divided into three steps. The two first steps deal with a single component of the random vectors written in (3.9).
Step 1 :In our first step, we make an additional assumption on L Y , which allows us to make a full use of the Gärtner-Ellis theorem (see, e.g., [7], p. 44).
Proof of Lemma 3.3 : We shall first show that, for each t ∈ R k , we have lim n→∞ 1 nh n f (z) log E exp < t, nh n f (z)∆Π n,c (C j,p , h n ) > = λ C j,p L Y λ C j,p −1 t .
(3.10) To show this, we start from the equality (2.17) to obtain by convolution : log E exp < t, nh n f (z)∆Π n,c (C j,p , h n ) > = n log E exp < t, nh n f (z)U h 1/d n C j,p > .
Recall that U has been defined in (2.2). Next, we use the characterisation (2.3), which is applied to the simple partition h Using that relation with t 1 = t and t 2 = 0, we obtain Hence (3.10) follows from assumptions (HL1) − (HL2). B Lemma 2.3.9 in [7], p 46, we know that (H 0 ) implies that the set of exposed points of h Y is equal to {x ∈ R k , h(x) < ∞}, from where the proof of Lemma 3.3 is concluded by an application of the Gärtner-Ellis theorem (see, e.g., [7], p.

44).
Step 2 : In our second step, we shall get rid of assumption (H 0 ), which is unfortunately not verified in all situations (for example, take k = 1, Y ≡ 1, which leads to L Y (t) = exp(t), t ∈ R and h Y (0) = 1, but (H 0 ) is not satisfied for x = 0).
Proof of Lemma 3.4 : First notice that the "closed sets" part of the LDP stated in Lemma 3.3 can be proved by making use of the Gärtner-Ellis theorem, without making assumption (H 0 ). Only the "open sets" part of Lemma 3.3 needs assumption (H 0 ), since it implies that the set of exposed points of h Y is equal to We only need to prove that, without assumption To achieve this goal, we shall slightly modify the Y i,j by adding small Gaussian random vectors.
denotes the open ball with centre x and radius ǫ. Now introduce an array ζ ij i,j≥1 of R k valued standard random vectors, that are independent of the array Y i,j , Z i,j i,j∈N . Also define ∆Π ′ n,c C j,p , h n := We shall first show that the vector Y + ζ fulfills assumptions (H 0 ). To prove this first notice that L Y +ζ = L Y L ζ , which holds since Y and ζ are independent conditionally to Z. Obviously we have, since ζ⊥ ⊥Z, which shows that ζ fulfills (H 0 ). Moreover, by Jensen's inequality we have Now consider x ∈ R k , and define the function g(t) =< x, t > − L Y +δ 2 1 ζ (t) − 1 . By (3.12) we have g(t) → −∞ as | t | k → ∞. Hence, the continuous and differentiable function g admits a maximum at some η ∈ R k fulfilling 0 = ∇g(η) = y − ∇L Y +ζ (x). This proves that the vectorζ fulfills (H 0 ) and hence, by Lemma 3.3 we have : (3.14) The last inequality holds for δ 1 > 0 small enough, by Fact 3.1, replacing Y by ζ. Hence, by the triangle inequality, we have for all large n : , which is true by the choice of δ 1 . The proof of Lemma 3.4 is then concluded since O and δ are arbitrary.
Step 3 : The proof of Lemma 3.2 by a tensorisation argument brought by Lynch an Sethuraman. Since, for each n, the collection ∆Π n,c (h n , C j ), j ∈ {1, . . . , 2 p } d is independent, and since each sequence ∆Π n,c (h n , C j ) n≥1 satisfies the LDP with the rate function λ(C j,p )h Y λ(C j,p ) −1 · . Then Lemma 3.2 is proved by applying Lemma 2.8 in [9].
A direct consequence of Lemma 3.2 is that condition 1 of Proposition 3.1 is satisfied, as shows our next lemma. n,c (h n , ·) n≥1 satisfies the LDP for ǫ n := (nh n f (z)) −1 . and for the rate function J (p) hY . Proof : The proof is a straightforward application of the contraction principle (see, e.g., [1], Theorem 2.1), considering, for fixed p, the following application, from R k 2 pd to B k ([0, 1) d ); || · || k (here we write x = (x j , j ∈ {1, . . . , 2 p } d ), with each x j belonging to R k ) x i .
We conclude the proof of Proposition 3.2 by combining Step 1 and Step 2 with Proposition 3.1.

Proof of point (1.6) of Theorem 1
We shall make use of some usual blocking arguments along the following subsequence : For any ǫ > 0 and A ⊂ B k ([0, 1) d ), we shall write : Now, recalling the definition of ∆Π n,c in (2.4), we define the following normalised Poisson processes that will play a crucial role in our blocking arguments.
H n (s) := 1 n k h n k f (z) n i=1 U i (h n k s), k ≥ 1, n ∈ N k , s ∈ [0, 1) d . (4.4) Fix ǫ > 0. We shall proceed in two steps : first, we will prove that, we have almost surely, ultimately as n → ∞,

M. Maumy and D. Varron/Non standard behaviour of the compound empirical increments 18
Step 1 : We first prove (4.5). In order to make use of usual blocking arguments along the blocks N k we shall first show that lim k→∞ max n∈N k P || H n (·) − ∆Π n k ,c (h n k , ·) || k > ǫ = 0. (4.7) To prove this, choose k ≥ 1 and n ∈ N k arbitrarily. A rough upper bound gives (excluding the trivial case where n = n k ).
P n,1 :=P || H n (·) − ∆Π n k ,c (h n k , ·) || k > ǫ =P ∆Π n k −n,c (h n k , ·) > ǫ n k n k − n (4.8) Now making use of point (2.14) of Proposition 2.1 with x := ǫn k /(n k − n) we get, for all large k and for each n ∈ N k with n = n k , Now, as n k − n ≥ n k − n k−1 , n k /(n k − n k−1 ) → ∞ and by Fact 3.1 we readily infer (4.7).
We are now able to make use of a well known maximal inequality (see, e.g., Deheuvels and Mason [5], Lemma 3.4) to conclude that, for all large k,