Nonparametric conditional density estimation for censored data based on a recursive kernel

: Consider a regression model in which the response is subject to random right censoring. The main goal of this paper concerns the kernel estimation of the conditional density function in the case of censored interest variable. We employ a recursive version of the Nadaraya-Watson estimator in this context. The uniform strong consistency of the recursive kernel conditional density estimator is derived. Also, we prove the asymptotic normality of this estimator.


Introduction
Studying the relationship between a response variable and a explanatory variable is one of the most important statistical analysis. Usually this relationship is modeled with the regression function. However, it is well known, this non-The recursive regression estimator for identically independent distributed (i.i.d.) random variables has been studied by many authors among whom we quote [1,18] and [24] for a nonparametric approach and [21] for semi-parametric models. In the strong mixing case, [22] derived the uniform almost sure convergence for (DW), while [23] showed its asymptotic normality. [20] studied some properties of local polynomial regression for dependent data.
Despite this great importance the recursive kernel estimation of censored nonparametric has not yet been fully explored. The present work is the first contribution that consider a recursive estimate in censored data. The main aim of this contribution is to study the asymptotic properties of the recursive kernel estimator of the conditional density and its derivatives, under random right censoring. Specifically, the asymptotic properties stated are the strong convergence and the asymptotic normality of these estimators. The paper is organized as follows. We present our model in Section 2. In Section 3 we introduce notations, assumptions and we state the main results. Finally, the proofs of the main results are relegated to Section 4 with some auxiliary results with their proofs.

Presentation of estimates
Consider n pairs of independent random variables (X i , T i ) for i = 1, . . . , n that we assume drawn from the pair (X, T ) which is valued in R d × R. In this paper we consider the problem of nonparametric estimation of the conditional density of Y given X = x when the response variable Y i are rightly censored. Furthermore, we denote by (C i ) i=1,...n the censoring random variables which are supposed independent and identically distributed with a common unknown continuous distribution function G.Thus, we construct our estimators by the where ½ A denotes the indicator function of the set A.
To follow the convention in biomedical studies, we assume that (C i ) 1≤i≤n and (T i , X i ) 1≤i≤n are independent; this condition is plausible whenever the censoring is independent of the modality of the patients.
The cumulative distribution function G, of the censoring random variables, is estimated by [14] estimator defined as follows which is known to be uniformly convergent toḠ. Given i.i.d. observations (X 1 , Y 1 , δ 1 ), . . . , (X n , Y n , δ n ) of (X, Y, δ), the kernel estimate of the conditional density φ(t|x) denotedφ n (t|x), is defined by where K, L are a kernels and h n is a sequence of positive real numbers. Note that this last estimator has been recently used by [15]. A recursive version of the previous kernel estimator is defined by and Remark 2.1. The Kaplan-Meir estimator is not recursive and the use of such estimator can slightly penalizes the efficiency of our estimator in term of computational time.

Assumptions and main results
Throughout the paper, we put ..n h i and all along the paper, when no confusion is possible, we denote by M and/or M ′ any generic positive constant. Further, we will denote by F (·) (resp. G(·)) the distribution function of T (resp. of C) and by τ F (resp. τ G ) the upper endpoints of the survival functionF (resp. ofḠ). In the following we assume that τ F < ∞, G(τ F ) > 0 and C is independent to (X, T ). We also assume that there exist a compact set C ⊂ C 0 = {x ∈ R d ℓ(x) > 0} where ℓ is the density of the explicative variable X, and Ω be a compact set such that Ω ⊂ (−∞, τ ] where τ < τ F ∧ τ G .
We introduce the following assumptions: Assumption A1. The kernels K and L are Lipschitz continuous functions and compactly supported satisfy. (i) The marginal density ℓ(·) is twice differentiable and satisfies a Lipschitz condition. Furthermore ℓ(x) > Γ for all x ∈ C and Γ > 0. Where C is a compact set of R. (ii) The joint density g(·, ·) of (X, T ) is bounded function twice differentiable.
Remark 3.1. Assumptions A1 and A2 are usually used in non censoring kernel estimation method. The independence assumption between (C i ) i and (X i , T i ) i , may seem to be strong and one can think of replacing it by a classical conditional independence assumption between (C i ) i and (T i ) i given (X i ) i . However considering the latter demands an a priori work of deriving the rate of convergence of the censoring variable's conditional law (see [11]). Moreover our framework is classical and was considered by [7] and [17], among others.

Uniform strong consistency results with rate of convergence
In order to give the rate of the uniform almost sure convergence of our estimate we need the following additional assumptions: Observe that, although the expression of the convergence rate is not classic in nonparametric statistic data analysis, this convergence rate is identifiable to the usual rate in the kernel method case where, for all i, we have . Now, the proof of this Theorem is based on the following decomposition So, the proof of this Theorem is a direct consequence of Lemmas 3.4-3.6.

Asymptotic normality
Now, we study the asymptotic normality of our estimate. To do that, we replace condition C by the following assumption: Assumption N.

Proof of Theorem 3.7
It is clear that Thus, The proof of Theorem 3.7 can be deduced directly from the following Lemmas Lemma 3.9. Under the Hypotheses of Theorem 3.7, we have and nh (d+1) n (ℓ n (x) − ℓ(x)) → 0 in probability as n → ∞.

Numerical study
In this short section we compare the finite-sample performance of the recursive kernel method and the classical kernel via a Monte Carlo study. For this comparison study, we consider the same models used in [16] that is where the random variables X and ǫ are i.i.d. and follow respectively the normal distribution N (0, 1) and N (0, σ).
It is clear that the conditional density expression is closely related to the distribution of ǫ. Thus, the conditional densities are respectively In order to control the effect of the censoring in the efficiency of both estimators we variate the percentage of censoring for each models by considering a various censoring distributions. Precisely, we generate the censoring variables C by an exponential distribution C(λ 1 ) shifted by λ 2 (for the exponential model), by a normal distribution N (0, σ 1 ) (for sinus case) and by N (0, σ 2 ) (for parabolic case). Thus, the behavior of both estimators is evaluated over a several parameters, such as the sample size n, the percentage of censoring τ controlled by (λ 1 , λ 2 , σ 1 , σ 2 ), the dimension of the regressors d and the type of model M . For sake of shortness, we consider the unidimensional case, we fixe the sample size n = 200, we took σ = 0.2, we consider three censoring type (τ = 10, τ = 40 and τ = 70). The test of the performance of both estimators is described by the following averaged squared errors Now, for our practical study, we use the Gaussian kernel and we consider the well-known smoothing parameter defined by h n,i = σ 2 n i −1/5 where The obtained results are given in Table 1. It is clear from Table 1 that the recursive method is slightly better than the classical kernel method. However, the main advantage of the recursive method is that considerably faster than the classical one for the three models. In particular, it reduces sensibly the computation time in function of the sample size and the kind of models. Overall, both methods give a satisfactory level of accuracy and the latter is strongly dependent to the censoring rate.
For the quantity sup x∈C sup t∈Ω |Eg n (x, t) − g(x, t)|, we use the fact that, for all measurable function ϕ and for all i = 1, . . . , n. Then,

S. Khardani and S. Semmar
Therefore, sup Now, concerning the quantity sup x∈C sup t∈Ω |g n (x, t) − Eg n (x, t)| we use the compactness property of the sets C and Ω to write that, for some (x k ) 1≤k≤λn and (t j ) 1≤j≤κn , where λ n ∼ a −d n and κ n ∼ b −1 n with a n = b n = n −(d+1)β−1/2 . Now, for any x ∈ C and t ∈ Ω, we set byk(x) = arg min k x k − x and j(t) = arg min j |t − t j | Then, for any (x, t) ∈ C × Ω, we can write =: T 1,n + T 2,n + T 3,n + T 4,n + T 5,n .
So, under Assumption C(ii), we have Nonparametric conditional density estimation for censored data based 2551 By using the same arguments as those used T 1,n we obtain Finally, in order to study T 5,n we use Bernstein's inequality. To do that, we put, Using the fact that the kernels K and L are bounded, we get Moreover, by a similar ideas to those used in the first part of this Lemma, we show that Hence, by Bernstein's inequality (see [13]), it follows that for all ǫ > 0: where h(u) = 3u/(6 + 2u) for all u > 0. Now, taking ǫ = ǫ 0 ( log n nh −(d+1) n ) 1/2 , we have for any (k, j), we obtain Consequently, Borel-Cantelli's lemma and an appropriate choice of ǫ 0 allows us to write that: Proof of Lemma 3.5. Firstly, we write The first term L 1n is very close to the last part of Lemma 3.4. So, by a standard analytical argument we get, While the proof of the second term For L 2n follows the same lines as in Lemma 3.4. Therefore, we get which completes the proof of this Lemma.
Proof of Lemma 3.6. It is clear that SinceḠ n (τ ) > 0, in conjunction with the SLLN and the LIL on the censoring law (see formula (4.28) in [11]), the result is an immediate consequence of Lemma 3.4.
Proof of Lemma 3.9.
• Proof of 3.4. Similarly to the previous Lemma, we have Further, as From N(ii) we obtain that The latter combined with the results of Lemma 3.4 allows us to complete the proof of 3.4.
• Proof of 3.5. It is shown in the first part of Lemma 3.4, that Thus, n ) which goes to zero under the second part of Assumption N(ii).
• Proof of 3.6. By a simple analytical arguments we show that Proof of Lemma 3.10. The proof of this Lemma is based on the version of the central limit Theorem given in ( [19], p. 275) where the main point is to calculate the following limit Observe that Once again, we use the result of Lemma 3.4 to show that ∇ 2 n = o(1). Now, concerning the first term ∇ 1 n , we have The continuity of the functionsḠ and g permit to write From Assumption A1(ii), we obtain the claimed result (5.8).
Let's now prove our asymptotic result. To do that we put and we prove that for some β > 2 Indeed, set ψ β i,n (x) = E |w i,n (x)| β . Applying the Cr-inequality (see [19], p. 155) Furthermore, by a standard arguments, we show that ).

2555
Because of 1 − β/2 < 0 we have ψ β n (x) → 0 which implies that The proof of this Lemma is now complete.