Parametrically guided local quasi-likelihood with censored data

It is widely pointed out in the literature that misspeciﬁcation of a parametric model can lead to inconsistent estimators and wrong inference. However, even a misspeciﬁed model can provide some valuable information about the phenomena under study. This is the main idea behind the development of an approach known, in the literature, as parametrically guided nonparametric estimation. Due to its promising bias reduction property, this approach has been investigated in diﬀerent frameworks such as density estimation, least squares regression and local quasi-likelihood. Our contribution is concerned with parametrically guided local quasi-likelihood estimation adapted to randomly right censored data. The generalization to censored data involves synthetic data and local linear ﬁtting. The asymptotic properties of the guided estimator as well as its ﬁnite sample performance are studied and compared with the unguided local quasi-likelihood estimator. The results conﬁrm the bias reduction property and show that, using an appropriate guide and an appropriate bandwidth, the proposed estimator outperforms the classical local quasi-likelihood estimator.


Introduction
The concept of quasi-likelihood estimation was proposed by Wedderburn (1974) as a flexible extension of the maximum likelihood estimation method for generalized linear models (GLMs).
The latter, as introduced by Nelder and Wedderburn (1972), relies on strong parametric assumption about the distribution of the data that can be hard to verify in practice. In such situations, the quasi-likelihood estimation may be a suitable alternative, since it relies only on assumptions about the first two moments. Moreover, the quasi-likelihood function has similar properties as the classical full log-likelihood function; see McCullagh and Nelder (1989) for more details.
Likelihood and quasi-likelihood provide consistent and powerful estimators if the required assumptions imposed on the data are met. However, a misspecified model can create an important bias in the estimation of the underlying target function. For this reason, nonparametric techniques, that are more robust, have been investigated by many studies. This include Green and Yandell (1985), O'Sullivan et al. (1986), Staniswalis (1989), Hunsberger (1994), and Severini and Staniswalis (1994), to cite just a few examples. More recent contributions include the work of Fan et al. (1995) and Fan and Chen (1999), who investigated local polynomial fitting for likelihood and quasi-likelihood in the context of GLMs, Chen et al. (2006), who studied local quasi-likelihood for missing data.
Even when the proposed model is misspecified, parametric estimation can provide a useful information about the target function. This information can be injected into a nonparametric estimator in order to improve its performance in terms of bias and mean-squared error (MSE). In the literature, there exists an attractive method that allows for that, namely the parametrically guided nonparametric estimation. In contrast to a traditional semi-parametric method, a parametrically guided estimator is fully nonparametric in the sense that no global parametric structure is imposed on the data. In the complete data case, considerable attention has recently been paid to this approach in the literature. First, Hjort and Glad (1995) introduced the parametric guided kernel scheme for density estimation. Then, (Glad (1998a(Glad ( , 1998b and Martins-Filhoo et al. (2008) investigated this method for mean regression function.
Later, the same approach has been extended to GLMs and local quasi-likelihood by Fan et al. (2009). Very recently, Fan et al. (2014) applied the guided estimation to generalized additive models and Davenport et al. (2015) studied the guided estimation for varying coefficient models. These papers noticed and showed the interesting property of bias reduction for their guided nonparametric estimators compared with the unguided ones without any increase in the variance.
There exist three different schemes allowing to guide parametrically a nonparametric estimator. The first scheme has been developed by Hjort and Glad (1995) using a multiplicative correction that requires a nonzero value for the parametric part which is not always respected in practice. In the second scheme, the correction is carried out in an additive rather than a multiplicative scale. Such guided scheme has been introduced in kernel regression by Rahman and Ullah (2002) and used later in different frameworks. Finally, the last guided scheme combines both the additive and the multiplicative scheme in a unified family indexed by a calibration parameter that controls the balance between the two corrections. The unified family has the advantage of being more general than the two other corrections. However, the additional calibration parameter needs to be selected, which is not an easy task. In the context of local quasi-likelihood, Fan et al. (2009) studied in detail the three different schemes. For the sake of simplicity, we first restrict our attention to the additive scheme, and next we give an extension of our results to the unified family of corrections. In the following, we give a brief description of the guided additive scheme in the context of local quasi-likelihood, more details are provided in Section 2. Suppose for the moment that we have completely observed and i.i.d. data (Y i , X i ), i = 1, . . . , n, and let m(x) = E(Y |X = x) be the true mean regression function. In classical parametric GLMs, m(x) is modeled linearly using a known link function g(·), that is g(m(x)) = η(x), with η(x) = θ 0 + θ 1 x. The parameter of interest θ = (θ 0 , θ 1 ) T can be estimated via the likelihood or the quasi-likelihood. However, in practice, the linearity assumption is not met in many situations. In such cases, local quasi-likelihood is more appropriate since it allows the estimation of η(x) without any explicit specification of its form. In between these two "extreme" approaches, Fan et al. (2009) proposed a guided local quasi-likelihood estimator with the objective of combining the advantages of both parametric and local quasi-likelihood estimators. As stated before, we focus on the additively guided local quasi-likelihood estimator.
The additive scheme starts with a parametric quasi-likelihood estimator which is not necessarily correctly specified. Then, in a second step, this crude parametric approximation is adjusted using a local quasi-likelihood estimator. More formally, let η(x, θ) be a "naive" quasi-likelihood estimator of η(x, θ), a given, possibly misspecified, parametric model for η(x). Fan et al. (2009) proposed to estimate the error term r θ (x) := η(x) − η(x, θ) using a nonparametric weighted local quasi-likelihood (LQL) estimator that we denote by r θ (x). The additive parametrically guided local quasi-likelihood (GLQL) estimator is defined by η(x) = η(x, θ) + r θ (x). When the parametric model is properly chosen, r θ may be flatter and easier to adjust non-parametrically than the original function η. In this case, the guided local quasi-likelihood estimator should be of smaller MSE than the classical LQL estimator. Otherwise, the nonparametric correction is expected to correct for the misspecification and there should not be much loss in accuracy for the resulting GLQL estimator compared to the classical LQL estimator.
Regression problems in which the response is subject to censoring have been widely studied in the literature. Many investigations have been devoted to parametric regression, among them, Buckley and James (1979), Koul et al. (1981), Lai and Ying (1991) and Delecroix et al. (2008). An extensive field of research has been developed for nonparametric regression, see for example Beran (1981), Fan and Gijbels (1994), Heuchenne and Van Keilegom (2007), El Ghouch and Van Keilegom (2008) and Lopez (2011), among others. However, only few papers extending parametric quasi-likelihood to censored data exist in the literature. The first extension of quasilikelihood to the right censored data case has been established in the framework of partially linear single-index models by Lu and Cheng (2007). In the generalized linear model, Yu et al. (2012) and Yu and Peace (2012) proposed different semi-parametric quasi-likelihood estimators in the framework of accelerated failure time models. Note that, none of the papers mentioned above has considered a fully nonparametric quasi-likelihood. Thus, one of the main objectives of this paper is to extend the local quasi-likelihood of Fan et al. (1995) to the censored data case.
Regarding the parametrically guided nonparametric estimation, as far as we know, except the recent work of Talamakrouni et al. (2015Talamakrouni et al. ( , 2016, the guided nonparametric estimation has never been studied in the context of censored data. A well known challenge in the presence of censoring is that the response is not always available. Consequently, the parametrically guided local quasi-likelihood method can not be directly applied. In order to address this problem, we first need to transform the data before applying the GLQL. Several transformations have been proposed in the literature. In this work we investigate the transformation proposed by Koul et al. (1981) since it does not require any iterative procedure. The paper is organized as follows. Section 2 explains in detail the different steps of the proposed methodology. Section 3 provides some asymptotic results for the proposed method, while Section 4 illustrates the performance of the proposed estimator via simulation studies.
Finally, some general conclusions are drawn in Section 5. The proofs are given in the Appendix.

2 Model and methodology
Regression techniques are commonly used to describe a relationship between a variable of interest Y ∈ and a covariate X ∈ . In a right censored regression framework, the response Y is not directly available. Indeed, in the presence of a censoring variable C one can only observe an i.i.d. random sample (X i , T i , δ i ), i = 1, . . . , n, from (X, T, δ), where T = min(Y, C) and δ = I(Y ≤ C). In the following we suppose that given the covariate X, the censoring variable C is independent of the variable of interest Y . Set F (y|x) = P (Y ≤ y|X = x) and G(y|x) = P (C ≤ y|X = x) the conditional distribution function of Y and C given X = x, respectively. Suppose that there exists a known positive function V (·) that relates the conditional mean and the conditional variance of φ(Y ) given X as follows: where φ is a known function used to cover various parameters of interest. For example, when φ(y) = y1 {y≤τ } , for some known τ , we get the truncated mean m(x) = τ −∞ ydF (y|x). Our main objective is to estimate η(x) = g(m(x)), where g(·) is a known link function. Since only the relationship between the conditional mean and the conditional variance is known, the likelihood estimation method can not be used. In the following, we first introduce the guided local quasi-likelihood for complete data, and then we adapt the method to handle censoring. Wedderburn (1974) defined the quasi-log-likelihood function as any function Q(µ, y) satisfying

Guided local quasi-likelihood for complete data
Assuming that η(x) = θ 1 + θ 2 x, the parameters θ 1 and θ 2 can be estimated via maximizing the parametric quasi-likelihood n i=1 Q g −1 (θ 1 + θ 2 X i ), Y i , that plays the role of the log-likelihood in the classical GLM model. Because the assumption of linearity does not hold in many situations, Fan et al. (1995) proposed a local polynomial quasi-likelihood method to estimate η(·) without assuming any specific form for it. The maximum local quasi-likelihood estimator of where K(·) is a kernel density, h ≡ h n is a smoothing bandwidth, and K h (·) = h −1 K(·/h).
Let η(x, θ) be a parametric model which belongs to some family of parameterized functions {η(x, θ) : θ ∈ Θ ⊂ d } and define the parametric maximum quasi-likelihood estimator of θ as As discussed in the introduction, the parametric estimator given in (2.1) provides some useful and interesting information about the target function η(·) that may help us to improve the local quasi-likelihood estimator. To simplify the presentation, we focus on the local linear case (p = 1). Under the additive scheme, Fan et al. (2009) proposed to estimate the error term . The latter can, equivalently and directly, be derived by maximizing and taking η(x) = β 0 , where ( β 0 , β 1 ) maximizes the above function with respect to (β 0 , β 1 ).

Guided local quasi-likelihood and censoring
In the presence of censoring, as E(φ(T )|X = x) = m(x), one cannot directly use the observed data to estimate η(x) = g(m(x)). In order to overcome this problem, we will use the synthetic data approach. In this approach, the observed response T is substituted by a synthetic response Y * , such that, under the conditional independence of Y and C given X, E(Y * |X = x) = m(x). Different transformations satisfying this equality exist in the literature, see for instance Leurgans (1987) and Zheng (1987), among others. We limit ourselves to the transformation of Koul et al. (1981) defined by This transformation is not directly applicable in practice, since it depends on G(y|x), the conditional distribution of C given X = x, which is unknown. An estimator of this function was proposed by Beran (1981) and is given bŷ , are the Nadaraya-Watson weights K 0 is a kernel density function and h 0 is a bandwidth parameter. Note that if w i (x) = n −1 , i = 1, . . . , n, thenĜ reduces to the well known Kaplan-Meier estimator. Beran's estimator was studied by many authors, among them we cite Doksum and Yandell (1982), Dabrowska (1987), Gonzalez-Manteiga and Cadarso-Suarez (1994) and Van Keilegom and Veraverbeke (1997). We define the synthetic responseŶ * by plugging Beran's estimator into the transformation (2.2) as follows: Following Fan et al. (2009), we define our parametrically guided local quasi-likelihood estimator of η, based on the synthetic sample (Ŷ with respect to β = (β 0 , β 1 ) T , and θ is a pseudo parametric quasi-likelihood estimator of θ adapted to censored data. The estimation approach that we adopt will be discussed in detail in Section 3.2. Note that the parametrically guided local quasi-likelihood given in (2.4) raises new challenges when compared to the equivalent estimator with completely observed data since the synthetic observationsŶ i * , i = 1, . . . , n defined by (2.3) are estimated using the whole sample.
Remark 2.1. We didn't find any results in the literature concerning the estimation of a general misspecified parametric model using quasi-likelihood under censoring. We also note that using a linear guide reduces the estimator to the classical local quasi-likelihood estimator of η, which means that our GLQL estimator ηĜ , θ (x) is a generalization of the classical LQL estimator that can be obtained by maximizing (2.4) with η = 0.

Theoretical properties
In order to show the bias reduction property of our new estimator, we investigate in this section the asymptotic distribution of ηĜ , θ (·). First of all, we derive in Theorem 3.1 the asymptotic properties of ηĜ(·) an estimator of η(·) guided by a given non stochastic approximation η(·).
Then, in Theorem 3.2 we generalize the results to cover the case of a data-driven parametric guide.

The model with non-random guide
Let η(x) be a non stochastic guide that approximates the true function η(x) and let β = ( β 0 , β 1 ) maximize the following function: Define the corresponding GLQL estimator as ηĜ(x) = β 0 . In the following, we provide the assumptions required for the main results.

A2. The function φ is bounded and vanishes outside the interval
A3. The functions H j (y|x) = P (T ≤ y, δ = j|X = x), j = 0, 1, have four derivatives with respect to x. Furthermore, the derivatives are bounded uniformly for all y ≤ τ and x ∈ S X .
A5. i. K is a symmetric probability density function with compact support, say S K = [−1, 1].
ii. K 0 is a symmetric, twice continuously differentiable probability density function with compact support S K 0 . iii.
B1. The function q 2 (x, y) < 0 for all x ∈ S X and y ≤ τ .
B2. The function σ 2 Condition A2 is related to the inconsistency of Beran's estimator in the right tail of the distribution of Y . Conditions A1, A3 and A4-A6 are usually required in kernel-based estimation with censored data. Finally, assumption A7 and B1-B3 are similar to the assumptions in Fan Theorem 3.1. Suppose that Assumptions 3.1 and 3.2 hold. Then, Remark 3.1. The bias produced by Beran's estimator is bounded by log n nh 0 1/2 . This extra term vanishes when the bandwidths are chosen such that h 0 h log n → ∞. Therefore, there is no loss of accuracy when one replaces the response by synthetic data, provided that the bandwidth for Beran's estimator is asymptotically larger than the bandwidth used in the local linear fit. This fact has also been pointed out by Talamakrouni et al. (2015) in the context of guided nonparametric regression with censored data. The bias term B(x) is similar to the fully observed data case and reveals the effect of the parametric guide. If the guide is chosen such that |η (x)− η (x)| ≤ |η (x)|, then the bias of the GLQL estimator will be smaller compared with that of the classical LQL estimator. If the second derivatives of the parametric guide and the true function are equal, then the bias term B(x) vanishes. Regarding the variance, there is no difference compared with the classical LQL under censorship. The only difference appears when one compares the variance term of the GLQL estimator in the presence and the absence of censoring. In fact, the term σ 2 and this is due to the synthetic data. Note that, if the parametric guide is chosen to be constant, then the GLQL estimator reduces to the classical LQL estimator.
Therefore, the result of our Theorem 3.1 is a generalization of Theorem 1.a (for p = 1, r = 0) in Fan et al. (1995) to right censored data.

The model with an estimated guide
In the previous section, Theorem 3.1 investigated the simple case of a fixed guide. However, in practice, the guide needs to be estimated. In the following, we consider the case where the parametric guide η(x, θ) is obtained from a first stage estimation procedure. Following Fan et al. (2009), we denote by f (x, y) = f X (x) exp(Q(g −1 (η(x)), y)) the true unknown joint density of (X, Y ) and by f (x, y; θ) = f X (x) exp(Q(g −1 (η(x, θ)), y)) the proposed parametric joint density.
Define θ * ∈ Θ⊂ d , the value of θ which maximizes the following function: where F (x, y) is the joint distribution function of (X, Y ) and ∆ = S X × (−∞, τ ] is needed because the right tail of the distribution F (x, y) cannot be estimated consistently when the response Y is censored. θ * is the parameter value that minimizes the Kullback-Leibler distance between the true joint density f (x, y) and the parametric joint density f (x, y; θ), that is, . If the parametric model is correct, i.e. there exists θ 0 ∈ Θ such that f (x, y) = f (x, y; θ 0 ), then θ 0 = θ * .
In the spirit of Suzukawa et al. (2001), we estimate θ by θ, the maximizer of a suitable analogue of (2.1) that we define as where F is an estimator of F satisfying the following assumptions: where Σ is a nonnegative-definite matrix and ∇ r θ Φ(x, y; θ) = ∂ r Φ(x, y; θ)/∂θ r for a twice differentiable function θ → Φ(x, y; θ) and r = 0, 1, 2.
When the data are completely observed, the estimator F may be replaced by the usual bivariate empirical distribution function F n (x, y) = 1 n n i=1 1 X i ≤x,Y i ≤y . In this case, the pseudo quasi-likelihood defined by (3.7) reduces to (2.1), meaning that our approach is more general.
In the censored data framework, there have been few proposals for estimating F (x, y) in the literature. For example, Lopez (2011) has developed an estimator of F (x, y) satisfying Assumption 3.3 (see Theorem 3.1 and Theorem 3.6 in Lopez (2011)) and given by the following . (3.8) Another different and interesting approach was introduced by Van Keilegom and Akritas (1999). Their estimator is constructed through an integrated version of Beran's estimator as follows where F n (x) is the empirical distribution function of X andF (y|x) is Beran's estimator of F (y|x) = P (Y ≤ y|X = x). We note that both estimators can be used in practice. However, to the best of our knowledge, Assumption 3.3 has not yet been investigated for F VA . Therefore, for sake of consistency, we only investigate the estimator of Lopez (2011) in our simulation studies. Next, we give additional conditions that are also needed.
D1. η(x, θ) belongs to a parametrically indexed class of functions defined by the following characteristics: 1. θ ∈ Θ, Θ is a compact subset of d .
D2. The function log f (x, y; θ) is twice continuously differentiable with respect to θ.
Conditions D2, D3 and D4 are classical conditions in the uncensored case that allow to take derivatives under the integrals. The following proposition provides the weak consistency and the asymptotic normality of the estimator θ.

2.
√ Note that, the results of Proposition 3.1 reveal the √ n-consistency of the estimator θ, that is √ n( θ − θ * ) = O p (1). Now, given this result and some additional conditions, the next Theorem states that there is no loss in accuracy when the parametric guide is estimated.
Theorem 3.2. Suppose that Assumptions 3.3 and 3.4 hold. Then, under assumptions of Theorem 3.1, we have Comparing this last result with the result of Theorem 3.1, we notice that the estimation of the parameter θ * does not affect the asymptotic bias and the asymptotic variance. A crucial issue that arises in any nonparametric method is the choice of the bandwidth parameters. From Theorem 3.2, the asymptotic mean integrated squared error is given by (3.10) If η (x) − η (x, θ * ) = 0, then B(x, θ * ) = 0. In such a case, one can choose an arbitrary large bandwidth so that the variance is reduced to its minimum possible value, which is impossible in a fully nonparametric framework (except for a linear η). If η (x) − η (x, θ * ) = 0, then minimizing (3.10) with respect to h gives the following theoretical optimal bandwidth: This last expression indicates that, if the parametric guide is chosen so that its second derivatives η (x, θ * ) is close to the second derivative of the true function η (x), then the optimal bandwidth for the GLQL estimator will be larger than the optimal bandwidth of the classical LQL estimator. This allows to reduce also the variance compared with the classical LQL estimator. This fact is widely noticed in our simulation studies. In practice, expression (3.11) cannot be used directly since it depends on a number of unknown quantities. Fan and Gijbels (1996) (see Section 4.9) and Fan et al. (1995) proposed some guidelines for the selection of the bandwidth h based on the bias-variance tradeoff. Their procedures can be easily extended to censored data framework by simply substituting the censored response Y i by the synthetic datâ Y * i . Finally, the bandwidth for Beran's estimator can be chosen using for example the plug-in method (see Dabrowska (1992)) or the bootstrap method investigated by Van Keilegom and Veraverbeke (1997).

Extension to unified family of corrections
As mentioned in the introduction, we investigate the additive correction in order to simplify our presentation. However, in addition to the additive scheme, Martins-Filho, et al. (2008) proposed a unified family of corrections in the uncensored data case. In the following we give some guidelines allowing to generalize their proposal to our framework. Starting from a parametric model η(x, θ), the basic idea of the guided estimation can be generalized using the following more general identity: where r θ,α (x) = [η(x) − η(x, θ)]/η(x, θ) α and α ≥ 0. We propose to estimate the correction factor r θ,α (x) by r θ,α (x) = β 0 , where ( β 0 , β 1 ) is the maximizer of Therefore, the extended guided local quasi-likelihood estimator is given by ηĜ , θ,α (x) = η(x, θ) + r θ,α (x)η(x, θ) α . Similarly as in Section 2.1, the extended guided estimator ηĜ , θ,α (x) can be defined directly as the first component of the maximizer of with respect to β = (β 0 , β 1 ). All the results established before can be generalized to the guided estimator based on the unified family of corrections, the generalization of the proof is straightforward and is omitted here . Theorem 3.3 generalizes the result of Theorem 3.2.
Note that, the additive correction is a special case of the unified family for α = 0. The choice of the parameter α was investigated by Fan et al. (2009). However, using the best α does not enhance the performance considerably compablack with the additive correction. Therefore, to simplify our simulation studies we investigate the additive correction.

Simulation results
This section is concerned with the evaluation of the finite sample performance of the GLQL estimator. To this end, we conduct two Monte Carlo simulation studies. In the first study, a Poisson model is investigated under right censoring. Such model is widely used in studies 14 dealing with quasi-likelihood and discrete responses, see for example Fan et al. (2009) and Davenport et al. (2015). Then, in a second time, an exponential model is considered to cover the case of continuous responses. Our target function is η(x) = g τ 0 ydF (y|x) where g is the canonical link, τ = inf x {τ x } and τ x is the 99.99% upper quantile of H(y|x) = P (T ≤ y|X = x).
The parametric guides are estimated via maximizing the pseudo QL given in (3.7) combined with the estimator (3.8) proposed by Lopez (2011). Along the simulations we use local linear fitting and the Epanechnikov kernel for both K 0 and K. To reduce our calculation time, we first selected the value of the bandwidths h 0 and h by minimizing the average mean squared error (MSE) using a small number of simulations. Then, we applied both guided and traditional LQL to 1000 other simulated data sets using the selected "optimal" bandwidths for each method.

Poisson model
In this model, the covariate X is generated from a uniform distribution over the interval [−2, 2].
As stated before, to select the bandwidths we repeat the simulation 200 times. Figure 1 shows how the squared bias, the variance and the MSE change with the bandwidth h, for sample size n = 250 and a censoring rate of 20%. As established in the asymptotic results, the bias is substantially reduced for the three guided estimators compared with the unguided estimator, while the variance remains unchanged or is slightly reduced especially when a large bandwidth is used. We also note that when the appropriate guide (sinusoidal) is used, the bias of the GLQL estimator is almost zero. This allows us to choose a larger bandwidth and so to reduce the variance substantially. Now, using the selected bandwidths, we compute the different estimators 1000 times. The squared bias (Bias 2 ×10 3 ), the variance (V ar ×10 3 ), the empirical mean squared error (M SE × 10 3 ) as well as the selected bandwidths are given in Table 1 for each setting. Generally speaking, the results show that the GLQL estimators have lower MSE compared to the classical LQL even if the parametric guide is not completely correct. As expected, the best results are obtained when the guide is correctly specified, namely with the sinusoidal guide. The selected bandwidths for the GLQL estimators are larger than or equal to those for the classical LQL. Overall, we can say that the GLQL estimator considerably outperforms the classical LQL estimator. As expected, increasing the sample size improves the quality of all the estimators, in terms of MSE, but increasing the censoring rate affects negatively the results. The sample size is n = 250, the proportion of censoring is p = 0.20 .

Exponential model
This section addresses the case of a continuous response. Given X = x, the response Y is generated from an exponential distribution with parameter η(x) = (0.5x 2 + 1) + a(sin(2πx)) 2 , where a = 0, 0.1, 0.3, 0.5, while the covariate X is uniformly distributed on [0,4]. The censoring variable C is independent of Y given X = x and is also generated from an exponential distribution with parameter η(x)/2 which leads to almost 33.4% rate of censoring. Regarding the parametric guide, we consider a second order polynomial guide η(x, θ) = θ 1 + θ 2 x + θ 3 x 2 .
The parameter a allows to control the difference between the true function and the parametric guide. Figure is the GLQL estimator for the j th replication. Then we calculate B 2 = 10 −1 10 i=1 b 2 i , the average squared bias, V = 10 −1 10 i=1 v 2 i , the average variance and M SE = B 2 + V , the average mean squared error. The obtained results are summarized in table 2. When the guide is correct (a = 0) the GLQL estimator clearly outperforms the LQL estimator. In fact, in this case, the average squared bias is approximately reduced by half. For a = 0.1, 0.3, 0.5, even if the parametric guide is not correctly specified, the GLQL estimator behaves better than the classical LQL estimator. Regarding the variance, the guided estimator has generally smaller variance, except for the case a = 0.5 where we observe a slightly larger variance for the GLQL estimator. Finally, as noticed in the first example, the bandwidth selected for the GLQL method is generally larger than the one selected for the classical LQL method.    Thanks to its bias reduction property, parametrically guided nonparametric estimation is more and more investigated in different areas of statistics. The application of the guided nonparametric method to density estimation, nonparametric regression, local quasi-likelihood, additive models and very recently varying coefficient models has revealed an improved performance for the guided estimator compared with the classical nonparametric estimator. However, most of these investigations are based on completely observed data.
In this paper, we focused on the adaptation of the parametrically guided local quasilikelihood estimation to the censored data case. To deal with censoring, we considered the synthetic data approach. We investigated the simplest guided scheme which is based on the additive correction. We also generalized the asymptotic results to an unified family of additivemultiplicative corrections. Our results provide a generalization to the censored data case of both the results of Fan et al. (1995) and Fan et al. (2009) . The asymptotic results confirm the bias reduction property of the guided local quasi-likelihood estimator in the presence of censoring. The results also show that when an optimal bandwidth and an appropriate parametric guide are used the variance can also be reduced. Our finite sample simulation investigated both the case of discrete and continuous responses. The simulation results corresponded quite closely to the theoretical results and proved that the guided local quasi-likelihood estimator outperforms the unguided local-quasi-likelihood estimator in terms of bias and mean squared error.
. In view of conditions A1 and A7, for 1 ≤ i, j ≤ 2, we have The above supremum tends to zero in probability by Proposition 4.3 in Van Keilegom and Akritas (1999) and the empirical sum is bounded in probability by assumptions A1 and A5. Hence, B n,Ĝ − B n,G = o p (1). (6.12) Now, note that (B n,G ) ij = (EB n,G ) ij + O p V ar{(B n,G ) ij } 1/2 . Since q 2 is linear in y and using A2, A5 and A7, we obtain that In view of Assumption 3.1, we have The result of Lemma 6.1 is now a direct consequence of (6.12) and (6.13).
Finally, it suffices to check the Lyapunov condition. Let c ∈ 2 , based on similar arguments to those used to develop (6.16), we can easily show that {c T V ar(V n,G )c} −3/2 n k=1 |c T v G,k − Ec T v G,k | = O p (nh) −1/2 , where v G,k = q 1 η(X k ) +η(x, X k ), Y * k X k K X k −x h . The result of Lemma 6.2 is now a direct consequence of the Cramér-Wold device.
Proof of Proposition 3.1.
2. In view of Corollary 5.8 in Bartle (1966), conditions D2, D3 and D4 ensure an interchange of differentiation and integration. Since Ω = Ω(θ * ) is non-singular by condition D6 and using Assumption 3.3, we get Therefore, the second point results directly from the first point together with Theorem 3.1 in Newey and McFadden (1994).