Robust estimation for varying-coe ffi cient partially linear measurement error model with auxiliary instrumental variables

: We study the varying-coe ffi cient partially linear model when some linear covariates are not observed, but their auxiliary instrumental variables are available. Combining the calibrated error-prone covariates and modal regression, we present a two-stage e ffi cient estimation procedure, which is robust against outliers or heavy-tail error distributions. Asymptotic properties of the resulting estimators are established. Performance of our proposed estimation procedure is illustrated through some numerous simulations and a real example. And the results confirm that the proposed methods are satisfactory.


Introduction
Varying-coefficient partially linear model is one of the most popular semiparametric regression models.It takes the form as where Y is the response variable, X, Z and U are the associated covariates, Θ is a p-dimensional vector of unknown parameters, α(.) is a q-dimensional vector of nonparametric function, ε is model error with E(ε|X, Z, U) = 0 and Cov(ε|X, Z, U) = σ 2 .By combining the flexibility of nonparametric model and the adaptability of parametric model, model (1.1) is quite general which covers many important models as special cases, such as liner model, varying-coefficient model, partially linear model and others.However, due to the low accuracy of instrument equipment, the imperfection of measurement technique, and the high price or time cost, it is often impossible to observe accurate data in the experiments, but the data with errors-in-variables(EV) [1].If one simply ignores the measurement error, known as naive method, it will result in unreliable statistical inference, such as biased estimators, loss of efficiency, decreasing power of hypothesis testing.According to the diversity of errors in variables, the treatment of measurement errors differs for various types.
Early research focused on the study of simple additive measurement errors in the linear component.Instead of true variable X, we can observe η with η = X + e and e is the measurement error, which is independent of (X, Z, U, ε).You and Chen [2] proposed a modified profile least squares procedure and Wang et al. [3] applied empirical likelihood inference to study model (1.1) with measurement errors in the linear part.On the other hand, measurement error in the nonparametric part in model (1.1) may occur in reality.The surrogate variable η is observed by η = Z + e with the measurement error e. Feng and Xue [4] proposed a local bias-corrected restricted profile least squares estimators and Fan et al. [5] applied some auxiliary information to construct empirical log-likelihood ratios for both parameter and nonparametric functions.It is noted that in the above literature, we assumed that the relationship between the unobservable variable and the observed EV data is simple additive.Furthermore, it is necessary that in the statistical inference, the variance of measurement error is known before.Even if the variance of error is unavailable in practice, we can use the estimators to replace it by repeatedly measuring actually observation data η.
However, the additive measurement error structure may not describe the complexity of experimental data and sometimes there are no repeated observations to estimate the variance of measurement error.In this paper, we do not specify any model structure of the measurement error, and do not require the information of the variance of measurement error.Furthermore, repeated measurement data are not necessary.We assume that some variables are unobserved directly, but auxiliary information is available to remit the true variable.Specifically, the unobserved variable and the observed surrogate variable with measurement error is connected by a nonparametric structure with an instrument variable.Let X = (ξ T , W T ) T , where ξ is a p 1 × 1 unobserved directly vector and W is a vector of the remaining exactly observed components.We assume that the true variable ξ is related to the observed measurement variable η and auxiliary variable V through the nonparametric structure ξ = E(η|V) = ξ(V). (1.2) From assumption (1.2), model (1.1) with error-prone linear covariates implies that T and e is the measurement error with E(e|V) = 0 and the covariance matrix Σ e .Therefore, the error model structures of ξ is a special cases of the additive error models.For the sake of simplicity, we assume that the auxiliary variables V is scalar.This structure of measurement error is more general, introduced by Zhou and Liang [6], which includes denoise linear model and rational expectation model and others.Zhou and Liang [6] proposed a regression calibration technique to reduce the bias due to mismeasurement and then developed the profile least-square based estimation procedure for parametric and nonparametric components.Meanwhile, this structure of error-prone covariates has been extensively studied by researchers for other semiparametric models.For the varying-coefficient models with error-prone covariates, Zhao and Xue [7] constructed confidence interval for the varying-coefficient functions using instrumental variable-based empirical likelihood inference and Xu et al. [8] presented an instrumental variable type local polynomial estimator.Zhang et al. [9] proposed the estimation and variable selection for partial linear single index model when some linear covariates are not observed, but their ancillary variables are available.Huang and Ding [10] studied a new partially linear error-in-variable models with error-prone covariates in the parametric part and mismeasured covariates in the nonparametric part, simultaneously.Sun et al. [11] developed an estimation procedure for semiparametric varyingcoefficient partially linear model when both response and part of covariates are measured with error by functions of some auxiliary variables.
In this paper, we propose two step estimation procedure for model (1.3).In the first step, we give a bias correction for the error-prone covariate by using ancillary information and applying locally linear technique.Then the covariates are corrected based on instrumental variables.In the second stage, instead of profile least squares estimation, we propose a robust estimation procedure.It is remarkable that the least squares technique may be sensitive to outliers and inefficient for many non-normal error distributions.Modal regression, introduced by Yao et al. [12], is a powerful tool for revealing the underlying relationship between the response and corresponding covariates in the presence of outlier or non-normal error distribution, and it has been successfully applied to various semiparametric models (Zhao et al. [13], Yang and Yang [14] and Yang et al. [15], Lv et al. [16], Yu et al. [17] and others).This technique has some good properties, such as easy to implement, robust to outliers, full asymptotic efficiency under the non-normal error distribution.Motivated by these, we extend modal regression in the second stage to estimate the parameter and nonparametric function.
The rest of this paper is organized as follows.Section 2 introduces a two-stage robust instrumental variables-based modal regression estimator and Section 3 establishes its theoretical properties.In section 4, we discuss the selection of bandwidth and the specific estimation algorithm.Simulation studies are conducted to evaluate the performances of the proposed estimation procedure in Section 5. A real data analysis is illustrated in section 6 to show the effective of the proposed estimation procedures.We make our concluding remarks in Section 7 and leave the proofs of the main Theorems in Section 8.

Robust estimation procedure
In this section, we propose a two-stage robust estimation procedure for model (1.3).Firstly, the local linear smoothing is adapted to calibrate the variable ξ by using auxiliary instrument variables η and V. Secondly, based on B-spline basis functions and modal regression, the final estimators of the parametric and nonparametric components are obtained.

Covariate calibration
i=1 is the independent and identically distributed sample from (Y, ξ, W, Z, U, η, V).The varying-coefficient partially linear model with error-prone in linear covariates has the following form, Firstly, we need to calibrate the covariate ξ i , which is not observed in model (2.1).Let η ik be the kth entry of vector η i , i = 1, . . ., n.To estimate ξ k (v), the kth component of ξ(v), we employ the local linear smoothing technique [18] to minimize where , and V v is defined in the following, Therefore, unobserved covariate ξ i can be replaced by its local linear estimator ξ(V i ), abbreviated as ξi .Then the robust estimators of parameters and nonparametric function can be constructed in the following procedure.

Modal regression estimator
Following Schumaker [19], we can replace the nonparametric functions α k (u), k = 1, 2, . . ., q through their basis function approximations.Let B(u) = (B 1 (u), . . ., B L (u)) T be the B-spline basis function with the order of M, where L = K + M and K is the number of interior knots.Then, the nonparametric function α k (u) can be approximated by where γ k = (γ k1 , γ k2 , . . ., γ kL ) T .Then model (2.4) is substituted into the model (2.1), we can obtain where ) is a standard linear regression model, and it contains three parametric vectors β, θ and γ.Each function α k (.) in model (2.1) is characterized by γ k in model (2.4).
Based on the thought of modal regression [12], the estimators of the parameters β, θ and γ are obtained through maximizing the following objective function, where ϕ h (.) = ϕ(./h)/h,ϕ(.) is the kernel function and h is a bandwidth.The value of bandwidth h determines the level of the robustness of the estimator, then the specific selection of optimal bandwidth h is described in the next section.
However, we can not maximize (2.6) directly due to the unobserved variable ξ i .Instead, we substitute the local linear estimator ξi with the true variable ξ i .Further, the calibrated expression is rewrite as Therefore, we get the spline modal regression estimators β, θ and γ by maximizing the objective function (2.7).Then, the corresponding estimator of coefficient function α k (u) is obtained by αk (u) = B(u) T γk , k = 1, 2, . . ., q.

Theoretical properties
In this section, we discuss the asymptotic properties of the resulting estimators.Denote Θ 0 = (β T 0 , θ T 0 ) T and α 0 (.) to be the true value of Θ = (β T , θ T ) T and α(.) in model (1.1).Correspondingly, γ 0 j is the best approximation coefficient of α 0 j (u) in the B-spline space.Let We begin with the following assumption conditions required to derive the main results.These conditions are quite mild and can be easily satisfied.
A1: The random variable U has a bounded support U. Its density function f u (.) is Lipschitz continuous and bounded away from 0 and infinite on its support.The density function of random function V, f v (.) is continuously differentiable and bounded away from 0 and infinite on its support V.
A4: The kernel functions L(.) and ϕ(.) are density functions with compact supports.A5: There exists an s > 0 such that E||X|| 2s < ∞ and for some r A8: F(x, z, u, h) and G(x, z, u, h) are continuous with respect to (x, z, u).In addition, F(x, z, u, h) < 0 for any h > 0. A9: are continuous with respect to x, z and u.
Theorem 1. Suppose that the assumptions A1-A9 hold, and the number of knots K = O(n 1/(2r+1) ).Then we have The following Theorem 2 shows the asymptotic normality for the parameter estimator Θ.

Bandwidth selection and estimation algorithm
In this section, we discuss the selection of bandwidth and estimation procedure based on the modal expectation maximization(MEM) algorithm proposed by Li et al. [20].

Bandwidth selection in practice
In order to obtain the spline modal estimators β, θ and γ by maximizing function (2.7), we need to select an appropriate bandwidth h due to the fact that its value affects directly the robustness of the estimators.According to the thought of Yao et al. [12], it is easy to show that the ratio of the asymptotic variance of our proposed estimator to that of the least square B-spline estimator is given by where , the parameters β (0) , θ (0) and γ (0) are the initial estimators of the parametric vectors β, θ and γ via the least square estimation procedure, and σ2 = 1 n n i=1 ε2 i .The optimal bandwidth h is calculated by minimizing (4.1).In the actual calculation, we use the grid search method to choose h, and the possible grids points for h can be h = 0.5 σ×1.02 j , j = 0, 1, . . ., 100 (See Yao et al. [12] and Lv et al. [16]).

The MEM algorithm for parameters
In this subsection, the following modal expectation maximization(MEM) algorithm by Li et al. [20] is adopted to obtain the estimators of the parameters Θ and γ.T , where Θ (0) and γ (0) are initial estimators of Θ and γ, respectively with m = 0. Step1.

Simulation studies
In this section, we conduct the following simulation to assess the finite sample performance of the proposed estimation procedure.For this purpose, we consider the following varying-coefficient partially linear model with error-prone linear covariates, where the parametric vector ) and U i ∼ U(0, 1), the covariates W i are multivariate normal distribution with mean 0 and covariance matrix Σ 1 = (σ i j ) 1≤i, j≤3 with σ i j = 0.5 |i− j| for i, j = 1, 2, 3 and the covariates Z i are multivariate normal distribution with mean 0 and covariance matrix Σ 2 = (σ i j ) 1≤i, j≤2 with σ i j = 0.5 |i− j| for i, j = 1, 2. The following three different model errors: (1) the normal distribution: ε i ∼ N(0, 1); (2) the t distribution: ε i ∼ t(3); (3) the mixed normal distribution(MN): ε i ∼ 0.9N(0, 1) + 0.1N(0, 9 2 ) are considered.
The variables ξ i in model (5.1) are the function of the auxiliary instrumental variables V i .It is noted that ξ i are not observed, and the observed variables are η i with η i = ξ(V i ) + e i , and . Furthermore, the measurement error e i is independent with V i and (W i , Z i , ε i ) and e i ∼ N(0, I 3 σ 2 e ), where σ 2 e = 0.4 2 and σ 2 e = 0.6 2 represent the different level of the measurement error.
In the simulation experiments, we choose the Epanechnikov kernel function L(t) = 3 4 (1 − t 2 ) + , the bandwidth b = σv n −1/3 is adopted in local linear smoothing according to Zhou and Liang [6], where σv is sample root variance of the variable V i .We use the standard normal density for ϕ(t) in the modal regression and the bandwidth h is selected through minimizing (4.1).We use the cubic B-spline basis function in our simulation.The sample size n is 100, 250 and 500, and run simulation is 200.
To evaluate the performance of the proposed estimation procedure in the parametric component, we compared the modal regression (MR) estimators to the profile least squares (PLS) estimators proposed by Zhou and Liang [6].Three types of estimators include the naive estimators in which ξ i is simply replaced by η i (denoted by MR-N and PLS-N), our proposed calibrated estimators (denoted by MR-C and PLS-C), and the benchmark estimators which assumes that ξ i is known before (denoted by MR-B and PLS-B).Tables 1-6 show the bias and standard deviation (denoted by SD in the brackets) of the estimators of parameters β and θ.According to Tables 1-6, we draw the following conclusions: (1) The estimators of parameters are obtained under six estimation approaches.Their bias and SD become smaller with the increasing of the sample size in all situations.This indicates that all estimation procedures are reasonable.(2) The bias and SD of the PLS estimator are smaller than MR estimator while the model error yielding to normal distribution.However, when the model error is the t or MN distribution, the proposed MR estimator gives smaller bias and SD than the PLS estimator.It illustrates that the PLS estimator is sensitive to the non normal distributed model error.(3) Compared the estimators among MR-N, MR-C and MR-B, we find that our proposed MR-C estimator is superior to MR-N estimator and it performs just a little bit worse than MR-B.And the PLS-C estimator has the similar performance to the MR-C estimator, which demonstrates that it is necessary to calibrate measurement errors in the calculation and we should not ignore the effect of the measurement errors.( 4) Given the sample size, with the increase of the measurement error, the SD of PLS-C and MR-C become smaller.This reveals that the calibration of the error-prone covaritates is effective.
Furthermore, to evaluate the estimation results on the nonparametric functions, we present the fitting curve of α 1 (u) and α 2 (u).Figures 1 and 2 show the plot of estimated curve with the dotted, the dashed and the solid lines which denote the PLS-C estimator, MR-C estimator and the real function, respectively.We only give the results of the measurement error σ 2 e = 0.4 2 and σ 2 e = 0.6 2 under n = 250 with model error ε i ∼ t(3).From Figures 1 and 2, we can see that two estimation procedures of PLS-C and MR-C estimator are all closed to the real curve, and the MR-C is much closer to the real curve.This indicates that our proposed estimation procedure is more robust when the data follows non normal distributions.In addition, we apply the root of mean square error(RMSE) to illustrate the effectiveness of the nonparametric estimators, and the form of the RMSE is where M = 200, u i is the equidistant points on (0, 1).The boxplots of the RMSE for the nonparametric functions are given in Figures 3 and 4 by PLS-C and MR-C estimators when the measurement error is σ 2 e = 0.4 2 and σ 2 e = 0.6 2 , respectively, and the sample size n = 250 and n = 500 under the model error ε i ∼ t(3).From Figures 3 and 4, the MR-C estimator performs smaller RMSE than PLS-C estimator.It further confirms that our proposed estimation procedure is robust when the model error is non normal.Meanwhile, the RMSEs of two procedures are decreasing as the sample size increases.

A real example
We adopt a diabetes data to our proposed estimation procedure in this section.The data set contains 10 baseline variables, including age, sex, body mass index, average blood pressure, six blood serum measurements obtained from 442 observations, and the response of interest, a quantitative measurement of disease progression 1 year after baseline.A detailed description of each variable can be seen by Efron et al. [21].Similar to Sun et al. [11], the following varying-coefficient partially linear measurement error model is given as Low-density level (ldl) and high-density level(hdl) are the covariates ξ which are measured with error.The exactly observed covariates W are triglycerides(tc), total cholesterol(tch) and tension glaucoma level(ltg).The covariates Z are the glucose concentration(glu) and average blood pressure(bp) and the variable U is age.Further, the variable body mass index(bmi) is considered as the auxiliary instrumental variable V.
We presented two methods with PLS-C and IMR-C to estimate parameter and nonparametric functions in model (6.1).The parametric estimating results are given in Table 7. From Table 7, the calibrated methods based on auxiliary instrumental variables are available in practice and the parametric estimators with PLS-C and MR-C have similar effects on response variable.In addition, the fitting curves of the coefficient functions are given in Figure 5, where the solid line and the dashed line describe the PLS-C method and the MR-C method, respectively.From Figure 5, we can see that the fitting trend of the nonparametric functions with PLS-C is close to MR-C.In a word, our proposed estimation procedures are effective in both parametric and nonparametric aspects.

Conclusions
We have presented a robust estimation procedure for varying-coefficient partially linear model when some covariates in parametric part are error-prone.We assume that some linear covariates in the model can not be observed directly, and do not specify the model structure of measurement error.Firstly, we calibrate the covariates with the auxiliary instrumental variable by the local linear smoothing technique.Then with the B-spline basis function and modal regression, the components of parameters and coefficient functions are simultaneously estimated.The proposed estimation procedure not only can attenuate the effect of the measurement errors, but also is robust against outliers or heavy-tail distributed data.Under some mild conditions, the consistency and the asymptotic normality of the resulting estimators are established.We also give the strategy to choose optimal bandwidth and specific algorithm in practice.Some simulation studies and a real data analysis are carried out to demonstrate the proposed estimation methods are satisfactory.
We show that, for any given ε > 0, there exists a large enough constant C such that where . Using Taylor expanding J 1 around ε i , we have where ε * i is between ε i and ε i + Z T i R(U i ) + (ξ i − ξi ) T β 0 .By Corollary 6.21 in Schumaker [19], we have ||R(U i )|| = O p (K −r ) = O p (δ).In addition, ξi is the local linear estimators of ξ i , invoking condition A5 and A9 and the simple calculation yields Similarly, we also have

||
where H = Thus, from (7.9) and Taylor expression, we have . (7.10) Thus, from Lemma A.1 and (7.10), we can obtain that (7.8) has the following asymptotic expression, where Then, by the law of large numbers, we have Substituting (7.12) into (7.7), and similar to (7.11), we obtain that Note that Let Xi = X i − Φ T n Γ −1 n Π i , (7.13) can be expressed by For I 4 , As nb 4 → 0, we have I 41 = o p (1).Then through the similar argument of Zhang et al. [9], we have Then, (7.14) can be written as ) where L b (.) = L(./b)/b is the kernel function, and b = b k for k = 1, . . ., p 1 is the bandwidth for the kth component of ξ.The local linear estimator of ξ k (v),k = 1, . . ., p 1 , denoted by ξk (v),

Table 2 .
The bias and SD of estimators for parameters with ε i ∼ t(3) and σ 2 e = 0.4 2 .

Table 5 .
The bias and SD of estimators for parameters with ε i ∼ t(3) and σ 2 e = 0.6 2 .

Table 6 .
The bias and SD of estimators for parameters with ε i ∼ 0.

Table 7 .
The estimators of the parameters β and θ in real example.