Skip to main content
Log in

Goodness-of-fit tests for quantile regression with missing responses

Statistical Papers Aims and scope Submit manuscript

Abstract

Goodness-of-fit tests for quantile regression models, in the presence of missing observations in the response variable, are introduced and analysed in this paper. The different proposals are based on the construction of empirical processes considering three different approaches which involve the use of the gradient vector of the quantile function, a linear projection of the covariates (suitable for high-dimensional settings) and a projection of the estimating equations. Besides, two types of estimators for the null parametric model to be tested are considered. The performance of the different test statistics is analysed in an extensive simulation study. An application to real data is also included.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Bahari F, Parsi S, Ganjali M (2019) Empirical likelihood inference in general linear models with missing values in response and covariates by MNAR mechanism. Stat Pap. https://doi.org/10.1007/s00362-019-01103-0

  • Bianco A, Boente G, González-Manteiga W, Pérez-González A (2011) Asymptotic behavior of robust estimators in partially linear models with missing responses: the effect of estimating the missing probability on the simplified marginal estimators. Test 20:524–548

    Article  MathSciNet  Google Scholar 

  • Bierens HJ, Ginther DK (2001) Integrated conditional moment testing of quantile regression models. Empir Econ 26:307–324

    Article  Google Scholar 

  • Benoit DF, Alhamzawi R, Yu K (2013) Bayesian lasso binary quantile regression. Comput Stat 28:2861–2873

    Article  MathSciNet  Google Scholar 

  • Chen X, Wan ATK, Zhou Y (2015) Efficient quantile regression analysis with missing observations. J Am Stat Assoc 10:723–741

    Article  MathSciNet  Google Scholar 

  • Conde-Amboage M, Sánchez-Sellero C, González-Manteiga W (2015) A lack-of-fit test for quantile regression models with high-dimensional covariates. Comput Stat Data Anal 88:128–138

    Article  MathSciNet  Google Scholar 

  • Cotos-Yáñez TR, Pérez-González A, González-Manteiga W (2016) Model checks for nonparametric regression with missing data: a comparative study. J Stat Comput Simul 86:3188–3204

    Article  MathSciNet  Google Scholar 

  • Davino C, Furno M, Vistocco D (2014) Quantile regression: theory and applications. Wiley, Hoboken

    MATH  Google Scholar 

  • Dong C, Li G, Feng X (2019) Lack-of-fit tests for quantile regression models. J R Stat Soc B. https://doi.org/10.1111/rssb.12321

  • Escanciano JC (2006) A consistent diagnostic test for regression models using projections. Econom Theory 22:1030–1051

    Article  MathSciNet  Google Scholar 

  • Escanciano JC, Goh SC (2014) Specification analysis of linear quantile models. J Econom 178:495–507

    Article  MathSciNet  Google Scholar 

  • Feng X, He X, Hu J (2011) Wild bootstrap for quantile regression. Biometrika 98:995–999

    Article  MathSciNet  Google Scholar 

  • García-Portugués E, González-Manteiga W, Febrero-Bande M (2014) A goodness-of-fit test for the functional linear model with scalar response. J Comput Graph Stat 23:761–778

    Article  MathSciNet  Google Scholar 

  • He X, Zhu L-X (2003) A lack-of-fit test for quantile regression. J Am Stat Assoc 98:1013–1022

    Article  MathSciNet  Google Scholar 

  • Horvitz DG, Thompson DJ (1952) A generalization of sampling without replacement from a finite universe. J Am Stat Assoc 47:663–685

    Article  MathSciNet  Google Scholar 

  • Hruschka ER, Hruschka ER Jr., Ebecken NFF (2003) Evaluating a nearest–neighbor method to substitute continuous missing values. AI 2003: advances in artificial intelligence, pp 723–734. Springer, Berlin

  • Huang Q, Zhang H, Chen J, He M (2017) Quantile regression models and their applications: a review. J Biom Biostat 8:2–6

    Article  Google Scholar 

  • Koenker R (2005) Quantile regression. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Koenker R, Bassett GS (1978) Regression quantiles. Econometrica 46:33–50

    Article  MathSciNet  Google Scholar 

  • Otsu T (2008) Conditional empirical likelihood estimation and inference for quantile regression models. J Econom 142:508–538

    Article  MathSciNet  Google Scholar 

  • Purwar A, Singh SK (2015) Hybrid prediction model with missing value imputation for medical data. Expert Syst Appl 42:5621–5631

    Article  Google Scholar 

  • Ruppert D, Wand MP (1994) Multivariate locally weighted least squares regression. Ann Stat 22:1346–1370

    MathSciNet  MATH  Google Scholar 

  • Shen Y, Liang HY (2018) Quantile regression and its empirical likelihood with missing response at random. Stat Pap 59:685–707

    Article  MathSciNet  Google Scholar 

  • Sherwood B, Wang L, Zhou X (2013) Weighted quantile regression for analyzing health care cost data with missing covariates. Stat Med 32:4967–4979

    Article  MathSciNet  Google Scholar 

  • Stute W (1997) Nonparametric model checks for regression. Ann Stat 25:613–641

    Article  MathSciNet  Google Scholar 

  • Sun Z, Wang Q, Dai P (2009) Model checking for partially linear models with missing responses at random. J Multivar Anal 100:636–651

    Article  MathSciNet  Google Scholar 

  • Sun Z, Chen F, Zhou X, Zhang Q (2017) Improved model checking methods for parametric models with responses missing at random. J Multivar Anal 154:147–161

    Article  MathSciNet  Google Scholar 

  • van der Vaart AW, Wellner JA (1996) Weak convergence and empirical processes. Springer, New York

    Book  Google Scholar 

  • Wang CY, Wang S, Gutiérrez RG, Carroll RJ (1998) Local linear regression for generalized linear models with missing data. Ann Stat 26:1028–1050

    Article  MathSciNet  Google Scholar 

  • Wei Y, Yang Y (2014) Quantile regression with covariates missing at random. Stat Sin 24:1277–1299

    MathSciNet  MATH  Google Scholar 

  • Xu W, Zhu L (2013) Testing the adequacy of varying coefficient models with missing responses at random. Metrika 76:53–69

    Article  MathSciNet  Google Scholar 

  • Xu HX, Fan GL, Liang HY (2017) Hypothesis test on response mean with inequality constraints under data missing when covariables are present. Stat Pap 58:53–75

    Article  MathSciNet  Google Scholar 

  • Yu K, Lu Z, Stander J (2003) Quantile regression: applications and current research areas. J R Stat Soc Ser D 3:331–350

    Article  MathSciNet  Google Scholar 

  • Zheng JX (1998) A consistent nonparametric test of parametric regression models under conditional quantile restrictions. Econom Theory 14:123–138

    Article  MathSciNet  Google Scholar 

  • Zhou Y, Wan ATK, Wang X (2008) Estimating equation inference with missing data. J Am Stat Assoc 103:1187–1199

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

The authors acknowledge the support of the Projects MTM2016-76969-P (AEI/FEDER, UE) by the Spanish Ministry of Economy and Competitiveness, MTM2017-89422-P by the Spanish Ministry of Economy, Industry and Competitiveness and the support of the Competitive Reference Groups, 2016–2019 (ED431C 2016/040) and 2017–2020 (ED431C 2017/38), supported by the Consellería de Cultura, Educación e Ordenación Universitaria, Xunta de Galicia. We would also like to thank the reviewers of the paper and the Associate Editor for their interesting comments, which have helped to improve the contents of this manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ana Pérez-González.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Theoretical results

In order to derive the asymptotic properties of the empirical processes, it is crucial to obtain the following representation of the estimators \(\hat{\theta }_{S}\) and \(\hat{\theta }_{W}\) resulting from (7) and (8) respectively. The following hypothesis on the missing probability are required:

  1. H1.

    \(\inf _{x}p\left( x\right)>C_{0}>0\).

  2. H2.

    \(\sup _{x} \left| \hat{p}\left( x\right) -p\left( x\right) \right| \xrightarrow {a.s.} 0.\)

1.1 Previous lemmas

Lemma 1

Consider a random sample \(\{(X_i,Y_i,\delta _i)\}\), with \(i=1,\ldots ,n\), from model (11) and assume that \(f\left( \cdot |X\right) \) (the distribution of the error, conditioned by X) is bounded in a neighbourhood of zero with \(f\left( 0|X\right) >0\), and \(\left| f\left( t|X\right) -f\left( 0|X\right) \right| \le c\left| t\right| ^{1/2}\) for some \(c<\infty \). Assume also that there exist \(A\left( x\right) \), \(B\left( x\right) \), such that

$$\begin{aligned}&\underset{\left\| \theta _1-\theta \right\| \le C}{Sup} \left\| \dot{g}\left( x;\theta _1\right) \right\| \le A\left( x\right) \\&\quad \left\| \dot{g}\left( x;\theta _1\right) -\dot{g}\left( x;\theta _2\right) \right\| \le B\left( x\right) \left\| \theta _1-\theta _2\right\| \quad \text {for any } x, \theta _1,\theta _2 \end{aligned}$$

with bounded \(E\left( \left| A\left( X\right) \right| ^3\right) \), \(E\left( \left| h\left( X\right) A\left( X\right) \right| \right) \), \(E\left( \left| h\left( X\right) A\left( X\right) ^3\right| \right) \) and \(E\left( \left| B\left( X\right) \right| ^2\right) \). If H1 holds, then:

$$\begin{aligned} \sqrt{n}\left( \hat{\theta }_{S}-\theta _0\right)= & {} \frac{1}{\sqrt{n}}{} \mathbf S ^{-1}\sum _{i=1}^{n} \delta _{i}\psi \left( \epsilon _{i}\right) \dot{g}\left( x_{i};\theta _0\right) \nonumber \\&+\,\mathbf S ^{-1}E\left[ p\left( X\right) f\left( 0|X\right) h\left( X\right) \dot{g}\left( X;\theta _0\right) \right] +o_{p}\left( 1\right) \end{aligned}$$
(26)

where \(\epsilon _{j}=Y_{j}-g\left( X_{j};\theta _0\right) \) and \(\mathbf S =E\left[ p\left( X\right) )f\left( 0|X\right) \dot{g}\left( X;\theta _0\right) \dot{g}^T\left( X;\theta _0\right) \right] \).

Proof of Lemma 1

The proof of this lemma can be obtained following arguments as in the Lemma 1 in He and Zhu (2003). \(\square \)

Lemma 2

Under the conditions of Lemma 1, assuming that H1 and H2 hold, it can be proved that:

$$\begin{aligned} \sqrt{n}\left( \hat{\theta }_{W}-\theta _0\right)= & {} \frac{1}{\sqrt{n}}{} \mathbf S _W^{-1}\sum _{i=1}^{n}\frac{\delta _{i}}{p\left( X_{i}\right) }\psi \left( \epsilon _{i}\right) \dot{g}\left( X_{i},\theta _0\right) \nonumber \\&+\,\mathbf S _W^{-1}E\left[ f\left( 0|X\right) h\left( X\right) \dot{g}\left( X;\theta _0\right) \right] +o_{p}\left( 1\right) , \end{aligned}$$
(27)

where \(\mathbf S _W=E\left[ f\left( 0|X\right) \dot{g}\left( X;\theta _0\right) \dot{g}\left( X;\theta _0\right) ^T\right] \).

Proof of Lemma 2

Let \(\hat{\theta }_{W}\) be obtained by minimising in \(\theta \) the following expression:

$$\begin{aligned} S\left( \theta \right) =\sum _{i=1}^{n}\frac{\delta _{i}}{\hat{p}\left( X_{i}\right) }\rho _{\tau }\left( Y_{i}-g\left( X_{i};\theta \right) \right) . \end{aligned}$$

Using directional derivatives over \( S\left( \theta \right) \) and following similar calculations to Lemma A.1 in He and Zhu (2003) lead to:

$$\begin{aligned}&\sum _{y_{i}\ne g\left( x_{i};\hat{\theta }_{W}\right) }\frac{\delta _{i}}{\hat{p}\left( X_{i}\right) }\psi _{\tau }\left( Y_{i}-g\left( X_{i};\hat{\theta }_{W}\right) \right) \beta ^T\dot{g}\left( X_{i};\hat{\theta }_{W}\right) \nonumber \\&\qquad \le -\sum _{y_{i}=g\left( X_{i};\hat{\theta }_{W}\right) }\frac{\delta _{i}}{\hat{p}\left( X_{i}\right) }\psi _{\tau }\left( Y_{i}-\alpha ^T\dot{g}\left( X_{i};\hat{\theta }_{W}\right) \right) \beta ^T\dot{g}\left( X_{i};\hat{\theta }_{W}\right) . \end{aligned}$$
(28)

The right hand side of the previous expression can be expressed as

$$\begin{aligned}&\sum _{y_{i}=g\left( X_{i};\hat{\theta }_{W}\right) }\frac{\delta _{i}}{p\left( X_{i}\right) }\psi _{\tau }\left( Y_{i}-\alpha ^T\dot{g}\left( X_{i};\hat{\theta }_{W}\right) \right) \beta ^T\dot{g}\left( X_{i};\hat{\theta }_{W}\right) \\&\qquad +\sum _{y_{i}=g\left( X_{i};\hat{\theta }_{W}\right) }\left( \frac{\delta _{i}}{\hat{p}\left( X_{i}\right) }-\frac{\delta _{i}}{p\left( X_{i}\right) }\right) \psi _{\tau }\left( Y_{i}-\beta ^T\dot{g}\left( X_{i};\hat{\theta }_{W}\right) \right) \beta ^T\dot{g}\left( X_{i};\hat{\theta }_{W}\right) . \end{aligned}$$

The number of residuals equal to zero, \(\left( Y_{i}=g\left( X_{i};\hat{\theta }_{W}\right) \right) \), is a finite number with probability 1. Considering the moment conditions of \(A\left( X\right) \) and the properties of p and \(\hat{p}\), the previous expression is bounded by \(o_{p}\left( \sqrt{n}\right) \). Then, (28) can be written as:

$$\begin{aligned} n^{-1/2}\sum _{i=1}^{n} \frac{\delta _{i}}{\hat{p}\left( X_{i}\right) }\psi _{\tau }\left( Y_{i}-g\left( X_{i};\hat{\theta }_W\right) \right) \dot{g}\left( X_{i};\hat{\theta }_W\right) =o_{p}\left( 1\right) . \end{aligned}$$

Using the conditions on p an \(\hat{p}\), it is easy to check that:

$$\begin{aligned} n^{-1/2}\sum _{i=1}^{n} \frac{\delta _{i}}{p\left( X_{i}\right) }\psi _{\tau }\left( Y_{i}-g\left( X_{i};\hat{\theta }_W\right) \right) \dot{g}\left( X_{i};\hat{\theta }_W\right) =o_{p}\left( 1\right) . \end{aligned}$$
(29)

Denote by e, a new variable with the same distribution as \(\epsilon \) (error variable in model (1)), i.e. with distribution function F and density f. Define \(l\left( X_{i};\hat{\theta }_W\right) =g\left( X_{i};\hat{\theta }_W\right) -g\left( X_{i};\theta _0\right) -\frac{h\left( X_{i}\right) }{\sqrt{n}}\). Then \(Y_{i}-g\left( X_{i};\hat{\theta }_W\right) =\epsilon _{i}-l\left( X_{i};\hat{\theta }_W\right) \).

Consider now the following decomposition:

$$\begin{aligned}&n^{-1/2}\sum _{i=1}^{n} \frac{\delta _{i}}{p\left( X_{i}\right) }\left[ \psi _{\tau }\left( \epsilon _{i}-l\left( X_{i};\hat{\theta }_W\right) \right) -E_{e}\psi _{\tau }\left( e-l\left( X_{i};\hat{\theta }_W\right) \right) \right] \dot{g}\left( X_{i};\hat{\theta }_W\right) \\&\qquad \overset{\left( i\right) }{=}-n^{-1/2}\sum _{i=1}^{n} \frac{\delta _{i}}{p\left( X_{i}\right) }E_{e}\psi _{\tau }\left( e-l\left( X_{i};\hat{\theta }_W\right) \right) \dot{g}\left( X_{i};\hat{\theta }_W\right) +o_{p}\left( 1\right) ,\\ \end{aligned}$$

where \(\left( i\right) \) is obtained using (29). Now, applying the local expansions of functions F and g respectively, it can be proved that:

$$\begin{aligned}&-n^{-1/2}\sum _{i=1}^{n}\frac{\delta _{i}}{p\left( X_{i}\right) }\left[ \tau \left( 1-F\left( l\left( X_{i};\hat{\theta }_W\right) \right) \right) -\left( 1-\tau \right) F\left( l\left( X_{i};\hat{\theta }_W\right) \right) \right] \\&\quad =\sqrt{n}\left( \hat{\theta }_W-\theta _0\right) \mathbf S -E\left[ f\left( 0|X\right) h\left( X\right) \dot{g}\left( X;\theta _0\right) \right] +o_{p}\left( 1\right) \\&\qquad +\,o_{p}\left( \sqrt{n}\left( \hat{\theta }_W-\theta _0\right) \right) , \end{aligned}$$

where \(\mathbf S =E\left[ f\left( 0|X\right) \dot{g}\left( X;\theta _0\right) \dot{g}^T\left( X;\theta _0\right) \right] \).

On the other hand, following the arguments proving A.3 in He and Zhu (2003) and the hypotheses on p, it can be shown that:

$$\begin{aligned}&\frac{1}{\sqrt{n}}\sum _{i=1}^{n} \frac{\delta _{i}}{p\left( X_{i}\right) }\left[ \psi _{\tau }\left( \epsilon _{i}-l\left( X_{i};\hat{\theta }_W\right) \right) -E_{e}\psi _{\tau }\left( e-l\left( X_{i};\hat{\theta }_W\right) \right) \right] \dot{g}\left( X_{i};\hat{\theta }_W\right) \\&\quad =n^{-1/2}\sum _{i=1}^{n} \frac{\delta _{i}}{p\left( X_{i}\right) }\psi _{\tau }\left( \epsilon _{i}\right) \dot{g}\left( X_{i};\theta _0\right) +O_{p}\left( \left( \left\| \hat{\theta }_W-\theta _0\right\| +n^{-1/2}\right) ^{1/2}\log \, n\right) . \end{aligned}$$

Thus, the asymptotic expression for \(\left( \hat{\theta }_W-\theta _0\right) \) is obtained. \(\square \)

1.2 Main theorems

Theorem 1

Under the conditions of Lemma 1, assuming that H1 and H2 hold, the empirical processes \(R_{n}^{1}\) and \(R_{n,W}^{1}\) can be written as:

$$\begin{aligned} R_{n}^{1}\left( t\right)= & {} n^{-1/2}\sum _{i=1}^{n}\delta _{i}\psi _{\tau }\left( \epsilon _{i}\right) \left[ I\left( X_{i}\le t\right) -\mathbf S \left( t\right) \mathbf S ^{-1}\right] \dot{g}\left( X_{i};\theta _0\right) \\&\quad -\,\mathbf S \left( t\right) \mathbf S ^{-1}E\left[ p\left( X\right) f\left( 0|X\right) h\left( X\right) \dot{g}\left( X;\theta _0\right) \right] \\&\quad +\,E\left[ p\left( X\right) f\left( 0|X\right) h\left( X\right) \dot{g}\left( X;\theta _0\right) I\left( X\le t\right) \right] +o_{p}\left( 1\right) , \end{aligned}$$

uniformly in t, where \(f\left( 0|X\right) \) denotes the conditional density of the error at zero and the matrices \(\mathbf S \left( t\right) \) and \(\mathbf S \) are defined as:

$$\begin{aligned} \mathbf S \left( t\right)= & {} E\left[ p\left( X\right) f\left( 0|X\right) \dot{g}\left( X;\theta _0\right) \dot{g}^T\left( X;\theta _0\right) I\left( X\le t\right) \right] \\ \ \text {and } \mathbf S= & {} E\left[ p\left( X\right) f\left( 0|X\right) \dot{g}\left( X;\theta _0\right) \dot{g}^T\left( X;\theta _0\right) \right] , \end{aligned}$$

and similarly

$$\begin{aligned} R_{n,W}^{1}\left( t\right)= & {} n^{-1/2}\sum _{i=1}^{n}\frac{\delta _{i}}{p\left( X_{i}\right) }\psi _{\tau }\left( \epsilon _{i}\right) \left[ I\left( X_{i}\le t\right) -\mathbf S _W\left( t\right) \mathbf S _W^{-1}\right] \dot{g}\left( X_{i};\theta _0\right) \\&\quad -\,\mathbf S _W\left( t\right) \mathbf S _W^{-1}E\left[ f\left( 0|X\right) h\left( X\right) \dot{g}\left( X;\theta _0\right) \right] \\&\quad +E\left[ f\left( 0|X\right) \dot{g}\left( X;\theta _0\right) h\left( X\right) I\left( X\le t\right) \right] +o_{p}\left( 1\right) , \end{aligned}$$

uniformly in t, where:

$$\begin{aligned} \mathbf S _W\left( t\right)= & {} E\left[ f\left( 0|X\right) \dot{g}\left( X;\theta _0\right) \dot{g}^T\left( X;\theta _0\right) I\left( X\le t\right) \right] \quad \text{ and }\\ \mathbf S _W= & {} E\left[ f\left( 0|X\right) \dot{g}\left( X;\theta _0\right) \dot{g}^T\left( X;\theta _0\right) \right] . \end{aligned}$$
Table 8 Percentage of rejections for the tests based on empirical processes, for the linear model (24) with \(d=1\) and \(a=0\)
Table 9 Percentage of rejections for the tests based on empirical processes, for the linear model (24) with \(d=2\) and \(a=0\)
Table 10 Percentage of rejections for the tests based on empirical processes with projections, for the linear model (24) with \(d=2\) and \(a=0\)
Table 11 Percentage of rejections for the tests based on empirical processes with projections, for the linear model (24) with \(d=4\) and \(a=0\)
Table 12 Percentage of rejections for all the tests, for the linear model (24) with \(d=2\) and different values of a (\(a=0\) corresponding to the null hypothesis)
Table 13 Percentage of rejections for all the tests, for the linear model (24) with \(d=2\) and different values of a (\(a=0\) corresponding to the null hypothesis)
Table 14 Percentage of rejections for all the tests, for the linear model (24) with \(d=2\) and different values of a (\(a=0\) corresponding to the null hypothesis)
Table 15 Percentage of rejections for all the tests for the linear model in (24) with \(d=2\) and \(a=0\)

Proof of Theorem 1

By some simple computations, it can be seen that

$$\begin{aligned} R_{n}^{1}=n^{-1/2}\sum _{i=1}^{n}\delta _{i}\psi _\tau \left( Y_{i}-g\left( X_{i};\hat{\theta }_{W}\right) \right) \dot{g}\left( X_{i};\hat{\theta }_{W}\right) I\left( X_{i}\le t\right) . \end{aligned}$$

The result for \(R_{n}^{1}\) can be obtained in a similar way to He and Zhu (2003), under H1. For simplicity, the result for \(R_{n,W}^1\) will be only presented. In this case, using analogous arguments to those considered in other papers on goodness-of-fit tests for regression with missing responses (see Sun et al. (2009), Xu and Zhu (2013) or Sun et al. (2017), among others) and assuming that H1 and H2 hold, it can be proved that:

$$\begin{aligned} R_{n,W}^{1}\left( t\right)= & {} n^{-1/2}\sum _{i=1}^{n}\frac{\delta _{i}}{\hat{p}\left( X_{i}\right) }\psi _\tau \left( Y_{i}-g\left( X_{i};\hat{\theta }_{W}\right) \right) \dot{g}\left( X_{i};\hat{\theta }_{W}\right) I\left( X_{i}\le t\right) \\= & {} n^{-1/2}\sum _{i=1}^{n}\frac{\delta _{i}}{p\left( X_{i}\right) }\psi _\tau \left( Y_{i}-g\left( X_{i};\hat{\theta }_{W}\right) \right) \dot{g}\left( X_{i};\hat{\theta }_{W}\right) I\left( X_{i}\le t\right) \\&+\,n^{-1/2}\sum _{i=1}^{n}\left( \frac{\delta _{i}}{\hat{p}\left( X_{i}\right) }-\frac{\delta _{i}}{p\left( X_{i}\right) }\right) \psi _\tau \left( Y_{i}-g\left( X_{i};\hat{\theta }_{W}\right) \right) \\&\qquad \dot{g}\left( X_{i};\hat{\theta }_{W}\right) I\left( X_{i}\le t\right) \\= & {} n^{-1/2}\sum _{i=1}^{n}\frac{\delta _{i}}{p\left( X_{i}\right) }\psi _\tau \left( Y_{i}-g\left( X_{i};\hat{\theta }_{W}\right) \right) \dot{g}\left( X_{i};\hat{\theta }_{W}\right) I\left( X_{i}\le t\right) \\&+\,o_{p}\left( 1\right) . \end{aligned}$$

The empirical process can be decomposed as:

$$\begin{aligned} R_{n,W}^{1}\left( t\right)= & {} n^{-1/2}\sum _{i=1}^{n}\frac{\delta _{i}}{p\left( X_{i}\right) }\left[ E_{\epsilon }\psi _\tau \left( \epsilon _{i}-l\left( X_{i};\hat{\theta }_{W}\right) \right) \right] \dot{g}\left( X_{i};\hat{\theta }_{W}\right) I\left( X_{i}\le t\right) +o_{p}\left( 1\right) \nonumber \\&+\,n^{-1/2}\sum _{i=1}^{n}\frac{\delta _{i}}{p\left( X_{i}\right) }\left[ \psi _\tau \left( \epsilon _{i}-l\left( X_{i};\hat{\theta }_{W}\right) \right) -E_{\epsilon }\psi _\tau \left( \epsilon _{i}-l\left( X_{i};\hat{\theta }_{W}\right) \right) \right] \nonumber \\&\qquad \dot{g}\left( X_{i};\hat{\theta }_{W}\right) I\left( X_{i}\le t\right) \end{aligned}$$
(30)

where \(l\left( X_{i};\hat{\theta }_{W}\right) =g\left( X_{i};\hat{\theta }_{W}\right) -g\left( X_{i};,\theta _0\right) -n^{-1/2}h\left( X_{i}\right) \). Similar to (A.3) in He and Zhu (2003), under H1, the second part of the previous expression can be approximated by

$$\begin{aligned}&n^{-1/2}\sum _{i=1}^{n}\frac{\delta _{i}}{p\left( X_{i}\right) }\left[ \psi _\tau \left( \epsilon _{i}-l\left( X_{i};\hat{\theta }_{W}\right) \right) -E_{\epsilon }\psi _\tau \left( \epsilon _{i}-l\left( X_{i};\hat{\theta }_{W}\right) \right) \right] \\&\qquad \times \dot{g}\left( X_{i};\hat{\theta }_{W}\right) I\left( X_{i}\le t\right) \\&\quad =n^{-1/2}\sum _{i=1}^{n}\frac{\delta _{i}}{p\left( X_{i}\right) }\psi _\tau \left( \epsilon _{i}\right) \dot{g}\left( X_{i};\theta _0\right) I\left( X_{i}\le t\right) +o_{p}\left( 1\right) \end{aligned}$$

uniformly in t. Moreover, the first addend in (30) can be approximated as

$$\begin{aligned}&n^{-1/2}\sum _{i=1}^{n}\frac{\delta _{i}}{p\left( X_{i}\right) }\left[ E_{\epsilon }\psi _\tau \left( \epsilon _{i}-l\left( X_{i};\hat{\theta }_{W}\right) \right) \right] \dot{g}\left( X_{i};\hat{\theta }_{W}\right) I\left( X_{i}\le t\right) \\&\quad =-\mathbf S _W\left( t\right) n^{1/2}\left( \hat{\theta }_{W}-\theta _0\right) +E\left[ f\left( 0|X\right) \dot{g}\left( X;\theta _0\right) h\left( X\right) I\left( X\le t\right) \right] +o_{p}\left( 1\right) . \end{aligned}$$

Replacing the asymptotic expression of \(\left( \hat{\theta }_{W}-\theta _0\right) \) obtained in Lemma 2, the following representation holds:

$$\begin{aligned} R_{n,W}^{1}\left( t\right)= & {} n^{-1/2}\sum _{i=1}^{n}\frac{\delta _{i}}{p\left( X_{i}\right) }\psi _\tau \left( \epsilon _{i}\right) \dot{g}\left( X_{i};\theta _0\right) I\left( X_{i}\le t\right) \\&\quad -\mathbf S _W\left( t\right) \mathbf S _W^{-1}n^{-1/2}\sum _{i=1}^{n}\frac{\delta _{i}}{p\left( X_{i}\right) }\psi _\tau \left( \epsilon _{i}\right) \dot{g}\left( X_{i};\theta _0\right) \\&\quad -\mathbf S _W\left( t\right) \mathbf S _W^{-1}E\left[ f\left( 0|X\right) h\left( X\right) \dot{g}\left( X;\theta _0\right) \right] \\&\quad +E\left[ f\left( 0|X\right) \dot{g}\left( X;\theta _0\right) h\left( X\right) I\left( X\le t\right) \right] +o_{p}\left( 1\right) . \end{aligned}$$

\(\square \)

Theorem 2

Under the conditions of Lemma 1, assuming that H1 and H2 hold, the empirical processes \(R_{n}^{2}\) and \(R_{n,W}^{2}\) can be written as:

$$\begin{aligned} R_{n}^{2}\left( \beta ,u\right)= & {} n^{-1/2}\sum _{i=1}^{n}\delta _{i}\psi _{\tau }\left( \epsilon _{i}\right) \left[ I\left( \beta ^TX_{i}\le u\right) -\mathbf S \left( \beta ,u\right) \mathbf S ^{-1}\right] \dot{g}\left( X_{i};\theta _0\right) \\&\quad +\,E\left[ f\left( 0|X\right) \dot{g}\left( X;\theta _0\right) h\left( X\right) I\left( \beta ^TX\le u\right) \right] \\&\quad -\,\mathbf S \left( \beta ,u\right) \mathbf S ^{-1}E\left[ h\left( X\right) f\left( 0|X\right) \dot{g}\left( X;\theta _0\right) \right] +o_{p}\left( 1\right) \end{aligned}$$

uniformly in \(\left( \beta ,u\right) \), where \(\mathbf S =E\left[ p\left( X\right) f\left( 0|X\right) \dot{g}\left( X;\theta _0\right) \dot{g}^T\left( X;\theta \right) \right] \) and \(\mathbf S \left( \beta ,u\right) =E\left[ p\left( X\right) f\left( 0|X\right) \dot{g}\left( X;\theta _0\right) \dot{g}^T\left( X;\theta _0\right) I\left( \beta ^TX\le u\right) \right] \).

$$\begin{aligned} R_{n,W}^{2}\left( \beta ,u\right)= & {} n^{-1/2}\sum _{i=1}^{n}\frac{\delta _{i}}{p\left( X_{i}\right) }\psi _{\tau }\left( \epsilon _{i}\right) \left[ I\left( \beta ^TX_{i}\le u\right) -\mathbf S \left( t\right) \mathbf S ^{-1}\right] \dot{g}\left( X_{i};\theta _0\right) \\&+\,E\left[ f\left( 0|X\right) \dot{g}\left( X;\theta _0\right) h\left( X\right) I\left( \beta ^TX\le u\right) \right] \\&-\,\mathbf S _W\left( \beta ,u\right) \mathbf S _W^{-1}E\left[ h\left( X\right) f\left( 0|X\right) \dot{g}\left( X;\theta _0\right) \right] +o_{p}\left( 1\right) \end{aligned}$$

uniformly in \(\left( \beta ,u\right) \), where \(\mathbf S _W=E\left[ f\left( 0|X\right) \dot{g}\left( X;\theta _0\right) \dot{g}^T\left( X;\theta _0\right) \right] \) and \(\mathbf S _W\left( \beta ,u\right) =E\left[ f\left( 0|X\right) \dot{g}\left( X;\theta _0\right) \dot{g}^T\left( X;\theta _0\right) I\left( \beta ^TX\le u\right) \right] .\)

Proof of Theorem 2

The proof of this theorem follows arguments to those in Theorem 1 in Conde-Amboage et al. (2015). \(\square \)

Extended simulation results

The design of the simulation study carried out in this work has been described in Sect. 3. Some partial results have been presented in the aforementioned section, which are now completed with the detailed results presented in this Appendix. Tables 8 and 9 provide the percentage of rejections for the tests based just on empirical processes, for \(d=1\) and \(d=2\) and different values of \(\alpha \). For tests based on empirical processes considering projections, Tables 10 and 11 present analogous results taking \(d=2\) and \(d=4\). Note that results corresponding to \(\alpha =0.05\) coincide with those reported in Sect. 3.

In addition, for \(d=2\), the behaviour of all the proposed tests have been compared. Results can be seen in Tables 12, 13, and 14, for different values of a and sample size \(n=100\).

Finally, the comparison of the tests behaviour taking the real or an estimated missing model (p or \({\hat{p}}\)) is presented in Table 15, for different values of \(\alpha \) and \(n=100\).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pérez-González, A., Cotos-Yáñez, T.R., González-Manteiga, W. et al. Goodness-of-fit tests for quantile regression with missing responses. Stat Papers 62, 1231–1264 (2021). https://doi.org/10.1007/s00362-019-01135-6

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00362-019-01135-6

Keywords

Navigation