Skip to main content
Log in

Conditional sure independence screening by conditional marginal empirical likelihood

  • Published:
Annals of the Institute of Statistical Mathematics Aims and scope Submit manuscript

Abstract

In many applications, researchers often know a certain set of predictors is related to the response from some previous investigations and experiences. Based on the conditional information, we propose a conditional screening feature procedure via ranking conditional marginal empirical likelihood ratios. Due to the use of centralized variable, the proposed screening approach works well when there exist either or both hidden important variables and unimportant variables that are highly marginal correlated with the response. Moreover, the new method is demonstrated effective in scenarios with less restrictive distributional assumptions by inheriting the advantage of empirical likelihood approach and is computationally simple because it only needs to evaluate the conditional marginal empirical likelihood ratio at one point, without parameter estimation and iterative algorithm. The theoretical results reveal that the proposed procedure has sure screening properties. The merits of the procedure are illustrated by extensive numerical examples.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Barut, E., Fan, J., Verhasselt, A. (2012). Conditional sure independence screening. http://arxiv.org/abs/1206.1024.

  • Bickel, P. J., Ritov, Y., Tsybakov, A. B. (2009). Simultaneous analysis of lasso and dantzig selector. The Annals of Statistics, 37, 1705–1732.

  • Bühlmann, P., Van de Geer, S. (2011). Statistics for high-dimensional data: Methods, theory and applications. New York: Springer.

  • Candes, E., Tao, T. (2007). The dantzig selector: Statistical estimation when p is much larger than n. The Annals of Statistics, 35, 2313–2351.

  • Chang, J., Tang, C. Y., Wu, Y. (2013a). Marginal empirical likelihood and sure independence feature screening. The Annals of Statistics, 41, 2123–2148.

  • Chang, J., Tang, C. Y., Wu, Y. (2013b). Supplement to “Marginal empirical likelihood and sure independence feature screening.”. doi:10.1214/13-AOS1139SUPP.

  • Chang, J., Chen, S. X., Chen, X. (2015a). High dimensional generalized empirical likelihood for moment restrictions with dependent data. Journal of Econometrics, 185, 283–304.

  • Chang, J., Tang, C. Y., Wu, Y. (2015b). Local independence feature screening for nonparametric and semiparametric models by marginal empirical likelihood. http://arxiv.org/abs/1502.07061.

  • Chen, S. X., Van Keilegom, I. (2009). A review on empirical likelihood methods for regression (with dicussions). TEST, 18, 415–447.

  • Chen, S. X., Peng, L., Qin, Y. L. (2009). Effects of data dimension on empirical likelihood. Biometrika, 96, 711–722.

  • Efron, B., Hastie, T., Johnstone, I., Tibshirani, R. (2004). Least angle regression (with discussions). The Annals of Statistics, 32, 407–499.

  • Fan, J., Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96, 1348–1360.

  • Fan, J., Lv, J. (2008). Sure independence screening for ultrahigh dimensional feature space. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 70, 849–911.

  • Fan, J., Lv, J. (2011). Nonconcave penalized likelihood with np-dimensionality. IEEE Transactions on Information Theory, 57, 5467–5484.

  • Fan, J., Song, R. (2010). Sure independence screening in generalized linear models with np-dimensionality. The Annals of Statistics, 38, 3567–3604.

  • Fan, J., Samworth, R., Wu, Y. (2009). Ultrahigh dimensional feature selection: Beyond the linear model. The Journal of Machine Learning Research, 10, 2013–2038.

  • Fan, J., Feng, Y., Song, R. (2011a). Nonparametric independence screening in sparse ultra-high-dimensional additive models. Journal of the American Statistical Association, 106, 544–557.

  • Fan, J., Lv, J., Qi, L. (2011b). Sparse high-dimensional models in economics. Annual Review of Economics, 3, 291–317.

  • Hall, P., Miller, H. (2009). Using generalized correlation to effect variable selection in very high dimensional problems. Journal of Computational and Graphical Statistics, 18, 533–550.

  • Hall, P., Titterington, D. M., Xue, J. H. (2009). Tilting methods for assessing the influence of components in a classifier. Journal of the Royal Statistical Society Series B (Statistical Methodology), 71, 783–803.

  • Hastie, T., Tibshirani, R., Friedman, J. (2009). The elements of statistical learning: data mining, inference and prediction. New York: Springer.

  • Hjort, N. L., McKeague, I. W., Van Keilegom, I. (2009). Extending the scope of empirical likelihood. The Annals of Statistics, 37, 1079–1111.

  • Leng, C., Tang, C. Y. (2012). Penalized empirical likelihood and growing dimensional general estimating equations. Biometrika, 99, 703–716.

  • Li, G., Peng, H., Zhang, J., Zhu, L. (2012). Robust rank correlation based screening. The Annals of Statistics, 40, 1846–1877.

  • Lin, L., Sun, J., Zhu, L. X. (2013). Nonparametric feature screening. Computational Statistics and Data Analysis, 67, 162–174.

  • McCullagh, P., Nelder, J. A. (1989). Generalized Linear Models. New York: Chapman and Hall/CRC.

  • Newey, W., Smith, R. J. (2004). Higher order properties of gmm and generalised empirical likelihood estimators. Econometrica, 72, 219–255.

  • Owen, A. B. (1988). Empirical likelihood ratio confidence intervals for a single functional. Biometrika, 75, 237–249.

    Article  MathSciNet  MATH  Google Scholar 

  • Owen, A. B. (2001). Empirical Likelihood. New York: Chapman and Hall/CRC.

    Book  MATH  Google Scholar 

  • Qin, J., Lawless, J. (1994). Empirical likelihood and general estimating equations. The Annals of Statistics, 22, 300–325.

  • Tang, C. Y., Leng, C. (2010). Penalized high-dimensional empirical likelihood. Biometrika, 97, 905–920.

  • Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Statistical Methodology), 58, 267–288.

  • Zhang, C. H., Zhang, T. (2012). A general theory of concave regularization for high-dimensional sparse estimation problems. Statistical Science, 27, 576–593.

  • Zhu, L. P., Li, L., Li, R., Zhu, L. X. (2011). Model-free feature screening for ultrahigh-dimensional data. Journal of the American Statistical Association, 106, 1464–1475.

Download references

Acknowledgments

We thank the Editor, the Associate Editor and two referees for their constructive comments and suggestions, which have helped greatly improve the article. We are very grateful to Drs. Jinyuan Chang and Yichao Wu for sharing with us programs for implementing their methods.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qinqin Hu.

Additional information

The research was supported by NNSF projects (11171188, 11071145, 11221061, and 11231005) and the 111 project (B12023) of China, NSF and SRRF projects (ZR2010AZ001 and BS2011SF006) of Shandong Province of China.

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hu, Q., Lin, L. Conditional sure independence screening by conditional marginal empirical likelihood. Ann Inst Stat Math 69, 63–96 (2017). https://doi.org/10.1007/s10463-015-0534-9

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10463-015-0534-9

Keywords

Navigation