Combining Survey and Non-survey Data for Improved Sub-area Prediction Using a Multi-level Model

Kim, Jae Kwang; Wang, Zhonglei; Zhu, Zhengyuan; Cruze, Nathan B.

doi:10.1007/s13253-018-0320-2

Combining Survey and Non-survey Data for Improved Sub-area Prediction Using a Multi-level Model

Published: 24 April 2018

Volume 23, pages 175–189, (2018)
Cite this article

Journal of Agricultural, Biological and Environmental Statistics Aims and scope Submit manuscript

Jae Kwang Kim ORCID: orcid.org/0000-0002-0246-6029¹,
Zhonglei Wang¹,
Zhengyuan Zhu¹ &
…
Nathan B. Cruze²

656 Accesses
6 Citations
Explore all metrics

Abstract

Combining information from different sources is an important practical problem in survey sampling. Using a hierarchical area-level model, we establish a framework to integrate auxiliary information to improve state-level area estimates. The best predictors are obtained by the conditional expectations of latent variables given observations, and an estimate of the mean squared prediction error is discussed. Sponsored by the National Agricultural Statistics Service of the US Department of Agriculture, the proposed model is applied to the planted crop acreage estimation problem by combining information from three sources, including the June Area Survey obtained by a probability-based sampling of lands, administrative data about the planted acreage and the cropland data layer, which is a commodity-specific classification product derived from remote sensing data. The proposed model combines the available information at a sub-state level called the agricultural statistics district and aggregates to improve state-level estimates of planted acreages for different crops. Supplementary materials accompanying this paper appear on-line.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Enhancing crop recommendation systems with explainable artificial intelligence: a study on agricultural decision-making

Article Open access 11 January 2024

The accuracy of crime statistics: assessing the impact of police data bias on geographic crime analysis

Article Open access 26 March 2021

Validation of Spatial Microsimulation Models

References

Battese, G. E., Harter, R. M. and Fuller, W. A. (1988). An error-components model for prediction of county crop areas using survey and satellite data, Journal of the American Statistical Association 83: 28–36.
Article Google Scholar
Berg, E. J. and Fuller, W. A. (2014). Small area prediction of proportions with applications to the canadian labour force survey, Journal of Survey Statistics and Methodology 2: 227–256.
Article Google Scholar
Boryan, C., Yang, Z., Mueller, R. and Craig, M. (2011). Monitoring us agriculture: the US department of agriculture, national agricultural statistics service, cropland data layer program, Geocarto International 26: 341–358.
Article Google Scholar
Cressie, N. (2015). Statistics for Spatial Data, revised edn, John Wiley & Sons, New York.
MATH Google Scholar
Datta, G., Ghosh, M. et al. (2012). Small area shrinkage estimation, Statistical Science 27: 95–114.
Article MathSciNet MATH Google Scholar
Datta, G. S. (2009). Model-based approach to small area estimation, Handbook of Statistics 29: 251–288.
Article Google Scholar
Datta, G. S. and Ghosh, M. (1991). Bayesian prediction in linear models: Applications to small area estimation, The Annals of Statistics 19: 1748–1770.
Article MathSciNet MATH Google Scholar
Deming, W. E. and Stephan, F. F. (1940). On a least squares adjustment of a sampled frequency table when the expected marginal totals are known, The Annals of Mathematical Statistics 11: 427–444.
Article MathSciNet MATH Google Scholar
Dever, J. A. and Valliant, R. (2010). A comparison of variance estimators for poststratification to estimated control totals, Survey Methodology 36: 45–56.
Google Scholar
Elliott, M. R. and Davis, W. W. (2005). Obtaining cancer risk factor prevalence estimates in small areas: combining data from two surveys, Journal of the Royal Statistical Society: Series C (Applied Statistics) 54: 595–609.
Article MathSciNet MATH Google Scholar
Fay, R. E. and Herriot, R. A. (1979). Estimates of income for small places: An application of james-stein procedures to census data, Journal of the American Statistical Association 74: 269–277.
Article MathSciNet Google Scholar
Ghosh, M., Natarajan, K., Stroud, T. and Carlin, B. P. (1998). Generalized linear models for small-area estimation, Journal of the American Statistical Association 93: 273–282.
Article MathSciNet MATH Google Scholar
Ghosh, M. and Rao, J. N. K. (1994). Small area estimation: an appraisal, Statistical science 9: 55–76.
Article MathSciNet MATH Google Scholar
Hidiroglou, M. (2001). Double sampling, Survey methodology 27: 143–154.
Google Scholar
Kim, J. K. and Park, M. (2010). Calibration estimation in survey sampling, International Statistical Review 78: 21–39.
Article Google Scholar
Kim, J. K., Park, S. and Kim, S. Y. (2015). Small area estimation combining information from several sources, Survey Methodology 41: 21–36.
Google Scholar
Kim, J. K. and Rao, J. N. K. (2012). Combining data from two independent surveys: a model-assisted approach, Biometrika 99: 85–100.
Article MathSciNet MATH Google Scholar
Kim, J. K. and Shao, J. (2013). Statistical Methods for Handling Incomplete Data, CRC Press, Florida.
MATH Google Scholar
Lahiri, S. N. and Zhu, J. (2006). Resampling methods for spatial regression models under a class of stochastic designs, The Annals of Statistics 34: 1774–1813.
Article MathSciNet MATH Google Scholar
Legg, J. C. and Fuller, W. A. (2009). Two-phase sampling, Handbook of statistics 29: 55–70.
Article MathSciNet Google Scholar
Louis, T. A. (1982). Finding the observed information matrix when using the EM algorithm, Journal of the Royal Statistical Society. Series B (Methodological) 44: 226–233.
MathSciNet MATH Google Scholar
Manzi, G., Spiegelhalter, D. J., Turner, R. M., Flowers, J. and Thompson, S. G. (2011). Modelling bias in combining small area prevalence estimates from multiple surveys, Journal of the Royal Statistical Society: Series A (Statistics in Society) 174: 31–50.
Article MathSciNet Google Scholar
Merkouris, T. (2004). Combining independent regression estimators from multiple surveys, Journal of the American Statistical Association 99: 1131–1139.
Article MathSciNet MATH Google Scholar
Merkouris, T. (2010). Combining information from multiple surveys by using regression for efficient small domain estimation, Journal of the Royal Statistical Society: Series B (Statistical Methodology) 72: 27–48.
Article MathSciNet Google Scholar
Pfeffermann, D. (2002). Small area estimation: New developments and directions, International Statistical Review/Revue Internationale de Statistique 70: 125–143.
MATH Google Scholar
Raghunathan, T. E., Xie, D., Schenker, N., Parsons, V. L., Davis, W. W., Dodd, K. W. and Feuer, E. J. (2007). Combining information from two surveys to estimate county-level prevalence rates of cancer risk factors and screening, Journal of the American Statistical Association 102: 474–486.
Article MathSciNet MATH Google Scholar
Rao, J. N. K. and Molina, I. (2015). Small Area Estimation, second edn, Wiley Online Library, New Jersey.
Book MATH Google Scholar
Renssen, R. H. and Nieuwenbroek, N. J. (1997). Aligning estimates for common variables in two or more sample surveys, Journal of the American Statistical Association 92: 368–374.
Article MathSciNet MATH Google Scholar
Tam, S.-M. and Clarke, F. (2015). Big data, official statistics and some initiatives by the Australian Bureau of Statistics, International Statistical Review 83: 436–448.
Article Google Scholar
Torabi, M. and Rao, J. N. K. (2008). Small area estimation under a two-level model, Survey Methodology 34: 11–17.
Google Scholar
Torabi, M. and Rao, J. N. K. (2014). On small area estimation under a sub-area level model, Journal of Multivariate Analysis 127: 36–55.
Article MathSciNet MATH Google Scholar
United States Department of Agriculture (2015). June area survey, Website. Last checked: October 15, 2015.
Wu, C. J. (1983). On the convergence properties of the em algorithm, The Annals of statistics 11: 95–103.
Article MathSciNet MATH Google Scholar
Wu, C. and Lu, W. W. (2016). Calibration weighting methods for complex surveys, International Statistical Review 84: 79–98.
Article MathSciNet Google Scholar
You, Y. and Rao, J. N. K. (2002). A pseudo-empirical best linear unbiased prediction approach to small area estimation using survey weights, The Canadian Journal of Statistics/La Revue Canadienne de Statistique 30: 431–439.
Article MathSciNet MATH Google Scholar
Zieschang, K. D. (1990). Sample weighting methods and estimation of totals in the consumer expenditure survey, Journal of the American Statistical Association 85: 986–1001.
Article Google Scholar

Download references

Acknowledgements

We are grateful to three referees and the Associate Editor for the constructive comments. This research was supported by the National Agricultural Statistics Service of the US Department of Agriculture.

Author information

Authors and Affiliations

Department of Statistics, Iowa State University, Ames, IA, 50011, USA
Jae Kwang Kim, Zhonglei Wang & Zhengyuan Zhu
National Agricultural Statistics Service, United States Department of Agriculture, Washington, DC, 20250, USA
Nathan B. Cruze

Authors

Jae Kwang Kim
View author publications
You can also search for this author in PubMed Google Scholar
Zhonglei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zhengyuan Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Nathan B. Cruze
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jae Kwang Kim.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (zip 4 KB)

Supplementary material 2 (pdf 167 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kim, J.K., Wang, Z., Zhu, Z. et al. Combining Survey and Non-survey Data for Improved Sub-area Prediction Using a Multi-level Model. JABES 23, 175–189 (2018). https://doi.org/10.1007/s13253-018-0320-2

Download citation

Received: 09 February 2017
Accepted: 03 March 2018
Published: 24 April 2018
Issue Date: June 2018
DOI: https://doi.org/10.1007/s13253-018-0320-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Combining Survey and Non-survey Data for Improved Sub-area Prediction Using a Multi-level Model

Abstract

Access this article

Similar content being viewed by others

Enhancing crop recommendation systems with explainable artificial intelligence: a study on agricultural decision-making

The accuracy of crime statistics: assessing the impact of police data bias on geographic crime analysis

Validation of Spatial Microsimulation Models

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (zip 4 KB)

Supplementary material 2 (pdf 167 KB)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Combining Survey and Non-survey Data for Improved Sub-area Prediction Using a Multi-level Model

Abstract

Access this article

Similar content being viewed by others

Enhancing crop recommendation systems with explainable artificial intelligence: a study on agricultural decision-making

The accuracy of crime statistics: assessing the impact of police data bias on geographic crime analysis

Validation of Spatial Microsimulation Models

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (zip 4 KB)

Supplementary material 2 (pdf 167 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation