Abstract
Outlier detection techniques in spatial data should allow to identify two types of outliers: global and local ones. Local outliers typically have non-spatial attributes that strongly differ from those observed on their neighbors. Detecting local outliers requires to be able to work locally, on neighborhoods, in order to take into account the spatial dependence between the statistical units under consideration, even though the outlyingness is usually measured on the non-spatial variables. Many procedures have been outlined in the literature, but their number reduces when one wants to deal with multivariate non-spatial attributes. In this paper, focus is on the multivariate context. A review of existing procedures is done. A new approach, based on a two-step improvement of an existing one, is also designed and compared with the benchmarked methods by means of examples and simulations.
Similar content being viewed by others
References
Breunig MM, Kriegel HP, Ng RT, Sander J (2000) LOF: identifying density-based local outliers. ACM SIGMOD Rec 29(2):93–104
Cerioli A, Farcomeni A (2011) Error rates for multivariate outlier detection. Comput Stat Data Anal 55(1):544–553
Chawla S, Sun P (2006) SLOM: a new measure for local spatial outliers. Knowl Inf Syst 9(4):412–429
Chen D, Lu CT, Kou Y, Chen F (2008) On detecting spatial outliers. Geoinformatica 12(4):455–475
Dray S, Jombart T (2011) Revisiting Guerry’s data: introducing spatial constraints in multivariate analysis. Ann Appl Stat 5(4):2278–2299
Filzmoser P, Ruiz-Gazen A, Thomas-Agnan C (2014) Identification of local multivariate outliers. Stat Pap 55(1):29–47
Friedman J, Hastie T, Tibshirani R (2008) Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9(3):432–441
Fritsch V, Varoquaux G, Thyreau B, Poline JB, Thirion B (2011) Detecting outlying subjects in high-dimensional neuroimaging datasets with regularized minimum covariance determinant. In: International conference on medical image computing and computer-assisted intervention-MICCAI 2011. Springer, Berlin, Heidelberg, pp 264–271
Gneiting T, Kleiber W, Schlather M (2010) Matérn cross-covariance functions for multivariate random fields. J Am Stat Assoc 105(491):1167–1177
Harris P, Brunsdon C, Charlton M, Juggins S, Clarke A (2014) Multivariate spatial outlier detection using robust geographically weighted methods. Math Geosci 46(1):1–31
Haslett J, Brandley R, Craig P, Unwin A, Wills G (1991) Dynamic graphics for exploring spatial data with application to locating global and local anomalies. Am Stat 45(3):234–242
Hubert M, Vandervieren E (2008) An adjusted boxplot for skewed distributions. Comput Stat Data Anal 52(12):5186–5201
Kriegel HP, Kröger P, Schubert E, Zimek A (2011) Interpreting and Unifying Outlier Scores. In: Proceedings of the 11th SIAM international conference on data mining (SDM), Mesa, AZ, 13–24
Lu CT, Chen D, Kou Y (2004) Multivariate spatial outliers detection. Int J Artif Intell Tools 13(04):801–811
Richardson S, Guihenneuc C, Lasserre V (1992) Spatial linear models with autocorrelated error structure. Statistician 41:539–557
Rousseeuw PJ (1985) Multivariate estimation with high breakdown point. In: Grossmann W, Pflug G, Vincze I, Wertz W (eds) Mathematical statistics and applications, volume B. Reidel, Dordrecht, pp 283–297
Rousseeuw PJ, Driessen KV (1999) A fast algorithm for the minimum covariance determinant estimator. Technometrics 41(3):212–223
Schubert E, Zimek A, Kriegel H-P (2014) Local outlier detection reconsidered: a generalized view on locality with applications to spatial, video, and network outlier detection. Data Min Knowl Discov 28(1):190–237
Sun P, Chawla S (2004) On local spatial outliers. In: Proceedings of 4th IEEE international conference on data mining, ICDM’04. Fourth IEEE International Conference on IEEE, pp 209–216
Witten DM, Tibshirani R (2009) Covariance-regularized regression and classification for high dimensional problems. J R Stat Soc Ser B 71(3):615–636
Acknowledgments
This work was partially supported by the IAP Research Network P7/06 of the Belgian State.
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible editor: Charu Aggarwal.
Appendix: Implementation: algorithm of the regularized spatial detection technique
Appendix: Implementation: algorithm of the regularized spatial detection technique
Rights and permissions
About this article
Cite this article
Ernst, M., Haesbroeck, G. Comparison of local outlier detection techniques in spatial multivariate data. Data Min Knowl Disc 31, 371–399 (2017). https://doi.org/10.1007/s10618-016-0471-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10618-016-0471-0