ABSTRACT
Outlier mining is an important data analysis task to distinguish exceptional outliers from regular objects. However, in recent applications traditional outlier mining approaches miss outliers as they are hidden in subspace projections.
In this work, we propose a novel outlier ranking based on the degree of deviation in subspaces. Object deviation is measured only in a selection of relevant subspaces and is based on adaptive neighborhoods in these subspaces. We show that our approach outperforms competing outlier ranking approaches by detecting outliers in arbitrary subspaces.
- I. Assent, R. Krieger, E. Müller, and T. Seidl. DUSC: Dimensionality unbiased subspace clustering. In ICDM, pages 409--414, 2007. Google ScholarDigital Library
- M. Breunig, H.-P. Kriegel, R. Ng, and J. Sander. LOF: Identifying density-based local outliers. In SIGMOD, pages 93--104, 2000. Google ScholarDigital Library
- H.-P. Kriegel, P. Kröger, E. Schubert, and A. Zimek. Outlier detection in axis-parallel subspaces of high dimensional data. In PAKDD, pages 831--838, 2009. Google ScholarDigital Library
- H.-P. Kriegel, M. Schubert, and A. Zimek. Angle-based outlier detection in high-dimensional data. In KDD, pages 444--452, 2008. Google ScholarDigital Library
- E. Müller, I. Assent, S. Günnemann, R. Krieger, and T. Seidl. Relevant Subspace Clustering: mining the most interesting non-redundant concepts in high dimensional data. In ICDM, pages 377--386, 2009. Google ScholarDigital Library
- E. Müller, I. Assent, U. Steinhausen, and T. Seidl. Outrank: Ranking outliers in high dimensional data. In DBRank Workshop, pages 600--603, 2008. Google ScholarDigital Library
- E. Müller, S. Günnemann, I. Assent, and T. Seidl. Evaluating clustering in subspace projections of high dimensional data. PVLDB, 2(1):1270--1281, 2009. Google ScholarDigital Library
- E. Müller, M. Schiffer, P. Gerwert, M. Hannen, T. Jansen, and T. Seidl. SOREX: Subspace outlier ranking exploration toolkit. In ECML PKDD, pages 607--610, 2010. Google ScholarDigital Library
- B. Silverman. Density Estimation for Statistics and Data Analysis. Chapman and Hall, London, 1986.Google ScholarCross Ref
Index Terms
- Adaptive outlierness for subspace outlier ranking
Recommendations
Flexible and adaptive subspace search for outlier analysis
CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge ManagementThere exists a variety of traditional outlier models, which measure the deviation of outliers with respect to the full attribute space. However, these techniques fail to detect outliers that deviate only w.r.t. an attribute subset. To address this ...
Statistical selection of relevant subspace projections for outlier ranking
ICDE '11: Proceedings of the 2011 IEEE 27th International Conference on Data EngineeringOutlier mining is an important data analysis task to distinguish exceptional outliers from regular objects. For outlier mining in the full data space, there are well established methods which are successful in measuring the degree of deviation for ...
Class separation through variance: a new application of outlier detection
This paper introduces a new outlier detection approach and discusses and extends a new concept, class separation through variance. We show that even for balanced and concentric classes differing only in variance, accumulating information about the ...
Comments