Abstract.
In this paper, we propose a new tunable index scheme, called iMinMax(\(\theta\)), that maps points in high-dimensional spaces to single-dimensional values determined by their maximum or minimum values among all dimensions. By varying the tuning “knob”, \(\theta\), we can obtain different families of iMinMax structures that are optimized for different distributions of data sets. The transformed data can then be indexed using existing single-dimensional indexing structures such as the B + -trees. Queries in the high-dimensional space have to be transformed into queries in the single-dimensional space and evaluated there. We present efficient algorithms for evaluating window queries as range queries on the single-dimensional space. We conducted an extensive performance study to evaluate the effectiveness of the proposed schemes. Our results show that iMinMax(\(\theta\)) outperforms existing techniques, including the Pyramid scheme and VA-file, by a wide margin. We then describe how iMinMax could be used in approximate K-nearest neighbor (KNN) search, and we present a comparative study against the recently proposed iDistance, a specialized KNN indexing method.
Similar content being viewed by others
References
Beckmann N, Kriegel H-P, Schneider R, Seeger B (1990) The R*-tree: an efficient and robust access method for points and rectangles. In: Proceedings of the 1990 ACM SIGMOD international conference on management of data, Atlantic City, NJ, 23-25 May 1990, pp 322-331
Berchtold S, Bőhm B, Kriegel H-P (1998) The pyramid-technique: towards breaking the curse of dimensionality. In: Proceedings of the 1998 ACM SIGMOD international conference on management of data, Seattle, 2-4 June 1998, pp 142-153
Berchtold S, Keim DA, Kriegel H-P (1996) The X-tree: an index structure for high-dimensional data. In: Proceedings of the 22nd international conference on very large data bases, Mumbai (Bombay), India, 3-6 September 1996, pp 28-37
Bertino E(1997) Indexing techniques for advanced database systems. Kluwer, Dordrecht
Beyer K, Goldstein J, Ramakrishnan R, Shaft U (1999) When is nearest neighbors meaningful? In: Proceedings of the international conference on database theory, Jerusalem, Israel, 10-12 January 1999, pp 217-235
Böhm C, Berchtold S, Keim D (2001) Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases. ACM Comput Surv 33(3):322-373
Chan CY, Ooi BC, Lu H (1992) Extensible buffer management of indexes. In: Proceedings of the 18th international conference on very large data bases, Vancouver, BC, Canada, 23-27 August 1992, pp 444-454
Guttman A (1984) R-trees: a dynamic index structure for spatial searching. In: Proceedings of the 1984 ACM SIGMOD international conference on management of data, Boston, 18-21 June 1984, pp 47-57
Kamel I, Faloutsos C (1994) Hilbert r-tree: an improved r-tree using fractals. In: Proceedings of the 20th international conference on very large data bases, Santiago de Chile, Chile 12-15 September 1994, pp 500-509
Manopopoulos Y, Theodoridis Y, Tsotra VJ (2000) Advanced database indexing. Kluwer, Dordrecht
Ooi BC, Tan KL, Chua TS, Hsu W (1992) Fast image retrieval using color-spatial information. J Very Large Databases 7(2):115-128
Ooi BC, Tan KL, Yu C, Bressan S (2000) Indexing the edge: a simple and yet efficient approach to high-dimensional indexing. In: Proceedings of the 18th ACM SIGACT-SIGMOD-SIGART symposium on principles of database systems, Dallas, TX, 15-17 May 2000, pp 166-174
Ramakrishnan R, Gehrke J (2000) Database management systems. McGraw-Hill, New York
Sakurai Y, Yoshikawa M, Uemura S (2000) The a-tree: an index structure for high-dimensional spaces using relative approximation. In: Proceedings of the 26th international conference on very large data bases, Cairo, Egypt, 10-14 September 2000, pp 516-526
Weber R, Schek H, Blott S (1998) A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In: Proceedings of the 24th international conference on very large data bases, New York, 24-27 August 1998, pp 194-205
Yu C (2002) High-dimensional indexing: transformational approaches to high-dimensional range and similarity searches. Lecture notes in computer science, vol 2341. Springer, Berlin Heidelberg New York
Yu C, Tan KL, Ooi BC, Jagadish HV (2001) Indexing the distance: an efficient method to knn processing. In: Proceedings of the 27th international conference on very large data bases, Rome, Italy, 11-14 September 2001, pp 421-430
Author information
Authors and Affiliations
Corresponding author
Additional information
Received: 21 May 2000, Revised: 14 March 2002, Published online: 8 April 2004
Edited by: M. Kitsuregawa.
Rights and permissions
About this article
Cite this article
Yu, C., Bressan, S., Ooi, B.C. et al. Querying high-dimensional data in single-dimensional space. VLDB 13, 105–119 (2004). https://doi.org/10.1007/s00778-004-0121-9
Issue Date:
DOI: https://doi.org/10.1007/s00778-004-0121-9