Skip to main content
Log in

Querying high-dimensional data in single-dimensional space

  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract.

In this paper, we propose a new tunable index scheme, called iMinMax(\(\theta\)), that maps points in high-dimensional spaces to single-dimensional values determined by their maximum or minimum values among all dimensions. By varying the tuning “knob”, \(\theta\), we can obtain different families of iMinMax structures that are optimized for different distributions of data sets. The transformed data can then be indexed using existing single-dimensional indexing structures such as the B + -trees. Queries in the high-dimensional space have to be transformed into queries in the single-dimensional space and evaluated there. We present efficient algorithms for evaluating window queries as range queries on the single-dimensional space. We conducted an extensive performance study to evaluate the effectiveness of the proposed schemes. Our results show that iMinMax(\(\theta\)) outperforms existing techniques, including the Pyramid scheme and VA-file, by a wide margin. We then describe how iMinMax could be used in approximate K-nearest neighbor (KNN) search, and we present a comparative study against the recently proposed iDistance, a specialized KNN indexing method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Beckmann N, Kriegel H-P, Schneider R, Seeger B (1990) The R*-tree: an efficient and robust access method for points and rectangles. In: Proceedings of the 1990 ACM SIGMOD international conference on management of data, Atlantic City, NJ, 23-25 May 1990, pp 322-331

  2. Berchtold S, Bőhm B, Kriegel H-P (1998) The pyramid-technique: towards breaking the curse of dimensionality. In: Proceedings of the 1998 ACM SIGMOD international conference on management of data, Seattle, 2-4 June 1998, pp 142-153

  3. Berchtold S, Keim DA, Kriegel H-P (1996) The X-tree: an index structure for high-dimensional data. In: Proceedings of the 22nd international conference on very large data bases, Mumbai (Bombay), India, 3-6 September 1996, pp 28-37

  4. Bertino E(1997) Indexing techniques for advanced database systems. Kluwer, Dordrecht

  5. Beyer K, Goldstein J, Ramakrishnan R, Shaft U (1999) When is nearest neighbors meaningful? In: Proceedings of the international conference on database theory, Jerusalem, Israel, 10-12 January 1999, pp 217-235

  6. Böhm C, Berchtold S, Keim D (2001) Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases. ACM Comput Surv 33(3):322-373

    Article  Google Scholar 

  7. Chan CY, Ooi BC, Lu H (1992) Extensible buffer management of indexes. In: Proceedings of the 18th international conference on very large data bases, Vancouver, BC, Canada, 23-27 August 1992, pp 444-454

  8. Guttman A (1984) R-trees: a dynamic index structure for spatial searching. In: Proceedings of the 1984 ACM SIGMOD international conference on management of data, Boston, 18-21 June 1984, pp 47-57

  9. Kamel I, Faloutsos C (1994) Hilbert r-tree: an improved r-tree using fractals. In: Proceedings of the 20th international conference on very large data bases, Santiago de Chile, Chile 12-15 September 1994, pp 500-509

  10. Manopopoulos Y, Theodoridis Y, Tsotra VJ (2000) Advanced database indexing. Kluwer, Dordrecht

  11. Ooi BC, Tan KL, Chua TS, Hsu W (1992) Fast image retrieval using color-spatial information. J Very Large Databases 7(2):115-128

    Google Scholar 

  12. Ooi BC, Tan KL, Yu C, Bressan S (2000) Indexing the edge: a simple and yet efficient approach to high-dimensional indexing. In: Proceedings of the 18th ACM SIGACT-SIGMOD-SIGART symposium on principles of database systems, Dallas, TX, 15-17 May 2000, pp 166-174

  13. Ramakrishnan R, Gehrke J (2000) Database management systems. McGraw-Hill, New York

  14. Sakurai Y, Yoshikawa M, Uemura S (2000) The a-tree: an index structure for high-dimensional spaces using relative approximation. In: Proceedings of the 26th international conference on very large data bases, Cairo, Egypt, 10-14 September 2000, pp 516-526

  15. Weber R, Schek H, Blott S (1998) A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In: Proceedings of the 24th international conference on very large data bases, New York, 24-27 August 1998, pp 194-205

  16. Yu C (2002) High-dimensional indexing: transformational approaches to high-dimensional range and similarity searches. Lecture notes in computer science, vol 2341. Springer, Berlin Heidelberg New York

  17. Yu C, Tan KL, Ooi BC, Jagadish HV (2001) Indexing the distance: an efficient method to knn processing. In: Proceedings of the 27th international conference on very large data bases, Rome, Italy, 11-14 September 2001, pp 421-430

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Cui Yu.

Additional information

Received: 21 May 2000, Revised: 14 March 2002, Published online: 8 April 2004

Edited by: M. Kitsuregawa.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yu, C., Bressan, S., Ooi, B.C. et al. Querying high-dimensional data in single-dimensional space. VLDB 13, 105–119 (2004). https://doi.org/10.1007/s00778-004-0121-9

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-004-0121-9

Keywords:

Navigation