Querying high-dimensional data in single-dimensional space

Yu, Cui; Bressan, Stéphane; Ooi, Beng Chin; Tan, Kian-Lee

doi:10.1007/s00778-004-0121-9

Querying high-dimensional data in single-dimensional space

Published: May 2004

Volume 13, pages 105–119, (2004)
Cite this article

The VLDB Journal Aims and scope Submit manuscript

Cui Yu¹,
Stéphane Bressan²,
Beng Chin Ooi² &
…
Kian-Lee Tan²

99 Accesses
7 Citations
Explore all metrics

Abstract.

In this paper, we propose a new tunable index scheme, called iMinMax(\(\theta\)), that maps points in high-dimensional spaces to single-dimensional values determined by their maximum or minimum values among all dimensions. By varying the tuning “knob”, \(\theta\), we can obtain different families of iMinMax structures that are optimized for different distributions of data sets. The transformed data can then be indexed using existing single-dimensional indexing structures such as the B⁺-trees. Queries in the high-dimensional space have to be transformed into queries in the single-dimensional space and evaluated there. We present efficient algorithms for evaluating window queries as range queries on the single-dimensional space. We conducted an extensive performance study to evaluate the effectiveness of the proposed schemes. Our results show that iMinMax(\(\theta\)) outperforms existing techniques, including the Pyramid scheme and VA-file, by a wide margin. We then describe how iMinMax could be used in approximate K-nearest neighbor (KNN) search, and we present a comparative study against the recently proposed iDistance, a specialized KNN indexing method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Beckmann N, Kriegel H-P, Schneider R, Seeger B (1990) The R^*-tree: an efficient and robust access method for points and rectangles. In: Proceedings of the 1990 ACM SIGMOD international conference on management of data, Atlantic City, NJ, 23-25 May 1990, pp 322-331
Berchtold S, Bőhm B, Kriegel H-P (1998) The pyramid-technique: towards breaking the curse of dimensionality. In: Proceedings of the 1998 ACM SIGMOD international conference on management of data, Seattle, 2-4 June 1998, pp 142-153
Berchtold S, Keim DA, Kriegel H-P (1996) The X-tree: an index structure for high-dimensional data. In: Proceedings of the 22nd international conference on very large data bases, Mumbai (Bombay), India, 3-6 September 1996, pp 28-37
Bertino E(1997) Indexing techniques for advanced database systems. Kluwer, Dordrecht
Beyer K, Goldstein J, Ramakrishnan R, Shaft U (1999) When is nearest neighbors meaningful? In: Proceedings of the international conference on database theory, Jerusalem, Israel, 10-12 January 1999, pp 217-235
Böhm C, Berchtold S, Keim D (2001) Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases. ACM Comput Surv 33(3):322-373
Article Google Scholar
Chan CY, Ooi BC, Lu H (1992) Extensible buffer management of indexes. In: Proceedings of the 18th international conference on very large data bases, Vancouver, BC, Canada, 23-27 August 1992, pp 444-454
Guttman A (1984) R-trees: a dynamic index structure for spatial searching. In: Proceedings of the 1984 ACM SIGMOD international conference on management of data, Boston, 18-21 June 1984, pp 47-57
Kamel I, Faloutsos C (1994) Hilbert r-tree: an improved r-tree using fractals. In: Proceedings of the 20th international conference on very large data bases, Santiago de Chile, Chile 12-15 September 1994, pp 500-509
Manopopoulos Y, Theodoridis Y, Tsotra VJ (2000) Advanced database indexing. Kluwer, Dordrecht
Ooi BC, Tan KL, Chua TS, Hsu W (1992) Fast image retrieval using color-spatial information. J Very Large Databases 7(2):115-128
Google Scholar
Ooi BC, Tan KL, Yu C, Bressan S (2000) Indexing the edge: a simple and yet efficient approach to high-dimensional indexing. In: Proceedings of the 18th ACM SIGACT-SIGMOD-SIGART symposium on principles of database systems, Dallas, TX, 15-17 May 2000, pp 166-174
Ramakrishnan R, Gehrke J (2000) Database management systems. McGraw-Hill, New York
Sakurai Y, Yoshikawa M, Uemura S (2000) The a-tree: an index structure for high-dimensional spaces using relative approximation. In: Proceedings of the 26th international conference on very large data bases, Cairo, Egypt, 10-14 September 2000, pp 516-526
Weber R, Schek H, Blott S (1998) A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In: Proceedings of the 24th international conference on very large data bases, New York, 24-27 August 1998, pp 194-205
Yu C (2002) High-dimensional indexing: transformational approaches to high-dimensional range and similarity searches. Lecture notes in computer science, vol 2341. Springer, Berlin Heidelberg New York
Yu C, Tan KL, Ooi BC, Jagadish HV (2001) Indexing the distance: an efficient method to knn processing. In: Proceedings of the 27th international conference on very large data bases, Rome, Italy, 11-14 September 2001, pp 421-430

Download references

Author information

Authors and Affiliations

Department of Computer Science, Monmouth University, NJ 07764, West Long Branch, USA
Cui Yu
Department of Computer Science, National University of Singapore, 3 Science Drive 2, 117543, Singapore
Stéphane Bressan, Beng Chin Ooi & Kian-Lee Tan

Authors

Cui Yu
View author publications
You can also search for this author in PubMed Google Scholar
Stéphane Bressan
View author publications
You can also search for this author in PubMed Google Scholar
Beng Chin Ooi
View author publications
You can also search for this author in PubMed Google Scholar
Kian-Lee Tan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cui Yu.

Additional information

Received: 21 May 2000, Revised: 14 March 2002, Published online: 8 April 2004

Edited by: M. Kitsuregawa.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yu, C., Bressan, S., Ooi, B.C. et al. Querying high-dimensional data in single-dimensional space. VLDB 13, 105–119 (2004). https://doi.org/10.1007/s00778-004-0121-9

Download citation

Issue Date: May 2004
DOI: https://doi.org/10.1007/s00778-004-0121-9

Keywords:

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Querying high-dimensional data in single-dimensional space

Abstract.

Access this article

Similar content being viewed by others

Trends and Future Perspective Challenges in Big Data

Density-Based Clustering Based on Hierarchical Density Estimates

Making data visualization more efficient and effective: a survey

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords:

Navigation

Querying high-dimensional data in single-dimensional space

Abstract.

Access this article

Similar content being viewed by others

Trends and Future Perspective Challenges in Big Data

Density-Based Clustering Based on Hierarchical Density Estimates

Making data visualization more efficient and effective: a survey

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords:

Search

Navigation