Fast filtering false active subspaces for efficient high dimensional similarity processing

Wang, GuoRen; Yu, Ge; Xin, JunChang; Zhao, YuHai; Zhang, EnDe

doi:10.1007/s11432-009-0051-7

Fast filtering false active subspaces for efficient high dimensional similarity processing

Published: 23 January 2009

Volume 52, pages 286–294, (2009)
Cite this article

Science in China Series F: Information Sciences Aims and scope Submit manuscript

GuoRen Wang¹,
Ge Yu¹,
JunChang Xin¹,
YuHai Zhao¹ &
…
EnDe Zhang¹

39 Accesses
1 Citation
Explore all metrics

Abstract

The query space of a similarity query is usually narrowed down by pruning inactive query subspaces which contain no query results and keeping active query subspaces which may contain objects corresponding to the request. However, some active query subspaces may contain no query results at all, those are called false active query subspaces. It is obvious that the performance of query processing degrades in the presence of false active query subspaces. Our experiments show that this problem becomes seriously when the data are high dimensional and the number of accesses to false active subspaces increases as the dimensionality increases. In order to solve this problem, this paper proposes a space mapping approach to reducing such unnecessary accesses. A given query space can be refined by filtering within its mapped space. To do so, a mapping strategy called maxgap is proposed to improve the efficiency of the refinement processing. Based on the mapping strategy, an index structure called MS-tree and algorithms of query processing are presented in this paper. Finally, the performance of MS-tree is compared with that of other competitors in terms of range queries on a real data set.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

LIDH: An Efficient Filtering Method for Approximate k Nearest Neighbor Queries Based on Local Intrinsic Dimension

Near-Optimal Partial Linear Scan for Nearest Neighbor Search in High-Dimensional Space

Improving the Performance of High-Dimensional kNN Retrieval through Localized Dataspace Segmentation and Hybrid Indexing

References

Böhm C, Berchtold S, Keim D A. Searching in high-dimensional spaces-index structures for improving the performance of multimedia databases. ACM Comput Surv, 2001, 33(3): 322–373
Article Google Scholar
Berkmann N, Krigel H P, Schneider R, et al. The R*-tree: an efficient and robust access method for points and rectangles. SIGMOD Record, 1990, 19(2): 322–331
Article Google Scholar
Katayama N, Satoh S. The SR-tree: an index structure for high-dimensional nearest meighbor queries. SIGMOD Record, 1997, 26(2): 369–380
Article Google Scholar
Lin K I, Jagadish H V, Faloutsos C. The TV-tree: an index structure for high-dimensional data. VLDB J, 1994, 3(4): 517–542
Article Google Scholar
White D A, Jain R. Similarity indexing with the SS-tree. In: Proceedings of the 12th ICDE Conference. Washington: IEEE Computer Society, 1996. 516–523
Google Scholar
Cha G H, Chung C W. The GC-tree: a high-dimensional index structure for similarity search in image databases. IEEE Trans Multimedia, 2002, 4(2): 235–247
Article Google Scholar
Bozkaya T, Ozsoyoglu M. Distance-based indexing for high-dimensional metric spaces. SIGMOD Record, 1997, 26(2): 357–368
Article Google Scholar
Ciaccia P, Patella M, Zezula P. M-tree: an efficient access method for similarity search in metric spaces. In: Proceedings of the 23rd VLDB Conference. San Fransisco: Morgan Kaufmann, 1997. 426–435
Google Scholar
Skopal T, Pokorny J, Kratky M, et al. Revisiting M-tree building principles. In: Proceedings of the 7th ADBIS Conference. Berlin: Springer-Verlag, 2003. 148–162
Google Scholar
Ishikawa M, Chen H, Furuse K, et al. MB+tree: a dynamically updatable metric index for similarity searches. In: Proceedings of the first WAIM Conference. Berlin: Springer-Verlag, 2000. 356–373
Google Scholar
Chakrabarti K, Mehrotra S. Local dimensionality reduction: a new approach to indexing high dimensional spaces. In: Proceedings of the 26th VLDB Conference. San Fransisco: Morgan Kaufmann, 2000. 89–100
Google Scholar
Zhou X, Wang G, Yu J X, et al. M⁺-tree: a new dynamical multidimensional index for metric spaces. In: Proceedings of the 14th Australasian Database Conference. Sydney: Australian Computer Society, 2003. 161–168
Google Scholar
Cui B, Ooi B C, Su J, et al. Contorting high dimensional data for efficient main memory processing. In: Proceedings of the 2003 ACM SIGMOD Conference. New York: ACM Press, 2003. 479–490
Chapter Google Scholar
Uhlmann J K. Satisfying general proximity/similarity queries with metric trees. Inform Process Lett, 1991, 40(4): 175–179
Article MATH Google Scholar
Yu G, Kaneko K, Bai G, et al. Transaction management for a distributed object storage system WAKSHI-design, implementation and performance. In: Proceedings of the 12th ICDE Conference. Washington: IEEE Computer Society, 1996. 460–468
Google Scholar

Download references

Author information

Authors and Affiliations

School of Information Science and Engineering, Northeastern University, Shenyang, 110004, China
GuoRen Wang, Ge Yu, JunChang Xin, YuHai Zhao & EnDe Zhang

Authors

GuoRen Wang
View author publications
You can also search for this author in PubMed Google Scholar
Ge Yu
View author publications
You can also search for this author in PubMed Google Scholar
JunChang Xin
View author publications
You can also search for this author in PubMed Google Scholar
YuHai Zhao
View author publications
You can also search for this author in PubMed Google Scholar
EnDe Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to GuoRen Wang.

Additional information

Supported by National Basic Research Program of China (Grant No. 2006CB303103), the National Natural Science Foundation of China (Grant Nos. 60873011, 60802026, 60773219, 60773021) and the High Technology Program (Grant No. 2007AA01Z192)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, G., Yu, G., Xin, J. et al. Fast filtering false active subspaces for efficient high dimensional similarity processing. Sci. China Ser. F-Inf. Sci. 52, 286–294 (2009). https://doi.org/10.1007/s11432-009-0051-7

Download citation

Received: 19 May 2008
Accepted: 20 September 2008
Published: 23 January 2009
Issue Date: February 2009
DOI: https://doi.org/10.1007/s11432-009-0051-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fast filtering false active subspaces for efficient high dimensional similarity processing

Abstract

Access this article

Similar content being viewed by others

LIDH: An Efficient Filtering Method for Approximate k Nearest Neighbor Queries Based on Local Intrinsic Dimension

Near-Optimal Partial Linear Scan for Nearest Neighbor Search in High-Dimensional Space

Improving the Performance of High-Dimensional kNN Retrieval through Localized Dataspace Segmentation and Hybrid Indexing

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Fast filtering false active subspaces for efficient high dimensional similarity processing

Abstract

Access this article

Similar content being viewed by others

LIDH: An Efficient Filtering Method for Approximate k Nearest Neighbor Queries Based on Local Intrinsic Dimension

Near-Optimal Partial Linear Scan for Nearest Neighbor Search in High-Dimensional Space

Improving the Performance of High-Dimensional kNN Retrieval through Localized Dataspace Segmentation and Hybrid Indexing

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation