Principal Component Hashing: An Accelerated Approximate Nearest Neighbor Search

Matsushita, Yusuke; Wada, Toshikazu

doi:10.1007/978-3-540-92957-4_33

Principal Component Hashing: An Accelerated Approximate Nearest Neighbor Search

Yusuke Matsushita⁴ &
Toshikazu Wada⁴

Conference paper

4305 Accesses
17 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 5414))

Abstract

Nearest Neighbor (NN) search is a basic algorithm for data mining and machine learning applications. However, its acceleration in high dimensional space is a difficult problem. For solving this problem, approximate NN search algorithms have been investigated. Especially, LSH is getting highlighted recently, because it has a clear relationship between relative error ratio and the computational complexity. However, the p-stable LSH computes hash values independent of the data distributions, and hence, sometimes the search fails or consumes considerably long time. For solving this problem, we propose Principal Component Hashing (PCH), which exploits the distribution of the stored data. Through experiments, we confirmed that PCH is faster than ANN and LSH at the same accuracy.

Download to read the full chapter text

Chapter PDF

References

Cover, T.M., Hart, P.E.: Nearest neighbor pattern classification. IEEE Transactions on Information Theory IT-13(1), 21–27 (1967)
Article MATH Google Scholar
Zhang, Z.: Iterative Point Matching for Registration of Free-Form Curves and Surfaces. Tech. Report INRIA, No 1658 (1992)
Google Scholar
Bentley, J.L.: Multidimensional binary search trees used for associative searching. Commun. ACM 18(9), 509–517 (1975)
Article MATH Google Scholar
Arya, S., Mount, D.M., Netanyahu, N.S., Silverman, R., Wu, A.Y.: An optimal algorithm for approximate nearest neighbor searching. Journal of the ACM 45, 891–923 (1998)
Article MathSciNet MATH Google Scholar
ANN: Library for Approximate Nearest Neighbor Searching, http://www.cs.umd.edu/~mount/ANN/
Indyk, P., Motwani, R.: Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality. In: Proceedings of the 30th ACM Symposium on Theory of Computing (STOC 1998), pp. 604–613 (May 1998)
Google Scholar
Datar, M., Indyk, P., Immorlica, N., Mirrokni, V.: Locality-Sensitive Hashing Scheme Based on p-Stable Distributions. In: Proceedings of the 20th Annual Symposium on Computational Geometry (SCG 2004) (June 2004)
Google Scholar
Andoni, A., Indyk, P.: Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions. In: Proc. of FOCS 2006, pp. 459–468 (2006)
Google Scholar
Vidal, R.: An algorithm for finding nearest neighbor in (approximately) constant average time. Pattern Recognition Letters 4, 145–158 (1986)
Article Google Scholar
Mico, L., Oncina, J., Vidal, E.: A new version of the nearest-neighbor approximating and eliminating search algorithm (AESA) with linear preprocessing time and memory requirements. Pattern Recognition Letters 15, 9–17 (1994)
Article Google Scholar
Brin, S.: Near neighbor search in large metric spaces. In: Proc. of 21st Conf. on very large database (VLDB), Zurich, Switzerland, pp. 574–584 (1995)
Google Scholar
Yianilos, P.Y.: Data structures and algorithms for nearest neighbor search in general metric spaces. In: Proc. of the Fourth Annual ACM-SIAM Symp. on Discrete Algorithms, Austin, TX, pp. 311–321 (1993)
Google Scholar

Download references

Author information

Authors and Affiliations

Graduate School of Systems Engineering, Wakayama University, 930 Sakaedani, Wakayama, 640-8510, Japan
Yusuke Matsushita & Toshikazu Wada

Authors

Yusuke Matsushita
View author publications
You can also search for this author in PubMed Google Scholar
Toshikazu Wada
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer and Communication Science, Wakayama University, 930 Sakaedani, Wakayama-shi, 640 8510, Wakayama, Japan
Toshikazu Wada
Institute of Computer Science and Information Engineering, National Ilan University, No. 1, Sec. 1, Shen-Lung Rd., 26047, Yi-Lan, Taiwan, ROC
Fay Huang
Microsoft Research Asia, Beijing Sigma Center, 5003, No. 49, Zhichun Road, 100190, Beijing, PR China
Stephen Lin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Matsushita, Y., Wada, T. (2009). Principal Component Hashing: An Accelerated Approximate Nearest Neighbor Search. In: Wada, T., Huang, F., Lin, S. (eds) Advances in Image and Video Technology. PSIVT 2009. Lecture Notes in Computer Science, vol 5414. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-92957-4_33

Download citation

DOI: https://doi.org/10.1007/978-3-540-92957-4_33
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-92956-7
Online ISBN: 978-3-540-92957-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)