Abstract
Similarity search, or finding approximate nearest neighbors, is becoming an increasingly important tool to find the closest matches for a given query object in large scale database. Recently, learning hashing-based methods have attracted considerable attention due to their computational and memory efficiency. The basic idea of these approaches is to generate binary codes for data points which can preserve the similarity between any two of them. In this paper, we propose a novel algorithm named Approximate Bit-Vector (ABV) for hashing-based similarity search. ABV algorithm map data points into Hamming space and integrate with hash functions for fast similarity or k-NN search. Extensive experimental results over real large-scale datasets demonstrate the superiority of the proposed approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Gao, Y., Zheng, B., Chen, G., Li, Q., Guo, X.: Continuous visible nearest neighbor query processing in spatial databases. VLDB 20(3), 371–396 (2011)
Lin, Y., Jin, R., Cai, D., He, X.: Random projection with filtering for nearly duplicate search. In: Proceedings of the 26th AAAI Conference on Artificial Intelligence, pp. 641–647. AAAI Press, Toronto (2012)
Zhang, D., Yang, G., Hu, Y., Jin, Z., Cai, D., He, X.: A unified approximate nearest neighbor search scheme by combining data structure and hashing. In: Proceedings of the 23rd International Joint Conference on Artificial Intelligence, pp. 681–687. IJCAI Press, Beijing (2013)
Andoni, A., Indyk, P.: Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Commun. ACM 51(1), 117–122 (2008)
Jin, Z., Hu, Y., Lin, Y., Zhang, D., Lin, S., Cai, D., Li, X.: Complementary projection hashing. In: IEEE International Conference on Computer Vision, pp. 257–264. IEEE Press, Sydney (2013)
Jin, Z., Li, C., Lin, Y., Cai, D.: Density sensitive hashing. IEEE Trans. Cybern. 44(8), 1362–1371 (2014)
Kulis, B., Grauman, K.: Kernelized locality-sensitive hashing. IEEE Trans. Pattern Anal. Mach. Intell. 34(6), 1092–1104 (2012)
Lin, Y., Jin, R., Cai, D., Yan, S., Li, X.: Compressed hashing. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 446–451. IEEE Press, Portland (2013)
Weiss, Y., Torralba, A., Fergus, R.: Spectral hashing. In: Proceedings of the 22nd Annual Conference on Neural Information Processing Systems, pp. 1753–1760. Curran Associates Press, Vancouver (2008)
Wu, C., Zhu, J., Cai, D., Chen, C., Bu, J.: Semi-supervised nonlinear hashing using bootstrap sequential projection learning. IEEE Trans. Knowl. Data Eng. 25(6), 1380–1393 (2013)
Xu, B., Bu, J., Lin, Y., Chen, C., He, X., Cai, D.: Harmonious hashing. In: Proceedings of the 23rd International Joint Conference on Artificial Intelligence, pp. 1820–1826. AAAI Press, Beijing (2013)
Gionis, A., Indyk, P., Motwani, R.: Similarity search in high dimensions via hashing. In: Proceedings of the 25th International Conference on Very Large Data Bases, pp. 518–529. Morgan Kaufmann Press (1999)
Yuan, P., Sha, C., Sun, Y.: Hashed-join: approximate string similarity join with hashing. In: Han, W.-S., Lee, M.L., Muliantara, A., Sanjaya, N.A., Thalheim, B., Zhou, S. (eds.) DASFAA 2014. LNCS, vol. 8505, pp. 217–229. Springer, Heidelberg (2014)
Shakhnarovich, G., Viola, P., Darrell, T.: Fast pose estimation with parameter-sensitive hashing. In: Proceedings of the 9th IEEE International Conference on Computer Vision, pp. 750–757. IEEE Press, Nice (2003)
Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proceedings of the 30th Annual ACM Symposium on Theory of Computing, pp. 604–613. ACM Press, New York (1998)
Charikar, M.: Similarity estimation techniques from rounding algorithms. In: Proceedings of the 34th Annual ACM Symposium on Theory of Computing, pp. 380–388. ACM Press, Montreal (2002)
Paula, L.B.D., Villaca¸ R.D.S., Magalhaes, M.F.: Analysis of concept similarity methods applied to an LSH function. In: Proceedings of the 35th Annual IEEE International Computer Software and Applications Conference, pp. 547–555. IEEE Press, Munich (2011)
Salakhutdinov, R., Hinton, G.: Semantic hashing. Approximate Reasoning 50(7), 969–978 (2009)
Chaudhry, R., Ivanov, Y.: Fast approximate nearest neighbor methods for non-euclidean manifolds with applications to human activity analysis in videos. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part II. LNCS, vol. 6312, pp. 735–748. Springer, Heidelberg (2010)
Kulis, B., Darrell, T.: Learning to hash with binary reconstructive embeddings. In: Proceedings of 23rd Annual Conference on Neural Information Processing Systems, pp. 1042–1050. Curran Associates Press, Vancouver (2009)
Lewis, D.D.: Reuters-21578 text categorization test collection. http://www.daviddlewis.com/resources/testcollections/reuters21578/
Acknowledgments
This work was supported by the Science and Technology Plan Projects of Jilin city (No. 201464059), by the Ph.D. Scientific Research Start-up Capital Project of Northeast Dianli University (No. BSJXM-201319), and by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT & Future Planning (No. 2013R1A2A2A01068923).
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Wang, L., Zhou, T.H., Liu, Z.H., Qu, Z.Y., Ryu, K.H. (2015). Approximate Bit-Vector Algorithms for Hashing-Based Similarity Searches. In: Huang, DS., Bevilacqua, V., Premaratne, P. (eds) Intelligent Computing Theories and Methodologies. ICIC 2015. Lecture Notes in Computer Science(), vol 9225. Springer, Cham. https://doi.org/10.1007/978-3-319-22180-9_61
Download citation
DOI: https://doi.org/10.1007/978-3-319-22180-9_61
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-22179-3
Online ISBN: 978-3-319-22180-9
eBook Packages: Computer ScienceComputer Science (R0)