Skip to main content

Approximate Bit-Vector Algorithms for Hashing-Based Similarity Searches

  • Conference paper
  • First Online:
Intelligent Computing Theories and Methodologies (ICIC 2015)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9225))

Included in the following conference series:

  • 1766 Accesses

Abstract

Similarity search, or finding approximate nearest neighbors, is becoming an increasingly important tool to find the closest matches for a given query object in large scale database. Recently, learning hashing-based methods have attracted considerable attention due to their computational and memory efficiency. The basic idea of these approaches is to generate binary codes for data points which can preserve the similarity between any two of them. In this paper, we propose a novel algorithm named Approximate Bit-Vector (ABV) for hashing-based similarity search. ABV algorithm map data points into Hamming space and integrate with hash functions for fast similarity or k-NN search. Extensive experimental results over real large-scale datasets demonstrate the superiority of the proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Gao, Y., Zheng, B., Chen, G., Li, Q., Guo, X.: Continuous visible nearest neighbor query processing in spatial databases. VLDB 20(3), 371–396 (2011)

    Article  Google Scholar 

  2. Lin, Y., Jin, R., Cai, D., He, X.: Random projection with filtering for nearly duplicate search. In: Proceedings of the 26th AAAI Conference on Artificial Intelligence, pp. 641–647. AAAI Press, Toronto (2012)

    Google Scholar 

  3. Zhang, D., Yang, G., Hu, Y., Jin, Z., Cai, D., He, X.: A unified approximate nearest neighbor search scheme by combining data structure and hashing. In: Proceedings of the 23rd International Joint Conference on Artificial Intelligence, pp. 681–687. IJCAI Press, Beijing (2013)

    Google Scholar 

  4. Andoni, A., Indyk, P.: Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Commun. ACM 51(1), 117–122 (2008)

    Article  Google Scholar 

  5. Jin, Z., Hu, Y., Lin, Y., Zhang, D., Lin, S., Cai, D., Li, X.: Complementary projection hashing. In: IEEE International Conference on Computer Vision, pp. 257–264. IEEE Press, Sydney (2013)

    Google Scholar 

  6. Jin, Z., Li, C., Lin, Y., Cai, D.: Density sensitive hashing. IEEE Trans. Cybern. 44(8), 1362–1371 (2014)

    Article  Google Scholar 

  7. Kulis, B., Grauman, K.: Kernelized locality-sensitive hashing. IEEE Trans. Pattern Anal. Mach. Intell. 34(6), 1092–1104 (2012)

    Article  Google Scholar 

  8. Lin, Y., Jin, R., Cai, D., Yan, S., Li, X.: Compressed hashing. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 446–451. IEEE Press, Portland (2013)

    Google Scholar 

  9. Weiss, Y., Torralba, A., Fergus, R.: Spectral hashing. In: Proceedings of the 22nd Annual Conference on Neural Information Processing Systems, pp. 1753–1760. Curran Associates Press, Vancouver (2008)

    Google Scholar 

  10. Wu, C., Zhu, J., Cai, D., Chen, C., Bu, J.: Semi-supervised nonlinear hashing using bootstrap sequential projection learning. IEEE Trans. Knowl. Data Eng. 25(6), 1380–1393 (2013)

    Article  Google Scholar 

  11. Xu, B., Bu, J., Lin, Y., Chen, C., He, X., Cai, D.: Harmonious hashing. In: Proceedings of the 23rd International Joint Conference on Artificial Intelligence, pp. 1820–1826. AAAI Press, Beijing (2013)

    Google Scholar 

  12. Gionis, A., Indyk, P., Motwani, R.: Similarity search in high dimensions via hashing. In: Proceedings of the 25th International Conference on Very Large Data Bases, pp. 518–529. Morgan Kaufmann Press (1999)

    Google Scholar 

  13. Yuan, P., Sha, C., Sun, Y.: Hashed-join: approximate string similarity join with hashing. In: Han, W.-S., Lee, M.L., Muliantara, A., Sanjaya, N.A., Thalheim, B., Zhou, S. (eds.) DASFAA 2014. LNCS, vol. 8505, pp. 217–229. Springer, Heidelberg (2014)

    Google Scholar 

  14. Shakhnarovich, G., Viola, P., Darrell, T.: Fast pose estimation with parameter-sensitive hashing. In: Proceedings of the 9th IEEE International Conference on Computer Vision, pp. 750–757. IEEE Press, Nice (2003)

    Google Scholar 

  15. Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proceedings of the 30th Annual ACM Symposium on Theory of Computing, pp. 604–613. ACM Press, New York (1998)

    Google Scholar 

  16. Charikar, M.: Similarity estimation techniques from rounding algorithms. In: Proceedings of the 34th Annual ACM Symposium on Theory of Computing, pp. 380–388. ACM Press, Montreal (2002)

    Google Scholar 

  17. Paula, L.B.D., Villaca¸ R.D.S., Magalhaes, M.F.: Analysis of concept similarity methods applied to an LSH function. In: Proceedings of the 35th Annual IEEE International Computer Software and Applications Conference, pp. 547–555. IEEE Press, Munich (2011)

    Google Scholar 

  18. Salakhutdinov, R., Hinton, G.: Semantic hashing. Approximate Reasoning 50(7), 969–978 (2009)

    Article  MATH  Google Scholar 

  19. Chaudhry, R., Ivanov, Y.: Fast approximate nearest neighbor methods for non-euclidean manifolds with applications to human activity analysis in videos. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part II. LNCS, vol. 6312, pp. 735–748. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  20. Kulis, B., Darrell, T.: Learning to hash with binary reconstructive embeddings. In: Proceedings of 23rd Annual Conference on Neural Information Processing Systems, pp. 1042–1050. Curran Associates Press, Vancouver (2009)

    Google Scholar 

  21. Lewis, D.D.: Reuters-21578 text categorization test collection. http://www.daviddlewis.com/resources/testcollections/reuters21578/

Download references

Acknowledgments

This work was supported by the Science and Technology Plan Projects of Jilin city (No. 201464059), by the Ph.D. Scientific Research Start-up Capital Project of Northeast Dianli University (No. BSJXM-201319), and by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT & Future Planning (No. 2013R1A2A2A01068923).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Ling Wang or Keun Ho Ryu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Wang, L., Zhou, T.H., Liu, Z.H., Qu, Z.Y., Ryu, K.H. (2015). Approximate Bit-Vector Algorithms for Hashing-Based Similarity Searches. In: Huang, DS., Bevilacqua, V., Premaratne, P. (eds) Intelligent Computing Theories and Methodologies. ICIC 2015. Lecture Notes in Computer Science(), vol 9225. Springer, Cham. https://doi.org/10.1007/978-3-319-22180-9_61

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-22180-9_61

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-22179-3

  • Online ISBN: 978-3-319-22180-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics