Skip to main content

Cardinality Computing: A New Step Towards Fully Representing Multi-sets by Bloom Filters

  • Conference paper
Web Information Systems – WISE 2006 (WISE 2006)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4255))

Included in the following conference series:

  • 625 Accesses

Abstract

Bloom Filters are space and time efficient randomized data structures for representing (multi-)sets with certain allowable errors, and are widely used in many applications. Previous works on Bloom Filters considered how to support insertions, deletions, membership queries, and multiplicity queries over (multi-)sets. In this paper, we introduce two novel algorithms for computing cardinalities of multi-sets represented by Bloom Filters, which extend the functionality of the Bloom Filter and thus make it usable in a variety of new applications. The Bloom structure presented in the previous work is used without any modification, and our algorithms have no influence to previous functionality. For Bloom Filters support cardinality computing in addition to insertions, deletions, membership queries, and multiplicity queries simultaneously, our work is a new step towards fully representing multi-sets by Bloom Filters. Performance analysis and experimental results show the difference of the two algorithms and show that our algorithms perform well in most cases.

Supported by State Key Laboratory of Networking and Switching Technology, NSFC Grant 60473051 and 60503037, and NSFBC Grant 4062018.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bloom, B.H.: Space/Time Trade-offs in Hash Coding with Allowable Errors. Communication of the ACM 13(7), 422–426 (1970)

    Article  MATH  Google Scholar 

  2. Fan, L., Cao, P., Almeida, J., Border, A.Z.: Summary Cache: A Scalable Wide-Area Web Cache Sharing Protocol. ACM SIGCOMM Computer Communication Review 28(4), 254–265 (1998)

    Article  Google Scholar 

  3. Cohen, S., Matias, Y.: Spectral Bloom Filters. In: Proceedings of SIGMOD, pp. 241–252 (2003)

    Google Scholar 

  4. Flajolet, P., Martin, N.: Probabilistic Counting Algorithms for Data Base Applications. Journal of Computer and System Sciences 31(2), 182–209 (1985)

    Article  MATH  MathSciNet  Google Scholar 

  5. Ganguly, S., Garofalakis, M.N., Rastogi, R.: Tracking Set-Expression Cardinalities over Continuous Update Streams. VLDB Journal 13(4), 354–369 (2004)

    Article  Google Scholar 

  6. Garofalakis, M.N., Ganguly, S., Kumar, A., Rastogi, R.: Join-Distinct Aggregate Estimation over Update Streams. In: Proceedings of PODS 2005, pp. 259–270 (2005)

    Google Scholar 

  7. Broder, A., Mitzenmacher, M.: Network Applications of Bloom Filters: A Survey. Internet Mathematics 1(4), 485–509 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  8. Metwally, A., Agrawal, D., Abbadi, A.E.: Duplicate Detection in Click Streams. In: Proceedings of WWW 2005, pp. 12–21 (2005)

    Google Scholar 

  9. Deng, F., Rafiei, D.: Approximately Detecting Duplicates for Streaming Data using Stable Bloom Filters. In: Proceedings of SIGMOD 2006, pp. 25–36 (2006)

    Google Scholar 

  10. Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and Issues in Data Stream Systems. In: Proceedings of PODS 2002, pp. 1–16 (2002)

    Google Scholar 

  11. http://www.gnu.org/software/gsl/

  12. L’Ecuyer, P.: Tables of Maximally Equidistributed Combined LFSR Generators. Mathematics of Computation 68(225), 261–269 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  13. Elias, P.: Universal Codeword Sets and Representations of the Integers. IEEE Transactions on Information Theory 21(2), 194–202 (1975)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhao, J., Yang, D., Chen, L., Gao, J., Wang, T. (2006). Cardinality Computing: A New Step Towards Fully Representing Multi-sets by Bloom Filters. In: Aberer, K., Peng, Z., Rundensteiner, E.A., Zhang, Y., Li, X. (eds) Web Information Systems – WISE 2006. WISE 2006. Lecture Notes in Computer Science, vol 4255. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11912873_26

Download citation

  • DOI: https://doi.org/10.1007/11912873_26

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-48105-8

  • Online ISBN: 978-3-540-48107-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics