Skip to main content

N-Gram Analysis Based on Zero-Suppressed BDDs

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4384))

Abstract

In the present paper, we propose a new method of n-gram analysis using ZBDDs (Zero-suppressed BDDs). ZBDDs are known as a compact representation of combinatorial item sets. Here, we newly apply the ZBDD-based techniques for efficiently handling sets of sequences. Using the algebraic operations defined over ZBDDs, such as union, intersection, difference, etc., we can execute various processings and/or analyses for large-scale sequence data. We conducted experiments for generating n-gram statistical data for given real document files. The obtained results show the potentiality of the ZBDD-based method for the sequence database analysis.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Hoffmeister, B., Zeugmann, T.: Text Mining Using Markov Chains of Variable Length. In: Jantke, K.P., Lunzer, A., Spyratos, N., Tanaka, Y. (eds.) Federation over the Web. LNCS (LNAI), vol. 3847, pp. 1–24. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  2. Jokinen, P., Ukkonen, E.: Two algorithms for approximate string matching in static texts. In: Tarlecki, A. (ed.) Mathematical Foundations of Computer Science 1991. LNCS, vol. 520, pp. 240–248. Springer, Heidelberg (1991)

    Chapter  Google Scholar 

  3. Kudo, T., Yamamoto, K., Tsuboi, Y., Matsumoto, Y.: Text mining using linguistic information (in Japanese). IPSJ SIG-NLP NL-148 , pp. 65–72 (2002)

    Google Scholar 

  4. Minato, S.: Zero-suppressed BDDs for set manipulation in combinatorial problems. In: Proc. 30th Design Automation Conference (DAC-93), June, pp. 272–277. ACM Press, New York (1993)

    Google Scholar 

  5. Minato, S.: Zero-suppressed BDDs and their applications. International Journal on Software Tools for Technology Transfer (STTT) 3(2), 156–170 (2001)

    Article  Google Scholar 

  6. Minato, S.: VSOP (Valued-Sum-of-Products) Calculator for Knowledge Processing Based on Zero-Suppressed BDDs. In: Jantke, K.P., Lunzer, A., Spyratos, N., Tanaka, Y. (eds.) Federation over the Web. LNCS (LNAI), vol. 3847, pp. 40–58. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  7. Nagano, M., Mori, S.: A new method of N-gram statistics for large number of n and automatic extraction of words and phrases from large text data of Japanese. In: Proc. 15th Conference on Computational Linguistics, vol. 1, pp. 611–615. Association for Computational Linguistics, Morristown, NJ, USA (1994)

    Google Scholar 

  8. Tsuboi, Y.: Mining frequent substrings, Technical Report of IEICE, NLC, -47, 2003 (in Japanese) (2003)

    Google Scholar 

  9. Ukkonen, E.: Approximate string-matching with q-grams and maximal matches. Theoretical Computer Science 92(1), 191–211 (1992)

    Article  MathSciNet  Google Scholar 

  10. Ruby Home Page. http://www.ruby-lang.org/en/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Takashi Washio Ken Satoh Hideaki Takeda Akihiro Inokuchi

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Kurai, R., Minato, Si., Zeugmann, T. (2007). N-Gram Analysis Based on Zero-Suppressed BDDs. In: Washio, T., Satoh, K., Takeda, H., Inokuchi, A. (eds) New Frontiers in Artificial Intelligence. JSAI 2006. Lecture Notes in Computer Science(), vol 4384. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69902-6_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-69902-6_25

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-69901-9

  • Online ISBN: 978-3-540-69902-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics