TMG Framework for Mining Unordered Subtrees

Hadzic, Fedja; Tan, Henry; Dillon, Tharam S.

doi:10.1007/978-3-642-17557-2_6

Fedja Hadzic,
Henry Tan &
Tharam S. Dillon

Part of the book series: Studies in Computational Intelligence ((SCI,volume 333))

794 Accesses

Abstract

This chapter describes the extension of the TMG framework for the mining of unordered induced/embedded subtrees. While in online tree-structured documents such as XML the information is presented in a particular order, in many applications the order among the sibling-nodes is considered unimportant or irrelevant to the task and is often not available. If one is interested in comparing different document structures, or the document is composed of data from several heterogeneous sources, it is very common for the order of sibling nodes to differ, although the information contained in the structure is essentially the same. In these cases, mining of unordered subtrees is much more suitable as a user can pose queries and does not have to worry about the order. All matching sub-structures will be returned with the difference being that the order of sibling nodes is not used as an additional candidate grouping criterion. Hence, the main difference when it comes to the mining of unordered subtrees is that the order of sibling nodes of a subtree can be exchanged and the resulting tree is still considered the same.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules. In: Proceedings of the 20th International Conference on Very Large Data Bases (VLDB), Santiago de Chile, Chile, Septemebr 12-15, pp. 487-499 (1994)
Google Scholar
Asai, T., Arimura, H., Uno, T., Nakano, S.-i.: Discovering Frequent Substructures in Large Unordered Trees. In: Grieser, G., Tanaka, Y., Yamamoto, A. (eds.) DS 2003. LNCS (LNAI), vol. 2843, pp. 47–61. Springer, Heidelberg (2003)
Chapter Google Scholar
Chehreghani, M.H., Rahgozar, M., Lucas, C., and Chehreghani, M.H, Mining Maximal Embedded Unordered Tree Patterns. Paper presented at the Proceedings of the, IEEE Symposium on Computational Intelligence and Data Mining (CIDM), Honolulu, Hawaii, April 1-5 (2007)
Google Scholar
Chi, Y., Yang, Y., Muntz, R.R.: Canonical forms for labeled trees and their applications in frequent subtree mining. Knowledge and Information Systems 8(2), 203–234 (2004a)
Article Google Scholar
Chi, Y., Yang, Y., Muntz, R.R.: HybridTreeMiner: An Efficient Algorithm for Mining Frequent Rooted Trees and Free Trees Using Canonical Forms. Paper presented at the Proceedings of the 16th International Conference on Scientific and Statistical Database Management (SSDBM 2004), Santorini Island, Greece, June 21-23 (2004b)
Google Scholar
Hadzic, F., Tan, H., Dillon, T.S.: UNI3 - Efficient Algorithm for Mining Unordered Induced Subtrees using TMG Candidate Generation. In: Proceedings of IEEE Symposium on Computational Intelligence and Data Mining (CIDM), Honolulu, Hawaii, USA, April 1-5, pp. 568–575. IEEE, Los Alamitos (2007)
Chapter Google Scholar
Hadzic, F., Tan, H., Dillon, T.S.: U3 - mining unordered embedded subtrees using TMG candidate generation. In: Proceedings of the IEEE / WIC / ACM International Conference on Web Intelligence, Sydney, Australia, December 9-12, pp. 285–292 (2008)
Google Scholar
Nijssen, S., Kok, J.N.: Efficient discovery of frequent unordered trees. In: Proceedings of the 1st International Workshop on Mining Graphs, Trees, and Sequences, Dubrovnik, Croatia (2003)
Google Scholar
Tan, H., Hadzic, F., Feng, L., Chang, E.: MB3-Miner: mining eMBedded subTREEs using tree model guided candidate generation. In: Proceedings of the 1st International Workshop on Mining Complex Data in conjunction with ICDM 2005, Houston, Texas, USA, November 27-30, pp. 103–110 (2005)
Google Scholar
Tan, H., Dillon, T.S., Hadzic, F., Chang, E., Feng, L.: IMB3-Miner: Mining Induced/Embedded Subtrees by Constraining the Level of Embedding. In: Ng, W.-K., Kitsuregawa, M., Li, J., Chang, K. (eds.) PAKDD 2006. LNCS (LNAI), vol. 3918, pp. 450–461. Springer, Heidelberg (2006)
Chapter Google Scholar
Zaki, M.J.: Efficiently Mining Frequent Trees in a Forest: Algorithms and Applications. IEEE Transactions on Knowledge and Data Engineering 17(8), 1021–1035 (2005a)
Article Google Scholar
Zaki, M.J.: Efficiently Mining Frequent Embedded Unordered Trees. Fundamenta Informaticae 66(1), 33–52 (2005b)
MATH MathSciNet Google Scholar

Download references

Authors

Fedja Hadzic
View author publications
You can also search for this author in PubMed Google Scholar
Henry Tan
View author publications
You can also search for this author in PubMed Google Scholar
Tharam S. Dillon
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Hadzic, F., Tan, H., Dillon, T.S. (2011). TMG Framework for Mining Unordered Subtrees. In: Mining of Data with Complex Structures. Studies in Computational Intelligence, vol 333. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17557-2_6

Download citation

DOI: https://doi.org/10.1007/978-3-642-17557-2_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17556-5
Online ISBN: 978-3-642-17557-2
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics