skip to main content
research-article
Artifacts Available / v1.1

Mining Frequent Infix Patterns from Concurrency-Aware Process Execution Variants

Authors Info & Claims
Published:01 June 2023Publication History
Skip Abstract Section

Abstract

Event logs, as considered in process mining, document a large number of individual process executions. Moreover, each process execution consists of various executed activities. To cope with the vast amount of process executions in event logs, the concept of variants exists that group process executions with identical ordering relations among their executed activities. Variants are an integral concept of process mining and help process analysts explore, filter, and manage large amounts of event data. In this paper, we consider concurrency-aware variants that allow activities within a process execution to be partially ordered---the execution of individual activities can overlap in time. However, the number of variants is often vast, making it challenging for process analysts to explore event data. Therefore, we present a novel approach to frequent pattern mining from concurrency-aware variants. We show that mining frequent patterns from concurrency-aware variants can be reduced to the frequent subtree mining problem. Further, we compare our proposed algorithm to a state-of-the-art frequent subtree mining algorithm exhibiting improved performance on real-life event logs.

References

  1. Wil M. P. van der Aalst, Arya Adriansyah, Ana Karla Alves de Medeiros, Franco Arcieri, Thomas Baier, Tobias Blickle, Jagadeesh Chandra Bose, Peter van den Brand, Ronald Brandtjen, Joos Buijs, et al. 2011. Process mining manifesto. In International conference on business process management. Springer, 169--194. Google ScholarGoogle ScholarCross RefCross Ref
  2. Rakesh Agrawal and Ramakrishnan. Srikant. 1995. Mining sequential patterns. In Proceedings of the Eleventh International Conference on Data Engineering. IEEE Comput. Soc. Press, 3--14. Google ScholarGoogle ScholarCross RefCross Ref
  3. Rakesh Agrawal, Ramakrishnan Srikant, et al. 1994. Fast algorithms for mining association rules. In Proc. 20th Int. Conf. Very Large Data Bases, VLDB, Vol. 1215. Santiago, Chile, 487--499. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Tatsuya Asai, Kenji Abe, Shinji Kawasoe, Hiroshi Sakamoto, Hiroki Arimura, and Setsuo Arikawa. 2004. Efficient substructure discovery from large semi-structured data. IEICE Transactions on Information and Systems 87, 12 (2004), 2754--2763. Google ScholarGoogle ScholarCross RefCross Ref
  5. Adriano Augusto, Raffaele Conforti, Marlon Dumas, Marcello La Rosa, Fabrizio Maria Maggi, Andrea Marrella, Massimo Mecella, and Allar Soo. 2019. Automated Discovery of Process Models from Event Logs: Review and Benchmark. IEEE Transactions on Knowledge and Data Engineering 31, 4 (2019), 686--705. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Kristof Böhmer and Stefanie Rinderle-Ma. 2020. LoGo: combining local and global techniques for predictive business process monitoring. In Advanced Information Systems Engineering: 32nd International Conference, CAiSE 2020, Grenoble, France, June 8--12, 2020, Proceedings 32. Springer, Springer, Cham, 283--298. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. R. P. Jagadeesh Chandra Bose and Wil M. P. van der Aalst. 2009. Trace Clustering Based on Conserved Patterns: Towards Achieving Better Process Models.. In Business Process Management Workshops, Vol. 43. Springer, 170--181. Google ScholarGoogle ScholarCross RefCross Ref
  8. Josep Carmona, Boudewijn F. van Dongen, Andreas Solti, and Matthias Weidlich. 2018. Conformance Checking. Springer. Google ScholarGoogle ScholarCross RefCross Ref
  9. Michelangelo Ceci, Pasqua Fabiana Lanotte, Fabio Fumarola, Dario Pietro Cavallo, and Donato Malerba. 2014. Completion time and next activity prediction of processes using sequential pattern mining. In Discovery Science: 17th International Conference, DS 2014, Bled, Slovenia, October 8--10, 2014. Proceedings 17. Springer, Springer, Cham, 49--61. Google ScholarGoogle ScholarCross RefCross Ref
  10. Yun Chi, Richard R. Muntz, Siegfried Nijssen, and Joost N. Kok. 2005. Frequent subtree mining - An overview. Fundamenta Informaticae 66, 1--2 (2005), 161--198. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Yun Chi, Yi Xia, Yirong Yang, and Richard R. Muntz. 2005. Mining closed and maximal frequent subtrees from databases of labeled rooted trees. IEEE Transactions on Knowledge and Data Engineering 17, 2 (2005), 190--202. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Yun Chi, Yirong Yang, and Richard R. Muntz. 2005. Canonical forms for labelled trees and their applications in frequent subtree mining. Knowledge and information systems 8 (2005), 203--234. Google ScholarGoogle ScholarCross RefCross Ref
  13. Remco Dijkman, Juntao Gao, Alifah Syamsiyah, Boudewijn F. van Dongen, Paul Grefen, and Arthur ter Hofstede. 2020. Enabling efficient process mining on large data sets: realizing an in-database process mining operator. Distributed and Parallel Databases 38, 1 (2020), 227--253. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Felix Mannhardt. 2016. Sepsis Cases - Event Log. Google ScholarGoogle ScholarCross RefCross Ref
  15. Peter C Fishburn. 1970. Intransitive indifference with unequal indifference intervals. Journal of Mathematical Psychology 7, 1 (1970), 144--149. Google ScholarGoogle ScholarCross RefCross Ref
  16. Philippe Fournier-Viger, Ted Gueniche, and Vincent S. Tseng. 2012. Using partially-ordered sequential rules to generate more accurate sequence prediction. In Advanced Data Mining and Applications: 8th International Conference, ADMA 2012, Nanjing, China, December 15--18, 2012. Proceedings 8. Springer, Springer, Berlin, Heidelberg, 431--442. Google ScholarGoogle ScholarCross RefCross Ref
  17. Shohei Hido and Hiroyuki Kawano. 2005. AMIOT: induced ordered tree mining in tree-structured databases. In Fifth IEEE International Conference on Data Mining (ICDM'05). IEEE, IEEE, 8--17. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Chuntao Jiang, Frans Coenen, and Michele Zito. 2013. A survey of frequent subgraph mining algorithms. The Knowledge Engineering Review 28, 1 (2013), 75--105. Google ScholarGoogle ScholarCross RefCross Ref
  19. Sander J. J. Leemans, Sebastiaan J. van Zelst, and Xixi Lu. 2023. Partial-order-based process mining: a survey and outlook. Knowledge and Information Systems 65, 1 (2023), 1--29. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Jing Lu, Weiru Chen, Osei Adjei, and Malcolm Keech. 2008. Sequential patterns postprocessing for structural relation patterns mining. International Journal of Data Warehousing and Mining (IJDWM) 4, 3 (2008), 71--89. Google ScholarGoogle ScholarCross RefCross Ref
  21. Heikki Mannila, Hannu Toivonen, and A Inkeri Verkamo. 1997. Discovery of frequent episodes in event sequences. Data mining and knowledge discovery 1, 3 (1997), 259--289. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Lars Reinkemeyer. 2020. Process Mining in Action. Springer. Google ScholarGoogle ScholarCross RefCross Ref
  23. Daniel Schuster, Niklas Föcking, Sebastiaan J van Zelst, and Wil M. P. van der Aalst. 2022. Conformance Checking for Trace Fragments Using Infix and Postfix Alignments. In International Conference on Cooperative Information Systems. Springer, Springer, Cham, 299--310. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Daniel Schuster, Lukas Schade, Sebastiaan J. van Zelst, and Wil M. P. van der Aalst. 2022. Visualizing Trace Variants from Partially Ordered Event Data. In Process Mining Workshops. LNBIP, Vol. 433. Springer, 34--46. Google ScholarGoogle ScholarCross RefCross Ref
  25. Daniel Schuster, Sebastiaan J. van Zelst, and Wil M. P. van der Aalst. 2022. Utilizing domain knowledge in data-driven process discovery: A literature review. Computers in Industry 137 (2022), 103612. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Daniel Schuster, Sebastiaan J. van Zelst, and Wil M. P. van der Aalst. 2023. Cortado: A dedicated process mining tool for interactive process discovery. SoftwareX 22 (2023), 101373. Google ScholarGoogle ScholarCross RefCross Ref
  27. Minseok Song, Christian W Günther, and Wil M. P. van der Aalst. 2008. Trace clustering in process mining. In International conference on business process management. Springer, Springer, Berlin, Heidelberg, 109--120. Google ScholarGoogle ScholarCross RefCross Ref
  28. Henry Tan, Tharam S. Dillon, Fedja Hadzic, Elizabeth Chang, and Ling Feng. 2006. IMB3-Miner: Mining Induced/Embedded subtrees by constraining the level of embedding. In Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, Springer, Berlin, Heidelberg, 450--461. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Shirish Tatikonda, Srinivasan Parthasarathy, and Tahsin Kurc. 2006. TRIPS and TIDES: new algorithms for tree mining. In Proceedings of the 15th ACM international conference on Information and knowledge management. ACM, Association for Computing Machinery, 455--464. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Niek Tax, Natalia Sidorova, Reinder Haakma, and Wil M. P. van der Aalst. 2016. Mining local process models. Journal of Innovation in Digital Ecosystems 3, 2 (2016), 183--196. Google ScholarGoogle ScholarCross RefCross Ref
  31. Wil M. P. van der Aalst. 2016. Process Mining: Data Science in Action. Springer. Google ScholarGoogle ScholarCross RefCross Ref
  32. Wil M. P. van der Aalst. 2020. On the Pareto Principle in Process Mining, Task Mining, and Robotic Process Automation.. In Proceedings of the 9th International Conference on Data Science, Technology and Applications - DATA. INSTICC, SciTePress, 5--12. Google ScholarGoogle ScholarCross RefCross Ref
  33. Boudewijn F. van Dongen. 2012. BPI Challenge 2012 - Event Log. Google ScholarGoogle ScholarCross RefCross Ref
  34. Boudewijn F. van Dongen. 2017. BPI Challenge 2017 - Event Log. Google ScholarGoogle ScholarCross RefCross Ref
  35. Boudewijn F. van Dongen. 2020. BPI Challenge 2020 - Event Log. Google ScholarGoogle ScholarCross RefCross Ref
  36. Maikel L. van Eck, Xixi Lu, Sander J. J. Leemans, and Wil M. P. van der Aalst. 2015. PM2: A Process Mining Project Methodology. In Advanced Information Systems Engineering. LNCS, Vol. 9097. Springer, 297--313. Google ScholarGoogle ScholarCross RefCross Ref
  37. Yongqiao Xiao and J-F Yao. 2003. Efficient data mining for maximal frequent subtrees. In Third IEEE International Conference on Data Mining. IEEE, IEEE, 379--386. Google ScholarGoogle ScholarCross RefCross Ref
  38. Mohammed J. Zaki. 2002. Efficiently mining frequent trees in a forest. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining. IEEE, 71--80. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Fareed Zandkarimi, Jana-Rebecca Rehse, Pouya Soudmand, and Hartmut Hoehle. 2020. A generic framework for trace clustering in process mining. In 2020 2nd International Conference on Process Mining (ICPM). IEEE, IEEE, Padova, 177--184. Google ScholarGoogle ScholarCross RefCross Ref

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in

Full Access

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader