Skip to main content

Discovering Process Models with Genetic Algorithms Using Sampling

  • Conference paper
Knowledge-Based and Intelligent Information and Engineering Systems (KES 2010)

Abstract

Process mining, a new business intelligence area, aims at discovering process models from event logs. Complex constructs, noise and infrequent behavior are issues that make process mining a complex problem. A genetic mining algorithm, which applies genetic operators to search in the space of all possible process models, deals with the aforementioned challenges with success. Its drawback is high computation time due to the high time costs of the fitness evaluation. Fitness evaluation time linearly depends on the number of process instances in the log. By using a sampling-based approach, i.e. evaluating fitness on a sample from the log instead of the whole log, we drastically reduce the computation time. When the desired fitness is achieved on the sample, we check the fitness on the whole log; if it is not achieved yet, we increase the sample size and continue the computation iteratively. Our experiments show that sampling works well even for relatively small logs, and the total computation time is reduced by 6 up to 15 times.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. van der Aalst, W.M.P., Ter Hofstede, A.H.M., Kiepuszewski, B., Barros, A.P.: Workflow patterns. Distrib. Parallel Databases 14(1), 5–51 (2003)

    Article  Google Scholar 

  2. van der Aalst, W.M.P., Weijters, A.J.M.M., Maruster, L.: Workflow Mining: Discovering Process Models from Event Logs. IEEE Transactions on Knowledge and Data Engineering 16(9), 1128–1142 (2004)

    Article  Google Scholar 

  3. Alves de Medeiros, A.K.: Genetic Process Mining. PhD thesis, Technische Universiteit Eindhoven, Eindhoven, The Netherlands (2006)

    Google Scholar 

  4. Alves de Medeiros, A.K., Weijters, A.J.M.M., van der Aalst, W.M.P.: Genetic process mining: An experimental evaluation. Data Mining and Knowledge Discovery 14(2), 245–304 (2007)

    Article  MathSciNet  Google Scholar 

  5. Chen, J.-H., Goldberg, D.E., Ho, S.-Y., Sastry, K.: Fitness inheritance in multi-objective optimization. In: GECCO, pp. 319–326 (2002)

    Google Scholar 

  6. Fitzpatrick, J.M., Grefenstette, J.J.: Genetic algorithms in noisy environments. Machine Learning 3, 101–120 (1988)

    Google Scholar 

  7. Günther, C.W., Rozinat, A., van der Aalst, W.M.P., van Uden, K.: Monitoring deployed application usage with process mining. Technical report, BPM Center Report BPM-08- 11, BPMcenter.org (2008)

    Google Scholar 

  8. Jin, Y.: A comprehensive survey of fitness approximation in evolutionary computation. Soft Computing 9(1), 3–12 (2005)

    Article  Google Scholar 

  9. Jin, Y., Branke, J.: Evolutionary optimization in uncertain environments-a survey. IEEE Trans. Evolutionary Computation 9(3), 303–317 (2005)

    Article  Google Scholar 

  10. Kivinen, J., Mannila, H.: The power of sampling in knowledge discovery. In: PODS 1994: Proceedings of the thirteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems, pp. 77–85. ACM, New York (1994)

    Chapter  Google Scholar 

  11. Lee, S.D., Cheung, D.W., Kao, B.: Is sampling useful in data mining? a case in the maintenance of discovered association rules. Data Min. Knowl. Discov. 2(3), 233–262 (1998)

    Article  Google Scholar 

  12. Miller, B.L.: Noise, Sampling and Efficient Genetic Algorithms. PhD thesis, Department of Computer Science, University of Illinois, USA (1997)

    Google Scholar 

  13. Rozinat, A., de Jong, I., Günther, C., van der Aalst, W.: Process Mining Applied to the Test Process of Wafer Scanners in ASML. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews 39(4), 474–479 (2009)

    Google Scholar 

  14. Tan, P.-N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Addison-Wesley Longman Publishing Co., Inc., Boston (2005)

    Google Scholar 

  15. Weijters, A.J.M.M., van der Aalst, W.M.P.: Rediscovering workflow models from event-based data using little thumb. Integr. Comput.-Aided Eng. 10(2), 151–162 (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bratosin, C., Sidorova, N., van der Aalst, W. (2010). Discovering Process Models with Genetic Algorithms Using Sampling. In: Setchi, R., Jordanov, I., Howlett, R.J., Jain, L.C. (eds) Knowledge-Based and Intelligent Information and Engineering Systems. KES 2010. Lecture Notes in Computer Science(), vol 6276. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15387-7_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-15387-7_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-15386-0

  • Online ISBN: 978-3-642-15387-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics