Skip to main content

Transformer Models for Activity Mining in Knowledge-Intensive Processes

  • Conference paper
  • First Online:
Business Process Management Workshops (BPM 2022)

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 460))

Included in the following conference series:

  • 698 Accesses

Abstract

Mining useful information to analyze knowledge-intensive business processes requires data that describes activities of knowledge workers. Emails are widely used in organizations to provide support in the functioning of knowledge-intensive processes. The recent COVID-19 pandemic has increased reliance on technologies such as email to help facilitate communication within organizations to make up for the lack of face-to-face contact. In this work, we propose an activity mining technique, which receives an incoming email message, classifies the sender’s intent and translates it into a set of business process activities. Specifically, we leverage deep learning language models to first classify the email body into a group of intents, which are then mapped to related activities. To our knowledge, we propose the first transfer-learning based solution for mining activity information from emails. The effectiveness of our solution was evaluated on real-world data coming from email exchanges between knowledge workers. Our results based on unsupervised experiments and a field study show that transformer models can be used to semantically label emails and that mapping activities to matched intents is highly accurate.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Note that a sentence can be represented as a bag of words or a sequence; our problem formulation is agnostic to how sentences are defined.

  2. 2.

    For more information about these pre-trained models visit https://huggingface.co/.

References

  1. Dustdar, S., Hoffmann, T., Van der Aalst, W.: Mining of ad-hoc business processes with teamlog. Data Knowl. Eng. 55(2), 129–158 (2005)

    Article  Google Scholar 

  2. Corston-Oliver, S., Ringger, E., Gamon, M., Campbell, R.: Task-focused summarization of email. In: Text Summarization Branches Out, pp. 43–50 (2004)

    Google Scholar 

  3. Stuit, M., Wortmann, H.: Discovery and analysis of e-mail-driven business processes. Inf. Syst. 37(2), 142–168 (2012)

    Article  Google Scholar 

  4. Bloom, N.: How working from home works out. Stanford Institute for Economic Policy Research, pp. 1–8 (2020)

    Google Scholar 

  5. Heavin, C., Power, D.J.: Challenges for digital transformation-towards a conceptual decision support guide for managers. J. Decis. Syst. 27(sup1), 38–45 (2018)

    Article  Google Scholar 

  6. Wang, W., Hosseini, S., Awadallah, A.H., Bennett, P.N., Quirk, C.: Context-aware intent identification in email conversations. In: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 585–594 (2019)

    Google Scholar 

  7. van der Aalst, W.M., Nikolov, A.: EmailAnalyzer: an e-mail mining plug-in for the prom framework. BPM Center Report BPM-07-16, BPMCenter.org (2007)

    Google Scholar 

  8. Lin, C.C., Kang, D., Gamon, M., Pantel, P.: Actionable email intent modeling with reparametrized RNNs. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)

    Google Scholar 

  9. Chambers, A.J., et al.: Automated business process discovery from unstructured natural-language documents. In: Del Río Ortega, A., Leopold, H., Santoro, F.M. (eds.) BPM 2020. LNBIP, vol. 397, pp. 232–243. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-66498-5_18

    Chapter  Google Scholar 

  10. Elleuch, M., Ismaili, O.A., Laga, N., Gaaloul, W., Benatallah, B.: Discovering activities from emails based on pattern discovery approach. In: Fahland, D., Ghidini, C., Becker, J., Dumas, M. (eds.) BPM 2020. LNBIP, vol. 392, pp. 88–104. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58638-6_6

    Chapter  Google Scholar 

  11. Jlailaty, D., Grigori, D., Belhajjame, K.: On the elicitation and annotation of business activities based on emails. In: Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing, pp. 101–103 (2019)

    Google Scholar 

  12. Alibadi, Z., Du, M., Vidal, J.M.: Using pre-trained embeddings to detect the intent of an email. In: Proceedings of the 7th ACIS International Conference on Applied Computing and Information Technology, pp. 1–7 (2019)

    Google Scholar 

  13. Radford, A., et al.: Language models are unsupervised multitask learners. OpenAI Blog 1(8), 9 (2019)

    Google Scholar 

  14. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

  15. Lewis, M., et al.: BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461 (2019)

  16. Brown, T.B., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020)

  17. Romera-Paredes, B., Torr, P.: An embarrassingly simple approach to zero-shot learning. In: International Conference on Machine Learning, pp. 2152–2161. PMLR (2015)

    Google Scholar 

  18. Cohen, W., Carvalho, V., Mitchell, T.: Learning to classify email into “speech acts”. In: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, pp. 309–316 (2004)

    Google Scholar 

  19. Searle, J.R., Searle, J.R.: Speech Acts: An Essay in the Philosophy of Language, vol. 626. Cambridge University Press (1969)

    Google Scholar 

  20. Carvalho, V.R., Cohen, W.W.: Learning to extract signature and reply lines from email. In: Proceedings of the Conference on Email and Anti-Spam, vol. 2004 (2004)

    Google Scholar 

  21. El Emam, K., Madhavji, N.H.: A field study of requirements engineering practices in information systems development. In: Proceedings of 1995 IEEE International Symposium on Requirements Engineering (RE 1995), pp. 68–80. IEEE (1995)

    Google Scholar 

  22. Wang, X., Xu, Y.: An improved index for clustering validation based on silhouette index and Calinski-Harabasz index. In: IOP Conference Series: Materials Science and Engineering, vol. 569, p. 052024. IOP Publishing (2019)

    Google Scholar 

  23. Yin, W., Hay, J., Roth, D.: Benchmarking zeroshot text classification: datasets, evaluation and entailment approach. arXiv preprint arXiv:1909.00161 (2019)

  24. Sappadla, P.V., Nam, J., Mencía, E.L., Fürnkranz, J.: Using semantic similarity for multi-label zero-shot classification of text documents. In: ESANN (2016)

    Google Scholar 

  25. Di Ciccio, C., Mecella, M.: Mining artful processes from knowledge workers’ emails. IEEE Internet Comput. 17(5), 10–20 (2013)

    Article  Google Scholar 

  26. Shu, K., Mukherjee, S., Zheng, G., Awadallah, A.H., Shokouhi, M., Dumais, S.: Learning with weak supervision for email intent detection. In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1051–1060 (2020)

    Google Scholar 

  27. Pearl, J., Mackenzie, D.: The Book of Why: The New Science of Cause and Effect. Basic Books (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Faria Khandaker .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Khandaker, F., Senderovich, A., Yu, E., Carbajales, S., Chan, A. (2023). Transformer Models for Activity Mining in Knowledge-Intensive Processes. In: Cabanillas, C., Garmann-Johnsen, N.F., Koschmider, A. (eds) Business Process Management Workshops. BPM 2022. Lecture Notes in Business Information Processing, vol 460. Springer, Cham. https://doi.org/10.1007/978-3-031-25383-6_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-25383-6_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-25382-9

  • Online ISBN: 978-3-031-25383-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics