Abstract
Mining useful information to analyze knowledge-intensive business processes requires data that describes activities of knowledge workers. Emails are widely used in organizations to provide support in the functioning of knowledge-intensive processes. The recent COVID-19 pandemic has increased reliance on technologies such as email to help facilitate communication within organizations to make up for the lack of face-to-face contact. In this work, we propose an activity mining technique, which receives an incoming email message, classifies the sender’s intent and translates it into a set of business process activities. Specifically, we leverage deep learning language models to first classify the email body into a group of intents, which are then mapped to related activities. To our knowledge, we propose the first transfer-learning based solution for mining activity information from emails. The effectiveness of our solution was evaluated on real-world data coming from email exchanges between knowledge workers. Our results based on unsupervised experiments and a field study show that transformer models can be used to semantically label emails and that mapping activities to matched intents is highly accurate.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Note that a sentence can be represented as a bag of words or a sequence; our problem formulation is agnostic to how sentences are defined.
- 2.
For more information about these pre-trained models visit https://huggingface.co/.
References
Dustdar, S., Hoffmann, T., Van der Aalst, W.: Mining of ad-hoc business processes with teamlog. Data Knowl. Eng. 55(2), 129–158 (2005)
Corston-Oliver, S., Ringger, E., Gamon, M., Campbell, R.: Task-focused summarization of email. In: Text Summarization Branches Out, pp. 43–50 (2004)
Stuit, M., Wortmann, H.: Discovery and analysis of e-mail-driven business processes. Inf. Syst. 37(2), 142–168 (2012)
Bloom, N.: How working from home works out. Stanford Institute for Economic Policy Research, pp. 1–8 (2020)
Heavin, C., Power, D.J.: Challenges for digital transformation-towards a conceptual decision support guide for managers. J. Decis. Syst. 27(sup1), 38–45 (2018)
Wang, W., Hosseini, S., Awadallah, A.H., Bennett, P.N., Quirk, C.: Context-aware intent identification in email conversations. In: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 585–594 (2019)
van der Aalst, W.M., Nikolov, A.: EmailAnalyzer: an e-mail mining plug-in for the prom framework. BPM Center Report BPM-07-16, BPMCenter.org (2007)
Lin, C.C., Kang, D., Gamon, M., Pantel, P.: Actionable email intent modeling with reparametrized RNNs. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)
Chambers, A.J., et al.: Automated business process discovery from unstructured natural-language documents. In: Del Río Ortega, A., Leopold, H., Santoro, F.M. (eds.) BPM 2020. LNBIP, vol. 397, pp. 232–243. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-66498-5_18
Elleuch, M., Ismaili, O.A., Laga, N., Gaaloul, W., Benatallah, B.: Discovering activities from emails based on pattern discovery approach. In: Fahland, D., Ghidini, C., Becker, J., Dumas, M. (eds.) BPM 2020. LNBIP, vol. 392, pp. 88–104. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58638-6_6
Jlailaty, D., Grigori, D., Belhajjame, K.: On the elicitation and annotation of business activities based on emails. In: Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing, pp. 101–103 (2019)
Alibadi, Z., Du, M., Vidal, J.M.: Using pre-trained embeddings to detect the intent of an email. In: Proceedings of the 7th ACIS International Conference on Applied Computing and Information Technology, pp. 1–7 (2019)
Radford, A., et al.: Language models are unsupervised multitask learners. OpenAI Blog 1(8), 9 (2019)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Lewis, M., et al.: BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461 (2019)
Brown, T.B., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020)
Romera-Paredes, B., Torr, P.: An embarrassingly simple approach to zero-shot learning. In: International Conference on Machine Learning, pp. 2152–2161. PMLR (2015)
Cohen, W., Carvalho, V., Mitchell, T.: Learning to classify email into “speech acts”. In: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, pp. 309–316 (2004)
Searle, J.R., Searle, J.R.: Speech Acts: An Essay in the Philosophy of Language, vol. 626. Cambridge University Press (1969)
Carvalho, V.R., Cohen, W.W.: Learning to extract signature and reply lines from email. In: Proceedings of the Conference on Email and Anti-Spam, vol. 2004 (2004)
El Emam, K., Madhavji, N.H.: A field study of requirements engineering practices in information systems development. In: Proceedings of 1995 IEEE International Symposium on Requirements Engineering (RE 1995), pp. 68–80. IEEE (1995)
Wang, X., Xu, Y.: An improved index for clustering validation based on silhouette index and Calinski-Harabasz index. In: IOP Conference Series: Materials Science and Engineering, vol. 569, p. 052024. IOP Publishing (2019)
Yin, W., Hay, J., Roth, D.: Benchmarking zeroshot text classification: datasets, evaluation and entailment approach. arXiv preprint arXiv:1909.00161 (2019)
Sappadla, P.V., Nam, J., Mencía, E.L., Fürnkranz, J.: Using semantic similarity for multi-label zero-shot classification of text documents. In: ESANN (2016)
Di Ciccio, C., Mecella, M.: Mining artful processes from knowledge workers’ emails. IEEE Internet Comput. 17(5), 10–20 (2013)
Shu, K., Mukherjee, S., Zheng, G., Awadallah, A.H., Shokouhi, M., Dumais, S.: Learning with weak supervision for email intent detection. In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1051–1060 (2020)
Pearl, J., Mackenzie, D.: The Book of Why: The New Science of Cause and Effect. Basic Books (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 Springer Nature Switzerland AG
About this paper
Cite this paper
Khandaker, F., Senderovich, A., Yu, E., Carbajales, S., Chan, A. (2023). Transformer Models for Activity Mining in Knowledge-Intensive Processes. In: Cabanillas, C., Garmann-Johnsen, N.F., Koschmider, A. (eds) Business Process Management Workshops. BPM 2022. Lecture Notes in Business Information Processing, vol 460. Springer, Cham. https://doi.org/10.1007/978-3-031-25383-6_2
Download citation
DOI: https://doi.org/10.1007/978-3-031-25383-6_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25382-9
Online ISBN: 978-3-031-25383-6
eBook Packages: Computer ScienceComputer Science (R0)