Semantically Enriching Embeddings of Highly Inflectable Verbs for Improving Intent Detection in a Romanian Home Assistant Scenario

Rad, Andrei-Cristian; Muntean, Ioan-Horia-Mihai; Stoica, Anda-Diana; Lemnaru, Camelia; Potolea, Rodica; Dînșoreanu, Mihaela

doi:10.1007/978-3-030-74251-5_20

Andrei-Cristian Rad¹²,
Ioan-Horia-Mihai Muntean¹²,
Anda-Diana Stoica¹²,
Camelia Lemnaru¹²,
Rodica Potolea¹² &
…
Mihaela Dînșoreanu¹²

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12695))

Included in the following conference series:

International Symposium on Intelligent Data Analysis

795 Accesses

Abstract

Word embeddings are known to encapsulate semantic similarity and have become the preferred representation solution for NLP models. However, they fail to identify the type of semantic relationship, which – in some applications – might be crucial. This paper adapts an existing solution for enhancing word embedding representations such as to better separate between synonyms and antonyms in an intent detection task applied to a Romanian home assistant scenario. Accounting for the morphological richness of the Romanian language, our method proposes an additional augmentation step, in order to generate conjugated pairs of antonym and synonym verbs. The generated pairs are run through the counterfitting step (inspired from literature), for which we propose a justified improvement for one of the hyperparameters. The evaluations performed on the home assistant scenario have shown that the pre-processing step has an essential role in reducing opposing intent errors in the classification model (by almost two thirds).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://github.com/eaaskt/nlu/tree/master/rad-antonyms.
2.
https://rodrigopivi.github.io/Chatito/ - Generate datasets for NLU models using a simple DSL.
3.
https://pypi.org/project/mlconjug/ - A Python library to conjugate verbs using Machine Learning techniques.
4.
https://spacy.io - Industrial-Strength Natural Language Processing.

References

Stoica, A., Kadar, T., Lemnaru, C., Potolea, R., Dînşoreanu, M.: The impact of data challenges on intent detection and slot filling for the home assistant scenario. In: IEEE 15th International Conference on Intelligent Computer Communication and Processing (ICCP), pp. 41–47. IEEE (2019)
Google Scholar
Stoica, A.D., et al.: The impact of Romanian diacritics on intent detection and slot filling. In: IEEE International Conference on Automation, Quality and Testing, Robotics (AQTR), pp. 1–6 (2020)
Google Scholar
Mrkšić, N., et al.: Counter-fitting word vectors to linguistic constraints. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2016)
Google Scholar
Bocklisch, T., Faulkner, J., Pawlowski, N., Nichol, A.: Rasa: open source language understanding and dialogue management. arXiv preprint arXiv:1712.05181 (2017)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, p. 26 (2013)
Google Scholar
Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification. arXiv preprint arXiv:1607.01759 (2016)
Joulin, A., Grave, E., Bojanowski, P., Douze, M., Jégou, H., Mikolov, T.: Fasttext. zip: compressing text classification models. arXiv preprint arXiv:1612.03651 (2016)
Ali, M.A., Sun, Y., Zhou, X., Wang, W. and Zhao, X.: Antonym-synonym classification based on new sub-space embeddings. arXiv preprint arXiv:1612.03651 (2019)
Nguyen, K.A., Walde, S.S.I., Vu, N.T.: Distinguishing antonyms and synonyms in a pattern-based neural network. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers (2017)
Google Scholar
Kim, J., Tur, G., Celikyilmaz, A., Cao, B., Wang, Y.: Intent detection using semantically enriched word embeddings. In: IEEE Spoken Language Technology Workshop (SLT) 2016, pp. 414–419 (2016)
Google Scholar
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: EMNLP, pp. 1532–1543 (2014)
Google Scholar
Mrkšić, N., et al.: Semantic specialization of distributional word vector spaces using monolingual and cross-lingual constraints. Trans. Assoc. Comput. Linguis. 5, 309–324 (2017). https://www.aclweb.org/anthology/Q17-1022
Dumitrescu, S.D., Avram, A.M., Morogan, L., Toma, S.A.: Rowordnet-a python api for the romanian wordnet. In: Proceedings of the 10th International Conference on Electronics, Computers and Artificial Intelligence (ECAI), pp. 1–6. IEEE (2018)
Google Scholar

Download references

Acknowledgment

The work presented in this paper was supported by grant no. 72PCCDI/01.03.2018, ROBIN - Robots and Society: Cognitive Systems for Personal Robots and Autonomous Vehicles.

Author information

Authors and Affiliations

Computer Science Department, Technical University of Cluj -Napoca, Cluj-Napoca, Romania
Andrei-Cristian Rad, Ioan-Horia-Mihai Muntean, Anda-Diana Stoica, Camelia Lemnaru, Rodica Potolea & Mihaela Dînșoreanu

Authors

Andrei-Cristian Rad
View author publications
You can also search for this author in PubMed Google Scholar
Ioan-Horia-Mihai Muntean
View author publications
You can also search for this author in PubMed Google Scholar
Anda-Diana Stoica
View author publications
You can also search for this author in PubMed Google Scholar
Camelia Lemnaru
View author publications
You can also search for this author in PubMed Google Scholar
Rodica Potolea
View author publications
You can also search for this author in PubMed Google Scholar
Mihaela Dînșoreanu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Coimbra, Coimbra, Portugal
Pedro Henriques Abreu
University of Porto, Porto, Portugal
Pedro Pereira Rodrigues
University of Granada, Granada, Spain
Alberto Fernández
University of Porto, Porto, Portugal
João Gama

Appendix A Confusion matrices and histograms

The evolution of the confusion matrices (for evaluation Scenario 1) display the reduction in number of errors, as there are more elements on the main diagonal. The evolution of the confidence histograms (for evaluation Scenario 2) depict less wrong predictions with high and low confidences – most low margin misclassifications were solved, and confidence for those with high confidence was mostly reduced.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rad, AC., Muntean, IHM., Stoica, AD., Lemnaru, C., Potolea, R., Dînșoreanu, M. (2021). Semantically Enriching Embeddings of Highly Inflectable Verbs for Improving Intent Detection in a Romanian Home Assistant Scenario. In: Abreu, P.H., Rodrigues, P.P., Fernández, A., Gama, J. (eds) Advances in Intelligent Data Analysis XIX. IDA 2021. Lecture Notes in Computer Science(), vol 12695. Springer, Cham. https://doi.org/10.1007/978-3-030-74251-5_20

Download citation

DOI: https://doi.org/10.1007/978-3-030-74251-5_20
Published: 13 April 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-74250-8
Online ISBN: 978-3-030-74251-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Semantically Enriching Embeddings of Highly Inflectable Verbs for Improving Intent Detection in a Romanian Home Assistant Scenario

Abstract

Access this chapter

Notes

References

Acknowledgment

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Appendix A Confusion matrices and histograms

Appendix A Confusion matrices and histograms

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation