ABSTRACT
The advent of Generative AI has certainly boosted the interest in developing innovative chatbot applications. Despite a vast amount of machine learning (ML) and natural language processing (NLP) research and English language resources that greatly improve chatbot technology, the corresponding research and resources for the Greek language are limited. The contribution of this paper is twofold: (i) it reports on the state-of-the-art research in Greek NLP, as far as language resources, embeddings-based techniques, deep learning models, and existing chatbot applications are concerned; (ii) it offers a set of insights on current NLP models and chatbot implementation methodologies, and outlines a set of pending issues and future research directions.
- Florian Brachten, Tobias Kissmer, and Stefan Stieglitz. 2021. The acceptance of chatbots in an enterprise context – A survey study. International Journal of Information Management 60, (October 2021), 102375. https://doi.org/10.1016/j.ijinfomgt.2021.102375Google ScholarDigital Library
- Aggeliki Androutsopoulou, Nikos Karacapilidis, Euripidis Loukis, and Yannis Charalabidis. 2019. Transforming the communication between citizens and government through AI-guided chatbots. Government Information Quarterly 36, 2 (April 2019), 358–367. https://doi.org/10.1016/j.giq.2018.10.001Google ScholarCross Ref
- Eleni Adamopoulou and Lefteris Moussiades. 2020. Chatbots: History, technology, and applications. Machine Learning with Applications 2, (December 2020), 100006. https://doi.org/10.1016/j.mlwa.2020.100006Google ScholarCross Ref
- Mohit Jain, Pratyush Kumar, Ramachandra Kota, and Shwetak N. Patel. 2018. Evaluating and Informing the Design of Chatbots. In Proceedings of the 2018 Designing Interactive Systems Conference (DIS ’18), Association for Computing Machinery, New York, NY, USA, 895–906. https://doi.org/10.1145/3196709.3196735Google ScholarDigital Library
- Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems, Curran Associates, Inc.Google Scholar
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. https://doi.org/10.48550/arXiv.1810.04805Google ScholarCross Ref
- Katerina Papantoniou and Yannis Tzitzikas. 2020. NLP for the Greek Language: A Brief Survey. In 11th Hellenic Conference on Artificial Intelligence (SETN 2020), Association for Computing Machinery, New York, NY, USA, 101–109. https://doi.org/10.1145/3411408.3411410Google ScholarDigital Library
- Iakovos Evdaimon, Hadi Abdine, Christos Xypolopoulos, Stamatis Outsios, Michalis Vazirgiannis, and Giorgos Stamou. 2023. GreekBART: The First Pretrained Greek Sequence-to-Sequence Model. https://doi.org/10.48550/arXiv.2304.00869Google ScholarCross Ref
- Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. https://doi.org/10.48550/arXiv.1301.3781Google ScholarCross Ref
- Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2017. Enriching Word Vectors with Subword Information. https://doi.org/10.48550/arXiv.1607.04606Google ScholarCross Ref
- Stamatis Outsios, Konstantinos Skianis, Polykarpos Meladianos, Christos Xypolopoulos, and Michalis Vazirgiannis. 2018. Word Embeddings from Large-Scale Greek Web Content. https://doi.org/10.48550/arXiv.1810.06694Google ScholarCross Ref
- Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. https://doi.org/10.48550/arXiv.1908.10084Google ScholarCross Ref
- Stamatis Outsios, Christos Karatsalos, Konstantinos Skianis, and Michalis Vazirgiannis. 2020. Evaluation of Greek Word Embeddings. In Proceedings of the Twelfth Language Resources and Evaluation Conference, European Language Resources Association, Marseille, France, 2543–2551.Google Scholar
- John Koutsikakis, Ilias Chalkidis, Prodromos Malakasiotis, and Ion Androutsopoulos. 2020. GREEK-BERT: The Greeks visiting Sesame Street. In 11th Hellenic Conference on Artificial Intelligence (SETN 2020), Association for Computing Machinery, New York, NY, USA, 110–117. https://doi.org/10.1145/3411408.3411440Google ScholarDigital Library
- Konstantinos Athinaios, Ilias Chalkidis, Despina-Athanasia Pantazi, and Christos Papaloukas. 2020. Named Entity Recognition using a Novel Linguistic Model for Greek Legal Corpora based on BERT model. BS Thesis, School of Science, Department of Informatics and Telecommunications.Google Scholar
- Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Ves Stoyanov, and Luke Zettlemoyer. 2019. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. https://doi.org/10.48550/arXiv.1910.13461Google ScholarCross Ref
- Prokopis Prokopidis and Stelios Piperidis. 2020. A Neural NLP toolkit for Greek. In 11th Hellenic Conference on Artificial Intelligence (SETN 2020), Association for Computing Machinery, New York, NY, USA, 125–128. https://doi.org/10.1145/3411408.3411430Google ScholarDigital Library
- Savvas Chatzipanagiotidis, Maria Giagkou, and Detmar Meurers. 2021. Broad Linguistic Complexity Analysis for Greek Readability Classification. In Proceedings of the 16th Workshop on Innovative Use of NLP for Building Educational Applications, Association for Computational Linguistics, Online, 48–58.Google Scholar
- Dimitris Papadopoulos, Nikolaos Papadakis, and Nikolaos Matsatsinis. 2021. PENELOPIE: Enabling Open Information Extraction for the Greek Language through Machine Translation. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop, Association for Computational Linguistics, Online, 23–29. https://doi.org/10.18653/v1/2021.eacl-srw.4Google ScholarCross Ref
- Paco Nathan. 2016. PyTextRank, a Python implementation of TextRank for phrase extraction and summarization of text documents. Online https://github. com/DerwenAI/pytextrank (2016).Google Scholar
- Mišo Belica. 2021. sumy: Automatic text summarizer.Google Scholar
- Nikolaos Giarelis and Nikos Karacapilidis. 2023. LMRank: Utilizing pre-trained language models and dependency parsing for keyphrase extraction. IEEE Access 11, (2023), 71459–71471.Google Scholar
- Christos Papaloukas, Ilias Chalkidis, Konstantinos Athinaios, Despina Pantazi, and Manolis Koubarakis. 2021. Multi-granular Legal Topic Classification on Greek Legislation. In Proceedings of the Natural Legal Language Processing Workshop 2021, Association for Computational Linguistics, Punta Cana, Dominican Republic, 63–75. https://doi.org/10.18653/v1/2021.nllp-1.6Google ScholarCross Ref
- Marios Koniaris, Dimitris Galanis, Eugenia Giannini, and Panayiotis Tsanakas. 2023. Evaluation of Automatic Legal Text Summarization Techniques for Greek Case Law. Information 14, 4 (April 2023), 250. https://doi.org/10.3390/info14040250Google ScholarCross Ref
- Zesis Pitenis, Marcos Zampieri, and Tharindu Ranasinghe. 2020. Offensive Language Identification in Greek. In Proceedings of the Twelfth Language Resources and Evaluation Conference, European Language Resources Association, Marseille, France, 5113–5119.Google Scholar
- Konstantinos Perifanos and Dionysis Goutsos. 2021. Multimodal Hate Speech Detection in Greek Social Media. Multimodal Technologies and Interaction 5, 7 (July 2021), 34. https://doi.org/10.3390/mti5070034Google ScholarCross Ref
- Ioannis Karamitsos. 2019. Chatbots for Greek/EU Public Services. (2019).Google Scholar
- Anestis Stamatis, Alexandros Gerontas, Anastasios Dasyras, and Efthimios Tambouris. 2020. Using chatbots and life events to provide public service information. In Proceedings of the 13th International Conference on Theory and Practice of Electronic Governance (ICEGOV ’20), Association for Computing Machinery, New York, NY, USA, 54–61. https://doi.org/10.1145/3428502.3428509Google ScholarDigital Library
- Panteleimon Antoniadis and Efthimios Tambouris. 2022. PassBot: A chatbot for providing information on Getting a Greek Passport. In Proceedings of the 14th International Conference on Theory and Practice of Electronic Governance (ICEGOV ’21), Association for Computing Machinery, New York, NY, USA, 292–297. https://doi.org/10.1145/3494193.3494233Google ScholarDigital Library
- Georgios Patsoulis, Rafail Promikyridis, and Efthimios Tambouris. 2022. Integration of chatbots with Knowledge Graphs in eGovernment: The case of Getting a Passport. In Proceedings of the 25th Pan-Hellenic Conference on Informatics (PCI ’21), Association for Computing Machinery, New York, NY, USA, 425–429. https://doi.org/10.1145/3503823.3503901Google ScholarDigital Library
- Nikolaos Giarelis, Charalampos Mastrokostas, and Nikos Karacapilidis. 2023. Abstractive vs. Extractive Summarization: An Experimental Review. Applied Sciences 13, 13 (January 2023), 7620. https://doi.org/10.3390/app13137620Google ScholarCross Ref
Index Terms
- A Review of Greek NLP Technologies for Chatbot Development
Recommendations
A Neural NLP toolkit for Greek
SETN 2020: 11th Hellenic Conference on Artificial IntelligenceWe present a neural NLP toolkit for Greek, currently integrating modules for POS tagging, lemmatization, dependency parsing and text classification. The toolkit is based on language resources including web crawled corpora, word embeddings, large lexica, ...
Building and evaluating resources for sentiment analysis in the Greek language
Sentiment lexicons and word embeddings constitute well-established sources of information for sentiment analysis in online social media. Although their effectiveness has been demonstrated in state-of-the-art sentiment analysis and related tasks in the ...
Development and Enhancement of a Stemmer for the Greek Language
PCI '16: Proceedings of the 20th Pan-Hellenic Conference on InformaticsAlthough there are three stemmers published for the Greek language, only the one presented in this paper and called Ntais' stemmer is freely open and available, together with its enhancements and extensions according to Saroukos' algorithm. The primary ...
Comments