Analysis of Tools for the Development of Conversational Agents †

: Conversational agents are being increasingly adopted in various domains, such as e-commerce and customer services, and as a direct communication channel between companies and end-users. Several tools have been developed to facilitate their deﬁnition and deployment. They exploit existing cloud infrastructures and artiﬁcial intelligence (AI) techniques to efﬁciently process users’ input and extract conversational information. Major Information Technology (IT) companies, such as Google, IBM, Microsoft, and Amazon, have provided powerful tools to develop conversational agents. Still, choosing the most appropriate tool is not easy, as it may require high costs associated with automatic natural language processing (NLP) services and expertise in software engineering and AI. Therefore, this paper aims to analyze different tools to help developers and non-developers to choose the optimal tool for their speciﬁc scenario of creating a conversational agent.


Introduction
Conversational agents (CA), virtual assistants, voice assistants, or chatbots are becoming part of our daily lives [1]. They provide users with various services via the use of natural language (NL). Users, for example, can inquire about the weather, ask questions, control home automation devices, such as coffee machines, book flights, and manage other essential tasks, such as emails and calendars.
As a result of the success of CAs, various technologies have been developed to create those systems. Tech Giants, such as Google, IBM, Microsoft, and Amazon, have released their CA creation tools, such as Dialogflow, Watson Assistant, bot framework, and Lex. Smaller companies, such as Rasa, Many-chat, FlowXO, and Pandorabots, have also proposed their tools. Those tools possess impressive capabilities, including NLP, Automatic Speech Recognition APIs (ASR), and speech synthesis. However, selecting the appropriate tool for a specific CA can be challenging due to the vast array of options. Moreover, operational factors, such as vendor lock-in and high costs, should also be considered.
Therefore, this work analyzes CA development tools to help developers and nondevelopers choose the most suitable solution for their scenario.
This paper is organized as follows: Section 2 presents an overview of how a CA works. Section 3 provides an analysis of agent development tools. Section 4 concludes our paper.

Building a Conversational Agent: An Overview
CAs are a good illustration of the progress in NLP [2]. Indeed, a CA is a computer program that is able to converse with users using NL. It can, thus, understand requests formulated using NL, process them, trigger actions, and formulate answers. CAs are attracting increasing interest as they provide access to various services, such as flight booking or weather checking, via mobile applications, websites, or social networks, such as Telegram, Twitter, or Slack. This approach allows users to benefit from these services without the need to install new applications, and their interactions with the service are facilitated via an NL text or voice conversation [3].
CAs can be classified into two main categories [4]: proactive agents, which can initiate the conversation with users in a given context without an explicit request from them (e.g., by alerting them), and reactive agents, which can interact with users regarding specific tasks, such as a hotel or flight booking. Figure 1 shows a simplified diagram of how a CA works [4,5]. A standard method of CA design is based on the use of "intents", which essentially correspond to the objectives or goals that the user wishes to achieve when they are initiating a conversation with the CA.

Building a Conversational Agent: An Overview
CAs are a good illustration of the progress in NLP [2]. Indeed, a CA is a computer program that is able to converse with users using NL. It can, thus, understand requests formulated using NL, process them, trigger actions, and formulate answers. CAs are attracting increasing interest as they provide access to various services, such as flight booking or weather checking, via mobile applications, websites, or social networks, such as Telegram, Twitter, or Slack. This approach allows users to benefit from these services without the need to install new applications, and their interactions with the service are facilitated via an NL text or voice conversation [3].
CAs can be classified into two main categories [4]: proactive agents, which can initiate the conversation with users in a given context without an explicit request from them (e.g., by alerting them), and reactive agents, which can interact with users regarding specific tasks, such as a hotel or flight booking. Figure 1 shows a simplified diagram of how a CA works [4,5]. A standard method of CA design is based on the use of "intents", which essentially correspond to the objectives or goals that the user wishes to achieve when they are initiating a conversation with the CA.  [4,5]).
Firstly, the CA receives the user's input in NL (e.g., "I want to book a car from 1 March 2023 to 15 March 2023", label 1 in Figure 1). It then tries to match the sentence to a specific intention (e.g., intention: book, label 2). Intents can be extracted from the text using different AI-based techniques, such as rule-based or machine learning-based ones. This task is essential in NLP and consists of identifying specific text elements and classifying them into predefined categories called "entities". These entities can be names of people, dates, phone numbers, email addresses, currencies, etc. For our above request, the CA extracts the following entities: start date: 1 March 2023; end date: 15 March 2023 (label 3).
Then, the CA triggers an appropriate action by responding to the request (label 4). Actions may include sending a text response, performing an online task, or interacting with external services (label 5). Finally, it generates an NL response from the result of the action (label 6). Firstly, the CA receives the user's input in NL (e.g., "I want to book a car from 1 March 2023 to 15 March 2023", label 1 in Figure 1). It then tries to match the sentence to a specific intention (e.g., intention: book, label 2). Intents can be extracted from the text using different AI-based techniques, such as rule-based or machine learning-based ones. This task is essential in NLP and consists of identifying specific text elements and classifying them into predefined categories called "entities". These entities can be names of people, dates, phone numbers, email addresses, currencies, etc. For our above request, the CA extracts the following entities: start date: 1 March 2023; end date: 15 March 2023 (label 3).
Then, the CA triggers an appropriate action by responding to the request (label 4). Actions may include sending a text response, performing an online task, or interacting with external services (label 5). Finally, it generates an NL response from the result of the action (label 6).

Analysis of the CA Development Tools
Various platforms, frameworks, and services are available to create CAs. These tools enable their creation for different messaging platforms, mobile applications, websites, and connected home devices.
This section analyzes several CA development tools, which are listed in Table 1. This analysis mainly focuses on the features offered by these tools and the concepts used to develop an agent. With there being many tools available to create CAs, choosing the best one can be challenging for developers. It is important to consider several factors, such as the features offered, the project's complexity, the developers' expertise, the budget, and the type of application.
These tools can be open source (such as Rasa framework), which allows more flexibility and customization, or closed source (such as Dialogflow and Amazon Lex), which are often easier to use and provide more comprehensive technical support. Closed source tools are offered on a subscription-or usage-based pricing model, which can vary considerably depending on the complexity of the tool, its functionality, and the level of support the vendor provides.
Most tools use user interfaces to create training sentences (such as Dialogflow, Bot Framework with LUIS, and Watson Assistant), allowing developers to define intentions and entities from these training sentences. On the other hand, some CA development tools use regular expressions to detect patterns in text. In contrast to training sentences, regular expressions allow the identification of more complex patterns in a sentence. They are beneficial for spotting keywords in a sentence that can reveal the user's intent. Regular expressions can, therefore, complement training sentences to improve the CA's ability to understand users' requests.
Some CA development tools support two input modes, text and voice, while others focus solely on one or the other. Text-based CA can be integrated with popular messaging apps, such as Facebook Messenger, WhatsApp, or Slack. In contrast, a voice-based CA can be integrated with devices such as voice assistants, for example, Amazon Alexa, Google Assistant, and Apple Siri.
Implementing a CA involves choosing the tool that best suits the conversational scenarios and the client's needs. The goal is to converse with many customers via social media platforms. In that case, a tool that supports the development of multilingual CA that can use different communication channels is necessary. Additionally, if the developer does not have the resources to host the infrastructure, tools offering hosting services are the best choice.

Conclusions
CA development has become increasingly accessible due to the variety of tools available on the market and advancements in AI. However, choosing a suitable tool is not easy, as it depends on the specific needs and requirements of the project, including the required functions, supported languages, deployment mode, and pricing. In this paper, we present an analysis of 20 tools that can help developers choose the right tool to support the development of their CAs in conformance with their specific requirements.

Conflicts of Interest:
The authors declare no conflict of interest.