Field trials of the Italian Arise train timetable system
Introduction
Transportation information and related issues have been often selected as a reference domain for the development of spoken dialogue systems (SDS). Some examples are TRIPS, the Rochester Interactive Planner System (Ferguson and Allen, 1998) that is an interactive, intelligent problem-solving assistant in a transportation/logistics domain, and the DARPA Communicator Project (Rudnicky et al., 1999), whose goal is the creation of a conversational interface for travel information, e.g. flights, hotel and car reservation. The goal of these research projects is the creation of very powerful conversational interfaces that are able to negotiate a complex task with a user, while other projects had focused on the development of spoken language technologies on train timetable information that could be easily integrated in a call centre for obtaining voice interactive applications which could be used by a large population of real users.
The LE-3 project Automatic Railway Information Systems for Europe (Arise) has promoted the creation, optimization and deployment of train timetable information SDSs in three different European languages (Dutch, French and Italian), see (den Os et al., 1999) for an overview of the project and a discussion of the major results. Several dimensions of an SDS have been studied and tested in frequent field trials by directly involving the railway companies in the evaluation. The project partners have studied several crucial issues such as:
- •
The design of a mixed-initiative dialogue strategy, which includes the selection of explicit versus implicit confirmation strategies, i.e. the comparison of different strategies at IRIT (Lavelle et al., 1999) and the use of acoustic confidence score for guiding the choice between explicit and implicit confirmations at KPN/KUN (Sturm et al., 1999).
- •
Usability issues, e.g. freedom and flexibility in the dialogue strategy (Rosset et al., 1999) and the use of barge-in for interrupting the speech output (Lamel et al., 1998).
- •
Evaluation issues, e.g. the comparison between technology-focused and user-focused evaluation approach (van Haaren et al., 1998).
In Section 2 a brief description of the call centre architecture is given. This architecture was tested in a field trial with experimental subjects in October 1997; from then on, it has been improved on the basis of the subjective and objective evaluation results. The relevant data of the evaluation are reported in Section 3; in particular, we will present evaluation data related to the Full-Automatic modality and subjective evaluation data showing users’ acceptance of the SDS.
A key issue for the successful introduction of artificial agents in information centres that were traditionally human-operated is the acceptance of the advanced technology by the human operators, since they have to modify their working style. This aspect was taken into careful consideration during all the different phases of the project: Giovara (1998) illustrates the theoretical and practical aspects of the training courses organized for the human operators of the railway information call centre located in Milano. The operators’ training was held before the second field trial described in Section 4. This very large field trial was carried out with real callers connected via ordinary telephone lines. The results of this second trial were considered promising by the Italian railway company, that as a consequence decided to improve all their call centres,3 namely the FS-Informa information network, by implementing the architecture derived from the studies and experimentations carried out during the Arise project. In Appendix A, the exploitation in the FS-Informa call centres is briefly illustrated.
Section snippets
Description of the Italian system
Based on the results of the pilot project Railtel4 and then during the Arise project, an SDS for accessing timetable information of the Italian railway network was developed. Fig. 1 shows the architecture of the call centre in which the prototype was integrated. All the telephone activities of the call centre are managed by an Automatic Call
The first field trial
The first field trial of the system was organised by Saritel on 1–3 October 1997. The aim of the trial was to identify possible shortcomings in order to enhance and optimise the service. For a more detailed description of the result of this trial see (Baggia et al., 1998).
The FS field trial
The goal of the second field trial was to demonstrate the performance of the Arise prototype in field conditions, with professional FS-Informa operators and real telephone traffic. The trial was carried out during the period between 13 January and 7 February 1998 in the experimental Arisecall centre located in a railway station (Stazione Porta Garibaldi) in Milan.
Conclusions
The field trials showed that the integration of the SDS in a call centre could dramatically improve the work of the human operators (four operators were able to manage the telephone traffic traditionally served by a larger set of human operators). A careful training of the operators is essential for the successful introduction of spoken language technologies: the second field trial showed that the duration of a call was reduced by nearly 30% after the second week of the trial. The callers
Acknowledgements
The work described in this paper could not have been possible without the advice and the efforts of our colleagues of CSELT Spoken Language Systems group. Special thanks to Marzia Lampis and Antonio Romeo, and to two anonymous reviewers for their helpful comments.
References (19)
- et al.
Field trial evaluations of two different information inquiry systems
Speech Communication
(1997) - et al.
A robust system for human–machine dialogue in telephony-based applications
International Journal of Speech Technology
(1997) - Baggia, P., Kellner, A., Pérennou, G., Popovici, C., Sturm, J., Wessel, F., 1999. Language modelling and spoken...
- Baggia, P., Danieli, M., 1998. CSELT approaches to spoken dialogue. In: Proceedings of the International Workshop on...
- Baggia, P., Castagneri, G., Danieli, M., 1998. Recent results from a field trial of the ARISE Italian system for train...
- Danieli, M., Gerbino, E., Moisa, L., 1997. Dialogue strategies for improving the usability of telephone human–machine...
- Danieli, M., 1996. On the use of expectation for detecting and repairing human–machine miscommunications. In:...
- Den Os, E., Boves, L., Lamel, L., Baggia, P., 1999. Overview of the ARISE project. In: Proceedings of EUROSPEECH,...
- Ferguson, G., Allen, J.F., 1998. TRIPS: An integrated intelligent problem-solving assistant. In: Proceedings of the...
Cited by (10)
Relations between de-facto criteria in the evaluation of a spoken dialogue system
2008, Speech CommunicationCitation Excerpt :However, in the literature some authors employ the term “field test” or “field trial” to describe studies in which the interactions are carried out by users who employ the telephone network instead of a laboratory environment, even when they are following predefined scenarios. This is the case, for example, for the evaluation of the ARISE spoken dialogue system (Baggia et al., 2000), which measures the impact of a train timetable system on the working routines of human operators, and on the callers who are traditionally served by the operators. Although the authors report experiments as “field studies”, in the first experiment they contacted different callers who were asked to use the system by following different scenarios, and to fill in a questionnaire where they expressed their opinions about the system performance.
Data-driven generation of phonetic broad classes, based on phoneme confusion matrix similarity
2005, Speech CommunicationCitation Excerpt :The size of vocabulary incorporated could be classified as medium size. Such vocabularies were applied in diverse telephone applications (Baggia et al., 2000; Johnston et al., 1997; Žgank and Rojc, 2003). As mentioned before, this paper addresses the monolingual and multilingual speech recognition systems.
Reinforcement Learning for Dialogue Generation: A Systematic Literature Review
2021, 4th International Conference on Innovative Computing, ICIC 2021Quality of telephone-based spoken dialogue systems
2005, Quality of Telephone-Based Spoken Dialogue SystemsKnowledge-combining methodology for dialogue design in spoken language systems
2005, International Journal of Speech TechnologyADAM: The SI-TAL corpus of annotated dialogues
2002, Proceedings of the 3rd International Conference on Language Resources and Evaluation, LREC 2002