Elsevier

Speech Communication

Volume 31, Issue 4, August 2000, Pages 355-367
Speech Communication

Field trials of the Italian Arise train timetable system

https://doi.org/10.1016/S0167-6393(99)00068-0Get rights and content

Abstract

This paper reports results from two field trials of the CSELT Arise spoken dialogue system in the Italian railway call centre FS-Informa. The system provides voice-driven access to railway timetable for the major Italian and some European cities. On the basis of the initial experiences we have been able to integrate the automatic system in the architecture of a typical railway call centre, where preferable timetable information is interchanged with a caller via a spoken dialogue system, and where a human operator is involved only for answering more complex user requests. We argue that the results we present are relevant from different points of view. They allowed us to test the impact of the automatic system on the working routines of the human operators, and the reactions of real callers who are traditionally served by human operators.

Introduction

Transportation information and related issues have been often selected as a reference domain for the development of spoken dialogue systems (SDS). Some examples are TRIPS, the Rochester Interactive Planner System (Ferguson and Allen, 1998) that is an interactive, intelligent problem-solving assistant in a transportation/logistics domain, and the DARPA Communicator Project (Rudnicky et al., 1999), whose goal is the creation of a conversational interface for travel information, e.g. flights, hotel and car reservation. The goal of these research projects is the creation of very powerful conversational interfaces that are able to negotiate a complex task with a user, while other projects had focused on the development of spoken language technologies on train timetable information that could be easily integrated in a call centre for obtaining voice interactive applications which could be used by a large population of real users.

The LE-3 project Automatic Railway Information Systems for Europe (Arise) has promoted the creation, optimization and deployment of train timetable information SDSs in three different European languages (Dutch, French and Italian), see (den Os et al., 1999) for an overview of the project and a discussion of the major results. Several dimensions of an SDS have been studied and tested in frequent field trials by directly involving the railway companies in the evaluation. The project partners have studied several crucial issues such as:

  • The design of a mixed-initiative dialogue strategy, which includes the selection of explicit versus implicit confirmation strategies, i.e. the comparison of different strategies at IRIT (Lavelle et al., 1999) and the use of acoustic confidence score for guiding the choice between explicit and implicit confirmations at KPN/KUN (Sturm et al., 1999).

  • Usability issues, e.g. freedom and flexibility in the dialogue strategy (Rosset et al., 1999) and the use of barge-in for interrupting the speech output (Lamel et al., 1998).

  • Evaluation issues, e.g. the comparison between technology-focused and user-focused evaluation approach (van Haaren et al., 1998).

The Italian consortium1 of the Arise project has focused its efforts on the integration of SDSs into railway information call centres by exploring the opportunity of integrating the activities of an artificial agent (the SDS) with the ones of the human railway operators. For instance, in a Semi-Automatic modality a human operator may be assisted by the automatic system for reading the information of a set of trains; on the contrary in a Full-Automatic modality, SDSs allow the complete automation of queries concerning timetable information, while in case of failure the user call can be transferred to a human operator (who already knows those information previously communicated by the user to the automatic system). These features have been implemented in a call centre architecture that supports the interaction between human and artificial agents. The automatic system is based on CSELT speech recognition technology.2 The spoken dialogue system can support different human–computer interaction models, ranging from system-driven to mixed-initiative (Danieli et al., 1997): the interaction style adopted in the experimental call centre described in this paper is guided by the system (Baggia and Danieli, 1998).

In Section 2 a brief description of the call centre architecture is given. This architecture was tested in a field trial with experimental subjects in October 1997; from then on, it has been improved on the basis of the subjective and objective evaluation results. The relevant data of the evaluation are reported in Section 3; in particular, we will present evaluation data related to the Full-Automatic modality and subjective evaluation data showing users’ acceptance of the SDS.

A key issue for the successful introduction of artificial agents in information centres that were traditionally human-operated is the acceptance of the advanced technology by the human operators, since they have to modify their working style. This aspect was taken into careful consideration during all the different phases of the project: Giovara (1998) illustrates the theoretical and practical aspects of the training courses organized for the human operators of the railway information call centre located in Milano. The operators’ training was held before the second field trial described in Section 4. This very large field trial was carried out with real callers connected via ordinary telephone lines. The results of this second trial were considered promising by the Italian railway company, that as a consequence decided to improve all their call centres,3 namely the FS-Informa information network, by implementing the architecture derived from the studies and experimentations carried out during the Arise project. In Appendix A, the exploitation in the FS-Informa call centres is briefly illustrated.

Section snippets

Description of the Italian system

Based on the results of the pilot project Railtel4 and then during the Arise project, an SDS for accessing timetable information of the Italian railway network was developed. Fig. 1 shows the architecture of the call centre in which the prototype was integrated. All the telephone activities of the call centre are managed by an Automatic Call

The first field trial

The first field trial of the system was organised by Saritel on 1–3 October 1997. The aim of the trial was to identify possible shortcomings in order to enhance and optimise the service. For a more detailed description of the result of this trial see (Baggia et al., 1998).

The FS field trial

The goal of the second field trial was to demonstrate the performance of the Arise prototype in field conditions, with professional FS-Informa operators and real telephone traffic. The trial was carried out during the period between 13 January and 7 February 1998 in the experimental Arisecall centre located in a railway station (Stazione Porta Garibaldi) in Milan.

Conclusions

The field trials showed that the integration of the SDS in a call centre could dramatically improve the work of the human operators (four operators were able to manage the telephone traffic traditionally served by a larger set of human operators). A careful training of the operators is essential for the successful introduction of spoken language technologies: the second field trial showed that the duration of a call was reduced by nearly 30% after the second week of the trial. The callers

Acknowledgements

The work described in this paper could not have been possible without the advice and the efforts of our colleagues of CSELT Spoken Language Systems group. Special thanks to Marzia Lampis and Antonio Romeo, and to two anonymous reviewers for their helpful comments.

References (19)

  • R. Billi et al.

    Field trial evaluations of two different information inquiry systems

    Speech Communication

    (1997)
  • D. Albesano et al.

    A robust system for human–machine dialogue in telephony-based applications

    International Journal of Speech Technology

    (1997)
  • Baggia, P., Kellner, A., Pérennou, G., Popovici, C., Sturm, J., Wessel, F., 1999. Language modelling and spoken...
  • Baggia, P., Danieli, M., 1998. CSELT approaches to spoken dialogue. In: Proceedings of the International Workshop on...
  • Baggia, P., Castagneri, G., Danieli, M., 1998. Recent results from a field trial of the ARISE Italian system for train...
  • Danieli, M., Gerbino, E., Moisa, L., 1997. Dialogue strategies for improving the usability of telephone human–machine...
  • Danieli, M., 1996. On the use of expectation for detecting and repairing human–machine miscommunications. In:...
  • Den Os, E., Boves, L., Lamel, L., Baggia, P., 1999. Overview of the ARISE project. In: Proceedings of EUROSPEECH,...
  • Ferguson, G., Allen, J.F., 1998. TRIPS: An integrated intelligent problem-solving assistant. In: Proceedings of the...
There are more references available in the full text version of this article.

Cited by (10)

  • Relations between de-facto criteria in the evaluation of a spoken dialogue system

    2008, Speech Communication
    Citation Excerpt :

    However, in the literature some authors employ the term “field test” or “field trial” to describe studies in which the interactions are carried out by users who employ the telephone network instead of a laboratory environment, even when they are following predefined scenarios. This is the case, for example, for the evaluation of the ARISE spoken dialogue system (Baggia et al., 2000), which measures the impact of a train timetable system on the working routines of human operators, and on the callers who are traditionally served by the operators. Although the authors report experiments as “field studies”, in the first experiment they contacted different callers who were asked to use the system by following different scenarios, and to fill in a questionnaire where they expressed their opinions about the system performance.

  • Data-driven generation of phonetic broad classes, based on phoneme confusion matrix similarity

    2005, Speech Communication
    Citation Excerpt :

    The size of vocabulary incorporated could be classified as medium size. Such vocabularies were applied in diverse telephone applications (Baggia et al., 2000; Johnston et al., 1997; Žgank and Rojc, 2003). As mentioned before, this paper addresses the monolingual and multilingual speech recognition systems.

  • Reinforcement Learning for Dialogue Generation: A Systematic Literature Review

    2021, 4th International Conference on Innovative Computing, ICIC 2021
  • Quality of telephone-based spoken dialogue systems

    2005, Quality of Telephone-Based Spoken Dialogue Systems
  • ADAM: The SI-TAL corpus of annotated dialogues

    2002, Proceedings of the 3rd International Conference on Language Resources and Evaluation, LREC 2002
View all citing articles on Scopus
View full text