skip to main content
10.1145/1647314.1647323acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
research-article

Dialog in the open world: platform and applications

Published:02 November 2009Publication History

ABSTRACT

We review key challenges of developing spoken dialog systems that can engage in interactions with one or multiple participants in relatively unconstrained environments. We outline a set of core competencies for open-world dialog, and describe three prototype systems. The systems are built on a common underlying conversational framework which integrates an array of predictive models and component technologies, including speech recognition, head and pose tracking, probabilistic models for scene analysis, multiparty engagement and turn taking, and inferences about user goals and activities. We discuss the current models and showcase their function by means of a sample recorded interaction, and we review results from an observational study of open-world, multiparty dialog in the wild.

References

  1. M. Argyle. Bodily Communication, International University Press, Inc, New York (1975).Google ScholarGoogle Scholar
  2. D. Bohus and E. Horvitz, Learning to Predict Engagement with a Spoken Dialog System in Open-World Settings, in Proceedings of SIGdial'09, London, UK (2009) Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. D. Bohus and E. Horvitz, Models for Multiparty Engagement in Open-World Dialog, in Proceedings of SIGdial'09, London, UK (2009) Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. D. Bohus and A. Rudnicky. The RavenClaw Dialog Management Framework: Architecture and Systems, Computer Speech and Language, DOI:10.1016/j.csl.2008.10.001 Google ScholarGoogle ScholarCross RefCross Ref
  5. R. Cole. Tools for Research and Education in Speech Science, in Proceedings of International Conference of Phonetic Sciences, San Francisco, CA (1999)Google ScholarGoogle Scholar
  6. G. Ferguson, and J. Allen. TRIPS: An Intelligent Integrated Problem-Solving Assistant, in Proceedings of AAAI'98, Madison, WI (1998) Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. E. Goffman, Behaviour in public places: notes on the social order of gatherings, The Free Press, New York (1963)Google ScholarGoogle Scholar
  8. E. Horvitz. Reflections on Challenges and Promises of Mixed-Initiative Interaction, in AI Magazine vol. 28, Number 2 (2007)Google ScholarGoogle Scholar
  9. E. Horvitz, P. Koch, C.M. Kadie, and A. Jacobs. Coordinate: Probabilistic Forecasting of Presence and Availability, in Proceedings of UAI '02, Edmonton, Canada (2002). Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. E. Horvitz, J. Apacible, and P. Koch. BusyBody: Creating and Fielding Personalized Models of the Cost of Interruption, in Proceedings of CSCW, ACM Press, (2004). Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J. Jaffe and S. Feldstein. Rhythms of Dialogue, Academic Press (1970)Google ScholarGoogle Scholar
  12. A. Kendon. Conducting Interaction: Patterns of Behavior in Focused Encounters, Studies in International Sociolinguistics, Cambridge University Press (1990)Google ScholarGoogle Scholar
  13. F. Kronlid. Steps towards Multi-Party Dialogue Management, Ph.D. Thesis, University of Gothenburg (2008)Google ScholarGoogle Scholar
  14. S. Larsson. Issue-based dialog management, Goteborg University, Ph.D. Thesis (2002)Google ScholarGoogle Scholar
  15. C. Peters, C. Pelachaud, E. Bevacqua, and M. Mancini, "A model of attention and interest using gaze behavior", Lecture Notes in Computer Science, pp. 229--240, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. A. Raux and M. Eskenazi. Optimizing Endpointing Thresholds using Dialogue Features in a Spoken Dialogue System, in Procs SIGdial'08, Columbus, OH (2008) Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. C. Rich, C. Sidner, and N. Lesh. COLLAGEN: Applying Collaborative Discourse Theory to Human-Computer Interaction, in AI Magazine. 22:15--25 (2001) Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. H. Sacks, A. Schegloff, G. Jefferson. A simplest systematic for the organization of turn-taking for conversation. Language, 50(4):696--735 (1974).Google ScholarGoogle ScholarCross RefCross Ref
  19. C. Sidner and C. Lee. Engagement rules for human-robot collaborative interactions, in IEEE International Conference on Systems, Man and Cybernetics, Vol 4, 3957--3962, (2003)Google ScholarGoogle Scholar
  20. Situated Interaction Project page: http://research.microsoft.com/en-us/um/people/dbohus/research_situated_interaction.htmlGoogle ScholarGoogle Scholar
  21. K. R. Thórisson. A Mind Model for Multimodal Communicative Creatures and Humanoids, in International Journal of Applied Artificial Intelligence, 13(4-5): 449--486 (1999)Google ScholarGoogle ScholarCross RefCross Ref
  22. K. R. Thórisson. Natural Turn-Taking Needs No Manual: Computational Theory and Model, From Perception to Action, in Multimodality in Language and Speech Systems, 173--207, Kluwer Academic Publishers (2003)Google ScholarGoogle Scholar
  23. D. Traum and J. Rickel. Embodied Agents for Multi-party Dialogue, in Immersive Virtual Worlds, AAMAS'02, pp 766--773 (2002) Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Q. Wang, W. Zhang, X. Tang and H. Shum. Real-Time Bayesian 3-D Pose Tracking, in IEEE Trans. CSVT, vol. 16, no.12, pp. 1533--1541 (2006) Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. C. Zhang, and Y. Rui. Robust Visual Tracking via Pixel Classification and Integration, in ICPR'2006, Hong Kong, China (2006) Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Dialog in the open world: platform and applications

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Conferences
              ICMI-MLMI '09: Proceedings of the 2009 international conference on Multimodal interfaces
              November 2009
              374 pages
              ISBN:9781605587721
              DOI:10.1145/1647314

              Copyright © 2009 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 2 November 2009

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article

              Acceptance Rates

              Overall Acceptance Rate453of1,080submissions,42%

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader