Abstract
With the recent spread of speech technologies and the increasing availability of application program interfaces for speech synthesis and recognition, system designers are starting to consider whether to add speech functionality to their applications. The questions that ensue are by no means trivial. SMALTO, the tool described below, provides advice on the use of speech input and/or output modalities in combination with other modalities in the design of multimodal systems. SMALTO (S peech M odality A uxi L iary TO ol), implements a theory of modalities and incorporates structured data extracted from a corpus of claims about speech functionality found in recent literature on multimodality. The current version of the system aims mainly at supporting decisions at early design stages, as a hypertext system. However, further uses of SMALTO as part of a complete domain-oriented design environment are also envisaged.
Similar content being viewed by others
References
N.O. Bernsen, “Towards a Tool for Predicting Speech Functionality, ” Speech Communication, vol. 23, 1997, pp. 181–210.
N.O. Bernsen and L. Dybkjær, “Is Speech the Right Thing for Your Application?, ” in Proceedings of ICSLP' 98, Sydney, Australia, Australian Speech Science and Technology Association, 1998, pp. 3209–3212.
N.O. Bernsen, “Defining a Taxonomy of Output Modalities from an HCI Perspective, ” Computer Standards and Interfaces, vol. 18, no. 6/7, 1997, pp. 537–556.
C. Baber and J. Noyes (Eds.), Interactive Speech Technology, London: Taylor and Francis, 1993.
N.O. Bernsen and L. Dybkjær, “A Theory of Speech in Multimodal Systems, ” in Proceedings of the ESCA Workshop on Interactive Dialogue in Multi-Modal Systems, P. Dalsgaard, C.-H. Lee, P. Heisterkamp, and R. Cole (Eds.), Irsee, Germany, European Speech Communication Association, 1999, pp. 105–108.
E.D. Mynatt, “Transforming Graphical Interfaces into Auditory Interfaces for Blind Users, ” Human-Computer Interaction, vol. 12, no. 1/2, 1997, pp. 7–45.
S.F. Roth, M.C. Chuah, S. Kerpedjiev, J.A. Kolojejchick, and P. Lucas, “Toward an Information Visualization Workspace: Combining Multiple Means of Expression, ” Human-Computer Interaction, vol. 12, no. 1/2, 1997, pp. 131–185.
N.O. Bernsen and L. Dybkjær, “Working Paper on Speech Functionality, ” Tech. Rep. D2.7, NIS Laboratory, DISC Spoken Language Dialogue Systems and Components: Best practice in development and evaluation, March 1999.
V. Apparao, S. Byrne, M. Champion, S. Isaacs, A.L. Hors, G. Nicol, J. Robie, P. Sharpe, B. Smith, J. Sorensen, R. Sutor, R. Whitmer, and C. Wilson, “Document Object Model (DOM) Level 1 Specification, ” Tech. Rep., The World Wide Web Consortium (W3C), 1998. Version 1.0. Available at http://www.w3. org/TR/REC-DOM-Level-1/.
G. Fischer, A. Girgensohn, K. Nakakoji, and D. Redmiles, “Supporting Software Designers with Integrated Domain-Oriented Design Environments, ” IEEE Transactions on Software Engineering, vol. 18, 1992, pp. 511–522.
J.E. Robbins, D.M. Hilbert, and D.F. Redmiles, “Software Architecture Critics in Argo, ” in Proceedings of the 1998 International Conference on Intelligent User Interfaces, Adaptation and Critiquing, 1998, pp. 141–144.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Luz, S., Bernsen, N.O. A Tool for Interactive Advice on the Use of Speech in Multimodal Systems. The Journal of VLSI Signal Processing-Systems for Signal, Image, and Video Technology 29, 129–137 (2001). https://doi.org/10.1023/A:1011183800658
Published:
Issue Date:
DOI: https://doi.org/10.1023/A:1011183800658