Abstract
In this article, we discuss robustness and portability issues forparsing components in interactive speech systems. The robustness isobtained by choosing an appropriate grammar formalism. It should bewell adapted to spontaneous speech effects, which are frequent inthese application domains. Portability, on the other hand, can beachieved by choosing a flexible grammar implementation. We illustrateboth issues by describing a stochasticparsing component implemented and evaluated for spoken languagetranslation and information retrieval applications.
Similar content being viewed by others
References
Bennacef, S. K., H. Bonneau-Maynard, J. L. Gauvain, L. Lamel and W. Minker: 1994, ‘A Spoken Language System for Information Retrieval’, International Conference on Spoken Language Processing ICSLP 1994, Yokohama, Japan, pp. 1271–1274.
Blasband, M.: 1998, ‘Speech Recognition in Practice: The ARISE Project’, La Lettre de l'IA, pp. 207–210.
Bruce, B.: 1975, ‘Case Systems for Natural Language’, Artificial Intelligence 6, 327–360.
Feldman, J. A. and D.H. Bullard: 1982, ‘Connectionist Models and Their Properties’, Cognitive Science 6, 205–254.
Fillmore, C. J.: 1968, ‘The Case for Case’, in Emmon Bach and Robert T. Harms (eds), Universals in Linguistic Theory. New York: Holt, Rinehart and Winston, pp. 1–90.
Finke, M., P. Geutner, H. Hild, T. Kemp, K. Ries and M. Westphal: 1997, ‘The Karlsruhe-Verbmobil Speech Recognition Engine’, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP'97), Munich, Vol. 1, pp. 83–86.
Gates, D., A. Lavie, L. Levin, A. Waibel, M. Gavaldà, L. Mayfield, M. Woszczyna and P. Zahn: 1996, ‘End-to-End Evaluation in JANUS: A Speech-to-speech Translation System’, ECAI 1996: Proceedings of the 12th European Conference on Artificial Intelligence, Budapest, pp. 35–40.
Gauvain, J. L., S. Bennacef, L. Devillers, L. Lamel and S. Rosset: 1997, ‘Spoken Language Component of the MASK Kiosk’, in S. Pfleger and K. Varghese (eds), Human Comfort & Security of Information Systems. New York: Springer-Verlag, pp. 93–103.
Hatazaki, K., J. Noguchi, A. Okumura, K. Yoshida and T. Watanabe: 1992, ‘INTERTALKER: An Experimental Automatic Interpretation System Using Conceptual Representation’, International Conference on Spoken Language Processing ICSLP 1992, Banff, Canada.
Issar, S. and W. Ward: 1993, ‘CMU's Robust Spoken Language Understanding System’, Eurospeech: Proceedings of the 3rd European Conference on Speech, Communication, and Technology, Berlin, Germany, pp. 2147–2150.
Jelinek, F., J. Lafferty, D. Magerman, A. Ratnaparkhi and S. Roukos: 1994, ‘Decision Tree Parsing Using a Hidden Derivation Model’, Proceedings of the ARPA Human Language Technology Workshop, Plainsboro, NJ, pp. 260–265.
Jelinek, F., J. Lafferty and R. Mercer: 1992, ‘Basic Methods of Probabilistic Context Free Grammars’, Speech Recognition and Understanding. Recent Advances 75, 345–360.
Kay, M., J. M. Gawron and P. Norvig: 1994, Verbmobil: A Translation System for Face-to-Face Dialog, Stanford: CSLI.
Kuhn, R. and R. De Mori: 1993 ‘Learning Speech Semantics with Keyword Classification Trees’, Proceedings of ICASSP 1993: IEEE International Conference on Acoustics, Speech, and Signal Processing, Minneapolis, MN, pp. 55–58.
Kuhn, R. and R. De Mori: 1994, ‘Recent results in automatic learning rules for semantic interpretation’, International Conference on Spoken Language Processing ICSLP 1994, Yokahama, Japan, pp. 75–78.
Lamel, L.: 1998, ‘Spoken Language Dialog System Development and Evaluation at LIMST’, International Symposium on Spoken Dialogue, Sydney.
Lamel, L., S. K. Bennacef, H. Bonneau-Maynard, S. Rosset and J. L. Gauvain: 1995, ‘Recent Developments in Spoken Language Systems for Information Retrieval’, ESCA Workshop on Spoken Dialogue Systems, Vigsø, Denmark, pp. 17–20.
Lavie, A. and M. Tomita: 1993, ‘GLR* — An Efficient Noise Skipping Parsing Algorithm for Context Free Grammars’, Third International Workshop on Parsing Technologies IWPT 93, Tilburg, The Netherlands, pp. 123–134.
Lavie, A., D. Gates, N. Coccaro and L. Levin: 1996, ‘Input Segmentation of Spontaneous Speech in JANUS: A Speech-to-Speech Translation System’, ECAI 1996: Proceedings of the 12th European Conference on Artificial Intelligence, Budapest, pp. 54–59.
Lavie, A., A. Waibel, L. Levin, M. Finke, D. Gates, M. Gavaldà, T. Zeppenfeld and P. Zhan: 1997, ‘Janus III: Speech-to-Speech Translation in Multiple Languages’, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97), Munich, Vol. 1, pp. 99–102.
Levin, E., and R. Pieraccini: 1995, ‘Chronus — The Next Generation’, Proceedings of the DARPA Human Language Technology Workshop, Princeton, NJ, pp. 269–271.
Levin, L., O. Glickman, Y. Qu, D. Gates, A. Lavie, C. P. Rosé, C. Van Ess-Dykema and A. Waibel: 1995, ‘Using Context in Machine Translation of Spoken Language’, Proceedings of the Sixth International Conference on Theoretical and Methodological Issues in Machine Translation, TMI 95, Leuven, pp. 173–187.
Levin, L., A. Lavie, M. Woszczyna, D. Gates, M. Gavaldà, D. Koll and A. Waibel: 2000, ‘The JANUS-III Translation System’, Machine Translation 15, 3–25
Mayfield, L., M. Gavaldà, W. Ward and A. Waibel: 1995, ‘Concept-Based Speech Translation’, 1995 IEEE International Conference on Acoustics, Speech, and Signal Processing ICASSP '95, Detroit, pp. 97–100.
Miller, S., D. Stallard, R. Bobrow and R. Schwartz: 1996, ‘A Fully Statistical Approach to Natural Language Interfaces’, 34th Annual Meeting of the Association for Computational Linguistics, Santa Cruz, CA, pp. 55–61.
Minami, Y., K. Shikano, S. Takahashi, T. Yamada, O. Yoshioka and S. Furui: 1995, ‘Large-Vocabulary Continuous Speech Recognition Algorithm Applied to a Multi-Modal Telephone Directory Assistance System’, Speech Communication 15, 301–310.
Minker, W.: 1997, ‘Stochastically-Based Natural Language Understanding Across Tasks and Languages’, EuroSpeech '97: 5th European Conference on Speech Communication and Technology, Rhodes, Greece, pp. 1423–1426.
Minker, W., S. K. Bennacef and J. L. Gauvain: 1996, ‘A Stochastic Case Frame Approach for Natural Language Understanding’, Eurospeech: Proceedings of the 3rd European Conference on Speech, Communication, and Technology, Berlin, Germany, pp. 1013–1016.
Minker, W., M. Gavaldà and A. Waibel: 1999a, ‘Stochastically-based Semantic Analysis for Machine Translation’, Computer Speech and Language 13, 177–194.
Minker, W., A. Waibel and J. Mariani: 1999b, Stochastically-Based Semantic Analysis, Boston: Kluwer Academic Publishers.
Morimoto, T., T. Takezawa, F. Yato, S. Sagayama, T. Tashiro, M. Nagata and A. Kurematsu: 1993, ‘ATR Speech Translation System: ASURA’, Eurospeech: Proceedings of the 3rd European Conference on Speech, Communication, and Technology, Berlin, Germany, pp. 1295–1298.
Oerder, M., and H. Aust: 1994, ‘A Realtime Prototype of an Automatic Inquiry System’, International Conference on Spoken Language Processing ICSLP 1994, Yokohama, Japan, pp. 703–706.
Peckham, J.: 1993, ‘A New Generation of Spoken Dialogue Systems: Results and Lessons from the Sundial Project’, Eurospeech: Proceedings of the 3rd European Conference on Speech, Communication, and Technology, Berlin, Germany, pp. 33–40.
Price, P.: 1990, ‘Evaluation of Spoken Language Systems: The Atis Domain’, Proceedings of ARPA Human Language Technology Workshop, pp. 91–95.
Rabiner, L. R. and B. H. Juang: 1986, ‘An Introduction to Hidden Markov Models’, IEEE Transactions on Acoustics, Speech and Signal Processing 3, 4–16.
Reithinger, N., E. Maier and J. Alexandersson: 1995, ‘Treatment of Incomplete Dialogues in a Speech-to-Speech Translation System’, ESCA Workshop on Spoken Dialogue Systems, Vigsø, Denmark.
Roe, D. B., F. C. Pereira, R. W. Sproat and M. D. Riley: 1992, ‘Efficient Grammar Processing for a Spoken Language Translation System’, Proceedings of ICASSP 1992: IEEE International Conference on Acoustics, Speech, and Signal Processing, San Francisco, Vol. 1., pp. 213–216.
Rosé, C. P., B. Di Eugenio, L. S. Levin and C. Van Ess-Dykema: 1995, ‘Discourse Processing of Dialogues with Multiple Threads’, 33rd Annual Meeting of the Association for Computational Linguistics, Cambridge, Massachusetts, pp. 31–38.
Schwartz, R., S. Miller, D. Stallard and J. Makhoul: 1996, ‘Language Understanding Using Hidden Understanding Models’, ICSLP 96: The Fourth International Conference on Spoken Language Processing, Philadelphia, PA, pp. 997–1000.
Tomita, M.: 1987, ‘An Efficient Augmented-Context-Free Parsing Algorithm’, Computational Linguistics 13, 31–46.
Tomita, M. (ed.): 1991. Generalized LR-Parsing. Boston: Kluwer Academic Publishers.
Wahlster, W.: 1993, ‘Verbmobil, Translation of Face-to-Face Dialogs’, The Fourth Machine Translation Summit, Kobe, Japan, pp. 127–135.
Waibel, A.: 1996, ‘Interactive Translation of Conversational Speech’, Computer 27, 41–48.
Waibel, A.: 1999, ‘Interactive Translation of Conversational Speech’, in: K. Ponting (ed.), Computational Models of Speech Pattern Processing, Berlin: Springer-Verlag.
Waibel, A., A. Jain, A. McNair, H. Saito, A. Hauptmann and J. Tebelskis: 1991, ‘JANUS: A Speech-to-Speech Translation System Using Connectionist and Symbolic Processing Strategies’, Proceedings of ICASSP 1991: IEEE International Conference on Acoustics, Speech, and Signal Processing, Toronto, Vol. 2, pp. 793–796.
Ward, W.: 1994, ‘Extracting Information in Spontaneous Speech’, International Conference on Spoken Language Processing ICSLP 1994, Yokohama, Japan, pp. 83–86.
Ward, W. and S. Issar: 1995, ‘The CMU Atis System’, Proceedings of ARPA Workshop on Spoken Language Technology, San Mateo, CA: Morgan Kaufmann, pp. 249–251.
Yamada, M., F. Itoh, K. Sakai, Y. Komori, Y. Ohora and M. Fujita: 1995, ‘A Spoken Dialogue System with Active/Non-Active Word Control for CD-ROM Information Retrieval’, Speech Communication 15, 355–365.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Minker, W. Robustness and Portability Issues in Multilingual Speech Processing. Machine Translation 16, 109–126 (2001). https://doi.org/10.1023/A:1014574522188
Issue Date:
DOI: https://doi.org/10.1023/A:1014574522188