ABSTRACT
This paper presents a series of measurements of the accuracy of speech understanding when grammar-based or robust approaches are used. The robust approaches considered here are based on statistical language models (SLMs) with the interpretation being carried out by phrase-spotting or robust parsing methods. We propose a simple process to leverage existing grammars and logged utterances to upgrade grammar-based applications to become more robust to out-of-coverage inputs. All experiments herein are run on data collected from deployed directed dialog applications and show that SLM-based techniques outperform grammar-based ones without requiring any change in the application logic.
- W. Ward. 1990. The CMU Air Travel Information Service: Understanding spontaneous speech. Proc. of the Speech and Natural Language Workshop, Hidden Valley PA, pp. 127--129. Google ScholarDigital Library
- A. L. Gorin, B. A. Parker, R. M. Sachs and J. G. Wilpon. 1997. How may I help you?. Speech Communications, 23(1):113--127. Google ScholarDigital Library
- C. Hemphill, J. Godfrey and G. Doddington. 1990. The ATIS spoken language systems and pilot corpus. Proc. of the Speech and Natural Language Workshop, Hidden Valley PA, pp. 96--101. Google ScholarDigital Library
- S. Knight, G. Gorrell, M. Rayner, D. Milward, R. Koeling and I. Lewin. 2001. Comparing grammar-based and robust approaches to speech understanding: a case study. Proc. of EuroSpeech.Google Scholar
- M. Rayner, P. Bouillon, N. Chatzichrisafis, B. A. Hockey, M. Santaholma, M. Starlander, H. Isahara, K. Kanzaki and Y. Nakao. 2005. A methodology for comparing grammar-based and robust approaches to speech understanding. Proc. of EuroSpeech.Google Scholar
- L. ten Bosch. 2005. Improving out-of-coverage language modelling in a multimodal dialogue system using small training sets. Proc. of EuroSpeech.Google Scholar
- M. Balakrishna, C. Cerovic, D. Moldovan and E. Cave. 2006. Automatic generation of statistical language models for interactive voice response applications. Proc. of ICSLP.Google Scholar
- J. Gillett and W. Ward. 1998. A language model combining tri-grams and stochastic context-free grammars. Proc. of ICSLP.Google Scholar
- F. Jelinek. 1990. Readings in speech recognition, Edited by A. Waibel and K.-F. Lee, pp. 450--506. Morgan Kaufmann, Los Altos.Google Scholar
- W. Xu and A. Rudnicky. 2000. Language modeling for dialog system. Proc. of ICSLP.Google Scholar
- V. Goel and R. Gopinath. 2006. On designing context sensitive language models for spoken dialog systems. Proc. of ICSLP.Google Scholar
- Enhancing commercial grammar-based applications using robust approaches to speech understanding
Recommendations
Japanese speech understanding using grammar specialization
HLT-Demo '05: Proceedings of HLT/EMNLP on Interactive DemonstrationsThe most common speech understanding architecture for spoken dialogue systems is a combination of speech recognition based on a class N-gram language model, and robust parsing. For many types of applications, however, grammar-based recognition can offer ...
Tree insertion grammar: a cubic-time, parsable formalism that lexicalizes context-free grammar without changing the trees produced
Tree insertion grammar (TIG) is a tree-based formalism that makes use of tree substitution and tree adjunction. TIG is related to tree adjoining grammar. However, the adjunction permitted in TIG is sufficiently restricted that TIGs only derive context-...
Tree insertion grammar: a cubic-time, parsable formalism that lexicalizes context-free grammar without changing the trees produced
Tree insertion grammar (TIG) is a tree-based formalism that makes use of tree substitution and tree adjunction. TIG is related to tree adjoining grammar. However, the adjunction permitted in TIG is sufficiently restricted that TIGs only derive context-...
Comments