Abstract
This article presents an approach for parsing natural language queries that integrates multiple subparsers and subgrammars, in contrast to the traditional single grammar and parser approach. In using LR(k) parsers for natural language processing, we are faced with the problem of rapid growth in parsing table sizes as the number of grammar rules increases. We propose to partition the grammar into multiple subgrammars, each having its own parsing table and parser. Grammar partitioning helps reduce the overall parsing table size when compared to using a single grammar. We used the GLR parser with an LR(1) parsing table in our framework because GLR parsers can handle ambiguity in natural language. A parser composition technique then combines the parsers' outputs to produce an overall parse that is the same as the output parse of single parser. Two different strategies were used for parser composition: (i) parser composition by cascading; and (ii) parser composition with predictive pruning.Our experiments were conducted with natural language queries from the ATIS (Air Travel Information Service) domain. We have manually translated the ATIS-3 corpora into Chinese, and consequently we could experiment with grammar partitioning on parallel linguistic corpora. For English, the unpartitioned ATIS grammar has 72,869 states in its parsing table, while the partitioned English grammar has 3,350 states in total. For Chinese, grammar partitioning reduced the overall parsing table size from 29,734 states to 3,894 states. Both results show that grammar partitioning greatly economizes on the overall parsing table size. Language understanding performances were also examined. Parser composition imparts a robust parsing capability in our framework, and hence obtains a higher understanding performance when compared to using a single GLR parser.
- ABNEY, S. 1991. Parsing by chunks. In Principle-Based Parsing: Computation and Psycholinguistics, R. C. Berwick et al., Eds. Kluwer Academic Publishers, 1991.Google Scholar
- AHO, A., SETHI, I, R. and ULLMAN, J. 1986. Compilers: Principles, Techniques, and Tools. Addison-Wesley, Reading, MA: 1986. Google Scholar
- AMTRUP, J. 1995. Parallel parsing: Different distribution schemata for charts. In Proceedings of the 4th International Workshop on Parsing Technologies (ACL/SIGPARSE, Sept.1995), 12-13.Google Scholar
- EARLEY, J. 1968. An efficient context-free parsing algorithm. Ph.D. thesis, Carnegie Mellon University, Pittsburgh, PA, 1968. Google Scholar
- HILLIER, F. S. and LIEBERMAN, G. J. 1995. Introduction to Operations Research. 6th ed. McGraw-Hill, 1995. Google Scholar
- JOHNSON, S. C. 1975. YACC: Yet Another Compiler Compiler. Tech. Rep. CSTR 32, Bell Laboratories, Murray Hill, NJ., 1975.Google Scholar
- KITA, K., MORIMOTO, T., and SAGAYAMA, S. 1993. LR parsing with a category reachability test applied to speech recognition. IEICE Trans. Inf. Syst. E 76-D, 1 (1993), 23-28.Google Scholar
- KITA, K., TAKEZAWA, T., HOSAKA, J., EHARA, T., and MORIMOTO, T. 1990. Continuous speech recogition using ywo-level LR parsing. In Proceedings of the International Conference on Spoken Language Processing, 21.3.1, 905-908.Google Scholar
- KITA, K., TAKEZAWA, T., and MORIMOTO, T. 1991. Continuous speech recognition using two-level LR parsing. IEICE Trans. E 74, 7 (1991), 1806-1810.Google Scholar
- KORENJAK, A. 1969. A Practical method for constructing LR(k). Commun. ACM 12, 11 (Nov. 1969). Google Scholar
- LUK, P. C., MENG, H., and WENG, F. 2000. Grammar partitioning and parser composition for natural language understanding. In Proceedings of the International Conference on Spoken Language Processing (Beijing, 2000).Google Scholar
- MOORE, R. C. 2000. Improved left-corner chart parsing for large context-free grammars. In Proceedings of 6th International Workshop on Parsing Technologies (ACL/SIGPARSE, Feb. 2000).Google Scholar
- PALLET ET AL. 1994. The 1993 benchmark tests for the ARPA spoken language program. In Proceedings of the DARPA Spoken Language Technology Workshop (1994), 15-40.Google Scholar
- PRICE, P. 1990. Evaluation of spoken language systems: The ATIS domain. In Proceedings of the ARPA Human Language Technology Workshop (1990), 91-95. Google Scholar
- RULAND, T., RUPP, C., SPILKER, J., WEBER, H. and WORM, K. 1998. Making the most of multiplicity: A multi-parser multi-strategy architecture for the robust processing of spoken language. In Proceedings of ICSLP (1998).Google Scholar
- SANN, P. 1991. Experiments with GLR and chart parsing. In Generalized LR Parsing. Kluwer Academic. 1991, 17-34.Google Scholar
- SIU, K. C. and MENG, H. 1999. Semi-automatic acquisition of domain-specific semantic structures. In Proceedings of EUROSPEECH (1999).Google Scholar
- SIU, K. C. and MENG, H. 2001. Semi-automatic grammar induction for bi-directional English-Chinese machine translation. In Proceedings of EUROSPEECH (Sept 2001).Google Scholar
- STEEL, S. and ROECk, A. D. 1987. Bi-directional parsing. In Proceedings of the 1987 AISB Conference (London, 1987). Wiley, New York, NY.Google Scholar
- TOMITA, M. 1985. Efficient Parsing for Natural Language. Kluwer Academic, Boston, MA, 1985. Google Scholar
- TOMITA, M. 1986. An efficient word lattice parsing algorithm for continuous speech recognition. In Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP, April 1986), 1569-1572.Google Scholar
- WARD, W. 1990. The CMU Air Travel Information Service: Understanding spontaneous speech. In Proceedings of Speech and Natural Language Workshop (June 1990), 127-129. Google Scholar
- WENG, F. 1993. Handling syntactic extra-grammaticality. In Proceedings of the 3rd International Workshop on Parsing Technologies (ACL/SIGPARSE, Aug. 1993).Google Scholar
- WENG, F. and STOLCKE, A. 1995. Partitioning grammar and composing parsers. In Proceedings of the 4th International Workshop on Parsing Technologies (ACL/SIGPARSE, Sept. 1995).Google Scholar
- WENG, F., MENG, H., and LUK, P. C. 2000. Parsing a lattice with multiple grammars. In Proceedings of the 6th International Workshop on Parsing Technologies (ACL/SIGPARSE, Feb. 2000).Google Scholar
- YOUNGEr, D. H. 1967. Recognition and parsing of context free languages in time n3. Inf. Control 10 (1967), 189-208.Google Scholar
Index Terms
- GLR parsing with multiple grammars for natural language queries
Recommendations
Right nulled GLR parsers
The right nulled generalized LR parsing algorithm is a new generalization of LR parsing which provides an elegant correction to, and extension of, Tomita's GLR methods whereby we extend the notion of a reduction in a shift-reduce parser to include right ...
Parsing expression grammars: a recognition-based syntactic foundation
POPL '04: Proceedings of the 31st ACM SIGPLAN-SIGACT symposium on Principles of programming languagesFor decades we have been using Chomsky's generative system of grammars, particularly context-free grammars (CFGs) and regular expressions (REs), to express the syntax of programming languages and protocols. The power of generative grammars to express ...
Parsing expression grammars: a recognition-based syntactic foundation
POPL '04For decades we have been using Chomsky's generative system of grammars, particularly context-free grammars (CFGs) and regular expressions (REs), to express the syntax of programming languages and protocols. The power of generative grammars to express ...
Comments