Article

Free Access

A structured language model based on context-sensitive probabilistic left-corner parsing

Authors:
Dong Hoon Van Uytsel

Katholieke Universiteit Leuven, ESAT, Belgium

Katholieke Universiteit Leuven, ESAT, Belgium
View Profile

,
Filip Van Aelten

Lernout & Hauspie, Belgium

Lernout & Hauspie, Belgium
View Profile

,
Dirk Van Compernolle

Katholieke Universiteit Leuven, ESAT, Belgium

Katholieke Universiteit Leuven, ESAT, Belgium
View Profile

NAACL '01: Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologiesJune 2001Pages 1–8https://doi.org/10.3115/1073336.1073365

Published:02 June 2001Publication History

NAACL '01: Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies

Pages 1–8

ABSTRACT

Recent contributions to statistical language modeling for speech recognition have shown that probabilistically parsing a partial word sequence aids the prediction of the next word, leading to "structured" language models that have the potential to outperform n-grams. Existing approaches to structured language modeling construct nodes in the partial parse tree after all of the underlying words have been predicted. This paper presents a different approach, based on probabilistic left-corner grammar (PLCG) parsing, that extends a partial parse both from the bottom up and from the top down, leading to a more focused and more accurate, though somewhat less robust, search of the parse space. At the core of our new structured language model is a fast context-sensitive and lexicalized PLCG parsing algorithm that uses dynamic programming. Preliminary perplexity and word-accuracy results appear to be competitive with previous ones, while speed is increased.

References

James K. Baker. 1979. Trainable grammars for speech recognition. In Jared J. Wolf and Dennis H. Klatt, editors, Speech Communication Papers for the 97th Meeting of the Acoustical Society of America, pages 547--550. The MIT Press, Cambridge, MA.Google Scholar
Eugene Charniak. 2000. A maximum-entropy inspired parser. In Proc. of the NAACL, pages 132--139. Google ScholarDigital Library
Ciprian Chelba. 2000. Exploiting Syntactic Structure for Natural Language Modeling. Ph.D. thesis, Johns Hopkins University. Google ScholarDigital Library
Michael J. Collins. 1996. A new statistical parser based on bigram lexical dependencies. In Proc. of the 34th Annual Meeting of the ACL, pages 184--191. Google ScholarDigital Library
Frederick Jelinek and Ciprian Chelba. 1999. Putting language into language modeling. In Proc. of Eurospeech '99, volume I, pages KN-1-6.Google Scholar
Frederik Jelinek and John Lafferty. 1991. Computation of the probability of initial substring generation by stochastic context-free grammars. Computational Linguistics, 17(3):315--323. Google ScholarDigital Library
Frederick Jelinek. 1997. Statistical Methods for Speech Recognition. The MIT Press, Cambridge, MA. Google ScholarDigital Library
Slava M. Katz. 1987. Estimation of probabilities from sparse data for the language model component of a speech recognizer. IEEE Trans. on Acoustics, Speech and Signal Processing, 35:400--401.Google ScholarCross Ref
David M. Magerman. 1994. Natural Language Parsing as Statistical Pattern Recognition. Ph.D. thesis, Stanford University. Google ScholarDigital Library
Christopher D. Manning and Bob Carpenter. 1997. Probabilistic parsing using left corner language models. In Proc. of the Fifth International Workshop on Parsing Technologies, pages 147--158.Google Scholar
Christopher D. Manning and Hinrich Schütze. 1999. Foundations of Statistical Natural Language Processing. The MIT Press, Cambridge, MA. Google ScholarDigital Library
Mitchell P. Marcus, Beatrice Santorini, and Mary Ann Marcinkiewicz. 1995. Building a large annotated corpus of English: the Penn Tree-bank. Computational Linguistics, 19(2):313--330. Google ScholarDigital Library
Brian Roark and Mark Johnson. 2000. Efficient probabilistic top-down and left-corner parsing. In Proc. of the 37th Annual Meeting of the ACL, pages 421--428. Google ScholarDigital Library
Andreas Stolcke. 1995. An efficient probabilistic context-free parsing algorithm that computes prefix probabilities. Computational Linguistics, 21(2):165--201. Google ScholarDigital Library
Filip Van Aelten and Marc Hogenhout. 2000. Inside-outside reestimation of Chelba-Jelinek models. Internal Report L&H--SR--00--027, Lernout & Hauspie, Wemmel, Belgium.Google Scholar
Dong Hoon Van Uytsel. 2000. Earley-inspired parsing language model: Background and preliminaries. Internal Report PSI-SPCH-00-1, K.U.Leuven, ESAT, Heverlee, Belgium.Google Scholar

A structured language model based on context-sensitive probabilistic left-corner parsing
1. Computing methodologies
  1. Artificial intelligence
2. Hardware
  1. Power and energy
    1. Power estimation and optimization

Recommendations

A context-sensitive model for probabilistic LR parsing of spoken language with transformation-based postprocessing
COLING '00: Proceedings of the 18th conference on Computational linguistics - Volume 2

This paper describes a hybrid approach to spontaneous speech parsing. The implemented parser uses an extended probabilistic LR parsing model with rich context and and its output is post-processed by a symbolic tree transformation routine that tries to ...
Read More
Left-corner parsing algorithm for unification grammars
Read More
Left recursion in Parsing Expression Grammars

Parsing Expression Grammars (PEGs) are a formalism that can describe all deterministic context-free languages through a set of rules that specify a top-down parser for some language. PEGs are easy to use, and there are efficient implementations of PEG ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

NAACL '01: Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
June 2001
293 pages
Sponsors
In-Cooperation
Publisher
Association for Computational Linguistics
United States
Publication History
- Published: 2 June 2001
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate21of29submissions,72%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 221
  Total Downloads
- Downloads (Last 12 months)17
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A structured language model based on context-sensitive probabilistic left-corner parsing

NAACL '01: Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies

ABSTRACT

References

Cited By

Recommendations

A context-sensitive model for probabilistic LR parsing of spoken language with transformation-based postprocessing

Left-corner parsing algorithm for unification grammars

Left recursion in Parsing Expression Grammars

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

A structured language model based on context-sensitive probabilistic left-corner parsing

NAACL '01: Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies

ABSTRACT

References

Cited By

Recommendations

A context-sensitive model for probabilistic LR parsing of spoken language with transformation-based postprocessing

Left-corner parsing algorithm for unification grammars

Left recursion in Parsing Expression Grammars

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media