ABSTRACT
In this paper we describe a method of automatically learning domain theories from parsed corpora of sentences from the relevant domain and use FSA techniques for the graphical representation of such a theory. By a 'domain theory' we mean a collection of facts and generalisations or rules which capture what commonly happens (or does not happen) in some domain of interest. As language users, we implicitly draw on such theories in various disambiguation tasks, such as anaphora resolution and prepositional phrase attachment, and formal encodings of domain theories can be used for this purpose in natural language processing. They may also be objects of interest in their own right, that is, as the output of a knowledge discovery process. The approach is generizable to different domains provided it is possible to get logical forms for the text in the domain.
- A. H. Aho, J. E. Hopcroft, and J. D. Ullman. 1974. The Design and Analysis of Computer Algorithms. Addison-Wesley Publishing Company. Google ScholarDigital Library
- A. H. Aho, R. Sethi, and J. D. Ullman. 1986. Compilers - Principles, Techniques, and Tools. Addison-Wesley, Reading, Massachusetts, USA. Google ScholarDigital Library
- O. Collin, F. Duclaye, and F. Yvon. 2002. Learning Paraphrases to Improve a Question-Answering System. staff.science.uva.nl/mdr/NLP4QA/10duclaye-et-al.pdf.Google Scholar
- Luc DeHaspe. 1998. Frequent Pattern Discovery in First-Order Logic. Ph.D. thesis, Katholieke Universiteit Leuven.Google Scholar
- G. Doddington and C. H. J. Godfrey. 1990. The ATIS Spoken Language Systems Pilot Corpus. In Speech and Natural Language Workshop, Hidden Valley, Pennsylvania. Google ScholarDigital Library
- R. Grishman and B. Sundheim. 1995. "Message Understanding Conference-6: A Brief History". www.cs.nyu.edu/cs/projects/proteus/muc/muc6-history-coling.ps.Google Scholar
- M. Liakata and S. Pulman. 2002. From Trees to Predicate-Argument Structures. In International Conference for Computational Linguistics (COLING), pages 563--569, Taipei, Taiwan. Google ScholarDigital Library
- D. Lin and P. Pantel. 2001. Dirt-Discovery of Inference Rules from Text. In In ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 323--328. Google ScholarDigital Library
- M. Marcus, G. Kim, M. Marcinkiewicz, R. Mac-Intyre, A. Bies, M. Ferguson, K. Katz, and B. Schasberger. 1994. The Penn Treebank: Annotating predicate argument structure. In ARPA Human Language Technology Workshop. Google ScholarDigital Library
- Stephen Muggleton. 1995. Inverse Entailment and Progol. New Generation Computing, special issue on Inductive Logic Programming, 13(3--4):245--286.Google Scholar
- Stephen Pulman. 2000. Statistical and Logical Reasoning in Disambiguation. Philosophical Transactions of the Royal Society, 358 number 1769:1267--1279.Google Scholar
- Ashwin Srinivasan. 1999. "the Aleph Manual". www.comlab.ox.ac.uk/oucl/research/areas/machlearn/Aleph/.Google Scholar
- M. Subramani and G. F. Cooper. 1999. Causal Discovery from Medical Textual Data. http://www.amia.org/pubs/symposia/D200558.PDF.Google Scholar
- Gertjan van Noord. 2002. FSA6 Reference Manual. http://odur.let.rug.nl/vannoord/Fsa/.Google Scholar
- Learning theories from text
Recommendations
Learning translation templates from bilingual text
COLING '92: Proceedings of the 14th conference on Computational linguistics - Volume 2This paper proposes a two-phase example-based machine translation methodology which develops translation templates from examples and then translates using template matching. This method improves translation quality and facilitates customization of ...
Learning better monolingual models with unannotated bilingual text
CoNLL '10: Proceedings of the Fourteenth Conference on Computational Natural Language LearningThis work shows how to improve state-of-the-art monolingual natural language processing models using unannotated bilingual text. We build a multiview learning objective that enforces agreement between monolingual and bilingual models. In our method the ...
Comments