Abstract
The spread and abundance of electronic documents requires automatic techniques for extracting useful information from the text they contain. The availability of conceptual taxonomies can be of great help, but manually building them is a complex and costly task. Building on previous work, we propose a technique to automatically extract conceptual graphs from text and reason with them. Since automated learning of taxonomies needs to be robust with respect to missing or partial knowledge and flexible with respect to noise, this work proposes a way to deal with these problems. The case of poor data/sparse concepts is tackled by finding generalizations among disjoint pieces of knowledge. Noise is handled by introducing soft relationships among concepts rather than hard ones, and applying a probabilistic inferential setting. In particular, we propose to reason on the extracted graph using different kinds of relationships among concepts, where each arc/relationship is associated to a weight that represents its likelihood among all possible worlds, and to face the problem of sparse knowledge by using generalizations among distant concepts as bridges between disjoint portions of knowledge.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Cimiano, P., Hotho, A., Staab, S.: Learning concept hierarchies from text corpora using formal concept analysis. J. Artif. Int. Res. 24(1), 305–339 (2005)
de Marneffe, M.C., MacCartney, B., Manning, C.D.: Generating typed dependency parses from phrase structure trees. In: LREC (2006)
Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)
Ferilli, S., Biba, M., Di Mauro, N., Basile, T.M.A., Esposito, F.: Plugging taxonomic similarity in first-order logic horn clauses comparison. In: Serra, R., Cucchiara, R. (eds.) AI*IA 2009. LNCS, vol. 5883, pp. 131–140. Springer, Heidelberg (2009)
Ferilli, S., Leuzzi, F., Rotella, F.: Cooperating techniques for extracting conceptual taxonomies from text. In: Proceedings of The Workshop on Mining Complex Patterns at AI*IA XIIth Conference (2011)
Hamming, R.W.: Error detecting and error correcting codes. Bell System Technical Journal 29(2), 147–160 (1950)
Kimmig, A., Santos Costa, V., Rocha, R., Demoen, B., De Raedt, L.: On the efficient execution of probLog programs. In: Garcia de la Banda, M., Pontelli, E. (eds.) ICLP 2008. LNCS, vol. 5366, pp. 175–189. Springer, Heidelberg (2008)
Kimmig, A., De Raedt, L., Toivonen, H.: Probabilistic explanation based learning. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 176–187. Springer, Heidelberg (2007)
Klein, D., Manning, C.D.: Fast exact inference with a factored model for natural language parsing. In: Advances in Neural Information Processing Systems, vol. 15, MIT Press (2003)
Maedche, A., Staab, S.: Mining ontologies from text. In: Dieng, R., Corby, O. (eds.) EKAW 2000. LNCS (LNAI), vol. 1937, pp. 189–202. Springer, Heidelberg (2000)
Maedche, A., Staab, S.: The text-to-onto ontology learning environment. In: ICCS-2000 - Eight International Conference on Conceptual Structures, Software Demonstration (2000)
Ogata, N.: A formal ontology discovery from web documents. In: Zhong, N., Yao, Y., Ohsuga, S., Liu, J. (eds.) WI 2001. LNCS (LNAI), vol. 2198, pp. 514–519. Springer, Heidelberg (2001)
Cucchiarelli, A., Velardi, P., Navigli, R., Neri, F.: Evaluation of OntoLearn, a methodology for automatic population of domain ontologies. In: Ontology Learning from Text: Methods, Applications and Evaluation. IOS Press (2006)
De Raedt, L., Kimmig, A., Toivonen, H.: Problog: a probabilistic prolog and its application in link discovery. In: Proceedings of 20th IJCAI, pp. 2468–2473. AAAI Press (2007)
Sato, T.: A statistical learning method for logic programs with distribution semantics. In: Proceedings of the 12th ICLP 1995, pp. 715–729. MIT Press (1995)
Wu, Z., Palmer, M.: Verbs semantics and lexical selection. In: Proceedings of the 32nd Annual Meeting on Association for Computational Linguistics, Morristown, NJ, USA, pp. 133–138. Association for Computational Linguistics (1994)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Leuzzi, F., Ferilli, S., Rotella, F. (2013). Improving Robustness and Flexibility of Concept Taxonomy Learning from Text. In: Appice, A., Ceci, M., Loglisci, C., Manco, G., Masciari, E., Ras, Z.W. (eds) New Frontiers in Mining Complex Patterns. NFMCP 2012. Lecture Notes in Computer Science(), vol 7765. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37382-4_12
Download citation
DOI: https://doi.org/10.1007/978-3-642-37382-4_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37381-7
Online ISBN: 978-3-642-37382-4
eBook Packages: Computer ScienceComputer Science (R0)