Skip to main content

A Combined Approach for Ontology Enrichment from Textual and Open Data

  • Chapter
  • First Online:
  • 492 Accesses

Part of the book series: Studies in Computational Intelligence ((SCI,volume 732))

Abstract

This paper proposes an approach for ontology enrichment for automatically labeling documents describing entities, with very specific concepts reflecting specific users’ needs. The peculiarity of this approach is that it addresses a triple challenge: (1) the concepts used for labeling have no direct terminology in the documents, (2) their formal definitions are not initially known, (3) the information useful to label the documents is not necessarily mentioned in them. To solve those problems, we propose to use an existing ontology of the domain of concern and to enrich it with the definitions of the concepts used for labeling. To construct these definitions, we work on a set of manually labeled documents, used as examples. The ontology is populated with information extracted from these documents, and with information coming from external resources (Linked Open Data). The definitions that we want to get can then be learned based on this populated ontology and on the set of labeled documents. Learned definitions are then added to the ontology (ontology enrichment). Hence, whenever new documents of the same domain have to be labeled, the ontology can be populated in the same way and definitions apply, allowing the new documents to be labeled. This approach, named Saupodoc, is a novel approach to ontology population and enrichment, exploiting the foundations of the Semantic Web by combining contributions of text analysis, linked open data extraction, machine learning and reasoning tools. An evaluation, on two application domains, provides quality results and demonstrates the interest of the approach.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://www.wepingo.com/.

  2. 2.

    http://www.opencalais.com/.

  3. 3.

    http://dbpedia.org/sparql.

  4. 4.

    http://www.thomascook.com/.

References

  • Alec, C., Reynaud-Delaître, C., & Safar, B. (2016). A model for linked open data acquisition and SPARQL query generation. In Graph-based Modeling of Conceptual Structures. 22nd International Conference on Conceptual Structures, ICCS (pp. 237–251). Annecy, France: Springer.

    Google Scholar 

  • Bontcheva, K., Tablan, V., Maynard, D., & Cunningham, H. (2004). Evolving GATE to meet new challenges in language engineering. Natural Language Engineering, 10(3/4), 349–373.

    Article  Google Scholar 

  • Cheng, X., & Roth, D. (2013). Relational inference for wikification. Empirical Methods in Natural Language Processing (EMNLP) (pp. 1787–1796), Seattle, Washington, USA.

    Google Scholar 

  • Chitsaz, M. (2013). Enriching ontologies through data. In Doctoral Consortium Co-located with International Semantic Web Conference (ISWC) (pp. 1–8), Sydney, Australia.

    Google Scholar 

  • Cimiano, P. (2006). Ontology learning and population from text: Algorithms. Evaluation and applications. Secaucus, NJ, USA: Springer New York Inc.

    Google Scholar 

  • Cimiano, P., & Völker, J. (2005). Text2Onto: A framework for ontology learning and data-driven change discovery. In Proceedings of the 10th International Conference on Natural Language Processing and Information Systems, NLDB (pp. 227–238). Alicante, Spain: Springer.

    Google Scholar 

  • Cimiano, P., Völker, J., & Studer, R. (2006). Ontologies on demand?–A description of the state-of-the-art, applications, challenges and trends for ontology learning from text. Information, Wissenschaft und Praxis, 57(6–7), 315–320.

    Google Scholar 

  • Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V., Aswani, N., Roberts, I., Gorrell, G., Funk, A., Roberts, A., Damljanovic, D., Heitz, T., Greenwood, M. A., Saggion, H., Petrak, J., Li, Y., & Peters, W. (2011). Text Processing with GATE. ACM Digital Library.

    Google Scholar 

  • Esposito, F., Fanizzi, N., Iannone, L., Palmisano, I., & Semeraro, G. (2004). Knowledge-intensive induction of terminologies from metadata. In Third International Semantic Web Conference (ISWC), Hiroshima, Japan, November 7–11 (pp. 441–455).

    Google Scholar 

  • Fanizzi, N., d’Amato, C., & Esposito, F. (2008). DL-FOIL concept learning in description logics. 18th International Conference Inductive Logic Programming, (ILP) (pp. 107–121). Prague, Czech Republic.

    Google Scholar 

  • Lehmann, J. (2009). DL-Learner: Learning concepts in description logics. Journal of Machine Learning Research, 10, 2639–2642.

    MathSciNet  MATH  Google Scholar 

  • Lehmann, J., Auer, S., Bühmann, L., & Tramp, S. (2011). Class expression learning for ontology engineering. Journal of Web Semantics, 9, 71–81.

    Article  Google Scholar 

  • Lehmann, J., & Hitzler, P. (2010). Concept learning in description logics using refinement operators. Machine Learning, 78(1–2), 203–250.

    Article  MathSciNet  Google Scholar 

  • Ma, Y., & Distel, F. (2013a). Concept adjustment for description logics. 7th International Conference on Knowledge Capture, K-CAP’13 (pp. 65–72). Banff, Canada: ACM.

    Chapter  Google Scholar 

  • Ma, Y., & Distel, F. (2013b). Learning formal definitions for snomed CT from text. In Proceedings of Artificial Intelligence in Medicine (AIME) (pp. 73–77). Murcia, Spain: Springer.

    Google Scholar 

  • Mendes, P. N., Jakob, M., García-Silva, A., & Bizer, C. (2011). DBpedia spotlight: Shedding light on the web of documents. 7th International Conference on Semantic Systems, I-Semantics’11 (pp. 1–8). NY, USA: ACM.

    Google Scholar 

  • Petasis, G., Möller, R., & Karkaletsis, V. (2013). BOEMIE: Reasoning-based information extraction. 12th International Conference on Logic Programming and Nonmonotonic Reasoning (LPNMR) (pp. 60–75), A Corunna, Spain.

    Google Scholar 

  • Ratinov, L., Roth, D., Downey, D., & Anderson, M. (2011). Local and global algorithms for disambiguation to wikipedia. In 49th Annual Meeting of the Association for Computational Linguistics (ACL) (pp. 1375–1384).

    Google Scholar 

  • Shearer, R., Motik, B., & Horrocks, I. (2008). HermiT: A highly-efficient OWL reasoner. In Fifth Workshop on OWL (OWLED), Co-located with the 7th International Semantic Web Conference, volume 432 of CEUR Workshop Proceedings.

    Google Scholar 

  • Sirin, E., Parsia, B., Grau, B. C., Kalyanpur, A., & Katz, Y. (2007). Pellet: A practical OWL-DL reasoner. Journal of Web Semantics, 5(2), 51–53.

    Article  Google Scholar 

  • Tsarkov, D., & Horrocks, I. (2006). FaCT++ description logic reasoner: System description. In Third International Joint Conference Automated Reasoning (IJCAR) (pp. 292–297), Seattle, WA, USA.

    Google Scholar 

  • Völker, J., Hitzler, P., & Cimiano, P. (2007). Acquisition of OWL DL axioms from lexical resources. In 4th European Semantic Web Conference (ESWC), pp. 670–685. Innsbruck, Austria: Springer.

    Google Scholar 

  • Yelagina, N., & Panteleyev, M. (2014). Deriving of thematic facts from unstructured texts and background knowledge. 5th International Conference Knowledge Engineering and the Semantic Web (KESW) (pp. 208–218). Kazan, Russia: Springer.

    Google Scholar 

  • Yosef, M. A., Hoffart, J., Bordino, I., Spaniol, M., & Weikum, G. (2011). AIDA: An online tool for accurate disambiguation of named entities in text and tables. In Proceedings of the 37th International Conference on Very Large Databases, (VLDB) (pp. 1450–1453).

    Google Scholar 

Download references

Acknowledgements

We acknowledge the Wepingo startup, which has funded this work in the settings of the Poraso project.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Céline Alec .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this chapter

Cite this chapter

Alec, C., Reynaud-Delaître, C., Safara, B. (2018). A Combined Approach for Ontology Enrichment from Textual and Open Data. In: Pinaud, B., Guillet, F., Cremilleux, B., de Runz, C. (eds) Advances in Knowledge Discovery and Management. Studies in Computational Intelligence, vol 732. Springer, Cham. https://doi.org/10.1007/978-3-319-65406-5_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-65406-5_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-65405-8

  • Online ISBN: 978-3-319-65406-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics