Skip to main content
Log in

Semi-Automatic Rule Learning Method Enabling Information Extraction for Ontology Population

  • Research Paper
  • Published:
Iranian Journal of Science and Technology, Transactions of Electrical Engineering Aims and scope Submit manuscript

Abstract

Business firms around the world have been generating enormous amounts of domain-related documents. Most of these firms are adapting semantic Web-based techniques into their software systems. Hence, they want to semantically enrich their documents to enable more meaningful querying or processing of the information in the documents. To impart semantics into these documents, ontologies relevant to the business domain should be used. In this context, to populate the domain ontology with the information from the source documents, a method for semi-automatic learning of extraction rules for populating the ontology is presented and implemented in the rule learning system. In addition to the rule learning system, a framework for separating the business logic from application logic and storing the business rules and extraction rules in external user-friendly format is presented in brief. The rule learning system is mainly developed to be a part of the presented framework, but it can be used as a standalone system to learn any decision or association rules too. The framework uses the rule learning system for learning extraction rules. The main idea behind the work presented is to learn extraction rules to be used by an information extraction system (part of the framework) to populate the domain ontology. The extraction rules learned by the rule learning system can be used with any business rules management system (BRMS) with appropriate wrappers to populate the domain ontology.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  • Alpaydin E (2004) Introduction to machine learning, MIT Press

  • Apache POI HSSF API (2002) http://poi.apache.org/spreadsheet/index.html (Accessed 2014)

  • Bach NX, Cuong LA, Ha NV, Binh NN (2008) Transformation rule learning without rule templates: a case study in part of speech tagging. In: Proceedings of the. International Conference on Advanced Language Processing and Web Information Technology, pp. 9–14

  • Ball M, Boley H, Hirtle D, Mei J, Spencer B (2005) The OO jDREW reference implementation of ruleml. In: Proceedings of the rules and rule markup languages for the semantic web (RuleML-2005), Springer LNCS 3791, pp. 218–223

  • Behkamal B, Naghibzadeh M, Askari Moghadam R (2012) Pre-processing ontologies to improve the results of matchers. Iran J Sci Technol Transact Elect Eng 36(E2):95

    Google Scholar 

  • Berners-Lee T, Hendler J, Lassila O (2001) The semantic web. Sci Am 5:34–43

    Article  Google Scholar 

  • Biletskiy Y, Ranganathan GR (2008) An invertebrate semantic/software application development framework for knowledge-based systems. Knowl-Based Syst 21(5):371–376

    Article  Google Scholar 

  • Boley H (2004) POSL: An Integrated Positional-Slotted Language for Semantic Web Knowledge, http://www.ruleml.org/submission/ruleml-shortation.html (accessed 2014)

  • Buitelaar P, Cimiano P, Frank A, Hartung M, Racioppa S (2008) Ontology-based information extraction and integration from heterogeneous data sources. Int J Hum Comput Stud 66:759–788

    Article  Google Scholar 

  • Celjuska D, Vargas-Vera M (2004) Ontosophie: A semi-automatic system for ontology population from text. In: Proceedings of the 3D International Conference on Natural Language Processing (ICON)

  • Cimiano P (2006) Ontology learning and population from text: algorithms, evaluation and applications. Springer-Verlag, New York

    Google Scholar 

  • Cohen WW (1995) Fast effective rule induction (RIPPER). In: Proceedings of the 12th International Conference on Machine Learning, pp. 115–123

  • Ferrer-Troyano F, Aguilar-Ruiz JS, Riquelme JC (2005) Incremental rule learning based on example nearness from numerical data streams, Symposium on Applied Computing. In: Proceedings of the ACM symposium on Applied computing, pp. 568–572

  • Fox MS, Barbuceanu M, Gruninger M (1996) An organization ontology for enterprise modelling: preliminary concepts for linking structure and behaviour. Comput Ind 29:123–134

    Article  Google Scholar 

  • Fox MS, Gruninger M (1998) Enterprise modelling. AI Magazine 19(3):109–121

    Google Scholar 

  • Gomez-Perez A, Corcho O, Fernandez-Lopez M (2004) Ontological Engineering: with examples from the areas of Knowledge Management, e-Commerce and the Semantic Web. Advanced information and knowledge processing. Springer, London

  • Holmes G, Donkin A, Witten H (1994) Weka: a machine learning workbench. In: Proceedings of the Second Australia and New Zealand Conference on Intelligent Information Systems, pp. 357–361

  • Jena API (2011) https://jena.apache.org/, accessed 2014)

  • Li H, Hu D, Hao T, Wenyin L, Chen X (2007) Adaptation rule learning for case-based reasoning. In: Proceedings of the Third International Conference on Semantics, Knowledge and Grid, pp. 44–49

  • Maedche S, Staab S (2001) Ontology learning for the semantic web. IEEE Intell Syst Archive 16(2):72–79

    Article  Google Scholar 

  • Manine P, Alphonse E, Bessieres P (2008) Information extraction as an ontology population task and its application to genic interactions. In: Proceedings of the 20th IEEE International Conference Tools with Artificial Intelligence. 2:74–81

  • Mierswa I, Wurst M, Klikenberg R, Scholz M, Euler T (2006) YALE: rapid prototyping for complex data mining tasks. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-06), pp. 935–940

  • Nederstigt LJ, Aanen SS, Vandic D, Frasincar F (2014a) FLOPPIES: a framework for large-scale ontology population of product information from tabular data in E-commerce stores. Decis Support Syst 59:296–311

    Article  Google Scholar 

  • Nederstigt LJ, Aanen SS, Vandic D, Frasincar F (2014b) FLOPPIES: a framework for large-scale ontology population of product information from tabular data in e-commerce stores. Decis Support Syst 59:296–311

    Article  Google Scholar 

  • OWL: Web Ontology Language, http://www.w3.org/2004/OWL (2004, accessed 2014)

  • Oberle D, Staab S, Studer R (2005) Supporting application development in the Semantic Web. ACM Trans Internet Technol 5(2):329–358

    Article  Google Scholar 

  • OpenL Tablets, http://openl-tablets.sourceforge.net/(2006, accessed 2014)

  • Protégé ontology editor, http://protege.stanford.edu/(2000, accessed 2014)

  • Ren F (2014) Learning time-sensitive domain ontology from scientific papers with a hybrid learning method. J Info Sci 40(3):329–345

    Article  Google Scholar 

  • Schapire R, Singer Y (2014) SLIPPER, http://www.cs.cmu.edu/~wcohen/slipper/(1999, accessed 2014)

  • Sim KM, Wong PT (2004) Towards agency and ontology for web-based information retrieval. IEEE Trans Syst Man Cybern C Appl Rev 34(3):257–269

    Article  Google Scholar 

  • Simon K, Hornung T, Lausen G (2006) Learning rules to pre-process web data for automatic integration. In: Proceedings of the. Second International Conference on Rules and Rule Markup Languages for the Semantic Web, pp. 107–116

  • Uschold M, King M (1995) Towards a methodology for building ontologies. The IJCAI-95 Workshop on Basic Ontological Issues in Knowledge Sharing. pp. 15–30

  • Uschold M, King M, Moralee S, Zorgios Y (1998) The Enterprise Ontology. The Knowledge Engineering Review, Special Issue on Putting Ontologies to Use. pp. 31–89

  • Vasile F, Silvescu A, Kang DK, Honavar V (2006) TRIPPER: rule learning using taxonomies. In: Advances in knowledge discovery and data mining, Springer Berlin Heidelberg, pp. 55–59, 2006

  • Wimalasuriya DC, Dejing D (2010) Ontology-based information extraction: an introduction and a survey of current approaches. J Info Sci 36(3):306–323

    Article  Google Scholar 

  • Witten H, Frank E (2005) Data mining: practical machine learning tools and techniques (2nd ed). Morgan Kaufmann series in data management systems

  • Zachman A (1987) A framework for information systems architecture. IBM Syst J 26(3):276–292

    Article  Google Scholar 

  • Zolghadri-Jahromi M, Valizadeh MR (2006) A proposed query-sensitive similarity measure for information retrieval. Iran J Sci Technol Trans B Eng 30(B2):171–180

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ismail Akbari.

Appendix

Appendix

See Fig. 9.

Fig. 9
figure 9

A part of the XML file with annotation information

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ranganathan, G.R., Biletskiy, Y. & Akbari, I. Semi-Automatic Rule Learning Method Enabling Information Extraction for Ontology Population. Iran J Sci Technol Trans Electr Eng 40, 103–115 (2016). https://doi.org/10.1007/s40998-016-0011-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40998-016-0011-3

Keywords

Navigation