Abstract
Business firms around the world have been generating enormous amounts of domain-related documents. Most of these firms are adapting semantic Web-based techniques into their software systems. Hence, they want to semantically enrich their documents to enable more meaningful querying or processing of the information in the documents. To impart semantics into these documents, ontologies relevant to the business domain should be used. In this context, to populate the domain ontology with the information from the source documents, a method for semi-automatic learning of extraction rules for populating the ontology is presented and implemented in the rule learning system. In addition to the rule learning system, a framework for separating the business logic from application logic and storing the business rules and extraction rules in external user-friendly format is presented in brief. The rule learning system is mainly developed to be a part of the presented framework, but it can be used as a standalone system to learn any decision or association rules too. The framework uses the rule learning system for learning extraction rules. The main idea behind the work presented is to learn extraction rules to be used by an information extraction system (part of the framework) to populate the domain ontology. The extraction rules learned by the rule learning system can be used with any business rules management system (BRMS) with appropriate wrappers to populate the domain ontology.
Similar content being viewed by others
References
Alpaydin E (2004) Introduction to machine learning, MIT Press
Apache POI HSSF API (2002) http://poi.apache.org/spreadsheet/index.html (Accessed 2014)
Bach NX, Cuong LA, Ha NV, Binh NN (2008) Transformation rule learning without rule templates: a case study in part of speech tagging. In: Proceedings of the. International Conference on Advanced Language Processing and Web Information Technology, pp. 9–14
Ball M, Boley H, Hirtle D, Mei J, Spencer B (2005) The OO jDREW reference implementation of ruleml. In: Proceedings of the rules and rule markup languages for the semantic web (RuleML-2005), Springer LNCS 3791, pp. 218–223
Behkamal B, Naghibzadeh M, Askari Moghadam R (2012) Pre-processing ontologies to improve the results of matchers. Iran J Sci Technol Transact Elect Eng 36(E2):95
Berners-Lee T, Hendler J, Lassila O (2001) The semantic web. Sci Am 5:34–43
Biletskiy Y, Ranganathan GR (2008) An invertebrate semantic/software application development framework for knowledge-based systems. Knowl-Based Syst 21(5):371–376
Boley H (2004) POSL: An Integrated Positional-Slotted Language for Semantic Web Knowledge, http://www.ruleml.org/submission/ruleml-shortation.html (accessed 2014)
Buitelaar P, Cimiano P, Frank A, Hartung M, Racioppa S (2008) Ontology-based information extraction and integration from heterogeneous data sources. Int J Hum Comput Stud 66:759–788
Celjuska D, Vargas-Vera M (2004) Ontosophie: A semi-automatic system for ontology population from text. In: Proceedings of the 3D International Conference on Natural Language Processing (ICON)
Cimiano P (2006) Ontology learning and population from text: algorithms, evaluation and applications. Springer-Verlag, New York
Cohen WW (1995) Fast effective rule induction (RIPPER). In: Proceedings of the 12th International Conference on Machine Learning, pp. 115–123
Ferrer-Troyano F, Aguilar-Ruiz JS, Riquelme JC (2005) Incremental rule learning based on example nearness from numerical data streams, Symposium on Applied Computing. In: Proceedings of the ACM symposium on Applied computing, pp. 568–572
Fox MS, Barbuceanu M, Gruninger M (1996) An organization ontology for enterprise modelling: preliminary concepts for linking structure and behaviour. Comput Ind 29:123–134
Fox MS, Gruninger M (1998) Enterprise modelling. AI Magazine 19(3):109–121
Gomez-Perez A, Corcho O, Fernandez-Lopez M (2004) Ontological Engineering: with examples from the areas of Knowledge Management, e-Commerce and the Semantic Web. Advanced information and knowledge processing. Springer, London
Holmes G, Donkin A, Witten H (1994) Weka: a machine learning workbench. In: Proceedings of the Second Australia and New Zealand Conference on Intelligent Information Systems, pp. 357–361
Jena API (2011) https://jena.apache.org/, accessed 2014)
Li H, Hu D, Hao T, Wenyin L, Chen X (2007) Adaptation rule learning for case-based reasoning. In: Proceedings of the Third International Conference on Semantics, Knowledge and Grid, pp. 44–49
Maedche S, Staab S (2001) Ontology learning for the semantic web. IEEE Intell Syst Archive 16(2):72–79
Manine P, Alphonse E, Bessieres P (2008) Information extraction as an ontology population task and its application to genic interactions. In: Proceedings of the 20th IEEE International Conference Tools with Artificial Intelligence. 2:74–81
Mierswa I, Wurst M, Klikenberg R, Scholz M, Euler T (2006) YALE: rapid prototyping for complex data mining tasks. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-06), pp. 935–940
Nederstigt LJ, Aanen SS, Vandic D, Frasincar F (2014a) FLOPPIES: a framework for large-scale ontology population of product information from tabular data in E-commerce stores. Decis Support Syst 59:296–311
Nederstigt LJ, Aanen SS, Vandic D, Frasincar F (2014b) FLOPPIES: a framework for large-scale ontology population of product information from tabular data in e-commerce stores. Decis Support Syst 59:296–311
OWL: Web Ontology Language, http://www.w3.org/2004/OWL (2004, accessed 2014)
Oberle D, Staab S, Studer R (2005) Supporting application development in the Semantic Web. ACM Trans Internet Technol 5(2):329–358
OpenL Tablets, http://openl-tablets.sourceforge.net/(2006, accessed 2014)
Protégé ontology editor, http://protege.stanford.edu/(2000, accessed 2014)
Ren F (2014) Learning time-sensitive domain ontology from scientific papers with a hybrid learning method. J Info Sci 40(3):329–345
Schapire R, Singer Y (2014) SLIPPER, http://www.cs.cmu.edu/~wcohen/slipper/(1999, accessed 2014)
Sim KM, Wong PT (2004) Towards agency and ontology for web-based information retrieval. IEEE Trans Syst Man Cybern C Appl Rev 34(3):257–269
Simon K, Hornung T, Lausen G (2006) Learning rules to pre-process web data for automatic integration. In: Proceedings of the. Second International Conference on Rules and Rule Markup Languages for the Semantic Web, pp. 107–116
Uschold M, King M (1995) Towards a methodology for building ontologies. The IJCAI-95 Workshop on Basic Ontological Issues in Knowledge Sharing. pp. 15–30
Uschold M, King M, Moralee S, Zorgios Y (1998) The Enterprise Ontology. The Knowledge Engineering Review, Special Issue on Putting Ontologies to Use. pp. 31–89
Vasile F, Silvescu A, Kang DK, Honavar V (2006) TRIPPER: rule learning using taxonomies. In: Advances in knowledge discovery and data mining, Springer Berlin Heidelberg, pp. 55–59, 2006
Wimalasuriya DC, Dejing D (2010) Ontology-based information extraction: an introduction and a survey of current approaches. J Info Sci 36(3):306–323
Witten H, Frank E (2005) Data mining: practical machine learning tools and techniques (2nd ed). Morgan Kaufmann series in data management systems
Zachman A (1987) A framework for information systems architecture. IBM Syst J 26(3):276–292
Zolghadri-Jahromi M, Valizadeh MR (2006) A proposed query-sensitive similarity measure for information retrieval. Iran J Sci Technol Trans B Eng 30(B2):171–180
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
See Fig. 9.
Rights and permissions
About this article
Cite this article
Ranganathan, G.R., Biletskiy, Y. & Akbari, I. Semi-Automatic Rule Learning Method Enabling Information Extraction for Ontology Population. Iran J Sci Technol Trans Electr Eng 40, 103–115 (2016). https://doi.org/10.1007/s40998-016-0011-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40998-016-0011-3