Partial multi-dividing ontology learning algorithm
Introduction
The concept of ontology, inspired in the philosophical notion, started to use in sciences in 1980s refers to different properties of a materia and their relations. Later, it was introduced into the field of computer and information technology, and from the90′s of the last century it became one of the hot research fields in artificial intelligence. Because of its powerful semantic query and concept management ability, the ontology has been applied to other fields in the past 10 years. Now, it is used in nearly all disciplines, such as chemical science (see for instance Vijayasarathi and Sankar [47] or Banchetti-Robino [4]), pharmacology science (see Sarntivijai et al. [36]), biology science (see Kohler et al. [26], Levine et al. [30] and Vishnu et al. [48]), psychology (see Aime and Charlet [1] and Petrunia [34]), education system (see Demartini et al. [12], Kruger-Ross [28] and Ochara [33]), geographic information system (GIS) (see Vaccari et al. [46], Delgado et al[11]. and Tahmoorespur et al. [44]), medical science (see Bertaud-Gounot et al. [6] and Lousado et al. [31]), material science (see Cuccia [9] and Ghibaudi and Cerruti [23]), and neuroscience (see Bowden et al.[7]. and Fumagalli [13]).
As a conceptual model, ontology storage and management the information, has been widely concerned in the field of information retrieval. Using the ontology similarity calculation, we can effectively find the semantic similarity concept of the original retrieval concept, carry out the extended query in the retrieval, and return the result to the user. This trick can improve a lot the intelligence of the information retrieval. For example, if we retrieval the keyword “computer”, the traditional way of search will return the computer–related information according to the degree of relevance from high to low and present them to the user. However, this retrieval is based on keyword matching, like a similar information contains the word “laptop” can not be matched to “computer”. But in fact the words “computer” and “laptop” share high semantics similarity. With the help of ontology for query expansion, can be found that the similarity between “laptop” and “computer” is very high. Thus, in order to find information related to the computer, we find laptop-related information, and then return back to the user according to the similarity. The advantage is that the retrieval of query information is intelligent and very comprehensive.
There are several advances in ontology semantic similarity computation. Rodriguez and Egenhofer [35] presented a method to compute semantic similarity which relaxes the demand of a single ontology. Steichen et al. [42] constructed a morphological abnormality ontology in breast pathology to assist inter-observer consensus, and it implemented position-based, content-based and mixed semantic similarity measures between concepts in this ontology. Al-Mubaid and Nguyen [3] proposed a ontology-structure-based trick for measuring semantic similarity across multiple ontologies. By means of human phenotype ontology, Kohler et al. [27] adapted semantic similarity metrics to compute phenotypic similarity between queries and hereditary diseases annotated. Batet et al. [5] studied a measure in view of the exploitation of the taxonomical structure of a biomedical ontology. Albacete et al[2]. gave proposal for computing a similarity function for each dimension of knowledge. Taha [43] presented techniques for determining the semantic relationships among GO terms. Taieb et al. [45] raised an ontology measure for quantifying the degree of the semantic similarity between concepts. Mazandu et al. [32] introduced adaptable gene ontology semantic similarity-based on functional analysis. Lastra-Diaz et al. [29] presented a detailed companion reproducibility article of the trick and experiments proposed by former researchers in a survey where the state of the art on this topic is presented.
Specifically, the framework of ontology can be expressed as a simple graph in which each concept, element or object corresponds to a vertex of the graph and each edge represents a potential link (or potential relationship) between two concepts.
In the previous conditions, let be a graph corresponding to the ontology O with vertex set V(G) and edge set E(G). In the engineering applications of ontology to various fields, the fundamental goal of the ontology algorithm is to obtain the best ontology function which is applied to measure the similarities between ontology vertices in single ontology or multiple ontologies. The aim of the ontology map is to get the high similarity vertices from different ontologies, i.e., to deduce the similarity between two or multiple ontologies, and it is used to build a bridge between different ontologies thus helps to yield a potential connection among the elements or objects from target ontologies.
At the beginning, the design of the formulas for ontology similarity measuring were heuristic based, i.e., the similarity formula is determined by the researchers according to the structural features of the ontology and the characteristics of the specific application domain. The shortcomings of this method are:
- 1.
It relies on the participate of high-level field experts.
- 2.
The similarity formula contains many man-made parameters.
- 3.
It can not adapt to the dynamic changes in the ontology.
- 4.
It has high complexity, and thus not suited in the specific application with big data background.
In order to overcome these shortcomings, the machine learning techniques are gradually applied to the ontology algorithm.The specific idea is to get the optimal ontology function from the sample learning, which maps each vertex in ontology graph to a real number, and thus maps the whole ontology graph to the one dimension real axis (for multiple ontologies, we put all the graphs into one graph, each ontology is seen as a connected branch of the graph). Then the similarity between the ontology concepts is determined by the distance of their corresponding vertex on the real axis. It means, the similarity between vertices vi and vj is measured by . To have a closer distance means to have higher similarity. The advantage of this algorithm is that it does not depend on domain experts; the results are intuitive; the parameters set by man-made settings are greatly reduced; and most important, the computational complexity is greatly reduced because there is no pairwise similarity calculating.
There are several ontology learning algorithms and theoretical analysis results proposed in recent years. For instance, Gao et al. [17] studied the strong and weak stability of k-partite ranking based ontology algorithm. Gao and Xu [19] presented the uniform stability analysis of learning algorithms for ontology similarity computation. Gao and Zhu [20] raised gradient based ontology learning algorithm. Gao et al. [21] obtained the Ontology sparse vector learning algorithm using ADAL technology. Gao and Farahani [15] researched the generalization bounds and uniform bounds for multi-dividing ontology algorithms with convex ontology loss function. More related contexts on ontology and machine learning can be referred to Cucker and Zhou [10], Smale and Zhou [41], Zhou [50], Ibrahim et al. [24], Jiao et al. [25], Shang et al. [37], [38], [40] or [39].
Among these ontology learning algorithms, multi-dividing ontology algorithm is the most popular ontology learning approach in which all vertices in ontology graph or multi-ontology graph are divided into k parts (correspond to the k classes of rates). Assume that f(va) > f(vb) if va belongs to rate a and vb belongs to rate b with 1 ≤ a < b ≤ k. Note that for ontology graph with tree or tree-likely structure, each kind of branch is corresponding to a rate in the dividing. Since most of ontology graphs have tree structure, multi-dividing ontology algorithm method is widely used in various of engineering filed like biology, medicine, chemistry, etc. Gao and Farahani [15] and Wu et al. [49] presented respectively some examples to show how multi-dividing ontology algorithm is applied to some specific engineering applications.
Although there have been several recent advances in the developing of algorithms for various settings on the multi-dividing ontology learning problem, the study of more available tricks and generalization properties of multi-dividing ontology learning algorithms has been largely limited to the special setting. It inspires us to explore more advanced techniques of ontology learning algorithm in multi-dividing setting and theoretical analysis from statistics learning theory.
In this paper, we present a partial multi-dividing ontology learning algorithm and study its statistics characteristics from a mathematical point of view. In this trick, we divide the whole ontology graph into some branches which are corresponding to several rates. The optimal ontology function is obtained by learning the ontology sample set which also can be divided into k training subsets, and the partial learning framework in multi-dividing setting plays a key role in the implementation process. The structure of the paper is as follows: firstly, we introduce the setting of multi-dividing ontology learning; secondly, the main algorithm is presented in Section 3; and finally, the effectiveness of proposed ontology learning algorithm is stated via five experiments developed in various of engineering applications.
Section snippets
Preliminaries, notation and background
For our mathematical discussion and learning setting expression, for each vertex in the ontology graph, we use a p dimensional vector to express all semantic information of its corresponding ontology concept. We shall use v to denote the vertex v and its corresponding vector in .
Let () be a vertex space for ontology graph G, and the vertices in V are drawn independently and randomly according to certain unknown distribution . The target of ontology learning algorithms is to predict
Description of the partial multi-dividing ontology algorithm
In this section, we consider ontology function denoted by for some . The contexts in this section is organized as follows: we first introduce the structural SVM based multi-dividing ontology framework with hinge ontology loss; then the partial multi-dividing ontology framework with hinge ontology loss is presented; next, we discuss the optimization methods for partial multi-dividing ontology framework based on structural SVM in interval and respectively
Experiments
We underline that to implement our algorithm with mathematical learning setting, for each vertex in each ontology in the experiments we shall use fix dimensional vectors to express vertex’s semantic and construct information. All the information of the concept include its name, attribute, instance and structure of vertex in the ontology graph is packaged in its corresponding vector. In this section, five experiments are designed and presented to measure the effectiveness of our partial
Conclusions
In recent years, since most ontology structure can be expressed as a tree or analogous to tree, multi-dividing ontology learning becomes a hot topic in ontology research in which all concepts are divided into k parts corresponding to k rates according to the branches of ontology tree, and the rank among these k parts are determined by domain experts. There are several advances both in theoretical and engineering applications in multi-dividing ontology setting, and proved to be in high
Conflict of interests
The authors hereby declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgment
We thank the reviewers for their constructive comments in improving the quality of this paper. This work has been partially supported by MINECO grant number MTM2014-51891-P and Fundación Séneca de la Región de Murcia grant number 19219/PI/14 and National Science Foundation of China grant number 11761083.
References (50)
- et al.
Social psychology insights into ontology engineering
Future Gen. Comput. Syst Int. J. Escience
(2016) - et al.
An ontology-based measure to compute semantic similarity in biomedicine
J. Biomed. Inform.
(2011) - et al.
Margin based ontology sparse vector learning algorithm and applied in biology science
Saudi J. Biol. Sci.
(2017) Raising the question of being in education by way of Heidegger’s phenomenological ontology
Indo-Pac. J. Phenomenol.
(2015)- et al.
A tuberculosis ontology for host systems biology
Tuberculosis
(2015) - et al.
A-DaGO-fun: an adaptable gene ontology semantic similarity-based functional analysis tool
Bioinformatics
(2016) Towards a regional ontology of management education in africa: a complexity leadership theory perspective
Acta Commercii
(2017)Dimensionally ontology v. frankl as the conceptual basis for interdisciplinary synthesis of biomedicine, psychology and computing
Biomed. Radioeng.
(2015)- et al.
Global discriminative-based nonnegative spectral clustering
Pattern Recognit.
(2016) - et al.
Computation of semantic similarity within an ontology of breast pathology to assist inter-observer consensus
Comput. Biol. Med.
(2006)
Determining the semantic similarities among gene ontology terms
IEEE J. Biomed. Health Inform.
Ontology-based approach for measuring semantic similarity
Eng. Appl. Arti. Intell.
Semantic similarity measures applied to an ontology for human-like interaction
J. Artif. Intel. Res.
Measuring semantic similarity between biomedical concepts within multiple ontologies
IEEE Trans. Syst. Cybern. Part C-Appl. Rev.
Van helmont’s hybrid ontology and its influence on the chemical interpretation of spirit and ferment
Found. Chem.
Ontology and medical diagnosis
Inform. Health Soc. Care
Neuronames: an ontology for the braininfo portal to neuroscience on the web
Neuroinformatics
Overview of the TREC 2003 Web Track
Proceedings of the Twelfth Text Retrieval Conference, Gaithersburg
Aquinas’S ontology of the material world. change: hylomorphism and material objects
Scripta Mediaevalia
Learning Theory: An Approximation Theory Viewpoint
An evaluation of ontology matching techniques on geospatial ontologies
Int. J. Geogr. Inf. Sci.
The bowlogna ontology: fostering open curricula and agile knowledge bases for Europe’s higher education landscape
Semant. Web
Choice models and realistic ontologies: three challenges to neuro-psychological modellers
Eur. J. Philos. Sci.
Generalization bounds and uniform bounds for multi-dividing ontology algorithms with convex ontology loss function
Comput. J.
Distance learning techniques for ontology similarity measuring and ontology mapping
Clust. Comput. J. Netw. Softw. Tools Appl.
Cited by (184)
Estimating total organic carbon of potential source rocks in the Espírito Santo Basin, SE Brazil, using XGBoost
2024, Marine and Petroleum GeologyApplicability and comparison of four nature-inspired hybrid techniques in predicting driven piles’ friction capacity
2022, Transportation GeotechnicsOntology learning from relational databases
2021, Information Sciences