Matching Sensor Ontologies with Simulated Annealing Particle Swarm Optimization

In recent years, innovative positioning and mobile communication techniques have been developing to achieve Location-Based Services (LBSs). With the help of sensors, LBS is able to detect and sense the information from the outside world to provide location-related services. To implement the intelligent LBS, it is necessary to develop the Semantic Sensor Web (SSW), which makes use of the sensor ontologies to implement the sensor data interoperability, information sharing, and knowledge fusion among intelligence systems. Due to the subjectivity of sensor ontology engineers, the heterogeneity problem is introduced, which hampers the communications among these sensor ontologies. To address this problem, sensor ontology matching is introduced to establish the corresponding relationship between different sensor terms. Among all ontology matching technologies, Particle Swarm Optimization (PSO) can represent a contributing method to deal with the low-quality ontology alignment problem. For the purpose of further enhancing the quality of matching results, in our work, sensor ontology matching is modeled as the metamatching problem firstly, and then based on this model, aiming at various similarity measures, a Simulated Annealing PSO (SAPSO) is proposed to optimize their aggregation weights and the threshold. In particular, the approximate evaluation metrics for evaluating quality of alignment without reference are proposed, and a Simulated Annealing (SA) strategy is applied to PSO’s evolving process, which is able to help the algorithm avoid the local optima and enhance the quality of solution. .e well-known Ontology Alignment Evaluation Initiative’s benchmark (OAEI’s benchmark) and three real sensor ontologies are used to verify the effectiveness of SAPSO. .e experimental results show that SAPSO is able to effectively match the sensor ontologies.


Introduction
In recent years, innovative positioning and mobile communication techniques have been developing to achieve Location-Based Services (LBSs) [1,2]. With the help of sensors, LBS is able to detect and sense the information from the outside world to provide location-related services. To implement the intelligent LBS, it is necessary to develop the Semantic Sensor Web (SSW) [3,4]; as the kernel technique of the SSW, sensor ontology is a standard information exchange model, which serves as the basis for different machines to understand semantics and implement the sensor data interoperability, information sharing, and knowledge fusion among intelligence systems.
Due to the subjectivity of sensor ontology engineers, they might make use of various concepts to mean the same thing, or one concept might have more than one meaning, yielding the problem of heterogeneity that affects semantic interoperability between ontologies. Ontology matching [5][6][7] can be seen as a powerful tool to face this challenge, which has been widely applied in different application domains, such as Artificial Internet of ings (AIoT) [8,9] and biomedical domain [10]. Sensor ontology matching can be used to discover the semantic relationships of different sensor ontologies, which is capable of determining the correspondences between concepts of heterogeneous sensor. e similarity measure is critical for a sensor ontology matching technique. Due to the complicated semantic relationships among the sensor data, a single similarity measure cannot ensure that it is able to distinguish all the semantically identical entities in any matching context. us, several different similarity measures are usually aggregated to enhance the result's confidence. Ontology matching is generally interpreted as how to find a set of appropriate weights and threshold to achieve high-quality ontology alignments.
Particle Swarm Optimization (PSO) [11] is a contributing methodology for determining high-quality ontology alignments [12]. Although PSO converges fast, it is apt to fall into the local optima, which makes it unable to find the global optimal solution. To overcome this drawback, in this work, aiming at various similarity measures, a Simulated Annealing PSO (SAPSO) is proposed to optimize their aggregation weights and the threshold. Particularly, in the process of evolving process, SAPSO introduces a Simulated Annealing (SA) strategy to further enhance the quality of solution. e innovation points of this work are as follows: (1) An approximate evaluation metric on ontology alignment is proposed, and an optimization model for the sensor ontology meta-matching problem is constructed. (2) To effectively solve the problem of sensor ontology meta-matching, an ontology meta-matching framework and a SAPSO algorithm are proposed.
is paper is organized as follows. Section 2 presents the related work. Section 3 gives the formal definitions on the sensor ontology and similarity measure. Section 4 constructs the optimization model for sensor meta-matching problem. Section 5 presents the SAPSO. Section 6 shows the experimental results and the corresponding analysis. Finally, Section 7 draws the conclusions and puts forward the future research directions.

Swarm Intelligence Algorithm-Based Ontology Matching Technique
In different sensor ontologies, due to the subjectivity of the designer, conceptual name in the sensor system may have different naming methods and definition methods, thus causing the problem of communication inconvenience between different sensor ontologies [13,14]. Due to the complex intrinsic nature of matching two ontologies, swarm intelligence algorithms, such as PSO, Parallel Compact Cuckoo Search Algorithm (PCCSA) [15], Artificial Bee Colony (ABC) algorithm [16], Firefly Algorithm (FA) [10,17], and Evolutionary Algorithm (EA) [18,19], have become effective methods to determine the ontology alignments.
Bock et al. [20] used a discrete PSO algorithm to optimize the results of ontology entity matching, which does not require the computation of large similarity matrices. He et al. [16] used the ABC-based matcher to solve the ontology meta-matching problem, whose results can be proved more effective. Xue et al. [17] proposed a Compact Cooperative Firefly Algorithm-(CCFA-) based ontology matching system, which can improve the search efficiency effectively by using a new mechanism. Xue et al. [12] also proposed a compact multiobjective PSO to solve the matching problem of large-scale biomedical ontology. In addition, they [10] also proposed a Compact Firefly Algorithm (CFA), which greatly reduced the running time and memory consumption by two compact movement operators. Chu et al. [21] first built an ontology model in vector space and proposed a Compact Evolutionary Algorithm (CEA) to solve the ontology matching problem. In this work, we further introduce SA into PSO's evolving process to trade off its exploration and exploitation, which is able to effectively help the algorithm to jump out of the local optima.

Sensor Ontology and Similarity Measure
3.1. Sensor Ontology. In the computer and information science field, ontology is a formal list of all the concepts and their relationships in a particular domain [18]. With respect to the SSW, a sensor ontology is used as the most important and extensive model for describing the concepts related to sensors and the IoT [22,23], such as the sensor's output, observations, observation characteristics, and so on. For ease of description, a set of triples (C, P, I) [24] is used to represent a sensor ontology, where C, P, and I represent the sets of class or concept, property, and instance, respectively. An example of sensor ontology is shown in Figure 1, where an ellipse represents a class and the arrows between the ellipses represent the class' properties. A class is a collection of instances, and each element in I is an instance of a class. Generally, classes, properties, and instances are collectively called entities. e goal of sensor ontology matching [25] is to establish correspondences between heterogeneous entities and find the set of entity correspondences, the so-called sensor ontology alignment [26]. Here, an entity correspondence is a five-tuple <id, e, e′,c, R >, where i d refers to the identifier of entity correspondence; e and e ′ are the entities of two ontologies, respectively; c is the degree of confidence between e and e ′ that can be matched, usually at [0, 1]; and R is the equivalence relationship between e and e ′ . e process of matching two sensor ontologies is shown in Figure 2, where O 1 and O 2 , respectively, represent the two sensor ontologies to be aligned, A I is the input alignment, p is a set of parameters, r represents some external resources, and A N is the obtained alignment.
A similarity measure uses particular information to calculate to what extent two entities are similar. Generally, the similarity measures can be composed of three types, which are described in detail in Section 3.2.

Syntax-Based Similarity Measure.
A syntactic measure calculates the string distance between entities of different ontologies. In our work, we use the N-Gram distance, which is an effective syntactic metric in the ontology matching domain. N-Gram has an obvious advantage in comparing the similarity between two strings [27,28]. Given two strings, their N-Gram distance is calculated by measuring the number of common substrings they have. To be specific, the N-Gram distance is defined as follows: where s 1 , s 2 are two strings to be computed, respectively; N stands for the length of each substring after splitting the original string, which is generally set to 2 or 3 (the lower the value, the higher their similarity; the value of N in this work is 3); C(s 1 , s 2 ) is the number of their common substrings; and n s 1 and n s 2 are their lengths, respectively.

Linguistic-Based Similarity
Measure. Semantic similarity calculates the similarity between entities according to the semantic context. In our approach, we use the Wu-Palmer [29] similarity measure, in particular, it returns a fraction to indicate the degree of similarity between the two words. In this work, we use the WordNet [30], which is an English dictionary based on cognitive linguistics, to calculate the related variables in Wu − Palmer. Here, we choose Wu − Palmer because it is the most popular WordNet-based similarity measure, which calculates the semantic similarity between two strings by considering not only the conceptual depth in the hierarchical semantic structure of WordNet but also their context information. To be specific, it is defined as follows: where depth denotes the depth of the word in WordNet's the hierarchical semantic structure and lcs(s 1 , s 2 ) is the closest common parent concept of s 1 and s 2 .

Structure-Based Similarity
Measure. e main idea of structure-based similarity measure is to determine two entities' similarity through neighborhood entities   Mobile Information Systems (superclass and subclass relationship). In general, matched entities have similar structures, that is, they have the same number of superclass and subclass; conversely, if two entities have the same number of superclass and subclass, they are considered similar. In our work, the structure-based similarity measure that we use is called Out-In degree, which calculates the similarity according to the number of superclasses and subclasses of entities in different ontologies, which is defined as follows: Based on three similarity measures, we can get three similarity matrices, respectively. e similarity matrix is defined as a matrix of m × n, where m and n are, respectively, the number of entities in the original ontology and the target ontology. Each element of the matrix is the similarity value of two corresponding entities determined by the similarity measure. After that, through assigning an aggregating weight for each similarity matrix, we can obtain an aggregated matrix, which is filtered by using a similarity threshold to determine the final matrix. e ontology meta-matching problem can be defined as determining the optimal aggregating weights and the threshold to get a high-quality ontology alignment, which will be formally defined in the following.

Sensor Ontology Meta-Matching Problem
In general, optimization problems can be divided into unconstrained optimization problems and constrained optimization problems; the classification criteria are whether there are constraints. In this paper, the problem of sensor ontology meta-matching is modeled as a constrained continuous optimization problem, and its constraints are the sum of the weights and the threshold of the similarity measure, which is explained in more detail in the following.
ere are three points to consider when building an optimization model: constraint conditions, decision variables, and objective function.

Constraint Conditions and Decision Variables.
For convenience, the process of sensor ontology meta-matching can be described as a seven-tuples where O 1 and O 2 represent the source ontology and target ontology, respectively; n represents the number of similarity measures; M represents a set of similarity matrices;ω is the set of aggregating weights;thres is the similarity threshold; and A is the obtained sensor ontology alignment. In particular, M and ω are, respectively, defined as follows: e framework of sensor ontology meta-matching is shown in Figure 3, where m 1 , m 2 , . . . , m n are the similarity measures; M 1 , M 2 , . . . , M n are the similarity matrices; ω 1 , ω 2 , . . . , ω n are aggregating weights on the similarity matrices, respectively;M is the aggregated matrix; A is alignment determined by M; and threshold is the threshold. As can be seen from the figure, the ultimate goal of sensor ontology meta-matching is to find a suitable weight for each similarity matrix and a suitable threshold value for the comprehensive similarity matrix, which is able to ensure the quality of the alignment.

Objective Function.
e quality of the results of sensor ontology meta-matching is usually measured by f-measure, whose value is related to both recall and precision. Traditional recall, precision, and f-measure [9] are defined in equations (5)- (7): where R is the standard alignment; A is the alignment determined by some matching techniques; recall divides true positive correspondences we find by the number of all correct matching pairs, which represents whether the matching results found by us are complete or not; and precision divides the number of true positive correspondences we find by the cardinality of found alignment and represents whether our match is accurate. eir values are between 0 and 1, and the quality of the results is judged by these values, but neither recall nor precision can evaluate the alignment effectively because a high recall value does not mean that our results are accurate and a high precision value does not mean that our results are complete. erefore, in order to consider the evaluation results of recall and precision, we use f − measure to combine these two indicators.
But the traditional evaluation index needs to work with reference matching results, which is impossible to obtain in advance in most cases. To overcome this drawback, in the following, we propose three new quality evaluation metrics [31] on sensor ontology alignment, i.e., ApproximateRecall, Approximate Precision, Approximate Fmeasure, to approximate traditional recall, precision, and f-measure: where M represents the composite similarity matrix and |M ij | is the value of row i, column j of the composite similarity matrix M.
and finally, the objective function we need to optimize is defined as follows:

Particle Swarm Optimization.
PSO is an algorithm based on swarm cooperation, which is developed by simulating the birds' foraging behavior [32]. PSO initializes a set of random particles (stochastic solutions) and iteratively searches the optimal solution; in each iteration, the particles update themselves by tracking two extremes. e formula for updating the speed and position of PSO is as follows: where i means the ith particle, i ∈ [1, n], n is the size of population, t is the number of iterations, v is the speed of particles,c 1 and c 2 are learning factors,rand is a random number in [0, 1], pbest is the extremum of an individual, the best solution Mobile Information Systems found by the particle itself, gbest is the global extremum, and present is the current position of the particle. Compared with other swarm intelligence algorithms, PSO has such advantage as only one-way information flow, i.e., all the particles are able to converge quickly, but it tends to fall into the local optima. To solve this problem, the SA strategy is introduced into the evolutionary process of PSO to make it better optimized.

Encoding Mechanism.
A decimal encoding method is used in this work to encode a solution, which encodes a set of weights and a threshold into each particle. With respect to the encoding process on n aggregating weights and one threshold, first, n real numbers are generated in [0, 1] randomly, which are, respectively, denoted as r 1 , r 2 , . . . , r n− 1 , r n , represents the encoding information of a particle. en, the first n − 1 numbers r 1 , r 2 , . . . , r n− 1 are sorted in the ascending order, and we get r 1 ′ , r 2 ′ , . . . , r n− 1 ′ . In particular, the final number r n is the threshold for filtering the final alignment. Finally, n aggregating weights are obtained as follows: Each particle in the population contains a set of weights and a threshold. An example of encoding process on aggregating weights is shown in Figure 4.
is encoding mechanism on the aggregating weights meets their constraints defined in equation (4), and it is also of help to reduce the solution's dimension, and it ensures that different groups of numbers correspond to different aggregating weights.

Simulated Annealing.
Simulated annealing algorithm is an algorithm that introduces random factors into the search process. e simulated annealing algorithm does not completely reject the worse solution, which greatly improves the probability of getting rid of the local optimal solution. Generally, SA contains two parts, which are metropolis algorithm and annealing process. Metropolis algorithm aims at helping the solution jump out of the local optima, which accepts new solutions with a certain probability. Annealing is a process in which T, the parameter of the probability of accepting the worse solution, decreases with the iteration, so that as the iteration proceeds, the probability of accepting a worse solution gradually decreases. Assuming that a system's previous solution is denoted as s(n) and the current solution is denoted as s(n + 1), where n is the current iteration number, the acceptance probability P of the system on changing from s(n) to s(n + 1) is where F(n) and F(n + 1) are the fitness of the previous solution and current solution, respectively. T is a parameter that represents the annealing temperature. Here, the initial temperature T 0 should be large, and as the iteration goes on, the temperature T would be gradually reduced, so as to ensure that the probability of state transition is gradually reduced from 1. In such situation, any solution can be accepted at the beginning of the iteration, and the current solution stays unchanged at the end of the iteration. erefore, SA not only avoids the algorithm falling into local optimization too quick but also guarantees the algorithm's convergence.
For the sake of clarity, the pseudocode of SAPSO is presented in Algorithm 1.
First, the particles are initialized, and each particle generates three random numbers r 1 , r 2 , and r 3 on the [0, 1] interval, representing the cut points of the two weights and a threshold, respectively. And each particle also generates three initial velocities. Consider the cut points of weights and a threshold contained by the particle as the best cut points and threshold of individual history for each particle, denoted as pbest 1 , pbest 2 , and pbest 3 , respectively, and calculate the fitness values for each particle (line 8). e two cut points and one threshold of each particle are denoted as the three dimensions of the particle. Find out the best one of each dimension of all particles, denoted as gbest 0 ,gbest 1 , and gbest 2 , respectively. Initialize the temperature T 0 and calculate/update the annealing temperature T k (line 16) at the beginning of the iteration based on equation (16), update the cut points and threshold for each particle with the PSO formulas (lines 19 and 20), and get the updated fitness values based on the formulas in Section 4.2 (line 21). en, there is the key to simulated annealing: if the updated particle has greater fitness than its predecessor, then the solution transition probability is set to 1 (line 23), and the new particle is considered as the pbest; otherwise, it is accepted at a certain probability according to equation (15). If the probability condition is satisfied, the new particle is considered as the pbest in the next generation to update the velocity and position with PSO (lines 19 and 20). Finally, the pbest particle whose fitness value is the largest is treated as the global best particle of the population for the next generation of PSO updating. If the end condition is not met, the loop executes the program until the end condition is met and the globally optimal fitness value, fmeasure, is output.

e Flowchart of SAPSO.
SAPSO is a method using annealing strategy to avoid the local optimal solution of PSO algorithm. e flowchart of SAPSO is shown in Figure 5. First, we initialize the entire population, including the parameters of each particle. e second step is to obtain the fitness value for each particle and then judge whether the iteration has reached the max iteration; if the max iteration is reached, the iteration process ends and the results are output; otherwise, the entire population will be optimized using PSO algorithm according to equations (12) and (13), and obtain each particle's new fitness; at this point, we use "state" to represent all the information that the particle contains, including its fitness value and encoding information, and the particle's fitness is used to indicate the particle's state; pbest state and gbest states are, respectively, the information of an individual's corresponding local best and the population's global best during the evolutionary process. And then it is going to judge whether the new state is better than that of the previous generation. If the new state is better than the previous generation state, the new state is accepted, which satisfies equation (15), which is the formula for simulated annealing. Using simulated annealing, if the particle accepts the new state, the particle treats the new state as pbest state; otherwise, the particle treats the original state as pbest state; then, gbest state is obtained by comparing the pbest state of each particle. e annealing temperature needs to be recalculated according to equation (16) before the next iteration, and then the process is looped until the end condition is met.

Experiment Results and Analysis
In this experiment, to verify the effectiveness of SAPSO, we use the OAEI's benchmark and three real sensor ontologies, i.e., SOSA [33] and new SSN and old SSN ontology [22]. e test results of SAPSO and PSO shown in Tables 1 and 2  e brief description of OAEI's benchmark is presented in Table 3. e first column in Table 3 is the ID of the testing cases, each corresponding to a testing ontology. We divide these test ontologies into five groups according to their specific characteristics, which is described in the second column of the table. We compare SAPSO with PSO-based ontology matching technique and OAEI's participants, i.e., edna, AML [34], LogMap [35], LogMapLt [35], XMap [36], and LogMapBio [35].
In Table 1, SAPSO's results are outperforming all the competitors except XMap on the testing cases 221-247. e reason is that on testing cases 221-247, the source ontology and the target ontology are identical in terms of lexical and semantic features but differ in terms of structural features, and our structure-based similarity measure is not effective, which reduces the f-measure. In particular, on all testing cases, SAPSO's results are all equal to or better than PSO, which shows that the introduction of SA is able to improve PSO's searching ability and improve the solution's quality. From the average of fmeasure, SAPSO performs better than others, which shows that SAPSO plays an effective role in improving the quality of ontology matching.

Real Sensor
Ontologies. SOSA (http://www.w3.org/ns/ sosa/), the basic class and property of SSN (http://www.w3. org/ns/ssn/) ontology, represents the lightweight core of new SSN ontology. ese sensor ontologies describe the function and performance of the sensor. ey support many applications and use cases, such as signal detection in large-scale scientific exploration, home infrastructure monitoring, livelihood services, observation-driven ontology engineering, the World Wide Web, sensor data service system, and more [37]. e new SSN differs from the original SSN in that it simplifies the relationship between the device, platform, and system classes on the old SSN. We tested SAPSO on three real sensor ontologies with our sensor ontology meta-matching system and got their f-measure, recall, and precision values. Table 2 shows the matching results of SAPSO.
In Table 2, the first column refers to two matched sensor ontologies, and second, third, and fourth columns are, respectively, f-measure, recall, and precision of the alignments. It can be seen from the table that on the task of matching  new SSN and SOSA, SAPSO is able to determine the perfect alignment. With respect to the other two matching tasks, SAPSO's f-measure is also close to 1.0. Since there exist some complex correspondences in the reference alignment, i.e., one source concept corresponds to several target concepts, SAPSO fails to find them, which reduces its f-measure. In general, SAPSO is able to effectively match various sensor ontologies.

Conclusion and Future Work
LBS's architecture is widely used in the fields of vehicle speed estimation [38], vehicle travel time prediction system [39], and bus arrival time prediction system [40]. Technologies and applications of LBS cannot be separated from sensors. To implement the intelligent LBS, different sensor ontologies need to be integrated on SSW. To this end, in this work, the new quality evaluation metrics are proposed to evaluate the traditional three evaluation metrics. And a mathematical model on sensor ontology meta-matching problem is constructed; finally, a SAPSO is presented to address the problem, which uses SA to help the algorithm avoid the local optima. To verify the effectiveness of SAPSO, we use the OAEI's benchmark and three real sensor ontologies. Finally, the experiment proves that SAPSO is an effective method. In the following work, the quality of the sensor ontology matching results would continue to be enhanced by taking into consideration those complex correspondences. At present, SAPSO still has some defects in determining the entity mappings with heterogeneous characteristics, which makes its f-measure relatively low in those testing cases with heterogeneous structure; at the same time, SAPSO has some limitations, for example, its performance is related to initial value and parameters are sensitive. Last but not least, it is necessary to improve the approximate evaluation metrics on ontology alignment to better guide the algorithm to search for the global optima.

Data Availability
e data used to support this study can be found in http:// oaei.ontologymatching.org.

Conflicts of Interest
e authors declare that they have no conflicts of interest.