Leverage Label and Word Embedding for Semantic Sparse Web Service Discovery

Information retrieval-based Web service discovery approach suﬀers from the semantic sparsity problem caused by lacking of statistical information when the Web services are described in short texts. To handle this problem, external information is often utilized to improve the discovery performance. Inspired by this, we propose a novel Web service discovery approach based on a neural topic model and leveraging Web service labels. More speciﬁcally, words in Web services are mapped into continuous embeddings, and labels are integrated by a neural topic model simultaneously for embodying external semantics of the Web service description. Based on the topic model, the services are interpreted into hierarchical models for building a service querying and ranking model. Extensive experiments on several datasets demonstrated that the proposed approach achieves improved performance in terms of F-measure. The results also suggest that leveraging external information is useful for semantic sparse Web service discovery.


Introduction
In the era of Big Data, a growing number of business enterprises worldwide are driven to deploy their business applications into Web services in both intranet and internet [1,2]. A number of registry centers, such as, Programmableweb (http://www.programmableweb.com) and Mashape (https://www.mashape.com/) and business enterprises, have built their own service discovery mechanism to provide a convenient way to access these Web services. e search engine-based approaches are widely adopted among these registries. However, the discovery method-based searching engine technology which mainly focuses on keyword-based matching may result in the poor recall problem due to lacking of keywords in Web service descriptions, using of synonyms, or variations of keywords [3].
Two kinds of methods are often adopted to alleviate the poor recall problem in discovering nonsemantic Web services [4,5]. e first one is to perform a broad searching and get a potentially large number of Web services which may not really be interest to users. e second one is to cluster services into similar functional clusters using the descriptions of Web services to enhance the capability of the search engine. Since this kind of method can effectively reduce the search space, it has attracted higher attentions from researchers [3][4][5][6][7].
ere are some new issues emerged when using the aforementioned Web service discovery approaches in recent years. One is the semantic sparsity problem resulting from short text descriptions of Web services that there is no sufficient information to express the full semantics of the Web service. e current Web service marketplaces often briefly describe the main functions, the providers, and the types of a Web service using short sentences which do not contain enough statistic information so as to hinder effective similarity computing and pose challenges to traditional service retrieval approaches [8,9]. Faced with this issue, transfer of external knowledge to enrich the semantic representation of short text documents has been proposed such as Tian et al. [5] transfer external knowledge by using Gaussian LDA and the word embedding model from auxiliary information to enhance the semantics of the Web services.
Inspirited by these excellent findings, we propose to introduce the word embeddings which have been shown to capture lexico-semantic regularities in the language. In the embedding space, words with similar syntactic and semantic properties are found to be close to each other [10]. us, this feature is particularly suitable to solve the problems of using synonyms/variations of keywords in the query. Furthermore, the context information such as the co-occurrence information in the word embeddings can be effectively used to enrich the semantics of a document. Inspired by this, we propose to introduce word embedding to handle the semantic sparsity problem in the discovery of Web service.
To enhance the clustering performance, extensive research has been carried out on category information [11]. Inspired by this, some topic models can directly integrate these information into the generative process of a topic model to improve topic quality and cluster accuracy. Some excellent work had been done to leverage external metainformation to enhance the topic model [12].
According to the above description, we propose a labelaided neural topic model (LNTM) derived from Gaussian LDA [13] which leverages word embeddings and external label information to improve Web service discovery.
Our main contributions are as follows: (1) We presented an approach that leverages pretrained word embeddings to enrich the semantics of Web service descriptions (2) We proposed a label-augmented neural topic model to retrieve the Web services based on word embeddings and categories of the Web services (3) We experimentally illustrate that the proposed approach outperforms several other approaches with higher evaluation metrics

Related Work
Web service discovery provides a mechanism to discover relevant services from different service registries. Base on the description method of services, the Web service discovery can be generally divided into two categories: semantic-based and nonsemantic-based. For instance, the Ontology Web Language for Services-(OWL-S-) based service is a typically semantic description language. In contrast, WSDL, Web Application Description Language (WADL), and natural language are typical nonsemantic description languages. Semantic-based approaches mainly focus on high-level match-making [14,15], whereas nonsemantic-based discovery methods utilize information retrieval techniques [3][4][5][6][7]. In the proposed approach, we concentrate on nonsemantic Web services. e nonsemantic-based discovery approaches are fairly different due to different description languages. For example, Elgazzar et al. [3] preprocessed the WSDL document to extract content, types, messages, ports, and service name as main features for the discovery method and utilized information retrieve approach to enhance Web service discovery.
e WSDL documents need be preprocessed to construct the features for representing the Web service. If the Web services use different description languages, the WSDL-based methods must be adjusted for the discovery process. In this paper, we focus on the discovery of Web services which have shorter description and may contain less features compared with the services with sufficient information files. erefore, the above methods may fail to work since they lack ways to handle the semantic sparsity problem.
Several studies have found that it is helpful to leverage external information to handle the semantic sparse problems of information retrieval approach [4,8,16]. Chen et al. proposed an augment LDA model to utilize both WSDL and tags for Web service discovery so as to provide effective Web service clustering [16]. ere are also different methods to handle with the semantic sparsity problem. For example, Hu et al. proposed to enhance the short text cluster by leveraging world knowledge [17]. Jin et al. utilized a transfer learning model to cluster short texts to embody auxiliary long texts [8].
ese approaches can partially handle the semantic sparsity problem; however, they also have some limitations. For instance, Hu et al.'s work in [17] makes the implicit assumption that the auxiliary data are semantically related to the short texts, which may not be true in the real world. Similarly, work [8] makes the assumption that the topical structure of the two domains which is completely identical would not be wholly correct.
Some studies utilized the complex network to handle Web service clustering problems to introduce the capability of network-based software. Many approaches have been performed from a complex network perspective by representing software systems (or service-oriented software systems) as software networks (or software service networks). Ma et al. [18] and Pan et al. [19] analyzed the topological structure of software networks, revealing many shared properties such as small-world and scale-free. Şora and Chirila [20] and Pan et al. [21,22] proposed approaches to identify key classes in Java systems. Ma et al. [18] and Pan et al. [23][24][25] proposed software metrics by using parameters in complex networks. Zhou and Wang [26] and Pan et al. [27,28] proposed an approach to cluster services by using community detection approaches in complex networks.
ere works are helpful to utilize the capability of the complex network; however, it still has the problem of semantic sparsity.
Faced with above problems, we propose to introduce another solution that introduces external information by word embeddings which have been shown to be beneficial for the semantic sparsity [29].
Latent Dirichlet Allocation (LDA) and extensions have been proved as efficient methods for boosting the discovery performance of Web services [16,30]. However, due to the base assumption that the words are discrete multinomial distribution, these probabilistic models cannot benefit from the word embeddings which are continuous vectors. Faced with this, we propose to use the neural topic model to leverage the advantages of both word embeddings and probabilistic models. Category labels can play an important role in the clustering procedures. Inspired by this, leveraging both label information and embeddings to enhance the discovery performance has attracted our attention. As a result, a label-aided neural topic model derived from Gaussian LDA which integrates both word embeddings and external label information is proposed.

The Discovery Process of the Proposed Approach
As is shown in Figure 1, the service discovery process of the proposed model consists of four major parts: service preprocessing, service modelling, query modelling, and service ranking. As shown in Figure 1, the Web service is firstly crawled and preprocessed. e description and the label of a Web service are extracted from the collected materials. en, the service descriptions are taken as the input of the word2vec model to create the word embeddings. After getting the word embeddings, we map the words in the service description label into word embedding to produce one input and take the Web service label as the other input for the proposed model LNTM. e LNTM will convert each Web service into representations of latent factors. To model users' queries, the words in a query are looked up from embeddings and mapped into embeddings. In the service ranking phase, based on LNTM and users' queries, a probabilistic service ranking model is proposed to retrieve relevant services for the users.
In the proposed approach, training word embedding and modelling services are conducted offline, and the efficiency of the proposed discovery model can be guaranteed. Hence, the focus of the approach will be placed on the accuracy of discovery.

LNTM.
For capturing semantic regularities in the language and handling the semantic sparse of Web services, an augmented topic model with word embeddings for Web service discovery is proposed in [31]. In the meanwhile, labels of documents can be used to guide topic learning so as to find more meaningful topics [12]. erefore, in this paper, a label-augmented neural topic model is proposed to leverage label information and capture semantic regularities for enhancing discovery performance of the Web service in this paper.
In the proposed model LNTM, the word embedding v for each term in a document d at position i is written as v di ∈ R W , and W is the length of the word embedding. As a result, the words in a document are mapped into continuous vectors in the W-dimensional space. erefore, each topic k is characterized as a multivariate Gaussian distribution with mean μ k and covariance Σ k . e Gaussian parameterization is determined by both analytic convenience and the semantic similarity of embeddings. To govern the mean and variance of each Gaussian, the Gaussian distribution centered at zero and an inverse Wishart distribution for the covariance are placed as the conjugate priors.
Similar to Gaussian LDA, Web service modeled by LNTM is represented as the mixtures over latent topics with proportions drawn from a Dirichlet prior.
To integrate labels, words are indicators for the presence of labels, and then l d would include 1 in the positions for each label listed on document d and 0, otherwise. e graphical representation of LNTM is shown in Figure 2. Based on above notions, the generative process of LNTM for a document can be summarized as follows: (1) For topic k � 1, . . ., K,

Web Service Modelling Using LNTM.
e LNTM is a generative model in which each embedding v in a service description is associated with the latent variable topic z, and each topic z is associated with the service description d. With these two distributions, a Web service can be expressed as two layers: the service topics and the topic embeddings.
After using the LNTM, the service-topic distribution is achieved by the parameter θ (θ ∈ |services| × |topics|), and topic embedding is achieved by the multivariate Gaussians.
To infer the topic assignments of individual embeddings and the posterior distribution of services over the topics, a collapsed Gibbs sampling method is adopted to derive the topic assignments to each embedding by using the update rule shown in the following equation: where z −di represents the topic assignments of all word embeddings, excluding the one at the i-th position of serviced. λ −di represents the label assignments. V d is the sequence of vectors for service description d; M is the length of the word embedding; a tuple ζ � (μ, κ, Σ, v) is the parameters of the prior distribution; and t v ′ (x | μ′, Σ′) is the multivariate tdistribution with freedom degree v ′ and parameters μ′ and Σ′.
Note that the first part of equation (1) which expresses the probability of topic k in service description d is derived as that of Gaussian LDA.
Mathematical Problems in Engineering e second part of equation (1) expresses the probability of assignment of topic k to the vector v di given the current topic assignments which represented by a multivariate tdistribution with parameters (μ k , κ k , Σ k , v k ). ese parameters for the posterior distribution are calculated by equation (2) as the Gaussian LDA: Here, the parameters v k and C k are calculated as where N k is the total counts of the words of the topic assignment of k across all descriptions. v k and C k are the sample mean and the scaled form of sample covariance assigned topic k, respectively. Intuitively, the parameters μ k and Σ k are the posterior mean and covariance. e parameters κ k and v k denote the strength of the priors for mean and covariance, respectively. After getting these parameters, we can simply achieve the topic-embedding distribution as discussed above.

Query Modelling and Ranking.
To retrieve relevant services by the proposed model, we firstly translate the user query into embeddings. e words in a query are extracted and mapped into the embeddings by looking up the embedding features.
To rank the retrieved Web service, we use the generated probabilities to calculate the similarity between the user queries and the Web services as the work in [31]. e similarities are represented by P(Q | s i ), where Q is the query and s i is the i-th Web service. us, using the assumptions of the LNTM described above, it can be calculated by the following equation: Here, P(e | z) and P(z | s) are the posterior probabilities computed according to above equation (2) and the matrix θ, respectively. Finally, we can obtain a list of retrieved services towards a query according the value of P(Q|s i ).

Experiment Setting
To evaluate the proposed approach, we conducted several experiments on the standard Web service test dataset SAWSDL-TC3 (TC3) (http://www.semwebcentral.org/ projects/sawsdl-tc) as Tian et al. [31] did. To use TC3, we first parse the WSDL files into a plain text and then removed stop words and lemmatized the remaining words.
As is known to all, the Web services of TC3 do not have explicit category labels. However, there are some implicit categories in the WSDL files. As shown in Figure 3, the node "<xsd:annotation>" of service "FoodMaxpricequantity.wsdl"  Figure 1: e discovery process. has values of "#Food," "#MaxPrice," and "Quantity." As a result, we extracted these values to generate the category labels for each Web service in our experiments. Since the Web services in the real-world service registry all belong to their certain categories, it is easily to collect the category label information for using the proposed approach.
In our experiments, we used precision p, recall r, and F-measure f as the evaluation criterion which is defined in equation (5) for the proposed approach. e larger the F-measure is, the better the performance of the discovery is.
where relevant is the relevant class labels and predicted is the predicted results of the classification methods.

Performance of the Proposed Approach.
To examine the performance of our approach, we compare the proposed method with three other Web service discovery approaches. ese approaches are demonstrated as follows: (1) LDA: when using LDA, the latent factors learnt from the Web service description are adopted to represent the Web service, and then a discovery approach is used to rank the services [30]. (2) Meta-LDA: in Meta-LDA [12], metainformation such as a category of a Web service is directly incorporated into the generative process. e external metainformation can improve the topic quality and modelling accuracy. We use Meta-LDA to group Web services into different clusters and then employ a probabilistic model to rank the services. (3) Gaussian − LDA: a Gaussian LDA-based Web service discovery approach which makes use of embeddings for semantic sparsity Web service discovery is conducted based on the work done by Tian at al. [31]. (4) LNTM: for LNTM, we first train the word embeddings by word2vec from the prepared corpus. en, we train the LNTM by incorporating embedding and service category labels into the generative process and organize the Web service into different clusters. Finally, we represent the query by embedding and utilize the probabilistic discovery model to rank the Web services as illustrated in Section 3.
For LDA and the Meta-LDA model, following the modelling process mentioned above, the topics are generated from the descriptions of the Web services. en, we tuned the algorithms, respectively, to their best parameter settings by cross validation. Table 1 shows the experimental data on TC3. According to these experiments, we have several observations: firstly, LNTM outperformed all the competitors in terms of F-measure on nearly all the settings, showing the benefit of using both word embeddings and service category labels which demonstrates the effectiveness of the proposed model.
Secondly, by looking at the approaches using the label information, we can see the significant improvement of these models over LDA, which indicates that document labels can play an important role in guiding topic modelling.
irdly, the LNTM and Gaussian − LDA have better performance than the LDA-based method. e results show that the embedding-based approach which takes continuous embeddings as the input may capture more semantically coherent topics compared to the traditional LDA-based method.
Finally, it is interesting to note that the Meta-LDA outperforms LDA and LNTM outperforms Gaussian − LDA, respectively, in this study. ese findings are in agreement with the idea that utilizing the category label data of Web services improves the performance of Web service discovery. ese results inspire the research work to integrate other external information for effective Web service discovery.

Validation of Labels.
To validate that incorporating category label information can significantly improve the generative topic accuracy, we varied the proportion of services used in training from 20% to 80% and used the remaining for testing. Here, we utilize normalised pointwise mutual information (NPMI) as shown in equation (6) to evaluate the topic quality of LDA, Meta-LDA, and LNTM: e NPMI score of each topic in the experiments is calculated with top 15 words (T �15). As shown in Figure 4, the NPMI scores of both LNTM and Meta-LDA outperform LDA. e result indicates that the label information can enhance the LDA-based model to find more meaningful topics. e details of the two corpus are shown in Table 2.

Validation of Embedding.
As discussed above, embodying more semantic knowledge by changing the Bag of Words model into the continuous embedding space using LNTM can enhance the performance of the Web service discovery model. Several experiments are conducted so as to validate the result. Figure 5 shows the F-measure performance of the proposed approach with different word embeddings trained by the word2vec model using different corpus TC3 and Wikipedia.
As shown in Figure 5, the proposed approach using TC3 has better F-measure performance than using Wikipedia. e possible explanation for this may be that some words extracted from the WSDL files which do not have enough appearance counts in the Wikipedia corpus are removed when training the embeddings though they are very informative [31].

Influence of Hyperparameters.
In LNTM, the parameter α illustrates the weight of language model contribution, μ and Σ control the document contribution, while s contributes to the label information. In our work, hyperparameters are empirically set as α � 1/K, s � zero, μ � zero mean, Σ � 3 * I, and 1,000 sampling iterations as in work [31]. Here, K is the number of topics, and I is the identity matrix. To check the influence of topic number k, we calculated P(e | k) for different k. As shown in Figure 6, the    result suggests that the data are best accounted for the proposed LNTM model incorporating 8 topics.

Conclusion
In this paper, we proposed a Web service discovery approach that combines word embeddings and category labels to deal with the poor recall problem in searching semantic sparse Web services. We used word embeddings to map the word into embedding so as to enrich the Web service semantics. We also introduced a label-augmented neural topic model LNTM which organizes the Web services into hierarchies for a probabilistic ranking approach. Several experiments were conducted on a widely used dataset TC3 to validate the performance of our approach. Experimental results suggested that the proposed approach is feasible, and in particular, the word embeddings and label information both lead to enhanced performance in the Web service discovery process.
Since not all the Web services have their category labels, it is necessary here to clarify exactly how to conduct effective Web service discovery without labels. In the future, there is abundant room to further investigate the usefulness of various metainformation of Web service and propose different forms based on Gaussian − LDA to provide effective service discovery.

Conflicts of Interest
e authors declare that they have no conflicts of interest. Mathematical Problems in Engineering 7