Abstract

In recent years, the optimization of multi-objective service composition in distributed systems has become an important issue. Existing work makes a smaller set of Pareto-optimal solutions to represent the Pareto Front (PF). However, they do not support complex mapping of the Pareto-optimal solutions to quality of service (QoS) objective space, thus having limitations in providing a representative set of solutions. We propose an enhanced multi-objective differential evolution algorithm to seek a representative set of solutions with good proximity and distributivity. Specially, we propose a dual strategy to adjust the usage of different creation operators, to maintain the evolutionary pressure toward the true PF. Then, we propose a reference vector neighbor search to have a fine-grained search. The proposed approach has been tested on a real-world dataset that locates a representative set of solutions with proximity and distributivity.

1. Introduction

Service composition became popular after introducing service-oriented architecture (SOA), as it allows complex and distributed software systems to be composed of web services through open standards. QoS attributes [1] (e.g., reliability or throughput) provide the quality criteria for selecting and composing web services, thus establishing QoS-aware service composition (QOSC). Since QoS requirements usually involve multiple conflicting objectives, QOSC is a multi-objective optimization problem (MOP) to find a set of Pareto-optimal solutions.

Existing work [24] has explored multi-objective evolutionary algorithm (MOEA) that allows a set of feasible solutions to approximate the Pareto-optimal set, based on analyzing a set of non-dominated solutions after one run and maintaining a good solution diversity during the search [5, 6]. However, this approach includes elitist preservation in the selection strategy. Therefore there may be a lack of evolutionary pressure to explore optimal solutions, especially as the number of service requests increases. Second, this approach does not explicitly consider fine-grained search. This can lead to overlapped or unstructured searches, resulting in uneven distribution of Pareto-optimal solutions.

In this work, we tackle the issue by proposing an enhanced multi-objective differential evolution (EnMODE) algorithm for searching for a representative set of feasible solutions to approximate the Pareto-optimal solutions in terms of proximity and distributivity. Our main contributions are summarized as follows:(1)Sufficient evolutionary pressure - We propose a dual strategy to adjust the usage of “rand/2/bin” and “current-to-best/1/bin” as the iteration evolves. “rand/2/bin” expands the evolution ability by performing new exploration around two different solutions. At the same time “current-to-best/1/bin” improves the evolution robustness by performing guided exploration around the current best. The dynamic execution of “rand/2/bin” and “current-to-best/1/bin” provides sufficient evolutionary pressure as the population evolves.(2)Fine-grained search - The reference vectors are used as neighborhood axis to divide MOP, and accumulate the non-dominated compositions around the nearest reference vector and the dominated compositions around the essential non-dominated compositions. The reference vector neighbor search simplifies MOP by systematically breaking it down into similar sub-problems and directing the search to the Pareto-optimal set.

We have evaluated the enhanced multi-objective differential evolution algorithm on a real-world dataset, which shows that the proposed method has better proximity and distributivity than the baselines. The rest of the paper is organized as follows. In Section 2, we compare it with related work. Section 3 introduces the multi-objective service composition model. We outline the proposed approach for the MOP in Section 4. Section 5 evaluates the proposed method in terms of proximity degree and uniformity. Finally, Section 6 concludes the paper and outlines future work.

This section outlines the related work on MOSC. Section 2.1 focuses on Pareto-based multi-objective service composition (PMOSC), which finds a set of Pareto-optimal solutions based on the hypothesis that users cannot accurately predefine weights or priorities for multidimensional objectives. Section 2.2 presents the utility-based multi-objective service composition (UMOSC), which computes the utility value of a solution based on the hypothesis that the weights or priorities can be accurately specified. The summary of the related work is presented in Section 2.3.

2.1. Pareto-Based Multi-Objective Service Composition

The intuitive method is to explore all Pareto-optimal solutions exhaustively. Since the Pareto-optimal set may include all possible solutions that exponentially grow in the sizes of service requests, the optimization cost of such a method would be prohibitive.

To solve this problem, Guo et al. [7] proposed a computationally efficient dropout neural network as a computationally scalable alternative of the Gaussian process model for assisting the solution of expensive high-dimensional multi-objective and many-objective expensive optimization problems. Li et al. [8] built the energy-efficient job-shop scheduling problem to a many-objective model with five objectives, i.e., makespan, total tardiness, total idle time, total worker cost, and total energy. They adopted a novel fitness evaluation mechanism based on fuzzy correlation entropy to solve this many-objective optimization problem. Cruz et al. [9] proposed an evolutionary algorithm-based search strategy for choosing an efficient design of an ensemble of Convolutional Neural Networks (CNNs), which includes not only the networks architecture but also the voting policy. During the running of the search strategy, not only the combination of CNNs with different architectures is taken into consideration, but also the most suitable policy used by the ensemble for generating the unified response.

Zhou et al. [3] proposed a multi-population differential artificial bee colony optimizer for PMOSC. The optimization problem is divided into several sub-problems to reduce the search scale. Different search behaviors are considered in the artificial bee colony algorithm to select the solutions set toward the Pareto-optimal set. The work of [10] integrated hyper-heuristics with genetic programming to solve the multi-objective dynamic service composition optimization. A set of Pareto-optimal solutions are provided to satisfy varied preferences. Wang et al. [11] proposed an improved whale optimization algorithm to divide the population into several populations. A pareto strategy is presented to improve the optimization. Yang et al. [12] adopted a multi-objective immune algorithm to implement PMOSC. The global ranking is incorporated into the evolution of multiple populations to obtain better generations.

Chen et al. [13] proposed an objective space partition-based adaptive multi-objective evolutionary algorithm to maintain diversity during strength convergence. The proposed approach defines the forward population distance as a metric to dynamically identify efficient subspaces and adaptively allocate computational resources to each subspace. In [14], an enhanced decomposition-based evolutionary many-objective optimization algorithm is proposed to solve irregular many-objective optimization problems. The local search is performed on external archives to alleviate the adverse effects of inappropriate weight vectors and strengthen the performance. Dai et al. [15] proposed a problem-specific multi-objective evolutionary algorithm where a decomposition scheme decomposes PMOSC into multiple scalar sub-problems. The evolutionary operators search Pareto solutions in terms of maximizing the service quality and minimizing the overhead. Seada and Deb [16] developed a unified evolutionary optimization algorithm U-NSGA-III to solve mono-, multi-, and many-objective optimization problems. The ability of U-NSGA-III to solve different types of problems equally efficiently and sometimes better, with the added flexibility brought in through population size control, remains a hallmark achievement. Dhiman et al. [17] proposed a novel hybrid many-objective evolutionary algorithm named Reference Vector Guided Evolutionary Algorithm (H-RVEA). It decomposed the optimization problem into several sub-problems by reference vectors, and used an adaptation strategy to adjust the reference vector distribution.

Lin et al. [18] proposed an adaptive immune-inspired multi-objective algorithm. This method embeds three differential evolution (DE) strategies with distinct features into multi-objective immune algorithms. At each generation, one of them is adaptively selected to be used based on the current search stage. This adaptive DE strategy selection effectively cooperates with three DE strategies, significantly improving search capability and population diversity. Kumar et al. [19] proposed a differential evolution and sine cosine algorithm-based new hybrid optimization method. This method adapted multi-objective versions of evolutionary optimization-based methods to mine the reduced high-quality numerical association rules automatically. Altay and Alatas [20] proposed an enhanced version of the multi-operators variant differential evolution model, named ESHADE. ESHADE utilized various mutation strategies, and an exponential population size reduction (EPSR) technique to reduce the population size for the next iteration. Besides, ESHADE employed a version of the univariate sampling method in later iterations to balance exploitative and explorative searches. We can conclude that the multi-operator variant [1820] implemented the task of enhancing the diversity of the candidate solutions. The difference is that the execution probability for the multi-operator variant is not a quantitative benchmark that accurately reflects the current search stage. At the same time, we calculate the probability as the search continues. This difference leads to problems because the multi-operator variant has difficulty tackling MOPs with different characteristics, and our proposed method solves these problems.

2.2. Utility-Based Multi-Objective Service Composition

Different approaches are developed to find the composition with the best utility, such as graph search [21, 22], evolutionary algorithms [23, 24], and so on.

Rodríguez-Mier et al. [25] used a Service Match Graph to represent all matches between the relevant services. On this basis, they proposed a hybrid local/global search to find the optimal solution. Siebert et al. [26] transformed the service composition problem into the subgraph isomorphism problem. A message-efficient localized algorithm is proposed to compose the component services according to the information from the collaboration candidate. There are also existing works on evolutionary algorithms for finding the optimal solution. For example, Hossain et al. [27] extended the particle swarm optimization algorithms to improve global and local optimization. The particles search the service space with guidance from extreme individual value and population extreme value. Martín et al. [28] proposed an ant colony optimization algorithm, in which a set of ants find the shortest path according to the pheromone mechanism. Some works used machine learning technologies to find the optimal solution. Wang et al. [29] integrated reinforcement learning with multi-agent techniques for finding the optimal solution. Game theory and a fictitious play process are combined to help improve performance. Peng et al. [30] and Wang et al. [31] used a restricted Boltzmann machine to learn the probability information of the global optimization contribution of concrete service. The information helps guide the search for solutions. Palade and Clarke [32] adopted collaborative agent communities to approximate the optimal solution.

2.3. Summary

The existing work for UMOSC can find an optimal or near-optimal solution effectively by maximizing or minimizing the utility value, which is computed under the basis of specifying weights. However, it is not easy to determine weights in practice. The reason might be that the information on user preference for multiple attributes is lost. Even if they know the user preference, it is hard to provide accurate quantitative values.

The work for PMOSC finds a set of Pareto-optimal solutions under the assumption of unknown weights, but the performance, such as proximity and distributivity, needs to be improved. In this paper, we focus on the work for PMOSC, and provide a hybrid approach to search for a smaller set of solutions with proximity and distributivity.

3. Multi-Objective Service Composition Model

3.1. QoS Vectors

Definition 1. (QoS vector) Assuming a web service has M attributes, the quality of is described by attributes that are considered an -dimensional QoS vector. Thus, the QoS vector for is defined as , where represents the QoS attribute value of (for ).

Definition 2. (QoS vector for a composition) A composition is represented as , where is the concrete service of specifying the instantiation of the abstract service. The QoS vector for cs is defined as , where is the aggregation value of the QoS attribute for all concrete services in .
As shown in Table 1, the aggregation value is computed based on the aggregation function. Other QoS attributes share similar aggregation functions, e.g., the cost computation in the case of sequential execution has a similar summation aggregation function.

3.2. Pareto-Optimality

Definition 3. (Pareto-dominance) Given two compositions and , their QoS vectors are denoted by and . is said to Pareto-dominate if i) for every attribute has a better QoS value than or equivalent value like , and ii) for some attributes, has better QoS values than , i.e.For brevity, the relation that dominates is denoted by . Through the notion of Pareto-dominance, we can also determine that and are non-dominated by each other if neither nor .

Definition 4. (Pareto-optimality) Given a set of compositions , a composition is a Pareto-optimal solution if it is feasible and not strictly dominated by any other feasible composition ,i.e.Consider a set of feasible compositions, and we have the relations , and . Since other compositions do not dominate and , they are Pareto-optimal.

3.3. Problem Statement

The MOP of service composition can be defined as follows:where represents a composition , is the set of composite services.

Considering multiple QoS attributes, service composition optimization is regarded as MOP. Due to conflicting objectives and unavailable preferences, finding a solution with the best values for all objectives is complex. An intuitive method to address this problem is to explore all Pareto-optimal solutions. However, the solution space size grows exponentially as service requests increase. Finding all Pareto-optimal solutions will cost a lot. Therefore, it is more desirable to approximate the Pareto-optimal set to allow runtime multi-objective service composition. This work aims to solve multi-objective service composition by seeking a representative set of solutions with good proximity and distributivity in QoS objective space.

4. The EnMODE for Multi-Objective Service Composition

4.1. Initialization

The initialization of the proposed approach is conducted from two aspects: the population and the reference vectors.

An individual corresponds to a composite service composed of several concrete services for the population. The concrete service is randomly chosen from the service candidates. A unique identifier is used to identify the concrete service ws. After identifying the concrete service, an individual cs is represented as , where D is the number of abstract services. The QoS vector of cs is represented as , where M is the size of the objectives.

For the initialization of the reference vectors, the key steps are listed. First, the reference point is generated by sampling points on a hyperplane. Then the reference points are mapped on the PF to generate the reference vectors. A reference vector is a vector that starts from the origin point in the objective space and ends in the reference point. Let H be the parameter that controls the division on the objective axis, a reference vector is generated by selecting from and satisfying . Usually, the number of reference vectors equals the population size N. The initial reference vector is stored in . Therefore, the initial reference vectors are represented as .

4.2. Offspring Creation with Dual Strategy

According to references [33], the state of the search space varies with the evolutionary process. At the early stage, the available information (e.g., the individuals with better QoS) about the search is limited. The available information about the search is accumulated with the increase of the iterations. Entering the latter stage, many individuals might be close to the true PF, so it is less likely to find better individuals even if taking longer. Such a change would leave the population without evolutionary pressure. To solve this problem, a dual strategy is needed to tweak the use of creation operators to provide sufficient evolutionary pressure in the evolutionary stage.

First, two creation operators, namely, “rand/2/bin” and “current-to-best/1/bin”, are selected to manipulate individuals. The “rand/2/bin” shows better disturbance and creates offspring based on different individuals. These characteristics make it capable of expanding its evolution ability. The “current-to-best/1/bin” creates the offspring by searching around the current best, which improves the evolutionary robustness. These two creation operators regard the abstract service as the operational dimension to generate a new individual. The new individual is produced as follows.(1)rand/2/bin(2)current-to-best/1/binwhere , and are the indices of different random individuals in the current population; best is the index of the best individual; is the random index within the range ; , , and control the proportion of different individuals.

Due to the finiteness of the service candidates, the identifier of the concrete service might exceed its limit. Therefore, the identifier needs to be reset, which is conducted according to the following formula.withwhere and represent the upper and low bounds of the identifier, respectively.

Second, these two operators are adjusted by a dual strategy, which controls the execution probability of “rand/2/bin” and “current-to-best/1/bin” in the evolutionary stage. There is no clear division between the various evolutionary stages throughout the evolution process. Therefore, we define the ratio of the current iteration to the total iteration to distinguish between the different evolutionary stages. On account of the ratio, we represent their probabilities as and , and compute them as follows.where is the number of the current iteration and is the number of total iteration. To exhibit the whole dynamic changes of and more clearly, their tendencies are illustrated in Figure 1. At the early stage, the “rand/2/bin” is run with a high probability of exploring more high-quality individuals. As the evolution continues, the execution probability of the “rand/2/bin” dynamically reduces. In contrast, the execution probability of the “current-to-best/1/bin” dynamically increases. The exploration of the “rand/2/bin” and the exploitation of the “current-to-best/1/bin” are used to speed up convergence and prevent premature convergence. Entering the latter stage, the “current-to-best/1/bin” is preferred to exploit the local information.

4.3. Reference Vector Neighbor Search

This search first unfolds the two-stage clustering to implement a fine-grained search under the guidance of the reference vectors and the elites in the non-dominated individuals, and then carries out the selection of the solutions set towards the true PF.

4.3.1. Two-Stage Clustering

In the two-stage clustering, the first-stage clustering uses the reference vector as the anchor to gather the non-dominated individuals. The second-stage clustering groups the dominated individuals under the direction of the elites in the first cluster.

For the first-stage clustering, we first search the non-dominated individuals from the union of the offspring and the current population. Then, the non-dominated individuals are clustered by computing their closeness degree with the reference vectors. The closeness degree can be measured by the perpendicular distance from the individual to a reference vector . The perpendicular distance is computed as follows.where represents the distance along , which is computed by , and represents the norm of the vector. Each non-dominated individual can be attached to the nearest reference vector by comparing the distance values. Reference vectors attached by non-dominated individuals are labeled as active, while reference vectors without attached individuals are labeled inactive.

Since there may be more than two non-dominated individuals attaching to one reference vector, we need to sort them within a cluster. We evaluate one individual from its proximity and distributivity. The proximity is reflected by the distance along the closest . The smaller the value, the closer the individual is to the true PF. The distributivity is measured by the perpendicular distance between the non-dominated individual cs and . This distance represents the distribution error between cs and . The smaller the value, the closer cs is to . These two criteria are integrated into the following formula.where is a parameter that controls the proximity and distributivity. We can sort the non-dominated individuals in ascending order based on the compromise value of cs.

For the second-stage clustering, we also need guidance for the search of the dominated individuals towards proximity and distributivity. Some characteristics of the second-stage clustering are summarized below.(i)For each cluster in the first stage, the top individual is taken as the center of the second-stage clustering.(ii)The dominated individuals are assigned to the closest center according to the Euclidean distance between the individuals and the centers.(iii)The dominated individuals in the same cluster are sorted in ascending order based on the comparison of their closest value to the center.

4.3.2. Population Selection

We propose a cyclic selection to distribute a new population close to the true PF evenly. The idea behind this is that, by selecting the feasible individuals having the best values in different clusters, we have a higher chance of obtaining a set of solutions with good proximity and distributivity. More specifically,(i)For each first-stage clustering, the feasible head of the sorted non-dominated individual’s list is selected.(ii)The feasible head of the sorted dominated individuals’ list is selected in order.(iii)If the number of the selected individuals is less than the population size, other feasible non-dominated individuals are selected in order.

5. Experiments and Analysis

5.1. Experiment Design
5.1.1. Dataset

Given a workflow with a set of tasks, there are concrete services with similar functions but different QoS values for each task. Therefore, there are a large number of composition instances. For each test case, the concrete services are randomly assigned using the QWS dataset (https://qwsdata.github.io/), which records the QoS measurements of real-world web services. We focus on the response time, availability, throughput, successability, and reliability attributes. All experimental results are collected on a 3.4 GHz PC with 8 GB RAM.

5.1.2. Comparative Approaches
(i)MODE: it is a basic multi-objective differential evolution algorithm for verifying the impact of the dual strategy and fine-grained search on the performance of our proposed algorithm.(ii)MOGP: A multi-objective genetic programming algorithm is proposed in [34]. It is a powerful evolutionary metaheuristic to find the best trade-offs between more than two objectives.(iii)MS-DABC: An improved artificial bee colony algorithm is proposed in [3]. It has a competitive performance produced by cooperating with a synergistic mechanism, a diversity maintenance strategy, and a well-maintained external achieve with the artificial bee colony algorithm.(iv)NSGA-III-DDR: An improved evolutionary multi-objective optimization algorithm using reference-point based non-dominated sorting approach is proposed in [35]. It provides a distance dominance relationship in NSGA-III. The algorithm not only considers the diverse solutions but also retains good convergence.
5.1.3. Parameter Setting

The shared parameters for all algorithms are as follows: the number of epochs is set to 200; the population size N is set to 100.

The individual parameters for each algorithm are as follows. For the EnMODE, the values of and are respectively set to 0.8; the values of and are respectively set to 0.4; the parameter that controls the division on the objective axis is H = 8; the parameter that controls the proximity and distributivity is ; The maximum number of iterations is itermax = 200. For the comparative algorithms, the crossover rate and mutation rate are set to 0.7 and 0.3, respectively.

5.1.4. Evaluation Metrics

GD [36] measures the proximity degree of the obtained solutions set toward the true Pareto-optimal set. It is computed using the quadratic mean of the Euclidean distances from N compositions in the obtained solutions set to the closest composition in the true Pareto-optimal set. Based on [37], we identify the true Pareto-optimal set by selecting compositions from the union of the obtained solutions from the EnMODE and the comparative algorithms. The formula of GD is computed as follows.where and represent the obtained solutions set and the true Pareto-optimal set, respectively. is the Euclidean distance from to the closest composition in . The smaller the value of GD, the better the proximity degree.

SP [38] measures the uniformity of the obtained solutions set on distribution. It is computed using the distance variance between compositions in the obtained solutions set.where represents the Euclidean distance from to the closest , with and . . The smaller the value of SP, the more uniform the obtained solutions set.

The size of the evaluation metrics is affected by the number of abstract services and concrete services. Thus, different parameter configurations are given as follows: the size of abstract services varies from 5 to 50 with a step of 5, and the size of concrete services varies from 100 to 1000 with a step of 100.

5.2. Analysis of Experimental Results
5.2.1. Analysis of the Proximity Problem

Figure 2 shows the GD values obtained by the five algorithms on 10 test cases with different numbers of concrete services. From the figure, we can see that the GD values obtained by the EnMODE on all test cases are smaller than the GD values obtained by other algorithms. More specifically, The EnMODE has smaller GD values than the baseline MODE. Meanwhile, the GD values obtained by the EnMODE on test cases of 100, 300, 400, 600, 700, 800, 900, and 1000 concrete services are slightly smaller than those obtained by NSGA-III-DDR, while the GD values obtained by the EnMODE on test cases of 200 and 500 concrete services are significantly smaller than those obtained by the NSGA-III-DDR. The GD values obtained by the EnMODE are significantly smaller than the GD values obtained by the MOGP and MS-DABC. As the number of concrete services grows, the GD values for each algorithm increase at a slow growth rate. The EnMODE has the slowest rate of increase. Figure 3 displays the GD values on 10 test cases with different numbers of abstract services. The figure shows that the EnMODE has smaller GD values than the baseline MODE, MOGP, NSGA-III-DDR, and MS-DABC in all test cases. Specifically, the EnMODE has slightly smaller GD values on first night test cases than the NSGA-III-DDR. Meanwhile, the EnMODE has significantly smaller GD values than the baseline MODE, MOGP, and MS-DABC. As the number of abstract services increases, there is a growing gap between EnMODE and other algorithms.

Through the experimental results on different test cases, we can see that the EnMODE has better proximity in terms of GD values by comparing the baseline MODE. It can be seen that the EnMODE with the dual strategy and fine-grained search can better improve the performance of the algorithm, especially in the proximity problem.

The experimental results on different test cases show that the EnMODE has the best GD values by comparing the competing approaches MOGP, NSGA-III-DDR, and MS-DABC. The reason for our analysis may be that as populations evolve, there is less evolutionary pressure. However, the dual strategy proposed in Section 4 provides sufficient evolutionary pressure for the population at different stages. In the early stage, the exploration ability of “rand/2/bin” is developed with a high probability. More and more services would be utilized to compose different value-added services, which result in a more high-quality composition with a higher chance of success. As the number of iterations increases, “current-to-best/1/bin” is gradually utilized to search around the potential high-quality region. The optimized information guides the exploration toward the potential optimized region, which makes EnMODE possible to converge to true PF. “rand/2/bin” is used simultaneously to explore more new compositions. The usage of “current-to-best/1/bin” would be enlarged during the posterior stage so that the creation operator uses more information to generate high-quality compositions. Even if the search status changes with the increase of concrete services and abstract services, the EnMODE provides sufficient evolution pressure to reduce the influence. Because MOGP, NSGA-III-DDR, and MS-DABC ignore the evolutionary pressure, they have worse proximity than the EnMODE.

5.2.2. Analysis of Uniform Distribution Problem

The SP results of each algorithm over 10 test cases for different numbers of concrete services are shown in Figure 4. We can see that the EnMODE has optimal SP values on test cases of 300, 400, 500, 700, 800, 900, and 1000 concrete services, and the NSGA-III-DDR has optimal SP values on test cases of 100, 200, and 600 concrete services. In general, the EnMODE algorithm has the strongest competitiveness. In contrast, the baseline MODE, MOGP, NSGA-III-DDR, and MS-DABC have worse SP values. As the number of concrete services grows, the SP values for each algorithm increase, but the growth rate of the EnMODE is less than other algorithms. The SP results of each algorithm over 10 test cases for different numbers of abstract services are displayed in Figure 5. From the figure, EnMODE outperforms its rivals. More specifically, the EnMODE has slightly smaller SP values than the NSGA-III-DDR, while the MODE, MOGP, and MS-DABC have significantly bigger SP values than the EnMODE. As the increase of abstract services, the SP values for EnMODE and NSGA-III-DDR, in general, increase at a low rate, while the values for MODE, MOGP, and MS-DABC increase quickly in all cases. Especially the growth rate of the EnMODE is less than the NSGA-III-DDR.

We can conclude from the experimental results on different test cases that the EnMODE has a more uniform distribution in terms of SP values by comparing the baseline MODE. It can be seen that the EnMODE with the dual strategy and fine-grained search can better improve the performance of the algorithm, especially in the uniform distribution problem.

From the experimental results, we also conclude that the EnMODE has the best uniform distribution in terms of SP values. We also found out that the numbers of concrete services and abstract services would influence the sizes of SP values. Still, the EnMODE has a narrower range of variation in SP values than other algorithms. As stated in Section 4, the reference vector neighbor search uses two-stage clustering to downsize the problem and then conducts a fine-grained search. The first stage of clustering divides the non-dominated compositions under the guidance of the reference vector to cause similar and distinct compositions to gather. The second stage of clustering assembles similar and distinct dominated compositions around the elites in the first stage. Two-stage clustering achieves natural-organized decomposition. A cyclic selection is used to make the distribution of the new generation close to the true PF evenly. Even if the sizes of concrete services and abstract services grow, the EnMODE provides a fine-grained search to make the distribution of the obtained solutions set over the whole extent of the current PF more uniform. The reason why the NSGA-III-DDR algorithm has slightly poor uniformity may have a good distance dominance relationship. The MOGP and MS-DABC achieve an evolutionary process by the genetic operators, which makes them difficult to make the solutions set close to the true PF evenly.

5.3. Summary of Results

Based on the evaluative results of the experiments, we have verified that the EnMODE algorithm finds a smaller set of solutions with better proximity and distributivity. Compared with MODE, NSGA-III-DDR, MS-DABC, and MOGP, the reference vector neighbor search gives EnMODE better proximity and distributivity. In addition, with the increase of concrete and abstract services, the influence of EnMODE is less than other algorithms.

6. Conclusion and Future Work

This paper proposes a novel multi-objective differential evolution algorithm as the search scheme. The proposed approach implements the natural-organized decomposition of MOP, and guides the search of multiple sub-problems around the active reference vector and high-quality non-dominated compositions. Experimental results verify that the proposed approach is more likely to find a representive set of solutions with proximity and distributivity.

This work is expected to investigate the impact of the number of QoS objectives on the optimization problem of MOSC. Furthermore, the proposed approach is improved to adapt to the workflow change.

Data Availability

The experimental data used to support the findings of this study are available upon request to the author.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was partially supported by the National Natural Science Foundation of China (No. U22A2026), the Project Funded of the China Postdoctoral Science Foundation (No. 2021M693727), the Natural Science Foundation of Chongqing (No. 2022NSCQ-MSX1809), the Special Fund of the Chongqing Postdoctoral Science Foundation (No. XmT2020180), the Science and Technology Research Program of Chongqing Municipal Education Commission (No. KJQN202100515), and the Foundation Projection of Chongqing Normal University (No. 21XLB003).