Effective anytime algorithm for multiobjective combinatorial optimization problems

In multiobjective optimization, the result of an optimization algorithm is a set of efficient solutions from which the decision maker selects one. It is common that not all the efficient solutions can be computed in a short time and the search algorithm has to be stopped prematurely to analyze the solutions found so far. A set of efficient solutions that are well-spread in the objective space is preferred to provide the decision maker with a great variety of solutions. However, just a few exact algorithms in the literature exist with the ability to provide such a well-spread set of solutions at any moment: we call them anytime algorithms. We propose a new exact anytime algorithm for multiobjective combinatorial optimization combining three novel ideas to enhance the anytime behavior. We compare the proposed algorithm with those in the state-of-the-art for anytime multiobjective combinatorial optimization using a set of 480 instances from different well-known benchmarks and four different performance measures: the overall non-dominated vector generation ratio, the hypervolume, the general spread and the additive epsilon indicator. A comprehensive experimental study reveals that our proposal outperforms the previous algorithms in most of the instances.


Introduction
MultiObjective Optimization (MOO) is a field of research with many applications in different areas, such as biology, computer science, scheduling, and finances.Broadly speaking, in a multiobjective optimization problem some objectives are in conflict and they have to be optimized simultaneously.Without loss of generality, we consider minimization problems throughout this paper.In the case of a maximization problem, we use the property max(f ) = −min(−f ).A MultiObjective Program (MOP) is an optimization problem characterized by multiple and conflicting objective functions that are to be optimized over a feasible set of decisions.If the constraints and the objective functions are linear, the MOP is a MultiObjective Linear Program (MOLP), and when the variables are integer, we name it MultiObjective Integer Program (MOIP).If some variables are constrained to be integer-valued and the rest are continuous, the problem is a MultiObjective Mixed Integer Program (MOMIP).MultiObjective Combinatorial Optimization (MOCO) is a special case of MOIP when the feasible set is finite.MOIP and MOCO are special cases of MultiObjective Discrete Optimization (MODO).
Finding the whole Pareto front in MOCO problems can require too much computational time when a faster solution is needed.For example, one may need to solve an assignment problem in hours and finding the whole Pareto front can take days.In particular, this happens in instances with many non-dominated points, or when the model is hard in itself.From a practical point of view, the decision maker with whom we interact may be interested in a set of solutions as well-spread as possible in the objective space at any time during the search.The concept of anytime algorithms was introduced by Dean and Boddy [4] to characterize algorithms with the following two properties: (i) they can be terminated at any time and still return an answer, and (ii) the answers returned improve in some well-behaved manner as a function of time.In a more precise way, algorithms that show a better trade-off between solution quality and runtime are said to have a better anytime behavior [23].
In this paper we propose a new exact algorithm to solve MOCO problems, which is also valid for MODO problems with finite feasible sets.The fact that the algorithm is exact and all the solutions found are in the Pareto front allows less freedom when interpreting the algorithm and reduces the probability that two different implementations behave differently [31].The algorithm is anytime in the sense that it is possible to interrupt its execution and take the non-dominated set.Moreover, it is guaranteed that this set is well-spread in the objective space.To check this spread and the quality of the solutions properly, four well-known metrics in the literature have been used: the overall non-dominated vector generation ratio [13], the hypervolume indicator [9,36], the general spread [13,37], and the additive epsilon indicator [21].In the computational results section, all these metrics are calculated for each of the 480 instances considered in the work presented here.
The proposed algorithm is based on a general framework by Dächert and Klamroth [2].Broadly speaking, it consists in analyzing a search region in the objective space and looking for new non-dominated points at each iteration.These regions can be considered as boxes in R p , where p is the number of objectives.The contribution of this paper is to present new designing criteria and an innovative way of splitting the objective space so that the solutions obtained are well-spread over the objective space.These are detailed in the following three novel proposals: • A new strategy to select the appropriate search space region as the next box to explore, in order to guarantee that the new solutions found are spread over the objective space.
• A new way of partitioning the search space after finding a new non-dominated point.This partition reorders the selection of the future boxes to explore and also has an influence in the spread of solutions.
• A new quality function to measure the priority for the new regions to explore.
In order to help readers to reproduce the results of this paper, we uploaded the source code to GitHub1 .This is a good practice that is becoming popular to face the reproducibility crisis in Computer Science [12].
The paper is organized as follows.In Section 2, we provide a literature review including the most relevant papers on MOO algorithms that could be used as anytime MOCO algorithms with good spread.In Section 3, we provide the background with some general definitions and the metrics we use in the computational results.In Section 4, a general formulation to solve MOCO problems is given.This formulation is the backbone of our contribution.In Section 5 our new anytime algorithm is presented.An exhaustive computational analysis is shown in Section 6.The last section is dedicated to the conclusions and future work.

Literature review
In this section we review different exact algorithms to solve multiobjective problems.We organize this review in four subsections.The first three subsections cover related works, which are presented (mostly) in chronological order in each subsection.The subsections focus respectively on the first algorithms, the ones based on recursion, and more recent algorithms.The fourth subsection summarizes the algorithms which can be considered as anytime with good spread and that are used to compare with our proposed algorithm.

First exact algorithms
In 2004, Sylva and Crema [32], working on the idea of Klein et al. [16], designed a new method to solve MultiObjective Integer Linear Programming (MOILP).They partitioned the solution space of dimension p by adding p binary variables and p constraints to exclude the previously-generated non-dominated points.About three years later, Sylva and Crema [33] formulated a variant of their previous work, finding well-spread subsets of non-dominated points in MOMIP.If the variables are integer, the whole Pareto front is obtained.To the best of the authors' knowledge, this was the first work to show dispersed solutions in the objective space, supported by computational results.
Masin and Bukchin [24] developed an algorithm for MOMIP, called DMA (Diversity Maximization Approach), which finds solutions by maximizing a proposed diversity measure and guarantees the generation of the complete set of efficient points.
In 2013, Lokman and Köksalan [22] proposed two improved algorithms based on the work of Sylva and Crema [32].The algorithms iteratively generate non-dominated points and exclude the regions that are dominated by the previously-generated non-dominated points.
Ceyhan et al. [1], based on the work of Lokman and Köksalan [22], proposed some algorithms to find a small representative set of non-dominated points for MOMIP.The first of them, called SBA, is able to find the complete Pareto front for discrete problems if enough time is available.The main advantage of the SBA algorithm is that it produces very well-spread solutions over the objective space.As noted in this paper [1], the work of Masin and Bukchin [24] is similar to Sylva and Crema's method in [33], because they generate similar representative subsets of non-dominated points for MOMIPs.In fact, SBA outperformed Masin and Bukchin's algorithm.

Algorithms based on recursion
In 2003, Tenfelde-Podehl [34] proposed a recursive algorithm for MOCO problems with p criteria.Dhaenens et al. [6] proposed a new method to solve MOP, called K-PPM (Parallel Partitioning Method), which is based on the work of Lemesre et al. [19], called 2-PPM and valid only for the bi-objective case.This new approach is valid for any dimension p ≥ 2, and could also be considered as an extension of the idea of Tenfelde-Podehl [34].
Özlen and Azizoglu [25] developed a general approach to generate the Pareto front for MOIP problems, based in the ε-constraint method.They also used recursion to solve multiobjective problems with fewer objectives obtaining efficiency ranges for each objective.One drawback of the method in [25] is that it generates the same non-dominated point many times.The algorithm was enhanced in the work of Özlen et al. [26].

Modern algorithms
Laumanns et al. [18,17] proposed the first algorithm to calculate the Pareto front in MOP through the resolution of a number of single-objective problems that depend only on the number of solutions in the front.The work of Laumanns et al. [18] was based on the ε-constraint method.They obtained a bound O(k p−1 ) for the number of calls to the single-objective solver, where k is the number of non-dominated points and p the number of objectives.In [17], Laumanns et al. presented another algorithm for solving MOP using only lower bounds (for maximization problems) which leads to a fewer number of constraints.The algorithm by Kirlik and Sayın [14] was an improvement of the work of Laumanns et al. [18].They changed the order in which the subproblems were solved.The search is managed over (p − 1)-dimensional rectangles.
In the last decade, modern algorithms solving MOO problems have been developed.Apart from the aforementioned work by Ceyhan et al. [1], there are other important papers we should mention.Przybylski et al. [28] proposed an algorithm to compute the supported non-dominated extreme points for MOIP.This work was generalized by Przybylski et al. [29] to obtain the complete Pareto front for MOIP and is considered as a generalization of the two-phase method of Ulungu and Teghem [35].Özpeynirci and Köksalan [27], among other authors, developed an algorithm to generate supported nondominated points for MOMIP.Dächert and Klamroth [2] conceived an algorithm to solve MOO for p = 3.Their method is based on the work of Przybylski et al. [28] and [29].They named full m-split the general framework for solving any MOO of dimension m.This algorithm, suitably adapted, is used in our work.Later, Klamroth et al. [15] improved full m-split, developing two new methods, called RE and RA, where they were concerned with eliminating or avoiding redundancies in the search zone explorations.Recently, Dächert et al. [3] have provided new theoretical insights into structural properties of the search region, assuming that a finite set of mutually non-dominated points has already been computed.The work provides better results than those in [15] for high dimensions (p ≥ 6).In 2018, Holzmann and Smith [11] designed an algorithm to solve MODO problems using a weighted augmented Tchebycheff scalarization.This work is also based on the work of Dächert and Klamroth [2].They improved the results of Kirlik and Sayın [14] and Özlen et al. [26].4 2.4.Related algorithms that can be considered as anytime with good spread As stated in the preceding review, only a few methods can be considered to have good spread over the objective space.This is in concordance with the work of Ceyhan et al. [1], which summarized in two the number of algorithms which can be used to calculate the complete front in MOCO problems [24,33].If fact, with their SBA algorithm, they outperformed the results in [33].The work of Holzmann and Smith [11] can also be considered as anytime, so we include it in our study.
We analyze which other algorithms in the literature can be slightly modified to obtain a well-spread set of non-dominated points at any time during the search.The algorithms which use the ε-constraint method do not seem to be good at spreading solutions over the objective space, because in the objective function, just one selected objective is optimized (Ch. 4 of [7]).Moreover, methods which use branch and bound as the general framework to obtain the Pareto front are widely surpassed by modern algorithms, which avoid repeating solutions [2].In addition, those which calculate the nadir point in a first phase can be slow because computing the nadir point is an NP-hard problem in itself [10], so we do not consider these algorithms.
In conclusion, we have chosen two algorithms: SBA, from the work of Ceyhan et al. [1], and the method of Holzmann and Smith [11].They are, to the best of the authors' knowledge, the algorithms that can potentially provide a well-spread set of non-dominated points at any time of execution.

Background
In this section, we first present some basic definitions to understand the paper, and then we define the four metrics used for the performance assessment in the computational results.

Definitions on multiobjective optimization
A MOCO problem can be defined as where x is the decision vector, p ∈ N is the number of objectives with p ≥ 2, f i : X → R with i = 1, . . ., p are the objective functions, and X ̸ = ∅ denotes the feasible solution set, which is discrete and bounded.
The notion of optimality with several objective functions is considered in the sense of Pareto optimization.Given two vectors x, y ∈ R p , we say that x is weakly dominated by y if y i ≤ x i for all i = 1, . . ., p (denoted y ≦ x).If y weakly dominates x and y k < x k for at least one k ∈ {1, . . ., p}, then we say that y dominates x (denoted y ≤ x) or x is dominated by y.If strict inequality holds for all k ∈ {1, . . ., p}, then we say that y strictly dominates x (denoted y < x) or x is strictly dominated by y.
A feasible solution x ∈ X is an efficient solution if there is no y ∈ X such that f (y) dominates f (x).A solution x ∈ X is called weakly efficient if there is no y ∈ X such that f (y) strictly dominates f (x).The image of an efficient solution x is a non-dominated point, z = f (x).The image of a weakly efficient solution x' is a weakly non-dominated point, z' = f (x') .The set of all efficient solutions in a MOCO problem is called efficient set, X E , and its image is the Pareto front, PF = f (X E ) ⊆ R p .An efficient solution is supported if its image lies on the frontier of the convex hull of PF ⊆ R p .Equivalently, x ∈ X is supported if it minimizes a weighted sum of the p objectives involving positive weights.Due to the fact that, in MOCO, many of the elements in X E could lead to the same image, we are only interested in the set PF and one anti-image for each element of this set.
Let z 1 , z 2 ∈ R p .We say that z 1 < lex z 2 if there exists an index q, where z 1 q < z 2 q and q = min{k | z 1 k ̸ = z 2 k }.The symbol < lex represents the strict lexicographic order.Given a permutation σ and a vector function f : X → R p , we denote by f σ = (f σ(1) , f σ(2) , . . ., f σ(p) ) the vector function where the objectives are reordered using σ.We say that x ∈ X is a lexicographic optimal solution for permutation σ if there is no y ∈ X with f σ (y) < lex f σ (x).There exists a maximum of p! different lexicographic optimal solutions, one for each permutation.
Given a Pareto front, PF, the ideal point is z I = (z I 1 , z I 2 , . . ., z I p ), where It is clear that z I and z N are a lower and an upper bound for the Pareto front.While the ideal point is found by solving p single objective optimization problems, the computation of the nadir point without knowing the whole Pareto front involves optimization over the efficient set, a very difficult problem in general [8].
Although it is common to use the term 'efficient solution' in the decision space and 'non-dominated point' in the objective space, sometimes we use the term 'solution' to refer to both spaces, when there is no possible confusion.We assume that |P F | > 1, otherwise the problem can be solved in one iteration with a single-objective optimization technique.
Given two vectors l, u ∈ R p with l < u, we define the box [l, u] as: When the lower bound of a box is the ideal point in a Pareto front, we usually represent the box only with its upper bound, that is, [z I , u] = [u].

Performance assessment
The performance assessment of algorithms for multiobjective optimization is not a trivial issue.A good analysis comparing different multiobjective evolutionary algorithms is given in [38].Recently, Li and Yao [20] published a survey of the quality evaluation for 100 different metrics.In [13], Jiang et al. categorize the MOO metrics into four groups: capacity metrics, convergence metrics, diversity metrics, and convergence-diversity metrics.Despite the aforementioned difficulty of designing anytime algorithms with good spread (effective anytime algorithms) because performance is often evaluated subjectively (see [23]), we have tried to take a representative group of metrics in which we can assume that the better the value of these metrics, the better the performance of the algorithm.Thus, we take one metric from each of the four categories defined by Jiang et al.
Capacity metrics [13] quantify the number or ratio of non-dominated solutions.We use here the Overall Non-dominated Vector Generation Ratio, which is defined as where N ⊆ P F is the set of non-dominated solutions found by a run of a search algorithm and | • | is the cardinality of a set.Thus, ONVGR is a rational number in the interval [0, 1] that represents the fraction of points of the Pareto front that were found by the search algorithm.For computing ONVGR, we need to know the total number of points in the Pareto front.
Convergence metrics measure the degree of proximity of the set of solutions to the complete front.The additive epsilon indicator gives the minimum additive factor by which the approximation set has to be translated in the objective space in order to weakly dominate the reference set [38,21].We have scaled each objective to obtain a value in the range [0,1].This additive epsilon indicator is defined as where r i is the range of objective i in N.This metric is also used in the computational experiments conducted by Ceyhan et al. [1] with the name of coverage gap.
Diversity metrics indicate the distribution and spread of solutions.From this group, we have selected the general spread metric, also cited in [13].The original spread metric (Deb et al. [5]) calculates the distance between two consecutive solutions, which only works for 2-objective problems.An extension to any dimension, defined in [37], computes the distance from a point to its nearest neighbor.Let N be a set of nondominated points and {e i } m i=1 the extreme solutions2 , which are the images of the lexicographic optimal solutions of the complete Pareto front, d = 1 Note that the metric ∆ * is a non-negative number.The lower its value, the more well-spread is N .Every lexicographic optimal solution not obtained in the execution produces a positive value in m i=1 d(e i , N ) and a higher value for ∆ * .From the group of convergence-diversity metrics we have selected the hypervolume indicator.Given a set of d non-dominated points, N = {z 1 , z 2 , . . ., z d }, the hypervolume HV is the measure of the region which is simultaneously dominated by N and bounded by a reference point r ∈ R p : where volume is the Lebesgue measure in R p .This is the quality measure with the highest discriminatory power among the known unary quality measures [23,30,38].There are many software packages that calculate the exact hypervolume, given a non-dominated set and a reference point [9,36].The reference point can be taken as r i = max j=1,...,d z j i ∀i = 1, . . ., p.This is an acceptable choice when we have no information about the complete Pareto front.In the computational experiments presented in this paper, we have calculated the complete fronts for all the instances in order to have a fitted reference point.This means that we know the nadir point for every instance.As noted in the work of Li and Yao [20], there is still no consensus on how to choose a proper reference point for a given problem.To assign some importance to the extreme points, we add one unit to the nadir point, so the reference point is r i = z N i + 1 ∀i = 1, . . ., p. Sometimes, it is useful to consider the percentage of the total hypervolume reached: Given two subsets of non-dominated points, A and B, we write A ⪰ B if every z 2 ∈ B is weakly dominated by at least one z 1 ∈ A. A metric is Pareto compliant if, for every two subsets A and B with A ⪰ B and B ⪰̸ A, the metric value for A is not worse than the metric value for B. It is desirable for a metric to be Pareto compliant [39].In other words, it must not contradict the order induced by the Pareto dominance relation.Of the four metrics considered, ONVGR, HV and ε + are Pareto compliant.This is equivalent to saying that, if we add a new element to the subset of solutions, then ONVGR and HV do not decrease and ε + does not increase.Nevertheless, ∆ * is not Pareto compliant, and we have to be more careful with its analysis.

General formulation for solving MOCO problems
In this section we describe the framework of Dächert and Klamroth [2] to solve MOCO problems.The framework can be found in Algorithm 1.Our proposed algorithm, described in Section 5, is based on this framework, but we add three modifications that let us get a better spread of solutions over the objective space.The method used by Holzmann and Smith [11] is also based on this framework.The idea of the method is to maintain a set of search zones, U, which are p-dimensional boxes.Every search zone is defined by its upper bound, because we can consider that the ideal point is the lower bound.
At the beginning, the set N , containing the non-dominated points, is empty (Line 1), and the set of search zones contains the initial element U (Line 2), defined by an upper bound of the nadir point.The mathematical program used by the algorithm, P , is an input parameter.The algorithm then enters a loop while there is at least one zone to analyze.In each iteration of the loop it selects one zone (Line 4), solves the optimization problem, and, if a new non-dominated point is found, it is saved in N .Every time it finds a new non-dominated point, it updates the set U accordingly (Line 8), so as to prevent repeated solutions in the future.Another goal of the updating procedure is to reduce the number of search regions at each iteration.More specifically, at least box B is extracted from U. The algorithm ends because, in MOCO problems, the number of non-dominated points is finite.
Algorithm 1 General method for MOCO problems if (P(B) is feasible) then 6: x * ← Optimal solution of P (B) Update U 9: else 10:

end if 12: end while
In what follows, we study the relevant elements of Algorithm 1 and analyze different options which may produce a well-spread set of solutions if we treat it as an anytime algorithm.We focus on three aspects of the general method that we think are important to consider.

Mathematical program
The selected mathematical program is one of the key aspects of the algorithm.Not all methods that solve multiobjective problems have the same performance.In fact, those which are based on the ε-constraint method (Ch. 4 of [7]) should not be good as effective anytime algorithms, unless we use a strategy for the selection of the next search zone to explore that guarantees a good dispersion of solutions in the objective space.The mathematical program used in the SBA method by Ceyhan et al. [1] where α is a variable which represents the coverage gap, ε is a sufficiently small positive constant and ω i > 0 ∀i.If the program ( 9) to ( 11) has a solution, then it is efficient.Another mathematical program, used in the work of Holzmann and Smith [11], is based on a weighted augmented Tchebycheff norm: where z I is the ideal point of the problem.The parameters ω and ϵ are chosen to make the program feasible.Depending on the objective value, we conclude whether the problem has a new solution inside the box or not.The reader is referred to the original paper [11] for more information.
We observed relevant performance differences between the two programs in a preliminary study.The weighted augmented Tchebycheff norm behaves better regarding the spread of the solutions in the objective space.The selection may be done randomly, but intuitively, if we split the objective space into different regions, and at each step we explore those which have a higher volume, we expect to obtain a final better spread of the solutions.Following this idea, we can define a numerical value for each search zone, called priority.One example of the priority function is the volume of the p-dimensional hyperrectangle with opposite vertices being the vector u (upper bound for the search zone) and z I (ideal point).For instance, if u = (10, 20, 30) and z I = (5,12,20), then the volume of the box is (10 − 5) • (20 − 12) • (30 − 20) = 400, and this could be its priority.A more sophisticated way of selecting the next box to explore is mentioned in Section 5.1.

Updating the set of search zones
This may be the most important aspect to take into account in Algorithm 1 (Line 8) in terms of termination of the general algorithm and computational effort.One way to ensure that a non-dominated point is never computed again during the search consists in splitting the objective search zone and discarding at least that point.A good way to do this is found in the work of Dächert and Klamroth [2].After a solution z is found, they divide the search zone into p new zones.The union of the new search zones discards the points dominated by z and the new search regions to explore contain at least one point less than the original search region.Given a box B with upper bound u = (u 1 , . . ., u p ), let us suppose that we solve the mathematical problem and obtain a new non-dominated point z = (z 1 , . . ., z p ).Then, we split the box B into {B i } p i=1 with upper bounds u i p i=1 , being u i = (u 1 , . . ., u i−1 , z i , u i+1 , . . ., u p ) ∀i = 1, . . ., p.Each box B i with upper bound u i is said to have direction i.If the lower bound of all the boxes B and B i is the ideal point, z I , we call the split full p-split.A graphical example is shown in Figures 1a and 1b for p = 3.Note that when a new non-dominated point is obtained, we must apply full p-split to all search zones that contain that new point, in order to assure the non-duplicity of the solutions.Full p-split is used in the work of Holzmann and Smith [11].In Section 5.2 we describe a new way of splitting the objective space, called p-partition.
The splitting process often generates redundant zones.If we have two search zones u 1 , u 2 with u 1 ≤ u 2 , all potential non-dominated points generated by u 1 could also be generated by u 2 , which means that u 1 is redundant.Therefore, a filtering process should be implemented after the split.Klamroth et al. [15] proposed two different algorithms for this purpose.The first is called RE (redundancy elimination) and consists in eliminating those redundant search zones at each iteration.The second algorithm is called RA (redundancy avoidance) and is based on structural properties of local upper bounds which yield necessary and sufficient conditions for a candidate local upper bound to be nonredundant.The filtering step is avoided in this case.In the computational experiments of the work by Klamroth et al. [15], they obtained better results in runtimes using RE when p ∈ {3, 4, 5}, and using RA when p > 5.
A more recent paper by Dächert et al. [3] describes a specific neighborhood structure among local upper bounds.With this structure, updates to the search region when a new non-dominated point is found can be more efficient compared to RE and RA approaches, as the number of points increases, but RA and even RE perform better for a small number of solutions.Since our proposal is designed as an anytime algorithm where the decision maker can stop the execution whenever desired, it is reasonable to think that the total number of solutions may not be high, unless a complete execution is derived.Thus, we have decided to use the RE approach in this paper.

Proposed anytime algorithm: TPA
In this section, we propose a new exact anytime algorithm based on the general framework described in Section 4 but with three novel contributions.This algorithm can solve any MOCO problem and it can also be used for MODO problems with finite feasible sets.We call it TPA because it uses a Tchebycheff mathematical program with Partition of the objective space using Alternating of directions.The pseudocode is shown in Algorithm 2.
Set ω = (ω 1 , . . ., ω p ) where if obj < 1 and z I < B.u then 13: Update (L, f (x * )) end if 18: end while We now provide a high-level description of the algorithm and go in depth into the novel contributions in separate subsections.Every search zone is a box with the following fields: lower bound (l), upper bound (u), and priority.The vector L contains p priority queues (heaps) of boxes.The boxes in queue L i have direction i and are sorted by nonincreasing priority.The initial box has as lower and upper bounds the ideal point and an upper bound of the nadir point.Thus, it contains the whole Pareto front.We insert this box into L 1 (Line 6), although any other queue L i could be chosen.The ϵ value is constant during the execution and depends on the initial bounds (Line 5).
TPA then executes a loop while there is a box in any list of L and the time limit is not reached (Boolean intime).In the loop, it first selects the next box to analyze (Line 9) using Algorithm 3.Then, it solves the mathematical program in Eq. ( 12) (Line 7).As pointed in Section 4, this mathematical program provides better spread of the nondominated points.The parameters are taken from the work of Holzmann and Smith [11]: i }, ∀i, and ϵ = 1/(2p(r − 1)), where u is the upper bound of the next box to analyze, r = max i=1,...,p (z N i − z I i ), z I is the ideal point and z N is an upper bound of the nadir point, computed as follows: When u ∈ z N , the model is always feasible.Moreover, if z I < u the solution is inside the box [u] if and only if the objective value is less than one unit (Theorem 5 of [11]).Thus, we can slightly modify Line 5 of Algorithm 1, changing the expression 'is feasible' by 'obj < 1 and z I < u' (Line 12 of Algorithm 2).If it finds a new solution, x * , it adds the pair (x * , f (x * )) to XZ (Line 13) and updates the boxes in the priority queues L (Line 14).Otherwise, it discards the box from the corresponding queue (Line 16).
The correctness and termination of TPA is proven in Theorem 1.This is an exact anytime algorithm and, thus, the complete Pareto front is found if the algorithm is given enough time.This guarantees the optimality and convergence of the algorithm.Observe that all the solutions found belong to the Pareto front.This contrasts with other heuristic and metaheuristic algorithms that can only compute approximate solutions without any guarantee of being efficient.Such algorithms require mechanisms to explore the search space that can be interpreted differently by different developers.Rostami et al. [31] have recently studied this issue and found that the resulting approximate solutions can influence the final results in such a way that the differences in quality metrics can even be statistically significant.The reliability and stability of TPA are supported by the ILP solver used.The memory usage and computation time are proportional to the number of boxes to maintain, and this can be exponential in the number of solutions found.Furthermore, the number of solutions in the Pareto front can be exponential in the number of decision variables.Thus, the memory required by the algorithm and the run time can be exponential in the number of variables if the whole Pareto front is computed.Regarding memory usage, in our experiments, using 2 GB of RAM was enough even when the complete Pareto front was computed.
In the next subsections we will detail the three novel contributions of the algorithm: a method to diversify the search exploring boxes with different directions, a new way to split the boxes forming a partition of the original one, and a new priority function for the boxes to explore.

Alternation of directions in the search
In this section we propose a new strategy for the selection of the next box to explore, trying to diversify the regions of the objective space that are explored.This is the first novel contribution of this paper.The priority queue L i contains boxes with direction i. Boxes in different directions are displaced in the objective space along the different dimensions.We propose to select a box from a different priory queue L i in each iteration of TPA.Among the ones with the same direction TPA chooses the box with higher priority.The pseudo-code of the Select next box procedure (Line 9 in Algorithm 2) is in Algorithm 3.

Algorithm 3 Select next box
Input: {L i }, k // List of boxes and integer counter Output: (B, k) end while

New updating procedure of the search zone
This section presents a new way of splitting the box, called p-partition.This is the second contribution of the paper.The main difference between p-partition and full p-split is that the new boxes created by p-partition form a partition of the original box (they are pairwise disjoint).Definition 1.Let B = [l, u] be a box and z ∈ R p a point.For every i = 1, . . ., p, we define B i (z) = {x ∈ B | x i < z i and x j ≥ z j ∀j > i}.We also define B 0 (z)={x ∈ B | x j ≥ z j ∀j}.
It is easy to check that B i (z) are boxes, which can be written as follows: where ẑi = max{z i , l i } ∀i = 1, . . ., p.
Proof.We need to prove that B i (z) ∩ B j (z) = ∅ for all i ̸ = j and for indexes i, j with i < j.As x ∈ B i (z), then x j ≥ z j .As x ∈ B j (z), then x j < z j .This is a contradiction, so the sets are always disjoint and the first part is proved.
tational execution time grows exponentially.For p-partition, we apply a variant of the redundancy elimination (RE ) used in full p-split.In the RE method, when a box A is contained in a box B, we eliminate A from the priority queue of boxes to explore.For a deep study of how this method works, see [15].Nevertheless, it is not possible to apply this idea to p-partition because the lower bounds of the new boxes differ.We join boxes to save this obstacle.Once we introduce join boxes, we cannot guarantee that all the boxes are pairwise disjoint, but we can confirm that the new box does not exclude non-dominated points.We prove this in the next proposition.The priority of the new box C could be greater than the value of the previous boxes A and B. This means that box C will have higher preference to be analyzed in subsequent iterations.
Proposition 2. With the conditions of Definition 3, it holds that A ∪ B ⊂ C, which means that analyzing box C substitutes the analysis of the other two boxes.
We are now ready to present in Algorithm 4 the updating procedure of Line 14 in Algorithm 2. It is divided into two parts.In the first part (Lines 1 to 9), the boxes in L are split using p-partition.Some of the new boxes may be empty.The non-empty boxes are inserted into the corresponding L i according to their direction (Line 6).The priority function used for the boxes is detailed in the next subsection.The second part (Lines 10 to 17) filters the redundant boxes using RE but taking into account that if a domination relation between two upper bounds is detected, the boxes are joined and the new box replaces the older ones (Lines 12 to 15).We finish this subsection proving the correctness and termination of TPA.Proposition 3. Algorithm TPA never finds the same non-dominated point twice.
Proof.Let B be the box being explored in TPA and z the non-dominated point found solving P (B.u, ω) in Line 11 of Algorithm 2. The definition of the mathematical program P ensures that z < B.u, and the use of p-partition with each box B ′ such that z < B ′ .udiscards B ′ 0 (z), which is the only box that can contain z.The upper bound of a join box is always the upper bound of a previous box.Thus, after the updating procedure, the remaining boxes do not contain z and z cannot be found again.
Theorem 1. TPA terminates after finding all the non-dominated points of the MOCO problem if the time limit is not set.
Proof.The number of non-dominated points is finite in MOCO problems.By Proposition 3, no points are found twice.Thus, TPA terminates.Proposition 2 ensures that all PF is found if enough time is given.for i = 1 to p do 3: B i (z).priority ← reduced scaled(B, i, z) 6: end if end for 9: end for 10: for i = 1 to p do end for 17: end for

A new priority function
As the third contribution, we define a new priority function for the boxes.We want to give more priority to boxes having a larger region where non-dominated points can potentially exist.Given a non-empty box B = [l, u], we define the scaled function as Note that scaled(B) > 0. In the case of an empty box, we define scaled(B) = 0. We assume, without loss of generality, that z N i > z I i ∀i, otherwise the objective i always takes the same value and can be discarded.
In some boxes, there is a region that cannot contain any non-dominated point.In this case, the scaled function overestimates the priority of the box.We define a new priority function to correct this overestimation.First, we need some results.Proof.From the definition of box, z < u.Let x ∈ [l, z].Then, l i ≤ x i < z i < u i ∀i = 1, . . ., p.More specifically, x p < z p .This means that x ∈ B p (z), so [l, z] ⊆ B p (z).
If z = l, we have [l, z] = ∅, and B i (z) = ∅ for all i = 1, . . ., p.The box [l, z], if it is not empty, contains points dominating z.Thus, if z is non-dominated, the box [l, z] does not contain any non-dominated point, otherwise z is dominated and we have a contradiction.Definition 4. Let B = [l, u] be a box, z < B.u a non-dominated point and B i (z) for i = 1, . . ., p the box with direction i after p-partition.Then, for every i = 1, . . ., p, we define the reduced scaled function as Note that some of the boxes after p-partition may be empty.
Proposition 5.The reduced scaled value is always non-negative.

Let us now assume that
, and reduced scaled is non-negative.
We use the reduced scaled function as the priority value of a box.

Computational results
In this section, we present a deep experimental study divided in four subsections.The first subsection describes the benchmark of instances we have used.The instances are grouped into categories.Each category has a fixed number of variables.The categories are also grouped into classes, where each class corresponds to a different MOCO problem.The authors have considered the choice of this benchmark appropriate because different MOCO problems could have different performances.The second subsection describes the parameters for the three algorithms used in the comparison.The third subsection shows the results of the computational experiments.For each instance and algorithm, the results are displayed at some time points.We provide a deep analysis of the results in the last subsection, including a statistical validation.

Benchmark
The benchmark we use is divided into three classes of well-known multiobjective problems.The first is the multiobjective knapsack problem (KP ), the second is the multiobjective assignment problem (AP ), and the third class corresponds to multiobjective integer linear programming problems (ILP ).All the details of the formulation of the three classes can be found in [14], at URL http://home.ku.edu.tr/~moolibrary/, and in the supplementary material.For each class, different problem categories are generated based on problem size.There are 10 instances in each category.In total, there are 480 instances.As stated in Section 3, the metrics we use to compare the algorithms are ON-VGR, HV, ∆ * , and ε + .To calculate these quality indicators, we need to calculate the complete Pareto front.The details of these experiments are shown in the supplementary material of the paper.

Algorithms and parameters
We downloaded and used the source code4 provided by Ceyhan et al. [1].This algorithm will be called ceyhan henceforth.The algorithms of Holzmann and Smith [11] (called holzmann) and TPA5 were programmed using C++.Since holzmann and TPA, are taken from the same framework, we use in them the same filtering process of the solutions, the RE method, in order to make a fair comparison and observe the differences of performance with our new contributions.
The computer used for the experiments is a multicore machine with four Intel Xeon CPUs (E5-2670 v3) at 3.1 GHz, a total of 48 cores, 64 GB of memory and Ubuntu 16.04 LTS.For each run we used only 1 core and 2 GB of RAM.
The three algorithms use CPLEX 12.6.2as the ILP solver.We set the CPLEX parameters, CPXPARAMEPGAP = CPXPARAMEPAGAP = CPXPARAMEPINT = 0 and CPX-PARAMPARALLELMODE = CPXPARAMTHREADS = 1.Parameter CPXPARAMEPINT indicates the integrality tolerance for the solution variables.Its default value is 10 −5 .Parameter CPXPARAMEPGAP sets a relative tolerance on the gap between the best integer objective and the objective of the best node remaining in the tree used in the branch and cut.The default value is 10 −4 .CPXPARAMEPAGAP sets an absolute tolerance on the gap between the best integer objective and the objective of the best node remaining.The default value is 10 −6 .CPXPARAMPARALLELMODE is set to deterministic mode, which means that multiple runs with the same model and the same parameter settings on the same platform will reproduce the same solution path and results.Finally, CPX-PARAMTHREADS sets the default number of parallel threads that will be invoked by any CPLEX parallel optimizer.In this case, it is set to 1.
Although the three algorithms are deterministic and the time limit is fixed, the number of solutions found can differ in different executions because CPLEX manages some internal parameters, such as the remaining available memory of the machine, which can influence the tree exploration to obtain the next solution.To soften these differences, we did 30 executions for each instance and algorithm, and then reported the average values.

Summary of the results
We calculated the four metrics in each algorithm (ceyhan, holzmann, TPA), for a fixed time in the set {10, 60, 300, 900}, measured in seconds.We have chosen this set of cut-times to know the performance of each algorithm at the very beginning of the run (just 10 s), and also after a longer (but still short) time (900 s), plus some other intermediate times.
Tables 1 and 2 present the results for each algorithm and metric.There are two columns per combination.The first one contains the number of instances in which the algorithm has the best average value (sometimes shared with another algorithm).The second column contains the number of instances in which the algorithm is the only one obtaining the best average value.The results are displayed for the three classes of the benchmark (AP, ILP and KP ) at four cut-points each, and the total sum is shown in the four last rows of the Analyzing the ONVGR metric, we can see that holzmann and TPA provide the higher number of solutions.In fact, TPA is better in all the classes in the benchmark.If we observe the row with the total sum of the three classes, the new proposed algorithm is clearly the best.Sometimes the number of solutions is not the most important issue.In fact, the hypervolume is considered one of the best metrics to measure the spread over the objective space.Looking at the results for HV, we see that in all the classes the total hypervolume reached by TPA is also maximum.It is the clear winner, with a large difference with respect to the second-best algorithm, holzmann.Note that ceyhan has the best hypervolume in a few cases.For instance, in the ILP class, at 10 seconds, ceyhan has the best value 43 times, but only in 7 cases it is the unique winner.In the other 36 instances, its HV is similar to that of other algorithms.The great differences in HV may be explained because, overall, ceyhan finds much fewer solutions than the others and, consequently, a worse total hypervolume.Ceyhan is good at finding a few solutions quickly, with a very good general spread, as we can see in the results for ∆ * in Table 2, where it is the clear winner.For the last metric, ε + , Table 2 shows again that TPA has the best results in all cases.
As a summary of the results shown in Tables 1 and 2, we see that holzmann and TPA provide better performance than ceyhan.This means that the structure of the algorithm has an influence on the final results.While ceyhan is designed to find a few well-spread solutions quickly, the dispersion of the solutions in the objective space is not so good when the search progresses.Ceyhan makes more than one call to the ILP solver at each iteration because it uses a subset of vectors which depends on the number of solutions found so far.We noticed in the experiments that these multiple calls to the solver per iteration seem to degrade performance as the search progresses.The other two methods (holzmann and TPA), based on the framework described in Algorithm 1, seem to be more efficient for longer run execution times.There is an exception in the results for ∆ * , where ceyhan is the clear winner.Comparing holzmann and TPA methods, the latter obtains better results.Although these two algorithms are based on the same framework and use the same mathematical program, TPA changes the way in which the next boxes to explore are selected: by alternating directions, using p-partition, and using a different priority value assigned to each box.The results demonstrate that the three new contributions described in this paper have a positive effect on the anytime performance.

Detailed analysis
This subsection is divided into three third-level sections.In the first third-level section, we rank the algorithms according to their performance on the instances.Clarifying graphs are shown at ten regular cut-points from 90 s to 900 s.In the second third-level section, we show the evolution of the metrics over time for three selected instances.In this way, we observe the anytime performance of the algorithms for each metric.The last third-level section provides the results of non-parametric statistical tests to check if the differences are statistically significant or not.In the supplementary material attached to this paper, we group the instances into categories, and show tables with the results for the four metrics at four time points.

Ranking the algorithms in each class of instances
To better measure the performance of the algorithms, we computed a rank for each of them in each instance.We did this for ONVGR, HVR, ∆ * , and ε + , at different stopping times from 90 s to 900 s, with a step of 90 s.For example, fixing a time limit of 360 s, we compared the average ONVGR value of the 30 runs of the algorithms in the first instance and assigned a rank from 1 to 3, being 1 the algorithm with the best value for the metric.We did the same in the rest of the cases.In the case of ties, we proceeded by taking the average rankings for the ties (e.g., 1.5 for a tie between the first two algorithms).
There are some time points in which an algorithm may have finished its execution and others not.In those cases, a rank value of 1 is assigned to the algorithm that finished and the following rankings are distributed among the rest.When all the algorithms have finished at a given time in all the runs, the instance is discarded.The results of Figures 2  and 3 indicate that the best average rank values correspond to TPA in most of the cases with important differences compared to the other metrics in some cases.Average rank values for the algorithms at 10 cut-points, using the metrics ONVGR and HVR.The lower the rank, the better the algorithm.

Search progress
In this section we want to show the progress of the search on some selected instances, to further illustrate how the algorithms evolve and behave at any time.For each class of the benchmark, the instance which requires the longest time to be solved is selected.
Figure 4 shows the behavior of ONVGR and HVR for the three selected instances: KP p-3 n-100 ins-4, which belongs to the multiobjective knapsack problem group, in this case, with 100 variables; AP p-3 n-50 ins-4, from the assignment problem group, which has 50 agents; and ILP p-4 n-80 m-40 ins-9, from the ILP group, which has dimension 4, 80 variables, and 40 constraints.Figure 5 displays the behavior for the other metrics.In the x -axis, we show the execution time in seconds.The y-axis shows the corresponding quality metric.In this case, the time points are taken every 5 s, up to 900 s.For these particular instances, TPA provides the higher number of solutions at 21  any time, HVR is quite similar for TPA and holzmann, ∆ * is better for ceyhan in most of cut-points, and ε + is similar for the three methods.We observe a high increase in HVR during the first minutes and then a stabilization of the value at the end.We see that the evolution of ∆ * in Figure 5 is not monotonic.This is a consequence of ∆ * not being Pareto compliant.

Statistical validation
In order to check if the observed differences are statistically significant or not, we applied the non-parametric Friedman test to compare the three algorithms.We applied the test for each metric and cut-point in the set {90,180,. . .,900}.In the cases in which the p-value is below the significance level α = 0.01, there is a strong evidence that the performances of the algorithms are different.To pinpoint the differences, we did a post hoc analysis using the Nemenyi multiple comparison test.This test compares every pair of algorithms looking for significant differences in a paired test with the 480 instances, as well as for each class.We chose only the cases in which one method is significantly different than the other two, and then we applied the Wilcoxon signed rank test to confirm that the corresponding algorithm is the best.All these tests are available in the R-package PMCMRplus.
A summary with the results of the tests is shown in Table 3.We used the initial of  the algorithm (C, H or T) in the cases where an algorithm has the best performance with a significance level α = 0.01 in the three tests (Friedman, Nemenyi and Wilcoxon).We highlight in boldface the cases in which the p-value is below 0.001.The first conclusion is that TPA is almost always the best algorithm.The second conclusion is that Holzmann is never the best for all the metrics.Another conclusion from the tests is that the behavior of the algorithms is problem-dependent.We observe that, for ILP and KP problems, ceyhan is significantly better with respect to ∆ * .In AP instances, ceyhan is the best in the first 180 s but the differences with TPA are not statistically significant.From 270 s to 900 s, ceyhan is surpassed by TPA.Grouping all the instances, the tests are statistically significant at a significance level of α = 0.001 at any cut-point for three of the four metrics, in favor of TPA.The algorithm proposed in this paper shows a statistically significant difference with holzmann for these quality indicators.

Conclusions and future work
We have designed a new anytime algorithm that allows the Pareto front to be calculated in multiobjective combinatorial optimization problems.The front is well-spread at any time and has good values for the metrics ONVGR, HV, ∆ * , and ε + .We have first Table 3: Statistical tests for each class and metric at different time points.The initial of the best algorithm is shown when the differences with the others are statistically significant at level α = 0.01.If the differences are also significant at level α = 0.001, the initial of the algorithm is marked in bold.
carried out an exhaustive study of the literature, and have identified two state-of-theart methods which produce a well-spread set of non-dominated points at any time and are not outperformed by other methods.They were proposed by Ceyhan et al. [1] and |N | x∈N d(x, N ), where d(x, N ) = min y∈N ;y̸ =x d * (x, y) and d * denotes the Euclidean distance between two p-dimensional vectors.We define the general spread as ∆ * (N ) = m i=1 d(e i , N ) + x∈N |d(x, N ) − d| m i=1 d(e i , N ) + |N |d .

4. 2 .
Selection of the next search zone to explore Line 4 of Algorithm 1 proposes the extraction of the next search zone to analyze.

Definition 3 .
Let A = [l a , u a ] and B = [l b , u b ] be two boxes with u a ≤ u b .We define the join box of the two boxes as a new box C = [l c , u c ], where l c i = min{l a i , l b i } and u c i = u b i ∀i = 1, . . ., p.

Proposition 4 .
Let B = [l, u] be a box and z ∈ B, then the box [l, z] ⊆ B p (z), where p is the dimension of the objective space.

Figure 2 :
Figure2: Average rank values for the algorithms at 10 cut-points, using the metrics ONVGR and HVR.The lower the rank, the better the algorithm.

Figure 3 :
Figure 3: Average rank values for the algorithms at 10 cut-points, using the metrics ∆ * and ε+.The lower the rank, the better the algorithm.

Figure 4 :
Figure 4: ONVGR and HVR as a function of time in 3 selected instances.The higher the value, the better the algorithm.

Table 1 :
table.Number of times that each algorithm obtains the best average value without exclusivity (best) and with exclusivity (excl ), for ONVGR and HV, for different execution times, and for each class in the benchmark.Total sum is displayed in the last four rows.

Table 2 :
Number of times that each algorithm obtains the best average value without exclusivity (best) and with exclusivity (excl ), for ∆ * and ε+, for different execution times, and for each class in the benchmark.Total sum is displayed in the last four rows.