On Necessary Optimality Conditions for Sets of Points in Multiobjective Optimization

Taking inspiration from what is commonly done in single-objective optimization, most local algorithms proposed for multiobjective optimization extend the classical iterative scalar methods producing sequences of points able to converge to single efficient points. Recently, a growing number of local algorithms that build sequences of sets has been devised, following the real nature of multiobjective optimization, where the aim is that of approximating the efficient set. This calls for a new analysis of the necessary optimality conditions for multiobjective optimization. We explore conditions for sets of points that share the same features of the necessary optimality conditions for single-objective optimization. On the one hand, from a theoretical point of view, these conditions define properties that are necessarily satisfied by the (weakly) efficient set. On the other hand, from an algorithmic point of view, any set that does not satisfy such conditions can be easily improved. We analyse both the unconstrained and the constrained case giving some examples.


Introduction
We consider multiobjective optimization problems of the following form: where f i : R n → R, for i = 1, . . ., m and F ⊆ R n .The set F is commonly known as the decision space of the problem, while the image of points in F through the function f is called the image space (or criterion space).To characterize the solutions of Problem (1) we use the standard optimality notion based on the componentwise ordering in the image space.In particular, given two points x ′ , x ′′ ∈ F, we say that x ′ dominates x ′′ and f (x ′ ) dominates f (x ′′ ) if f i (x ′ ) ≤ f i (x ′′ ), i = 1, . . ., m.A subset N of R m is stable with respect to the dominance relation ≤, or simply, stable if for any z, z ′ ∈ N , z ̸ ≤ z ′ .We also say that a subset N ′ of R m is weakly stable if for any z, z ′ ∈ N ′ , z ̸ < z ′ .
Definition 1 (efficient and nondominated point).A feasible point x * ∈ F is called an efficient point for Problem (1) if there is no point x ∈ F such that f i (x) ≤ f i (x * ) ∀i = 1, . . ., m and f k (y) < f k (x) for some k ∈ {1, . . ., m}.
The image f (x * ) is called nondominated point.
A feasible point x ∈ F is called a weakly efficient point for Problem (1) if there is no point x ∈ F such that f i (x) < f i (x) ∀i = 1, . . ., m.
The image f (x) is called weakly nondominated point.
Algorithms for multiobjective optimization aim at approximating the set of all nondominated points, called nondominated set and the efficient set: Definition 2 (efficient and nondominated set).The set of (weakly) efficient points of Problem (1) is called (weakly) efficient set and is denoted by E (E w ).The image set of all (weakly) efficient points is the (weakly) nondominated set and is denoted by N (N w ) (also known, specifically for m = 2, as Pareto front).
As in single-objective optimization we can divide algorithms for multiobjective optimization into global and local algorithms.Global algorithms are based on the use of sequences of sets, in order to get information on the global behavior of the problem.It is a debated topic what can be considered as a proper approximation of the nondominated set (see [22] for an overview).We recall, for example, the concept of enclosure defined in [8,9,10,11], which is essentially a well-structured set in the image space, as for example a union of boxes, which contains the nondominated set as a subset.
On the other hand, taking inspiration from what is commonly done in single-objective optimization, local algorithms for multiobjective optimization use the fact that in case a point does not satisfy suitable necessary optimality conditions, it can be easily improved, in the sense that a "better" new point can be easily defined, with respect to a specific quality measure, that is usually the objective function value.The majority of algorithms proposed in this respect, both in the unconstrained and constrained case, extend the classical iterative scalar optimization algorithms, such as the steepest descent [13], Newton [12,16], external penalty [14], interior point [15], just to name a few.The mentioned approaches produce sequences of points able to converge to single efficient points.In particular, at every iteration, these algorithms look for a new "improved" point, namely a point that dominates the previous one.
Recently, local algorithms that build sequences of sets have been proposed [6,5,3,4,18,17], trying to go back to the real aim of multiobjective optimization, that is that of approximating a set (and not a single point).Producing sequences of sets instead of single points necessarily leads to explore new algorithmic approaches.Indeed, at every iteration, such algorithms look for a new "improved" set, namely either a larger set or a set containing points that dominate at least one point from the previous set.
The aim of this work is that of exploring the necessary optimality conditions in multiobjective optimization from a new perspective, namely the definition of conditions associated to a set of points, instead of a single point.In particular, this can be of interest both from a theoretical and an algorithmic point of view.As in single-objective optimization, these conditions play a twofold role: • from a theoretical point of view, they define properties that are necessarily satisfied by a global solution, namely by the (weakly) efficient set E w ; • from an algorithmic point of view, they characterize points that cannot be "easily" improved, in the sense, for example, that the standard use of first order information may not be enough to get a new better set.
Therefore, in the multiobjective optimization context, we look for conditions for sets of points that approximate the nondominated set and share the above two features.Such conditions, as stated before, should help to define algorithms that build taylored sequences of sets.
In order to introduce our analysis, we start with the following definition.
Definition 3. Let S ⊆ F be a set whose image points with respect to f form a weakly stable set.For any x ∈ S, we define T S (x) = {I 1 , . . ., I p } as a collection of sets of indices, where (i) I j ⊆ {1, . . ., m}, j = 1, . . ., p, are subsets of objective indices such that there is no point y ∈ S for which (ii) i̸ =j I i ̸ ⊆ I j for all j ∈ {1, . . ., p}.
Remark 1.Note that Definition 3 allows the possibility of considering T S (x) = I 1 = {1, . . ., m}, for all x ∈ S.However, in case we are able to detect at least an index set I j that is a proper subset of {1, . . ., m} for a point x ∈ S, we can also choose a different T S (x), not containing the whole set of objective functions, as condition (ii) requires.This means that either T S (x) is made of the whole set {1, . . ., m} or it is made of proper subsets of {1, . . ., m}.
Example 1.In Figure 1, the image space of a bi-objective instance is depicted.The image of the feasible points through f (x) = (f 1 (x), f 2 (x)) is represented by f (F) and the nondominated set N is highlighted with a bold line.Let S = {x 1 , x 2 , x 3 }, the images f (x 1 ), f (x 2 ), f (x 3 ) belong to N , so that in particular they form a stable set.The following sets T S (x i ), i = 1, 2, 3 satisfy Definition 3: We can notice that since f 1 (x 1 ) is the minimum with respect to f 1 (x) for x ∈ F, we have that f (x 1 ) is a nondominated point in S "thanks to" f 1 , while f 2 does not play any role.Equivalently, since f 2 (x 3 ) is the minimum with respect to f 2 (x), we have that f (x 3 ) is nondominated with respect to the other points in S since f 2 (x 3 ) < f 2 (x 1 ) and f 2 (x 3 ) < f 2 (x 2 ).On the other hand, f (x 2 ) is nondominated with respect to the other two points as ), meaning that T S (x 2 ) needs to include both indices.
Example 1 highlights that, in order to characterize an efficient point, the whole set of objective functions is not always needed.The obvious case is a point that is the minimum with respect to one objective function.This suggests to look for new necessary optimality conditions reflecting the fact that the efficiency of a point might be locally described by subsets of objective functions only, differently from the classical conditions.This difference is more relevant when considering more than two objective functions, as the following example suggests.
)} is a stable set.The following choices of T S (x i ), i = 1, . . ., 4 satisfy Definition 3: Note that, for example, the efficiency of x 1 with respect to the points in S is described either by f 2 or by f 5 , where the least value with respect to both f 2 and f 5 is attained.Furthermore, f 4 (x 1 ) ≤ f 4 (x i ) for i ̸ = 1.This suggests that the characterization of the (weakly) efficiency of x 1 can be described by the local behavior of these three functions only.Indeed, the behavior of, for example, f 1 , can be ignored for x 1 as f 1 (x 1 ) is worse than values attained in other points, like f 1 (x 2 ).In order to describe the efficiency of x 3 , either we can consider f 4 , for which x 3 attains its minimum value, or we can consider the subset made of f 2 , f 3 and f 3 , f 5 .Indeed the index 2 needs to belong to I 2 in order to have x 3 nondominated with respect to x 2 , the index 3 is needed in order to have x 3 nondominated with respect to x 1 .The index 5 is needed in combination with 3 in order to have x 3 nondominated with respect to x 4 .Note that in the definition of T S (x i ), i = 1, . . ., 4 we are not considering the natural choice of the whole set of objective functions indices {1, . . ., 5}.
2 Unconstrained case: We start by analyzing multiobjective unconstrained case, namely problems of the form ( In the following, we assume that the objective functions f i : R n → R, i = 1, . . ., m are continuously differentiable.In the literature, to characterize a single efficient point, the following condition has been proposed [21].
Subsequently, a Pareto stationary point is defined as follows.
if, for all d ∈ R n , an index j ∈ {1, . . ., m} exists such that When dealing with the (weakly) efficient set, we can extend Proposition 1 as follows: Proposition 2. If E w ⊆ R n is the weakly efficient set for Problem (2), then for all x ∈ E, all d ∈ R n there exists an index i ∈ {1, . . ., m} such that ∇f i (x) ⊤ d ≥ 0.
According to Proposition 2, we can introduce the following definition: Definition 5 (Standard-Pareto stationary set).Let S ⊆ R n be a non empty set such that f (S) is a weakly stable set.We say that S is a standard-Pareto stationary set for Problem (2) if, for all x ∈ S, all d ∈ R n there exists an index i ∈ {1, . . ., m} such that ∇f i (x) ⊤ d ≥ 0.
As already mentioned in the introduction, the efficiency of a point within a set, can be characterized by a subset of objective functions.This suggests that we can characterize an efficient set using stronger conditions than those in Proposition 2. In particular, there is no need of considering the whole set of objective functions.This can be important when the number of objective functions is large.In those cases, Definition 4 as well as Definition 5 poorly characterize efficient points and sets of efficient points.Indeed, Definition 4 equivalently states that a point is Pareto-stationary in case there is no direction that is a descent direction for every objective function, implying that the higher number of objective functions the easier the definition can be satisfied.The same applies to Definition 5, as described in the following example.
Example 3. Let us consider a multiobjective problem with the following objective functions f i : R 2 → R. i = 1, . . ., 5.
Their gradients are .
Note that S satisfies Definition 5. Indeed, x 1 and x 3 are stationary with respect to functions f 4 and f 5 respectively, so that for any direction d ∈ R n we have that ∇f 4 (x 1 ) T d = 0 and ∇f 5 (x 3 ) T d = 0.For i = 2, 4, we have that ∇f 1 (x i ), ∇f 2 (x i ) and ∇f 3 (x i ) can be combined with positive coefficients to obtain the zero vector.From Gordan's Theorem of alternative (see Table 2.4.1 [20] or Theorem 1 in the appendix) this implies that for any d ∈ R n there is one index j ∈ {1, 2, 3} such that ∇f j (x i ) T d ≥ 0, i = 2, 4.
However, looking at the objective function values at the points in S, we can see that the set S can be easily "improved", for example by moving point x 2 along a descent direction with respect to f 5 , as it will be clarified in Example 4.
As shown in example 3, there is room to improve the definition of standard-Pareto stationarity.One weakness in Definition 5 is that it does not fully exploit that the considered set is a weakly stable set.In particular, it ignores the fact that not all the objective functions should necessarily be taken into account.Indeed, there exist objective functions that can be neglected in order to characterize the fact that a point in a weakly stable set is nondominated with respect to the others (as highlighted also in Example 2).In this respect, Definition 3 comes into play and allows us to state the following result, giving new, taylored and more flexible optimality conditions for Problem (2).
(ii) for all x * ∈ E w , for all d ∈ R n and all I ∈ T E (x * ), an index i ∈ I exists such that ∇f i (x * ) ⊤ d ≥ 0.
Proof.Let x * ∈ E w .Then, no x ∈ F exists such that f i (x) < f i (x * ), i = 1, . . ., m so that by Definition 3 we have T Ew (x * ) ̸ = ∅, so that (i) holds.
Assume by contradiction that there exists x * ∈ E w , d ∈ R n and I ∈ T E (x * ) such that We will get a contradiction by showing that there exists a weakly efficient point for Problem (2) not belonging to E w .Assuming (3), for any i ∈ I there exists α i > 0 such that Now consider any α ∈ (0, min i∈I α i ].We can write For all y ∈ E w , from the definition of I it follows that an index î ∈ I exists such that Furthermore, from (4) we also have Since this holds for all y ∈ E w , it follows that (x * + αd) ̸ ∈ E w .
Moreover, for all u ∈ R n \ E w , there exists y ∈ E w such that f i (y) < f i (u) for all i = 1, . . ., m.In particular, using (5), we have Since this holds for all u ∈ R n \ E w , it follows that x * + αd ∈ E w , getting a contradiction.Remark 2. Proposition 3 allows to define different necessary optimality conditions according to the choice of T Ew (x * ) made.In particular, the larger the number of the subsets of indices within T Ew (x * ) and the smaller their cardinality, the stronger the conditions become.
From Proposition 3, we introduce the following new definition of Pareto stationary set.Definition 6 (Pareto stationary set).Let S ⊆ R n be a non empty set such that f (S) is a weakly stable set.We say that S is a Pareto-stationary set for Problem (2) if, for all x ∈ S, all d ∈ R n and all I ∈ T S (x), there exists an index i ∈ I such that ∇f i (x) ⊤ d ≥ 0.
Local algorithms for multiobjective optimization that aim at detecting one single efficient point are generally based on the optimality conditions reported in Proposition 1 (see e.g.[13]).In particular, in case a point does not satisfy Pareto-stationarity conditions, it is possible to define a direction that is a descent direction for each objective function, thus able to produce a new point that dominates the starting one.
When the aim is producing a sequence of sets of points, local algorithms can be based on the conditions introduced in Proposition 3. In particular, these conditions on the one hand allow us to certify whether a set S is Pareto-stationary.On the other hand, in case a set S does not satisfy these conditions, we can easily either increase the cardinality of S or replace some points in S by new points dominating them, as shown in the next proposition.Proposition 4. Let S ⊆ R n be a non Pareto-stationary set for Problem (1).Then, x ∈ S, a direction d ∈ R n and a stepsize α > 0 exist such that f (x + αd) is non-dominated by any f (y), y ∈ S. Namely, Furthermore, f (x+αd) is non-dominated by any z ∈ R m that is dominated by some f (y), y ∈ S.
Proof.Since S ⊆ R n is a non Pareto-stationary set, there exist x ∈ S, d ∈ R n and I ∈ T S (x) such that ∇f i (x) ⊤ d < 0 ∀i ∈ I.
Then, for all i ∈ I, there exists α i > 0 such that Now consider any α ∈ (0, min i∈I α i ].We can write For all y ∈ S, from the definition of I it follows that there exists an index î ∈ I such that Furthermore, from (6) we also have Since this holds for all y ∈ S, it follows that f (x + αd) is non-dominated by any f (y), y ∈ S. Now, consider any z ∈ R m \ f (S) dominated by some f (y), y ∈ S. Namely, f i (y) ≤ z i for all i = 1, . . ., m and z ̸ = f (y).Using (7), we have It follows that f (x + αd) is non-dominated by z.
Example 4. Let us consider the multiobjective problem proposed in Example 3, where each point belonging to S satisfies Definition 4. Assume that this set S is produced by an algorithm that builds sequences of points.The use of the standard-Pareto-stationarity definition would make the algorithm stop.On the other hand, this would not be the case if using our new stationarity characterization given in Definition 6, with T S (x) specifically chosen.
Indeed, f (S) ⊂ R 5 is the following stable set: and these are possible choices of T S (x i ), i = 1, . . ., 4 satisfying Definition 3: Note that even if x 1 is stationary with respect to f 4 , since I 1 = {1} belongs to T S (x 1 ), we have that directions d ∈ R n such that ∇f 1 (x 1 ) T d < 0 exist so that Definition 6 is not satisfied and S is a non Pareto-stationary set for Problem (1).Therefore, from Proposition 4, the set S can be easily "improved" as a direction d ∈ R n and a stepsize α > 0 exist such that f (x 1 + αd) is non-dominated by the image of any point in S. Similar arguments apply to the other points x i , i = 2, 3, 4. For example, starting from x 2 , we can improve S in different ways, using descent directions for f 1 or f 2 .Starting from x 3 we can move along a descent direction for f 2 or f 5 and starting from x 4 we can move along a descent direction for f 3 .

Constrained case
In this Section, we extend the above analysis to the constrained setting.We consider the following constrained multiobjective optimization problem, where the feasible set F is explicitely defined by inequality and equality constraints: We assume that the functions f : R n → R m , g : R n → R p and h : R n → R q are continuously differentiable.
Furthermore we can define the standard-Pareto stationary set as: Definition 7 (Standard-Pareto stationary set).Let S ⊆ R n be a non empty set such that f (S) is a weakly stable set.We say that S is a standard-Pareto stationary set for Problem (8) if all x ∈ S satisfies Fritz-John conditions, namely, multipliers (σ, λ, µ) ∈ R m × R p × R q exist, (σ, λ, µ) ̸ = (0, 0, 0), such that (9) holds.
As for the unconstrained case, Definition 3 comes into play and allows us to state new, taylored and more effective optimality conditions for Problem (2).To this extent we need an intermediate result, based on sets G(•) and H(•) introduced in the following: Definition 8. Given x ∈ F, let A(x) = {i ∈ {1, . . ., p} : g i (x) = 0}.We define the sets G(x) and H(x) as follows

, q}
We now extend the classical result for single objective constrained optimization [1, Theorem 4.3.1] to the multiobjective case, using the sets introduced in Definition 3: Lemma 1.Let E w ⊆ F be a weakly efficient set for Problem (8) and let x * ∈ E. If ∇h j (x * ) j = 1, . . ., q are linearly independent, we have that for all d ∈ G(x * )∩H(x * ) and all I ∈ T Ew (x * ), there exists i ∈ I such that ∇f i (x * ) ⊤ d ≥ 0.
Proof.Given I ∈ T Ew (x * ), let F I (x * ) be the set defined as follows Proving the proposition is equivalent to showing that for all I ∈ T Ew (x * ) This follows from the proof of [1, Theorem 4.3.1]with minor changes.
Thanks to Lemma 1 we are able to state a new characterization for a weakly efficient point of problem (8).Proposition 7. Let x * ∈ F be a weakly efficient point of Problem (8).Then, for all λ * j g j (x * ) = 0, j = 1, . . ., p.
Taking inspiration from [19], we further extend our characterization of the weakly efficient points of problem (8) given in Proposition 7, assuming regularity conditions on the constraints.This allows us to state the definition of Pareto-stationary set and propose how to compute a descent direction in case S is not a Pareto-stationary set.We first recall the classical Mangasarian-Fromovitz (CMF).Definition 9.The CMF holds at x ∈ F if: there is no α j ≥ 0, j ∈ A(x), β l , l = 1, . . ., q such that j∈A(x) α j ∇g j (x) + q l=1 β l ∇h l (x) = 0, (α, β) ̸ = (0, 0).Proposition 8. Let x * ∈ F be a weakly efficient point of Problem (8) and let the CMF hold at x * .Then, for all Proof.From Proposition 5 we have that x * satisfies (10).Assume by contradiction that σ * I = 0.Then, (10a) would become p j=1 λ * j ∇g j (x * ) + q l=1 µ * l ∇h l (x * ) = 0 or equivalently, since λ j = 0, j ̸ ∈ A(x * ), µ * l ∇h l (x * ) = 0, so that we get a contradiction to the CMF conditions.
Taking inspiration from other regularity conditions presented and analyzed in the literature [2], we define the extended Mangasarian-Fromovitz (EMF) conditions, in light of Definition 3.
Definition 10.Given I ⊂ {1, . . ., m}, the EMF holds at x ∈ F if: for all s ∈ I, there is no The extended Mangasarian-Fromovitz conditions allow us to prove a further characterization of the weakly efficient points of problem (8): Proposition 9. Let x * ∈ F be a weakly efficient point of Problem (8) and let the EMF hold at x * for every I ∈ T E (x * ).Then, for all Proof.From Proposition 5 we have that x * satisfies (10).Assume by contradiction that s ∈ I exists such that σ s = 0.Then, (10a) would become µ * l ∇h l (x * ) = 0, so that we get a contradiction to the EMF conditions.
We are finally able to state the definition of Pareto-stationary set for multiobjective constrained problems.
Definition 11 (Pareto-stationary set).Let S ⊆ F be a non empty set such that f (S) is a weakly stable set.We say that S is a Pareto-stationary set for Problem (8) if, for all x ∈ S, and all I ∈ T S (x), there exist multipliers σ i , i ∈ I, λ ∈ R p , µ ∈ R q , with σ I ̸ = 0, such that As a final result, we characterize how to compute a feasible descent direction in case S is not a Pareto-stationary set.For sake of simplicity, we assume that the feasible set F is defined according to inequality constraints only.
Proposition 10.Let S ⊆ F be a non Pareto-stationary set for Problem (8) and assume that the CMF holds at each point in S.Then, x ∈ S, a direction d ∈ R n and a stepsize α > 0 exist such that (x + αd) ∈ F and is non-dominated by any y ∈ S. Namely, Furthermore, x + αd is non-dominated by any z ∈ F that is dominated by some y ∈ S.
Proof.We start by showing that if S ⊆ F is a non Pareto-stationary set for Problem (1), we can detect a point x ∈ S, I ∈ T S (x) and a direction d ∈ R n such that ∇f j (x) ⊤ d < 0, ∀j ∈ I and ∇g i (x) ⊤ d ≤ 0, i = 1, . . ., p with ∇g i (x) ⊤ d < 0, for all i ∈ A(x).
We now show that necessarily ∇g i (x) ⊤ d < 0 for all i ∈ A(x) and we proceed by contradiction.Assume that there is no d ∈ R n such that ∇f j (x) ⊤ d < 0, ∀j ∈ I and ∇g i (x) ⊤ d < 0, i ∈ A(x).
In particular, by multiplying the above expression with the direction d ∈ R n , we obtain j∈I ρ j ∇f j (x) ⊤ d + i∈A(x) ρj ∇g i (x) ⊤ d = 0.
Since ∇f j (x) ⊤ d < 0 for all j ∈ I, we necessarily have ρ j = 0 for all j ∈ I. Therefore i∈A(x) ρj ∇g i (x) = 0, getting a contradiction with the fact that CMF holds at x.We then proved that x ∈ S, d ∈ R n and I ∈ T S (x) exist such that ∇f i (x) ⊤ d < 0 ∀i ∈ I and ∇g i (x) ⊤ d ≤ 0, i = 1, . . ., p with ∇g i (x) ⊤ d < 0 for all i ∈ A(x).
Theorem 3 (Motzkin's Theorem [20]).Let A ∈ R s 1 ×n , C ∈ R s 2 ×n , D ∈ R s 3 ×n be three matrices, with A non vacuous.One and only one of the following systems has solution