A Combinatorial Approximation Algorithm for the Vector Scheduling with Submodular Penalties on Parallel Machines

. In this paper, we focus on solving the vector scheduling problem with submodular penalties on parallel machines. We are given n jobs and m parallel machines, where each job is associated with a d -dimensional vector. Each job can either be rejected, incurring a rejection penalty, or accepted and processed on one of the m parallel machines. Te objective is to minimize the sum of the maximum load overall dimensions of the total vector for all accepted jobs, along with the total penalty for rejected jobs. Te penalty is determined by a submodular function. Our main work is to design a ( 2 − 1/ m )( min r,d { }) -approximation algorithm to solve this problem. Here, r denotes the maximum ratio of the maximum load to the minimum load on the d -dimensional vectors among all jobs.


Introduction
Te multiprocessor scheduling problem was frst studied in 1966 by Graham [1].It is an important problem in the feld of combinatorial optimization [2,3].In this problem, we have a set of n jobs, denoted as J � J 1 , J 2 , . . ., J n  , and a set of m parallel machines, denoted as M � M 1 , M 2 , . . ., M m  .Each job can be processed on one of the machines.Te objective is to minimize the makespan, which is the maximum completion time among all machines.Te problem is strongly NP-hard [1,4], meaning that it is computationally challenging to fnd an optimal solution.
In 1966, Graham [1] proposed a classical (2-1/m)approximation algorithm called LS (list scheduling) based on a greedy strategy.Later, Graham [5] designed a 4/3approximation algorithm called LPT (longest processing times) by sorting the jobs in nonincreasing order.In 2020, Jansen et al. [6] introduced an efcient polynomial-time approximation scheme, which is currently the best-known result for this problem.
In some scenarios, rejecting certain low-cost performance jobs can increase profts.Based on the multiprocessor scheduling problem, Bartal et al. [7] explored the problem of multiprocessor scheduling with rejection in 2000.Tis variant allows each job J j ∈ J to either be rejected with a penalty or scheduled on one of the machines.Te objective is to minimize the makespan of the accepted jobs while also considering the total penalty incurred by the rejected jobs.Tey proposed a (2 − 1/m)-approximation algorithm.Later, Ou et al. [8] developed an improved (3/2 + ε)-approximation algorithm, where ε > 0 is a given small constant.
Te vector scheduling problem, introduced by Chekuri and Khanna [9], is a further extension of the multiprocessor scheduling problem.In this problem, each job is associated with a d-dimensional vector.Te objective is to schedule n d-dimensional jobs on m machines to minimize the maximum load overall dimensions of the machines.Chekuri and Khanna [9] proved in 2004 that this problem cannot be approximated within a constant factor unless NP � ZPP, assuming arbitrary input dimension d.Here, ZPP refers to a complexity class that deals with probabilistic Turing machines [10], which take into account the probability of acceptance.In the same paper, Chekuri and Khanna [9] also presented an O(ln 2 d)-approximation algorithm and an O(ln dm/ln ln dm)-approximation algorithm with high probability, where m denotes the number of machines.
Recently, Harris and Srinivasan [11] developed a randomized polynomial-time approximation algorithm for the vector scheduling problem, achieving a ratio of O(ln d/ ln ln d).For the vector scheduling problem with rejection, when m � 1, Dai and Li [12] demonstrated that the problem is NP-hard and proposed a d-approximation algorithm for a fxed constant d.Li and Cui [13] introduced a 2.54-approximation algorithm based on randomized rounding in the case where m � 2. Additional related results can be found in reference [14].
In many practical scenarios, the relationship between the number of rejected objects and the associated rejection penalties is often submodular rather than linear.As a result, combinatorial optimization problems with submodular penalties have gained signifcant attention.Tese problems involve a rejection penalty determining a submodular function [15,16].
Liu and Li [17] addressed the multiprocessor scheduling problem with submodular penalties.Tey presented a combinatorial (2 − 1/m)-approximation algorithm based on the list scheduling (LS) algorithm and a greedy method.Wang and Liu [18] focused on parallel-machine scheduling with release times and submodular penalties, proposing a combinatorial 2-approximation algorithm.Zhang et al. [19] tackled the precedence-constrained scheduling problem with submodular rejection on parallel machines.Tey introduced a combinatorial 3-approximation algorithm to solve this problem.Liu et al. [20] considered the single-machine vector scheduling problem with submodular penalties.Tey presented a combinatorial min r, d { }-approximation algorithm to solve this problem, where r represents the maximum ratio of the maximum load to the minimum load on the d-dimensional vectors among all jobs.
We are inspired by previous research studies and focus on the problem of vector scheduling with submodular penalties on parallel machines.We propose a (2 − 1/m)(min r, d { })approximation algorithm, where r represents the maximum ratio between the largest and smallest components of the d-dimensional vectors among all jobs.Tis algorithm extends the fndings of prior results [17,20].
Tis paper is organized as follows: In Section 2, we provide a formal problem statement and a fundamental lemma to ensure the correctness of our approximation algorithm.In Section 3, we present an approximation algorithm for the problem.In Section 4, we present our conclusions.

Preliminaries
Defnition 1 (see [15]).Let U be a given set and π: 2 U ⟶ R ≥0 be a real-valued function defned on all subsets of U. It is called a submodular function if Te problem of vector scheduling with submodular penalties on parallel machines (the VSSP-PM problem, for short) is defned as follows.
Given a set , It is asked to fnd a scheduling confguration (A 1 , A 2 , . . ., A m , R) that satisfes the following conditions: A 1 , A 2 , . . ., A m form a partition of the accepted job set A (i.e., J∖R), where each A i represents the set of jobs processed on machine M i and R is the set of rejected jobs.Te accepted job sets A 1 , A 2 , . . ., A m are mutually exclusive, and their union covers the entire accepted job set A. Te objective is to minimize the maximum load overall dimensions and the penalty incurred by the rejected jobs.Te penalty, denoted as π(R), is determined by a submodular function π(•).Tat is, To further understand our problem, we present the following integer linear program (ILP) for the VSSP-PM problem.
where variable z R ∈ 0, 1 { } indicates whether R is picked, that is, z R � 1 if and only if R is rejected, variable x i j ∈ 0, 1 { } implies whether job J j ∈ J is accepted and scheduled on the machine M i , here x i j � 1 if and only if J j is accepted and scheduled on the machine M i .Clearly, if d � 1, the VSSP-PM problem is exactly the problem of parallel machine scheduling with submodular penalties studied in Liu and Li [17].If m � 1, the VSSP-PM problem becomes the problem of single machine vector scheduling with general penalties studied in Liu et al. [20].
For convenience, we may assume I � (J, M, p(•), π(•)) to be an instance of the VSSP-PM problem and assume r to be the maximum ratio of the largest component to the smallest component of the d-dimensional vectors among all jobs, i.e., r � max Note that when r ≥ 1, in particular, we set r � + ∞ if min J j ∈J min k p k j   � 0. Now, we construct an auxiliary instance is a set of m parallel machines, and for each J j ′ ∈ J ′ , we defne 2 Journal of Mathematics and p′(J j ′ ) � p j , π ′ (•) � π(•), i.e., for each subset l⊆ as an auxiliary instance of I � (J, M, p(•), π(•)).Ten, we have the following result.

Lemma 2. Given an instance I � (J, M, p(•), π(•)) of the VSSP-PM problem and its auxiliary instance
where p(A i ) �  J j ∈A i p j ≔ ( J j ∈A i p 1 j ,  J j ∈A i p 2 j , . . .,  J j ∈A i p d j ).
Proof.We are given an instance I � (J, M, p(•), π(•)) of the VSSP-PM problem and its auxiliary instance as the load vector of machine M i for each i ∈ [m].
Case 1: r ≤ d.In this case, we have p j � min k∈[d] p k j for each j ∈ [n] and we can obtain the following for each i ∈ [m]: Terefore, we can obtain the following equation: Case 2: r > d.
In this case, we can obtain the following for each i ∈ [m]: Tus, we have max To sum up, we conclude this lemma. where For convenience, we may assume that p(A i by the LS algorithm [1].Terefore, we have where the frst inequality comes from the fact p(J ′ ∖R j ⋆ ) − p  j /m is the average load on the m parallel machines, and the second inequality follows from the fact J  j ′ ∈ A j ⋆  i and p  j ≤ p j ⋆ for each  j( ≤ j ⋆ ).
Additionally, we have max where .By Lemma 2, we consider the following two cases.
(1) Let p 0 :� 0 and sort all jobs in J in nondecreasing order, for convenience, we may assume that p (2) For j ′ � 0, 1, . . ., n do Constructing R(j ′ ) � J j | j > j ′   and fnding the set R j ′ � argmin S: R(j ′ )⊆S⊆J 1/m J j ∈J∖S p j + π(S)   by the method in [16].We assign the jobs in 2 , . . ., A j ′ m using the LS algorithm.
ALGORITHM 1: Algorithm based on the greedy and LS algorithms.

Journal of Mathematics
For convenience, let (A 1 ″ , . . ., A m ″ , R ″ ) be another feasible solution of instance where Case 2: r > d.Using similar arguments as Case 1, we can obtain the following for each Similarly, let (A 1 ″ , . . ., A m ″ , R ″ ) be the solution of instance where
Te numerical experiments of our algorithm are described as follows.Te experimental data are generated in an average distribution over a given range (Dataset link: https:// github.com/wencheng2018/GLSalgorithm-based-data),so each experiment is repeated 10 times.Te fnal results are averaged to reduce the impact of randomness.
We compared the GLS algorithm and the optimal solution for the VSSP-PM problem.We use IBM's opensource tool, CPLEX, to obtain the optimal solution.In cases where the optimal solution was not obtained within 1 minute, we terminated the CPLEX process.As illustrated in Figure 1, the results obtained from the GLS algorithm closely approximate the optimal solution.Te numerical experiment results indicate that the approximation ratio of our algorithm is less than 2.5 for the given instances.Tis fnding indicates the superiority of our method.

Conclusions
In this paper, we proposed the problem of vector scheduling with submodular penalties on parallel machines (VSSP-PM), which generalizes both the vector scheduling with submodular rejection on a single machine [13] and the parallel machine scheduling with submodular penalties [17].We present a (2 − 1/m)(min r, d { })-approximation algorithm, where m is the number of parallel machines and r is the maximum ratio of the largest component to the smallest component of the d-dimension vectors among all jobs.Tis result generalizes the conclusions in [17,20].Numerical experiment results show that our algorithm is efcient and reliable.
A challenging task for further research is to present some better approximation algorithms with lower performance or lower running times to solve the VSSP-PM problem.Additionally, vector scheduling with submodular penalties on unrelated machines is an interesting problem to be explored.It is possible to design a O(d)-approximation algorithm, but it is a challenge.

Case 1 :
r ≤ d.We can obtain the following for each i ∈ [m]
is a feasible solution of the instance I ′ , then there exists a feasible solution σ � (A 1 , . . ., A m , R) of instance I satisfying