A size-reduction algorithm for the order scheduling problem with total tardiness minimization

Article history: Received: October 10, 2021 Received in revised format: December 20, 2021 Accepted: January 18, 2022 Available online: January 18, 2022 We investigated a variant of the customer order scheduling problem taking into consideration due dates to minimize the total tardiness. Since the problem under study is NP-hard, we propose an efficient size reduction algorithm (SR). We perform an extensive computational experience and compare our proposition with JPO-20 matheuristic, the best existing algorithm for the problem under study. We use the Relative Deviation Index (RDI) and the Success Rate (SRa) as the statistical indicators for the performance measure. We must emphasize that SR presented the lowest average RDI (around 15.5 %), whereas the JPO-20 presented an average RDI approximately three times higher (around 52.5 %). Furthermore, the proposed SR presented a higher average SRa (around 66.9%), whereas the JPO-20 presented a lower average success (around 25.7%). Our proposal used a lower computational effort, resulting in a reduction for the computation times of approximately 22%. The obtained results point to the superiority of the proposed SR in comparison with the JPO-20. © 2022 Growing Science Ltd. All rights reserved.


Introduction
Currently, the increasing of the exigency level of the customers, as well as the rigorous competition among the firms, has resulted in changes in the production paradigms. The strong demand for customized goods continuously has led to the production of orders in different places or production lines, which will be assembled in each facility. Over recent years, the researchers of the production scheduling area have been paid greater attention to the assembly scheduling problems. Framinan et al. (2019) presented a new unified notation for this class of problems, surveying the current contributions and highlighting promising research topics. The customer order scheduling environment appears in several real-world problems, such as the paper and pharmaceutical industries, among others (Leung et al., 2005). In this paper, we address the customer order scheduling problem. Let n be the set of customer orders and m the set of dedicated parallel machines, all the components of the orders must be produced one time in each of the available machines in a given position. Each order presents an associated processing time as well as a due date. The objective function is the total tardiness minimization. Since the customer order scheduling is NP-hard with total tardiness minimization for m ≥ 2 (Wagneur & Sriskandarajah, 1993), heuristic algorithms are required for finding high-level solutions within admissible computational times. This paper aims at presenting a size-reduction algorithm (SR) for the customer order scheduling problem with total tardiness minimization. We extended the traditional size-reduction approach based on processing times to an approach based on due dates. We develop an efficient scheme for fixing as zero some decision variables using the problem due dates and a dispatch rule, such as the well-known earliest due date algorithm. Based on extensive computational experimentation performed with benchmark test instances, our proposition outperformed the JPO-20 matheuristic proposed (Framinan & Perez-Gonzalez, 2018), the best algorithm found in the revised literature.
The remainder of the paper is structured as follows. In Section 2, we present some related approaches. In Section 3, we describe the problem under study. In Section 4, we present the proposed size-reduction algorithm. In Section 5 we present the computational experience, as well as the discussion of the results. Finally, in Section 6 we present the main conclusions as well as the suggested research avenues.

Literature Review
Here, we present the literature review using the notation provided by Framinan et al. (2019). Julien and Magazine (1990) studied a flexible manufacturing environment where customer requirements for each of the possible product types are known in advance. Wagneur and Sriskandarajah (1993) introduced the customer order scheduling environment. Furthermore, these authors proved that this problem is NP-hard for the total tardiness minimization objective. Sung and Yoon (1998) addressed a DPm→0 | | ΣwjCj problem in which each order presents two types of components that are processed by two independent machines specialized in a given type of component. They proposed two constructive heuristics that presented high-quality results, in comparison with a proposed lower bound. Ahmadi et al. (2005) introduced the coordinated customer order scheduling problem for the weighted completion time minimization. They proposed a Lagrangian heuristic as well as three constructive heuristics. The first one reached the best results in the analyzed set of test instances. Yang and Posner (2005) considered a production environment in which the jobs are processed in batches, and the objective function is the minimization of the total completion time in the batches. Two heuristics are proposed, presenting near-optimal solutions for the evaluated instances. Leung et al. (2005) presented two heuristics for the customer order scheduling with weighted completion time objective which outperform all heuristics previously reported in the literature. Lin and Kononov (2007) approached the problems DPm→0| | ΣUj and DPm→0| | ΣwjUj. These authors proved the NPhardness of the DPm→0| | ΣUj variant, and they developed a heuristic for the problems under study based on the tardiness weighting. Shi et al. (2017) presented a quadratic mathematical formulation for the DPm→0| | ΣC, which can be converted into an equivalent mixed-integer linear programming formulation. They proposed a nested partition algorithm that presented high-quality solutions. Xu et al. (2015) addressed a variant of the customer order scheduling where the orders are subdivided into sub-lots. These authors proposed a mixed-integer linear programming formulation, a lower bound, two heuristics, and a matheuristic. The last one outperforms the two constructive heuristics, although with a higher computational cost. Xu et al. (2016) introduced a multiple-machine order scheduling problem with a learning effect to minimize the total tardiness. Some dominance relations are presented as well as a lower bound. As solution procedures, these authors present a simulated annealing (SA) meta-heuristic, a particle swarm optimization (PSO) meta-heuristic, and a branch-and-bound algorithm. The PSO outperforms the other approaches, however with a higher computational effort. Lin et al. (2017) addressed a two-agent multi-facility order scheduling with ready times (DPm→0| |ε(ΣC A ,C B ). They derived several dominance properties and a lower bound on the optimal solution. As the solution procedures, a PSO and an opposite-based particle swarm optimization (O-PSO) are presented. Framinan and Perez-Gonzalez (2017) addressed the DPm→0| | ΣC variant. They developed a new constructive heuristic as well as greedy search algorithm for the problem under study. The first one incorporates a look-ahead procedure for the evaluation of the contribution to the objective function of the candidate orders as well an estimation of the contribution of the non-scheduled orders (named as FP algorithm). The second one is a greedy constructive algorithm (GSA) with some improvement procedures (perturbations and local search). These authors concluded that such approaches outperformed the existing algorithms. Riahi et al. (2019) criticized the FP algorithm because the placement of an unscheduled customer order only at the end of the scheduled partial sequence is a greedy procedure. Faced with this limitation, they proposed a new constructive heuristic considering 8 different initial priority lists. Furthermore, they developed a meta-heuristic based on perturbative and constructive procedures. The computational experiments show that the proposed approaches clearly outperformed the existing algorithms. Lee (2013) presented four constructive heuristics for the DPm→0| | ΣT: total processing time earliest due date (TPT-EDD), maximum processing time earliest due date (MCT-EDD), earliest due date maximum processing time ( EDD-MCT), and order modified due date (OMDD). It can be observed that OMDD outperformed all the other algorithms. Framinan and Perez-Gonzalez (2018) proposed a constructive heuristic based on the look-ahead mechanism presented by Framinan and Perez-Gonzalez (2017). Furthermore, they proposed two matheuristics called JPF and JPO for the above-mentioned problem. For the JPO, there is an oscillation parameter δ for the fixation of decision variables in the MILP. Computational highlighted δ = 20 as the best parameter value. Therefore, the JPO-20 algorithm is the bestso-far algorithm for the problem under study. In our view, the definition of the oscillation of the decision variables of the JPO algorithm does not explicitly consider the characteristics of a given instance. For example, if a given algorithm explores the information of the problem due dates, the fixation of decision variables could be more efficient. Moreover, if the oscillation of the JPO algorithms occurs in the first positions of the sequence, some decision variables cannot be fixed in the oscillation window. Thus, our proposal takes into account these issues for a better reduction of the search space. Prata et al. (2021b) introduced the customer order scheduling with sequence-dependent setup times to minimize the makespan (DPm→0|STSD| Cmax). As solution procedures, two mixed-integer linear programming models and two matheuristics are proposed.  studied the (DPm→0|STSD| ΣC). A mathematical formulation is developed, as well as an innovative hybrid discrete differential evolution algorithm. Antonioli et al. (2022) addressed the (DPm→0|STSD| ΣT). The properties of the global optimal solutions are studied. Besides, several constructive heuristics and matheuristics are developed.

Problem description
Consider the following example with the processing times presented in Table 1. In addition, the due dates for each order are d = {4, 5, 6}. A feasible solution for this instance is the sequence Π ={ 3, 2, 1} , with a completion time vector C = {9, 6, 2} . Thus, the first order presents a tardiness of 5-time units, the second order presents a tardiness of 1 time unit and the third order does not present tardiness. Consequently, the solution illustrated in Fig. 1 presents a total tardiness of 6-time units. Hence, we present the basic notation for the comprehension of the problem under study. Let I = { 1, 2, ..., m} be a set of machines, J = { 1, 2, ..., n} a set of positions, and K = {1, 2, ..., k} a set of orders. We define p ik is the processing time of order k in machine i, and d k the due date of order k. We then define the following decision variables: xkj a binary decision variable in which 1 indicates whether order k is produced in position j, 0 otherwise. In addition, Tj is the tardiness of order in position j. The mixed-integer programming model for the problem under study proposed by Framinan and Perez-Gonzalez (2018) is presented as follows.
(1) subject to The objective function (1) is the total tardiness minimization. Set of constraints (2) ensures that an order is scheduled only in a position k. Set of constraints (3) enforces that a position receives only a job j. Set of constraints (4) calculates the tardiness for each order. Finally, constraint sets (5), and (6) determine domain of the decision variables.

Proposed solution approach
In an integer linear programming model with binary decision variables in which permutation constraints appear, the number of decision variables with a value equal to one usually is much smaller than the number of decision variables with a zero value in the optimal solution. In view of the parameters of the integer linear model, the possibility of some decision variables appear in high-quality solutions can be small. Thus, aiming to reduce the size of a given optimization problem and speed up its solution, a percentage of these decision variables can be fixed as zero before the beginning of the analysis. The size reduction algorithm (SR) is introduced by Fanjul-Peyro and Ruiz (2011Ruiz ( , 2017 has been applied in other production sequencing optimization problems, presenting competitive results (Fanjul-Peyro et al., 2017, Prata et al., 2021a. We can observe that there is no guarantee that the SR provided the global optimal solution; however, it frequently is a useful matheuristic. Lee (2013) shows that the global optimal solution for the DPm → 0 | | ΣT presents the same permutation of orders for all machines. Since we present positional decision variables xkj representing a permutation, we have decisions variables equal to 1 in the feasible solutions. Thus, the greater part of the decision variables is equal to 0 in the feasible solutions. In the problem under study, we are looking to the total tardiness minimization. Therefore, we can use the information related to the due dates for setting several decision variables as 0 in a size-reduction approach. We can infer that orders with the largest due dates hardly will be allocated in the first positions of the sequence in high-quality solutions.
Aiming to determine which decision variables can be set as zero, we can use a constructive heuristic that considers the problem due dates. Depending on the solution returned by a constructive heuristic, the decision variable associated with a given position can be fixed since high-quality solutions hardly allocate an order in a position that implies high tardiness. In Fig. 2 we present an example of the proposed size-reduction algorithm. We consider well-known earliest due date (EDD) heuristic, in which the orders are sorted according to a non-decrescent sequence Π ={7, 10, 6, 9, 1, 4, 8, 2, 5, 3}. With the basis on this initial solution, we use a parameter, called α, for which we set as zero all the decision variables associated with a difference between the due date values greater than the α dk. In this example, as we adopt α = 50%. For the order o7 allocated in the first position, the orders o4, o8, o2, o5, and o3 present a percentage difference greater than 50%. Thus, the associated decision variables to these orders are set as 0. In Figure 2, the rectangles illustrate the range of the positions that are not set as zero. The different colors of the rectangles emphasize that some positions are in the extremities of the sequence. The circles highlight the orders considered in each position. Fig. 3 illustrates the proposed SR algorithm. The algorithm receives as input parameters α, which determinates the percentage of decision variables to be maintained free for the solver, and tlimit, the time limit adopted for the solver. Firstly, the proposed approach uses an initial solution based on the OMMD heuristic (Lee, 2013). Given a sequence Π, presents the binary decision variable with the corresponding positions to the permutation Π. For all the positions j ( j = 1, ..., n), we calculate wmin and wmax, which are the lower and upper bounds for the window with free positions. The positions located out of this interval are set as zero. After that, the model defined by Eqs. (1-6), adding the constraints (7):  (6), as well as constraint (7), during tlimit seconds Store in Π the solution corresponding to solution x if x < xbest then xbest := x Πbest := Π end return xbest , Πbest   Fig. 3. Pseudocode of the proposed SR algorithm.

Test instances, statistics used in the computational experiments and methods under comparison
We evaluate the BIG test instances proposed by Framinan and Perez-Gonzalez (2018). For the BIG data set we have n ∈{100, 150, 200, 300} and m ∈{5, 10}. These instances have two key parameters: the range of due dates (RDD) and the tardiness factor (TF). We use these parameters in the discussion of the results. We evaluate the BIG test instances proposed by Framinan and Perez-Gonzalez (2018). For the BIG data set we have n ∈{100, 150, 200, 300} and m ∈{5, 10}. These instances have two key parameters: the range of due dates (RDD) and the tardiness factor (TF). We use these parameters in the discussion of the results. As the indicators for evaluation measure, we use the Relative Deviation Index (RDI) and the success rate. RDI is the usual indicator of quality for problems involving due dates (Fernandez-Viagas & Framinan, 2015). In this indicator, the total tardiness returned for a given method is compared with the best and the worst results obtained for all the methods under comparison. Mathematically, the RDI for a method s ∈ H when applied to instance t is defined as follows: Firstly, the proposed approach uses an initial solution based on the OMMD heuristic (Lee, 2013), where H = {MILP, JPO-20, SR} and Tst is the tardiness value obtained by method s in instance t. In our case minh∈HTht is the best solution found among the methods in H. The Success Rate (SRa) is calculated as the number of times that a given rule results in the best solution (with or without a draw) divided by the number of test instances in instance class. We consider the following methods in our computational experiments: mixed-integer programing problem (MILP), proposed by Framinan and Perez-Gonzalez (2018); JPO-20 algorithm, proposed by Framinan and Perez-Gonzalez (2018); and SR algorithm (our proposal). We implemented all the matheuristics using Julia language (https://julialang.org/) within Atom IDE ( https://atom.io/). For the pure MILP model as well as the matheuristics the commercial solver is the IBM ILOG CPLEX (https://www.ibm.com/products/ilog-cplex-optimization-studio) version 12.8. We perform the computational experience on a PC with Intel Core i5-3470 CPU 3.20GHz and 32GB memory. We use 600 seconds as a time limit for the MILP, JPO-20, and SR methods, as presented by Framinan and Perez-Gonzalez (2018). Since the MILP method is not efficient for large-sized instances, we use the OMDD heuristic (Lee, 2013) as a warm start for the JPO-20 and SR matheuristic. After several preliminary computational experiments, we determine the values of the parameter α following the test problem size. The size-reduction parameter is varying with the values of m and n, as illustrated in Table 2. For example, taking into account the test instances with m = 5 and n = 100, we adopt α = 0.8, meaning that 80% of the decision variable values are not fixed. Concerning the calibration process of the parameter α, we perform the adjustment empirically. After several tests, we could observe that smallsized instances require a smaller reduction, and the large-sized instances require a greater reduction.

Table 2
Description of the parameters used in the proposed SR algorithm.  Table 3 illustrates the results for the methods under comparison grouped by problem size. We can observe that SR presents the lowest average RDI (around 15.5%), whereas the JPO-20 presents an average RDI approximately three times higher (around 52.5 %). For the test instances with the lower values of m and n, MILP reaches the best results. However, for the large-sized instances, the performance of MILP drastically reduces. For the test instances with 10 machines and 300 orders, CPLEX is not able to find a feasible integer solution within the specified time limit. In contrast, JPO-20 and SR return feasible integer solutions for all the considered test instances. Considering the success rate indicator, we can observe that the SR algorithm returns a success rate value approximately 3 times higher than the JPO-20 algorithm and two times higher than the MILP method.  Table 3 and Table 4 are different. Once again, there is evidence of the superiority of the SR algorithm in comparison with all the other evaluated methods.   Fig. 4 illustrates the boxplots for average RDI values depending on the numbers of orders. In this figure, for each value of n we consider the test instances of 5 and 10 machines. We can observe that the MILP method returns the smaller RDI values for the test instances with 100 orders. For the test instances with 150, 200, and 300 orders, the SR algorithm returns lower RDI values than the MILP method as well as the JPO-20 algorithm.  To validate the results, an ANOVA experiment is applied to verify the observed differences in the results of the local search algorithms are statistically significant. Since the F value is greater than the critical value, as illustrated in Tables 5, 7, and 9, a statistically significant difference between the methods under comparison is found. In these three tables, Df means the degrees of freedom, Sum Sq means the sum of squares, and Mean Sq means the mean of squares. In Fig. 7, the mean plots with HSD Tukey intervals (α = 0.05) of all evaluated methods are presented. Furthermore, Tables 6, 8, 10 illustrate the Tukey HSD results. Table 5 presents the ANOVA of average RDI values depending on TF and RDD, where F is a statistic that determines if the means of two or more populations are significantly different. We can observe that the F is greater than the critical value (in this case, f = 3.4028). Thus, it is possible to analyze which algorithms present a difference statistically significant using the Tukey test. According to Table 6, the only value greater than α = 0.05 is found for the pair SR-JPO. We can emphasize that SR outperforms the JPO-20 algorithm. Furthermore, we cannot state that there is a statistical significative difference between all the other methods for a 95% confidence level. One can observe that there are statistically significant differences between the average RDI values among the SR algorithm and the JPO-20 algorithm, as illustrated in Tables 6 and 8. Therefore, the SR algorithm outperforms the JPO-20 algorithm taking into consideration the evaluated test instances. In addition, we can emphasize that the SR algorithm also outperforms the MILP method for RDD values of 0.5 and 0.8, as illustrated in Table 10.  Table 8, the single value greater than α = 0.05 is for the pair SR-JPO with TF=0.2. Therefore, there is a statistically significant difference between these methods in this case. For all the other situations, we cannot conclude if there is a statistically significant difference for a confidence level of 95%.   Table 9 illustrates the ANOVA for average RDI values depending on RDD. The critical values for the RDD levels of 0.2, 0.5, and 0.8 are the same (in this case, f = 5.1433). Considering the Tukey test, as presented in Table 10, we find a differ RDD = 0.2. For RDD = 0.5, the SR outperforms the MILP method and the JPO-20 algorithm. For RDD = 0.8, SR outperforms the MILP method and the JPO-20 algorithm. Furthermore, the MILP method outperforms the JPO-20 algorithm. For all the other cases, we cannot state if there is a difference statistically significant between the methods under comparison for a confidence level of 95%. Concerning the computation times, the methods under study present a distinct behavior, as illustrated in Table 3. The MILP method, the JPO-20 algorithm, and the SR algorithm present average computational times of 566.8s, 630.7s, and 497.0, respectively. Although we adopt a time limit of 600s, in some cases the CPLEX presents an imprecision in controlling this time because of pre-solve function. Because of this imprecision, JPO-20 presents an average computational time greater than the specified time limit. We can observe that the SR algorithm uses a lower computational effort than all the other methods under comparison. In comparison with the JPO-20 algorithm, the proposed SR algorithm returns an average RDI approximately three times smaller, with a resultant reduction of the computational times of approximately 22%. The computational experience carried out shows that our proposal outperforms the JPO-20 algorithm and can provide high-quality results within admissible CPU times.

Conclusions
In this paper, we investigate the customer order scheduling problem, and the objective function is to minimize the total tardiness. We develop a size-reduction matheuristic that led to excellent results within an admissible computational effort. The results of the proposed approach are presented taking into consideration the literature benchmark instances presented by Framinan and Perez-Gonzalez (2018). We used the relative deviation index statistic and Success rate as the performance measures. Considering the above mentioned literature benchmark instances, the proposed size-reduction algorithm outperforms the JPO-20 matheuristic proposed by Framinan and Perez-Gonzalez (2018). The proposed algorithm finds better solutions than the JPO-20 algorithm, using lower computational times. As extensions of this work, we suggest the consideration of explicit setup times in the customer order scheduling. Meta-heuristics could be proposed for the resolution of the problem under study. In addition, future studies could also investigate the behavior of the proposed approaches considering other objective functions, such as total completion time minimization or a just-in time environment.