Revisiting Surrogate Relaxation for the Multidimensional Knapsack Problem

The multidimensional knapsack problem (MKP) is a classic problem in combinatorial optimisation. Several authors have proposed to use surrogate relaxation to compute upper bounds for the MKP. In their papers, the surrogate dual is solved heuristically. We show that, using a modern dual simplex solver as a subroutine, one can solve the surrogate dual exactly in reasonable computing times. On the other hand, the resulting upper bound tends to be strong only for relatively small MKP instances.


Introduction
The multidimensional knapsack problem (MKP) is a classic problem in combinatorial optimisation, with a wide range of practical applications. It is defined as follows. We have n items and m resources. The profit of item j is p j . The amount of resource i available is b i . Item j uses a ij units of resource i. The goal is to select a set of items of maximum total profit, while respecting the availabilities of the resources.
The MKP has a natural formulation as a 0-1 linear program (0-1 LP). One simple way to compute an upper bound for any 0-1 LP is to solve its continuous relaxation. A more sophisticated approach is to use surrogate relaxation [19,20]. In the 1980s, several authors experimented with the application of surrogate relaxation to the MKP, with quite promising results (see, e.g., [16,18,26,31,33]).
To obtain a strong upper bound with surrogate relaxation, one must solve an auxiliary optimisation problem, called the surrogate dual. It was shown in 1986 that the dual can be solved exactly in pseudo-polynomial time [5]. In the five above-mentioned papers, however, the dual was solved only approximately. Indeed, up to now, no paper has presented computational results obtained by solving the surrogate dual exactly for benchmark MKP instances.
To address this gap in the literature, we present a simple algorithm for solving the dual exactly. The algorithm exploits the fact that excellent simplex-based LP solvers are now incorporated into many modern optimisation packages (such as CPLEX and Gurobi). We then test the algorithm on several families of benchmark MKP instances. It turns out that the algorithm can solve the dual in very reasonable computing times, even for large instances. Surprisingly, however, the resulting upper bound tends to be strong only for relatively small MKP instances. Indeed, for many of the larger instances, the upper bounds from continuous and surrogate relaxation turn out to be virtually identical.
For interest, we also present a simple "primal heuristic", which enables one to generate lower as well as upper bounds during our algorithm. The results from this heuristic are rather promising.
The paper has the following structure. The literature is reviewed in Section 2. The algorithm for the dual is described in Section 3, and the primal heuristic is given in Section 4. The computational results are in Section 5. Finally, some concluding remarks are made in Section 6.
Throughout the paper, we assume that the reader is familiar with the basics of linear, integer and dynamic programming (see [7,9]). We write "SR" and "SD" for surrogate relaxation and surrogate dual, respectively. We let N denote {1, . . . , n}. We assume that the p j , b i and a ij are positive integers. Finally, given a vector v ∈ R p + , we let |v| denote p i=1 v i .

Literature Review
We now review the literature. For brevity, we mention only papers of direct relevance.

The KP
When m = 1, the MKP reduces to the standard 0-1 knapsack problem (KP). Although the KP is itself N P-hard [24], it is often easy in practice (see, e.g., [27]). Moreover, the KP can be solved in pseudo-polynomial time via dynamic programming (DP). The original DP algorithm, due to Bellman [3], takes O nb 1 time. By reversing the roles of the profits and weights in Bellman's algorithm, one can solve the KP in O(nP ) time, where P is the profit of the optimal solution (see Section 2.3 of [27]).

The MKP
The MKP is N P-hard in the strong sense [17], but solvable in pseudopolynomial time for fixed m [27]. It has a natural formulation as a 0-1 LP. For all j ∈ N , define a binary variable x j , taking the value 1 if and only if item j is selected. We then have: The continuous relaxation of the 0-1 LP is obtained by replacing the binary condition with the weaker condition x ∈ [0, 1] n . The resulting LP can be solved extremely quickly in practice. We will let x * ∈ [0, 1] n denote the LP solution, U LP the corresponding upper bound, π ∈ Q m + the vector of dual prices, and ρ ∈ Q n + the vector of reduced costs. Several exact and heuristic algorithms for the MKP attempt to exploit information in the LP solution, by giving "priority" to variables with large x * j and/or small ρ j (e.g., [1,11,29,35]).

Surrogate relaxation and duality
Now consider an arbitrary 0-1 LP of the form (1), where negative coefficients are permitted. In SR, we pick a vector µ ∈ Q m + of surrogate multipliers, and solve the following simpler 0-1 LP [19,20]: We will let U (µ) denote the resulting upper bound. The SD is the problem of finding the vector µ that minimises U (µ). It is shown in [19,20] that U (µ) is a quasi-convex function of µ. This fact is used in various heuristic algorithms for the SD (e.g., [4,25,28,33]), most of which are variants of the subgradient method.
It has been proved that the SD can be solved in pseudo-polynomial time [5,12]. However, the proofs in [5,12] rely on the equivalence of separation and optimisation, which in turn relies on the ellipsoid method [21]. Given that the ellipsoid method is very slow in practice, the results in [5,12] are of theoretical interest only. Now recall the definition of π from the previous subsection. It was shown in [19,20] that U (π) ≤ U LP . Thus, in theory at least, the upper bound from the SD is at least as strong as the one from the LP.

Surrogate relaxation for the MKP
Observe that, if we apply SR to the MKP, all coefficients are non-negative in (2). Thus, in this case, (2) is a KP. For this reason, SR may be an attractive option for the MKP.
In the 1980s, several papers applied SR to the MKP (e.g., [16,18,26,31,33]). In those papers, the SD was solved heuristically, via iterative approaches. The resulting upper bounds were good, but the test instances were rather small by today's standards (see again [11,29]).
Crama & Mazzola [10] proved a negative result concerning the quality of the upper bounds that we can expect to obtain by applying SR to the MKP. It states that, for any µ ∈ Q m myredFinally, we remark that SR was applied to the MKP more recently, in [6,30]. In both papers, however, only very simple heuristics were used to select the multipliers. Moreover, in [30], SR was used to guide branching decisions in a branch-and-bound algorithm, rather than to improve the upper bound from the LP.

Solving the Surrogate Dual
As mentioned above, the exact algorithms in [5,12] for solving the SD are of theoretical interest only, since the ellipsoid method is very slow in practice. Given the fact that excellent simplex-based LP solvers are now available, we consider using a simplex-based method instead.
Observe that, for any given µ ∈ Q m + , the upper bound U (µ) will be integral. We then use the following result, proved in [12]. For any given θ ∈ Z + , there exists a vector µ ∈ Q m + such that U (µ) ≤ θ if and only if the following LP is feasible: We call the LP (3)-(5) the "master" LP. Note that the number of constraints (4) is typically exponential in n. Thus, to solve the master LP (for a given θ), one must use a specialised algorithm. In [12], we used the ellipsoid method. Here, however, we propose to use a simple cutting-plane algorithm based on the dual simplex method. This is because, as mentioned in the introduction, several excellent simplex-based LP solvers are now available.
In each iteration of our cutting-plane algorithm, we solve a relaxation of the master LP that contains only a subset of the constraints (4). The solution to the relaxed master LP yields a "tentative" multiplier vector, saȳ µ. One then solves a separation problem to check whetherμ satisfies all of the constraints (4) (see [21] for a thorough treatment of separation problems in convex and combinatorial optimisation).
Intuitively, the separation problem attempts to find a binary vectorx which has large profit, but satisfies the current "tentative" surrogate constraint. The separation problem can itself be formulated as the following 0-1 LP: The problem (6) is itself a KP (of minimisation type). It is not hard to show that it can be solved exactly in O(nθ) time. (This can be done using a modified version of the O(nP )-time algorithm for the standard KP, mentioned in Subsection 2.1.) Moreover, since the profit of the optimal MKP solution cannot exceed |p|, we can assume without loss of generality that θ ≤ |p|. Thus, we can solve (6) in O n |p| time. Note that this running time is pseudo-polynomial.
The above-mentioned cutting-plane algorithm takes a given value of θ ∈ Z + as input, but we do not know the optimal value of θ in advance. Thus, we perform a binary search on θ, solving the LP (3)-(5) in each major iteration. Now, let µ * denote the (as yet unknown) optimal solution to the SD, and let θ * = U (µ * ) denote the corresponding upper bound (also unknown) for the MKP. For the binary search, we need initial lower and upper bounds on θ * . We will call these and u. To obtain , we simply run a greedy heuristic for the MKP (inserting items in non-increasing order of profit), and then set to the profit of the resulting MKP solution. As for u, we simply use U LP .
The overall approach is described in Algorithm 1. Note that, towards the end of the outer "while" loop, we set θ to 0.9u + 0.1 rather than to (u + )/2 . This is because, in our experience, θ * tends to be closer to u than to .
We remark that the algorithm as stated returns only the upper bound θ * . One can easily modify the algorithm so that it also returns the optimal multiplier vector µ * , and/or the associated x vector. We omit details for brevity.

Primal Heuristic
Observe that, each time we solve the separation problem (6), we obtain a vectorx ∈ {0, 1} n . Typically,x is not a feasible solution to the MKP. For interest, we designed a simple primal heuristic, which attempts to "repair" x, to make it feasible for the MKP.
Before running the heuristic for the first time, we sort the items in order of "attractiveness" and create a sorted list. In more detail, when we solve the LP relaxation at the start of Algorithm 1, we store the vectors x * and ρ. We then sort the items in N according to two criteria. First, we sort Algorithm 1: Solving the Surrogate Dual of the MKP input: positive integers m, n; non-negative integer matrix A; positive integer vectors p and b. // initialisation Solve the LP relaxation of the MKP and set u to U LP ; Run a greedy heuristic for the MKP to get initial lower bound ; Set up initial "master" LP min |µ| : |µ| ≥ 1, µ ∈ R m + ; Solve the master LP via primal simplex; Set θ to u; // binary search while u = do // cutting-plane algorithm while the master LP is feasible do Letμ be the current solution to the master LP; Solve the 0-1 LP (6) them in non-increasing order of x * j value. Then, if there are any items with x * j = 0, we sort those in non-decreasing order of ρ j . We call the sorted list "SL".
Our primal heuristic is described in Algorithm 2. The idea is that, instead of relying entirely on the list SL, we give priority to items that havē x j = 1. Since the separation problem is solved many times during Algorithm 1, this gives our primal heuristic several chances to find a good solution. Of course, the best primal solution found so far is stored in memory.

Algorithm 2: Primal Heuristic
input : positive integers m, n; positive integer vectors p and b, non-negative integer matrix A, sorted list SL, current solutionx of the subproblem (6). output: feasible MKP solution S ⊂ N . Let N 1 be the set of items for whichx j = 1; Let j be the t-th item in SL that belongs to N k ; if item j fits into the knapsack then Place item j into S; end end end We remark that, if Algorithm 2 obtains a solution whose profit is larger than , we can update in Algorithm 1. This can be accomplished by adding the following line to Algorithm 1, immediately after the cuttingplane phase: "if the primal heuristic has found a new best MKP solution, then set to the profit of that solution". We found, however, that the performance of Algorithm 2 is already satisfactory in practice, making this change unnecessary.

Computational Results
In this section, we present our computational results. Our algorithm was implemented in C++ and compiled with Microsoft Visual Studio (v. 15.0) in Windows 10. To solve the master LPs, we used the CPLEX (v. 12.7.1) dual simplex solver, under default settings. All experiments were conducted on a 2.80GHz, 4-core Intel i7-7700HQ laptop, with 16GB RAM.

Test instances
Several hundred benchmark MKP instances are available at the OR-Library [2]. Most of those, however, are very small by modern standards. Of the remaining instances, we have selected the following three sets: • Instances proposed in 1967 by Weingartner & Ness [36]. All of these instances have m = 2. There are six instances with n = 28 and two with n = 105, making 8 instances in total. We will call these 'WN' instances.
• Instances proposed in 1979 by Shi [34]. All of these instances have m = 5. There are five instances with n = 30 and five with n = 90. For each value of n ∈ {40, 50, . . . , 80}, there are only four instances. This makes 30 instances in total.
The optimal values for the WN and Shi instances are available in the OR-Library. For the CB instances, the situation is more complicated. For the instances with n = 100 and/or m ≤ 10, the optimal values are known [29]. The remaining 60 instances, with n ∈ {250, 500} and m = 30, remain unsolved. The best known lower bounds for those instances can be found in [29]. Table 1 presents a summary of the results obtained with our upper-bounding procedures. The first three columns give the source of the instances, the number of items n and number of constraints m. The columns headed "UB %gap" show the average gap between the LP and SD upper bounds and the optimum, expressed as a percentage of the optimum. If a number is in parenthesis, it is because the optimal values are not known for the corresponding instances. (For those instances, the true gaps could be smaller, but cannot be larger.) The next two columns show the average number of major iterations in the binary search procedure, and the average number of times the separation problem had to be solved, respectively. Finally, the last column shows the average time taken by our SD algorithm, in seconds.

Results from LP and SD
(The time taken to solve the LP was negligible, being less than 0.1 seconds, in every case.) Each entry in the last five columns is an average taken over the instances of the given size. The full results will be made available at the Lancaster University Data Repository.) 1 We see that, for the WN and Shi instances, the SD bound is much stronger than the LP bound. For the CB instances, however, the improvement is small. In fact, closer inspection of the output revealed that, for many of the larger CB instances, the SD bound was equal to the LP bound rounded down to the nearest integer. Thus, for large-scale random instances, one may as well just solve the continuous relaxation, without bothering to use SR.
Another observation is that both integrality gaps tend to decrease as n increases, but increase as m increases. The effort required to solve the SD, however, seems to be an increasing function of both n and m.
On the positive side, the running times of our exact algorithm are reasonable, being measured in second rather than minutes. We remark that some additional implementation "tricks" could potentially improve the speed of our algorithm. For example, one could begin by running a local-search heuristic for the MKP, in order to obtain a better initial lower bound . One could also use a heuristic to solve the separation problem (6), and only use an exact separation algorithm when the heuristic fails.

Results from primal heuristics
The results obtained with the primal heuristics are summarised in Table  2. Each entry in the last three columns shows the average gap between a lower bound and the optimum, expressed as a percentage of the optimum. The column "Gr'" corresponds to the greedy heuristic, in which items are inserted in non-increasing order of profit. The column "LP" corresponds to the case in which items are first inserted in non-increasing order of x * j , and then in non-decreasing order of ρ j . (This is equivalent to running Algorithm 2 with N 1 set to N .) The column "SD" corresponds to our primal heuristic.
We do not report average running times in the table, because the time taken by all primal heuristics was negligible compared to the time taken to solve the SD.
We see that the LP-based primal heuristic performs much better than the greedy heuristic in almost all cases, and our SD-based primal heuristic performs much better still. Of course, one could attempt to improve the lower bounds further using local search or some other meta-heuristic. This is however beyond the scope of this paper.

Conclusion
Although surrogate relaxation has been applied many times to the MKP, we are the first to solve the surrogate dual exactly for instances of meaningful size. To do this, we used an approach based on the dual simplex method, rather than one of the classical heuristics, which are all essentially variations of the subgradient method.
The results indicate that one can solve the dual quite quickly in practice, using a modern implementation of the dual simplex method to solve the intermediate linear programs. Moreover, the upper bound from the surrogate dual was very strong for the instances with no more than 100 items. For the larger instances, however, the upper bound from the surrogate dual was often no better than the standard linear programming bound.
Finally, our simple primal heuristic, based on "repairing" the infeasible solutions found during the course of our algorithm, produced solutions of very good quality. This gives us hope that surrogate relaxation could be a useful tool to "drive" heuristics for various combinatorial optimisation problems. We explore this topic in a follow-up paper [13].