Heuristics with Performance Guarantees for the Minimum Number of Matches Problem in Heat Recovery Network Design

Heat exchanger network synthesis exploits excess heat by integrating process hot and cold streams and improves energy efficiency by reducing utility usage. Determining the minimum number of matches is the bottleneck of designing a heat recovery network using the sequential method. This subproblem is an NP-hard mixed-integer linear program exhibiting combinatorial explosion in the possible hot and cold stream configurations. We explore this challenging optimization problem from a graph theoretic perspective and correlate it with other special optimization problems such as cost flow network and packing problems. In the case of a single temperature interval, we develop a new optimization formulation without problematic big-M parameters. We develop heuristic methods with performance guarantees using three approaches: (i) relaxation rounding, (ii) water filling, and (iii) greedy packing. Numerical results from a collection of 48 instances substantiate the strength of the methods.


Introduction
Heat exchanger network synthesis (HENS) minimizes cost and improves energy recovery in chemical processes (Biegler et al. 1997, Smith 2000, Elia et al. 2010, Baliban et al. 2012. HENS exploits excess heat by integrating process hot and cold streams and improves energy efficiency by reducing utility usage (Floudas and Grossmann 1987, Gundersen and Naess 1988, Furman and Sahinidis 2002, Escobar and Trierweiler 2013. Floudas et al. (2012) review the critical role of heat integration for energy systems producing liquid transportation fuels (Niziolek et al. 2015). Other important applications of HENS include: refrigeration for the minimum number of matches problem. These methods have guaranteed solution quality and efficient run-time bounds. In the sequential method, many possible stream configurations are required to evaluate the minimum overall cost (Floudas 1995), so a complementary contribution of this work is a heuristic methodology for producing multiple solutions efficiently. We classify the heuristics based on their algorithmic nature into three categories: (i) relaxation rounding, (ii) water filling, and (iii) greedy packing.
The manuscript proceeds as follows: Section 2 formally defines the minimum number of matches problem and discusses mathematical models. Section 3 discusses computational complexity and introduces a new N P-hardness reduction of the minimum number of matches problem from bin packing. Section 4 focusses on the single temperature interval problem.
Section 5 explores computing the maximum heat exchanged between the streams with match restrictions. Sections 6 -8 present our heuristics for the minimum number of matches problem based on: (i) relaxation rounding, (ii) water filling, and (iii) greedy packing, respectively, as well as new theoretical performance guarantees. Section 9 evaluates experimentally the heuristics and discusses numerical results. Sections 10 and 11 discuss the manuscript contributions and conclude the paper.

Minimum Number of Matches for Heat Exchanger Network Synthesis
This section defines the minimum number of matches problem and presents the standard transportation and transshipment MILP models. Table 1 contains the notation.
Bin (single temperature interval problem) Maximum heat among all hot streams (h max = max i∈H {h i }) c j Total heat demanded by cold stream j (c j = t∈T δ j,t ) σ i,s Heat supply of hot stream i in interval s δ j,t Heat demand of cold stream j in interval t σ, δ Vectors of all heat supplies, demands σ t , δ t Vectors of all heat supplies, demands in temperature interval t R t Residual heat exiting temperature interval t U i,j Upper bound (big-M parameter) on the heat exchanged via match (i, j) λ i,j Fractional cost approximation of match (i, j) (Lagrangian relaxation) λ Vector of all fractional cost approximations λ i,j Variables y i,j Binary variable indicating whether i and j are matched q i,j,t Heat of hot stream i received by cold stream j in interval t q i,s,j,t Heat exported by hot stream i in s and received by cold stream j in t y, q Vectors of binary, continuous variables r i,s Heat residual of heat of hot stream i exiting s x b Binary variable indicating whether bin b is used w i,b Binary variable indicating whether hot stream i is placed in bin b z j,b Binary variable indicating whether cold stream j is placed in bin b Other N Minimum cost flow network G Solution graph (single temperature interval problem) φ(M ) Filling ratio of a set M of matches y f , q f Optimal fractional solution α i , β j Number of matches of hot stream i, cold stream j L i,j Heat exchanged from hot stream i to cold stream j I Instance of the problem r Remaining heat of an algorithm

Problem Definition
The minimum number of matches problem posits a set of hot process streams to be cooled and a set of cold process streams to be heated. Each stream is associated with an initial and a target temperature. This set of temperatures defines a collection of temperature intervals.
Each hot stream exports (or supplies) heat in each temperature interval between its initial and target temperatures. Similarly, each cold stream receives (or demands) heat in each temperature interval between its initial and target temperatures. Appendix F formally defines the temperature range partitioning. Heat may flow from a hot to a cold stream in the same or a lower temperature interval, but not in a higher one. In each temperature interval, the residual heat descends to lower temperature intervals. A zero heat residual is a pinch point. A pinch point restricts the maximum energy integration and divides the network into subnetworks. A problem instance consists of a set H = {1, 2, . . . , n} of hot streams, a set C = {1, 2, . . . , m} of cold streams, and a set T = {1, 2, . . . , k} of temperature intervals. Hot stream i ∈ H has heat supply σ i,s in temperature interval s ∈ T and cold stream j ∈ C has heat demand δ j,t in temperature interval t ∈ T . Heat conservation is satisfied, i.e. i∈H s∈T σ i,s = j∈C t∈T δ j,t . We denote by h i = s∈T σ i,s the total heat supply of hot stream i ∈ H and by c j = t∈T δ j,t the total heat demand of cold stream j ∈ C.
A feasible solution specifies a way to transfer the hot streams' heat supply to the cold streams, i.e. an amount q i,s,j,t of heat exchanged between hot stream i ∈ H in temperature interval s ∈ T and cold stream j ∈ C in temperature interval t ∈ T . Heat may only flow to the same or a lower temperature interval, i.e. q i,s,j,t = 0, for each i ∈ H, j ∈ C and s, t ∈ T such that s > t. A hot stream i ∈ H and a cold stream j ∈ C are matched, if there is a positive amount of heat exchanged between them, i.e. s,t∈T q i,s,j,t > 0. The objective is to find a feasible solution minimizing the number of matches (i, j).

Mathematical Models
The transportation and transshipment models formulate the minimum number of matches as a mixed-integer linear program (MILP).
Transportation Model (Cerda and Westerburg 1983). As illustrated in Figure 1a, the transportation model represents heat as a commodity transported from supply nodes to destination nodes. For each hot stream i ∈ H, there is a set of supply nodes, one for each temperature interval s ∈ T with σ i,s > 0. For each cold stream j ∈ C, there is a set of demand nodes, one for each temperature interval t ∈ T with δ j,t > 0. There is an arc between the supply node (i, s) and the destination node (j, t) if s ≤ t, for each i ∈ H, j ∈ C and s, t ∈ T .
In the MILP formulation, variable q i,s,j,t specifies the heat transferred from hot stream i ∈ H in temperature interval s ∈ T to cold stream j ∈ C in temperature interval t ∈ T .
Binary variable y i,j if whether streams i ∈ H and j ∈ C are matched or not. Parameter U i,j is a big-M parameter bounding the amount of heat exchanged between every pair of hot stream i ∈ H and cold stream j ∈ C, e.g. U i,j = min{h i , c j }. The problem is formulated: min i∈H j∈C    (Cerda and Westerburg 1983), each hot stream i supplies σ i,t units of heat in temperature interval t which can be received, in the same or a lower temperature interval, by a cold stream j which demands δ j,t units of heat in t. In the transshipment model (Papoulias and Grossmann 1983), there are also intermediate nodes transferring residual heat to a lower temperature interval. This figure is adapted from Furman and Sahinidis (2004).
j∈C t∈T q i,s,j,t = σ i,s i ∈ H, s ∈ T (2) i∈H s∈T q i,s,j,t = δ j,t j ∈ C, t ∈ T (3) s,t∈T q i,s,j,t ≤ U i,j · y i,j i ∈ H, j ∈ C (4) q i,s,j,t = 0 i ∈ H, j ∈ C, s, t ∈ T : s ≤ t (5) Expression (1), the objective function, minimizes the number of matches. Equations (2) and (3) ensure heat conservation. Equations (4) enforce a match between a hot and a cold stream if they exchange a positive amount of heat. Equations (4) are big-M constraints.
Equations (5) ensure that no heat flows to a hotter temperature.
Transshipment Model (Papoulias and Grossmann 1983). As illustrated in Figure 1b, the transshipment formulation transfers heat from hot streams to cold streams via intermediate transshipment nodes. In each temperature interval, the heat entering a transshipment node either transfers to a cold stream in the same temperature interval or it descends to the transshipment node of the subsequent temperature interval as residual heat.
Binary variable y i,j is 1 if hot stream i ∈ H is matched with cold stream j ∈ C and 0 otherwise. Variable q i,j,t represents the heat received by cold stream j ∈ C in temperature interval t ∈ T originally exported by hot stream i ∈ H. Variable r i,s represents the residual heat of hot stream i ∈ H that descends from temperature interval s to temperature interval s+1. Parameter U i,j is a big-M parameter bounding the heat exchanged between hot stream i ∈ H and cold stream j ∈ C, e.g. U i,j = min{h i , c j }. The problem is formulated: Expression (7) minimizes the number of matches. Equations (8)-(10) enforce heat conservation. Equation (11) allows positive heat exchange between hot stream i ∈ H and cold stream j ∈ C only if (i, j) are matched.

Computational Complexity
We briefly introduce N P-completeness and basic computational complexity classes (Arora and Barak 2009, Papadimitriou 1994. A polynomial algorithm produces a solution for a computational problem with a running time polynomial to the size of the problem instance. There exist problems which admit a polynomial-time algorithm and others which do not.
There is also the class of N P-complete problems for which we do not know whether they admit a polynomial algorithm or not. The question of whether N P-complete problems admit a polynomial algorithm is known as the P = N P question. In general, it is conjectured that P = N P, i.e. N P-complete problems are not solvable in polynomial time. An optimization problem is N P-hard if its decision version is N P-complete. A computational problem is strongly N P-hard if it remains N P-hard when all parameters are bounded by a polynomial to the size of the instance.
The minimum number of matches problem is known to be strongly N P-hard, even in the special case of a single temperature interval. Furman and Sahinidis (2004) propose an N P-hardness reduction from the well-known 3-Partition problem, i.e. they show that the minimum number of matches problem has difficulty equivalent to the 3-Partition problem.

Figure 2: Analysis of an Approximation Algorithm
Appendix A presents an alternative N P-hardness reduction from the bin packing problem. This alternative setting of the minimum number of matches problem gives new insight into the packing nature of the problem. A major contribution of this paper is to design efficient, greedy heuristics motivated by packing.
Theorem 1. There exists an N P-hardness reduction from bin packing to the minimum number of matches problem with a single temperature interval.
Proof: See Appendix A.

Approximation Algorithms
A heuristic with a performance guarantee is usually called an approximation algorithm (Vazirani 2001, Williamson andShmoys 2011). Unless P = N P, there is no polynomial algorithm solving an N P-hard problem. An approximation algorithm is a polynomial algorithm producing a near-optimal solution to an optimization problem. Formally, consider an an optimization problem, without loss of generality minimization, and a polynomial Algorithm A for solving it (not necessarily to global optimality). For each problem instance I, let C A (I) and C OP T (I) be the algorithm's objective value and the optimal objective value, respectively. Algorithm A is ρ-approximate if, for every problem instance I, it holds that: That is, a ρ-approximation algorithm computes, in polynomial time, a solution with an objective value at most ρ times the optimal objective value. The value ρ is the approximation ratio of Algorithm A. To prove a ρ-approximation ratio, we proceed as depicted in Figure 2.
For each problem instance, we compute analytically a lower bound C LB (I) of the optimal objective value, i.e. C LB (I) ≤ C OP T (I), and we show that the algorithm's objective value is at most ρ times the lower bound, i.e. C A (I) ≤ ρ · C LB (I). The ratio of a ρ-approximation algorithm is tight if the algorithm is not ρ − approximate for any > 0. An algorithm is O(f (n))-approximate and Ω(f (n))-approximate, where f (n) is a function of an input parameter n, if the algorithm does not have an approximation ratio asymptotically higher and lower, respectively, than f (n).
Approximation algorithms have been developed for two problem classes relevant to process systems engineering: heat exchanger networks (Furman and Sahinidis 2004) and pooling (Dey and Gupte 2015).

Single Temperature Interval Problem
This section proposes efficient algorithms for the single temperature interval problem.
Using graph theoretic properties, we obtain: (i) a novel, efficiently solvable MILP formulation without big-M constraints and (ii) an improved 3/2-approximation algorithm.
In the single temperature interval problem, a feasible solution can be represented as a bipartite graph G = (H ∪ C, M ) in which there is a node for each hot stream i ∈ H, a node for each cold stream j ∈ C and the set M ⊆ H × C specifies the matches. Appendix B shows the existence of an optimal solution whose graph G does not contain any cycle. A connected graph without cycles is a tree, so G is a forest consisting of trees. Appendix B also shows that the number v of edges in G, i.e. the number of matches, is related to the number of trees with the equality v = n + m − . Since n and m are input parameters, minimizing the number of matches in a single temperature interval is equivalent to finding a solution whose graph consists of a maximal number of trees.

Novel MILP Formulation
We propose a novel MILP formulation for the single temperature interval problem. In an optimal solution without cycles, there can be at most min{n, m} trees. From a packing perspective, we assume that there are min{n, m} available bins and each stream is placed into exactly one bin. If a bin is non-empty, then its content corresponds to a tree of the graph. The objective is to find a feasible solution with a maximum number of bins.
To formulate the problem as an MILP, we define the set B = {1, 2, . . . , min{n, m}} of available bins. Binary variable x b is 0 if bin b ∈ B is empty and 1, otherwise. A binary variable w i,b indicates whether hot stream i ∈ H is placed into bin b ∈ B. Similarly, a binary variable z j,b specifies whether cold stream j ∈ C is placed into bin b ∈ B. Then, the minimum number of matches problem can be formulated: Expression (13), the objective function, maximizes the number of bins. Equations (14) and (15) ensure that a bin is used if there is at least one stream in it. Equations (16) and (17) enforce that each stream is assigned to exactly one bin.

Improved Approximation Algorithm
Furman and Sahinidis (2004) propose a greedy 2-approximation algorithm for the minimum number of matches problem in a single temperature interval. We show that their analysis is tight. We also propose an improved, tight 1.5-approximation algorithm by prioritizing matches with equal heat loads and exploiting graph theoretic properties.
The simple greedy (SG) algorithm considers the hot and the cold streams in nonincreasing heat load order (Furman and Sahinidis 2004). Initially, the first hot stream is matched to the first cold stream and an amount min{h 1 , c 1 } of heat is transferred between them. Without loss of generality h 1 > c 1 , which implies that an amount h 1 − c 1 of heat load remains to be transferred from h 1 to the remaining cold streams. Subsequently, the algorithm matches h 1 to c 2 , by transferring min{h 1 − c 1 , c 2 } heat. The same procedure repeats with the other streams until all remaining heat load is transferred. Furman and Sahinidis (2004) show that Algorithm SG is 2-approximate for one temperature interval. Our new result in Theorem 2 shows that this ratio is tight.
Theorem 2. Algorithm SG achieves an approximation ratio of 2 for the single temperature interval problem and it is tight.
Algorithm 1 Simple Greedy (SG), developed by Furman and Sahinidis (2004), is applicable to one temperature interval only.
1: Sort the streams so that h 1 ≥ h 2 ≥ . . . ≥ h n and c 1 ≥ c 2 ≥ . . . ≥ c m . 2: Set i = 1 and j = 1. 3: while there is remaining heat load to be transferred do 4: Transfer q i,j = min{h i , c j } 5: if c j = 0, then set j = j + 1 8: end while Algorithm 2 Improved Greedy (IG) is applicable to one temperature interval only. Transfer h i amount of heat load (also equal to c j ) between them and remove them. 3: end for 4: Run Algorithm SG with respect to the remaining streams.
Proof: See Appendix B.
Algorithm IG improves Algorithm SG by: (i) matching the pairs of hot and cold streams with equal heat loads and (ii) using the acyclic property in the graph representation of an optimal solution.
Theorem 3. Algorithm IG achieves an approximation ratio of 1.5 for the single temperature interval problem and it is tight.
Proof: See Appendix B.

Maximum Heat Computations with Match Restrictions
This section discusses computing the maximum heat that can be feasibly exchanged in a minimum number of matches instance. Section 5.1 discusses the specific instance of two streams and thereby reduces the value of big-M parameter U i,j . Sections 5.2 & 5.3 generalize Section 5.1 from 2 streams to any number of the candidate matches. Section 5.2 is limited to a restricted subset of matches in a single temperature interval. Section 5.3 calculates the maximum heat that can be feasibly exchanged for the most general case of multiple temperature intervals.
These maximum heat computations are an essential ingredient of our heuristic methods and aim in using a match in the most profitable way. They also answer the feasibility of the minimum number of matches problem.

Two Streams and Big-M Parameter Computation
A common way of computing the big-M parameters is setting U i,j = min{h i , c j } for each i ∈ H and j ∈ C. Gundersen et al. (1997) propose a better method for calculating the big-M parameter. Our novel Greedy Algorithm MHG (Maximum Heat Greedy) obtains tighter U i,j bounds than either the trivial bounds or the Gundersen et al. (1997) bounds by exploiting the transshipment model structure.
Given hot stream i and cold stream j, Algorithm MHG computes the maximum amount of heat that can be feasibly exchanged between i and j in any feasible solution. Algorithm MHG is tight in the sense that there is always a feasible solution where streams i and j exchange exactly U i,j units of heat. Note that, in addition to U i,j , the algorithm computes a value q i,s,j,t of the heat exchanged between each hot stream i ∈ H in temperature interval s ∈ T and each cold stream j ∈ C in temperature interval t ∈ T , so that s,t∈T q i,s,j,t = U i,j .
These q i,s,j,t values are required by greedy packing heuristics in Section 8.
Algorithm 3 is a pseudocode of Algorithm MHG. The correctness, i.e. the maximality of the heat exchanged between i and j, is a corollary of the well known maximum flowminimum cut theorem. Initially, the procedure transfers the maximum amount of heat across the same temperature interval; q i,u,s,u = min{σ i,u , δ j,u } for each u ∈ T . The remaining heat is transferred greedily in a top down manner, with respect to the temperature intervals, by accounting heat residual capacities. For each temperature interval u ∈ T , the heat residual u t=1 δ j,t imposes an upper bound on the amount of heat that may descend from temperature intervals 1, 2, . . . , u to temperature intervals u + 1, u + 2, . . . , k.

Single Temperature Interval
Given an instance of the single temperature interval problem and a subset M of matches, the maximum amount of heat that can be feasibly exchanged between the streams using only the matches in M can be computed by solving MaxHeatLP. For simplicity, MaxHeatLP drops temperature interval indices for variables q i,j .
Constraints (21) and (22) ensure that each stream uses only part of its available heat.

Relaxation Rounding Heuristics
This section investigates relaxation rounding heuristics for the minimum number of matches problem. Figure 3 shows the main steps in relaxation rounding. These heuristics begin by optimizing an efficiently-solvable relaxation of the original MILP. The efficientlysolvable relaxation allows violation of certain constraints, so that the optimal solution(s) is

Fractional LP Rounding
The LP rounding heuristic, originally proposed by Furman and Sahinidis (2004), transforms an optimal fractional solution for the transportation MILP to a feasible integral solution. We show that the fractional LP can be solved efficiently via network flow techniques.
We observe that, in the worst case, the heuristic produces a weak solution if it starts with an arbitrary optimal solution of the fractional LP. We derive a novel performance guarantee showing that the heuristic is efficient when the heat of each chosen match (i, j) is close to big-M parameter U i,j , in the optimal fractional solution.
Consider the fractional LP obtained by replacing the integrality constraints y i,j ∈ {0, 1} of the transportation MILP, i.e. Eqs.
(1)-(6), with the constraints 0 ≤ y i,j ≤ 1, for each i ∈ H and j ∈ C: FracLP can be solved via minimum cost flow methods. Figure 4 illustrates a network N , i.e. a minimum cost flow problem instance, such that finding a minimum cost flow in N is equivalent to optimizing the fractional LP. Network N is a layered graph with six layers of nodes: (i) a source node S, (ii) a node for each hot stream i ∈ H, (iii) a node for each pair (i, s) of hot stream i ∈ H and temperature interval s ∈ T , (iv) a node for each pair (j, t) for each cold stream j ∈ C and temperature interval t ∈ T , (v) a node for each cold stream j ∈ C, and (vi) a destination node D. We add: (i) the arc (S, i) with capacity h i for each i ∈ H, (ii) the arc (i, (i, s)) with capacity σ i,s for each i ∈ H and s ∈ T , (iii) the arc ((i, s), (j, t)) with infinite capacity for each i ∈ H, j ∈ C and s, t ∈ T , (iv) the arc ((j, t), j) with capacity δ j,t for each j ∈ H and t ∈ T , and (v) the arc (j, D) with capacity c j for each j ∈ C. Each arc ((i, s), (j, t)) has cost 1/U i,j for i ∈ H, j ∈ C and s, t ∈ T . Every other arc has zero cost. Any flow of cost i h i on network N is equivalent to a feasible solution for FracLP with the same cost and vice versa. Furman and Sahinidis (2004) observe that any feasible solution of FracLP can be rounded to a feasible solution of the original problem via Algorithm 4, a simple greedy procedure that we call FLPR. Given a problem instance I, the procedure F ractionalLP (I) computes an optimal solution of FracLP. We denote by ( y f , q f ) the optimal fractional solution.
An inherent drawback of the Furman and Sahinidis (2004) approach is the existence of optimal fractional solutions with unnecessary matches. Theorem 4 shows that Algorithm FLPR performance is bad in the worst case, even for instances with a single temperature Figure 4: Minimum cost network flow formulation of FracLP. The heat is modeled as flow transferred from a source node S to a destination node D. All finite capacities are labelled above the corresponding arcs. The cost is incurred in each arc between node (i, s) ∈ H × T and node (j, t) ∈ C × T under the condition that heat flows to the same or a lower temperature interval.

Algorithm 4 Fractional LP Rounding (FLPR) (Furman and Sahinidis 2004)
if s,t∈T q i,s,j,t > 0 then 5: end if 9: end for 10: Return ( y, q) interval. The proof, given in Appendix C, can be extended so that unnecessary matches occur across multiple temperature intervals.
Consider an optimal fractional solution to FracLP and suppose that M ⊆ H × C is the set of pairs of streams exchanging a positive amount of heat. For each (i, j) ∈ M , denote by L i,j the heat exchanged between hot stream i and cold stream j. We define: as the filling ratio, which corresponds to the minimum portion of an upper bound U i,j filled with the heat L i,j , for some match (i, j). Given an optimal fractional solution with filling ratio φ(M ), Theorem 5 obtains a 1/φ(M )-approximation ratio for FLPR.
Theorem 5. Given an optimal fractional solution with a set M of matches and filling ratio φ(M ), FLPR produces a (1/φ(M ))-approximate integral solution.
Proof: See Appendix C.
In the case where all heat supplies and demands are integers, the integrality of the minimum cost flow polytope and Theorem 5 imply that FLPR is U max -approximate, where Because performance guarantee of FLPR scales with the big-M parameters U i,j , we improve the heuristic performance by computing a small big-M parameter U i,j using Algorithm MHG in Section 5.1.

Lagrangian Relaxation Rounding
Furman and Sahinidis (2004) design efficient heuristics for the minimum number of matches problem by applying the method of Lagrangian relaxation and relaxing the big-M constraints. This approach generalizes Algorithm FLPR by approximating the fractional cost of every possible match (i, j) ∈ H × C and solving an appropriate LP using these costs.
We present the LP and revisit different ways of approximating the fractional match costs.
In a feasible solution, the fractional cost λ i,j of a match (i, j) is the cost incurred per unit of heat transferred via (i, j). In particular, where L i,j is the heat exchanged via (i, j). Then, the number of matches can be expressed as i,s,j,t λ i,j ·q i,s,j,t . Furman and Sahinidis (2004) propose a collection of heuristics computing a single cost value for each match (i, j) and constructing a minimum cost solution. This solution is rounded to a feasible integral solution equivalently to FLPR.
Given a cost vector λ of the matches, a minimum cost solution is obtained by solving: min i∈H j∈C s,t∈T A challenge in Lagrangian relaxation rounding is computing a cost λ i,j for each hot stream i ∈ H and cold stream j ∈ C. We revisit and generalize policies for selecting costs.
Cost Policy 1 (Maximum Heat). Matches that exchange large amounts of heat incur low fractional cost. This observation motivates selecting where U i,j is an upper bound on the heat that can be feasibly exchanged between i and j.
In this case, Lagrangian relaxation rounding is equivalent to FLPR (Algorithm 4).
Cost Policy 2 (Bounds on the Number of Matches). This cost selection policy uses lower bounds α i and β j on the number of matches of hot stream i ∈ H and cold stream j ∈ C, respectively, in an optimal solution. Given such lower bounds, at least α i cost is incurred for the h i heat units of i and at least β j cost is incurred for the c j units of j. On average, each heat unit of i is exchanged with cost at least α i /h i and each heat unit of j is exchanged with (2004) use lower bounds α i = 1 and β j = 1, for each i ∈ H and j ∈ C. We show that, for any choice of lower bounds α i and β j , this cost policy for selecting λ i,j is not effective. Even when α i and β j are tighter than 1, all feasible solutions of CostLP attain the same cost. Consider any feasible solution ( y, q) and the fractional cost

Furman and Sahinidis
Then the cost of ( y, q) in CostLP is: Since every feasible solution in (CostLP) has cost i∈H α i , Lagrangian relaxation rounding returns an arbitrary solution. Similarly, if λ i,j = β j /c j for (i, j) ∈ H × C, every feasible solution has cost j∈C β j . If λ i,j = 1 2 ( αi hi + βj cj ), all feasible solutions have the same cost 1/2 · ( i∈H α i + j∈C β j ).
Cost Policy 3 (Existing Solution). This method of computing costs uses an existing solution.
The main idea is to use the actual fractional costs for the solution's matches and a non-zero cost for every unmatched streams pair. A minimum cost solution with respect to these costs may improve the initial solution. Suppose that M is the set of matches in the initial solution and let L i,j be the heat exchanged via (i, j) ∈ M . Furthermore, let U i,j be an upper bound on the heat exchanged between i and j in any feasible solution. Then, a possible selection

Covering Relaxation Rounding
This section proposes a novel covering relaxation rounding heuristic for the minimum number of matches problem. The efficiency of Algorithm FLPR depends on lower bounding the unitary cost of the heat transferred via each match. The goal of the covering relaxation is to use these costs and lower bound the number of matches in a stream-to-stream to basis by relaxing heat conservation. The heuristic constructs a feasible integral solution by solving successively instances of the covering relaxation.
Consider a feasible MILP solution and suppose that M is the set of matches. For each hot stream i ∈ H and cold stream j ∈ C, denote by C i (M ) and H j (M ) the subsets of cold and hot streams matched with i and j, respectively, in M . Moreover, let U i,j be an upper bound on the heat that can be feasibly exchanged between i ∈ H and j ∈ C. Since the solution is feasible, it must be true that j∈Ci(M ) U i,j ≥ h i and i∈Hj (M ) U i,j ≥ c j . These inequalities are necessary, though not sufficient, feasibility conditions. By minimizing the number of matches while ensuring these conditions, we obtain a covering relaxation: In certain cases, the matches of an optimal solution to CoverMILP overlap well with the matches in a near-optimal solution for the original problem. Our new Covering Relaxation Rounding (CRR) heuristic for the minimum number of matches problem successively solves instances of the covering relaxation CoverMILP. The heuristic chooses new matches iteratively until it terminates with a feasible set M of matches. In the first iteration, Algorithm CRR constructs a feasible solution for the covering relaxation and adds the chosen matches in M . Then, Algorithm CRR computes the maximum heat that can be feasibly exchanged using the matches in M and stores the computed heat exchanges in q. In the second iteration, the heuristic performs same steps with respect to the smaller updated instance ( σ , δ ), where σ i,s = σ i,s − j,t q i,s,j,t and δ j,t = δ j,t − i,s q i,s,j,t . The heuristic terminates when all heat is exchanged.
Algorithm 5 is a pseudocode of heuristic CRR. Procedure CoveringRelaxation( σ, δ) produces an optimal subset of matches for the instance of the covering relaxation in which the heat supplies and demands are specified by the vectors σ and δ, respectively. Procedure M HLP ( σ, δ, M ) (LP-based Maximum Heat) computes the maximum amount of heat that can be feasibly exchanged by using only the matches in M and is based on solving the LP in Section 5.3.

Water Filling Heuristics
This section introduces water filling heuristics for the minimum number of matches problem. These heuristics produce a solution iteratively by exchanging the heat in each Algorithm 5 Covering Relaxation Rounding (CRR) For each j ∈ C and t ∈ T , set δ j,t ← δ j,t − i∈H s∈T q i,s,j,t 7: Excess heat (b) Excess Heat Descending Figure 5: A water filling heuristic computes a solution by exploiting the top down temperature interval structure and moving from the higher to the lower temperature interval. In each temperature interval t, the heuristic isolates the streams with positive heat at t, it matches them and descends the excess heat to the next interval which is sequentially solved.
temperature interval, in a top down manner. The water filling heuristics use, in each iteration, an efficient algorithm for the single temperature interval problem (see Section 4). if t = 1 then 5: end if 10: The greedy heuristic WFG adapts Algorithm IG by terminating when the entire heat demanded by the cold streams has been transferred. After addressing the single temperature interval, the excess heat descends to the next temperature interval. Algorithm 6 represents our water filling approach in pseudocode.
Proof: See Appendix D.

Greedy Packing Heuristics
This section proposes greedy heuristics motivated by the packing nature of the minimum number of matches problem. Each greedy packing heuristic starts from an infeasible solution with zero heat transferred between the streams and iterates towards feasibility by greedily selecting matches. The two main ingredients of such a heuristic are: (i) a match selection policy and (ii) a heat exchange policy for transferring heat via the matches. Section 8.1 observes that a greedy heuristic has a poor worst-case performance if heat residual capacities

A Pathological Example and Heat Residual Capacities
A greedy match selection heuristic is efficient if it performs a small number of iterations and chooses matches exchanging large heat load in each iteration. Our greedy heuristics perform large moves towards feasibility by choosing good matches in terms of: (i) heat and (ii) stream fraction. An efficient greedy heuristic should also be monotonic in the sense that every chosen match achieves a strictly positive increase on the covered instance size.
The Figure 6 example shows a pathological behavior of greedy non-monotonic heuristics.
The instance consists of 3 hot streams, 3 cold streams and 3 temperature intervals. Hot stream i ∈ H has heat supply σ i,s = 1 for s = i and no supply in any other temperature interval. Cold stream j ∈ C has heat demand δ j,t = 1 for t = j and no demand in any other temperature interval. Consider the heuristic which selects a match that may exchange the maximum amount of heat in each iteration. The matches (h 1 , c 2 ) and (h 2 , c 3 ) consist the initial selections. In the subsequent iteration, no match increases the heat that can be feasibly exchanged between the streams and the heuristic chooses unnecessary matches.
A sufficient condition enforcing strictly monotonic behavior and avoiding the above pathology, is for each algorithm iteration to satisfy the heat residual capacities. As depicted in Figure 7 Figure 7: Decomposition of a greedy packing heuristic. The problem instance I is the union of the instance I A already solved by the heuristic and the instance I B that remains to be solved.

Largest Heat Match First
Our Largest Heat Match First heuristics arise from the idea that the matches should individually carry large amounts of heat in a near optimal solution. Suppose that Q v is the maximum heat that may be transferred between the streams using only a number v of matches. Then, minimizing the number of matches is expressed as min{v : Theorem 7. Algorithm LHM-LP is O(log n+log hmax )-approximate, where is the required precision.
Proof: See Appendix E.

9:
For each t ∈ T , set δ j ,t ← δ j ,t − s∈T q i ,s,j ,t 10: r ← r − s,t∈T q i ,s,j ,t 11: end while 12: Return M LHM-LP heuristic is polynomial-time in the worst case. The i-th iteration solves nm − i + 1 LP instances which sums to solving a total of nm i=1 (nm − i + 1) = O(n 2 m 2 ) LP instances in the worst case. However, for large instances, the algorithm is time consuming because of this iterative LP solving. So, we also propose an alternative, time-efficient greedy approach. The new heuristic version builds a solution by selecting matches and deciding the heat exchanges, without modifying them in subsequent iterations.
The new approach for implementing the heuristic, that we call LHM, requires the M HG( σ, δ, i, j) procedure. Given an instance ( σ, δ) of the problem, it computes the maximum heat that can be feasibly exchanged between hot stream i ∈ H and cold stream j ∈ C, as defined in Section 5.1. The procedure also computes a corresponding value q i,s,j,t of heat exchanged between i ∈ H in temperature interval s ∈ T and j ∈ C in temperature interval t ∈ T . LHM maintains a set M of currently chosen matches together with their respective vector q of heat exchanges. In each iteration, it selects the match (i , j ) and heat exchanges q between i and j so that the value M HG( σ, δ, i , j ) is maximum. Algorithm 8 is a pseudocode of this heuristic. Suppose that F v is the maximum amount of total stream fraction that can be covered using no more than v matches. Then, minimizing the number of matches is expressed as min{v : F v ≥ n + m}. Based on this observation, the main idea of LFM heuristic is to construct iteratively a feasible set of matches, by selecting the match covering the largest fraction of streams, in each iteration. That is, LFM prioritizes proportional matches in a way that high heat hot streams are matched with high heat cold streams and low heat hot streams with low heat cold streams. In this sense, it generalizes the idea of Algorithm IG for the single temperature interval problem (see Section 4), according to which it is beneficial to match streams of (roughly) equal heat.

Largest Fraction Match First
An alternative that would be similar to LHM-LP is an LFM heuristic with an M F LP (M ) (LP-based Maximum Fraction) procedure computing the maximum fraction of streams that can be covered using only a given set M of matches. Like the LHM-LP heuristic, this procedure would be based on solving an LP (see Appendix E), except that the objective function maximizes the total stream fraction. The LFM heuristic can be also modified to attain more efficient running times using Algorithm M HG, as defined in Section 5.1. In each iteration, the heuristic selects the match (i, j) with the highest value is the maximum heat that can be feasibly exchanged between i and j in the remaining instance.

Smallest Stream Heuristic
Subsequently, we propose Smallest Stream First (SS) heuristic based on greedy match selection, which also incorporates stream priorities so that a stream is involved in a small number of matches. Let α i and β j be the number of matches of hot stream i ∈ H and cold stream j ∈ C, respectively. Minimizing the number of matches problem is expressed as min{ i∈H α i }, or equivalently min{ j∈C β j }. Based on this observation, we investigate heuristics that specify a certain order of the hot streams and match them one by one, using individually a small number of matches. Such a heuristic requires: (i) a stream ordering strategy and (ii) a match selection strategy. To reduce the number of matches of small hot streams, heuristic SS uses the order h 1 ≤ h 2 ≤ . . . ≤ h n .
In each iteration, the next stream is matched with a low number of cold streams using a greedy match selection strategy; we use greedy LHM heuristic. Observe that SS heuristic is more efficient in terms of running time compared to the other greedy packing heuristics, because it solves a subproblem with only one hot stream in each iteration. Algorithm 9 is a pseudocode of SS heuristic. Note that other variants of ordered stream heuristics may be obtained in a similar way. The heuristic uses the M HG algorithm in Section 5.1.

Numerical Results
This section evaluates the proposed heuristics on three test sets. Section 9.1 provides information on system specifications and benchmark instances. Section 9.2 presents computational results of exact methods and shows that commercial, state-of-the-art approaches have difficult solving moderately-sized instances to global optimality. Section 9.3 evaluates q ← q + q 10: For each s ∈ T , set σ i,s ← σ i,s − t∈T q i,s,j ,t

11:
For each t ∈ T , set δ j ,t ← δ j ,t − s∈T q i,s,j ,t

12:
r ← r − s,t∈T q i ,s,j ,t 13: end while 14: end for 15: Return M experimentally the heuristic methods and compares the obtained results with those reported by Furman and Sahinidis (2004). All result tables are provided in Appendix G.

System Specification and Benchmark Instances
All computations are run on an Intel Core i7-4790 CPU 3.60GHz with 15.6 GB RAM running 64-bit Ubuntu 14.04. CPLEX 12.6.3 and Gurobi 6.5.2 solve the minimum number of matches problem exactly. The mathematical optimization models and heuristics are implemented in Python 2.7.6 and Pyomo 4.4.1 (Hart et al. 2011(Hart et al. , 2012. We use problem instances from two existing test sets (Furman andSahinidis 2004, Chen et al. 2015b). We also generate a collection of larger test cases using work of Grossmann (2017). An instance of general heat exchanger network design consists of streams and utilities with inlet, outlet temperatures, flow rate heat capacities and other parameters. Appendix F shows how a minimum number of matches instances arises from the original instance of general heat exchanger network design.
The Furman (2000) test set consists of test cases from the engineering literature. Table   G.4 reports bibliographic information on the origin of these test cases. We manually digitize this data set and make it publicly available for the first time (Letsios et al. 2017). Table   G.4 lists the 26 problem instance names and information on their sizes. The total number streams and temperature intervals varies from 6 to 38 and from 5 to 32, respectively. Table   G The Grossmann (2017) test set is generated randomly. The inlet, outlet temperatures of these instances are fixed while the values of flowrate heat capacities are generated randomly with fixed seeds. This test set contains 12 moderately challenging problems (see Table G

Exact Methods
We evaluate exact methods using state-of-the-art commercial approaches. For each problem instance, CPLEX and Gurobi solve the Section 2 transportation and transshipment models. Based on the difficulty of each test set, we set a time limit for each solver run as follows: (i) 1800 seconds for the Furman (2000) test set, (ii) 7200 seconds for the Chen et al.
(2015a,b) test set, and (iii) 14400 seconds for the Grossmann (2017) test set. In each solver run, we set absolute gap 0.99, relative gap 4%, and maximum number of threads 1. Table G.5 reports the best found objective value, CPU time and relative gap, for each solver run. Observe that state-of-the-art approaches cannot, in general, solve moderatelysized problems with 30-40 streams to global optimality. For example, none of the test cases in Grossmann (2017) test set is solved to global optimality within the specified time limit. Table   G.8 contains the results reported by Furman and Sahinidis (2004) using CPLEX 7.0 with 7 hour time limit. CPLEX 7.0 fails to solve 4 instances to global optimality. Interestingly, CPLEX 12.6.3 still cannot solve 3 of these 4 instances with a 1.5 hour timeout.
Theoretically, the transshipment MILP is better than the transportation MILP because the former has asymptotically fewer variables. This observation is validated experimentally with the exception of very few instances, e.g. balanced10, in which the transportation model computes a better solution within the time limit. CPLEX and Gurobi are comparable and neither dominates the other. Instances with balanced streams are harder to solve, which highlights the difficulty introduced by symmetry, see Kouyialis and Misener (2017).

Heuristic Methods
The difficulty of solving the minimum number of matches problem to global optimality motivates the design of heuristic methods and approximation algorithms with proven performance guarantees. Tables G.6 and G.7 contain the computed objective value and CPU times, respectively, of the heuristics for all test cases. For the challenging Chen et al.
(2015a,b) and Grossmann (2017) test sets, heuristic LHM-LP always produces the best solution. The LHM-LP running time is significantly higher compared to all heuristics due to the iterative LP solving, despite the fact that it is guaranteed to be polynomial in the worst case. Alternatively, heuristic SS produces the second best heuristic result with very efficient running times in the Chen et al. (2015a,b) and Grossmann (2017) test sets. Figure 8 depicts the performance ratio of the proposed heuristics using a box and whisker plot, where the computed objective value is normalized with the one found by CPLEX for the transshipment MILP. Figure 9 shows a box and whisker plot of the CPU times of all heuristics in log 10 scale normalized by the minimum CPU time for each test case. Figure 10 shows a line chart verifying that our greedy packing approach produces better solutions than the relaxation rounding and water filling ones. Performance ratio  also demonstrates the importance of the big-M parameter on the transportation MILP fractional relaxation quality.
In particular, Table G   Algorithm MHG is strictly best for 33 FLPR and 32 LRR test cases. Finally, Algorithm MHG acheives the tightest fractional MILP relaxation for all test instances. Figure 8 and CRR solves a sequence of MILPs, Figure 9 and Table G.7 show that its running time is efficient compared to the other relaxation rounding heuristics.
Our water filling heuristics are equivalent to or better than Furman and Sahinidis (2004) for 25 of their 26 test set instances (all except 7sp2). In particular, our Algorithm WFG is strictly better than their WFG in 18 of 26 instances and is worse in just one. This improvement stems from the new 1.5-approximation algorithm for the single temperature interval problem (see Section 4.2). The novel Algorithm WFM is competitive with Algorithm WFG and produces equivalent or better feasible solutions for 37 of the 48 test cases. In particular, WFM has a better performance ratio than WFG (see Figure 8) and WFM is strictly better than WFG in all but 1 of the Grossmann (2017)  in the same order of magnitude as its greedy counterpart WFG (see Figure 9).
In summary, our heuristics obtained via the relaxation rounding and water filling methods improve the corresponding ones proposed by Furman and Sahinidis (2004). Furthermore, greedy packing heuristics achieve even better results in more than 90% of the test cases.

Discussion of Manuscript Contributions
This section reflects on this paper's contributions and situates the work with respect to exisiting literature. We begin in Section 4 by designing efficient heuristics for the minimum number of matches problem with the special case of a single temperature interval. Initially, we show that the 2 performance guarantee by Furman and Sahinidis (2004) is tight. Using graph theoretic properties, we propose a new MILP formulation for the single temperature interval problem which does not contain any big-M constraints. We also develop an improved, tight, greedy 1.5-approximation algorithm which prioritizes stream matches with equal heat loads. Apart from the its independent interest, solving the single temperature interval problem is a major ingredient of water filling heuristics.
The multiple temperature interval problem requires big-M parameters. We reduce these parameters in Section 5 by computing the maximum amount of heat transfer with match restrictions. Initially, we present a greedy algorithm for exchanging the maximum amount of heat between two streams. This algorithm computes tighter big-M parameters than Gundersen et al. (1997). We also propose LP-based ways for computing the maximum exchanged heat using only a subset of the available matches. Maximum heat computations are fundamental ingredients of our heuristic methods and detect the overall problem feasibility.
This paper emphasizes how tighter big-M parameters improve heuristics with performance guarantees, but notice that improving the big-M parameters will also tend to improve exact methods. Section 8 develops a new greedy packing approach for designing efficient heuristics for the minimum the number of matches problem motivated by the packing nature of the problem.
Greedy packing requires feasibility conditions which may be interpreted as a decomposition method analogous to pinch point decomposition, see Linnhoff and Hindmarsh (1983). Similarly to Cerda et al. (1983), stream ordering affects the efficiency of greedy packing heuristics. Smallest Stream First (SS) is inspired by the idea of the tick-off heuristic (Linnhoff and Hindmarsh 1983) and produces matches in a stream to stream basis, where a hot stream is ticked-off by being matched with a small number of cold streams.
Finally, Section 9 shows numerically that our new way of computing the big-M parameters, our improved algorithms for the single temperature interval, and the other enhancements improve the performance of relaxation rounding and water-filling heuristics. The numerical results also show that our novel greedy packing heuristics dominate relaxation rounding and water-filling ones in the majority of test cases.

Conclusion
In his PhD thesis, Professor Floudas showed that, given a solution to the minimum number of matches problem, he could solve a nonlinear optimization problem designing effective heat recovery networks. But the sequential HENS method cannot guarantee that promising minimum number of matches solutions will be optimal (or even feasible!) to Professor Floudas' nonlinear optimization problem. Since the nonlinear optimization problem is relatively easy to solve, we propose generating many good candidate solutions to the minimum number of matches problem. This manuscript develops nine heuristics with performance guarantees to the minimum number of matches problem. Each of the nine heuristics is either novel or provably the best in its class. Beyond approximation algorithms, our work has interesting implications for solving the minimum number of matches problem exactly, e.g. the analysis into reducing big-M parameters.

Acknowledgments
We gratefully acknowledge support from EPSRC EP/P008739/1, an EPSRC DTP to G.K., and a Royal Academy of Engineering Research Fellowship to R.M.
Appendix A. N P-hardness Reduction Theorem 1. There exists an N P-hardness reduction from bin packing to the minimum number of matches problem with a single temperature interval.

Proof:
Initially To the first direction, consider a feasible packing O 1 , . . . , O m . For each i ∈ H and j ∈ C, we obtain a solution for the minimum number of matches instance by setting q i,j = h i if i ∈ O j , and q i,j = 0, otherwise. By the constraints ∪ m j=1 O j = O and O j ∩ O j = ∅ for each 1 ≤ j < j ≤ m, there is exactly one j ∈ B such that i ∈ O j . Hence, the number of matches is |{(i, j) ∈ H × C : q i,j > 0}| = n and j∈C q i,j = h i for every i ∈ H. Since the capacity of bin j ∈ B is not exceeded, we have that i∈Oj s i ≤ K, or equivalently i∈H q i,j ≤ c j for all j ∈ C. Thus, the obtained solution is feasible.
To the other direction, consider a feasible solution for the minimum number of matches instance. Obtain a feasible packing by placing object i ∈ O in the bin j if and only if q i,j > 0. Since the solution contains at most n matches and h i > 0, for each i ∈ H, each hot stream i ∈ H matches with exactly one cold stream j ∈ C and it holds that q i,j = h i .
That is, each object is placed in exactly one bin. Given that i∈H q i,j ≤ c j = K, the bin capacity constraints are also satisfied.

Appendix B. Single Temperature Interval Problem
Lemma 8 concerns the structure of an optimal solution for the single temperature interval problem. It shows that the corresponding graph is acyclic and that the number of matches is related to the number of graph's connected components (trees), if arc directions are ignored.
• if arc directions are ignored, the corresponding graph G( y * , q * ) is a forest consisting of * trees, i.e. there are no cycles, and • ( y * , q * ) contains v * = m + n − * matches.
Since G( y * , q * ) does not contain a cycle, it must be a forest consisting of * trees (which we call bins from a packing perspective). Let B = {1, . . . , * } be the set of these trees and M b the subset of matches in tree b ∈ B. By definition, tree b ∈ B contains |M b | matches (edges) and, therefore, |M b | + 1 streams (nodes). Furthermore, each stream appears in exactly one tree implying that * b=1 |M b | = n + m − * . Thus, it holds that the number of matches in ( y * , q * ) is equal to: Theorem 2 that Algorithm SG, developed by Furman and Sahinidis (2004), is tight.
Theorem 2. Algorithm SG achieves an approximation ratio of 2 for the single temperature interval problem and it is tight.

Proof:
In the algorithm's solution, the number v of matches is equal to the number of steps that the algorithm performs. For each pair of streams i ∈ H and j ∈ C matched by the algorithm, at least one has zero remaining heat load exactly after they have been matched. Therefore, the number of steps is at most v ≤ n + m − 1. The optimal solution contains at least v * ≥ max{n, m}. Hence, the algorithm is 2-approximate.
Consider a set of n hot streams with heat loads h i = 2n + 1 − i for 1 ≤ i ≤ n and m = n + 1 cold streams with c j = 2n − j, for 1 ≤ j ≤ m. As shown in Figure B.11 for the special case n = 5, the algorithm uses 2n matches while the optimal solution has n + 1 matches. Hence, the 2 approximation ratio of Algorithm SG is asymptotically tight. Lemma 9 formalizes the benefit of matching stream pairs with equal heat loads and indicates the way of manipulating these matches in the analysis of Algorithm IG and the proof of Theorem 3.
Lemma 9. Consider an instance (H, C) of the single temperature interval problem and suppose that there exists a pair of streams i ∈ H and j ∈ C such that h i = c j . Then, • there exists an optimal solution ( y * , q * ) s.t. q * i,j = h i , i.e. i and j are matched together, • any ρ-approximate solution for (H \ {i}, C \ {j}) is also ρ-approximate for (H, C) with the addition of match (i, j).

Proof:
Consider an optimal solution ( y * , q * ) in which i and j are not matched solely to each other.
Suppose that i is matched with j 1 , j 2 , . . . , j m while j is matched with i 1 , i 2 , . . . , i n . Without loss of generality, q * i,j = 0; the case 0 < q * i,j < h i is treated similarly. Starting from ( y * , q * ), we obtain the slightly modified solution ( y, q) in which i is matched only with j. The c j units of heat of i 1 , i 2 , . . . , i n originally transferred to j are now exchanged with j 1 , j 2 , . . . , j m , which are no longer matched with i. The remaining solution is not modified. Analogously to the proof of Theorem 2, we show that there can be at most n + m − 1 new matches between the n hot streams (i.e. i 1 , i 2 , . . . , i n ) and the m cold streams (i.e. j 1 , j 2 , . . . , j m ) in ( y, q). By also taking into account the new match (i, j), we conclude that there exists always a solution in which i is only matched with j and has no more matches than ( y * , q * ).
Consider an optimal solution ( y * , q * ) for (H, C), in which there are v * matches and i is matched only with j. An optimal solution for (H \ {i}, C \ {j}) contains v * − 1 matches.
The following theorem shows a tight analysis for Algorithm IG. Theorem 3. Algorithm IG achieves an approximation ratio of 1.5 for the single temperature interval problem and it is tight.

Proof:
By Theorem 2, Algorithm IG produces a solution ( y, q) with v ≤ n+m matches. Consider an optimal solution ( y * , q * ). By Lemma 8, ( y * , q * ) consists of * trees and has v * = n + m − * matches. Due Lemma 9, we may assume that instance does not contain a pair of equal hot and cold streams. Hence, each tree in the optimal solution contains at least 3 streams, i.e. * ≤ (n + m)/3. Thus, v * ≥ (2/3)(n + m) and we conclude that v ≤ (3/2)v * .
For the tightness of our analysis, consider an instance of the problem with n hot streams, where h i = 4n − 2i for i = 1, . . . , n, and m = 2n cold streams such that c j = 4n − 2j − 1 for j = 1, . . . , n and c j = 1 for j = n + 1, . . . , 2n. Algorithm IG uses 3n matches, while the optimal solution uses 2n matches. Hence the 3/2 approximation ratio of the algorithm is tight. Figures B.12a and B.12b show the special case with n = 4.

Proof:
We construct a minimum number of matches instance for which Algorithm FLPR produces a solution Ω(n) times far from the optimal solution. This instance consists of a single temperature interval and an equal number of hot and cold streams, i.e. n = m, with the same heat load h i = n and c j = n, for each i ∈ H and j ∈ C. Because of the single temperature interval, we ignore the temperature interval indices of the variables q. In the optimal solution, each hot stream is matched with exactly one cold stream and there are v * = n matches in total. Given that there exist feasible solutions such that q i,j = n, for every possible i ∈ H and j ∈ C, the algorithm computes the upper bound U i,j = n. In an optimal fractional solution, it holds that q f i,j = 1, for each i ∈ H and j ∈ C. In this case, Algorithm FLPR sets y i,j = 1 for each pair of streams i ∈ H, j ∈ C and uses a total number of matches equal to v = i∈H j∈C y i,j = Ω(n 2 ). Therefore, it is Ω(n)-approximate.
Theorem 5. Given an optimal fractional solution with a set M of matches and filling ratio φ(M ), FLPR produces a 1 φ(M ) -approximate integral solution.

Proof:
We denote Algorithm FLPR's solution and the optimal fractional solution by ( y, q) and ( y f , q f ), respectively. Moreover, suppose that ( y * , q * ) is an optimal integral solution. Let M ⊆ H × C be the set of matched pairs of streams by the algorithm, i.e. y i,j = 1, if (i, j) ∈ M , and y i,j = 0, otherwise. Then, it holds that: The first equality is obtained by using the fact that, for each (i, j) ∈ M , it holds that Ui,j and L i,j = s,t∈T q i,s,j,t . The first inequality is true by the definition of the filling ratio φ(M ) and the fact that q = q f . The second inequality holds by the big-M constraint of the fractional relaxation. The final inequality is valid due to the fact that the optimal fractional solution is a lower bound on the optimal integral solution.

Appendix D. Water Filling Heuristics
The reformulated MILP in Eqs. (D.1)-(D.6) solves the single temperature interval problem without heat conservation. It is similar to the MILP in Eqs. (13)-(19) with heat conservation, except that it does not contain constraints (14) while Equalities (16) and (18) become the inequalities (D.3) and (D.5). In the single temperature interval problem with (without) heat conservation, the total heat of hot streams is equal to (greater than or equal to) the demand of the cold streams. Each water filling algorithm step solves the single temperature interval problem without heat conservation. All heat demands of cold streams are satisfied and the excess heat supply of hot streams descends to the subsequent temperature interval.
Theorem 6 shows an asymptotically tight performance guarantee for water filling heuristics proportional to the number of temperature intervals. The positive performance guarantee implies the proof of Furman and Sahinidis (2004).

Proof:
A water filling algorithm solves an instance of the single temperature interval problem in each temperature interval t = 1, . . . , k. This instance consists of at most n hot streams and at most m cold streams. By Theorem 2, algorithms WFG and WFM introduce at most n + m new matches in each temperature interval and produce a solution with v ≤ k(n + m) matches. In the optimal solution, each hot and cold stream appears in at least one match which means that v * ≥ max{n, m} matches are chosen in total. So, v ≤ 2k · v * .
On the negative side, we show a lower bound on the performance guarantee of algorithms WFG and WFM using the extension of the problem instance in Figure D.13 with an equal number of hot streams, cold streams and temperature intervals, i.e. m = n = k.
Each hot stream i ∈ H has heat supply σ i,s ∈ {0, 1} and each cold stream j ∈ C has heat demand δ j,t ∈ {0, 1}, for each s, t ∈ T . Hot stream i has unit heat in temperature intervals

Appendix E. Greedy Packing Heuristics
Lemma 10 shows a condition ensuring the strict monotonicity of a greedy heuristic which decomposes any instance I into the instances I A (already solved) and I B (remaining to be solved) in each iteration (see Section 8.1).

Lemma 10.
A greedy heuristic is strictly monotonic if I B is feasible in each iteration.

Proof:
Given that I A is of maximal heat (see Section 8.1), any match of M is redundant in any feasible solution of I B . Since I B is feasible, there exists a match in H × C \ M whose selection increases the amount of heat exchanged in I A .
Lemma 11 states necessary and sufficient conditions for the feasibility of a minimum number of matches instance I. The first set of conditions ensures that heat always flows from the hot side to the same or lower temperature intervals on the cold side. The last condition enforces heat conservation.
Lemma 11. An instance I of the minimum number of matches is feasible if and only if the following conditions hold.
• For each u ∈ T \ {k}, it is the case that R u ≥ 0, or equivalently Proof: To the first direction, a violation of a condition makes the task of constructing a feasible solution impossible. To the opposite direction, Algorithm MHG in Section 3 constructs a feasible solution for every instance satisfying the conditions; it suffices to consider all the hot and cold streams as one large hot and large cold stream, respectively. The single hot stream has heat load i∈H σ i,s in temperature interval s ∈ T and the single cold stream has heat load j,t δ j,t in temperature interval t ∈ T .
Given a decomposition of an instance I into instances I A and I B , Lemma 12 shows that a careful construction of I A respecting the proposed heat residual capacities in Section 8.1 implies that I B is also feasible.
Lemma 12. Consider a decomposition of a feasible instance I into the instances I A and I B . Let R, R A and R B be the corresponding heat residual capacities. If I A is feasible and it holds that R A u ≤ R u for each u ∈ T , then I B is also feasible.

Proof:
To show that the Lemma is true, it suffices to show that I B satisfies the feasibility conditions of Lemma 11. Consider a temperature interval u ∈ T \ {k}. Then, In the same fashion, the fact that R k = R A k = 0 implies that R B k = 0. Hence, I B is feasible.
Given a set M matches, the LP in Eqs. (E.1)-(E.5) maximizes the total stream fraction that can be covered using only matches in M . It is similar to the LP in Eqs. (20)- (24) in Section 8.2, except that the maximum fraction objective function (E.1) replaces the maximum heat objective function (20).
The following theorem shows a performance guarantee for Algorithm LHM-LP using a standard packing argument.
Theorem 7. Algorithm LHM-LP is O(log n+log hmax )-approximate, where is the required precision. Proof: Initialy, we show an approximation ratio of O(log n + log h max ) for the special case of the problem with integer parameters. Then, we generalize the result to decimal parameters. We (E.6) Let S be the total remaining heat to be transferred when the -th iteration completes.
Then, S 0 = Q and S = Q − Q , for = 1, . . . , v, where Q = n i=1 h i is the total amount of heat. Note that S v = 0 because the algorithm produces a feasible solution. Since the algorithm chooses the match that results in the highest increase of transferred heat in each iteration, it must be the case that E 1 ≥ . . . ≥ E v or equivalently κ 1 ≤ . . . ≤ κ v . At the end of the -th iteration, the remaining heat can be transferred using at most v * additional matches by selecting the remaining matches of an optimal solution. Using a simple average argument we get that κ ≤ v * S −1 , for each = 1, . . . , v. Thus, Eq. (E.6) implies: By the integrality of the minimum cost network flow polytope, each value E is an integer, for = 1, . . . , v. Hence, .
Inequalities (E.7) and (E.8) imply: Using the asymptotic bound Q e=1 1 e = O(log Q) of harmonic series and the fact that Q ≤ n · h max , we conclude that the algorithm is O(log n + log h max )-approximate, where h max = max i∈H {h i } is the maximum heat of a hot stream.
Generalizing to decimal parameters, the algorithm is O(log n + log hmax ), where is the precision required for solving the problem instance. The reasoning is the same except that, instead of considering integer units, we consider units to extend inequality (E.8).

Appendix F. Minimum Utility Cost Problem
This section shows how to obtain a minimum number of matches problem instance from a general heat exchanger network design problem instance via minimizing utility cost. We include this appendix for completeness, but this material is available elsewhere (Floudas 1995). Table F.3 lists the notation.
General heat exchanger network design. An instance of the general heat exchanger network design consists of a set HS = {1, 2, . . . , ns} of hot streams, a set CS = {1, 2, . . . , ms} of cold streams, a set HU = {1, 2, . . . , nu} of hot utilities and a set CU = {1, 2, . . . , mu} of cold utilities. Each hot stream i ∈ HS (cold stream j ∈ CS) has initial inlet, outlet temperatures T HS in,i , T HS out,i (resp. T CS in,j , T CS out,j ) and flowrate heat capacity F Cp i (resp. F Cp j ). Each hot utility i ∈ HU (cold utility j ∈ CU ) is associated with inlet, outlet temperatures T HU in,i , T HU out,i (resp. T CU in,j , T CU out,j ) and a cost κ HU i (resp. κ CU j ). Heat demand of cold utility j in interval t R t Residual heat exiting temperature interval t Temperature intervals. The sequential method begins by computing a set T I = {1, 2, . . . , k} of k temperature intervals Flower 1978, Ciric andFloudas 1989). A minimum heat recovery approach temperature ∆T min specifies the minimum temperature difference between two streams exchanging heat. In order to incorporate ∆T min in the problem's setting, we enforce that each temperature interval corresponds to a temperature range on the hot stream side shifted up by ∆T min with respect to to its corresponding temperature range on the cold stream side. Let T I H and T I C be the temperature intervals on the hot and cold side, respectively. Consider, on the hot side, all k + 1 discrete temperature values We set T I = T I H and we observe that T I C contains exactly the same temperature intervals with T I shifted by ∆T min . Moreover, we For each temperature interval t ∈ T I, we are now able to compute the quantity σ HS i,t of heat load exported by hot stream i ∈ HS as well as the amount δ CS j,t of heat load received by cold stream j ∈ CS in t ∈ T I. Specifically, for each i ∈ HS and t ∈ T I, we set Similarly, for each j ∈ CS and t ∈ T I, Minimum utility cost. This problem is solved in order to compute the minimum amount of utility heat so that there is heat balance in the network. For each hot utility i ∈ HU and cold utility j ∈ CU the continuous variables σ HU i,t and δ CU j,t correspond to the amount of heat of i and j, respectively, in temperature interval t. The LP uses a heat residual variable R t , for each t ∈ T I. Let T I i be the set of temperature intervals to which hot utility i ∈ HU can transfer heat, feasibly. Similarly, let T I j be the set of temperature intervals from which cold utility j ∈ CU can receive heat. The minimum utility cost problem can be solved by using the following LP model (see Cerda et al. (1983), Papoulias and Grossmann (1983)).
Constraints F.4 and F.5 ensure that heat flows from a temperature interval to the same or a lower temperature interval.
Minimum number of matches. Given an optimal solution of the minimum utility cost problem, we obtain an instance of the minimum number of matches problem as follows. All utilities are considered as streams, i.e. H = HS ∪ HU , C = CS ∪ CU , n = ns + nu and m = ms + mu. Furthermore, T = T I. Finally, for each i ∈ H and t ∈ T the parameter σ i,t is equal to σ HS i,t or σ HU i,t depending on whether i was originally a hot stream or utility. The parameters δ j,t are obtained similarly, for each j ∈ C and t ∈ T .  Table G.5: Computational results using exact solvers CPLEX 12.6.3 and Gurobi 6.5.2 with relative gap 4%. Relative gap is defined (best incumbent -best lower bound) / best incumbent and * indicates timeout. The transshipment formulation performs better than the transportation model: the transshipment model solves one additional problem (balanced10) and performs as well or better than the transportation model on 46 of the 48 test cases (with respect to time or gap closed). CPLEX solves the small models slightly faster than Gurobi while Gurobi closes more of the optimality gap for large problems. All exact method results are available online (Letsios et al. 2017