Optimising Heat Exchanger Network Synthesis using Convexity Properties of the Logarithmic Mean Temperature Diﬀerence

Industrial processes typically involve heating and cooling ﬂuids via networks of heat exchangers which reuse excess process heat onsite. Optimally synthesising these networks of heat exchangers is a mixed-integer nonlinear optimisation problem with nonlinear terms including bilinear stream mixing, concave cost functions, and the logarithmic mean temperature diﬀerence (LMTD), which characterises the nonlinear nature of heat exchange. LMTD is typically associated with numerical diﬃculties, but, after adding the limits, this manuscript proves the strict convexity of LMTD β , β < 0, and also characterises the shape of the function for all β ≤ 1. These proofs motivate why previous, heuristic-based approaches work best when the problem is reformulated to move the LMTD terms into the objective. The convexity results also lead to an eﬀective algorithm bounding the simultaneous synthesis model SYNHEAT from the online test set MINLPLib2; this algorithm solves a series of mixed-integer linear optimisation problems converging to the global objective value of the original problem.


Introduction
Many industrial processes involve heating and cooling liquids: a quarter of the 2012 EU energy consumption came from industry and industry uses 73% of its energy on heating and cooling (European Commission 2016). With the present focus on reducing CO 2 emissions, e.g. the UK Climate Change Act 2008, reusing excess process heat becomes ever more important, as illustrated by applications of heat exchanger network synthesis (Castro et al. 2015, Niziolek et al. 2015, Novak-Pintarič and Kravanja 2015, Martín and Grossmann 2016. Heat exchanger networks allow excess heat to be reused onsite. A heat exchanger is a device that transfers heat from a hot liquid to a cold liquid without allowing them to mix. A Heat Exchanger Network (HEN) is a set of hot and cold streams with a collection of heat exchangers operating over them. Heat Exchanger Network Synthesis (HENS) designs a HEN; the objective is minimising the annual running cost by optimally placing heat exchangers in the network. The annual running cost consists of: a fixed cost per exchanger, a variable cost per exchanger proportional to its area and a variable cost per hot (cold) stream proportional to the amount of external cooling (heating) required, if any, to satisfy the outlet temperature. HENS models are difficult because they: are non-convex mixed-integer nonlinear optimisation problems (MINLP) and contain the logarithmic mean temperature difference (LMTD), a function that can cause numerical difficulties. See Belotti et al. (2013) and Boukouvala et al. (2016) for a review of MINLP and Furman and Sahinidis (2002) for a review of HENS. Approaches to overcome the numerical difficulties created by LMTD include using the Paterson (1984) or Chen (1987) approximations, reformulating the LMTD constraint and enforcing bounds (Huang et al. 2012) or perturbing the function by adding small values, e.g. 10 −6 , to prevent the indeterminate cases from arising (Huang et al. 2012). Some of these approximations add errors into the model and are not accounted for in the solution. This paper proposes a method for approximating LMTD while accounting for the error.
The study of HENS has existed since the mid-twentieth century (Broeck 1944, Hwa 1965. Since its conception, HENS has received a lot of interest with many models proposed to solve these problems. HENS models can be categorised into sequential synthesis and simultaneous synthesis (Furman and Sahinidis 2002). Sequential synthesis solves models sequentially; the solution from an intermediate model parameterises the following model. An example of a sequential synthesis approach consists of the: linear programming (LP) transshipment model, mixed integer linear programming (MILP) transshipment model (Papoulias and Grossmann 1983) and nonlinear programming (NLP) superstructure model (Floudas et al. 1986); these are solved in the stated order and correspond to minimising the: utility cost, number of matches and investment cost, respectively. A sequential approach does not guarantee the global solution, e.g. the global solution may not necessarily contain the minimal possible utility cost. A simultaneous synthesis model overcomes the drawbacks of the sequential approach by only minimising the running cost, e.g. the MINLP SYNHEAT model (Yee and Grossmann 1990). Since simultaneous optimisation models may have more a favourable global optimum than sequential synthesis models Floudas 1991, Escobar andTrierweiler 2013), this paper studies the simultaneous MINLP SYNHEAT model.
Recent research has developed general methods to generate convex envelopes (Tardella 2007, Jach et al. 2008, Locatelli and Schoen 2012, Khajavirad and Sahinidis 2013, but the functional forms examined by these papers cannot be directly applied to LMTD or its reciprocal (RecLMTD). This manuscript proves that the limits of function LMTD and RecLMTD exist over R 2 + , where R + is the set of positive reals. We also show that the limits of the Jacobian and Hessian exist over R 2 + . This leads us to the critical result of strict convexity for RecLMTD β , β > 0. It is important to note that this paper presents alternative proofs; Zavala-Río et al. (2005) showed continuous differentiability and positivity of LMTD and Floudas and Ciric (1989) showed convexity of RecLMTD β , 0 < β ≤ 1. While these properties have been previously proved, the new proofs are valuable because: 1. We give a complete analytical proof for the shape of LMTD for different constants β. After adding the limits, our results are LMTD β , over R 2 + , is concave, strictly concave, linear and strictly convex for β = 1, 0 < β < 1, β = 0 and β < 0, respectively.
2. Instead of analysing a series expansion, e.g. as Zavala-Río et al. (2005), we prove that the limits of LMTD exist via a coordinate transformation (to the polar system) that reformulates this bivariate function as the product of two univariate functions. This results in a shorter, simpler proof.
3. Limits of the derivatives of orders 0, 1 and 2 of LMTD and RecLMTD can be shown to exist using the same transformation described above showing twice-continuous differentiability of these functions.
4. The Floudas and Ciric (1989) convexity proof applies to the half-plane x < y and also applies to the half plane x > y. Our new proof accounts for function indeterminacies on the line x = y and is valid on the entire R 2 + domain.
This manuscript begins in Section 2 by formulating the MINLP SYNHEAT model and identifying model nonlinearities. Section 3 mathematically analyses LMTD; the key result is that RecLMTD, the reciprocal of LMTD, is strictly convex. Using the mathematical results of Section 3, Section 4 introduces an algorithm that solves the MINLP iteratively using a series of mixed-integer linear optimisation problems (MILP); we also show how the algorithm converges to the global optimum. Section 5 examines the numerical performance of the algorithm and identifies better lower bounds on the global minimum than the global MINLP solvers. Section 6 concludes with remarks on the manuscript results and algorithm.

Heat Exchanger Network Synthesis
Heat exchanger network costs are associated with the number of exchangers, the area of each exchanger and the amount of external heating/cooling required to achieve the required outlet temperatures. These costs are constrained by network-correctness properties, e.g. heat or energy balances (see Section 2.1). Among these constraints are the heat exchanger area calculations, which are defined by the log mean temperature difference (LMTD). LMTD characterises the nonlinear nature of heat transfer in the where x, y are the temperature differences between the two streams at either end of the exchanger. See Fig. 1 for reference (note that the Fig. 1 variables are with respect to the Section 2.1 model formulation).

The MINLP SYNHEAT Model
We analyse the MINLP SYNHEAT model (Yee and Grossmann 1990). The MINLP SYNHEAT model, which is diagrammed in Fig. 1, represents the HEN as being partitioned into stages. Each stage allows stream splitting, thereby allowing multiple heat exchangers to act on a single stream. Between stages, split streams are mixed back together giving a single temperature entering and leaving a stage for a particular stream.
The model consists of sets: HP, CP and ST, representing the: hot streams, cold streams, and stages, respectively. There are also labels CU and HU for the cold utility and hot utility. The model parameters and variables are given in Tables 1 and 2, respectively. The intuition for the optimisation problem is that it favours conserving energy by passively exchanging heat between the hot (HP) and cold (CP) process streams. Utilities CU and HU are typically expensive, so the model penalises consuming the energy required to produce CU and HU.
Outlet temperature of the {hot side, cold side} of exchanger i,j,k dt [ijk, cui, huj] [ K ] Temperature approach between {hot stream i and cold stream j at stage k; hot stream i and the cold utility; cold stream j and the hot utility} LMTD [ijk, cui, huj] [ K ] Log mean temperature difference between {hot stream i and cold stream j at stage k; hot stream i and the cold utility; cold stream j and the hot utility} q [ijk, cui, huj] [ kW ] Heat load between {hot stream i and cold stream j at stage k; hot stream i and the cold utility; cold stream j and the hot utility} t [ik, jk] [ K ] Temperature of {hot stream i, cold stream j} at hot end of stage k Binary Variables Units Description z [ijk, cui, huj] -Existence of a match between {hot stream i and cold stream j at stage k; hot stream i and the cold utility; cold stream j and the hot utility} The MINLP SYNHEAT constraints are (Yee and Grossmann 1990): The objective is minimising the Total Annual Cost (TAC). The nonlinearities (all non-convexities) arise from the constraints categorised under: energy balances (first four constraints only, bilinear), LMTD (concave 1 ), Area (bilinear) and the objective (concave for β ∈ (0, 1]).
Example 1. A realistic HENS model may have many hot and cold streams; this may lead to a large number of nonconvex terms in the model. Consider a network with 22 hot streams i, 17 cold streams j and 22 stages k (the number of stages is often taken to be the maximum out of the number of hot and cold streams) (Escobar and Trierweiler 2013). Each nonconvexity associated with: stream to stream heat exchangers (those involving subscript ijk) has 8,228 (= 22 × 17 × 22) instances, cold utility (those involving subscript cui) has 22 instances and hot utility (those involving subscript huj) has 17 instances.
There are: 7 nonconvex constraints associated with each stream to stream heat exchanger (4 in Energy Balances, 1 from LMTD, 1 from Area and 1 from Objective), 3 nonconvex constraints associated with each cold utility heat exchanger (1 from LMTD, 1 from Area and 1 from Objective) and 3 nonconvex constraints associated with each hot utility heat exchanger (similar to cold utility heat exchanger). So, for this example, there are 57,713 (= 7 × 8, 228 + 22 × 3 + 17 × 3)

Mathematical Results
This section studies LMTD and its reciprocal, RecLMTD. Adding the limits over the set S = (c, c) T ∈ R 2 + , we show that RecLMTD is continuous over R 2 + , the same Cold Utility Exchangers Figure 1: Labelled diagrams of the model heat exchangers for the SYNHEAT model. Top: Stream to stream heat exchangers, those associated with hot streams i ∈ HP, cold streams j ∈ CP at stages k ∈ ST, between consecutive stages (ascending and descending) the outlet temperatures are mixed together. Variables f H ijk and f C ijk correspond to the flowrate for the hot and cold streams. Bottom left: Hot utility heat exchangers, those associated with the hot utility and cold streams j ∈ CP. Bottom right: Cold utility heat exchangers, those associated with the cold utility and hot streams i ∈ HP. method of proof shows twice-continuous differentiability for both LMTD and RecLMTD over R 2 + . We also analyse the shape of LMTD and RecLMTD, here the key result is that RecLMTD β is strictly convex, β > 0. Table 3 summarises the convexity results; refer to the online supplement for proofs including more of the details. This paper primarily analyses RecLMTD instead of LMTD because the optimisation model has the following property (A, the area, forms part of the objective): To simplify error analysis in Section 4.3, we treat the area constraints as a product of variables opposed to a division as shown by Eq. (1).
In the model, the area of a heat exchanger (A) is raised to the β-th power (0 < β ≤ 1). Distributing the exponent β over the product on the right hand side of Eq. (1) gives an alternative function (RecLMTD β ) for analysis.

Limits
Direct evaluation of RecLMTD when we equate the variables gives: A similar result to that of Eq. (2) occurs for all elements of the gradient and Hessian of RecLMTD over the line y = x. The indeterminate set of points can be seen in Fig. 2, the heat map shows the difficulty in proving the limit existences as the indeterminate set gives a path (y = x) over which we cannot directly evaluate the limit.
Proof. Let S = (c, c) T ∈ R 2 + , the set over which we shall show that the limit of RecLMTD exists. S (line y = x) is associated with angle θ c = π / 4 in polar coordinates; this can be seen in the Fig. 2 heat map. We switch from the Cartesian to the polar system using transformations x r cos(θ) and y r sin(θ). The limit Applying the transformations mentioned above, RecLMTD becomes: Equation (3) is now a product of separable functions f and g where f is continuous and g is well-defined except at π / 4 , direct evaluation gives g( π / 4 ) = 0 / 0 . For g, applying l'Hôpital's rule allows us to evaluate the limit: We can now prove the limit: The polar transformation used to prove the limit of RecLMTD can also be used to evaluate its gradient and Hessian over the indeterminate set. Definition 1 in the online supplement shows how to create a well-defined formulation using these limits.

Shape
Figure 2 suggests that RecLMTD is convex. A similar plot for LMTD implies concavity. Proposition 2 generalises these observations; the results are outlined in Table 3. For the rest of the manuscript, we assume that the limits derived in Section 3.1 are explicitly added to the functions; now RecLMTD and LMTD are twice-continuously differentiable.
Proposition 2. Let β ≥ −1 be constant. RecLMTD β : R 2 + → R, the reciprocal of the log mean temperature difference raised to the β-th power: Proof. For the case of β = 0, since RecLMTD is well-defined and non-zero over R 2 + , RecLMTD 0 = 1 which is linear.
For the case of β > −1, β = 0 we shall analyse the Hessian of RecLMTD β , showing that ∇ 2 RecLMTD β is positive definite for β > 0 and negative definite for −1 < β < 0. We analyse it separately over the sets S = {(x, x) T | x > 0} and S * = R 2 + \ S . In the sequel, we aid readability by using m to represent RecLMTD (no exponent); the function parameters are also dropped. The Hessian of RecLMTD β is given by: Over S , evaluating the limits of the partial derivatives of Eq. (4) (similarly to Section 3.1) results in Eq. (5), justification is presented in the online supplement Proposition 1 and Corollary 1: where: Analysing the leading principle minors (the upper left element, D 1 , and the determinant, D 2 ) of the matrix A gives: For β > −1, both D 1 and D 2 are positive therefore, by Sylvester's criterion, the matrix A is positive definite. Hence, since For the remainder of the proof, we use the substitution w = x / y . Once again, we analyse the leading principle minors of the Hessian. For positive definiteness, both leading principle minors need to be positive. For negative definiteness, the leading principle minor of order 1 must be negative and the leading principle minor of order 2 must be positive.
Taking derivatives, we get: Using the identity: sinh(2n) ≡ 2 sinh(n) cosh(n), we get: Clearly, h (4) (n) > 0 when n = 0. Evaluating at n = 0, the derivatives of h of orders 0 to 3 all equal to 0. Using reasoning similar to that of the function r, we get h(n) > 0 when n = 0. This shows positivity of g(w).
The final case is β = −1. Over the line x = y, substituting β = −1 gives (note x > 0): Over S * , substituting β = −1 and taking out common factors gives: where: To see k(w) ∈ R + , substitute w = e n and reason similarly to the arguments for g(w).

Further Results
We found that LMTD has a well-defined formulation for its derivatives up to order 2, i.e. LMTD ∈ C 2 . The proofs for these results are similar to that in Section 3.1. The Section 4 algorithm only uses milder assumptions of once-continuous differentiability, but observe that the twice-continuously differentiability property implies the applicability of other convex MINLP algorithms, e.g. nonlinear branch-and-bound where the NLP subproblems are addressed via second-order methods (Bonami et al. 2008).

A MILP Algorithm
This section uses the strict convexity of RecLMTD β to develop an algorithm converting the MINLP to an MILP relaxation and iteratively tighten this approximation to converge to a solution. We show that the algorithm converges to the global minimum of the MINLP. Najman and Mitsos (2016) use an early version of our work (Mistry 2015) to show that the concavity of LMTD could be alternatively leveraged using multivariate McCormick relaxations (McCormick 1976, Tsoukalas andMitsos 2014); this immediately applies to prior work underestimating LMTD with the chain rule (Watson et al. 2015).

Changes to the MINLP Model
As mentioned in Section 3, we reformulate from LMTD to RecLMTD. We propagate these changes into the MINLP model by replacing variables {LMTD ijk , LMTD cui , LMTD huj } with {RecLMTD ijk , RecLMTD cui , RecLMTD huj }, respectively. We also substitute the LMTD and Area constraints with: Observe that these changes are similar to one of the Escobar and Grossmann (2010) reformulations from MINLP SYNHEAT to a specialised model. The Escobar and Grossmann (2010) specialised model makes the following changes to MINLP SYNHEAT: • Replaces LMTD with the Chen (1987) approximation (x · y · (x + y)/2) 1/3 ; • Moves the Chen (1987) approximation to the objective function; • Introduces an approximation to eliminate some of the bilinear terms in the constraints.
The Maranas and Floudas (1995) convexity/concavity identification procedure shows that the Chen (1987) approximation is concave and that its reciprocal is convex. The Escobar and Grossmann (2010) specialised model is not convex, but it does eliminate several nonconvexities. Escobar and Grossmann (2010) show that convex MINLP optimisation solvers Bonmin (Bonami et al. 2008), DICOPT (Duran and Grossmann 1986), and SBB perform well on a reformulated model with LMTD-related terms in the objective. The convexity result in this manuscript justifies why the Escobar and Grossmann (2010) model works better with the LMTD term moved to the objective.

Linear Outer Approximation
The MILP outer approximation is created by reformulating the model nonlinearities, these are found in constraints groups: Energy Balances, RecLMTD, Area and TAC.  ). This approach partitions the domain of the bilinearity, forming the union of multiple disjoint McCormick hulls. For the concavities, we use a piecewise linear approximation of secant lines which adds further binary variables into the model ).
The nonlinear convex terms are associated with RecLMTD. For these functions, we can create an outer approximation by adding linear cuts constraining RecLMTD to be greater than a set of tangents generated at a set of predetermined points (Duran andGrossmann 1986, Bonami et al. 2008). Using an outer approximation is justified because the objective function minimises: where i ∈ HP, j ∈ CP and k ∈ ST. Since α, β > 0, TAC minimises sums consisting of: and, since RecLMTD forms part of a product in the area calculations, we seek to minimise: RecLMTD ijk , RecLMTD cui and RecLMTD huj . Figure 3 illustrates that modelling RecLMTD with a set of 'greater than or equal to' constraints will end on the topmost tangent at a given point (the model is trying to minimise). The approximations are all relaxations: the McCormick Hull increases the search space, the piecewise linear functions are used on concavities which are minimised and the convexities are as described above, i.e. the approximated search space contains the feasible region regardless of the approximation tightness.

The Algorithm
The convexity of RecLMTD coupled with the MINLP structure gives us a useful way of approximating RecLMTD without introducing binary variables. The linear approximation also allows us to overcome the numerical difficulties that RecLMTD presents. This linear approximation could be alternatively applied to prior work approximating HENS as an MILP (Hasan et al. 2009, Nguyen et al. 2010); the step change to those algorithms would be a rigourous bound on the LMTD approximation. The algorithm we present uses RecLMTD opposed to LMTD and partitions the domains of   Björk and Westerlund (2002) split the area calculations (bilinearities) into a convex and a concave constraint via the Paterson approximation (1984), here the concave LMTD structure could be used similarly to how we use RecLMTD or directly by including the limits (Proposition 1 in the online supplement).
Algorithm 1, illustrated in Fig. 4, is an iterative process. The key steps are: assessing the termination criteria (block 3 in Fig. 4 and line 3 in Algorithm 1) and tightening the constraints (block 5 in Fig. 4 and lines 4 to 9 in Algorithm 1). Note that line 4 in Algorithm 1 ensures that we have correctness for mixer bilinear constraints (3 rd and 4 th equations in the Section 2.1 category 'Energy Balances'). The errors are categorised into energy balancing bilinear errors, area bilinear errors, concave errors and RecLMTD errors. These 4 categories have an associated ε > 0. If the maximal relative error is bounded by the associated ε across all categories, we terminate. For example, max{relative error across all RecLMTD approximations} < ε RecLMTD means that the Algorithm convergence is motivated by showing that, for any ε TAC , ε Mixer > 0, there exists a linear approximation returning an objective value within ε TAC of the correct TAC and satisfying the Mixer Energy Balancing Constraints (labelled Energy Balances in the Section 2.1 SYNHEAT model) within a tolerance of ε Mixer (assuming a MILP solver that solves to global optimality). This linear approximation can be derived by funnelling (Fig. 7) the model errors. The 'Mixer Energy Balancing Constraints' are missing from the Fig. 7 funnel, but the q ijk definitions contain the relevant nonlinear terms, so, tightening q ijk indirectly tightens the 'Mixer Energy Balancing Constraints'. We can construct approximations that will satisfy the derived error bounds because: • Reducing the interval sizes (more breakpoints) over which we partition the concave functions causes a strict error decrease size; • Similarly to the concave functions above, a reduction in the interval sizes over which we partition the bilinearities causes a strict decrease in the error size. Although, here we have the added restriction that the interval sizes must be small enough such that cascaded errors do not invalidate further estimation; • In the model, the RecLMTD domain is a closed box with positive lower and upper bounds. Over a domain of this form, we have shown that the function is continuously differentiable hence it is Lipschitz continuous. This approximation is also used in further approximations hence, as in the bilinear case, we assume that the l u   approximation error has a small enough bound so that we can achieve a tolerance of ε TAC on the objective, i.e. that this approximation does not invalidate the bound on the error in further approximations.
This reasoning shows that we can construct a model within ε TAC of the globally optimal objective value, but the model may take a long time to solve for relevant instances. Instead, the proposed algorithm places breakpoints and tangents dynamically, so the resulting approximation may not be evenly placed. For algorithm convergence, assume that the approximated functions have error tolerances (opposed to TAC and the Mixer Energy Balancing Constraints, the tolerances can produce worst case error violations). In the algorithm, placing further tangents/breakpoints excludes the current incumbent resulting in strict TAC increases and strict approximation error decreases in local neighbourhoods. In the worst case, the algorithm reduces to the construction described above. The construction has a finite number of tangents and breakpoints hence the algorithm will have a finite number of iterations to generate the construction. Since each iteration is solved in finite time, the algorithm will terminate in finite time satisfying the tolerances.

Distributing the Area Exponent
In the MILP outer approximation model (Section 4.2), we can distribute β (note 0 < β ≤ 1) over the product that forms the area. where (h −1 i + h −1 j ) β is constant, q β ijk is concave and, by Proposition 2, RecLMTD β ijk is strictly convex. We get a similar result for the cold and hot utility heat exchanger areas, they only differ from Eq. (9) by indices.
In the following, the variables are mentioned without their subscripts to aid readability, e.g. q means the variables q [ijk,cui,huj] , see Table 2 in Section 2.1. The linear outer approximation described in Section 4.2 can be amended according to the distribution of the exponent over the area bilinearity as shown by Eq. (9). The difference here is that the concave calculations and one of the sets of bilinear calculations are associated with different variables. We introduce a new set of variables: q [β] for the approximated value of q β (note the square brackets to show that this is a second model), RecLMTD [β] instead of RecLMTD. The variables associated with the area, A, are removed from the model and the calculation of A [β] becomes a bilinearity in the variables q [β] and RecLMTD [β] .
For this model, the associated algorithm is similar to that of Section 4.3. The approximations we make are still categorised under convex, concave and bilinear. The break/tangent-point placement for these categories remains the same, it is the variables that differ. Covergence can be proven in the same way as in Section 4.3 as there is still a funnel of errors as shown by Fig. 8 which is similar to that shown by Fig. 7. The cascade of errors shown by the funnel allows us to prove that we can construct a model that converges to an arbitrary positive precision (the process would be similar to what is described for the algorithm in Section 4.3).

Numerical Results
This section discusses numerical results from implementing the Sections 4.2 and 4.4 models. The associated algorithms that we have implemented are a modified version of the algorithm in Section 4.3, the change is that we do not carry out the tightening process of line 4 in Algorithm 1 (a subset of the line 4 approximations are tightened by line 6). It is important to note that this modification means that the implementations do not necessarily converge to the global solution however this modification was made because the implementation of Algorithm 1 takes a very long time to solve for the models that we test against, this is because we do a fresh solve on each iteration therefore previous knowledge is discarded causing the MILP solver to slow down. Dropping line 4 in Algorithm 1 still provides good bounds for the problems that we test against since they have a small number of stages. When referring to the two algorithms that these models create in this section, we shall call the algorithm associated with Section 4.2 the first algorithm and the algorithm associated with Section 4.4 the second algorithm.
The two algorithms differ in approximating the area objective calculations since the first algorithm does not distribute exponent β while the second algorithm does. These differences make it difficult to define termination criteria that result in equivalent solutions, i.e. solutions with equivalent objectives and final models. The funnel of errors for the first algorithm (Fig. 7) has 'levels' to it (excluding the Mixer Energy Balancing Constraints), we use these 'levels' to define termination criteria. For the first algorithm our termination criteria for the top row in Fig. 7 (approximations associated with q ijk , RecLMTD ijk , RecLMTD cui and RecLMTD huj ) is that the relative error of these approximations is 0.001. For the next two levels we have a similar condition however we increase the tolerance by an order of magnitude, i.e. max{relative error of A ijk , A cui and A huj } < 0.01 and max{relative error of A β ijk , A β cui and A β huj } < 0.1. The funnel of errors for the second algorithm (Fig. 8) doesn't give as obvious a grouping. However, for similarity, we group these terms in a similar fashion to that of the first algorithm. Here we take the q [β] 's in the second algorithm to be on the same level as the A's in the first algorithm. This gives our termination criteria, for the second algorithm, to be: • max{relative error of bilinearities associated with q ijk } < 0.001 • max{relative error of RecLMTD huj } < 0.1 The algorithms were implemented using Pyomo (Hart et al. 2011(Hart et al. , 2012 with solver Gurobi 6.0.3 (Gurobi Optimization Inc. 2015). The models were run on an HP EliteDesk 800 G1 with 16GB of RAM and Intel Core i7-4770 @ 3.40GHz, running Ubuntu 14.04.

The Models
We used three heat exchanger models for which the solution data is published online. The number of hot and cold streams, and stages (associated with number of binary variables) in the models are: • Model 1 2 : 2 hot streams/2 cold streams/2 stages • Model 2 3 : 5 hot streams/1 cold stream/2 stages • Model 3 4 : 5 hot streams/5 cold streams/2 stages The full specifications are given by Tables 5 to 7.
In these tables, the parameters: α, β, c, c CU and c HU ; are indirectly stated by: Cost of Heat Exchangers, Cost of Cooling Utility and Cost of Heating Utility. We compare our results to the: best solution and greatest lower bound; reported out of the MINLP solvers: ANTIGONE (Misener and Floudas 2014), BARON (Tawarmalani andSahinidis 2005, Sahinidis 2014), Couenne (Belotti et al. 2009), LINDO (Lin andSchrage 2009, LINDO Systems Inc. 2015) and SCIP (Achterberg 2009, Vigerske 2012); on MINLPLib2 (Bussieck et al. 2003) when solving the general model (Section 2.1). The reported results are shown in Table 8, we only report the best primal solution and greatest lower bound with the solvers that achieved them. Note that Alpha-ECP Pörn 2002, Westerlund andLundqvist 2005), which does not return lower bounds, achieves the best primal solution for one of the models. On each iteration our algorithms have a tighter outer approximation, therefore the reported result at each iteration will be a lower bound hence they should agree with the global optima.

The Results
For Models 1 & 2, we take the best primal solutions to be as reported in Table 8. For Model 3, we take the best primal solution to be $64,138.16, a feasible solution that we have found. Our solution does not invalidate the lower bound presented in Table 8 since these results are for instances where the parameters for the LMTD function are perturbed by a small value to prevent numerical difficulties from arising. Common approaches to avoid the numerical difficulties posed by the LMTD function introduce an error into the model that is not accounted for in the solution. Our algorithm and approximation methods allow for dynamic reduction of error while preventing numerical difficulties. The online solution is 1% larger than our feasible solution and Model 3 is a small problem; a larger, more realistic problem could see a larger gap between the correct solution and the solution achieved by using common approximation methods for LMTD.  (Escobar and Grossmann 2010) Stream Cost of Cooling Utility ($ · kW −1 · yr −1 ) = 15 Cost of Heating Utility ($ · kW −1 · yr −1 ) = 80 Table 6: Model 2 input data (Escobar and Grossmann 2010) Stream Cost of Cooling Utility ($ · kW −1 · yr −1 ) = 10 Cost of Heating Utility ($ · kW −1 · yr −1 ) = 140 Table 7: Model 3 input data (Escobar and Grossmann 2010) Stream    While the models are small in size, the results show different characteristics. We look at the: objective value and cumulative time; against the iteration. The results of the first algorithm for Models 1 to 3 are shown by Figs. 9 to 11 respectively and the results of the second algorithm for Models 1 to 3 are shown by Figs. 12 to 14 respectively.
Since the algorithm consists of only outer approximations, a solution will be less than or equal to the global solution. We see a progressive rise towards the global objective value across the models. For Model 2, we terminate at a value larger than the global value, but the difference between the two values is not significant relative to the global solution and arises from numerical error. The results all show a steep rise towards the global solution before flattening out implying that initial values could carry some useful information, e.g. from iteration 4 all models are within 10% of the global solution.
Looking at the running times, we generally see, as expected, a trend of increasing time required to solve each successive iteration. The time taken to terminate is a noticable difference between the two algorithms, shown in Table 9. The second algorithm is faster than the first for all of the models, particularly for Model 2 where the first algorithm takes more than 8 hours whereas the second algorithm takes under 3 minutes. We do not get a speed up of the same magnitude for the other two models therefore such results may be problem specific. For the area calculations in the objective, the exponent is applied after the bilinear calculate for the first algorithm and vice versa for the second algorithm. We place the speed up on this property. Comparing across models we can see that as problem sizes scale up the time taken to terminate rises, where Model 3 takes longer 2 hours to solve for both algorithms. A practical application of the algorithm would have to be able to handle problems that are much harder than Model 3 and may not terminate. The algorithms mainly encounter scalability issues because each iteration sees a fresh solve of the (tightened) problem, here knowledge of a previous branching tree could be used to drive the current search.  Scaling is an issue, but the algorithm acts as a method by which we can place a lower bound on the problem in question, i.e. a method that allows us to place a certificate on a MINLP solution. From Table 8, the best lower bounds (for Models 1 and 2 only) are placed quite far from the global solution. Our handling of LMTD allows us to close this gap. Running the algorithms with no additional binary variables (single McCormick hull for bilinearities and single piece for concavities) results in Tables 10 and 11, we shall refer to these algorithms as the weak first and second algorithms. The bounds achieved in Tables 10 and 11 took at most 33 seconds, the solutions and bounds stated on MINLPLib2 (Table 8) are achieved in at most 15 minutes. In 15 minutes, the second algorithm achieves a bound of $64,054.46 (iteration 12) for Model 3, it terminates for the other two problems. For the weak algorithms, the percentage gaps (calculated using: (best solution − lower bound)/best solution) reduce by about: 12% in Model 1 (both weak algorithms) and 2% (weak first algorithm) and 5% (weak second algorithm) in Model 2 (we cannot compare the Model 3 gap).
The second algorithm objective values outperform the first algorithm objective values in the early Models 2 and 3 iterations (since β = 1 for Model 1, the approximations are equivalent from a bounding perspective). The weak algorithms also favour the second approach since it results in a tighter lower bound. The second algorithm yields better initial results because it makes better use of the structure of RecLMTD β and the maximal approximation error of applying the piecewise approximation before the McCormick hull is less than the opposite order as found in the first algorithm. We will show these bounds on the error numerically for Models 2 and 3.
To compare the two algorithms, assume that each concave cost and bilinear term is RecLMTD β Figure 15: The difference in how the area (variables only) of a heat exchanger is calculated by the two algorithms. Left: the first algorithm -the product of q and RecLMTD is taken, the result is then raised to the β-th power. Right: the second algorithm -q is raised to the β-th power, RecLMTD β is still convex and forms a single approximation, the product of these two variables is then taken.
underestimated without partitioning and that sufficient tangents approximate RecLMTD and RecLMTD β such that the error is small. The algorithms differ in order-of-operations: the first multiplies variables and raises the result to the β-th power, the second raises each variable to the β-th power and multiplies the result, see Fig. 15. Since the approximations only appear in a minimisation objective function, we limit our analysis to the convex underestimators. Analysis is further simplified as the lower bound on {q, q β , A, A β } is zero. Notation q l , q u represents the lower and upper bounds of variable q, respectively, the notation is similar for q β and RecLMTD β , however we use label r for RecLMTD when referring to bounds. Androulakis et al. (1995) showed that the maximal error attained by the McCormick hull when estimating b · x · y (where b is constant) is: and occurs at the point 1 2 (x l + x u ), 1 2 (y l + y u ) . The maximal error of estimating x β for 0 < β < 1 and x l = 0 is and occurs when x = β−1 x β u βxu . The derivation of Eq. (11) is given by Proposition 5 in the online supplement. We want to calculate bounds on the errors in these approximations assuming that the input has an error.
The first algorithm takes the McCormick hull first and thereby creates potential error ξ M calculated as in Eq. (10), substituting {q, RecLMTD} for {x, y}. Auxiliary variablê A, created from the multiplication, is then used in the subsequent concave estimation whereÂ = A − ξ M and A is the correct value at the point of maximal error. Analysis of Figure 16: Where the maximal error occurs under the assumption that the input has a fixed error. Left: the case for the first algorithm, ξ M is the maximal error attained by making the McCormick estimation. Right: the case for second algorithm, ξ C is the maximal error attained by making the concave estimation (we assume that the RecLMTD β variable is fixed). For both cases, since the distance between the correct, p C , and estimated, p E , values is constant, a is constant resulting in the error, a + b, being maximised when b is maximised. The range in the bound on the error for the function approximations in the first and second algorithm and the average percentage gap between the bounds in the approximations for Models 2 & 3. The average gap is calculated relative to the bound in first algorithm. The gap for a particular approximation is given by: (first alg. bound − second alg. bound)/first alg. bound. Note: the bound on the error in the second algorithm approximations are always lower than the corresponding approximations in the first algorithm for these models.  Fig. 16 indicates why the bound occurs here. The second algorithm applies these approximations in the opposite order. Calculating the concave function first creates a potential error of ξ C calculated as in Eq. (11). The variableq β is then used in the subsequent McCormick estimation whereq β = q β − ξ C and q β is the correct value at the point of maximal error. We analyse the convex envelope under these conditions by adapting the proof presented by Androulakis et al. (1995), the difference, in our case, is that we analyse the following optimisation problem (ξ is constant): This gives a bound on the error of (u β is a scaling factor for area calculations): u β q β u r β u − r β l 4 + u β ξ C r β u + r β l 2 + u β ξ 2 C r β u + r β l 4q β u at the pointq β = q β u 2 − ξ C 2 and RecLMTD β = r β u +r β l 2 + ξ C (r β u −r β l ) 2q β u , for RecLMTD β fixed at this point, Fig. 16 indicates why the bound occurs here. The derivation of the above bounds are given by Section C.1 in the online supplement. Substituting the bounds derived from the input data of Models 2 and 3, results in the second approach giving a lower bound on the error for every estimation. Table 12 summarises the results; the bound on the error is on average 24.2% lower in the second algorithm. While the absolute values of the bounds are small, we see a significant difference in the second algorithm objective values (in early iterations) because the area has a cost factor, α, that scales the area by at least 146 (α Model 2 = 1200 and α Model 3 = 146), generally the value for α scales the area up by at least two orders of magnitude. The effectiveness of the second algorithm is shown in the active binary assignments (on exchangers only) over the iterations for Model 3 where the set of binary assignments that have been active is less than that of the first algorithm, the second algorithm also shows more convergence properties in the active binary assignments as iteration 13 onwards only tightens approximations for a single set of binary assignments, the first algorithm does not display a similar property.

Conclusion
This paper examines the non-convex HENS problem focussing on the LMTD function. We show that, from a numerical analysis perspective, the reciprocal, RecLMTD, is more appropriate for the optimisation model. While RecLMTD causes numerical issues for solvers, we showed that it is: twice-continuously differentiable and strictly convex. These results allowed us create an outer approximation algorithm that approaches the global solution of the SYNHEAT HENS model. Numerical results show that rate of convergence for smaller problems is not unreasonable and that we can achieve lower bounds that are significantly higher than what the solvers currently achieve. The solvers appear to be finding the global solutions however providing a certificate is not so easy as the lower bounds placed on the problems are not informative. Numerical issues are among the reasons that solvers struggle in placing bounds for these problems, we have proven results that allow these difficulties to be overcome.