Optimal inflow control of production systems with finite buffers

We introduce the optimal inflow control problem for buffer restricted production systems involving a conservation law with discontinuous flux. 
Based on an appropriate numerical method inspired by the wave front tracking algorithm, we present two techniques to solve the optimal control problem efficiently. A numerical study compares the different optimization procedures and comments on their benefits and drawbacks.

The models of interest rely on conservation laws with discontinuous flux functions representing production units with finite buffers. The evolution of the part density ρ(x, t) ∈ [0, ρ max ] satisfies for all x ∈ [0, 1] the equation where the relation between flux and density is given by Here, H(·) denotes the Heaviside function andf (ρ) is a smooth concave function. The existence of solutions to this problem is proven in [18]. A detailed motivation and discussion for the model (1)- (2) can be found in [2,13]. For our investigations, we restrict to the linear case withf (ρ) = aρ and a constant processing velocity a > 0. We distinguish between a free flow regime, i.e. H(·) > 0 as long as ρ < ρ max , and a congested regime, i.e H(·) = 0 if ρ = ρ max .
Remark 1. The solution of (1) with (3) is defined as the limit of a regularized flux function for δ → 0:

SIMONE GÖTTLICH AND PATRICK SCHINDLER
Wave Front Tracking DFG method Simulation Optimization Figure 1. Coherence diagram: The idea of the Wave Front Tracking is used to motivate a problem-adapted finite volume scheme, i.e. the Discontinuous Flux Godunov (DFG) method. The DFG method provides good numerical solutions within acceptable time, cf. Section 4.1. For optimization purposes, the DFG method directly leads to two different approaches to determine the optimal inflow into the system, cf. Section 4.2.
The limiting process merges the jammed and congested regimes to a single point ρ = ρ max , cf. [13].
In the context of optimizing capacity constrained production systems, the maximization of flow and/or the minimization of buffering zones are of particular importance. They are directly related to control issues such as routing of goods through the system [7,9,21], demand tracking [22] and maintenance scheduling [9].
Within this paper, we are interested in controlling the inflow subject to (1) and (3), cf. Figure 1. This corresponds to the idea of a boundary control ρ(0, t) by additionally ensuring that a predefined amount of goods will be produced. To solve this PDE-restricted problem numerically, we first have to discuss an appropriate numerical scheme for the discontinuous flux function (3).
We use the scheme proposed in [13] and present a detailed numerical analysis similar to [27]. Then, we exploit the theoretical results to solve the inflow control problem in two ways, both following the approach first discretize-then optimize, cf. [14,21]. One can show that the two approaches only differ in their numerical accuracy.
2. Numerical method. In this section we briefly review the weak solutions of (1) and (3) and discuss the corresponding wave front tracking algorithm, see [3,5,20]. The latter will be used to set up the numerical framework for the discontinuous flux Godunov (DFG) method recently introduced in [13]. The DFG method is a finite volume approach supplemented with a problem-adapted numerical flux allowing for a sharp tracking of shocks. Therefore this formulation will be essential for the numerical consideration of our optimal control problem.
According to the discussion in [2,13,18,27], the solution can be classified in three cases: Figure 2. Choice of the initial data.
(A.) If ρ l , ρ r < ρ max , then the solution is a shock wave with velocity s = f (ρ l )−f (ρr) ) If ρ l = ρ max and ρ r < ρ max , then the solution consists of two rarefaction shock waves. However, this is a special situation where in fact a wave with velocity negative infinity and zero magnitude (so-called zero waves) is followed by a classical shock wave with positive velocity s = a. This means, applying the Rankine Hugoniot condition s = −f (ρr) ρmax−ρr directly to the conservation law with discontinuous flux, one would get negative valued velocities for s.
Considering the regularization f δ in (4) one obtains a solution consisting of two dispersing shock waves with intermediate state σ. The speed of the two waves is s = − 1 δ and s = a. For small δ the solution of the conservation law with f δ approximates the classical shock wave solution to the Riemann problem with speed s = a. (C.) If ρ r = ρ max and ρ l < ρ max , then the shock speed is given by s = f (ρ l )−fout ρ l −ρmax , where f out is either f (ρ max ) = 0, i.e. s < 0, or used as boundary condition with f out ∈ [0, aρ max ].
Discretize the spatial domain in a grid 0 = x 0 < x 1 < x 2 < x N −1 < x N = 1 and define the step size ∆x := x i − x i−1 . Then, the Riemann problem is: The initial data at the boundaries is chosen as (cf. also Figure 2): The choice corresponds to the discussion in case (C.). Note that if f out = 0 for a certain time, the buffer is full and a backlog will occur. Otherwise, we are in the free flow regime, i.e. f out = aρ max .
Then, the wave front tracking algorithm works by solving the series of initial Riemann problems (5). We choose a starting time defined as τ N = 0, i.e. no interaction between two waves has happened so far. Since the initial data is piecewise constant given by the Riemann problem (5), different wave fronts will evolve over time. The core of the wave front tracking algorithm is to track each wave propagation individually, cf. Figures 3 and 4. As we will see in Section 2.2, the idea of the wave front tracking algorithm will be used to motivate an appropriate numerical method to rigorously derive the conservation law with discontinuous flux.
Corresponding to the cases (A.), (B.) the shock front at position x i moves with positive velocity, if there is no interaction between any backward traveling shock waves. Generally, the positive shock velocity is computed via cf. case (A.) in Section 2.1. Additionally, if f out > f (ρ N ), the shock moves in positive direction. Thus, there it exists no backward traveling shock wave. If f out < f (ρ N ), a shock with negative speed appears in the solution. This shock starts at location x N and moves with velocity s − N : If the negative shock wave interacts with the shock wave of velocity s + N , we get a new single Riemann problem with states ρ l = ρ N −1 , ρ r = ρ max , f (ρ r ) = f out . Therefore a new shock wave appears. Generally, the shock velocity is determined by Note that the shock with velocity s − i moves slower than s + i , i.e. s − i ≤ s + i . We define the time τ i−1 for the interaction of wave s − i with wave s + i . The slower traveling shock changes its velocity to s − i−1 . For an illustration, see Figure 3. Here, we observe that the collision of waves lead to new Riemann problems.

Remark 2.
In the case of a zero wave, i.e. a full buffer ρ i = ρ max , the shock speed is s = −∞. This yields that τ i−1 = τ i and we have an instantaneous transport of information.
For evaluating the time of wave interaction τ i−1 it is necessary to have knowledge about the previous waves. Therefore we assume that τ i is known. Then, the intersection of characteristic curves is In the next section we explain how the information induced by the wave front tracking algorithm can be used to derive a suitable and efficient numerical scheme to solve the conservation law with discontinuous flux.

2.2.
Discontinuous Flux Godunov. Based on our theoretical considerations, we introduce a finite volume scheme to solve (1) and (3). The proposed Discontinuous Flux Godunov scheme was presented for the first time in [13]. However, a detailed derivation and a numerical analysis are missing so far. Applying the idea of front tracking combined with the finite volume approach will lead to a scheme called discontinuous flux Godunov (DFG). As before, the spatial domain is divided into N cells [x i−1 , x i ] with constant width ∆x. The solution is assumed to be piecewise constant on each grid cell.

SIMONE GÖTTLICH AND PATRICK SCHINDLER
The following three-stage-algorithm, referred to as the Reconstruct-Evolve-Ave rage or REA algorithm as in [23], is used to compute the time evolution for ρ n+1 • Evolve the conservation law exactly using initial data ρ(x, t n ), thereby obtaining ρ(x, t n+1 ) at time t n+1 = t n + ∆t. • Average the solution ρ(x, t n+1 ) to determine new cell average values Here, the evolution step can be performed by solving local Riemann problems at each cell interface setting ρ l = ρ n i−1 and ρ l = ρ n i . We solve Riemann solutions of regularized flux functions according to (4) and consider the limit case δ → 0. Similar approaches are also used for conservation laws with discontinuous flux in [11,24,27]. Especially the work of [27] uses REA algorithms for the construction of finite volume methods. The main advantage of this approach is that zero waves with infinite speed can be reproduced without any hard restriction of the CFL condition, i.e. ∆t ≤ a∆x instead of ∆t ≤ |δ|∆x for the regularized problem (4). Since three different cases (A.)-(C.) may arise as Riemann solutions, the front tracking approach can be used to perform the evolution step: (1.) Solve each local Riemann-Problem and determine the shock velocity s. After the evolution step, the front tracking solution is averaged to obtain the cell average values ρ n+1 i . Let F (x, t) be the numerical flux f (ρ(x, t)) at position x and time t. Then, integration of (1) over the domain [t n , t n + ∆t] where λ = ∆t ∆x . Let us explain the choice of F n i : At first we consider the flux at the discretization point x N . If aρ n N < f out , then the shock moves in positive direction, i.e. F (x N , t) = aρ n N . Otherwise, the shock moves with the negative speed s − N and therefore the flux becomes F (x N , t) = f out . In particular, it holds F (x N , t) = min{aρ n N , f out }. Next, we consider the flux F (x i , t) at point x i . Here, two configurations are possible: • The backward traveling shock wave s − i never reaches the point x i at time t ∈ (t n , t n + ∆t). Then, the solution is just a shock wave between ρ n i , ρ n i+1 with positive speed s + i+1 = a > 0 and the flux at • The backward traveling shock wave passes the location x i at timeτ i ∈ (t n , t n + ∆t). Analogous to the previous situation, the flux at where the timeτ i can be computed as: This leads to the following numerical flux: The backward traveling shock wave passes x i if and if only the wave velocity s − i is negative. This is valid for aρ n i > f out . Hence, Summarizing, the discontinuous flux Godunov method (DFG) is defined as: supplemented with the numerical flux: The solution is also given by the initial data and the boundary values. For the inflow, we choose the left boundary value of the density ρ(0, t).
It is also necessary to prescribe outflow boundary conditions. We introduce a variable f n out that limits the outflow in a following way: The DFG-method satisfies the following numerical properties:

SIMONE GÖTTLICH AND PATRICK SCHINDLER
Proof. We consider the following case distinction for the numerical flux F n i−1 : is constant for all ρ n i . Case 2. Assume that the flux satisfies F n i−1 = aρ n i−1 . We also assume that there exists a constant K ≥ 0 such that Due to the CFL condition and (7)- (8), ρ n+1 i is non-decreasing for all ρ n i . Thus, the DFG-method is monotone.
We have shown that the DFG-method belongs to the class of monotone methods. These methods also imply the properties l 1 −contracting, TVD and monotonicity preserving, see [23].
Having this numerical scheme at hand, one is able to consider the respective optimal inflow control problem straightforward. We introduce a Lagrangian formalism and compare the latter to a mixed-integer program.
3. The inflow control problem. The inflow control problem consists of a minimization of inflow such that the constraints given by the conservation law with discontinuous flux is ensured. In other words: We assume that for a predefined outflow the optimal inflow into the production system is determined such that congestions are avoided. It is up to optimization to find the optimal time-dependent inflow u n = F 0 n regarding the fact that a certain supply S, i.e. F n 0 ≤ W ∀n, must be fulfilled.
To compute the optimal control, the continuous optimality system is discretized and solved by descent type methods. This approach is known as optimize-then discretize. In the context of hyperbolic partial differential equations as constraints, a variety of literature on sensitivity analysis and numerical methods can be found in [16,19,25,26] and the references therein.
Here, we proceed differently by first discretizing the constraints and objective function and then optimize the finite-dimensional optimization problem. This corresponds to the strategy first discretize-then optimize. In the case of production networks, a similar approach can be found in [21].
First, in Section 3.1, we consider the discrete optimality system. As well see, the discretization can be chosen such that the optimization problem is in fact a mixed-integer programming problem, see Section 3.2.
For this type of problem, we prefer the discrete optimization approach due to the almost linear nature of the problem. Similar to [14], we explain how the adjoint variables and dual variables are related. where C n i are positive weights and u n = F 0 n the controls. In the following sections, we formally derive discrete adjoint equations and a mixed-integer program as well to solve (9). We also focus on the interpretation of both approaches.
3.1. Optimality System. We formally derive the first order optimality system of the discrete problem. Therefore, we transform the pde-restricted problem (9) into an unrestricted one. We denote by φ n i the Lagrange multiplier for the discretized partial differential equation and ψ for the inflow condition. Then, the discrete Lagrangian function reads: We formally deduce the first order optimality system from (10) by assuming sufficient constraint qualifications.

SIMONE GÖTTLICH AND PATRICK SCHINDLER
Obviously, we need the derivatives of the numerical flux F n j in (10). Therefore, it is necessary to smooth the min-function in (FLUX) by a smooth approximation min ≈ min. We choose with α = aρ n i−1 and β = ρmax−ρ n i λ + F n i . Then, the derivatives of the numerical flux can be represented as for i = 1, . . . , N where δ i,j denotes the Kronecker delta, i.e. δ i,j = 1, i = j or δ i,j = 0, i = j. Note that the flux F n j does not depend on the density ρ n i for j > i. The derivatives are then The outflow F n N only depends on the density values ρ n N . Hence, • Considering the gradient equation, we end up with: An optimal solution of the first order optimality system can be found by projected gradient methods where the solution of the gradient equation (14) is computed by a serial realization of the state and adjoint equations. The idea is to start with a feasible solution of (9) and seek a control u n (k) = F n 0 that minimizes L( ρ i n , u n , φ n i , ψ) iteratively for each level k. To ensure that Instead of considering the second-order optimality system to check that there really exists a local minimum, we follow another way. The idea is to consider an alternative optimization approach which can be solved to global optimality. Having such a tool at hand, we show the connection between the proposed optimization models.

Mixed Integer Programming Model.
Mixed-integer programming (MIP) models can be used to solve a special class of pde-constrained optimization problems.
Since the optimal inflow problem (9) has a nearly linear structure, it is closely related to the problems mentioned in [6,9,14]. Usually a MIP model consists of a linear cost functional combined with linear constraints. The only nonlinearity appearing in (9) is the flux function. Here, we apply a standard linearization (see [9]) by introducing binary variables ξ n i ∈ {0, 1}. (FLUX1): MIP problems are solved using common software packages, e.g. CPLEX [4]. Note that increasing the number of binary variables, the computation time of the MIP may blow up. One possibility to reduce the computational effort provides the following lemma. We introduce an extension of the MIP model by including new constraints such that the original problem is solved faster.
Proof. Consider the binary variable ξ n i+1 = 0. Then, the inequalities (FLUX1) -(FLUX3) yield F n i = aρ n i and the numerical flux will be We assume that 0 ≤ ρ n i ≤ ρ max for all indices i, n. It holds that Combining the results (17) and (18), the flux F n i−1 simplifies to F n i−1 = aρ n i−1 . The inequalities (FLUX1) -(FLUX3) lead to ξ n i = 0. The choice of ξ n i+1 = 1 as a starting point works analogously.
The interpretation of Lemma 3.1 can be also done looking at the discussion of the Riemann problems in Section 2. The only combination of binaries that is not allowed is ξ n i = 1 and ξ n i+1 = 0. This corresponds to a sequence of congestion followed by a free flow regime, i.e. this is impossible.

Comparison of Optimization
Approaches. So far, we have presented two solution approaches for (9) which may at first sight seem different. In this section, we connect both optimization approaches by comparing the first order optimality system to the MIP approach, both based on the discretization proposed in Section 2.2. This means in detail, we formally compare the adjoint variables (10) with the dual variables of the relaxed MIP (15), see also [14]. To do so, we need to transform the relaxed MIP into its dual problem and explain in a second step the relevant similarities to (10). These computations show that for our discretization the finite dimensional optimization problem resulting from the first discretize-then optimize approach can be in fact related to a mixed integer model. Moreover, a numerical comparison is added in Section 4.
The relaxed MIP. Let the binary variables are treated as real-valued, i.e. ξ n i ∈ [0, 1]. Then, the relaxed MIP with ρ n i , F n i ∈ R reads for all indices n, i: (SUPPLY) constraints: Ψ : where the corresponding dual variables are introduced in an extra column. This leads to the dual program of the form: with Φ n i , Ψ ∈ R, Ψ n , ϕ n k,i ∈ R + , k = 1, 2 and where again the dual variables are denoted in a separate column. We set ϕ n 2,0 := 0 as well.
Mixed integer model and discrete adjoint calculus. The aim is now to compare the dual variables of the dual MIP (20) with the adjoint variables (10). We will show how these variables and optimization approaches respectively are related to each other. Let us assume that ρ n i , F n i are a feasible solution of the primal problem (19). We are interested in finding a feasible (not necessarily optimal) solution of the dual relaxed problem (20). By applying the complementary slackness theorem it is possible to obtain an (optimal) solution to the dual when only an (optimal) solution to the primal is known. In other words: If a MIP solution ρ n i , F n i is optimal for the primal problem (19), then the dual slack variables ϕ n 1,i , ϕ n 2,i fulfill the complementary slackness conditions (21). In case of no optimality, we have no restriction for the dual states ϕ n 1,i , ϕ n 2,i with respect to the primal state, i.e. the dual variables can be chosen freely.
Theorem 3.2. Let the inflow control problem be given by (9). Then, the variables of the adjoint equations and the dual variables of the mixed integer program correlate.
Proof. The main idea of the proof is to to pick those dual variables ϕ n 1,i , ϕ n 2,i such that the complementary slackness condition is satisfied: for k = i − 1, i. In fact, we have to analyze three different situations: freeflow, blocking and release of congestions.
Freeflow: Let F n k = aρ n k and F n k < ρmax−ρ n i+1 λ + F n k+1 for k = i − 1, i. Thus, due to (21), the dual variables must be ϕ n 1,k ≥ 0, ϕ n 2,k = 0, k = i − 1, i, and the constraints of the dual problem turn into Rearranging the last two equations, we end up with The equation for Φ n−1 i is similar to φ n−1 i in (10) since ∂ ρ n i F n j = a.
This result can be plugged into the constraint Again, we compare the dual variables Φ n−1 i with the adjoints φ n−1 i in (10), but with the crucial difference that the partial derivatives of ∂ ρ n i F n j are more involved, cf. (12) and (13). If α ≤ β, the derivative of the smoothed minimum function min is ∂ 1 min (α, β) = 1, ∂ 2 min (α, β) = 0 or otherwise, for α > β, Using the Taylor expansion to simplify the expressions ∂ 1 min (α, β), ∂ 2 min (α, β) for α > β yields Then, the partial derivatives of the numerical flux ∂ ρ n i F n j can be expressed as for i = j and for i * ≤ j < i. If j < i * , the derivatives are ∂ ρ n i * F n j = 0. A computation shows that the dual variables Φ n−1 i in (22) and the adjoint variables φ n−1 i from (10) only differ O( 2 ): Release: Let F n k = aρ n k and F n k = ρmax−ρ n i+1 λ + F n k+1 for k = i − 1, i. Then, the complementary slackness conditions (21) does not contain any information about the dual variables, i.e. ϕ n 1,k ≥ 0, ϕ n 2,k ≥ 0, k = i − 1, i. Note that the choice of ϕ n 2,k is not unique. Looking at the objective function of (20), we see that the maximization problem forces small values of ϕ n 2,k . In the ideal case, ϕ n 2,k = 0 and the release case immediately reduces to freeflow.
To highlight the numerical impact of these results, we solve the optimization problem 9 with the adjoint approach presented in Section 3.1 as well as the mixed integer program (MIP) presented in Section 3.2 applying CPLEX [4]. Comparisons are given in Section 4.2.
4. Numerical results. In this last section, we perform two types of numerical experiments. We point out the benefits of the proposed discontinuous flux Godunov (DFG) scheme since it is the main ingredient for our control purposes. Having a reliable forward solver at hand, we present optimization results concerning the choice of the optimal inflow. We compare the adjoint approach with the mixedinteger problem and address in particular the quality of results. 4.1. Wave Front Tracking Algorithm vs. DFG method. At first, we test the DFG method against the classical front tracking algorithm, see [3,5,20]. The latter is a grid-independent method, i.e. there is no CFL condition, that tracks all propagating waves and their interactions, see Section 2.
We consider the initial boundary value problem for solving (1) with (3): To compare the approximate solutions of both methods we choose a time horizon of T = 1 as well as a = 1 and ρ max = 1. In the time interval 0.5 ≤ t < 0.8, the outflow f out (t) is as large as possible (freeflow regime). Otherwise the outflow is blocked. otherwise.
For the wave front tracking algorithm, we divide the spatial domain into N = 20 and N = 100 cells. This yields a space step size of ∆x = 0.05 or ∆x = 0.01, respectively.

SIMONE GÖTTLICH AND PATRICK SCHINDLER
The initial data for the Riemann problems is given by the cell integrals over ρ 0 (x), i.e.
The inflow can be translated into a Riemann problem of the form ρ l = ρ 0 = 0.2 and ρ r = ρ 1 . The Riemann problems for N = 20 are shown in Figure 6.  Figure 6. Initial data for the Riemann problems (black), continuous initial data ρ 0 (x) (red).
In Figure 7, we observe for different discretizations that the density is transported in a forward direction with velocity a. Since the outflow is zero for 0 ≤ t < 0.5, a congestion occurs resulting in a backward traveling shock wave with maximal density ρ max . The shape of the latter shock wave is according to the nonlinear initial condition ρ 0 (x). For the time 0.5 ≤ t < 0.8, the congestion is released, i.e. the outflow is not zero anymore. All density ρ moves with velocity a in a positive direction. After time t ≥ 0.8, the outflow is blocked again and a new jam arises.
Furthermore, in Table 1, the computation times of the wave front tracking and the DFG method for different step sizes ∆x are compared.

4.2.
Optimal Inflow. Our main objective, however, is to find the optimal inflow F n 0 such that congestions are avoided. To do so, we try to keep the buffers as small as possible, i.e. we minimize the sum over the density i,n ρ n i , cf. Section 3. We investigate the following scenario. Let us assume a total supply of S = 4, a production velocity of a = 1 and a final time T = 15. The production system is mapped onto the unit interval [0, 1] and the last machine is stopped for maintenance in time t ∈ [4, 8) ∪ [11,15).
We restrict the inflow to the upper bound W = 0.5. For the space-time grid we take N = 10 cells and N T = 150 time points, i.e. ∆t = ∆x = 0.1. The adjoint approach is implemented with a smoothness parameter = 10 −2 . The termination criterion is for a tolerance of T OL = 10 −6 given by where d n (k) is the steepest descent direction in the k-th iteration. We have checked that the gradients computed by numerical differentiation and the adjoint approach coincide up to an error less than T OL.
All optimization results are plotted in Figure 8 [4] behave in the same way. Only small differences can be identified, i.e. there is a delay of 10 discrete time steps in the second interval of maintenance. This is a numerical artifact resulting from the different computational treatment of the min-function in equation (FLUX), compare (12) with (FLUX1)-(FLUX4).
Note that the duality gap of the MIP, i.e. the difference between the primal (19) and dual solution (20), is zero. Consequently, from the theory of strong duality, we know that the computed solution is globally optimal. The corresponding optimal densities are shown in Figure 9. Since the optimal inflow tries to prevent congestions ( Figure 8) by reducing the inflow to zero before the maintenance intervals are scheduled we naturally observe a similar behavior in the evolution of the density. Looking at Figure 10, we see that there is no outflow during the maintenance intervals. Hence, the whole system is blocked. Due to the relation F n N ≤ f out (t), the outflow is adjusted as soon as the system is released again. As already shown, the density variables of the adjoint approach differ slightly from the density variables of the MIP model due to the smooth approximation of the min-function for the adjoint approach. This effect is particularly apparent in the the optimal cost functional values, i.e. J * (ρ n i ) = 393.21 for the adjoint approach and J * (ρ n i ) = 396 for the MIP model. Computation times. We repeat the previous example with a different number of discrete cell points N and measure the computation times. The goal is to compare the efficiency of the different optimization approaches from a practical point of view.  Table 3. Number of variables and constraints of the MIP model. Table 2 and Table 3 stress the usability of the adjoint approach. Finer resolutions lead to a moderate increase of the computation times. In contrast, the MIP model without any acceleration takes approximately one hour to solve the N = 20 instance. By additionally including constraint (16), the solution of the MIP model can be speed up significantly. There is a certain threshold where the adjoint approach dominates the improved MIP. Furthermore, we observe that for N = 16 to N = 20 the solution time for the improved MIP is multiplied by a factor of 40 while the number of variables and constraints is doubled. This is can be explained by the increasing complexity of the problem resulting in larger branch and bound trees for the CPLEX solver.