Convex output feedback model predictive control for mitigation of COVID-19 pandemic

In this paper, a model predictive control approach is proposed for epidemic mitigation. The disease spreading dynamics is described by an 8-compartment smooth nonlinear model of the COVID-19 pandemic in Hungary known from the literature, where the manipulable control input is the stringency of the introduced non-pharmaceutical measures. It is assumed that only the number of hospitalized people is measured on-line, and the other state variables are computed using a state observer which is based on the dynamic inversion of a linear sub-system of the model. The objective function contains a measure of the direct harmful consequences of the restrictions, and the constraints refer to input bounds and to the capacity of the healthcare system. By exploiting the special properties of the model, the nonlinear optimization problem required by the control design is reformulated to convex tasks, allowing a computationally efficient solution. Two approaches are proposed: the first finds a suboptimal solution by geometric programming, while the second one further simplifies the problem and transforms it to a linear programming task. Simulations show that both suboptimal solutions fulfill the design specifications even in the presence of parameter uncertainties.


Introduction
The COVID-19 pandemic is one of the biggest challenges the world is currently facing with. Until the vaccine and effective treatment are widely available, carefully planned measures have to be introduced in every country to avoid the serious effects of the spreading disease. Choosing a right management policy is a sensitive task where several potentially contradicting objectives have to be taken into consideration. The most important limiting constraint is the capacity of the healthcare system, which can be easily overwhelmed if the spread of the disease is not controlled. It is clear that the transmission of the virus can be efficiently slowed down by appropriate restrictions (social distancing, lockdown), but these measures have negative impacts on the society and the economy that cannot be neglected. At the moment, governments are continuously evaluating the control measures, trying to find balance between public health concerns and the costs of social distancing measures.
principle might be dangerous in handling the pandemic. In Morato, Bastos, Cajueiro, and Normey-Rico (2020) a binary input representing full or no isolation is computed using MPC and a nonlinear epidemic model to reduce the peak of infected individuals while minimizing the economic loss due to the restrictions. This solution is refined and made more realistic in Morato, Pataro, da Costa, and Normey-Rico (2020), where weekly piecewise constant input with actuation dynamics are assumed. In Péni, Csutak, Szederkényi, and Röst (2020), a nonlinear MPC-based solution was proposed for epidemic management, where the main contribution is the application of temporal logic for the incorporation of interdependent and possibly time-varying constraints into the design.
As the plenty of successful solutions show, nonlinear MPC is a suitable tool to transform the various control goals and objectives related to the epidemic management into a transparent optimization problem. However, the numerical solution of this problem requires significant computational power even in the case of a simple 4-8 state compartment model. This makes it difficult to extend the algorithms for large dimensional (e.g. multi-scale or multi-region) epidemic models, to compute control sequence for long horizons and to apply advanced robust control design algorithms (e.g. tube or scenario-based methods). Therefore, an important aim of this work is to reformulate the MPC design in such a framework, where the numerical computations can be performed more efficiently. From this respect, a recent related work is Hayhoe, Barreras, and Preciado (2020), where a data-driven model is proposed to describe the spread of the COVID-19 pandemic, and an optimal controller is designed in a convex framework using geometric programming to minimize the final number of deaths while taking into consideration cost and hospital capacity constraints. Moreover, a detailed and realistic actuation model identified from mobility data is also presented in Hayhoe et al. (2020).
A well-known characteristic of predictive control is that it generates a full state feedback policy, which can be applied only if all of the states are measured or estimated. In the work of Péni et al. (2020) the possibility of output feedback control was demonstrated by designing a state observer for the epidemic model. In this paper an improved state estimator is proposed, which by exploiting the specific structure of the compartmental model can be constructed in an elegant way and provides reliable state estimates even in the presence of modeling uncertainty. The observer and the convexified nonlinear model predictive controller give an output feedback structure that is proposed as a possible solution to the control problem related to the mitigation of the pandemic.

Compartmental model
The transmission dynamics of the epidemic is usually described by compartmental models where each state variable represents one group of the population. The disjoint groups collect individuals of the same infectious status. Depending on the modeling assumptions and goals, different compartmental models can be constructed Ansumali, Kaushal, Kumar, Prakash, and Vidyasagar (2020), Carli, Cavone, Epicoco, Scarabaggio, and Dotoli (2020). In this paper we use the 8 state model introduced in Péni et al. (2020). This specific model structure has been developed to capture the dynamic characteristics of the epidemic that are relevant to our control design objectives. In the model construction it has been important that all parameters have clear physical interpretation and they can be determined with sufficient precision by using the available medical literature and epidemiological data.
In our model the total population is divided into 8 classes: denotes the susceptibles, i.e. those who can be infected by the disease, (latent) collects individuals, who have already contracted the disease but do not show symptoms and are not infectious yet; is the pre-symptomatic compartment for those who are infected, but they need a few days to develop any symptoms. After the incubation period elapses the infected individuals are transferred to the asymptomatic ( ) and symptomatic Average length of hospitalization 10 (days) Probability of fatal outcome, given hospitalization 0.145 ( ) compartments. Those in will always recover, while the more severe cases in may require hospitalization and move to compartment . Patients in may eventually recover ( ) or die ( ). The transition diagram is depicted in Fig. 1, further details on the model can be found in Péni et al. (2020).
The differential equations of the compartmental model are written aṡ The state variables represent the number of individuals in the compartments and are assumed to be normalized to 1, i.e. ∑ ∈{ , , , , , , , } = 1, where 1 represents the total population. By construction, ∑ ∈{ , , , , , , , }̇= 0 so the size of the population remains constant. The parameters of the model are summarized in Table 1. The nominal values are based on the estimates calculated from the epidemiological data collected in Hungary and also worldwide. For more details, see Péni et al. (2020), Röst et al. (2020).
Until vaccine or appropriate treatment is available, the spread of the virus can only be controlled by limiting the contacts and reducing the infection probability between individuals. The effect of different interventions (from mandatory mask wearing to total lockdown) can be quantified by different scalings of the transmission rate , therefore it is straightforward to choose the scaling factor as control input. Assuming that the number of hospitalized and deceased individuals can be reliably documented, state variables and are chosen for measured state variables. The official number of daily new infections and active cases cannot be used directly, since it is known that only a fraction of the actual cases are detected Phipps, Grafton, and Kompas (2020).
The observability of the model was briefly analyzed in an LPV framework in Péni et al. (2020) where it was found that the model is observable from , although the observability matrices are numerically badly conditioned. Following the methodology of Massonis, Banga, and Villaverde (2020) and using the STRIKE-GOLDD software tool Villaverde, Barreiro, and Papachristodoulou (2016), the identifiability of model parameters was also checked with the assumption that and are observed. Computation results showed that the model itself is not structurally identifiable, since the parameters and cannot be uniquely determined by measuring only and . This is an important fact from the point of view of model analysis and parameter estimation, but does not affect the solvability of the control task.
Discretization. Adapting to the predictive control framework, the epidemic model will be used later in discrete time. To discretize the T. Péni and G. Szederkényi  dynamics, we make the following observation: (1) can be considered as a linear time invariant (LTI) system driven by a nonlinear input function: and can be determined from (1). Vector comprises the states. A discrete time model can be obtained by applying standard zero order hold (ZOH) method to the LTI part and evaluating the input function only at the discrete time steps. The discretized model can be given formally as follows: We will use two different sampling times in our study. To obtain acceptable predictions for the considered long horizon, the predictive controller (Section 4.) uses = 0.5 day for updating the states, while = 1 day is used for the observer design (Section 3). We further assume that the update of the input can be performed weekly.

State reconstruction
The goal of the state observer is to reconstruct the states of model (1) from the available output measurements. The proposed observer is designed in 4 steps, these steps are presented in detail in this section. The observer is designed in discrete time and as we have mentioned above, the sampling time is = 1 day. This means that the output measurements can be updated only on a daily basis, which is feasible in practice. It is important to emphasize that the observer is designed for the nominal model, possible parameter uncertainites are not considered explicitly in the design. On the other hand, in Section 5, the observer is tested on uncertain models to analyze its properties on realistic scenarios.

Inversion based estimator for
We start from the LTI model defined by Eqs. (1c)-(1f), where state can be considered as an unmeasured external input and is the measured output. Let this dynamical system be converted to discrete time domain and let the resulting transfer function be denoted by ( ). One possible approach to reconstruct is to invert this dynamics and drive the inverse with input . If the model is accurate enough and a causal, stable inverse exists, then a reliable estimateĉ an be obtained. Note also that the other 3 state variables are not needed for inversion because the system is linear, so the inverse can be started from any (e.g. zero) initial state.
By examining ( ) it turns out that the system is strictly proper and contains unstable zeros. Therefore, the construction of the inverse requires the numerical differentiation of the output and the inverse obtained will not be stable. Fortunately, the first problem is easy to does not affect the → transfer. This state can therefore be trivially removed from the model. The third singular value is significantly smaller than the other two, so we expect that the removal of the corresponding state does not generate significant error. Doing so, two states are removed from the model. By performing a balanced reduction (function 'balred' in MATLAB) we obtain the following 2-state, proper transfer function: where 1 = 0.92787, 2 = 0.94446 and 1,2 = 1.27205 ± 0.37353 . (The numerical values of the poles and zeros have been obtained by using the nominal model parameters in Table 1.) Note that ( ) can be easily inverted, without numerical differentiation.
The second problem (instability of the inverse) still holds, because ( ) has a complex, unstable zero pair. To make the inverse stable, the zero is transformed back to continuous time bỹ1 ,2 = log( 1,2 )∕ , then reflected across the imaginary axis and finally it is transformed back to discrete time bȳ1 ,2 = exp(̃1 ,2 ). Then , ( ) = tf( 1 , 2 ,̄1,̄2) is considered as the stable, approximate inverse of ( ). Due to the reduction and the transformation of the zeros, , ( ) does not properly cancel ( ). The product of them generates a phase shift corresponding to = 3.5 days time delay. This can be seen on the bode diagram in Fig. 2, where ( ) , ( ) is compared with the 2nd order Pade approximation of the = 3.5 days time delay. It can also be seen that the magnitude of ( ) , ( ) significantly differs from 1 at higher frequencies. Since the epidemic model is controlled by piecewise constant input that changes relatively rarely compared to the sampling time, this approximate inverse will be suitable for computing an estimate for .

Reconstruction of , ,
By using , ( ), we can obtain at time = ⋅ , the delayed estimatê( − ). Using this input, together with the measured, delayed output ( − ) a state observer can be designed for the statespace counterpart of ( ). In this paper a standard Kalman filter is proposed for this purpose. The Kalman filter also has the practical advantage of being able to handle state and measurement noise, which are typically present in a real system. By using the time shifted inputs, the Kalman filter can produce estimate for the delayed states, that is, we obtain̂( − ),̂( − ) and̂( − ).

Estimation of and computing the actual state values
In this step the first 5 equations of the epidemic model are nu- and is regularly updated as we describe later. At the beginning it is initialized to 1 (assuming that the state estimation starts at the beginning of the epidemic). The integration of the model requires also the past control input sequence that has been applied on time interval [ − , ]. This control sequence is assumed to have been previously stored and is available.
The integration of the model is performed in two steps. First the state trajectories are computed on interval [ − , − + ] to obtain ( − + ), which is used to updatê−. Continuing the integration on time interval [ − + , ] the actual values of the state estimates, i.e.̂( ),̂( )̂( ),̂( ) and̂( ) can be obtained. The compartment of recovered patients will not be used later in the control design, but if it is necessary to estimate , it can be easily done by using the estimated states and measured output : as the sum of the states is 1, we havê( ) = 1 − ∑ ∈{ , , , , }̂( ) − ( ) − ( ). The complete algorithm of state reconstruction is summarized in Algorithm 1. We remark that another possible approach is to design an unknown input observer to estimate , , and using the theory in Bhattacharyya (1978). However, for this the first and some higher derivatives of are also needed as auxiliary outputs to fulfill the computability conditions.

Control problem formulation
The epidemic model has one manipulable input: the scalar factor that scales the transmission rate (see, Eq. (1)). This variable represents the effect of the measures that are applied on the society in order to control the contact rate between individuals. Regarding the control goals, several scenarios can be reasonable. We are focusing on the mitigation of the epidemic, where the goal is to avoid the overwhelming of the healthcare system in a finite time window by applying less stringent interventions whenever possible. Strict measures have serious adversarial effects on society and economy so avoiding them is desirable Mandel and Veetil (2020), Saladino, Algeri, and Auriemma (2020). The time window is typically chosen to be long enough to make acceptable long-term prediction for the future behavior of the epidemic. It is also an option to align it to the estimated date when a suitable vaccine or treatment becomes available.
Following a similar concept as in our earlier paper Péni et al. (2020), we formulate a nonlinear model predictive controller for the mitigation problem above. The control input is computed periodically based on the most recent measurements and the predicted future behavior of the model. Since the control window is fixed (not shifted in each time step), a shrinking horizon MPC is designed. The controller is implemented in discrete domain by using = 0.5 sampling time. At a time instant = ⋅ the control input is computed by solving the following optimization problem: where ( ) = ( | + ) , ( ) = ( | + ) are the predicted state and control input at the + th time step and (0) = ( | ) = ( ) = ( ) is the state at the current time instant. Weights (with 0 ≤ ) are used to differently penalize control actions at different time instants. They are a-priori fixed tuning parameters.
Strict interventions are penalized by maximizing the cost function (6a). To protect the functionality of the healthcare system a strict upper bound is introduced for the number of hospitalized patients (6e). Further constraints are added to keep the control input in a valid range (6h). The maximal value of is 1 (no intervention), the minimal value is chosen to correspond to the strictest measure, e.g. the total lockdown.
Since the control input determines rules and restrictions that have to be performed by the society, it takes time to have impact on the dynamics. Therefore it is no use changing the control input at each sampling instant. We therefore assume that is updated only at every th time step (6g). Assume the control window is [0, ], where is a priori fixed. The prediction horizon is ( ) + , where ( ) = − is shrinking as moves towards . defines a short time interval after the control window, where the control input is not updated, but the constraints have to be satisfied. This prevents the optimization to simply improve the cost function by generating less effective (in our case higher) control actions at the end of the horizon. Turning off the controller in the last few steps leads to high increase in the constrained state variable, which can therefore easily exceed the limit right after leaving the control window. This safety constraint can be used together with a small weight on the last control action (e.g. ( ) < min ≠ ( ) ) allowing strict intervention at the end of the control window. This procedure implements the following concept: the system is left at time in a state, where there exist (at least a strict) intervention that is able to keep the states below the prescribed limits at least for ⋅ time period after the end of the control window.
The last parameter of the procedure is ( ) = ⌊ ( )∕ ⌋ + 1, which denotes the number of free control moves to be designed. Remark 1. Since compartments of recovered ( ) and deceased ( ) patients are not used in the control synthesis process, the corresponding state variables and can be removed from the control model.

Suboptimal solution by geometric programming
The optimization problem (6) is nonlinear and nonconvex. In addition, the control horizon is long, considering that the control window is at least about six months long, but might be even longer. These factors make the control design computationally challenging. Our goal is therefore to recast the original problem to a convex optimization task. For this, the geometric programming (GP) framework Boyd, Kim, Vandenberghe, and Hassibi (2007), Boyd and Vandenberghe (2009) will be used. Geometric programming is a powerful tool to solve nonlinear and nonconvex optimization problems if the decision variables are positive and the cost and the constraints are polynomial functions with positive coefficients. More precisely, a geometric program in standard form can be given as follows: 2 … , , > 0 and , ∈ R. The main advantage of GP is that (7) can be systematically transformed to a nonlinear convex optimization problem by taking the logarithm of the variables. It is clear that this transformation converts (7c) to linear constraints, but it can also be shown that the logarithm of the posynomials are also convex in the transformed variables. The standard GP can be extended by allowing further functions of posynomials, e.g. max, positive (fractional) power, etc. in the construction of the optimization problem. It can be shown that these operators preserve the convexity. The convex counterpart of a standard (or extended) GP can be constructed manually or by using a suitable software tool, e.g. the CVX toolbox Grant and Boyd (2020). More details on GP and its applications can be found in the paper and book Boyd et al. (2007), Boyd and Vandenberghe (2009).
In the case of epidemic control, the GP method is motivated by the following properties of (6): • the model is polynomial in the state and input • all state variables are inherently positive (nonnegative), • apart from the dynamics of , the coefficients of every term in the discrete time model are nonnegative.
To eliminate the problem caused by the dynamics of , we can make the following observations: first, continuously decreasing; second, a successful mitigation control keeps the number of infections low, so only a limited portion of the society is expected to get infected by the end of the control window. Therefore, ( ) in (6) can be replaced by its constant upper bound̄= (0) . By using this upper bound, stricter control actions are generated that may degrade the performance (results in larger control cost). However, the simulations will show that this is an acceptable price to get a significantly simpler optimization task.
Note also that the original cost function cannot be transformed in the geometric programming framework. Therefore, it is replaced by ∑ (1∕ ). Of course, this new cost is not equivalent to the original one, but expresses the same intention: minimizes the strictness of the interventions. The modified optimization problem can be given as follows: wherẽ,̃and̃are obtained from , and by removing the first state variable . Variable is introduced to make constraint (8e) soft. A common solution of adding to the right hand side of ( ) ≤d oes not work here, because this formulation cannot be transformed to a geometric programming constraint. Therefore is used as a scaling of and ≥ 1 is prescribed. The value of is minimized by completing the cost with an additive term penalizing the constraint violation. By using the soft constraint, the numerical infeasibility caused by ( ) > can be avoided. After the optimization is completed, the value of gives information on the magnitude of the constraint violation that has been occurred during the synthesis.
This problem can be systematically transformed to a convex geometric program, which can be solved efficiently by numerical optimization. To perform the simulations presented in this paper we use CVX optimization package Grant and Boyd (2020) to formulate the geometric program. CVX also calls an external solver, which in our case is MOSEK MOSEK ApS (2020), to solve the convex optimization problem. (8) to decrease the gap between the convex and the original optimization problems. Since the upper bound̄is constant over the entire horizon, the simplified and the original model have similar behavior, if the trajectory of does not significantly differ from̄. To keep −̄under control, the dynamics of ∶= 1− can be added to the model. Since the dynamics of in discrete time is polynomial with positive coefficients, thus the optimization problem can still be converted to a geometric program. By prescribing upper bound for , we can limit the changes of . The drawback of this modification is that constraining narrows the set of feasible control policies. To overcome this limitation an iterative design can be applied. After each iteration, the trajectory of (computed from ) is stored and used in the next iteration in the place of the constant̄. (The simulations for this paper (Section 5) have been performed without this fine tuning step, so the control input has been obtained in one step by solving the geometric optimization problem (8) only once.

Remark 2. Further ingredients can be added to
Remark 3. Though in most cases the convex program above produces an acceptable result, it is possible to further improve the control policy by nonlinear optimization. For this, the control sequence generated by (8) can be used as a warm start for the nonlinear optimization (6).

Suboptimal solution by linear programming
We have seen that geometric programming provides an efficient way to simplify the optimization task required by the predictive control design. Now we go further. In this section we show that by applying additional relaxations it is possible to convert the original problem to a linear program (LP). LP is the simplest convex optimization task, it can be solved very efficiently even for thousands of free variables Gass (2003).
To generate a linear program from (6)  This reformulation greatly simplifies the optimization, but generates also some new problems. One of the problems is that, we can no longer prescribe piecewise constant input by simple linear constraints. This requirement should therefore be dropped from the optimization procedure and has to be handled by post processing the result. If the control input is allowed to change only at every ⋅ time step, the simplest way is to take the mean of the ( ) values in the same ⋅ window and to consider the result as the constant input over this time interval. This 'approximation' gives acceptable result only if does not change significantly inside the ⋅ intervals. Unfortunately this does not hold in our case: the LP tends to generate highly oscillating input sequence at certain sections of the control window. The piecewise constant approximation of this signal results in significant performance loss and even infeasibility of the optimization in the later time steps. Since the epidemic dynamics is relatively slow, we have found that not the term ( ) + ( ) + ( ) is responsible for the oscillation. Consequently, if is smoothed out, so can be expected from . We therefore applied the following strategy: is approximated by a 4th order B-spline The MathWorks (2020) and the optimization is reformulated to seek the coefficients of the spline instead of the ( ) values directly. To avoid large changes inside the ⋅ length sections, the knot points are apriori fixed at the boundaries of these intervals. This prevents from oscillating between the successive knots. As the spline bases (the knot points) are a-priori fixed, the optimization remains linear in the parameters. Simulation results will show that this strategy works in practice: the spline formulation smoothes out the solution of the optimization and results in input sequences that can be suitably approximated by a piecewise constant function thereafter.
The second problem with LP relaxation is that the original cost is nonlinear in the optimization variable . Therefore it is replaced by ( ) . This cost is of course, not equivalent to the original one, but again, it expresses similar intention: we seek the less stringent interventions that do not result in violation of state constraints. can be maximized by maximizing ( ) -s (which is the original task) and the terms ( ) + ( ) + ( ) . Note that, these two requirements are not contradictory, because applying less stringent interventions induces the increase of the number of infected individuals, i.e. the values of , and get larger. The increase is controlled by the upper bound prescribed for .
To guarantee numerical feasibility the constraint ≤̄is again transformed to a soft constraint ≤̄+ , where ≥ 0 is an additional free variable. Its value is minimized by adding − to the linear cost above.
The control design by LP optimization can be summarized as follows: where (⋅) denotes the row vector containing the values of B-spline basis functions evaluated at and is the vector of the coefficients. The spline is defined over the [0, ( ( ) − 1) ] interval.

Output feedback control
In this section an output feedback controller is formed by using the state observer of Section 3 and the two state feedback MPC controllers of Section 4. The properties of the control structure are analyzed via numerical simulations. In these simulations, the sensitivity of the controller to the uncertainty of the model parameters is also examined.

Analysis of the observer
To analyze the proposed observer structure, numerical simulations have been performed. The observer is tested in open loop without controller. The control input is chosen to be constant 1, which corresponds to applying no intervention. The initial state is selected to represent the beginning of the epidemic, that is the number of infected persons is very low, there are no hospitalized patients yet and nobody has been recovered or died. Numerically 0 = [ 0, ; 200; 50; 2; 1; 0; 0]∕ , where 0, = − 200 − 50 − 2 − 1 and is the total population, which is = 9800000 in Hungary. The Kalman filter required to reconstruct , , (Section 3.2) has been designed by assuming only a Gaussian output noise (i.e. a noise on ) with covariance = 0.1. In the simulations we assumed that the model parameters are not known exactly: the observer is designed for a nominal model and then applied to different ''real'' models generated by perturbing the nominal parameters. The parameters have been perturbed according to the following uncertainty bounds: These bounds are based on the latest uncertainty estimates calculated by using the epidemiological data collected worldwide. More information on the identification and uncertainty bounds of the model parameters can be found e.g. in Fernández-Villaverde and Jones (2020), Korolev (2021).
Taking the uncertainty intervals above, 100 different parameter sets (i.e. 100 different models) have been generated randomly by using uniform distribution over the uncertainty intervals. The relative absolute estimation error (i.e. the error divided by the maximal value of the state) are depicted for all scenarios in Fig. 3. It is not surprising that in the uncontrolled case the epidemic reaches a peak where the number of infected individuals are extremely high. We have found that the estimation error is maximal in the neighborhood of the extremal values of the states. Since the goal of the control is to avoid (delay) the epidemic peak, therefore smaller estimation errors are expected in closed loop. This will be justified in the next section.
By analyzing Fig. 3, a small estimation error can be seen even in the nominal case. This can be explained by the effects of model reduction and the numerical integrations needed to compensate the time delay. Considering the uncertain cases, the relative error is the highest in the estimation of , while at the other states it is less than 20%. This is not surprising as the model is nonlinear, several parameters are allowed to deviate from the nominal value and the observer is designed without explicitly taking the uncertainty into consideration. Nevertheless the observer well tracks the trend of the states and the accuracy it provides is sufficient to use the estimated states for feedback control design. Fig. 3. Analysis of state reconstruction error by testing the observer on 100 randomly generated epidemic models. The figures show for each state the relative absolute estimation error, i.e. absolute error divided be the maximal value of the state. The nominal case is highlighted by black.

Closed loop control by geometric programming
In this section the MPC controller introduced in Section 4 is applied to our particular epidemic model. The controller uses the state estimation provided by the observer (Section 3), they thus form an output feedback structure. The parameters of the controller have been chosen as follows. The control window is 190 days, so = 190∕ . is set to 3 weeks, so = 21∕ . The upper bound for the hospitalized patients (state ) is chosen tō= 0.001. This bound is based on the available capacity of the healthcare system in Hungary. The system is assumed to start from the initial state (10), but the controller (and the observer) are activated only after 2 weeks. We assume that these two weeks are needed to detect the outbreak of an epidemic. The lower bound of the control input is set to = 0.2, which corresponds to the strictest measure, the total lockdown of the cities. This value was determined using Wang et al. (2020) studying the situation in Wuhan in 2019-2020.
In the simulations we have used 6 models: beside the nominal one, 5 others with different parameter sets have been chosen to represent uncertain scenarios. Actually, these 5 models have been obtained by selecting the 5 worst case scenarios from the 100 random examples used in the observer analysis in Section 5.1. The selection was based on the maximal estimation error. The controllers have been tested on the continuous time model, i.e. in each sampling instant the actual control input is applied on the continuous model.
We have performed two simulation scenarios. In the first case (simulation scenario I.) all of the weights in the cost function are set to 1, i.e. no weighting is applied. In the second case (simulation scenario II.), we have chosen the weights to reflect the tolerance of the society for the interventions. In the first four weeks the weight is 1, which means that strict measures are tolerated. In the next four weeks the control input weights are 4 to force less stringent interventions. From the 9th week the weight is 12, that means strict interventions are definitely undesired. In compact form, the initial weight sequence at = 0 is 0,…,3 = 1, 4,…,15 = 4, 16,…, (0)−2 = 12, (0)−1 = 1. As time index gets closer to , ( ) decreases (shrinking horizon), the actual weights are obtained by taking the last ( ) elements of the initial weight sequence above. The small weight of the last control action is part of the terminal constraints (see Section 4 for details). The weight penalizing the soft constraint violation has been set to 10. Figs. 4,5,6 and 7,8,9 present the state trajectories and control inputs obtained in the two simulation scenarios. The nominal case is distinguished from the uncertain ones. It can be seen that in the nominal case, the safety ingredients guarantee the constraint satisfaction over the entire horizon, as well as beyond the control window. On the other hand if the model parameters significantly differ from the nominal ones, may slightly exceed the limit after the end of the control window. This motivates the design of a robust controller, e.g. by a simple scenario method introduced in the next subsection.
It can be seen that the two different weightings result in two different control strategies. In the first case (when = 1, ∀ ), an almost constant, mild intervention is proposed, which is gradually eased towards the end of the control window. In the second case, the control input follows the same pattern as the weighting: in the first 4 weeks strict intervention is applied, that is significantly eased in the next month. From the 8th week, the control input is pushed as close as possible to 1.
Finally, we analyze the computation time needed to solve the optimization task (8). The computation consists of two main steps: first, formulating the GP problem (CVX) and second, finding the optimal solution (MOSEK). The first part requires about 180-200 s while the second takes 70-80 s in general (measured on the longest control horizon, at = 0). (The computations have been performed on Mac Pro Late 2013 computer, under Matlab 2020a.) Compared to the nonlinear MPC methods in Péni et al. (2020), where finding an optimal solution required 10-12 min, this is a significant improvement in terms of T. Péni and G. Szederkényi   computational cost. However, it can be seen that the problem formulation takes up significant time compared to the numerical optimization. Since every time a similar numerical optimization task has to be performed (only the length of the horizon changes), it is possible to speed up the problem formulation phase by generating precompiled program components, identifying tasks that have to be performed only once and exploiting the specific structure of the problem. With these improvements the computation time can be further reduced.

Closed loop control by linear programming and simple scenario method
This section presents the results obtained by using the LP based control design presented in Section 4.3. To formulate and solve the LP optimization problem we used again the CVX environment. As we    which can compile the code and allows to generate parameterized optimizer objects. In this paper, we do not use this tool as we have found standard CVX suitably fast. However, by exploiting the benefit of code optimization and precompiling, the computation time can be further decreased.
By performing a control synthesis with the same parameters as in the GP example above (i.e. = 190∕ , = 21∕ , = 7∕ ), the computation time with LP is less than 10 s. From this, the construction of the optimization task takes approximately 8-8.5 s, the solve-time is always below 2 s. Since the solution of an LP requires significantly less time than GP, therefore we implemented a simple scenario method to improve the robustness of the controller. This approach can be applied to the GP-based control synthesis as well.
The basic concept of scenario based predictive control is generating a pool of possible system models and designing a control input that is feasible (and suboptimal) for all. If the size of the pool is suitably large, the probability of the constraint violation (infeasibility) on an arbitrary new model is close to zero. The details of scenario methods and scenario based control design can be found e.g. in Calafiore and Fagiano (2013), Campi (2018). The main drawback of this approach is that the computation of the control input generally requires to include the entire pool in the optimization process, which highly increases the dimension of the problem. Due to the specific structure of the epidemic model, this difficulty is not to be reckoned with. Since smaller control inputs mean stricter interventions, if feasible control inputs are computed separately for the models of the pool, then the minimum of these control input candidates gives a feasible control action for all models. The papers cited above propose specific algorithms on how to set up the pool correctly. In this paper we chose a very simple method. The pool is initiated with the nominal model. Random models are generated and the scenario MPC design is applied. If a constraint violation is detected, the actual model is added to the pool. The pool for the simulations presented in this paper has been constructed by running 100 design processes with 100 different models. (In all cases the weights of the cost function have been set to 1 (no input weighting has been applied), the weight of constraint violation has been set to 100.) After completing all of the design tasks we have obtained a pool of only 7 models. By testing the pool on another 100 examples, we have not run into constraint violation. This result suggests that the scenario method is a potential approach to design robust controller for our epidemic model. This motivates us to continue the analysis of this method and to make further efforts to derive mathematical guarantees for feasibility and optimality. This will be part of the future research.
After constructing the pool of models, we have tested the scenario controller in closed loop on the same 6 models that have been used in the earlier simulations. As before, we have not applied any weighting in the cost, i.e. = 1 has been chosen in (9) for all . The weight penalizing the soft constraint violation has been set to 100. The simulation results can be seen in Figs. 10-12. Similar to the results obtained by the GP-based design (simulation scenario I.), a mild intervention is applied in the most part of the control window, which is gradually relaxed towards the end of the horizon. The main difference between the two control strategies is at the beginning of the control window: the LP strategy starts with very mild measures, the control is practically activated only after the 4th week. In contrast, the GP-based approach applies significant interventions right from the beginning. By comparing the cost of the control we have computed the 1 norm of the control signals in both cases: by using the GP-based synthesis this norm is between 105 and 125, while it is between 115 and 135 in case of the LP. The LP is slightly better, but the difference is not significant, so we can say that practically the performance of the two control strategies is very similar. Note also that the characteristic and performance of either control strategy can be further tuned by modifying the weights of the cost function.
By analyzing the trajectories of we can see that with the LP controller there is no constraint violation. Due to the scenario design, a slightly conservative control action is designed, which prevents from reaching the prescribed limit.

Conclusions
An output feedback control approach was proposed in this paper to design the stringency of the non-pharmaceutical measures for mitigating the COVID-19 pandemic. The linearity of a subsystem of the used compartmental model allowed us to use dynamic inversion to give an estimation for the number of people in the latent compartment. The nonnegativity of the terms in the discrete-time model was exploited to formulate the control problem in a convex geometric programming framework. Additionally, by relaxing the optimization task further, the input can be computed by solving a simple linear programming problem. A state observer was designed and analyzed for the tracking of non-measured variables. The maximal estimation errors occur around the peak of the epidemic. Simulation results show that the designed feedback efficiently reduces the effects of the studied model uncertainties, and the control goals and constraints can be reached within an acceptable tolerance even in the linear programming case. The computations can be done in real-time and significantly faster than in the case of a previous nonlinear MPC solution Péni et al. (2020). The convexity of the re-formulated optimization problems guarantees a unique solution which is suboptimal with respect to the original optimal control problem, and makes possible to extend the control design to larger dimensional (e.g. multi-scale or multi-region) compartmental models Carli et al. (2020), as well as to apply more advanced uncertainty analysis techniques, e.g. scenario methods, in the future. Another subject of further work will be the application of the obtained suboptimal inputs as initial solutions in the original nonlinear MPC design.