An extended linear quadratic regulator for LTI systems with exogenous inputs

This paper proposes a cost effective control law for a linear time invariant (LTI) system having an extra set of exogenous inputs (or external disturbances) besides the traditional set of control inputs. No assumption is made with regard to a priori knowledge of the modeling equations for the exogenous inputs. The problem of optimal control for such a system is defined in the standard framework of linear quadratic control and an extended linear quadratic regulator (ELQR) is proposed as the solution to the problem. The ELQR approach is demonstrated through an example and is shown to be significantly more cost effective than currently available approaches for linear quadratic control.


Introduction
Optimal control of a dynamic system involves cost effective operation of the system through constrained control effort. The particular case of optimal control in which system dynamics are described by linear differential equations, and the associated cost is a quadratic function of system states and control-effort, is termed as linear quadratic (LQ) problem (Kwakernaak & Sivan, 1972;Ogata & Yang, 2001). Solution to the LQ problem is provided by linear quadratic regulator (LQR), a state feedback controller (Kalman, 1960).
It is assumed in the LQ problem that all of the inputs given to the system are control inputs, which means that each of the input can be manipulated by the controller. But, this may not be the case for a general LTI system, and some of the inputs to the system may be external disturbances, which means that such inputs can neither be disabled nor be manipulated by the controller. Such inputs are also termed as exogenous inputs, and the terms 'external disturbance' and 'exogenous input' are used interchangeably in this paper. Some examples of dynamic systems with exogenous inputs can be found in control literature in Menguy, Boimond, ✩ This work was supported by EPSRC, U.K., under Grants EP/K036173/1 and EESC-P55251. The material in this paper was not presented at any conference. This paper was recommended for publication in revised form by Associate Editor Delin Chu under the direction of Editor Ian R. Petersen. E-mail addresses: a.singh11@imperial.ac.uk (A.K. Singh), b.pal@imperial.ac.uk (B.C. Pal). Hardouin, and Ferrier (2000) and Kwak (2010, 2011), wherein such inputs are called 'uncontrollable inputs'. Another recent example of exogenous inputs can be found in power system literature in Singh and Pal (2014), wherein they are referred to as 'pseudo-inputs'.
The problem of linear quadratic control of LTI systems with external disturbances has been studied in past (see, for instance, Cheok & Loh, 1987;Johnson, 1968Johnson, , 1971Ostertag, 2011). In these studies it is assumed that the controller can either measure or estimate the external disturbances using information of the model which governs these disturbances. Thereby, the disturbances are eliminated using a control input which has two components; one component exactly cancels out the disturbance, while the other provides optimal state feedback. In case the disturbance cannot be exactly canceled out (and this is practically the most likely case), such a component of the control input is found which is smallest in magnitude and also minimizes the effect of the disturbance on the system. This technique of accommodating the external disturbance in the LQ problem by minimizing its effect has a drawback: it does not guarantee the minimization of the net costs associated with control-efforts and state-deviations. This drawback is also demonstrated in Section 5 of this paper. Thus, this technique fails to achieve the main objective of linear quadratic control. This paper proposes a new solution to the problem of optimal control of LTI systems with external disturbances or exogenous inputs, and the solution not only optimally accommodates the disturbance, but also minimizes the overall quadratic costs of control-efforts and state deviations. The solution does not make Nomenclature k Denotes the kth time sample T Denotes the transpose of a matrix or a vector T 0 The sampling period for the system in s 0 a×b Denotes a zero matrix of size (a × b) N The final time sample at which a closed-loop system with LQR or ELQR reaches steady state x The vector of states of the system u The vector of control inputs to the system u ′ The vector of exogenous inputs to the system A The state matrix B The input matrix corresponding to control inputs B ′ The input matrix corresponding to exogenous inputs F The state-feedback gain in the LQR and the ELQR solutions G The feedback gain corresponding to u ′ in the ELQR solution G ′ The supplementary feedback quantity in the ELQR solution I c Denotes an identity matrix of size (c × c)

J
The quadratic cost function for a discrete LTI system without any exogenous input J ′ The quadratic cost function for a discrete LTI system with both control inputs and exogenous inputs P The positive-definite matrix corresponding to F Q The cost weighting matrix corresponding to x R The cost weighting matrix corresponding to u R ′ The cost weighting matrix corresponding to u ′ S The matrix corresponding to G in the ELQR solution S ′ The matrix corresponding to G ′ in the ELQR solution any assumption on the availability of a priori knowledge of the model or the statistics of the exogenous input, and it remains valid for any sequence of exogenous inputs. Rest of the paper is organized as follows. Section 2 formally states the linear quadratic problem. Section 3 explains the classical LQR solution, while Section 4 describes the ELQR solution. This solution is demonstrated on an example system in Section 5; and Section 6 concludes the paper.

Problem statement
Some preliminary definitions: Definition 1. A 'control input' given to an LTI system is an input whose magnitude can be decided and changed as per any required control scheme. This is the input in the traditional sense of system theory.
Definition 2. An 'exogenous input' given to an LTI system is an input which cannot be removed from the system and whose magnitude cannot be decided or changed. This input is an unavoidable quantity which cannot be used as a control input in corrective actions, although it may be possible to find a control input which cancels out or minimizes the effect of the exogenous input.
Using the above two definitions, the control problem is stated as follows: For a discrete-time open-loop LTI system in which both control inputs and exogenous inputs are present, find an optimal control law such that the sum of the quadratic costs associated with the system state deviations, the exogenous inputs and the control inputs is minimized.
Thus, the aim of the above problem is to control the system via its control inputs under the constraints of exogenous inputs. This problem is termed as the extended linear quadratic (ELQ) problem in this paper.

Classical LQR control
A discrete-time open-loop LTI system without any exogenous input is represented by the following equation.
The quadratic cost function for (1) for N + 1 samples is given by: Minimizing J with respect to u k gives the following LQR solution.
If N is finite then the above optimal control policy is called as finite horizon LQR; otherwise it is infinite horizon LQR. Moreover, P k and F k for the infinite horizon case are bounded and have a steadystate solution iff the pair (A, B) is stabilizable, and the steady-state solution is found by solving the following discrete-time algebraic Riccati equation (ARE).

Linear quadratic control for systems with exogenous inputs
A discrete-time open-loop LTI system with both control inputs and exogenous inputs is given by the following equation.
As explained in Section 1, many solutions have been proposed for linear quadratic control of the above system using disturbance accommodation (such as in Cheok & Loh, 1987;Johnson, 1968Johnson, , 1971Ostertag, 2011). In these solutions, the component of the control input which accommodates the exogenous inputs, Here, B + denotes the Moore-Penrose pseudoinverse of B. The net control input is given by: The above control solution in (9) does not consider the quadraticcost for the discrete system given by (8). This system has an extra term (corresponding to the exogenous inputs) as compared to the system given by (1). Thus the quadratic-cost for this system gets modified. For N + 1 samples it is given by: In order to find the optimal control policy for (8), J ′ in (10) needs to be minimized with respect to u k . This minimization gives the following theorem. (8)), provided u ′ k = 0 ∀ k ≥ N, the optimal control policy for 0 ≤ k < N is given by (11)-(13) (and for k ≥ N, u k = 0).

Theorem 1. For an LTI system with pseudo-inputs (as given by
F k and P k remain same as the LQR case (given by (4)- (5)).

Proof.
A preliminary modification needs to be done in the system given by (8) for the derivation of Theorem 1, by adding a constant pseudo-input at the end of the column vector u ′ k , as: Here a is the number of elements in x k and b is the number of elements in u ′ k . It should be understood that because of (16), the above modification has no effect on the dynamics of the original system. The modification is needed to get an iterative expression for the optimal control policy. On its own, u ′ k cannot be expressed in terms of u ′ k−1 . But when a new pseudo-input vector v k is defined by appending a constant value 1 at the end of u ′ k , then v k can be expressed in terms of v k−1 using (15). The quadratic-cost for the modified system (given by (14)) for N + 1 samples is given by: where, Eq. (19) and the definition of R 1 (given by (18)) ensure that the constant pseudo-input 1 in v k has zero cost, so that the quadraticcosts for the modified system and the original system (as given by (10) and (17), respectively) are identical. As it is given that u ′ k = 0 ∀ k ≥ N, and the system reaches its final steady state, The combined quadratic cost for k = N − 1 and k = N, provided that the cost for k = N is optimal (which is J ′opt N ), is given by J ′ N−1 as: Substituting : Finding the partial derivative of J ′ N−1 in above equation with respect to u N−1 , ∂J ′ N−1 /∂u N−1 comes as: where, Also, as ∂ 2 (20): where, Again, the combined quadratic cost for k = (N − 2), (N − 1) and N, provided that the combined cost for k = (N − 1) and N is optimal where, Next, when the terms u opt N−3 and J ′opt N−3 are evaluated, their expressions are similar to (32) and (35), respectively, with the only change that N − 2 is replaced by N − 3, and N − 1 is replaced by N − 2. Similar expressions come for the rest of u opt k and J ′opt k (that is for k < N − 3). Thus, using initial conditions U N = 0 a×(b+1) and P N = Q, and applying induction for k < N, the optimal cost for J ′ in (17) comes as J ′opt 0 (and is found by iteratively evaluating the sequence J ′opt N , J ′opt N−1 , . . . , J ′opt 1 , J ′opt 0 ) and the corresponding optimal control policy required to arrive at this optimal cost is given by: where, It may be noted that W k has no role in deciding u opt k . Also, P k (using (42)) can be rewritten as: Substituting F k from (40) in (45) gives: Similarly, U k (using (43)) can be rewritten as: Also, from (45): (49) in (48): Using (40), H k in (41) can be rewritten as: ; and using (50), Partitioning U k in (48) as Partitioning H k in (39) as is the number of elements in u k : and using (51), Hence, with (40), (46), (53)-(56), Theorem 1 stands proved.
The optimal control solution in Theorem 1 is termed as the extended linear quadratic regulator (ELQR) solution. In this solution, finite horizon case is applicable only when the sequence of exogenous inputs is known to be finite, otherwise if the sequence is not known or if it is infinite then infinite horizon case is applicable. If the pair (A, B) is stabilizable, then infinite horizon solutions for P k , F k , G k and S k exist, and are given by F, P as in (6)- (7), and S, G as in (57)-(58). (57) : Although the terms F k and P k for the ELQR case remain same as the LQR case, this needs to be mathematically derived and hence this derivation is an important contribution of this paper. The other terms G k and S k are independent of the sequence of u ′ k , and hence they can be easily calculated if A, B, B ′ , Q and R are known. On the other hand, the terms G ′ k and S ′ k require the knowledge of the sequence of u ′ k for all the future and present samples. Thus, if the sequence of exogenous inputs is not known, only the terms F k , P k , G k and S k can be accurately calculated, and the terms G ′ k and S ′ k can only be estimated/predicted based on the estimated/predicted values of u ′ k . If u ′ k cannot be estimated/predicted then the term G ′ k should be ignored while finding the ELQR policy.

Implementation example: A third-order LTI system
The ELQR control can be implemented on any system whose equations can be reduced to the form given by (8). An illustrative example has been presented as follows, in which a simple thirdorder LTI system is controlled using the ELQR methodology.
The various state-space matrices of the test system, the equation of which is given by (8) Hence, the above system has three states, two control inputs and one exogenous input. Initially, all the states and inputs are zero, that is, x 0 = 0 3×1 , u 0 = 0 2×1 , and u ′ 0 = 0. Following three cases are considered for the exogenous input.

Known and deterministic model of the exogenous input
In this case, the exogenous input can be predicted and its model is known. For k ≥ 1, it is given as follows: Although in the above example system a vanishing exogenous input has been given (which vanishes when k → ∞), any other sequence of exogenous input(s) can be given to the system (which may or may not vanish) and subsequently ELQR solution can be applied.

ELQR policy
As u ′ k in (59) is an exponentially decreasing function of timesample, and it becomes zero only when k → ∞, therefore the infinite horizon case of ELQR needs to be used to optimally control this system. Using equations (6), (7), (57) and (58), and cost weighting matrices Q and R as I 3 and I 2 , respectively, the following infinite horizon values of P, F, S and G are evaluated (rounded-off to two decimal places Also, S ′ k can be evaluated by substituting P and S for P k (or P k+1 ) and S k , respectively, in (13), and solving for S ′ k iteratively. S ′ k is then substituted in (12) to find G ′ k . The final solutions for S ′ k and G ′ k are given as (rounded-off to two decimal places): Substituting the values of F, G, and G ′ k from (60) and (61) in (11), the optimal control policy for ELQR comes as:

Classical LQR policy
The classical LQR control is also applied on the test system for performance comparison with ELQR control, and as only the statefeedback gain F is required for classical LQR, and it is same as the state-feedback gain for ELQR control, the classical LQR control policy comes as: −0.42 0.11 0.60 1.14 0.12 0.05

Disturbance accommodating LQR policy
The disturbance accommodating LQR (DALQR) policy given by (9) is also applied on the test system. After substituting the values for F, B and B ′ in (9), the DALQR policy comes as: 1.14 0.12 0.05

Comparison of control performance
The weighted norms of the states and the control inputs (given by x T k Qx k and u T k Ru k , respectively) can be used as measures of control performance of a control method. These weighted norms are also the quadratic costs associated with the control method for the kth sample, as can be inferred from the constituent terms of J ′ in (10). The cost associated with the exogenous inputs, given by u ′T k R ′ u ′ k remains independent of the control method. This is because u ′ k is not dependent on the control method. The test system has been simulated in MATLAB, and the weighted norms of states and control inputs have been plotted in Fig. 1. It should be noted that x T k Qx k = x T k x k and u T k Ru k = u T k u k for the test system. Table 1 presents a comparison of quadratic costs associated with the states and the control inputs for the three methods. It may be inferred from Fig. 1 and Table 1 that ELQR is much more efficient than both DALQR and classical LQR in presence  of exogenous inputs, and for the test system the total quadratic cost for the states and the control inputs is reduced by 66.5% as compared to the classical LQR, and by 32.0% as compared to DALQR.

Known and stochastic model of the exogenous input
In this case, the exogenous input is stochastic with known model, and the rest of the system is same as in Case A. For k ≥ 1, the exogenous input is given as follows: Thus, X k in the above model is a random variable with a truncated normal distribution with mean = 0.9, variance = 0.01, upper limit = 0.95, and lower limit = 0.85.
As A, B and B ′ remain same as in Case A, the values of P, F, S and G also remain unchanged and are given by (60). An exact value of S ′ k for this case cannot be found as the sequence of u ′ k is nondeterministic. But the expected value of S ′ k can be evaluated by substituting ∆u ′ k+1 with its expected value (which is −0.1(0.9) k ) and replacing P k and S k with P and S, respectively, in (13). Finally, S ′ k is solved iteratively and substituted in (12) to find G ′ k . The final solutions for the expected values of S ′ k and G ′ k are given as:  Substituting the values of F, G, and G ′ k from (60) and (66) in (11), the optimal control policy for ELQR is obtained as follows.
The optimal control policies for DALQR and classical LQR remain same as in Case A. As random inputs are used in simulation, multiple simulations need to be run to get statistics of the quadratic costs. Fig. 2 and Table 2 show the mean values of quadratic costs for 1000 simulations. It can be observed that in this case as well, the control performance of ELQR is better as compared to the other two methods, and the net mean quadratic cost for the states and the control inputs is reduced by 58.7% as compared to the classical LQR, and by 14.1% as compared to DALQR.

Unknown model for the exogenous input
In the third case, it is assumed that any knowledge about the model of the exogenous input is not available, and the exogenous inputs cannot be predicted/estimated. The term G k in the ELQR policy is not a function of exogenous inputs, and hence it can still be used for control, while the term G ′ k must be ignored as it depends on the sequence of exogenous inputs. Substituting the values of F, and G from (60) in (11), the optimal control policy for ELQR comes as follows, while those for DALQR and classical LQR remain same as in Case A. u k = −  −0.42 0.11 0.60 1.14 0.12 0.05  For simulation, a uniform random number between 0.5 and 1 is given as an exogenous input at each sample (in the above ELQR law, it is assumed that even this information about the randomness of the exogenous inputs is not available). As random inputs are used in this case as well, multiple simulations need to be run to get statistics of the quadratic costs. Fig. 3 and Table 3 show the mean values of quadratic costs for 1000 simulations. It can be observed that in this case also the control performance of ELQR is much better as compared to the other two methods, and the net mean quadratic cost for the states and the control inputs is reduced by 71.5% as compared to the classical LQR, and by 46.9% as compared to DALQR.

Conclusions
A control scheme has been presented for the optimal control of a special case of LTI systems in which both control inputs and exogenous inputs are present. The scheme is termed as extended LQR, and it is shown to be significantly more cost effective than available LQR schemes. The applicability of the scheme has been shown on a simple model LTI system.