On the Optimal Control Computation of Linear Systems

In this paper, we consider a numerical method for designing optimal controlon Linear Quadratic Regulator (LQR) problem. In the optimal control design process through Pontryagin Maximum Principle (PMP), we obtain a system of diferential equations in state and costate variables. This system lacks of initial condition on the adjoint variables, and this situation creates classic dificulty for solving optimal control problems.This paper proposes a constructive method to approximate the initial condition of the adjoint system.DOI : http://dx.doi.org/10.22342/jims.15.1.40.13-20


INTRODUCTION
Let a linear control system beẋ = Ax + Bu. We want to drive the state x from a given initial condition to a given final condition in a finite time and minimize a cost functional. In solving this optimal control problem, Pontryagin's Maximum Principle is applied. This approach converses the optimal control problem into solving a Hamiltonian system consisting of a pair of linear differential equations. For solving this system of equations, one must solve a two point boundary value problem of the first equation. On the other hand, the second equation lacks of the initial condition. This is a classic problem in the optimal control design. In order to overcome this, we propose a constructive method to approximate the initial condition of the adjoint variable. This method is quite general in the sense that it can be applied to any kind of linear control systems.
There have been several different methods proposed by other researchers for overcoming the classical difficulty regarding the initial conditions of the adjoint variables. Scheeres et al [4] use generating functions in their work. In Twigg et al [5], they use genetic algorithm for computing the optimal controls. Another approach is proposed by Kelly et al [2]. In their paper, they use polynomial spirals.

LINEAR QUADRATIC REGULATOR PROBLEM
We consider a Linear Quadratic Regular (LQR) problem as followṡ where x ∈ R n and u ∈ R m . The initial and final conditions are The problem discussed here is to determine a control u driving the the state x from m to s in time T . At the same time, the control u must minimize the quadratic functional cost where δ is some positive constant. The integrand represents the cost or energy of the controls used. In (2), the symbol x(t) denotes the state at time t, the symbol m denotes the initial state, and s denotes the final state. In order to make the discussion simpler, we consider a slightly different functional, namely it is the negative of the original J. So, we want to maximize a functional Thus, instead of minimizing J, we will maximize J ′ . We assume that the control system is controllable. Because of the linearity of the system, the controllability implies strong controllability. This means that there exists a control u such that (2) is satisfied for however small the time T is. Therefore, there is a u that makes J ′ finite. It implies that there exists a u maximizing J ′ .
The Hamiltonian function derived from the optimal control problem above is Using the Hamiltonian function above, one can derive the following Hamiltonian system By the Pontryagin Maximum Principle, the extremal trajectory must satisfy and p 0 must be a positive constant. One may consult Hocking [1], for more detailed explanation. We assume p 0 = 1. If one substitutes the control u from (5) into (3), we obtain a system of differential equations as followṡ Using (7), one obtains for some k.
The system consisting a pair of differential equations (6-7) must be solved to find the optimal trajectory of x. The first equation (6) has initial and final conditions that must be satisfied. However, the initial condition of the second equation (7) is not known. This is a classic difficulty in the optimal control theory.

Steepest Descent Method
Because the initial condition of the second equation (7) is not known, we create a method for constructing an approximation of the suitable initial condition.
If we let p(0) = q for some q ∈ R n , we obtain Next, if we use (8) in (6), we obtaiṅ Therefore, the final state x(T ) with p(0) = q is Of course, the trajectory x in general will not end at s. In other words, in general To guess a suitable q that makes x start exactly from m and end exactly at s in time T is not easy. Thus, we propose an algorithm for approximating a suitable q.
Instead of finding the precise q above, we want to find q that minimizes the functional where x(T ) is the evaluation of x at time T and (x, p) is the solution of differential equation system (6)-(7) with initial conditions (x(0), p(0)) = (m, q). We minimize the functional F by employing the Steepest Descent Method. The algorithm can be described as follows. First, we pick any positive numbers ǫ, α and any vector q 0 ∈ R n . Using these and the initial condition x(0) = m, one can solve the initial value probleṁ where (x(0), p(0)) = (m, q 0 ). In particular, one can compute the final state x(T ) for this particular q 0 . Moreover, using this x(T ), one can compute (9) to determine the scalar value F (q 0 ). Next, we want to find a new q 1 that will make F (q 1 ) smaller than F (q 0 ). The partial derivatives of F (p 1 , p 2 , · · · , p n ) with respect to p i at q 0 is approximated by Thus, the gradient of F at q 0 is approximated by Next, we set If E 1 < E 2 , we let q 1 = q 0 − α(D 1 , D 2 , · · · , D n ). Otherwise, we let q 1 = q 0 − α 2 (D 1 , D 2 , · · · , D n ) and replace ǫ as ǫ 2 and α as α 2 . We can start again using this q 1 in place of q 0 to determine q 2 . If we continue this process, we will obtain a vector sequence {q m } ∞ m=0 .
The next task is to verify the convergence of the proposed algorithm. First, we prove the following theorem Theorem 3.1. The functional F defined in (9) is strictly convex as a function of q.
Proof. Suppose r = exp(A T )x(0) and So, the final state x(T ) with the initial condition of the adjoint variable equals q is x(T ; q) = r + Qq. Thus, it is an affine function with respect to q. Let q be λv + (1 − λ)w, where v, w ∈ R 2 , v = w, and λ satisfies 0 < λ < 1. Thus, and we conclude that F is strictly convex with respect to q.
Next, using any initial condition of F , we need the guarantee that the steepest descent sequence is convergent to the unique global minimizer of F . So, the following lemma is needed.
Lemma 2.1. If F is a strictly convex, coercive function with continuous first partial derivatives on R n , then for any initial point x(0), the steepest descent sequence with initial point x(0) converges to the unique global minimizer of F .
Interested readers may find the proof in [3]. Now, we are ready to state the main result of this paper in the next theorem.
Theorem 3.2. The sequence {q m } converges to a suitable initial condition p that makes the final state x(T ) equal s. Proof.By Theorem 3.1, the functional F • x is strictly convex with respect to q.
Obviously, F • x is a coercive function, again with respect to q. By Lemma 3.1, we conclude that the steepest descent sequence {q m } converges to the global minimizer of F • x. This means that the sequence {x(T ; q m )} converges to s as m goes to infinity. At the same time, the sequence {q m } converges to a suitable initial condition of the adjoint variable that makes the state x reach s at time T .
In the next section, we give an example to validate the proposed method. It shows that the numerical results obtained by the proposed method are close to the theoretical solution.

AN EXAMPLE
Consider the following linear system where the initial and final conditions are given as follows The cost functional that must be maximized is When v ≡ 0, the final state at t = 1 is Thus, v ≡ 0 must be the exact optimal control. The Hamiltonian function of the above system is The Hamiltonian system is ∂H By the Pontryagin Maximum Principle, a necessary condition that makes the system optimal is ∂H ∂v

Thus, one obtains
So, now we have to solve the differential equation system as followṡ Since the exact optimal control of the above case is the zero control, it means that the exact values of p 1 (0) and p 2 (0) must be zero. We show that the proposed numerical method approaches these values as well.
We apply the proposed method to the above system. First, we let the error tolerance be 10 −3 . If we set an initial guess This is close to the terminal condition in (11). Second, we use a smaller error tolerance. Let the error tolerance be 10 −6 . Using the same initial guess, after 1736 iterations, we obtain p 1 (0) p 2 (0) = 0.0019435296484999768101 −0.000088768635445718336227 .