Application of Conjugate Gradient Approach for Nonlinear Optimal Control Problem with Model-Reality Differences ()
1. Introduction
Recently, the integrated optimal control and parameter estimation (IOCPE) algorithm has been proposed [1] in solving the nonlinear optimal control problem, both for discrete time deterministic and stochastic cases (see for more detail in [2] - [9] ). In essence, the concept of the IOCPE algorithm is come from the dynamic integrated system optimization and parameter estimation (DISOPE) algorithm, which was developed by [10] . By using the DISOPE algorithm, optimal control of the deterministic dynamical systems, not only for continuous time but also for discrete time, has been widely discussed [10] [11] . On this point of view, the applications of the DISOPE algorithm have been well-defined. Date back to the 70s, [12] and [13] proposed the integrated system optimization and parameter estimation (ISOPE) algorithm, which is for solving the static optimization problems. Since then, the development of ISOPE algorithm in the dynamic version is rapidly growing up till today.
In fact, the basic idea for ISOPE, DISOPE and IOCPE algorithms is the principle of model-reality differences [1] [10] [13] . Because the structure of the nonlinear optimal control problem is complex and solving such problem is computationally demanding, the simplified model for the original optimal control problem is proposed to be solved iteratively. By adding the adjusted parameters into the model used, the differences between the model used and the real plant can be measured. This measurement is done repeatedly, in turn, to update the optimal solution of the model used. Once the convergence is achieved, the iterative solution approximates to the true optimal solution of the original optimal control problem, in spite of model-reality differences [1] [10] [13] . Besides, for solving the discrete time nonlinear stochastic optimal control problem, the Kalman filtering theory is associated with the principle of model-reality differences in order to do state estimation and system optimization [2] [3] [4] [6] .
By virtue of the evolution of these algorithms, the feedback optimal control law is provided in solving the nonlinear optimal control problems, and their effectiveness has been well-confirmed. Nevertheless, the applicability of the open-loop optimal control law in these algorithms shall be investigated such that the popularity of these algorithms could be promoted. This is because of the open-loop optimal control sequences could be generated by taking the advantage of the power of the state-of-the-art nonlinear programming (NLP) solver. Thus, as an efficient optimization technique, the conjugate gradient method [14] [15] has been explored to solve the optimal control problem [16] [17] [18] since last few decades. Thereby, the use of the conjugate gradient method inspires us to explore this method in the IOCPE algorithm practically.
Hence, the application of the conjugate gradient method is discussed in this paper for solving the nonlinear optimal control problem, where the model-reality differences are considered. Apparently, the model-based optimal control problem, which is simplified from the nonlinear optimal control problem, is constructed. Follow from this, the Hamiltonian function is defined and the augmented cost function is obtained. Then, the set of the necessary conditions for optimality is derived. Consequently, the modified model-based optimal control problem is converted to be a nonlinear optimization problem. By applying the conjugate gradient approach, the nonlinear optimization problem is solved and the optimal control sequences are generated. With this open-loop control law, the dynamical system is optimized and the cost function is evaluated. For illustration, optimal control of an economic growth problem [19] is discussed. The results obtained show the applicability of the algorithm proposed.
The structure of the paper is organized as follows. In Section 2, the problem statement is described briefly, where the simplified model from the nonlinear optimal control problem is discussed. In Section 3, system optimization with parameter estimation is further discussed. The use of the conjugate gradient approach in solving the model-based optimal control problem is presented and the calculation procedure is summarized as an iterative algorithm. In Section 4, an economic growth problem is solved and the results are obtained. Finally, the concluding remarks are made.
2. Problem Statement
Consider a general discrete-time optimal control problem, given by
(1)
where
and
are, respectively, the control sequences and the state sequences. Here,
represents the real plant,
is the cost under summation and
is the terminal cost, whereas
is the scalar cost function and
is the known initial state vector. It is assumed that all functions in Equation (1) are continuously differentiable with respect to their respective arguments.
This problem is regarded as the real optimal control problem, and is referred to as Problem (P). Note that the structure of Problem (P) is complex and nonlinear, solving Problem (P) requires the efficient computation techniques. On this point of view, the simplified model of Problem (P) is probably suggested to be solved in order to approximate the true optimal solution of Problem (P). Therefore, let us define this simplified model-based optimal control problem as follows:
(2)
where
and
are introduced as the adjusted parameters, whereas A is an
transition matrix and B is an
control coefficient matrix. Besides,
and Q are
positive semi-definite matrices, and R is a
positive definite matrix. Here,
is the scalar cost function.
This problem is referred to as Problem (M).
Notice that, due to the different structures and parameters, only solving Problem (M), without the adjusted parameters, would not obtain the optimal solution of Problem (P). However, by adding the adjusted parameters into Problem (M), the differences between the real plant and the model used can be calculated. In such a way, solving Problem (M) iteratively could give the correct optimal solution of Problem (P), in spite of model-reality differences.
3. System Optimization with Parameter Estimation
Now, introduce an expanded optimal control problem, which is referred to as Problem (E), given by
(3)
where
and
are introduced to separate the sequences of control and state in the optimization problem from the respective signals in the parameter estimation problem, and
de
notes the usual Euclidean norm. The term
and
with
are introduced to improve the convexity and
to facilitate the convergence of the resulting iterative algorithm. Here, it is classified that the algorithm is designed such that the constraints
and
are satisfied upon termination of the iterations, assuming that convergence is achieved. Moreover, the state constraint
and the control constraint
are used for the computation of the parameter estimation and matching scheme, while the corresponding state constraint
and control constraint
are reserved for optimizing the model-based optimal control problem. By virtue of this, system optimization and parameter estimation are mutually integrated.
3.1. Necessary Conditions for Optimality
Define the Hamiltonian function for Problem (E) by
(4)
where
,
and
are modifiers. Then, the augmented cost function becomes
(5)
where
and
are the appropriate multipliers to be determined later.
Applying the calculus of variation [20] [21] [22] , the following necessary conditions for optimality are obtained:
1) Stationary condition:
(6a)
2) Co-state equation:
(6b)
3) State equation:
(6c)
4) Boundary conditions:
and
(6d)
5) Adjusted parameter equations:
(7a)
(7b)
(7c)
6) Modifier equations:
(8a)
(8b)
(8c)
with
and
.
7) Separable variables:
,
,
. (9)
Notice that the parameter estimation problem is defined by Equation (7) and the computation of multipliers is given by Equation (8). Indeed, the necessary conditions, which are defined by Equations (6a) to (6d), are the optimality for the modified model-based optimal control problem.
3.2. Modified Model-Based Optimal Control Problem
The modified model-based optimal control problem, which is referred to as Problem (MM), is given by
(10)
with the specified
and
, where the boundary conditions are given by
and
with the specified multiplier
.
3.3. Open-Loop Optimal Control Law
For simplicity, define Problem (MM) as an equivalent nonlinear optimization problem with the initial control
, given by
subject to
for
(11)
where the admissible control variable u is set to be
.
Let this problem as Problem (N). To proceed, it is noticed that solving Problem (N) could be done once the state Equation (6c) is solved forward and the costate Equation (6b) is solved backward with the corresponding control sequences u. In addition, the gradient function for the objective function
is evaluated from
(12)
which can be calculated from the Hamiltonian function (4) and the stationary condition (6a) once the necessary conditions for optimality, given by Equations (6) - (9), are satisfied.
Suppose the gradient function (12) is represented as
. (13)
Then, for arbitrary initial control
, the initial gradient and the initial direction are, respectively, given by
and
. (14)
By using the line search equation [14] [16] , the control sequences can be generated from
(15)
where
is determined from the one-dimensional search, that is,
. (16)
Later, the gradient and the direction are updated as follow:
(17)
(18)
with the coefficient
(19)
where
represents the iteration numbers.
Thus, we present the result on the obtaining optimal control law discussed above as a proposition, given below:
Proposition 1. Consider Problem (N). The control sequences
, which is defined in Equation (15) and is represented by
,
is generated through a set of the direction vectors
whose components are linearly independent. Also, the direction
is conjugacy.
Proof: Refer [14] .
Here, the conjugate gradient algorithm for obtaining the optimal control law is summarized below:
Algorithm 1: Conjugate gradient algorithm
Data Choose the arbitrary initial control
. Compute the initial gradient
and the initial direction
from Equation (14). Set
= 0.
Step 1 Solve the state Equation (6c) forward in time from
= 0 to
=
with the initial condition (6d) to obtain
.
Step 2 Solve the costate Equation (6b) backward in time from
=
to
= 0 with the boundary condition (6d), where
is the solution obtained.
Step 3 Calculate the value of the cost functional
from Equation (10).
Step 4 Determine the step size
from Equation (16).
Step 5 Update the control
from Equation (15).
Step 6 Update the gradient
from Equation (17). If the gradient
, stop, else go to Step 7.
Step 7 Compute the coefficient
from Equation (19).
Step 8 Update the direction
from Equation (18). Set
, go to Step 1.
Remarks:
1) The initial control
can be any valued-vectors, including the zero vector.
2) The gradient function
for Problem (N) defined by Equation (11) is calculated from the stationary condition (6a). This is the turning point of using the conjugate gradient algorithm for solving Problem (M) defined by Equation (2) and Problem (MM) defined by Equation (10).
3) The optimal control sequences generated by the line search equation in Equation (15) is known as the open-loop control law.
4) The necessary conditions (6b) and (6c) shall be satisfied in solving Problem (N) defined by Equation (11).
3.4. Iterative Procedure
Accordingly, from the discussion above, a summary of the calculation procedure for the integrated system optimization and parameter estimation is made as follows:
Algorithm 2: Iterative procedure
Data
. Note that A and B could be determined based on the linearization of
at
or from the linear terms of
.
Step 0 Compute a nominal solution. Assume that
and
. Solve Problem (M) defined by Equation (2) to obtain
and
. Then, with
and using
from the data. Set
,
,
and
.
Step 1 Compute the parameters
and
from Equation (7). This is called the parameter estimation step.
Step 2 Compute the modifiers
and
from Equation (8). Notice that this step requires taking the derivatives of f and L with respect to
and
.
Step 3 With
and
, solve Problem (N) by using Algorithm 1. This is called the system optimization step.
a) Use Equation (15) to obtain the new control
.
b) Use Equation (6c) to obtain the new state
.
c) Use Equation (6b) to obtain the new costate
.
Step 4 Test the convergence and update the optimal solution of Problem (P). In order to provide a mechanism for regulating convergence, a simple relaxation method is employed:
(20a)
(20b)
(20c)
where
are scalar gains. If
and
, within a given tolerance, stop; else set
, and repeat the procedure starting from Step 1.
Remarks:
1) The variable
is zero in Step 0. The calculated value of
changes from iteration to iteration during the calculation procedure.
2) The conjugate gradient algorithm is applied to generate the control sequences
for Problem (M) and Problem (MM), respectively.
3) Problem (P) is not necessary to be linear or to have a quadratic cost function.
4) The conditions
and
are required to be satisfied for the converged optimal control sequence and the converged state sequence. The following averaged 2-norms are computed and then they are compared with a given tolerance to verify the convergence of
and
:
(21a)
(21b)
5) The convergence result on the conjugate gradient algorithm can be referred to [14] , and the convergence result for Algorithm 2 is presented in [4] and [10] .
4. Illustrative Example
Consider a basic economic growth model [19] [23] , which is a discrete time minimization problem, given by
where the payoff function and dynamics system are, respectively, defined by
and
.
Here, x is the capital stock,
is the control variable,
is the discount factor, whereas
is a production function with constants
,
. The difference between the output and the next period’s capital stock is the consumption.
Let us refer this problem as Problem (P). In literature, the exact solution of Problem (P) is known [24] , and is given by
with
and
.
The unique optimal equilibrium for Problem (P) is given by
.
By using the specified parameters
,
and
, the optimal equilibrium is
[23] .
In the following, we introduce a simplified model-based optimal control model, which is derived from Problem (P) and is referred to as Problem (M), given below:
Note that Problem (M) and Problem (P) are different from the structures and the parameters used.
After running the algorithm proposed within the tolerance (10−6), the result is shown in Table 1. The initial cost, which is 13.072, is the cost spent before taking into account system optimization with parameter estimation. At the end of implementing the algorithm proposed, the final cost is 22.198. There are 41 iterations with 7.84 seconds to reach the convergence.
The graphical results for this economic growth model illustrate the application of the algorithm proposed. Figure 1 shows the final control trajectory and Figure 2 shows the final state trajectory, respectively. With this final control solution, it is observed that the final state towards to the steady state at x = 2.0673. Figure 3 shows the final costate trajectory, while Figure 4 and Figure 5 show, respectively, the adjusted parameters γ(k) and α(k). Overall, these solutions are in the optimal sense, which are verified by the satisfaction of the stationary condition shown in Figure 6.
5. Concluding Remarks
The use of the conjugate gradient approach in solving the nonlinear optimal control problem with model-reality differences was discussed in this paper. Essentially, the simplified model of the original optimal control problem, which is the linear optimal control problem by adding the adjusted parameters, is
Figure 1. Final control trajectory
.
Figure 2. Final state trajectory
(--) and state equilibrium
(×××××).
Figure 3. Final costate trajectory
.
formulated. In solving this model-based optimal control problem, the conjugate gradient approach is employed to generate the open-loop control sequences such that the optimal solution is obtained. Here, the stationary condition is used to be the gradient function in the conjugate gradient approach. On the other hand, due on the different structure of the problems, the differences between the real plant and the model used, which is measured by the adjusted parameters repeatedly, are taken into consideration during the iteration calculation procedure. At the convergence, the optimal solution of the model used approximates to the true optimal solution of the original optimal control problem, in spite of model-reality differences. For illustration, the application of the algorithm proposed was discussed for solving a basic economic growth model. The results obtained show the efficiency of the algorithm proposed. In conclusion, the applicability of the algorithm is highly recommended.
Acknowledgements
The authors would like to acknowledge the Universiti Tun Hussein Onn Malaysia (UTHM) and the Ministry of Higher Education (MOHE) for the financial support for this study under the research grant FRGS VOT. 1561.
Conflicts of Interest
The authors declare no conflicts of interest regarding the publication of this paper.