OUTPUT REGULATION FOR DISCRETE-TIME NONLINEAR STOCHASTIC OPTIMAL CONTROL PROBLEMS WITH MODEL-REALITY DIFFERENCES

In this paper, we propose an output regulation approach, which is based on principle of model-reality differences, to obtain the optimal output measurement of a discrete-time nonlinear stochastic optimal control problem. In our approach, a model-based optimal control problem with adding the adjustable parameters is considered. We aim to regulate the optimal output trajectory of the model used as closely as possible to the output measurement of the original optimal control problem. In doing so, an expanded optimal control problem is introduced, where system optimization and parameter estimation are integrated. During the computation procedure, the differences between the real plant and the model used are measured repeatedly. In such a way, the optimal solution of the model is updated. At the end of iteration, the converged solution approaches closely to the true optimal solution of the original optimal control problem in spite of model-reality differences. It is important to notice that the resulting algorithm could give the output residual that is superior to those obtained from Kalman filtering theory. The accuracy of the output regulation is therefore highly recommended. For illustration, a continuous stirred-tank reactor problem is studied. The results obtained show the efficiency of the approach proposed.


Introduction.
Recently, an integrated optimal control algorithm for solving discrete-time nonlinear stochastic optimal control problems has been proposed, see for examples [3], [9], [10] and [4].The developed algorithm is an iterative approach, where the model-based optimal control problem is solved repeatedly in order to approximate the true optimal solution of the original optimal control problem.With the adjustable parameters that are introduced in the model, the differences between the real plant and the model used could be measured.The repetitive solution is then converged to the real optimal solution within a given tolerance in spite of model-reality differences [11], [12], [1].On the other hand, because of the present of 276 S. L. KEK AND A. A. MOHD ISMAIL the random disturbances, an optimal filtering solution of the nonlinear stochastic optimal control problem in discrete-time is obtained, where the modified linear quadratic Gaussian optimal control problem is solved repeatedly [5].In addition to this, a least-square output residual is introduced in the cost functional such that the output error is further minimized [6].
However, minimizing the output error would not give a minimum value of the cost function due to the weighted parameter that is selected for the least-square output residual in the model.In this paper, we propose an efficient computation approach to improve this limitation.In our approach, the linear quadratic regulator optimal control model is considered, where the trajectories of state and control are smoothed in expectation manner.Moreover, an adjustable parameter is introduced to the model output, which is measured from the expected state trajectory.The aim of this adjustable parameter is to regulate the expected output as closely as possible to the real output, as such giving the smallest minimum output error.Note that the Kalman filtering theory is not applied here.It is remarked that the proposed approach gives both of the optimal expected solution and the optimal regulated output at the end of iteration computation procedure despite modelreality differences.Hence, the accuracy of output solution is highly recommended.
The rest of the paper is organized as follows.In Section 2, a general class of discrete-time nonlinear stochastic optimal control problem is described.In Section 3, the model-based optimal control problem with the adjustable parameters is discussed.The expectation optimal solution is obtained and then the expected output is regulated approximately to the real output in spite of model-reality differences.In Section 4, an illustrative example of continuous stirred-tank reactor problem is presented to show the efficiency of the proposed approach.Finally, some concluding remarks are made.
2. Problem Description.Consider a general class of stochastic optimal control problem given below: . ., N , and y(k) ∈ p , k = 0, 1, . . ., N , are, respectively, the control sequence, the state sequence, and the measured output.The terms ω(k) ∈ q , k = 0, 1, . . ., N − 1, and η(k) ∈ p , k = 0, 1, . . ., N , are the stationary Gaussian white noise sequences with zero mean and their covariance matrices are given by Q ω and R η , respectively, where Q ω is a q × q positive definite matrix and R η is a p × p positive definite matrix.G is an n × q process noise coefficient matrix, f : n × m × → n represents the real plant, and h : n × → p is the output measurement, whereas ϕ : n × → is the terminal cost, L : n × m × → is the cost under summation.Here, J 0 is the scalar cost function and E[•] is the expectation operator.It is assumed that all functions in (1) are continuously differentiable with respect to their respective arguments.
The initial state is where x 0 ∈ n is a random vector with mean and covariance given, respectively, by Here, M 0 is an n × n positive definite matrix.It is assumed that the initial state, process noise and measurement noise are statistically independent.This optimal control problem is regarded as the real optimal control problem, and is referred to as Problem (P).We notice that the exact solution of Problem (P) is, in general, unable to be obtained.Furthermore, applying the nonlinear filtering theory to estimate the state of the real plant is computationally demanding.In view of these, we propose to solve Problem (P) via solving a simplified model-based optimal control problem iteratively.Let this simplified model-based optimal control problem, which is referred to as Problem (M), be given below: where x(k) ∈ n , k = 0, 1, . . ., N , ȳ(k) ∈ p , k = 0, 1, . . ., N , and y(k) ∈ p , k = 0, 1, . . ., N , are, respectively, the expected state sequence, the expected output sequence and the regulated output sequence.A is an n × n state transition matrix, B is an n × m contol coefficient matrix, C is a p × n output coefficient matrix, while S(N ) and Q are n × n positive semi-definite matrices and R is a m × m positive definite matrix.Here, Notice that solving Problem (M) iteratively would give the optimal expected output solution of Problem (P), which is given by ȳ(k), and the optimal regulated output solution of Problem (P), which is represented by y(k).Since the Kalman filtering theory is not used here, the state error covariance is larger than the state error covariance as presented in [5].However, the additional output measurement, which is added into the model, regulates the expected output sequence as closely as possible to the real output trajectory.In this regulation procedure, we aim to approximate the true output trajectory of Problem (P).
3. Output Regulation with Model-Reality Differences.Now, let us introduce an expanded optimal control problem, which is referred to as Problem (E), given below:  2 with r 1 ∈ , r 2 ∈ and r 3 ∈ are introduced to improve convexity and to facilitate convergence of the resulting iterative algorithm.It is important to note that the algorithm is designed such that the constraints v(k) = u(k), z(k) = x(k) and ŷ(k) = ȳ(k) are satisfied upon termination of the iterations, assuming that convergence is achieved.The state constraint z(k), the output constraint ŷ(k) and the control constraint v(k) are used for the computation of the parameter estimation and matching schemes, while the corresponding state estimate constraint x(k), output estimate constraint ȳ(k) and control constraint u(k) are used in the optimization of the model-based optimal control problem.On this basis, the system optimization and the parameter estimation are mutually interactive.
Note that Problem (E) is equivalent to the estimation of Problem (P), which is in terms of expectation and regulation.
Applying the calculus of variation [4], [5], [6], [2], [8], the following necessary optimality conditions are obtained: (a) Stationary condition: (c) State equation: (f) Adjustable parameter equations: (g) Multiplier equations: In view of these optimality conditions, the multipliers are computed from (9), the parameter estimation problem is defined by (8), where the adjustable parameters are calculated, and the modified model-based optimal control problem, which satisfies the optimality conditions in ( 6) and (7), is given below.This modified model-based optimal control problem is referred to as Problem (MM).min subject to (11) x , where the boundary conditions x(0) and p(N ) are given with the specified modifier Γ.Note that it is essential to include the modification terms λ(k) u(k), β(k) x(k) and θ 1 (k) ȳ(k) in the cost function of Problem (MM).Otherwise, the correct solution estimate of Problem (P) cannot be obtained by simply iterating the solution of Problem (M) and performing parameter estimation at every iteration step.In addition, to obtain the solution of Problem (MM), it is necessary to solve the two-point boundary-value problem (TPBVP) that is defined by (6b) and (6c).
3.2.Feedback control law.The solution method for solving Problem (MM) is described in Theorem 3.1, where the feedback control law is resulted.Theorem 3.1.Suppose the optimal control law for Problem (E) exists.Then, this control law is the feedback control law for Problem (MM) given by where with the boundary conditions S(N ) given and s(N ) = 0, and Proof.From the necessary optimality condition (6a), we obtain Applying sweep method [2], [8], we substitute (15) for k = k + 1 into (14), which yields Then, substitute the expected state equation (6c) into (16).After some algebraic manipulations, the feedback control law ( 12) is obtained, where (13a) and (13b) are satisfied.
From the co-state equation (6b), we substitute (15) for k = k + 1 to give Consider the expected state equation (6c) in (17), we obtain Substitute the feedback control law ( 12) into (18), and doing some algebraic manipulations, it is found that (13c) and (13d) are satisfied after comparing to (15).This completes the proof.
Taking ( 12) into (6c), the expected state equation becomes whereas the expected output is measured from and the regulated output is obtained from 3.3.Iterative computation procedure.From the discussion above, the result is summarized as an iterative algorithm, where the computation procedure is given below.

The iterative computation procedure
h.Note that A and B may be chosen based on the linearization of f , and C is obtained from the linearization of h.
Step 4 Test the convergence and update the optimal expectation solution and the optimal output regulation of Problem (P).In order to provide a mechanism for regulating convergence, a simple relaxation method is employed: where . ., N , and ŷ(k) i+1 = ŷ(k) i , k = 0, 1, . . ., N , within a given tolerance, stop; else set i = i + 1, and repeat the procedure starting with Step 1.
This problem is referred to as Problem (P).
To obtain an optimal output solution, which is close enough to the real output, we simplify Problem (P) and propose the following model-based optimal control problem as Problem (M) given below: with the initial condition x(0) = [0.050] , and the adjusted parameters γ(k), α 3 (k), α 2 (k), and After running the iterative algorithm, the simulation result is shown in Table 1, where the iteration number is 19 with the final cost 0.0164.It is almost 99% of the reduction to obtain the optimal cost.The output residual of the proposed approach is 0.016748, which is smaller than the output residual of the filtering solution given by 0.034731; see [5].Figures 1,  2 and 3 show, respectively, the trajectories of final control, final state, and final output.From these figures, the trajectories of control and state are smoothly free from disturbance, and the trajectory of output is regulated closely to the real output.These trajectories are then compared to the filtering trajectories of final control, final state and final output as shown in Figures 4, 5 and 6, respectively.It is concluded that the regulated output trajectory tracks the real ouput trajectory efficiently as well as giving the smallest output residual.5. Concluding Remarks.An output regulation, which is added into the modelbased optimal control problem, was discussed in this paper.The proposed iterative approach with adjustable parameters is for solving the discrete-time nonlinear stochastic optimal control problem in spite of model-reality differences.The expected trajectories of state and control were obtained and the output was measured deterministically.By introducing an adjustable parameter to the expected output, the

Table 1 .
Algorithm performance