Dynamic Programming and Hamilton–Jacobi–Bellman Equations on Time Scales

Bellman optimality principle for the stochastic dynamic system on time scales is derived, which includes the continuous time and discrete time as special cases. At the same time, the Hamilton–Jacobi–Bellman (HJB) equation on time scales is obtained. Finally, an example is employed to illustrate our main results.


Introduction
e stochastic control problem is to find an optimal control such that a cost functional associated with a stochastic system reaches the minimum value. e method of dynamic programming is a powerful approach to solving the stochastic optimal control problems. e dynamic programming is a well-established subject [1][2][3][4] to deal with continuous and discrete optimal control problems, respectively, and it has great practical applications in various fields [5,6]. It is generally assumed that the time is continuous or discrete in the dynamic systems. However, this cannot be guaranteed. In reality, the time scale could be neither continuous nor discrete. ere are many processes which are the mixture of continuous time and discrete time, nonuniform discrete time, or the union of disjoint time interval, such as the production and storage process in economics, the investment process in finance, and the population for the seasonal insects. When the time is more complex, it makes the control problem more difficult. How to deal with this problem?
Time scales were first introduced by Hilger [7] in 1988 in order to unite differential and difference equations into a general framework.
is allows us to deal with the continuous and discrete analyses from the common point of view. Recently, time scale theory is extensively studied in many works [8][9][10][11][12][13]. It is well known that the optimal control problems on time scales are an important field for both theory and applications. Since the calculus of variations on time scales was studied by Bohner [14], the results on optimal control problems in the time scale setting and their applications have been rapidly growing. e existence of optimal control for the dynamic systems on time scales was discussed [15][16][17]. Subsequently, Pontryagin maximum principle on time scales was studied in several works [18,19], which specifies the necessary conditions for optimality. e dynamic programming for dynamic systems on time scales is not a simple task to unite the continuous time and discrete time cases because the time scales contain more complex time cases. Seiffertt et al. [20] studied the approximate dynamic programming for the dynamic system in the isolated time scale setting. In addition, Bellman dynamic programming on general time scales for the deterministic optimal control problems was considered in [21,22]. However, limited work [23,24] has been done on the linear quadratic stochastic optimal control problem in the time scales setting. at is to say, the general setting of stochastic optimal control problems on time scales is completely open.
Motivated by all these significant works, the purpose of this paper is to study the method of dynamic programming for the stochastic optimal control problems on time scales. As we know, the stochastic dynamic programming principle is different from the deterministic systems, which reflects the stochastic nature of the optimal problem. So, the method in the deterministic case on time scales cannot be applied to the stochastic case directly. In order to overcome this difficulty, we first give a new form of the chain rule on time scales. Based on this idea, we obtain Ito's formula for the stochastic process on time scales. Second, we consider a family of optimal control problems with different initial times and states to establish the Bellman optimality principle in the time scale setting. ird, using Ito's formula and the Bellman optimality principle obtained in the time scales, we also get the associated Hamilton-Jacobi-Bellman (HJB for short) equation on time scales which is a nonlinear second-order partial differential equation involving expectation. If the HJB equation is solvable, then we can get an optimal feedback control. Our work will enrich the dynamic programming problem by providing a more general time framework and make dynamic programming theory to be a powerful tool in tackling the optimal control problem on complexity time. e organization of this paper is as follows. In Section 2, we show some preliminaries about time scale theory. Section 3 focuses on the Bellman optimality principle and the HJB equations on time scales. By introducing a new symbol, we present Ito's formula in a new form. On this basis, we get the main results. Finally, an illustrative example is given to show the effectiveness of the proposed results.

Preliminaries
A time scale T is a nonempty closed subset in real number set R, and we denote [0, T] T � [0, T] ∩ T. In this paper, we always suppose T is bounded. e forward jump operator σ and backward jump operator ρ are, respectively, defined by supplemented by inf∅ ≔ supT and sup∅ ≔ infT, where ∅ denotes the empty set. If σ(t) � t and t < supT, the point t is ). e definition of the graininess function μ is as follows: We now present some basic concepts and properties about time scales (see [10,11]).
Definition 1 (see [10]). Let f be a function on T. f is called a right-dense continuous function if f is continuous at every right-dense point and has finite left-sided limits at every leftdense point. Similarly, f is called a left-dense continuous function if f is continuous at every left-dense point and has finite right-sided limits at every right-dense point. If f is right-dense continuous and also is left-dense continuous, then f is called a continuous function.
Define the set T κ as Definition 2 (see [11]). Let f: T ⟶ R be a function and t ∈ T κ . If for all ε > 0, there exists a neighborhood U of t such that Furthermore, we give the derivation rule of the compound function.
Lemma 1 (see [25]). Let g: In this paper, we adopt the stochastic integral defined by Bohner et al. [26]. Let (Ω, F, F t t∈[0,T] T , P) be a complete probability space with increasing and continuous filtration A Brownian motion indexed by time scales T was defined by Grow and Sanyal [13]. Although the Brownian motion on time scales is very similar to that on continuous time, there are also some differences between them. For example, the quadratic variation of a Brownian motion on time scales (see [27]) is an increasing process yet, but it is not deterministic. In fact, Now, we give the definition of the stochastic Δ-integral and its properties.
Definition 3 (see [26]). e random process X(t) is sto- where and the Brownian motion on the right side of (6) is indexed by continuous time.

Complexity
We also have the following properties. Let where the integral X with respect to the quadratic variation of Brownian motion 〈W〉 t is defined by Stieltjes integral as Let X be an n-dimensional stochastic process defined by

Problem Statement and Main Results
Let Ω, F, F t t∈T , P be a given filtered probability space satisfying the usual condition. Consider the stochastic control system

Complexity 3
where U is a convex subset of R m . And the functions a(t, x, u) and b(t, x, u) satisfy the Lipschitz condition and linear growth condition in x. Obviously, equation (11) admits a unique solution (see Bohner et al. [26]). e cost functional associated with (11) is where the maps r: [0, T] T × R n × U ⟶ R and h: R n ⟶ R are continuous.
e optimal control problem is to find u * (·) ∈ U[0, T] T such that u * (·) is called the stochastic optimal control of the problem, and the corresponding X(·; x 0 , u * (·)) is called an optimal state process. Now, we consider a family of optimal control problems with different initial times and states. Let (t, x) ∈ [0, T] T × R n , consider the state equation

(16)
For any (t, x) ∈ [0, T] T × R n , minimize (16) subject to (15) over U[t, T] T . e value function of the optimal control problem is defined as We first introduce a symbol which is useful in the sequel. Let g: T ⟶ R be Δ-differentiable and f: R ⟶ R be continuously differentiable. For any t ∈ T κ , define f Δ g (g(t)) as follows: , t is right − dense and g Δ ≠ 0, Remark 1. Note that f Δ g (g(t)) depends not only on the functions f and g but also on the time scales T. If t is a rightdense point on time scales T, ). On the contrary, if t is a right-scattered point and g σ (t) � σ(g(t)), we have With the help of the new symbol, we have the following lemma.
Lemma 3. Let f: R × R ⟶ R be continuously differentiable and x: T ⟶ R and y: T ⟶ R be Δ-differentiable, then Proof. If x Δ (t) � 0 or y Δ (t) � 0, it is easy to verify that (19) is true. We only give proofs under conditions x Δ (t) ≠ 0 and y Δ (t) ≠ 0. If t is right-dense, one has When t is right-scattered, then is completes the proof.
□ Remark 2. Similarly, another form can be expressed as Remark 3. In particular, let g: T ⟶ R be Δ-differentiable and f: R ⟶ R be continuously differentiable. en f Δ (g(t)) � f Δ g (g(t))g Δ (t). It is easy to see that this equality is equivalent to (5).

Remark 4.
It is not hard for us to get the following result of multidimensions: where F: R n × R m ⟶ R is continuously differentiable and x: T ⟶ R n and y: T ⟶ R m are Δ-differentiable.
Next, we show Ito's formula in a new form on time scales. Complexity Proposition 1. Let X satisfy (9) and f ∈ C 1,2 (T × R n ; R), we have where I(·) is an indicative function and D is the set of all right-dense points.
Proof. Because of Lemma 2, it is enough to show that T a(s, X(s))Δs T a(s, X(s))Δs By some manipulation, namely, it is straightforward to show that the above equation is true. Now, we state the Bellman optimality principle on time scales.
It follows that Combining with (28) and (31), we get the result. Furthermore, we give the HJB equation on time scales which is similar to continuous and discrete cases.