On the hierarchical optimal control of a chain of distributed systems

In this paper, we consider a chain of distributed systems governed by a degenerate parabolic equation, which satisfies a weak H\"{o}rmander type condition, with a control distributed over an open subdomain. In particular, we consider two objectives that we would like to accomplish. The first one being of a controllability type that consists of guaranteeing the terminal state to reach a target set starting from an initial condition; while the second one is keeping the state trajectory of the overall system close to a given reference trajectory on a finite, compact time intervals. We introduce the following framework. First, we partition the control subdomain into two disjoint open subdomains that are compatible with the strategy subspaces of the {\it leader} and that of the {\it follower}, respectively. Then, using the notion of Stackelberg's optimization (which is a hierarchical optimization framework), we provide a new result on the existence of optimal strategies for such an optimization problem -- where the {\it follower} (which corresponds to the second criterion) is required to respond optimally, in the sense of {\it best-response correspondence} to the strategy of the {\it leader} (that is associated to the controllability-type criterion) so as to achieve the overall objectives. Finally, we remark on the implication of our result in assessing the influence of the reachable target set on the optimal strategy of the {\it follower} in relation to the direction of {\it leader-follower} and {\it follower-leader} information flows.


S2 S1
u Sn u u S1 : dx 1 t = f1 t, x 1 t , . . . , x n t dt + σ t, x 1 t , . . . , x n t dW, Sj : dx j t = fj t, x j−1 t , . . . , x n t dt, j = 2, . . . n, u is a control distributed over an open subdomain, I1, I2, . . . , In are information for interconnecting subsystems Let us introduce the following notation that will be useful later. We use bold face letters to denote variables in R nd , for instance, 0 stands for a zero in R nd (i.e., 0 ∈ R nd ) and, for any t ≥ 0, the solution x 1 t , x 2 t , . . . , x n t to (1.1) is denoted by x t . Moreover, for t, (x j−1 , . . . , x n ) ∈ (0, ∞)×R (n−j+2)d , j = 2, . . . , n, the function x j → f j t, x j−1 , . . . , x n is continuously differentiable with respect to x j and its derivative denoted by t, x j−1 , . . . , x n → D x j f j t, x j−1 , . . . , x n .
Then, we can rewrite the distributed system in (1.1) as where F = f 1 , f 2 , . . . , f n is an R nd -valued function and G = I d , 0, . . . , 0 T stands for an Let Ω be a regular bounded open domain in R nd , with smooth boundary Γ. For an open subdomain U of Ω, we consider the following distributed control system, governed by a partial differential equation (PDE) of parabolic type, with a control distributed over U , i.e., ∂y ∂t where u(t, x) ∈ L 2 ((0, T ) × U ) is a control function, χ U is a characteristic function of the subdomain U and L t,x is a second-order operator given by 1 REMARK 2. Note that, in (1.1), the random perturbation enters in the first subsystem through the diffusive part and is then subsequently transmitted to other subsystems. As a result, such a distributed system is described by an R nd -valued diffusion process, which is degenerate in the sense that the second-order operator associated with it is a degenerate parabolic equation. Moreover, we also assume that the distributed system in (1.1) satisfies a weak Hörmander condition (e.g., see [9] or [7, Section 3] for additional discussions).
In what follows, we assume that the following statements hold true for the distributed system in (1.1).

REMARK 3.
In general, the hypoellipticity assumption is related to a strong accessibility property of controllable nonlinear systems that are driven by white noise (e.g., see [19] concerning the controllability of nonlinear systems, which is closely related to [17] and [10]; see also [7,Section 3]). From Part (b) of the above assumption, the Jacobian matrices . . , n are assumed to be nondegenerate uniformly in time and space (i.e., they satisfy Hölder conditions both with respect to time and second variables).
Here it is worth mentioning that some studies on the controllability of systems that are governed by parabolic equations have been reported in literature (e.g., see [12] in the context of Stackelberg optimization; and [1] and [8] in the context of Stackelberg-Nash controllabilitytype problem). 2 Note that rationale behind our framework follows in some sense the settings of these papers. However, to our knowledge, the problem of optimal control for a chain of distributed system governed by degenerate parabolic equations has not been addressed in the context of hierarchical argument, and it is important because it provides a mathematical framework that shows how a hierarchical optimization framework can be systematically used to obtain optimal strategies for the leader and that of the follower (distributed over an open subdomain) for a chain of distributed system with random perturbations. 3 The remainder of this paper is organized as follows. In Section 2, using the remarks made above in Section 1, we state the optimal control problem for a chain of distributed system. Section 3 presents our main results -where we introduce a hierarchical optimization framework under which the follower is required to respond optimally, in the sense of best-response correspondence to the strategy of the leader (and vice-versa) so as to achieve the overall objectives. This section also contains results on the controllability-type problem for such a distributed system. For the sake of readability, all proofs are presented in Section 4. Finally, Section 5 provides further remarks.
2. Problem Formulation. In this paper, we consider two objectives that we would like to accomplish. The first one being of a controllability type that consists of guaranteeing the terminal state to reach a target set from an initial condition; while the second one is keeping the state trajectory of the overall system close to a given reference trajectory on a finite, compact time intervals. Such a problem can be stated as follow: Problem: Find an optimal control strategy u * (t, x) ∈ L 2 ((0, T ) × U ) (which is distributed over U ) such that (i) The first objective: Suppose that we are given a target point y tg in L 2 (Ω).
Then, we would like to have where y(t; u * ) denotes the function x → y(t, x; u * ), B is a unit ball in L 2 (Ω) and α is an arbitrary small positive number. 4 (ii) The second objective: Suppose that we are given a reference trajectory Then, we would like to have the state trajectory y(t, x; u * ) not too far from the reference y r f (t, x) for all t ∈ (0, T ).
In order to make the above problem more precise, we specifically consider the following hierarchical cost functionals: and Note that, in general, finding such an optimal strategy u * ∈ L 2 ((0, T ) × U ) that minimizes simultaneously the above cost functionals in (2.2) and (2.3) is not an easy problem. However, in what follows, we introduce the notion of Stackelberg's optimization [20] (which is a hierarchical optimization framework), where we specifically partition the control subdomain U into two open subdomains U 1 and U 2 (with U 1 ∩ U 2 = ∅) that are compatible with the strategy subspaces of the leader and that of the follower, respectively. That is, where the strategy for the leader (i.e., u 1 ) is from the subspace L 2 ((0, T ) × U 1 ) and the strategy for the follower (i.e., u 2 ) is from the subspace L 2 ((0, T ) × U 2 ).
Note that if χ Ui , for i = 1, 2, denotes the characteristic function for U i and u i is the restriction of the distributed control u to L 2 ((0, T ) × U i ). Then, the PDE in (1.3) can be rewritten as Suppose that the strategy for the leader u 1 ∈ L 2 ((0, T ) × U 1 ) is given. Then, the problem of finding an optmal strategy for the follower, i.e., u * 2 ∈ L 2 ((0, T ) × U 2 ), which minimizes the cost functional J 2 is then reduced to finding an optimal solution for for some unique map R : . Moreover, the controllability-type problem in (2.2) is then reduced to finding an optimal solution for inf u1∈L 2 ((0, T )×U1) In the following section, we provide a hierarchical optimization framework for solving the above problems (i.e., the optimization problems in (2.6), together with (2.7) and (2.8)). Note that, for a given u 1 ∈ L 2 ((0, T ) × U 1 ), the optimization problem in (2.6) has a unique solution on L 2 ((0, T ) × U 2 ) (cf. Proposition 3.1). Moreover, the optimization problem in (2.8) makes sense if y(T ; (u 1 , R(u 1 ))) spans a dense subset of L 2 (Ω), when u 1 spans the subspace L 2 ((0, T ) × U 1 ) (cf. Propositions 3.2 and 3.3).
3. Main Results. In this section, we present our main results -where we introduce a framework under which the follower is required to respond optimally, in the sense of bestresponse correspondence to the strategy of the leader (and vice-versa) so as to achieve the overall objectives. Moreover, such a framework allows us to provide a new result on the existence of optimal strategies for such optimization problems pertaining to a chain of distributed system with random perturbations.
3.1. On the optimality distributed system for the follower. Suppose that, for a given leader strategy u 1 ∈ L 2 ((0, T ) × U 1 ), if u * 2 ∈ L 2 ((0, T ) × U 2 ), i.e., the strategy for the follower, is an optimal solution to (2.6) (cf. (2.3)). Then, such a solution is characterized by the following optimality condition where y andŷ are, respectively, the solutions to the following PDEs Furthermore, if we introduce an adjoint state p as follow where L * t,x is the adjoint operator for L t,x . Then, we have the following result which characterizes the map R in (2.7) (i.e., the optimality distributed system for the follower).
admits a unique solution pair y(u 1 ), p(u 1 ) (which also depends uniformly on u 1 ∈ L 2 ((0, T )× U 1 )). Then, the optimality distributed system for the follower is given by (3.6) REMARK 4. The above proposition states that if the strategy of the leader u 1 ∈ L 2 ((0, T ) × U 1 ) is given. Then, the strategy for the follower u * 2 = R(u 1 ), which is responsible for keeping the state trajectory y(t, x; (u 1 , R(u 1 ))) close to the given reference trajectory y r f (t, x) on the time intervals (0, T ), is optimal in the sense of best-response correspondence. Later, in Proposition 3.2, we provide an additional optimality condition on the strategy of the leader, when such a correspondence is interpreted in the context of hierarchical optimization framework.

3.2.
On the optimality distributed system for the leader. In this subsection, we provide an optimality condition on the strategy of the leader in (2.2), when the strategy for the follower satisfies the optimality condition of Proposition 3.1.
For a given ξ ∈ L 2 (Ω), let ϕ and ϑ be unique solutions to the following PDE Next, define the following linear decompositions y = y 0 + z and p = p 0 + q (3.8) such that y 0 and p 0 are the unique solutions to the following PDE (3.9) Note that, from (3.5) and (3.9) together with (3.8), it is easy to show that z and q are the unique solutions to the following PDE ∂z ∂t (3.10) where u * 1 ∈ L 2 ((0, T ) × U 1 ) is an optimal strategy for the leader which satisfies additional conditions (see below (3.12) and (3.13)).
Then, we have the following result which characterizes the optimality condition for the leader in (2.2).

12)
where ϕ(ξ) is given from the unique solution set y(ξ), p(ξ), ϕ(ξ), ϑ(ξ) for the optimality distributed system ∂y ∂t Moreover, ξ ∈ L 2 (Ω) is a unique solution to the following variational inequality 5 (3.14) REMARK 5. Note that the hierarchical optimization problem in Proposition 3.2 requires the follower to respond optimally to the strategy of the leader in the sense of best-response correspondence, where such a correspondence is implicitly embedded in (3.13) (see also Section 4 for additional remarks).
(3.15) REMARK 6. Note that the above proposition implicitly requires the strong accessibility property of the distributed system in (1.1) which is concerned with the controllability property of nonlinear systems with random perturbations (see Assumption 1 and Remark 3).

Proof of the Main Results.
In this section, we give the proofs of our results.

Proof of Proposition 3.1.
For a given u 1 ∈ L 2 ((0, T ) × U 1 ), let y and p be the unique solutions of (3.5). If we multiply the second equation in (3.5) byŷ and integrate by parts. Further, noting the PDEs in (3.3) and (3.4), then we have the following Moreover, using the optimality condition in (3.1) together with (4.1), we obtain which further gives an optimal strategy for the follower as where p is from the unique solution set {p(u 1 ), y(u 1 )} of (3.5) that depends uniformly on u 1 ∈ L 2 ((0, T ) × U 1 ). This completes the proof of Proposition 3.1. s.t. y(T ; (u 1 , R(u 1 ))) ∈ y tg − y 0 (T ) + αB (see (3.8)).
Introduce the following cost functionals Let H ∈ L (L 2 ((0, T ) × U 1 ); L 2 (Ω)) be a bounded linear operator such that 6 Then, the optimization problem in (2.2) is equivalent to inf u1∈L 2 ((0, T )×U1) Furthermore, using Fenchel duality theorem (e.g., see [14] or [6]), we have the following inf u1∈L 2 ((0, T )×U1) where H * is the adjoint operator of H and the conjugate functionsJ * i are given bȳ Note that if we multiply the first equation (respectively, the second one) in (3.13) by z (respectively, by q) and integrate by parts, then we obtain the following Then, for ξ ∈ L 2 (Ω) that satisfies (3.11), we have the following where ϕ is from the unique solutions of (3.13).
Note that the first two equations in (4.18) imply the following which is a quasi-elliptic equation; and in view of Cauchy problems on bounded domains (e.g., see [18, Theorem 6.6.1]), for any fixed t ∈ (0, T ), ϕ(t, x) is analytic in U 2 , with Cauchy data zero on ∂ U 2 . As a result of this, ϕ(t, x) = 0 on (0, T ) × ∂ U 2 and also continuous in t, then we have ϕ(0, x) = 0 and ϕ(t, x) = ϑ(t, x) = 0 for (t, x) ∈ (0, T ) × ∂ U 2 , which implies ξχ U2 = 0 (cf. (4.18), since ϕ(T, x) = ξχ U1 ). This completes the proof of Proposition 3.3. 2 5. Further remarks. In this section, we briefly comment on the implication of our result in assessing the influence of the reachable target set on the strategy of the follower in relation to the direction of leader-follower and follower-leader information flows.
Note that the statement in Proposition 3.1 (i.e., the optimality distributed system for the follower) is implicitly accounted in Proposition 3.2 (cf. (3.13)). Hence, the optimal strategy for the follower (cf. (3.6)) is given by where ξ ∈ L 2 (Ω) is a minimum solution to the variational inequality in (3.14) and it also assumes a zero value outside of U 2 (i.e., ξχ U2 = 0). Moreover, such a minimum solution lies in a certain dense subset of L 2 (Ω), which is spanned by y(T ; (u 1 , R(u 1 ))), when u 1 spans L 2 ((0, T ) × U 1 ) (cf. Proposition 3.3).
Note that, from Proposition 3.2 (cf. (4.10)), we also observe that the optimal strategy for the leader is given by which is implicitly conditioned by the target set y tg + αB, where α is an arbitrary small positive number (cf. (2.2) or (2.8)). Moreover, the terminal state is guaranteed to reach the target set starting from an initial condition y(0, x) = 0 on Ω (i.e., y(T ; (u * 1 , R(u * 1 ))) ∈ y tg + αB); and the state trajectory y(t, x; (u * 1 , R(u * 1 ))) is not too far from the reference y r f (t, x) for all t ∈ (0, T ). As a result of this, such interactions constitute a constrained information flow between the leader and that of the follower (i.e., an information flow from leader-to-follower and vice-versa) that captures implicitly the influence of the reachable target set on the strategy of the follower.