Probabilistic message passing control and FPD based decentralised control for stochastic complex systems

: This paper o ﬀ ers a novel decentralised control strategy for a class of linear stochastic large-scale complex systems. The proposed control strategy is developed to address the main challenges in controlling complex systems such as high dimensionality, stochasticity, uncertainties, and unknown system parameters. To overcome a wide range of domain of complex systems, the proposed strategy decomposes the complex system into several subsystems and controls the system in a decentralised manner. The global control objective is achieved by individually controlling all the local subsystems and then exchanging information between subsystems about their state values. This paper mainly focuses on the probabilistic communication between subsystems, therefore the detailed process of message-passing probabilistic framework is provided. For each subsystem, the regulation problem is considered, and fully probabilistic design (FPD) is applied to take the stochastic nature of complex systems into consideration. Also, since the governing equations of the system dynamics are assumed to be unknown, linear optimisation methods are employed to estimate the parameters of the subsystems. To demonstrate the e ﬀ ectiveness of the proposed control framework, a numerical example is given.

(5) This paper provides a basic understanding of the message-passing between the subsystems in the decentralised control process. For that purpose, the proposed strategy, including the information transmission and reception and how the local controller harmonizes the subsystems states based on the neighbours new received information, is demonstrated on a simple linear quadratic example for better understanding.
The remainder of this paper is organised as follows. Section II introduces the structure of the subsystems and the estimation of the parameters. In Section III, the FPD is presented and the stages of the proposed strategy are provided. Section IV gives details about the message passing approach. In Section V, the proposed algorithm is applied to a numerical example to show its effectiveness. Finally, the conclusion is summarised in Section VI.

System statement
In this section, the basic decentralised probabilistic framework of message passing will be introduced. Within the proposed decentralised probabilistic framework, the considered complex system is decomposed into K subsystems based on the systems' conditions such as 1) Some states can be physically controlled together; 2) the global system is composed of many independent agents, like a drone, ext. Each subsystem is estimated as a probabilistic model and controlled independently by a probabilistic controller.
Unlike other decentralised frameworks, in this paper, the neighbouring subsystems states will also be treated as the considered subsystems states. More specifically, the full state of a subsystem is formed by two parts, internal states, and external states. The subsystem's own states are defined as internal states, which are controlled by their own local controllers. The states received from the neighbour subsystems are defined as external states, therefore the local controller has no control power over it. Note that the external state in each subsystem is also estimated as a linear probabilistic model, which is only related to itself. Furthermore, the local control decision will be made based on both internal and external states.
To better explain the proposed framework, we will demonstrate the subsystem structure and the message passing process using two subsystems: subsystem α and subsystem β as an example.

The subsystem α
The considered subsystem α (which is also called node α) can be shown as follows, x k;α =Ā 1 x k−1;α +Ā 2 y k−1;α +B α u k−1;α + v k−1;α , (2.1) where the x k;α ∈ n 1 is the internal state of subsystem α and y k;α ∈ n 2 is the external state of the subsystem α which is related to the neighbour subsystem β. Note that the full state of subsystem α is formed as z k;α = [x T k;α , y T k;α ] T . u k−1;α ∈ n 3 stands for the control input of subsystem α which need to be designed,Ā 1 ,Ā 2 , A 3 , andB α are the system parameters with appropriate dimensions. In addition, v k−1;α and w k−1;α are Gaussian noises with zero means and covariance Q α and R α , respectively. v k;α ∼ N (0, Q α ) ,

The subsystems β
Similar to Eq (2.1) and Eq (2.2), the state equation of subsystem β is given by, where the x k;β ∈ n 2 represents the internal state of subsystem β and y k;β ∈ n 1 is the external state of the subsystem β which is passed from the neighbour subsystem α. Similarly, the full state of subsystem β is defined as z k;β = [x T k;β , y T k;β ] T , u k−1;β ∈ n 4 stands for the control input of subsystem β which need to be designed,C 1 ,C 2 , C 3 , andB β are the system parameters with appropriate dimensions. In addition, v k−1;β and w k−1;β are Gaussian noises with zero means and covariance Q β and R β , respectively.
In each subsystem, Eq (2.1) and Eq (2.4) represent the internal state equation while Eq (2.2) and Eq (2.5) are the external state equations. Based on Eq (2.1) and Eq (2.4), we can see that the internal state x k depends on the previous internal state x k−1 and the previous external state y k−1 which is passed by the neighbour subsystem and the designed controller. This means that the internal states of each subsystem will be controlled by their own controller and will be affected by the external states received from neighbour subsystems. In addition, from Eq (2.2) and Eq (2.5), it can be seen that the external state y k is only affected by the previous external state y k−1 . Besides, the designed local controller of each subsystem can only control their internal states. Another thing need to be noted is that the local subsystem control decision is made based on both their own states (internal states) and the neighbour systems states (external states). In this way, the global system can cooperatively achieve its goal. Without message passing between subsystems, each subsystem is controlled by its own control strategy independently and not affected by the neighbours' information, which leads to the consequence that the global system goal will never be reached.

Parameter estimation
In industrial processes, the precise system models are normally unavailable. Therefore, in this paper, all the system parameters will be estimated first. To achieve this, linear optimisation method is applied and details will be introduced in the following text using subsystem α as an example.
The system Eq (2.1) can be rewritten in the following form, where ϕ k−1;α stands for the input matrix which is constructed as, and θ α is the weight matrix which is formed as follows, (2.10) Denote the estimated weight matrixθ α as,θ (2.11) where A 1 , A 2 and B are the estimated parameters forĀ 1 ,Ā 2 andB α . Note that at each step, all the observed data up to time k will be applied to estimate the system parameters to ensure accurate estimation. Thus using observed data up to time k, Ψ = [ϕ 0;α , ..., ϕ k−1;α ] and X = [x T 1;α , ..., x T k;α ] T , one can get the following equation, Based on Eq (2.12), the estimated weight matrixθ α can then be obtained as follows, where Ψ † represents the pseudo inverse of Ψ which is given by, (2.14) Similarly, using the same approach, the parameter A 3 in Eq (2.2) can be estimated via the following equation, Therefore, the conditional distribution of the system dynamic can be estimated as follows where µ k;α = A 1 x k−1;α + A 2 y k−1;α + B α u k−1;α , Following the same estimation approach for subsystem α, the system parameters of subsystem β can be estimated and the system distribution is given as follows using the estimated parameters, where, (2.21) where C 1 , C 2 , and B β are the estimated parameters forC 1 ,C 2 , andB β , respectively. Therefore, using the estimated parameters, the full state system equation of subsystem α can be specified in the following form, Thus, the conditional distribution of the system dynamic of the full state of subsystem α can be expressed as Similarly, for subsystem β, the state equation of subsystem β is given by, The conditional distribution of the system dynamic of the full state of subsystem β can be expressed as

Fully probabilistic design
The control strategy presented in this paper is to design the subsystem controller for each subsystem and achieve the objective of each subsystem and then consequently achieve the objectives of the overall complex system. The objective of each subsystem considered in this paper is to design a randomised control input c(u k−1 |z k−1 ) to solve the regulation problem and bring all the internal states back to zero. Considering the stochastic nature of the complex systems, the FPD will be employed to each subsystem to achieve that.

General form of FPD
The performance index is formed by the Kullback-Leibler divergence (KLD) which is applied to describe the distance between the pdf of the joint distribution of the closed loop control system and the desired joint pdf. The KLD between the actual joint pdf f (F) of the observed data F = (x(H), u(H)) and the ideal joint pdf f I (F) on a set of possible F is defined as follows, where H is the control horizon. According to the chain rule for pdfs [30], the joint distribution of the probabilistic closed-loop description of the system dynamics can be evaluated as follows: where c(u k−1 |z k−1 ) is the actual conditional pdf of system controller u k−1 . Similarly, the ideal closedloop pdf can be expressed in the same form as Eq (3.2) with ideal system model pdf s I (z k |u k−1 , z k−1 ) and ideal controller pdf c I (u k−1 |z k−1 ), With the KL-distance (3.1), the closed loop joint pdf (3.2) and the desired closed loop joint pdf (3.3), the performance index can be formalised to be given by the following expression: where the first term in parenthesis in Eq (3.4) stands for the partial cost while the second term is the expected minimum cost-to-go function. The recursive formulation of performance index (3.4) is similar to Dynamic programming. Full derivation of Eq (3.4) can be found in [27]. Based on the Fully Probabilistic Design (FPD) [23,27,31], the control law c * (u k−1 |z k−1 ) for the subsystem which minimises the performance index (3.4) is given by, where, Full derivation of Eqs (3.4)-(3.6) can be found in [7].

Linear Gaussian quadratic design
Following the FPD algorithm described by Eq (3.6), the generalised fully probabilistic control solution of the regulation problem using subsystem α as an example will be derived in this section. As mentioned in the last section, the objective of the controller is to return the system states back to zero from their initial values. Therefore, the ideal distribution of the system is specified as, where Σ 2 means the ideal covariance of the state. The ideal distribution of the controller can also be defined as follows, where Γ is the ideal covariance of the subsystem control input. Note that the covariance Γ indicates the allowed range of optimal control input. Based on Eq (3.5), Eq (3.6), Eq (3.7) and Eq (3.8), the optimal controller form can be given by the following theorem.
Theorem 1. By submitting the ideal distribution of the system dynamics (3.7), the ideal distribution of the controller (3.8), and the real distribution of the system dynamics (2.23) and (2.24) into Eq (3.6), the optimal controller for system (2.1) that minimizes the performance index (3.4) is given by where, Note that this part is not the main contribution of this paper, therefore, FPD is taken as a ready methodology here. The detailed proof can be found in [7,24]. In addition, same method will be applied to the subsystem β, which will be omitted here.

Message passing
We have stated the FPD control methodology for each subsystem in Section 3. As we mentioned earlier, without communication between neighbouring subsystems, each subsystem is controlled by its own control strategy individually, therefore might fail in achieving the global system goal. Besides, in this work the subsystem dynamics are described by probabilistic state space models considering the stochastic nature of the complex systems, which implies that the communication between subsystems should also be formed in a probabilistic fashion. Thus, in this section, a novel probabilistic framework for message passing between two subsystems in a decentralised and synchronous manner will be introduced using the progress of message passing from subsystem α to subsystem β as an example.
In general, the message passing approach can be divided into two stages: Passing and Receiving. More specific, once the internal states of subsystem α are updated, it will be passed in a probabilistic way to the neighbour subsystem β. After the subsystem β receive the probabilistic information that subsystem α passed, the next step is for the neighbour subsystem β to fuse the received information with its own prior external states distribution (2.19) to obtain the posterior external state distribution. The process can be shown as x k−1;α → y k−1;β . The detailed processes will be presented in two steps as follows.
4.1. Message passing from subsystem α to subsystem β The first step is for the subsystem α to pass its updated internal state distribution to subsystem β, which is given by the following theorem.
Theorem 2. Denote the distribution of the controller u k−1;α as follows, where m k−1;α is the mean of the input u k−1;α and Σ k−1;α is the variance. The probabilistic model that subsystem α passes to subsystem β can be described as follows Proof. Based on the chain rule, the conditional joint distribution of the behaviour of the closed loop system is given by L α (x k;α , y k;α , u k−1;α z k−1;α ) = s 1 x k;α |u k−1;α , z k−1;α s 2 y k;α |y k−1;α c u k−1;α |z k−1;α . By substituting Eq (2.18) and Eq (4.1) into Eq (4.4), the conditional joint distribution of the behaviour of the closed loop system can be further expressed as follows, To update the knowledge that the subsystem β maintains about its external variables, all state variables of the closed loop probability density description of subsystem α need to be integrated except the internal states.
M β←α (x k;α z k−1;α ) = L α (x k;α , y k;α , u k−1;α z k−1;α )dy k;α du k−1;α . Substituting Eq (4.5) into Eq (4.6), we have The above Eq (4.7) can then be presented as, Based on the Woodbury Identity, the term can be rewritten as following form, Then, Eq (4.8) can be shown as follows, Using the push-through identity as, Similarly, c k−1 can be solved following the push-through identity and the Woodbury Identity as follows then Eq (4.15) is given by Substituting b k−1 and c k−1 into Eq (4.12), we can get The proof is completed.

Message receiving at subsystem β
Once the subsystem β received the distribution (4.2) passed by the subsystem α, the subsystem β will update the prior external state distribution by merging with the newly obtained distribution. The theorem is given as follows.
Theorem 3. The posterior external state probability distribution y f used,β;k after fusing with newly obtained M β←α (x k;α z k−1;α ) is given as follows, y f used,β;k ∝ N(µ k; f ;β , Σ k; f ;β ), (4.19) where, Proof. Recall the prior external distribution of subsystem β as follows The probabilistic model (4.2) that subsystem α passed to subsystem β and the distribution (4.23) that subsystem β has priori about its external state can be fused using Bayes' rule by multiplying the two together. Thus, the new probability description of the fusion of the information is given by where By defining Equations (4.25) and (4.26) can be expressed as follows, which competes the proof.

Remark 2.
Compared with the other existing distributed approaches, we have brought up the concept of the external state in this work. On the one hand, the external state equation can be treated as part of the full state, which makes the controller design part easier. On the other hand, the external state equation can briefly estimate the next time steps of external values, which can make sure each subsystem works normally if the connection with the neighbour subsystem is lost, or when the future external value is required for some systems. Connection lost and requirement of the prediction of the future state values will be demonstrated in our future work. Besides, our proposed framework follows a fully probabilistic approach, which is more efficient than most existing decentralised methods that usually do not take the stochastic property into the consideration.

Procedure of message passing
The procedure of the proposed probabilistic message passing framework can then be summarized as follows: 1) Initialize all the subsystem states and parameters, including the system parameters A 3 and C 3 and the FPD Raccatti matrix S 0 ; 2) Form the matricesÃ andB α as specified in Eq (2.22). Apply the same forC, andB β as specified in Eq (2.25); 3) For each subsystem, calculate the control input of each subsystem, u k following Eqs (3.9)-(3.10) and update the system as give in Eqs (

7)
Move to the next sampling instant k = k + 1 and update the system using step 2.

Simulation results
To demonstrate the effectiveness of the proposed control strategy, the numerical example used in [32] will be employed in this paper to test the proposed framework.
The system is given by the following discrete time dynamical equation [32], where x k is the global system state, u k represents the system controller and v k is Gaussian noise with zero mean and variance 0.1.
v k ∼ N (0, 0.1) Unlike [32] splitting the original system into three decoupled subsystems, in this paper the original system is divided into two subsystems for showing the presented control strategy clearer. The first subsystem is taking the first two states as the internal states while the second subsystem is taking the last state as the internal state. To control the whole system, there are two control inputs that will be formed for each subsystem, respectively. Thus, the first subsystem is given by: Similarly, the second subsystem is presented as, Among these equation, the parameters a 11 , a 12 , a 21 , a 22 , and a 33 are related to the external observable states which are updated and communicated between the two subsystems. They are initialised randomly and will be updated each instant.
The initial values of the states are x 0 = [2.5, −2.3, 0.4] T . The objective for each subsystem is to use the FPD algorithm to design local controllers so that all the internal states of the individual subsystems are returned to the origin. Then the subsystems exchange their internal states information with each other via message passing method. In the end, all the system states should be all back to zero.
The simulation results are given in Figures 1-5. The system states in both subsystems are shown in Figures 1-3, which can be seen that all the states in both subsystems are back to the origin within 20 steps. That means the whole system is successfully decentrally controlled by individually controlling the two subsystems. In addition, from Figures 1-3, we can see that all the states in both subsystems match with each other's trajectories, which means that the message passing approach is successfully implemented. Figure 4 and Figure 5 show the FPD gain for subsystem 1 and subsystem 2, respectively. From Figure 4 and Figure 5, we can see that the gain converged to the optimal values within 20 steps, meaning that the FPD algorithm works in both subsystems. In conclusion, the control objective is successfully reached using our proposed control strategy.

Conclusion
This paper concentrated on the communication problems between the subsystems of complex dynamical stochastic systems. A detailed discussion about how the two subsystems exchange their new updated internal states has been offered. Also, the process detailing how the external states affect the subsystems and then reach the global control goal has been explained. In the meantime, a regulation problem has been considered for a complex system with a large number of subsystems and a decentralised control strategy using the proposed message passing approach has been developed adaptively. The conventional FPD has been applied to the subsystems as a randomised controller to reach the local control goals. Finally, the associated simulation results have been produced to verify the proposed control algorithm and the desired results have been obtained.