Adaptive stochastic model predictive control of linear systems using Gaussian process regression

This paper presents a stochastic model predictive control method for linear time-invariant systems subject to state-dependent additive uncertainties modelled by Gaussian process (GP). The new method is developed by re-building the tube-based model predictive control framework with chance constraints via adaptive constraint tightening. In particular, the tightened constraint set is constructed by forecasting the conﬁdence region of uncertainty. Utilising this adaptive strategy, the Gaussian process based stochastic model predictive control (GP-SMPC) algorithm is designed by applying the adaptive tightened constraints in all prediction horizons. To reduce the computation load, the one-step GP-SMPC algorithm is further developed by imposing the tightened constraints only to the ﬁrst-step nominal state and the worst-case constraints to the remaining steps. Under the assumption that the state-dependent uncertainties are bounded, the recursive feasibility of the designed optimisation problem is ensured, and the closed-loop system stability is guaranteed. The effectiveness and advantage over existing methods are veriﬁed via simulation and comparison studies.


INTRODUCTION
Model predictive control (MPC) is widely implemented in modern industry due to its capability of handling nonlinear dynamics, attaining optimal control performance, and fulfilling state and control input constraints. This popular approach solves a finite horizon optimal control problem initialised by the sampled system state online at each sampling time instant, to predict an optimal control sequence within the horizon, and applies the first element of the computed control sequence to the plant. By repeating this procedure, an effective receding horizon control strategy is produced [1,2]. In practice, robust MPC (RMPC) [3][4][5] is proposed to deal with model uncertainty. In general, by assuming the model uncertainty is bounded, the RMPC approach guarantees algorithm feasibility and constraints satisfaction for all possible realisations by considering the worst-case uncertainty. This framework does not utilise possible statistical properties of the uncertainty, despite the statistical information might be available in many cases. Therefore, robust approaches may lead to an overly conservative or possibly infeasible controller design [6]. statistical identification tools to learn model uncertainties. One of the promising statistical identification tools is the Gaussian Process Regression (GPR) [16], which can provide variance prediction for the uncertainty estimate, and then be incorporated into MPC to improve control performance.
Some of the works model the full dynamics as Gaussian Process (GP). For example, in [17], the authors represent the full dynamics of a simulated pH neutralisation process by an off-line trained GP model in MPC, which results in a controller with a higher level of robustness due to richer information utilised in the model. Similarly, [18] proposes to use a GP to identify system behavior when the occurrence of a fault, and combines it with MPC to obtain a fault-tolerant control functionality, which is illustrated by two flight control examples in simulation. A GP-based MPC strategy is proposed in [19] and demonstrated in trajectory tracking problem, where the GP model of the full dynamics is learned from empirical data.
However, as one likely knows part of the dynamics in practice, it is might be too conservative to model the full dynamics as unknowns. To overcome this shortcoming, some other works only model the uncertainties as a GP. For example, a study for applying the GP-based SMPC to a drinking water network is developed in [20], where the strategy considers the influence of disturbances through the uncertainty propagation by using GP forecasting. The work in [21] learns a GP model for uncertainties for a robot and designs an NMPC for a path-repeating experiment in outdoor environments. The work in [22] presents an MPC approach that integrates a nonlinear nominal system dynamics with an additive dynamics modelled as a GP, and demonstrates the approach in a simulation for autonomous miniature racing car.
Note that all the aforementioned works are mainly focused on the implementation of the MPC with GPR, but the feasibility and stability properties are not well studied. In addition, it is either conservative or computational heavy to directly using the existing MPC approaches with GPR. In this paper, the model predictive control problem of linear time-invariant (LTI) system under state-dependent additive uncertainties modelled by GP is studied. To solve the problem, we propose a new GP-based SMPC (GP-SMPC) scheme based on the framework of learning MPC presented in [15], which falls in the category of analytic approximation method.
The novelty of this work lies in that we use the predicted confidence region of uncertainty in combination with the robust tube-based MPC (RTMPC) [23] to adaptively scale the tightening parameters corresponding to chance constraints. The main contributions of this paper are as follows: • A new GP-based SMPC algorithm termed as the full-step GP-SMPC is developed. In particular, the uncertainty is modelled as a GP, by which the mean and variance of the uncertainty is predicted. With these predictions, the constraints are adaptively tightened along all the prediction horizon, and the GP-based SMPC algorithm is then designed by extending the constantly tightened constraint stochastic MPC (CSMPC) [24] based on the framework of [15].
In comparison with [24], this work enlarges the feasible region and reduces conservatism by designing the adaptive tightening mechanism. • To reduce the computational complexity of the full step algorithm, the one-step GP-SMPC algorithm is further developed by imposing the tightened constraints only to the first-step nominal state and the worst-case constraints to the remaining prediction steps. • The theoretical properties of the developed algorithms are analysed. In particular, the recursive feasibility, chance constraints satisfaction and system stability are proved under the assumption that the state-dependent uncertainties are bounded.
The remainder of this paper is organised as follows. Section 2 introduces the problem to be solved and the preliminaries of CSMPC algorithm. Section 3 presents the GP-SMPC scheme, starting with the GP model training for uncertainties and the formulation of the uncertainty confidence region, followed by constraints handling of online tightening, and concluding with the GP-SMPC scheme. In Section 4, the theoretical properties of the proposed scheme are analysed. Section 5 presents the one-step GP-SMPC scheme. Numerical examples and comparison studies are given in Section 6. Finally, Section 7 provides conclusions.
Notations: ℝ denotes the set of all real numbers, ℕ i denotes the set of all integers which are equal or greater than i, and ℕ j i denotes the set of all consecutive integers {i, … , j}. We denote the value of a variable x at time k as x k , and the kstep-ahead predicted value of x at time t as x k|t . We define the weighted norm ‖x‖ 2 Q = x T Qx, and Q ≻ 0 refers to a positive definite matrix. The probability of the occurrence of event A is denoted by Pr(A). For a random variable x, its expected value and variance are denoted by E(x) and Var(x), respectively. A Gaussian distribution with mean and variance Σ is indicated with  ( , Σ). The notation A ⊕ B = {a + b|a ∈ A, b ∈ B} represents the Minkowski sum, and A ⊖ B = {a ∈ A|a + b ∈ A, ∀b ∈ B} represents the Pontryagin set difference.

Problem statement
Consider a discrete-time LTI system subject to additive uncertainties where x k ∈ ℝ n , u k ∈ ℝ m and w k ∈ ℝ n are the system state, control input and uncertainty at time k, respectively. It is assumed that all the system states are measured directly, and the uncertainty is state-dependent with bounded support and lies in a polytope , that is, w k ∈ . Moreover, the system in (1) is subject to the following constraints on states and inputs where ⊂ ℝ n , and ⊂ ℝ m are compact and each set contains the origin in its interior.
Considering the stochastic nature of the system dynamics, the chance constraint, which allows violation with a probability no greater than ∈ (0, 1), is chosen to reduce the conservatism for all the predicted states in future horizon. The objective of this paper is to design a GP-based SMPC algorithm to stabilise the system in (1) under the following system constraints:

Preliminaries
In this subsection, we recall the constantly tightened constraint stochastic MPC (CSMPC) [24] for the system in (1), which will facilitate algorithm design. Before that, we present some definitions for the ease of presentation.

Definition 1 (Robust Positively Invariant Set).
A set Z is said to be a robust positively invariant (RPI) set for the discrete-time system Definition 2 (Robust Successor Set). For the discrete-time system x k+1 = f (x k , w k ), the robust successor set Suc(Z, ), that is, one-step reachable set of the set Z , defines the collection of all the states that can be reached in one step from the initial set Z , under the state constraint x k ∈ Z and the uncertainty constraint w k ∈ . That is,

Definition 3 (Predecessor Set). For the discrete-time system
x k+1 = f (x k , u k ), the predecessor set Pre(S, , ), that is, onestep controllable set of the set S , defines the collection of all the states that can be driven to the set S in one step, under the input u k ∈ and state constraint x k ∈ . That is, Definition 4 (N-Step Controllable Set). For the discrete-time system x k+1 = f (x k , u k ), the N -step controllable set  N (S ) defines the collection of all the states that can be driven to the set S in N steps, under the input constraint u k ∈ and state constraint x k ∈ . That is,  N (S ) ≜ {x 0 ∈ ℝ n : x k+1 = f (x k , u k ), x N ∈ S, ∃u k ∈ , x k ∈ , k ∈ ℕ N −1 0 }, and  N (S ) also can be defined recursively as  k+1 (S ) = Pre( k (S )),  0 (S ) = S, k ∈ ℕ N −1 0 . The system dynamics in (1) can be decoupled into a nominal dynamics and an error dynamics [23] as where the nominal state is denoted by s k ∈ ℝ n and the nominal input is denoted by v k ∈ ℝ m . A cl = A + BK is stable, where the feedback gain K is obtained by solving the linear quadratic regulator (LQR) problem for the nominal dynamics in (4a). The error between the observed state x k and the nominal state s k is defined as The state feedback control is used as in RTMPC [23] For simplicity, it is assumed that is a compact polytope, and there exists a polytope  ⊂ and Pr(w k ∈ ) ≥ 1 − . Let Z be the minimal robust positively invariant (mRPI) set for the error dynamics in (4b) with the disturbance w k ∈ , and the successor invariant set Z s = Suc(Z, ), then it holds that [24]: Construct the tightened state constraint set  ≜ ⊖ Z s . Then the satisfaction of chance constraint (3a) can be guaranteed by (7). And the satisfaction of hard constraint (3b) can be guaranteed by imposing v k ∈ , where  ≜ ⊖ KZ . Construct the terminal constraint set as which is the maximal output admissible set [25] for the autonomous system s k+1 = (A + BK )s k . The stage cost is (s, v) = ‖s‖ 2 Q + ‖v‖ 2 R , and the terminal cost is V f = ‖s‖ 2 P , where Q ≻ 0, R ≻ 0, and P ≻ 0. Based on the above operations, the finite horizon optimal control problem ℙ csmpc N (x t ) in [24] to be solved at each time instant t is as follows: This tube-based framework which decouples the planning of the nominal trajectory and the handling of the disturbance will be used in the new algorithm design.

GAUSSIAN PROCESS BASED SMPC
Any formulation method for the polytope  of uncertainty can be combined with the above CSMPC scheme. For example, Chebyshev inequality method [11], Box-shaped method and Ellipsoidal method [26] can be adopted if the mean and variance informations of uncertainty are known. If the samples of uncertainty can be obtained, the Scenario approach [14,24] can be used to generate the polytope. However, if the polytope  cannot be formulated by aforementioned methods, another approach may needed. In the following, a Gaussian process regression method is proposed to solve this issue when the uncertainty is state-dependent.

Uncertainty modelling and uncertainty confidence region
In the uncertainty model, the model input and output are the state-control tuple z k = (x k , u k ) and the corresponding uncertainty w k , respectively. From system in (1) we obtain the uncertainty at time k as: The resulting data pair, {z k , w k }, represents an individual experience. Given a well collected data pair set {Z, w} and a test data pair {z * , w * }, the jointly Gaussian distribution [16] is . (10) The posterior distribution of w * is still Gaussian with mean (z * ) = E(w * ) and variance 2 (z * ) = Var(w * ) as follows: The n separate GP models are trained for each dimension in w ∈ ℝ n , and the optimal hyperparameters of a Gaussian model can be obtained offline by maximising the log marginal likelihood of collected data sets. After learning the GP models, the predicted uncertaintỹ w k can be generated. Construct the prediction model with predicted uncertainty corresponding to the system in (1) as wherex k denotes the predicted state andũ k is the predicted closed-loop input. The GP distribution ofw k based onx k andũ k is Let 1 − , where ∈ (0, 1), be the confidence coefficient of w k , then the quantile value corresponding to the confidence coefficient is where Φ −1 denotes the quantile function of the standard normal distribution [20]. The confidence interval ofw k corresponding to the confidence coefficient 1 − can be obtained as Then, the predicted uncertainty confidence region  k = {w k }, k ∈ ℕ 0 , is generated based on the confidence interval corresponding to the confidence coefficient 1 − . Suppose that the system is sufficiently excited, then follows directly from (14)∼(16).
Remark 1. Note that, the multiplicative uncertainty consists of constant part and time-varying part [1,11]. The time-varying part, which is a function of the state and input, can be seen as the state-dependent additive uncertainty. Therefore, the multiplicative uncertainty can be predicted by the proposed GP model as well.
It is worth noting that, the proposed method can be used in the case where the polytope  cannot be formulated accurately or totally cannot be formulated when the uncertainties are statedependent in some practical applications, such as robot arm [27], water supplying [20], trajectory tracking [19,21,22] etc.

Constraints handling
Due to the fact that computing probability integral is challenging, solving the chance constrained optimisation problem needs approximation method. We propose a strategy of formulating the chance constraints into adaptive deterministic constraints on the nominal state by using the predicted uncertainty confidence region, in particular, converting the chance constraint in (3a) into a deterministic constraint. To this end, we need the following results.
Let Z be the mRPI set for the error dynamics in (4b) with the uncertainty w ∈ . Suppose there exists a polytope  k ⊆ . According to Definition 2, we get the robust successor set for the error dynamics in (4b) as Z s Construct the successor invariant set as Z s k+1 ≜ A cl Z ⊕  k , then the tightened state constraint set is Clearly, if the nominal state s k+1 ∈  k+1 , then the chance constraint in (3a) is satisfied, that is, Pr(x k+1 = s k+1 + e k+1 ∈ ) ≥ 1 − can be guaranteed by (18). And the satisfaction of hard constraint in (3b), that is, u k = v k + Ke k ∈ can be guaranteed by constructing the robust tightened input constraint set  ≜ ⊖ KZ and letting v k ∈ .
Remark 2. Due to the use of the predicted uncertainty confidence region, the state constraint set  k is tightened adaptively along the prediction horizon. Compared with the robust tightened state constraint set  r ≜ ⊖ Z [23], this adaptive method results in less restrictive tightening, which can be seen in the following lemma and Subsection 3.4.

Lemma 1.
Let Z be the mRPI set for the error dynamics in (4b) with the uncertainty w ∈ . Suppose  k ⊆ holds. Then  r ⊆  k , k ∈ ℕ 0 , and  r ⊆ .
Proof. According to Definition 1 and the error dynamics in (4b), e k+1 = A cl e k + w k ∈ Z , for all e k ∈ Z and w k ∈ . That is, Z + = A cl Z ⊕ ⊆ Z . According to Definition 2 and the error dynamics in (4b), e k+1 = A cl e k + w k ∈ Z s k+1 , for all e k ∈ Z and w k ∈  k . That is, (19) and  r ≜ ⊖ Z . Using the similar argument, it can be derived that Z s ⊆ Z and  r ⊆ . □ Lemma 1 will facilitate in comparing feasible regions of different approaches.
To ensure the feasibility, we construct the robust terminal constraint set as the maximal output admissible set [25] as In this case, X f r satisfies the axioms [4]:

Gaussian process based SMPC
Combining the adaptive constraint tightening method and the tube-based MPC framework, the Gaussian process based stochastic finite horizon optimal control problem ℙ g psmpc N (x t ) at each time instant t is formulated as follows: x k+1|t = Ax k|t + Bũ k|t +w k|t , (21c) Generate samplesw k|t by g p(x k|t ,ũ k|t ), (21e) Here, = Φ −1 (1 − ) and g p(x k|t ,ũ k|t ) ∼  (̃k |t ,̃2 k|t ).
The optimal solution of ℙ g psmpc N , and the associate optimal state sequence for nominal system is s * ( and s * 0|t , and the optimal control law is designed as: The entire process of the GP-SMPC is repeated for all t ⩾ 0 to yield a receding horizon control strategy.

Comparison on feasible regions
For each time instant, the feasible region of the nominal state for the problem ℙ g psmpc N (x t ) is the state set satisfying the tightened state constraint in (21j), the tightened input constraint in (21k), the initial state constraint in (21l), and the terminal constraint in (21m). Hence, the N -step feasible region of the nominal state for the problem ℙ g psmpc N (x t ) can be obtained by which follows from Definition 4, where  N |t (X f r ) ≜ {s 0|t ∈ ℝ n : s k+1|t = As k|t + Bv k|t , x t ∈ s 0|t ⊕ Z, s N |t ∈ X f r , ∃v k|t ∈ , s k|t ∈  k|t , k ∈ ℕ N −1 0 }. As a result, the N -step feasible region of the actual state for the problem ℙ g psmpc N (x t ) can be obtained by The corresponding N -step feasible region of the actual state for RTMPC in [23] According to lemma 1,  r ⊆  k , k ∈ ℕ 0 , it follows that X N ⊆ X N |t . Similarly, we can infer thatX N ⊆X N .
Note that, since the uncertainty is state-dependent, and the constraints are tightened adaptively in the GP-SMPC strategy, we can infer that  ⊆  k , andX N ⊆ X N |t .

THEORETICAL PROPERTIES OF GP-SMPC
In this section, the feasibility and constraint satisfaction of the designed GP-SMPC scheme are first presented, then the stability results of the closed-loop system are provided. Before proceeding to the main results, a lemma for developing the feasibility result is presented. (4b). For x t +1 = Ax t + Bu t + w t , and s 1|t = As 0|t + Bv 0|t , if x t ∈ s 0|t ⊕ Z and u t = K (x t − s 0|t ) + v 0|t , then x t +1 ∈ s 1|t ⊕ Z for all w t ∈ .

Lemma 2. Suppose that Z is the mRPI set for the error dynamics in
Proof. The proof follows the similar line of reasoning as Proposition 1 in [28]. In (5) we have defined e t = x t − s 0|t and e t +1 = x t +1 − s 1|t . Due to x t ∈ s 0|t ⊕ Z , it follows that e t = x t − s 0|t ∈ Z . Therefore, according to Definition 1 and the error dynamics in (4b), A cl Z ⊕ ⊆ Z holds for all e t ∈ Z and w t ∈ , namely, e t +1 = (A + BK )e t + w t ∈ Z . Since x t +1 = s 1|t + e t +1 , then x t +1 ∈ s 1|t ⊕ Z follows. □ Based on Lemma 2, the feasibility and constraint satisfaction result is reported as follows.

Theorem 1 (Feasibility and Constraint Satisfaction)
. Let X N |t be defined in (24), then the following hold true.
is feasible for the problem ℙ g psmpc N (x t ) with x t ∈ X N |t , then applying the control law in (22) gives the recursive feasibility, that is, there exists a feasible  t +1 for x t +1 . (c) If x 0 ∈ N |t , for x t +1 = Ax t + Bu t + w t and s 1|t = As 0|t + Bv 0|t , then Pr(x t +1 ∈ ) ≥ 1 − and u t ∈ , for all t ∈ ℕ 0 , Then, for the nominal dynamics in (4a), we have that The proof of part (b) follows the similar line of reasoning as Proposition 3 in [23]. Since  t is feasible for the problem ℙ g psmpc N (x t ) with x t ∈ X N |t , the constraints (21j)-(21m) are satisfied by v * (x t ) and s * (x t ). The shifted input sequence Then we choose the candidate solution as . Hence, the first N − 1 elements of ⃗ v * (x t ) satisfy the constraint (21k) and the first N elements of ⃗ s * (x t ) satisfy the constraint (21j). Because s * N |t ∈ X f r , it follows from A1 that A cl s * N |t ∈ X f r and Ks * N |t ∈ ⊖ KZ . Therefore, the last element Ks * N |t of ⃗ v * (x t ) satisfies constraint (21k) and the last element A cl s * N |t of ⃗ s * (x t ) satisfies the terminal constraint (21m). Moreover, from Lemma 2 we have that if x t ∈ s * 0|t ⊕ Z , then x t +1 ∈ s * 1|t ⊕ Z for all w ∈ , that is, the initial constraint (21l) is satisfied. To sum up,  t +1 is feasible for the problem ℙ g psmpc N (x t +1 ). The proof of part (c) is as follows. From the ℙ g psmpc N (x t ) scheme in (21) we have that e t = x t − s * 0|t ∈ Z , t ∈ ℕ 0 . Using the fact that the successor invariant set of Z under w t ∈  0|t is Z s 1|t = Suc(Z,  0|t ), we get that Pr(A cl e t + w t ∈ Z s 1|t |e t ∈ Z ) ≥ 1 − for all w t ∈ following (18). From the recursive feasibility of ℙ g psmpc N (x t ) we have that s * 1|t ∈  1|t . Consequently, Pr(s * 1|t + A cl e t + w t ∈  1|t ⊕ Z s 1|t |e t ∈ Z ) ≥ 1 − . Finally, as x t +1 = s * 1|t + A cl e t + w t and =  1|t ⊕ Z s 1|t , it follows that Pr(x t +1 ∈ ) ≥ 1 − , t ∈ ℕ 0 . To prove the satisfaction of hard constraint of input sequence, note that x t − s * 0|t ∈ Z and v * 0|t ∈ , t ∈ ℕ 0 due to the recursive feasibility of ℙ g psmpc N Finally, we show that the designed algorithm ensures closedloop exponential stability. The main result is summarised in Theorem 2.
Theorem 2 (Stability). Suppose that Z is the mRPI set for the error dynamics in (4b) and let X N |t be defined in (24). For the system in (1), the following hold.

(a) The feasible region X N |t is RPI. (b) Z is exponentially stable with the feasible region X N |t .
Proof. Since x t ∈ X N |t , it follows from the definition in (24) that there exists s 0|t ∈ S N |t and x t − s 0|t ∈ Z . Then there exists an input sequence v 0 (s 0|t ) = [v k|t ∈ , k ∈ ℕ N −1 0 ] and the associated state sequence s 0 (s 0|t ) = [s k|t ∈  k|t , k ∈ ℕ N −1 0 ], such that and v N |t = Ks N |t ∈ ⊖ KZ . This implies s 0|t ∈ S N +1|t whenever s 0|t ∈ S N |t , that is, S N |t ⊆ S N +1|t . From the definition of S N |t in (23) and Definition 4, s 0|t ∈ Pre( N −1|t (X f r ), ,  1|t ). Furthermore, according to Definition 3 and the nominal dynamics in (4a), it follows that s 1|t = As 0|t + Bv 0|t ∈  N −1|t (X f r ) = S N −1|t ⊆ S N |t . Consequently, if s 0|t ∈ S N |t , then s 1|t ∈ S N |t , namely, S N |t is a positively invariant set for the nominal dynamics in (4a). From Lemma 2 we have that e t +1 = x t +1 − s 1|t = A cl (x t − s 0|t ) + w t ∈ Z , ∀w t ∈ . Therefore, x t +1 ∈ (s 1|t ⊕ Z ) ⊆ (S N |t ⊕ Z ) = X N |t . This proves part (a).
To prove (b), note that the terminal cost ‖ ⋅ ‖ 2 P is a control Lyapunov function in the terminal set X f r , and X f r satisfies the axioms A1 and A2. As a result, lim k→∞ s k|t = 0 and the origin s = 0 is exponentially stable with feasible region S N |t by following from Theorem 2.7 and Theorem 2.8 in [5]. Moreover, since e t +k = x t +k − s k|t ∈ Z , then lim k→∞ x t +k = lim k→∞ (s k|t + e t +k ) = lim k→∞ e t +k ∈ Z . Thus, Z is exponentially stable with the feasible region X N |t = S N |t ⊕ Z . □

ONE-STEP GP-SMPC
Theoretically speaking, if the offline-learned uncertainty model is sufficiently accurate, the GP-SMPC method may result in desired performance due to its step-by-step long horizon estimation. However, the model accuracy might not be made sufficiently good because of the limit of collected data. In addition, the computational load of the GP-SMPC would be very heavy. To resolve this issue, the one-step GP-SMPC is proposed. The reasoning is that, because of the receding horizon control strategy, a new measured state will be available before the prediction in the next step, which means that the next input u t +1 can be computed only under the tightened constraint  0|t +1 [29]. The set  0|t +1 is quite close to the first-step tightened constraint  1|t . We only compute the first-step predicted tightening constraint set  1|t in the open-loop process and fix all the remaining tightening constraint sets to the worst case  r to alleviate computational consumption [30]. Following the same procedure of ℙ g psmpc N (x t ) in (21), a new finite horizon optimal control problem ℙ g psmpc N −one (x t ) at each time instant t can be formulated as follows: Generate samplesw 0|t by g p(x 0|t ,ũ 0|t ), s k+1|t = As k|t + Bv k|t , Here, = Φ −1 (1 − ) and g p(x 0|t ,ũ 0|t ) ∼  (̃0 |t ,̃2 0|t ). The one-step GP-SMPC solves ℙ g psmpc N −one (x t ) to generate the control input. Note that, the tightened constraint set for states only need to be computed once at any time instant. Roughly speaking, the computational consumption of generating the uncertainty confidence region in one-step GP-SMPC would be about 1∕N of that in the full-step version.
The feasibility, constraint satisfaction and stability results for the one-step GP-SMPC scheme in (25) can be similarly obtained as for the full-step GP-SMPC scheme in (21), so they are omitted here.

NUMERICAL SIMULATION
In this section, the feasible region of the proposed GP-SMPC scheme is first calculated and compared with the existing methods. Then, the control performance between the full-step GP-SMPC scheme and the one-step scheme are compared. Finally, the chance constraint violation in GP-SMPC is analysed and verified. In the simulations, the feasible region sets S N and X N are efficiently approximated using the algorithm in [31], and the robust positively invariant set Z is computed as the outer approximation of the minimal invariant disturbance set using the strategy in [32]. Furthermore, the implementations of those algorithms and the computations of  k and Z s k are performed by using the MPT3 toolbox [33].

Comparison on feasible regions
Consider a discrete-time LTI system subject to state-dependent additive uncertainties with bounded support: where ≜ {x ∈ ℝ 2 : The state and input constraints are Pr(x k+1 ∈ ) ≥ 0.8 and u k ∈ , ∀w k ∈ , k ∈ ℕ 0 . The weights of the cost function are Q = I 2 and R = 0.01, respectively. K is the LQR feedback gain for the unconstrained optimal problem (A, B, Q , R). The prediction horizon is N = 10 and the initial state The 10-step feasible regions for GP-SMPC, RTMPC, and CSMPC are shown in Figure 1. It can be seen that the feasible region of GP-SMPC is much larger than those of CSMPC and RTMPC, that is,X n ⊂X n ⊂ X n .

Comparison of full-step GP-SMPC and one-step GP-SMPC
In the same conditions and setup, the simulation of the full-step GP-SMPC and the one-step version for the system in (26) are conducted, where the simulation step is N sim = 12. The results are shown in Figs. 2-7. Figure 2 shows the closed-loop state trajectories of the full-step GP-SMPC and the relevant k-step (k ∈ ℕ 10 0 ) feasible regions. The solid-circle line (red) represents the actual trajectory, while the dash-dot line (blue) is the nominal trajectory. Along the nominal trajectory, the tubes s 0|t ⊕ Z (t ∈ ℕ N sim −1 0 ) are shown as the small hexagons. Figure 3 presents the corresponding state trajectories of the one-step GP-SMPC. Figure 4 shows the first step open-loop state trajectories of the full-step GP-SMPC and their predicted varying tubes related to the optimal nominal states. The solid-circle line (red) represents the predicted trajectory, while the dash-dot line (blue) denotes the nominal trajectory. Figure 5 shows the first step open-loop predicted uncertainties of the full-step GP-SMPC. The dashed line (red) represents the open-loop predicted uncertainty, while the dash-dot line (blue) denotes the closed-loop actual uncertainty. If the offlinelearned uncertainty model is sufficiently accurate, all the closedloop actual uncertainties will fall into the 80% confidence interval of the open-loop prediction. However, in practice, among all 10 actual points of w 1 , there are 4 points violating the 80% confidence interval of the open-loop prediction. And that of w 2 , there are 5 points. Figure 6 shows the closed-loop uncertainties of the one-step GP-SMPC. The dashed line (red) represents closed-loop predicted uncertainty, while the dash-dot line (blue) denotes the    closed-loop actual uncertainty. It can be seen that the one-step GP-SMPC performs a similar uncertainty prediction as the fullstep case. Among all 12 actual points of w 1 , there are 4 points violating the 80% confidence interval of the closed-loop prediction. And that of w 2 , there are 5 points. Figure 7 compares the cost value between the full-step GP-SMPC algorithm and the one-step algorithm. Obviously, the cost values of the full-step GP-SMPC algorithm are bigger than those of the one-step version in the first few steps. Considering the framework of the cost function, it signifies that the nominal states of the full-step GP-SMPC algorithm are closer to the constraint boundary than those of the one-step version, namely, the full-step case results in less conservatism.
Moreover, the average computational time of the full-step GP-SMPC for one step closed-loop prediction is 23.2 s, while that of the one-step version is 2.5 s.
To sum up, the simulation results validate that: (1) The proposed full-step GP-SMPC and one-step GP-SMPC algorithms The one-step GP-SMPC has less computational load and can achieve a comparable control performance as the full-step GP-SMPC, which might be a good choice for practice implementation.
The state and input constraints are Pr(x k+1 ∈ ) ≥ 0.8 and u k ∈ , ∀w k ∈ , k ∈ ℕ 0 . The weights of the cost function are Q = and R = 1, respectively. K is the LQR feedback gain for the unconstrained optimal problem (A, B, Q , R). The prediction horizon is N = 8, the simulation step is N sim = 15 and the initial state x 0 = [2.5, 2.8] T . Figure 8 reports the simulation results of the full-step GP-SMPC with 100 realisations. The left part demonstrates the closed-loop trajectories with 15-step simulation. The right part exhibits the detailed constraint violation. For the 100 realisations, the average constraint violation in the first 6 steps is 17.5%, which is close to 20% specified in advance. Figure 9 reports the simulation results of the one-step GP-SMPC with 100 realisations. The average constraint violation in the first 6 steps is 17.17%, which is comparable to that of the full-step version.

CONCLUSION
The aim of this paper is to develop a stochastic MPC approach based on the adaptive tightening technique for linear constrained systems with state-dependent stochastic disturbances, and design two SMPC algorithms, that is, the fullstep GP-SMPC and one-step version. The proposed GP-SMPC scheme reduces the conservatism through tightening the constraints adaptively, and using the predicted information to form the confidence region of the uncertainty. Furthermore, the first-step tightening constraints are exploited in the the one-step GP-SMPC to reduce computation load. Under mild assumptions, the recursive feasibility and system stability are rigorously proved. Numerical simulations verify that the GP-SMPC approach can provide an increased feasible region comparing with RTMPC and CSMPC, and reduce conservatism in constraint violation. Future work will concentrate on finding better approximation of the time-varying confidence region by updating the GP model online.