Neural network‐based non‐linear adaptive controller design for a class of bilinear system

This study presents a novel neural network (NN)-based non-linear adaptive control strategy for the global stability of multi-input–multi-output state-control homogeneous bilinear system (BLS) at the equilibrium position. Although this class of non-linear system is neither piecewise nor feedback linearisable, conditionally stabilisable control system design can be utilised to generate multiple state transitions and corresponding control gains. The collected data was used to train a NN to obtain an optimal gain estimator. Then the optimal gain estimator was integrated into real-time control system operation to adaptively compute control gains, ensuring that the controller is continuously adjustable to changing behaviour of the system. The proposed design was shown, through an illustrative example, to overcome the stability limitations of traditional controllers for the investigated class of BLS. Furthermore, discussions about the utility of the traditional control and learning system integration, as well as stability analysis of the proposed scheme were presented.


Introduction
A bilinear system (BLS) can be described as a linear system with state and control input coupling terms [1][2][3]. Although these systems are linear in state as well as in control, they are jointly non-linear. Interest in the study of such systems has been sustained largely because of several real-world dynamical systems exhibiting bilinear behaviour. They include biological, chemical, nuclear and thermal processes [4,5]. For instance, biochemical reactions exhibit bilinear behaviour in that the enzyme concentration (control variable), and both the substrate concentration and the decomposing complex (the state variables), are jointly non-linear. A similar phenomenon is exhibited in the regulation of thyroxin in the human body, where the control variable is the concentration of the free protein, and the states are the free thyroxin and proteinbound thyroxin concentrations. In nuclear fission processes, the rate of change of neutron population is bilinear in nature. In that case, the neutron multiplicative factor (control variable) is jointly non-linear with the neutron population and the precursor (the state variables).
Whereas a general solution to the BLS control problem is still open research, control techniques for certain classes of BLS have been reported. These include piecewise linearisation methods, feedback linearisation techniques, and Lyapunov-based non-linear control schemes. In [6], a static linear state-feedback control was designed for BLS by solving complex optimisation problems which utilise the linear matrix inequalities (LMIs). However, only local stability is guaranteed and the polytopic region must be within the domain of attraction of the equilibrium. The authors of [7] proposed a piecewise constant control method for planar BLS using the switching control approach. In [8], convex optimisation procedure for the stability of BLS with binary inputs was proposed with only local stability guaranteed.
An output feedback controller based on backstepping strategy was designed in [9] to stabilise multi-variable systems with bilinear stochastic coupling, invariably attenuating the stochastic coupling of the output and ensuring that the closed-loop trajectories are bounded. In [10], the passive control scheme was applied to the feedback stabilisation of BLS with bounded inputs and multiplicative noise, where a bounded nonlinear feedback controller was introduced based on the storage function strategy. And the problem of establishing a minimum-time control for nearcontrollability of BLS was addressed in [11]. In [12], neural network (NN) was used to find the solution of Hamilton-Jacobian-Bellman (HJB) equation by estimating the value function parameters. This, in turn, was used to stabilise BLS for which optimal control solution with static parameters exists. Also, NN was utilised in [13] to approximate the system behaviour of hyperbolic systems of conservation laws (HSCL) class of BLS with nonstandard boundary conditions. The NN was used for computational modelling of the system dynamics, whereas Lyapuvnov function was utilised for controller synthesis. Although NN has been integrated into the control system, the control action was still computed using the traditional approach.
To the best of authors' knowledge, a class of BLS for which there is no reported global stability solution is the multiple-inputmultiple-output (MIMO) state-control homogeneous BLS (HBLS). This class of BLS has neither linear state nor linear control components, but only the state-control coupled terms. As a result, they are neither piecewise linearisable nor feedback linearisable. Moreover, existing control schemes only achieved conditional stability as presented in Section 3. Therefore, in this work, an NNbased non-linear adaptive control method has been proposed.
In contrast to ordinary controllers, the parameters of adaptive controllers are continuously adjusted on-line based on the changing system dynamics. Therefore, rather than having a single controller, a family of controllers is realised due to the tunability of the controller parameters. However, there has been little success with the application of adaptive control to general non-linear systems. For the classes of systems where solutions exist, the following conditions must be satisfied: (1) non-linear dynamics of the plant can be linearly parameterised, (2) full plant state must be measurable, and (3) control input must be able to cancel nonlinearities with no unstable hidden nodes or dynamics if the parameters are known [14]. As analysed in Section 4 of this paper, MIMO state-control HBLS does not satisfy conditions 1 and 2.
The development of learning models has presented tools for advanced decision making in adaptive control designs. For example, learning models have been deployed in the control of modern systems such as autonomous vehicle navigation [15] and robotics [16,17]. In order to stabilise and achieve tracking control of switched non-strict-feedback non-linear systems with unknown non-linearities, dead zones and unmeasured states, the authors of [18] proposed an observer-based fuzzy output feedback control scheme. The unknown functions were approximated using fuzzy logic systems, the unmeasurable states were estimated by a fuzzy switched state observer, and backstepping control design strategy was utilised. However, BLS was not considered in the problem formulation as the states and control input were not multiplicative.
In [19], a decentralised adaptive NN output-feedback controller was designed for the stabilisation and tracking control of non-linear systems. Whereas the NN was utilised by the authors to model unknown nonlinear functions and the adaptive backstepping design strategy was employed, our proposed scheme employed NN to adaptively compute the control input. Although the interconnecting terms are unknown, the states and the control input were not multiplicative, indicating that BLS was not considered in the problem formulation. The effect of multiple fading channels on the state estimator performance of periodic NNs was investigated in [20]. Specifically, two sufficient conditions were provided to ensure that the estimation error system is stochastically stable and meets specified H ∞ performance, and the estimator gains were obtained. The NN was used to approximate intermediate control functions in [21] to realise an adaptive finite-time decentralised control scheme. Backstepping strategy was integrated with the Lyapunov control theory to achieve desired control of large-scale non-linear systems with input saturation and time-varying output constraints in finite time. However, BLS was not considered.
Therefore, due to the capability of NN models to characterise highly non-linear behaviour once a pattern exists [22][23][24], it was employed in this work as a universal approximator [25] to compute adaptive controller parameters for MIMO state-control HBLS. By non-linear mapping of the system state transitions to stabilising controller gains across different domains of the state space, timevarying state-adaptive controller parameters for global stability of the equilibrium position are obtained. The block diagram of Fig. 1 gives a high-level description of the nonlinear adaptive controller optimisation technique. And an illustrative example is presented in Section 3 to demonstrate the effectiveness and superiority of the proposed strategy over the traditional scheme, with comparative stability and tracking control performance.
The contributions of this work are (1) design of an NN-based non-linear adaptive control system, (2) control of non-piecewise linearisable and non-feedback linearisable system to achieve global stability, and (3) demonstration of achieving global stability of the MIMO state-control homogeneous BLSs at the equilibrium position using the proposed control design.
The remainder of the paper is organised as follows: Section 2 describes the control problem and presented the proposed solution. The effectiveness and comparative advantage of the control scheme are investigated with an illustrative example in Section 3. Section 4 discusses the utility and merits of the proposed methodology. Section 5 contains the conclusion and future work.

Problem formulation and the proposed control design
The general MIMO BLS model is described by the following statespace equation: where x(t) ∈ ℝ α is the state vector; u(t) ∈ ℝ β is the control input vector; A ∈ ℝ α × α is the state matrix; N ∈ ℝ α × α is the weighting matrix of the coupling between the control and the state vectors; otherwise known as the bilinearity; B ∈ ℝ α × β is the control input matrix and p is the summation index. The general observation equation is described as where y(t) ∈ ℝ γ is the output vector; C ∈ ℝ γ × α is the output matrix; D ∈ ℝ γ × β is the direct transmission matrix usually zero.
In (1), if B is zero, the system becomes state homogeneous, and if A is also equal to zero, the system is said to be state-control homogeneous [26] as defined in the following equation: For general non-linear systems, when the region of operation is around the equilibrium point, the behaviour of the system in such neighbourhood can be linearly approximated. The behaviour of the equilibrium point and its stability are given in Definitions 1 and 2.
Definition 2: A non-linear system is said to be conditionally stable at the equilibrium point if for some where Ω is a set of initial states. However, if for any x(0), the condition of Definition 2 is satisfied, then the system is said to be globally stable at the equilibrium point.
According to Definition 1, if x¯ is an equilibrium point, and we apply a constant input ū, the state derivative will be zero; that is, the state of the system will remain unchanged. This behaviour is often used to approximate the response of nonlinear systems within a certain operating range, such that well established linear control strategies can be applied. Definition 2 is an extension of Definition 1 where stability at equilibrium depends on the initial state conditions. More details about the Definitions and application of equilibrium point and stability are reported in [27,28] and Section 4 of this paper.
For MIMO state-control HBLS, both the Jacobian linearisation and feedback linearisation control techniques fail as analysed in Section 4. Moreover the Lyapunov-based method is insufficient as only conditional stability could be achieved as shown in Section 3. Since traditional control techniques have proved inadequate for the solution of MIMO state-control HBLS, NN was employed and integrated into this work as an adaptive state feedback controller gain estimator owing to its powerful non-linear mapping capabilities. Whereas the Lyapunov-based state feedback method achieves only conditionally stabilisable control system, its state feedback control gain establishes a trajectory pattern in phase portraits, which provides useful insight and efficiency with respect to generating training data across different domains.
In order to adaptively compute the control input, time step measurements of multi-dimensional variables {x(t), x(t + τ), K(t)} are taken for several sequences of conditionally stable MIMO state-control HBLS. The NN input variables x(t) and x(t + τ) are the current state and the next state effected by the control input application, respectively. The output variable K(t) is the control gain that caused the state transition. The training model is described using a feed-forward NN model with four input It should be noted that the number of input and output variables for any MIMO state-control HBLS model are α and β × α, respectively.
The neurons are the processing units with non-linear activation functions θ having weighted interconnections between them. Multiple sequences of data are obtained by varying the conditional control gain matrix for every run. A percentage of the sample data is used to train the model by minimising the sum of squares of output deviations as in (4), that is, obtain optimal weights that will map each state transition to the optimal control input where η denotes the loss function, n is the neuron index for a layer l, N is the total number of neurons and L is the final layer, which is the output layer. For the traditional state feedback system where the control gain K is static. By iteratively training the NN, the elements of the adaptive controller gain matrix can be computed as where For online operation as shown in Fig. 3, the input variable next state x(t + τ) is substituted with the desired state x d (t) to obtain the optimal time-varying control gain function Hence, substituting for K in (5), the control input for a MIMO system becomes Furthermore, the stability of a non-linear system can be analysed by evaluating the trajectory at the equilibrium point. However, the Jacobian of the closed-loop MIMO state-control HBLS is degenerate, with all elements equal to zero according to the following analysis. By substituting for u p (t) in (3) based on the NN output where n is the index of output neurons, i.e. matrix element index of the control gain, and P is the total number of control inputs. The Jacobian of the closed-loop system is therefore Therefore, at the equilibrium point, J(0) = 0. The eigenvalues are invariably zeros and the system is said to be degenerate, having an equilibrium subspace [29]; hence expanded and more laborious trajectory analysis is required to evaluate the global stability of the equilibrium point by computing the state derivative (9) of the state space grid, which gives the direction of the state vector at specified points, thereby characterising the stability behaviour of the system.

Illustrative example
In this section, the effectiveness of the proposed method for global stability of MIMO state-control HBLS at the equilibrium position is presented using an illustrative example.

Setup
Consider a MIMO state-control HBLS described by the following state-space equation: It can be seen that the system has only non-linear state and control coupling terms with no linear term.

Benchmark: traditional control design
Although the system of (11) is neither piecewise linearisable nor feedback linearisable as presented in Section 4 of this work, conditional state feedback control gain was obtained by the author of [30] as which results in the closed-loop state equation with the conditionally stable domain given as

Proposed NN controller
By applying the proposed NN-based adaptive control strategy as described in Section 2, stability for all regions of initial states was achieved. The state transition and control gain matrices were sampled for 30 sequences of 1000 time steps, making a total of 30,000 data examples. The states and the next states {x 1 (t), x 2 (t), x 1 (t + k), x 2 (t + k)} are the input features whereas the adjustable control gain matrices are the output. NN model training was done using the MATLAB NN toolbox. Data was divided into 70% training, 15% validation and 15% testing, and the tan-sigmoid symmetric activation function was used. The model has one hidden layer, two hidden neurons and the training algorithm was Levenberg-Marquerdt [31][32][33]. The adaptive control gain function was then obtained as where θ = (2/(1 − e 2s )) − 1 is the sigmoid symmetric activation function, s = b (1) + W (1)

× [(X(t) − λ)μ + ρ] is the hidden layer input. X(t)
is the desired state transition input vector at time t, where λ, μ and ρ are the offset, gain and minimum values of the normalised input vector, respectively. Similarly, ψ, ϕ and λ are the offset, gain and minimum values of the normalised output gain vector, respectively. W (1) and W (2) are the hidden and output layer optimal weights, respectively, obtained after training the NN; whereas b (1) and b (2) are the NN biases for hidden and output layers, respectively. The control parameters were obtained by optimising the interconnecting weight and bias vector among input, hidden, and output variables as described in Section 2. However, at the beginning of the NN training, the weights should be initialised to non-zero random numbers, otherwise only the scale of the weight will change but not the gradient which determines the direction. Furthermore, the more the control design parameters deviate from the obtained optimal values, the more the control performance deteriotes.
The gain function with parameters given in Table 1 was integrated into the adaptive control structure of Fig. 3 as a powerful estimator, and the simulation results are shown in Section 3.4.

Simulation results
The simulations were done using the Matlab-Simulink, as shown in the Appendix. A measure of the predictive performance of the NN-based control gain estimator is shown in Fig. 4. It can be seen that the training, validation and test mean squared errors are very small at the termination of training, which indicates optimality. Furthermore, there is alignment among the training, validation, and test mean square errors especially at the termination of training, which indicates strong generalisation of the obtained gain function.  Fig. 7, where the initial states are both negative, only the NN-based controller stabilised the system, and the performance of the traditional state-feedback control system is unstable, as shown to have significantly diverged away from the target zero state. Furthermore in Figs. 8 and 9, control input computed by the different control schemes is similar in the region where both strategies stabilise the system states. Significant control input is applied at the beginning to compensate for the initial state error without causing oscillations and diminishes as the states stabilise near the target value. However, when the initial states are both negative, excessive control input was supplied by the traditional state-feedback system, whereas the NN-based adaptive control system provided appropriate control input to stabilise and control the system states as shown in Fig. 10. Figs. 11 and 12 show the time-varying adaptive control gain trajectories. It can be seen that the control input is not varying only according to the state error but also according to the optimal gain computed by the NN, based on the region of the states. To further investigate the stability of the control methods, the phase portraits are shown in Fig. 13. Whereas the traditional state-feedback control system shows strong divergence at the lower left quadrant of the phase portrait explaining why stability is not achieved when the states are both in the negative region, the NN-based adaptive control system has an

Discussion
In classical adaptive control, the two main structures are modelreference adaptive control (MRAC) scheme and the self-tuning controllers (STCs). In MRAC, the estimation of the adjustable controller parameters is based on the reference model output which specifies the ideal plant response and the actual plant response. On the other hand, in STC, the adjustable controller parameters are estimated based on the past control input and the output. Nonetheless, the two classes of adaptive control systems have a unified framework in the sense that they both have control inner loop and estimation outer loop [14]. The proposed NN-based adaptive non-linear controller whose block diagram is shown in The traditional non-linear adaptive control methods require that the following three conditions be satisfied: (i) Non-linear dynamics of the plant must be linearly parameterisable.
(ii) Provided the parameters are known, control input must stably cancel non-linearities without unstable hidden dynamics or nodes. (iii) System's full state must be measurable. (iv) However, the class of BLS considered in this work does not satisfy conditions (1) and (2). The proof is given below.
Substituting for x and u in (3), and applying Taylor expansion with higher-order terms neglected Since f (x¯, ū) = ∑ p = 1 P N p x(t)u p (t) = 0 at equilibrium, the linearised system is expressed as where the Jacobian coefficients J x and J u were obtained as Because the Jacobian coefficients are both equal to zero, the piecewise control method cannot be applied to MIMO state-control HBLS, i.e. they are not linearly parameterisable (violation of condition1). Whereas piecewise linearisation involves point-by-point linear approximation of the non-linear dynamics, feedback linearisation involves state transformations and feedback. It essentially cancels the non-linearities and transforms the system dynamics into a linear form. However, it is only applicable to systems that can be expressed in controllability canonical form [34] as follows: where x is the observed scalar output, f (x) and g(x) are non-linear functions of x, u is the scalar control input, and x = [x, x˙, …x (σ − 1) ] T is the state vector. Then, the non-linearities are cancelled using the control input which implies Therefore, the linear control law can be designed with such that the following dynamics is exponentially stable with roots strictly in the left-hand-side (LHS) of the complex plane: Cogn. Comput. Syst., 2020, Vol.
Taking the time derivative of (3) keeping u p constant and by substituting for ẋ(t) also from (3) into (25) Hence, as shown by (26), transformation to the controllability canonical form of (23) cannot be obtained from MIMO statecontrol HBLS (violation of condition 2). Furthermore, the Lyapunov-based linear state feedback is reported and shown to achieve only conditional stability in Section 3. The NN-based adaptive controller is effective for the class of BLS considered in this paper because it only requires the third condition in addition to the existence of conditionally stabilising control gain. The simulation results demonstrate its superiority over the traditional method as global stability was achieved irrespective of the initial state conditions. Since the training data for the NN is generated from the stabilised system, 'good' data were obtained which provides training efficiency as the complete dataset is relevant to the estimator training. Moreover, the NNbased estimator needs to be trained only once offline. Furthermore, the risk of oscillatory or slower convergence associated with an otherwise online estimator is eliminated and we also avoid the challenge of insufficient data, especially at the early stages of an otherwise online estimation. However, if the system is absolutely not conditionally stabilisable by the traditional method, it becomes challenging to generate 'good' data. That, in turn, prolongs the training process of the NN-based estimator.

Conclusion
A novel NN-based non-linear adaptive control system scheme for MIMO state-control homogeneous BLS has been proposed in this work. State transition data and corresponding control gain data were utilised for training an NN model integrated into the control system as a powerful time-varying non-linear gain estimator. Provided the system states are measurable and conditionally stable control gain exists, it was shown that the global stability of the considered BLS could be achieved at the equilibrium point. In addition, because the estimator is pre-trained offline, estimator sophistication is not restricted and the challenge with parameter oscillations and slow convergence associated with an otherwise online estimator was avoided. Through the case study presented, the effectiveness of the proposed strategy was demonstrated, and stability analysis was done by plotting the trajectory of the state derivatives. Future work will investigate the realisation of NNbased gain estimator when a conditionally stabilisable traditional control gain cannot be obtained. Furthermore, the improvement of the strategy for the stability of non-linear, environmentally dependent systems will be investigated.  The MATLAB simulation block diagram for the proposed strategy is shown in Fig. 14.