Hammerstein Box-Jenkins System Identification of the Cascaded Tanks Benchmark System

A common process control application is the cascaded two-tank system, where the level is controlled in the second tank. A nonlinear system identification approach is presented in this work to predict the model structure parameters that minimize the difference between the estimated and measured data, using benchmark datasets. The general suggested structure consists of a static nonlinearity in cascade with a linear dynamic filter in addition to colored noise element. A one-step ahead prediction error-based technique is proposed to estimate the model. The model is identified using a separable least squares optimization, where only the parameters that appear nonlinearly in the output of the predictor are solved using a modified Levenberg–Marquardt iterative optimization approach, while the rest are fitted using simple least squares after each iteration. Finally, MATLAB simulation examples using benchmark data are included.


Introductions
Recently, system identification attracted the attention of many researchers and practitioners because of the difficulty in modeling many systems using physical modeling approaches. Good empirical mathematical models that represent a variety of practical applications in different engineering fields are required to address various needs [1]. ese needs might be to understand and analyze the limitations of current systems, predict and simulate new experiments, or design new or modified controllers. e process of creating a mathematical model from measured data is called system identification [2,3]. Unlike most machine learning methods, system identification gives more insight about the structure and dynamics of the system. A very common approach in system identifications is by using model structure methods, which is applied in this work. Unlike linear system identification methods that are well established and extensively applied in the literature, nonlinear system identification needs more attention because our real-life applications are nonlinear. e uniqueness of each nonlinearity increases the complexity of nonlinear systems and their identification [4][5][6][7][8][9][10][11].
ere are a few attempts to apply nonlinear system identification techniques to common applications in different engineering fields reported in the literature. In the area of chemical process control, Aljamaan et al. [12] identified the dynamics of the continuously stirred tank reactor (CSTR) in the presence of nonstationary disturbances, Zhu [13] identified a distillation column, and Birpoutsoukis et al. [14] estimated a Volterra series model of the cascaded two-tank system. Similarly, in biomedical engineering applications, Vlaar et al. [15] identified brain signals by electroencephalography (EEG) while Cescon et al. [16] identified models of the blood sugar level inside the human body.
In this paper, a prediction error minimization (PEM) nonlinear system identification parametric approach, similar to the one proposed in [10], is applied to approximate the two-tank cascaded model, where the benchmark data provided by Schoukens et al. [26] were used. e main contribution is to use such general nonlinear model structure, Hammerstein Box-Jenkins model, and separable least squares (SLS) identification approach to predict the parameters of cascaded tanks system. e suggested model structure to fit this application is Hammerstein Box-Jenkins model, where a static nonlinearity is in series with a linear dynamic filter whose output is summed with an autoregressive moving average (ARMA) noise model. Separable least squares (SLS) techniques are suggested, where secondorder gradient descent-based optimization approach is performed on the nonlinear parameters while ordinary least squares regression is used, after each iteration, to estimate the coefficients that appear linearly in the predictor. e arrangement of the paper is as follows. After a physical description of the two-tank system and approximated model structure in Section 2, Section 3 includes the identification suggested approach. en, simulation examples and results' discussion are illustrated. Finally, concluding remarks are provided in the conclusion.

e Cascaded Water Tanks.
e cascaded water tanks system is a benchmark system proposed in [26]. It comprises two water tanks, with free outlets as shown in Figure 1. e first tank is fed by a pump and drains into the second tank, which itself drains into the reservoir (see Figure 2). e input to the system is the voltage applied to the pump, while the output is the reading on a level sensor in the second tank. Note that if the upper tank overflows, some of the overflow will land in the second tank, while the rest will go directly into the reservoir. e source of nonlinearity is mainly at the input signal and combined with a small nonlinearity during the measurement [26].

e Proposed Model Structure.
e Box-Jenkins model is, in some sense, ideal for the modeling and identification of linear time invariant (LTI) systems because many models can be regarded as special cases such as the AR, MA, ARMA, ARX, ARMAX, and OE models [33]. It includes separate filters for the system and noise models, which gives it a great deal of flexibility. Replacing the linear process filter in the Box-Jenkins model with a nonlinear model produces a much more flexible model, as it can model nonlinear behaviour in the plant but can still provide accurate predictions, even when the stochastic disturbance is highly autocorrelated. In this section, we consider the Hammerstein Box-Jenkins model, shown in Figure 3. It combines the linear dynamics of a Box-Jenkins model with the nonlinear structure of a Hammerstein model. e main reason for selecting Hammerstein is the previous knowledge of the differential balance equations for the two-tank system that imply the main nonlinearity is at the input which is in accordance with the Hammerstein model. Different system identification approaches are presented in [34][35][36][37][38][39][40].
e main identification target is to estimate the process and noise parameters and the expansion coefficients of the nonlinearity. An important factor in achieving an acceptable identification result is a good initial guess. is is because when any iterative optimization method is used, it is uncertain that the assessment function will converge to the e water is boosted from reservoir in the upper tank to the lower tank and then returns back to the reservoir [26]. global minimum unless the initial guess of the parameter vector is close enough. e model is formulated as where y d (t) is the output of the deterministic part of the system and the noise part is (2) e deterministic process output side is computed by where x(t) is the output of the static nonlinearity: In this paper, the memoryless nonlinearity m(·) will be approximated by a polynomial of degree M because of the efficiency and ability to apply LS to find the coefficients of the polynomial nonlinearity.

Prediction Error Method (PEM).
e prediction error method (PEM) searches for the parameter vector, θ, that reduces the difference between the measured output and the output prediction generated by the model. us, ε is the prediction error.
Note that the parameter vector, θ ∈ R n×1 , consists of the elements of the process and noise filters and coefficients of the polynomial nonlinearity.

One-Step Ahead Prediction.
e most common option in PEM is the one-step ahead prediction, where the existing output, y(t), at time t is approximated given all I/O measurements up to including t − 1.
e one-step ahead predictor equation for the Hammerstein Box-Jenkins model is as follows: where the noise model predictor, By substituting (3) and (8) in (7): e main objective, (10), is to compute the parameter vector θ that minimizes the mean squared error (MSE) of the cost function, V N (θ), in (11).

Separable Least Squares.
In the SLS algorithm, the parameter vector, θ, is divided into two sets: a group of parameters that appears linearly in the prediction error, θ ℓ , and the remaining parameters, θ n [41]. e prediction error is generally a nonlinear function of θ n ; although in some cases, one or more elements of θ n may appear bilinearly with elements of θ ℓ . In these cases, the decision as to which parameter is "linear" or "nonlinear" is arbitrary. us, the parameter vector is subdivided as For the Hammerstein model, the prediction error can be treated as linear in either the polynomial coefficients or the plant numerator. e output of process model, y d (t), and polynomial X is formulated as follows: e parameter vector of the linear part, θ ℓ , is estimated by doing simple LS.
where Y is the vector that contains all experimental results of y(t). e nonlinear parameter vector, θ n , is estimated using an iterative approach.
It is noticed from (16) that θ ℓ relied on θ n in finding the direction of the search. Similar to the parameter vector θ, the H (q -1 ) + Figure 3: Hammerstein Box-Jenkins model.
Jacobian is divided into two parts too: J ℓ and J n that consist of the linear parameters and nonlinear coefficients, respectively: and formulated in (18) and (19): e Jacobian of the SLS J s is computed by Because the predictor output relies on the nonlinear parameters that might result variations in the linear elements the projection of the linear part, P ℓ , defined in (21), is required to calculate this variation.
e complementary projection, Q ℓ , is defined in the following equation: Now J s can be calculated by Once J s has been computed, θ n can be updated using any quasi-Newton optimization, and Levenberg-Marquardt is proposed in this work. e separable least squares approach is described in more detail by Aljamaan et al. [12]. e nonlinear vector parameters θ n can be updated using any quasi-Newton optimization.

L-M Optimization.
e Levenberg-Marquardt algorithm is a second-order iterative optimization-based technique, where the update applied to the parameter depends on the local gradient and the approximation of the error surface's local curvature [42,43]. It is applied to find the nonlinear LS problems, where minimum is explored, but it is not guaranteed to be the global minimum. Also, it is used in software applications to fit best curve. e phase direction and length were measured together in the L-M iteration. e update is defined in (14).
Note that the distance and the direction of the k th step are chosen so that the cost decreases.
e L-M method formulas are as follows: where ε is the error and J is the Jacobian. d (k) is the updated step to control the step size and also responsible for speeding up of the convergence and ensures the stability, and H is the approximate Hessian; the step size for the Hessian approximation is included. IM is the M × M identity Matrix.

Instrumental Variable (IV).
e global minimum can be always achieved by using LS if the parameters of the model are linear. On the other hand, models that are nonlinear in their parameters require iterative optimization techniques to converge to a minimum.
is minimum may not be the global minimum and hence the importance of a good initial guess. Moreover, predicting an excellent initial guess plays a significant factor to have a successful identification. e IV approach, used in this work, is a familiar option that yields unbiased approximated coefficients (this is explained in [2,3]). An extended IV approach is provided in [44].

Hammerstein Box-Jenkins Model Identification Using SLS.
In the following, a summary of the proposed SLS identification algorithm (Algorithm 1) is presented.

Experimental Results
MATLAB simulation examples using two datasets of 1024 points each, identification and validation data, were provided by Vlaar et al. [15]. ese datasets were collected from the cascade tanks system shown in Figure 2. e input u (t) is a multisine signal, which is the voltage applied to pump the water from the reservoir to the upper tank with frequency range between 0 and 0.0144 Hz and sampling time Ts equal to 4 seconds. e water level of the lower tank is the output of the system, y (t), with signal to noise ratio, SNR, approximately equal to 40 dB. Input and output are shown in Figure 4. More information about the data is included in [15].
A Hammerstein Box-Jenkins model structure similar to the illustrated one in Figure 3 is applied to identify the approximated mathematical model. A third-order polynomial nonlinearity in cascade with second-order filter that represents the plant model is selected. Unlike the process model, the noise subsystem was of first-order ARMA filter. e order was selected manually using the training data. On the other hand, Table 1 illustrates the MSE results using the validated dataset. MSE function was chosen as the assessment tool. e first experiment was applied by selecting a first-order process and noise model with zero-order nonlinearity, and it produces high MSE results. en, the order of the nonlinearity increases and it is clear that the cost decreases in the 2 nd and 3 rd nonlinearity cases. Unlike in the 2 nd and 3 rd cases, there is very small difference in the cost between 3 rd and 4 th order nonlinearities. After that, the process linear dynamic order was increased, and it was noticed that using 2 nd order plant model and 3 rd order nonlinearity produces the minimum cost. Notice that a 1 st order noise model was fixed for all the experiments. e proposed parametric identification method based on Section 3 was applied. Figure 5 shows the difference between the simulated outputs, dotted black lines, that tracks the measured output represented in red solid line. Moreover, the cost function using iterative second order gradient descent method based is plotted in Figure 6, where the final convergence was at iteration 25. e nonlinearity is illustrated in Figure 7. e mean square error, MSE, of the 2 nd order linear dynamic subsystem, the 1 st order noise filter, and 3 rd order nonlinearity was the minimum with 0.0073.
(3) Find J ℓ and J n as in (18) and (19). (4) Calculate the updated step d k , (26), and then update, θ n(k+1) , (16). (5) Compute x(t), (5), using updated θ n(k+1) , and fit θ (ℓ+1) . (6) Using LS. (7) Compute the error and cost function, ε and V n , respectively. (8) If the cost V n decreases. (9) If V n is significant en (10) Exit (11) else increase speed of convergence by setting the update step δ � δ/c, then go back to step 3 (12) End (13) else the cost, V n , increases en (14) Decrease the speed of convergence by setting: δ � δc and go back to step 5 (15) end ALGORITHM 1: Summary of the proposed identification algorithm.  Figure 4: Recorded input to of the cascaded tanks system, pump voltage, and output of the cascaded tanks system, the water level at the lower tank.       A comparison between the results obtained by applying this proposed approach and two previous studies using the same benchmark data by Relan et al. and Birpoutsoukis et al. [1,14] is given in Table 2 and Figure 8. It is clear that the proposed Hammerstein Box-Jenkins approach outperformed the other algorithms and has the least MSE value.

Conclusions
In this research, a nonlinear identification approach is applied to estimate a cascade tank system from data provided on the nonlinear system identification benchmark website as one of the challenging datasets, provided by Aljamaan [11]. e suggested model structure that fits this application is a Hammerstein Box-Jenkins model. is parametric method is prediction error based, where the linear dynamic process and noise subsystem elements are approximated using iterative Hessian method optimization technique; on the other hand, the polynomial nonlinearity coefficients are fitted using the simple LS method after each iteration because they appear linearly in the predictor. In order to prevent the ill-condition issue and increase the chance of convergence, an IV method is used for the initial guess. Different orders were tested to select the suitable one, and simulated data were applied to validate the results. e final results are promising for the successful identification of real two-tank systems.

Data Availability
No data were used to support this study.

Conflicts of Interest
e authors declare that they have no conflicts of interest.