Observation reconstruction and disturbance compensation-based position control for autonomous underwater vehicle

To address the position control problem of a six degrees of freedom autonomous underwater vehicle (AUV), the disturbance is observed by an observer and compensated by a controller in this paper. To ensure the vehicle reaches the specified position quickly, the sine cosine algorithm with individual memory is used to find suitable parameters for different controllers. The simulation results show that the observer can extract the system's internal and external uncertainty information in real-time, reduce the phase lag and improve the stability of AUV position control. For a system with complex nonlinearity such as AUV, the observer proposed in this paper has stronger applicability and better robustness.


Introduction
With the deepening of people's understanding of the ocean and the increasing demand for marine resources, we increasingly rely on autonomous underwater vehicles (AUV). Now, AUV has shown good application prospects in the military and civilian fields. AUV can complete complex and diverse underwater tasks, such as seafloor imaging (Sato et al., 2014), bathymetric surveying (Wang et al., 2015), underwater exploration (Lee & Lee, 2014) and polar scientific research (Costarelli et al., 2017).
In the modelling process, the influence of fluid dynamics, propeller thrust, and fin angle variation on AUV motion is usually highly nonlinear and time-varying (Antonelli, 2008), so the model of vehicle has strong nonlinearity. Moreover, the disturbances caused by waves are often fast and time-varying (Liu et al., 2017;Yu et al., 2017). Therefore, the challenge for the controller is to deal with internal and external uncertainties, and it is of practical significance to design a controller that can control AUV's motion according to the desired performance. At present, many classical control methods have been applied to control AUV. Proportion integration differentiation (PID) control is the most widely used method, but simple linear control like PID has poor robustness in the face of model uncertainty, which is also verified in the simulation part of this paper. Therefore, PID needs to be combined with CONTACT Zengqiang Chen chenzq@nankai.edu.cn other intelligent algorithms; Bijani and Khosravi (2016) proposed a strategy for adjusting the parameters of the PID controller based on the constrained artificial bee colony algorithm. Khodayari and Balochian (2015) designed a new adaptive fuzzy PID controller, the parameters of PID controller are adjusted by the established fuzzy rules. Li et al. (1994) proposed a CMAC-PID control algorithm with neural network decoupling and genetic algorithm optimization. As a classic nonlinear control technology, active disturbance rejection control (ADRC) was first proposed by Han (1998). The basic idea of ADRC is to observe and compensate the nonlinear dynamics, model uncertainties, and external disturbances in the system in real-time through the extended state observer (ESO) (Han, 2008). Gao linearized the ESO (LESO), and simplify the original ADRC into the Linear ADRC (LADRC) Gao (2003). The excellent performance of ADRC has been verified in many aspects, such as an aerial vehicle, hovercraft, and power systems (Morales et al., 2015;Zhao et al., 2020;Zheng et al., 2020). In recent years, disturbance rejection has become a research hotspot in the field of control (Chen et al., 2016). It is hoped that there will have better estimation and compensation methods to make the controller have the good capability. Fischer et al. (2014) developed a nonlinear control method using a continuous robust integral of the sign of the error (RISE) to compensate for disturbance and uncertainty for a fully-actuated AUV. Fang et al. (2018) combined ADRC with fractionalorder-proportional-integral-derivative (FOPID), the ADRC-FOPID control scheme for hydro turbine speed governor system was proposed, and the proposed control scheme has the strong anti-disturbances capability and better performance. To improve the robustness of the controller, Chen et al. (2020) and Khodayari and Balochian (2015), respectively, used reinforcement learning and fuzzy rules to realize the real-time adjustment of controller parameters according to the state of the controlled object.
Based on the above discussion and the structure of the LESO, an improved linear extended state observer (ILESO) for complex nonlinear systems is proposed. Specifically, the main contributions can be summarized as follows: • the observation method of the disturbance is reconstructed, and the new disturbance compensation not only helps to reduce the phase lag but also improves the stability of the position control process. • optimization algorithm is used to eliminate the deviation caused by manual adjustment of controller parameters. Add memory to each individual to reduce the randomness of the algorithm. • consider the possible situations of the AUV during its movement. Verify ILESO's ability to resist internal and external disturbances and its robustness in the face of uncertainty.
The remainder of this paper is organized as follows. The model of the AUV is introduced in Section 2. In Section 3, the position controller based on ILESO and the optimization algorithm with individual memory are presented. Comparative experiments are carried out in Section 4 to verify the superiority of the ILESO. Section 5 is the conclusion of the paper.

Model
The model of the autonomous underwater vehicle (AUV) is divided into kinematics and dynamics. Therefore, to describe the movement of the AUV more clearly, two right-handed coordinates: earth-fixed coordinate and body-fixed coordinate, should be established. The origin of the earth-fixed coordinate can be taken at any position on the ground to describe the AUV's position, velocity, and other kinematics. The origin of the body-fixed coordinate is taken at the centre of buoyancy of AUV, which is used to describe the dynamic such as the position and attitude change of AUV after the action of force. The establishments of two coordinates are shown in Figure 1. The motion of AUV can be described by following vectors: η = [x, y, z, φ, θ, ψ] T , υ = [u, v, w, p, q, r] Y, Z, K, M, N] T . Vector η is the AUV's position and attitude described in the earth-fixed coordinate; υ is the translation and rotation velocity of AUV described in the body-fixed coordinate; τ is the total force and moment acting on the AUV described in the body-fixed coordinate.
Define the position of AUV's gravity centre in the bodyfixed coordinate, r G = [x g , y g , z g ] T . Because the bodyfixed coordinate centred at the buoyancy of AUV, so the I xy , I xz , I yz in the inertia tensor matrix are negligible. The variables and symbols used in the equation and their meanings are summarized in Table 1. After the simplification, the motion equations of AUV are: The first three equations represent the translational motion of the AUV, and the last three equations represent the rotational motion. In order to separate acceleration terms, Equation (1) can be expressed as This paper uses REMUS vehicle as the simulation model. The forces and moments acting on the vehicle can be The coordinate of vehicle's gravity centre I x , I y , I z The variables of the diagonal inertia tensor m Mass of vehicle expressed as (Prestero, 2011):

is body lift and moment, and
τ δ 1 · · · τ δ 4 are fin lift force and moment coefficients, X prop is propeller thrust, K prop is propeller torque, and δ s δ r are fin angles referenced to the hull. Combining (2)(3) with (4), the motion equation of the AUV can be obtained as: The velocity term in the earth-fixed coordinate can be obtained by coordinate transform: According to (4), the control input of AUV are propeller thrust X prop , pitch fin angle δ s , and rudder angle δ r . The horizontal fins and the vertical fins of REMUS control pitch angle and rudder angle, respectively, to make the vehicle heave and yaw. Because this paper mainly studies the position control of the AUV, it is assumed that the propeller speed of the REMUS is always 1500 r/min. In this case, REMUS can maintain the speed of 1.51 m/s when there is no significant change in heave and yaw (Prestero, 2011).

Position controller
In this section, a standard LESO is introduced to estimate the comprehensive disturbance, including nonlinear dynamics, model uncertainties, and external disturbances. Simple proportional compensation cannot properly approximate the complex disturbance, so the disturbance feedback item is adjusted based on the original LESO. To make the setting of controller parameters fairer, the sine and cosine algorithm with individual memory is used to help different controllers select parameters.

Regular linear extended state observer
This section develops a third-order LESO for AUV. AUV dynamics (5)(6) can be rewritten as (7) into a more representative second-order system: x can represent any quantity in the vector η.
Convert (8) to an expanded state description: , and the standard LESO for the system (8) is Where l = [l 1 , l 2 , l 3 ] T is the gain of LESO, to ensure that the characteristic matrix is Hurwitz, place the poles of the equation at −ω o , so we have Where ω o is called the observer bandwidth. LESO use the difference between the system output x 1 and the observation value z 1 as a feedback compensation to make the observation value z = [z 1 , z 2 , z 3 ] T closer to the real value x = [x 1 , x 2 , x 3 ] T of the system. When z 3 can estimate the disturbance f in the system (8) that is, z 3 ≈ f , and the controller takes the following form: Substituting (11) into (8), then the model can be simplified into a standard integral form: For the system (12), only a proportion differentiation controller is needed: where k p and k d are gains of the controller, usually k p = ω c 2 k d = 2ω c , where ω c is called the controller bandwidth, and r is the setting value.

Improved linear extended state observer
For LESO, the estimation of disturbance can be expressed as: For a system with complex dynamics, the disturbance estimation may be unsatisfactory using only the error integral action to approximate the disturbance. In (9), the observer outputż 2 is used to approach the control targetẋ 2 , in order to make z 2 close to x 2 , that is, x 3 ≈ z 3 + l 2 (x 1 − z 1 ), so define a new disturbance: With the reconstruction of the disturbance observation, the compensation is also modified, the controller is adjusted to: (8) and (9) can be rewritten as z 1 (s) = l 1 s 2 + l 2 s + l 3 s 3 + l 1 s 2 + l 2 s + l 3 x (s) Consider (9) and (15) as two observation methods for comprehensive disturbance, the transfer functions form f to z 3 andf are as follows: According to the transfer function, the phase lags of the two disturbance observations are different. Under the same parameter settings, the phase lag of the adjusted observation method is smaller. For convenience, this improved LESO is abbreviated as ILESO.

Control system and fixed parameters
The block diagram of the depth control system for AUV is shown in Figure 2. The yaw plane has the same structure as the depth control. The general controller often has two parts: the outer loop controls the position of the AUV, and the inner loop controls its attitude.
Theẑ in the diagram represents the setting value of depth, the depth error e z convert to the setting value of pitchθ through the position controller, which is a proportion controller in this paper. The attitude error e θ input into the attitude controller to obtain the control quantity δ s . LESO and ILESO in the inner loop have different definitions of disturbance, two observers will have different performances.
Because some parameters are related to the system, there is no need to adjust them after they have appropriate values. The proportion controller parameter γ of the outer loop and the observer bandwidth ω o of LESO and ILESO can be fixed. In addition, if the accurate value of b 0 cannot be obtained, its approximate estimated value can be used instead, so that the observer can process the unknown part of b 0 as part of the disturbance (Tang & Zhang, 2017;Xue & Huang, 2015). Therefore, b 0 is estimated as a constant value and fixed in this paper.
After the above simplification, the parameters that need to be adjusted are the parameters k p and k d in δ, or it can be further simplified to adjust the controller bandwidth ω o . To eliminate the deviation caused by manually adjusting the parameters, an optimization algorithm can be used to help the controller find the optimal parameters.

Sine cosine algorithm with individual memory
The sine cosine algorithm was proposed in 2016 (Mirjalili, 2016). The candidate solution randomly fluctuates in the form of sine and cosine or fluctuates toward the optimal solution, so that it can search different areas in the space, effectively avoid the local optimal, and converge to the global optimal solution. The iterative equation of the algorithm can be expressed as: t represents the current iteration number, P j represents the value of the current global optimal solution on the j-th dimension, and X j i (t) represents the value of the individual i in dimension j at the t-th iteration. r 1 determines the maximum step size that the current solution can achieve, r 2 determines the iteration direction, and r 3 determines the influence of the optimal solution on the candidate solution, r 4 describes the randomness between the sine and cosine. Among them, r 2 is a random parameter in the interval [0, 2π ], r 3 and r 4 are two random parameters in the interval [0, 1]. Considering the balance of exploration and exploitation for the algorithm to converge to the global optimum, the parameter r 1 adaptively changes as: Where T is the maximum number of iterations.
In the iterative process, exploration and exploitation have strong randomness, and the new value of the individual depends entirely on the current global optimum. It is easy to lose their own characteristics and easy to skip the real global optimum. Therefore, add fitness memory to each individual, and use the greedy selection mechanism to update the position of the current solution. The added iterative equation is: is the integral absolute error (IAE), which is the fitness function. The complete algorithm flow:

Simulation results
The simulation verifies the applicability of the ILESO in the position control and verifies that changing the definition of disturbance improves the performance of the observer. For the ILESO controller and the PD controller, through the optimization algorithm, find their optimal parameters. Through several kinds of anti-disturbance comparison experiments, the effectiveness and robustness of the ILESO are verified.

LESO and ILESO
The simulation experiment aims at controlling the vehicle to reach [x, −1, 1]. The parameters of the two control systems are shown in Table 2. · (z) and · (y) represent the controller parameters on the depth plane and yaw plane. · represents the parameters of the ILESO controller. The controller parameters of ILESO have the same values on the two planes.   The disturbance estimation of LESO is not stable, as shown in Figure 3. It can be seen from Figure 3( a,b) that the LESO only has feasible performance at the beginning of the movement. After that, the observed value has a large deviation, which causes the system to oscillate. The observed values in Figure 3( c,d) correspond to z 1 in formula (9), which are the observation values of the attitude angles. Two LESO under the two types of parameter settings has different degrees of oscillation. At the same time, the fin angle and rudder angle of the AUV also change drastically between the maximum and minimum values. Because the system is unstable, the figures of the control quantities are omitted here.
ILESO's observation results of disturbance are shown in Figure 4(a,b). ILESO modified the definition of disturbance which changed from Equations (14) to (15). Take Figure 4(a) as an example, the LESO in the figure is z 3 in Equation (9) after the disturbance definition is modified. The performance of the observer is stable, and there is no oscillation like in Figure 3. ILESO can estimate disturbances well and has a smaller time lag compared with LESO. Figure 4( c,d) are the observation values of the attitude angles using the ILESO, which are consistent with the actual values. Figure 5(a) shows the results of depth control using LESO and ILESO, and Figure 5(b) is the pitch fin angle using ILESO. The movement of the vehicle using the ILESO is more stable.
It can be seen that for a strong nonlinearity system such as AUV, modifying the definition of disturbance is not only beneficial to improving the performance of the observer but also beneficial to improving the stability of the system. In the next section, the sine cosine algorithm with individual memory is used to find the optimal parameters of the controller in the AUV position control problem. Through several comparative experiments, the unique advantages of the ILESO are verified.

ILESO and PD
Because some parameters are fixed in Section 3.3, this section uses an algorithm to find the suitable parameters for the controller (13) and the PD controller. The parameters k p and k d in (13) correspond to P and D in the PD controller, and their variation ranges are shown in Table 3. In this paper, the fitness function of the algorithm only considers the integral absolute error of the motion, so the variation ranges of parameters are be considered artificially to ensure the stability of the control quantity. And the algorithm aims at controlling the vehicle to reach [x, −1, 1] as a simulation example.
The settings of the algorithm are as follows: individuals N = 10, the position control of AUV involves two motion planes, so a total of 4 parameters need to be found, D = 4, and the algorithm is iteratively updated T = 300 times. Figure 6 is the individual memory in the iteration process of the individual with the smallest fitness value after the iteration. The final controller parameters are shown in Table 4. Due to the symmetry of the two motion planes, the following experiments take depth control as an example for comparison.
The algorithm helps the vehicle to reach the specified position as soon as possible, as shown in Figure 7(a), both PD and ILESO controllers can control the vehicle to reach the specified depth and move steadily. Then, at the 20th second, add 10 seconds of wave force disturbance (Willy, 1994). The ILESO controller responds quickly, and the depth deviation is larger than the deviation under the PD controller. But the quick response helps the vehicle to quickly return to the set value. The values of several performance indices calculated from when the disturbance    occurs are shown in Table 5.
When the vehicle is moving steadily, a step-type disturbance is added. As shown in Figure 8, a 0.5-second step signal with an amplitude of 0.5 m is added at the 20th second. The vehicle with the ILESO controller has a smaller position deviation and returns to the original position faster. The values of several performance indices of steptype disturbance are shown in Table 6. At the same time, it can be seen from Figures 7(b) and 8(b) that the cost of the control quantity of the ILESO controller is relatively small.
The comparative experiments with two types of disturbances show that the response adjustment time of the system using the ILESO controller is the shortest, which can help the system restore stability quickly. In addition, Prestero (2011) mentioned that the methods used to calculate rolling resistance and cross-flow drag resistance have high uncertainty. Although this paper uses the coefficients adjusted according to field experiments in Prestero (2011), in order to test the performance of the ILESO in the face of model uncertainty, the aforementioned coefficients have been modified. After modifying the coefficients of the vehicle, the control result of the PD controller on the yaw plane diverges. Figure 9(a) is a comparison of two control methods on the depth plane, the original PD controller is no longer applicable, and the parameters of the controller need to be re-adjusted. ILESO can maintain the same control result, and Figure 9   shows that the observer can still achieve a real-time estimation of disturbances. In practical applications, the position change requirement of the AUV is more than 1 m, so the control target will change in the next. Control the vehicle to sink 10 m while moving to the left by 5 m, keeping the forward direction unchanged. Figure 10 is a 3-D image of the position change process described in the earth-fixed coordinate. The process is stable and the vehicle can move forward steadily at the designated position. Figure 11 shows the disturbance estimation results observed by two ILE-SOs on the two motion planes. The observation result is coincident with the actual value, and the performance of the observer is ideal. The disturbance on the yaw plane will not be completely eliminated, which is related to the model itself.

Conclusion
For a system with complex dynamics, the disturbance estimation may be unsatisfactory using only the error integral action to approximate the disturbance. ILESO is obtained by adjusting the structure of the disturbance estimation in the LESO. The new observer can reduce the  phase lag and, at the same time, help to improve the stability of AUV position control. Compared with the PD controller, the controller combined with ILESO has a shorter adjustment time when facing external disturbances and has the characteristics that can make the system quickly restore stability, and it shows strong and ideal robustness in the face of model uncertainty. In order to deal with the AUV motion control problem that will become more complicated in practical applications, the path following of the vehicle and the self-adjustment of the controller parameters will be studied in the future.

Data availability statement
The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

Disclosure statement
No potential conflict of interest was reported by the author(s).