Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

An experimental comparison of different hierarchical self-tuning regulatory control procedures for under-actuated mechatronic systems

  • Omer Saleem ,

    Roles Conceptualization, Formal analysis, Investigation, Methodology, Resources, Software, Validation, Visualization, Writing – original draft

    omer.saleem@nu.edu.pk

    Affiliation Department of Electrical Engineering, National University of Computer and Emerging Sciences, Lahore, Pakistan

  • Khalid Mahmood-ul-Hasan,

    Roles Supervision

    Affiliation Department of Electrical Engineering, University of Engineering and Technology, Lahore, Pakistan

  • Mohsin Rizwan

    Roles Validation

    Affiliation Department of Mechatronics and Control Engineering, University of Engineering and Technology, Lahore, Pakistan

Abstract

This paper presents an experimental comparison of four different hierarchical self-tuning regulatory control procedures in enhancing the robustness of the under-actuated systems against bounded exogenous disturbances. The proposed hierarchical control procedure augments the ubiquitous Linear-Quadratic-Regulator (LQR) with an online reconfiguration block that acts as a superior regulator to dynamically adjust the critical weighting-factors of LQR’s quadratic-performance-index (QPI). The Algebraic-Riccati-Equation (ARE) uses these updated weighting-factors to re-compute the optimal control problem, after every sampling interval, to deliver time-varying state-feedback gains. This article experimentally compares four state-of-the-art rule-based online adaptation mechanisms that dynamically restructure the constituent blocks of the ARE. The proposed hierarchical control procedures are synthesized by self-adjusting the (i) controller’s degree-of-stability, (ii) the control-weighting-factor of QPI, (iii) the state-weighting-factors of QPI as a function of “state-error-phases”, and (iv) the state-weighting-factors of QPI as a function of “state-error-magnitudes”. Each adaptation mechanism is formulated via pre-calibrated hyperbolic scaling functions that are driven by state-error-variations. The implications of each mechanism on the controller’s behaviour are analyzed in real-time by conducting credible hardware-in-the-loop experiments on the QNET Rotary-Pendulum setup. The rotary pendulum is chosen as the benchmark platform owing to its under-actuated configuration and kinematic instability. The experimental outcomes indicate that the latter self-adaptive controller demonstrates superior adaptability and disturbances-rejection capability throughout the operating regime.

1. Introduction

The design principles of under-actuated self-stabilizing systems are extensively used in the fabrication of humanoid robotic systems, aeronautical systems, self-balancing transporters, robotic manipulators, and underwater vehicles, etc [1, 2]. These systems have high dexterity, better control-input economy, and a lesser propensity to break down [3]. However, the under-actuated systems have fewer actuators than the degrees-of-freedom to be regulated [4]. The system’s under-actuated configuration, nonlinear dynamics, and open-loop instability poses a challenging problem to the researchers in developing robust controllers that can effectively reject the exogenous disturbances encountered by the physical system in real-time applications [5, 6].

1.1. Literature review

Conventional controllers have been extensively used to optimize the disturbance compensation behavior of the aforementioned class of mechatronic systems [7, 8]. The integer-order PID controllers are widely preferred in the control industry due to their simple structure and reliable control effort [9]. However, they cannot efficiently mitigate the influence of parametric uncertainties owing to their limited degrees-of-freedom and simple structure [10]. The fractional-order PID controllers offer relatively better flexibility of controller design, which increases the controller’s degrees-of-freedom and enables it to quickly reject the nonlinear disturbances [11]. However, tuning the controller parameters is an ill-posed problem [12]. Despite their enhanced flexibility, the fuzzy controllers require a large number of empirically-defined qualitative rules to deliver robust control decisions [13]. Apart from degrading the controller’s computational economy, this arrangement also increases the human-rendered inaccuracies in the synthesized rule-base [14]. The neural controllers require rigorous training and large sets of training-data to deliver an accurate data-driven control model [15]. They also puts an excessive recursive computational burden on the digital computer [16]. The Linear-Quadratic-Regulator (LQR) is a state-space control procedure that minimizes a quadratic-performance-index (QPI), which captures the state and control-input variations, to compute an optimal set of state-feedback gains [17, 18]. Despite its optimality and guaranteed stability, the LQR lacks robustness against exogenous disturbances, model variations, and identification errors [19, 20]. The robustness of the generic LQR can be improved by prescribing a “Degree-of-Stability” (DoS) in its structure [21]. The DoS design relocates the system’s eigenvalues on the left-hand of the line s = −β in the complex plane, where, "s" is the Laplace operator and β>0 is a preset parameter that defines the LQR’s DoS [22]. The repositioning of eigenvalues enhances the controller’s response speed and its damping against exogenous disturbances by manipulating its phase-margin [23]. However, this technique compromises the control-input expenditure of the controller [24].

The robust nonlinear controllers put unnecessary restraints on deriving the exact solution due to the boundary conditions and complex geometry of the system’s model [5, 25]. The nonlinear control scheme proposed in [26] effectively handles the actuated state-constraints, un-actuated state-constraints, and composite variable constraints for a specific class of under-actuated systems. However, it does not address the effect of parametric uncertainties encountered by the system. The sliding-mode controllers are also renowned for delivering robust control efforts [27]. However, they apply highly discontinuous control force which inevitably injects chattering in the response [28]. The back-stepping controllers are also used to regulate the performance of the nonlinear systems [29]. However, the cancellation of indefinite cross-coupling terms, which is done to maintain the negativity of the Lyapunov function’s first-derivative throughout the operating regime, contributes to a higher control activity and degrades the system’s robustness as well [30].

The adaptive controllers are used as an important tool in disturbance-compensators for under-actuated systems [31]. They perform on-board reasoning to dynamically restructure the control procedure by self-tuning the critical controller parameters [32]. This setup enables the system to quickly adapt to the abrupt state-variations [3234]. Historically, the adaptive controllers are categorized via their direct or indirect nature. The direct approach self-adjusts the critical controller-parameters as a function of the error-variables [13]. In the indirect approach, an identification scheme is used to estimate the system’s unknown model-parameters to update the control law [14].

Extensive research has been done to synthesize robust adaptive controllers for under-actuated mechatronic systems [35, 36]. The Model-Reference-Adaptive-Controllers utilizes the Lyapunov theory to track a reference control model which leads to the online dynamic adjustment of the critical controller parameters [37, 38]. However, identifying an accurate reference model for the tracking purpose is a difficult task [39]. The gain-scheduling mechanism employs a state-driven look-up table to select pre-configured feedback controllers; where, each controller is designed specifically for a given operating condition [40]. The calibration and stability assurance of the constituent controllers for a system with a big range of uncertainty become quite laborious [41]. The model-predictive-controllers use smaller time frames to solve the receding-horizon optimization problem and deliver time-varying controller gains [42]. However, they render wrong predictions which may lead to a fragile control effort under long drifting disturbances or model variations [43]. The State-Dependent-Riccati-Equation based controllers require accurate state-dependent-coefficient matrices to update the Riccati Equation solutions [44]. However, an accurate definition of these matrices is quite hard due to the restrictions imposed by the nonlinear dynamics of higher-order systems [45]. The Markov-Jump-Linear-System is a stochastic control technique that is renowned for its reliance against the random faults occurring in the cyber-physical system [46]. However, the cost and likelihood of acquiring accurate a priori transition probabilities for necessary computations is expensive and arguable, respectively [47].

The state-error-driven nonlinear scaling functions have also been extensively used for the development of expert adaptive systems to online adapt the controller parameters [48]. Retrofitting the linear compensators with nonlinear scaling functions to adaptively modify the critical controller gains has garnered a lot of traction in developing robust control for non-minimum phase systems [49]. The nonlinear-type feedback controllers tend to improve the system’s damping against oscillations, reference-tracking accuracy, and error-convergence rate [50, 51]. There are two main categories of nonlinear-type gain adaptation laws that are widely used in the adaptive control field; namely, the state-error-magnitude observers and state-error-phase observers. In state-error-magnitude observers, the online dynamic gain-adjustment depends on the magnitude of the state-error variable and its higher-order derivatives [52]. In state-error-phase observers, the online dynamic gain-adjustment is driven by the magnitude of the classical state-error as well as the direction of motion of the state response (commonly referred to as the “phase” of the state-response) [53]. The phase information helps in flexibly manipulating the controller’s characteristics as the response deviates from or converges to the reference [54]. The biologically-inspired artificial-immune system is a computationally intelligent adaptive mechanism that efficiently rejects the exogenous disturbances [55]. They mimic the self-regulation capability of biological immune systems to adaptively tune the controller parameters which optimizes the controller’s adaptability to environmental indeterminacies [56].

The hierarchical self-tuning state-feedback regulators are yet another emerging control paradigm [57, 58]. They are implemented by dynamically adjusting the constituent weighting matrices of LQR’s QPI to indirectly modify the controller gains [59]. The online variations in these weighting-factors manipulates the critical parameters in the succeeding layers of the controller’s structure which eventually delivers time-varying state-feedback gains.

1.2. Proposed approach

The main contribution of this article is the development and experimental comparison of four unique state-of-the-art nonlinear-type hierarchical self-tuning state-feedback regulators for under-actuated mechatronic systems in order to improve their disturbance-rejection capability against exogenous disturbances. The proposed control scheme follows a hierarchical architecture that re-computes the state-feedback gains, after every sampling-interval, based on the state-error-dependent adaptive tuning of weighting-factors associated with LQR’s quadratic-performance-index (QPI). For this purpose, the generic LQR structure is retrofitted with an auxiliary online self-tuning mechanism that acts as a superior regulator to adaptively tune the constituent weighting-factors associated with the QPI. The Riccati equation uses these adjusted weights to deliver the time-varying state-feedback gains. Each self-tuning mechanism is designed such that it exploits a specific aspect of the system’s state-error profile and harnesses it to effectively reposition the system’s closed-loop eigenvalues in the stable region of the complex-plane. The said hierarchical control procedure is quite innovative because, apart from adjusting the state-feedback gains online, the solution of Riccati equation concurrently guarantees the asymptotic-convergence of the control law as long as the concerned weighting-factors are varied within pre-defined bounds. Hence, additional stability proofs are not required. The salient innovative contributions of this article are postulated as follows:

  1. Development of a self-tuning mechanism for the LQR’s “degree-of-stability”.
  2. Development of a self-tuning mechanism for the control-weighting-factor associated with the QPI.
  3. Development of a self-tuning mechanism for the state-weighting-factors of QPI that depends on the system’s state-error-phase.
  4. Development of a self-tuning mechanism for the state-weighting-factors of QPI that depends on the magnitudes of the system’s state-error variables.
  5. Formulation of each self-tuning mechanism by using pre-configured hyperbolic functions to re-scale the critical weighting-factors in real-time.
  6. Comparative performance assessment of the proposed self-tuning controller variants by conducting credible real-time experiments, designed specifically to emulate practical disturbance scenarios in the physical environment, on the standard QNET Rotary Pendulum setup [11].

The experimental results (shown later in this article) indicate that each self-tuning-regulator variant significantly enhances the system’s robustness against the exogenous disturbances and the control-input economy to a certain degree while preserving the system’s asymptotic-stability throughout the operating regime. The experimental comparison of the four different structures of hierarchical self-tuning regulators, that employ innovative rule-based adaptation mechanisms to dynamically adjust the critical weighting-factors of the QPI, has not been attempted previously in the open literature. Hence, this is the main focus of this article.

The remaining paper is organized as follows: The pendulum system is mathematically modeled in Section 2. The baseline fixed-gain LQR is synthesized in Section 3. The detailed design of the four prescribed hierarchical self-tuning regulators is presented in Section 4. The experimental comparison of the proposed self-tuning regulator is presented in Section 5. The paper is concluded in Section 7.

2. System model

In this paper, a standard rotary inverted pendulum (RIP) system is used as the benchmark platform to experimentally analyze the implications of the proposed control procedure [60]. It requires an active control system to stabilize itself vertically. Apart from being under-actuated in nature, the said multivariable system also exhibits all the properties typically associated with mechatronic systems; such as open-loop (or kinematic) instability, complex geometry, and nonlinear dynamics [61]. The block diagram of an RIP system is illustrated in Fig 1. The system employs a DC geared servo-motor to apply the necessary control torque to rotate the pendulum’s arm, which is coupled to the motor’s shaft. The arm’s angular displacement energizes the pendulum rod to swing-up and balance itself vertically. The angular-displacements of the arm and the rod are denoted as α and θ, respectively.

thumbnail
Fig 1. Hardware schematic of a typical rotary pendulum system.

https://doi.org/10.1371/journal.pone.0256750.g001

The system’s nonlinear equations of motion are formulated via the Euler-Lagrange approach [62]. The system’s Lagrangian (L), expressed in Eq 1, is evaluated by computing the difference between the total kinetic energy (T) and the total potential energy (V) of the system is computed in terms of the coordinates (φ and θ) and their corresponding angular-velocities ( and ).

(1)

The Euler-Lagrange equations of the RIP system are derived as follows [62]. (2) where, τ represents the torque applied by the DC motor. It is expressed as follows.

(3)

The viscous damping forces and frictional forces are neglected in this research. The resulting nonlinear relationship between α, θ, and τ is expressed as follows [62].

(4)

The aforementioned set of nonlinear equations can be linearized around the point . Furthermore, the small-angular displacements of the pendulum rod are approximated via the following expressions.

(5)

The state-space model of a linear dynamical system is represented via Eq 6 [11]. (6) where, x is the state-vector, y is the output-vector, u is the control input signal, A is the state-transition matrix, B is the input matrix, C is the output matrix, and D is the feed-forward matrix. The state-vector and the control input-vector of the RIP system are identified in Eq 7 [59]. (7) where, Vm is the control-input voltage applied to operate the DC motor. The nominal state-space model of the RIP system is presented as follows [59]. (8) where,

The model parameters of the QNET RIP are identified in Table 1 [11].

3. Linear quadratic regulator

The LQR is a standard state-space control strategy that is widely favored for optimal position-regulation of multivariable electro-mechanical systems [19]. The LQR yields an optimal control trajectory by minimizing an energy-like QPI, expressed in Eq 9, that captures the state-variations and the control input associated with the linear dynamical system [17]. (9) where, Q ∈ ℝ4×4 and R ∈ ℝ are the state and control-input weighting matrices, respectively. The QPI minimization is followed by the solution of Hamilton-Jacobi-Bellman (HJB) equation to acquire the state-feedback gains offline [17]. The weighting-matrices are selected such that Q is a positive semi-definite matrix and R is a positive definite matrix. For the RIP system considered in this research, the Q and R matrices are symbolically represented as shown in Eq 10. (10) where, qx and ρ represent the real-numbered coefficients of the Q and R matrices, respectively. The value of ρ is selected as unity to maintain a reasonable control-input economy. The Q matrix is tuned in this research by iteratively minimizing the performance criterion given in Eq 11 to minimize the position-regulation error as well as the control-input energy [63]. (11) where, eα(t) and eθ(t) represent the error in the angular displacement of arm and rod from their corresponding reference positions, respectively. The reference position of the pendulum’s rod is set as π radians in order to stabilize it vertically. The angular position of the pendulum’s arm recorded at the beginning of every experimental trial is considered as its reference, αref. The LQR delivers the optimal set of state-feedback gains with the lowest cost of Jlq. However, these optimal gains are computed by using a specific set of Q and R matrices. This arrangement may not always contribute a good position-regulation behavior with respect to Jlq. Hence, to optimize the selection procedure, Jc is used to tune the state-weighting factors in this research [59]. To acquire the best-fit solution, each state weighting-factor is selected from the range [0, 500] in this research. The search is initiated from a random point in the range-space. The search is conducted in the direction of descending gradient of Jc and it is terminated when the minimum cost is achieved. The coefficients of Q and R matrices acquire for this research (corresponding to the minimum cost of Jc) are presented as follows.

(12)

The Algebraic-Riccati-Equation (ARE) utilizes the system’s nominal model as well as the tuned Q and R matrices to compute the solution, P, as shown in Eq 13. (13) where, P∈ℝ4×4, is a symmetric positive definite matrix. It is well-known that if the system is controllable and that Q = QT ≥ 0 and R = RT ˃ 0, the solution of ARE yields an asymptotically-stable control behavior [17]. The state-feedback gain vector, Kf, is calculated as shown in Eq 14. (14) where, . The optimal control law is expressed as follows.

(15)

The evaluation of the gain vector yields . The linear control law is restructured by equipping it with the following state-error-integral variables.

(16)

This augmentation improves the pendulum’s damping against fluctuations and its reference-tracking behavior [18]. The integral control law is expressed as follows.

(17)

The integral-gain vector Ki is tuned by iteratively minimizing the cost function, Jc, to minimize the position-regulation error. The Ki vector that yields the minimum cost in the range [-5, 0] is selected. In this paper, the integral gains are tuned as . The baseline control law is given by the linear combination of the optimal control law and the integral control law as shown in Eq 18.

(18)

4. Hierarchical self-tuning-regulator design

The ubiquitous LQR uses the system’s linear state-space model to deliver fixed state-feedback gains. Thus, it lacks robustness against the state-deviations caused by the bounded disturbances, modeling uncertainties, identification errors, and other parametric variations. To solve this problem, the LQR is augmented with an online adaptation law that dynamically reconfigures the critical controller parameters. The adaptation law is realized by using state-error-dependent nonlinear scaling functions. These synthetic “nonlinear” functions flexibly manipulate the control profile to reject the exogenous disturbances. This arrangement significantly improves the controller’s adaptability and disturbance-rejection capability; although, the resulting self-tuning regulator continues to utilize the system’s linear state-space model.

This section presents the theoretical background and formulation of four different state-of-the-art hierarchical adaptive state-feedback control procedures. Each self-tuning mechanism adaptively modulates the gains of the LQR. The proposed mechanisms redesign the nominal LQR, after every sampling interval, to flexibly manipulate the control-input trajectory which aids in efficiently rejecting the exogenous disturbances and parametric variations. It is to be noted that only the state-feedback gains are being updated online in the proposed adaptive control procedures; whereas, the integral gains are kept fixed at as discussed in the previous section. Each proposed adaptive control procedure undertakes to achieve a beneficial compromise between the position-regulation behaviour and control energy expenditure while maintaining the system’s stability across a broad range of operating conditions. As discussed earlier, the proposed adaptation laws self-adjust specific parameters (existing naturally) within the hierarchical structure of the LQR control system. The online reconfiguration of these targeted parameter indirectly leads to the re-computation of state-feedback gains, after every sampling interval. In this article, four unique hierarchical self-tuning control procedures are investigated. These control procedures are individually synthesized by:

  1. Self-adjusting the degree-of-stability of the LQR by using state-error feedback.
  2. Self-adjusting the R matrix by using state-error feedback.
  3. Self-adjusting the coefficients of Q matrixby using a well-established rationale that depends on the state-error-phase feedback.
  4. Self-adjusting the Q and R matrices by using well-postulated meta-rules that depend on the state-error-magnitude feedback.

The adaptation laws are formulated via pre-calibrated hyperbolic nonlinear scaling functions. These functions are continuous which allows for a smooth variation of the concerned weights as the operating conditions change. These functions are bounded which limits the variation of the concerning weights and thus, ensures an asymptotically-stable control behaviour. The symmetry of the hyperbolic functions, about the vertical axis, helps to appropriately steer the control trajectory as the polarities of the state-error variables change. Finally, these algebraic equations can be easily solved in a single-step after every sampling interval. Unlike the iterative auto-tuning or gradient-descent techniques, the real-time computation of hyperbolic scaling functions does not put an excessive recursive computational burden on the embedded processor. Hence, they are computationally economical and can be easily programmed in the control software by using modern-day digital computers.

4.1. Adjustable degree-of-stability

The baseline LQR is transformed into a self-tuning-regulator by retrofitting it with a self-adjusting degree-of-stability (DoS) [21]. The QPI is equipped with a reconfiguration block that relocates the system’s closed-loop poles on the left-hand side of the vertical line, s = −β(t), on the complex s -plane; where, β(.) is a state-error dependent time-varying positive constant. The original QPI is modified by associating a time-varying exponential multiplying factor of the form e2β(t)t with it as shown in Eq 19 [22].

(19)

The multiplication of the typical cost-function with the time-varying exponential term shifts the position of eigenvalues of the state-transition matrix A on the left side of the line s = −β(t) which ensures the asymptotic-stability of the controller’s operation [22]. The revised cost-function can be simplified according to the following expression.

(20)

This simplification implies that the expressions of the state-vector, as well as the control-input vector, can be revised as expressed below [23].

(21)

The substitution of the revised expressions of the state-vector and control-input vector yields the following expression of the cost-function.

(22)

The system’s state-equation is also modified as expressed below [40].

(23)

The expression in the Eq (18) reveals that the augmentation of the exponential term, e2β(t)t, in the quadratic cost-function ends up transforming the system’s state-matrix A into A+β(t)I. Hence, this arrangement contributes in varying the coefficients of the state-matrix as a function of the state-variables. The modified expression of ARE is shown below [24].

(24)

The time-varying state-feedback gain vector is updated online as follows.

(25)

The updated gain vector, Kd(t), flexibly steers the control trajectory using the following Self-Tuning-Regulator (STR) control law.

(26)

In order to constitute the adaptive control law, the value of β is dynamically adjusted via an online adaptation law. The proposed adaptation mechanism is formulated by using continuous nonlinear scaling functions that dynamically reconfigures the value of β online based on the real-time variations in the system’s cumulative position-regulation error. The cumulative position-regulation error and projected error, contributed by the pendulum’s arm and the rod, is evaluated by taking the linear combination of the individual state-error variables. The modified Riccati equation (expressed in Eq 24) uses the updated values of β to re-compute its solution after every sampling interval, and thus, yield a time-varying state-feedback gain vector. The structure of the STR employing the aforementioned adjustable-DoS (ADoS-STR) mechanism is illustrated in Fig 2 [64].

The online adaptation law for β is formulated by using a pre-calibrated continuous Hyperbolic-Secant-Function (HSF) that depends on the weighted sum of state-error variables [64]. The HSF is chosen because its waveform is continuous, bounded, and even-symmetric. The shape of HSF’s waveform is calibrated according to the following rationale [64].

The magnitude of β is enlarged when the state-error magnitudes increase in order to place the eigenvalues farther from the imaginary-axis. This arrangement yields stronger damping against overshoots and quickly reverses the direction of response.

  1. The magnitude of β is reduced when the state-error magnitudes decrease in order to place the eigenvalues closer to the imaginary-axis. This allows the response to settle naturally (and smoothly).

These characteristics yield rapid convergence with strong damping against oscillations without contributing large actuating torques under the influence of bounded exogenous disturbances. The proposed HSF is formulated as follows. (27) where, sech(.) represents the HSF, βmin and βmax represent the minimum and maximum limits of the HSF, z(t) is the weighted sum of all state-error variables in real-time, and the parameters σ1, σ2, σ3, and σ4 are the preset weights linked with each state-error variable in z(t). The waveform of the weight-adjusting function is shown in Fig 3.

The inclusion of the four state-error variables in the computation of z(t) informs the adaptation law regarding the effect of the disturbance on the system’s behavior. This self-reasoning capability improves the controller’s adaptability. To acquire the proposed self-reasoning capability, “positive” weights are selected for each state-error variable in z(t). Hence, when the state-responses diverge from the reference, the positive weights promote an increment in the magnitude of z(t) owing to the same polarities of the classical error variables and their derivatives in this phase. Conversely, when responses revert and approach the reference, the positive weights allow a decrement in the magnitude of z(t) owing to the opposite polarities of the classical error variables and their derivatives in this phase. This arrangement enhances the controller’s flexibility and ensures a stiff damping control effort under large error conditions to quickly attenuate the oscillations, and a softer control effort under small error conditions. The parameters are selected by minimizing Je to improve the reference-tracking and disturbance-rejection behavior. The tuned parameters are recorded in Table 2.

thumbnail
Table 2. Parameter selection of the HSF for the ADoS-STR mechanism.

https://doi.org/10.1371/journal.pone.0256750.t002

4.2. Adjustable control-weighting-factor

In the LQR problem, the control-weighting-factor (ρ) steers the control input trajectory. The selection of ρ makes a compromise between the system’s position-regulation behavior and control energy expenditure [22]. A small value of ρ increases the controller’s robustness against disturbances but also induces highly discontinuous control activity. On the contrary, a large value of ρ limits the system’s control activity under disturbance conditions which inevitably degrades the position-regulation and transient-recovery behavior [65]. Hence, the fixed value of ρ renders the overall control mechanism uneconomical under rapidly changing operating conditions [66, 67]. On one hand, it applies superfluous control force under small error conditions. On the other hand, it contributes to inadequate control resources under transient disturbances. A viable solution is to adaptively modulate the ρ in LQR’s QPI, while keeping the coefficients of Q matrix fixed at their prescribed values, as shown below.

(28)

The idea is to smoothly slide the factor ρ across a continuous surface so that the control profile can be flexibly manipulated to minimize the reference-tracking error and to maintain a reasonable control-input economy throughout the operating regime. This arrangement automatically relocates the eigenvalues of the closed-loop system to effectively compensate for the disturbances. With the modification, R(t) = ρ(t), incorporated in the nominal LQR procedure, the QPI is revised as follows.

(29)

The modified Riccati Equation is expressed in Eq 30.

(30)

The gain vector is re-computed online as follows.

(31)

The time-varying gain vector, Kc(t), delivers the following STR control law.

(32)

The STR equipped with the adjustable-control-weighting-factor (or ACWF) is denoted as ACWF-STR in this research [68]. Its block diagram is shown in Fig 4. The ACWF-STR yields an asymptotically-stable control behavior, as long as ρ(t)>0.

The proposed STR is implemented by augmenting the baseline LQR with a reconfiguration module that self-adjusts the value of ρ as a pre-calibrated nonlinear scaling function of state-error variables. The following meta-rules are used to formulate the proposed reconfiguration module [68].

  1. Under small error conditions (or equilibrium state), the value of ρ is enlarged to allow for position-regulation with minimal control input expenditure.
  2. Under large error conditions (or disturbance state), the value of ρ is proportionally reduced to deliver a tighter control effort to efficiently reject the disturbances.
  3. If the control-input inflates drastically under the influence of bounded disturbances, the variation-rate of ρ is reduced to economize the control effort and limit the peak servo requirements.

With these qualities, the module dynamically restructures the control procedure to enhance the system’s response speed, strengthen its damping against oscillations, and ensure optimum allocation of control resources under exogenous disturbances. The HSF is used to ensure smooth transitions in the value of ρ as the operating conditions change [64]. The linear combination of the real-time state-error variables is used as the input to the HSF which aids in diagnosing the occurrence (and impact) of the exogenous disturbances. The feature dictated by the third meta-rule prevents the RIP’s DC motor from getting saturated while maintaining a reasonable response-speed and damping against oscillations [68]. This feature is incorporated in the HSF-based adaptation law by means of an auxiliary control-input-dependent function. The proposed ACWF adaptation law is formulated as follows.

(33)

where, ρmax and ρmin represent the upper and lower bounds of the HSF, μc is the preset variation-rate of the HSF, z(t) is the same state-error-driven variable as shown in Eq 27 [64], and γ(u, t) is the control-input-dependent self-adjusting variance of the HSF. The function γ(u, t) is specifically designed and implanted in the adaptation-law to realize the third meta-rule. The augmentation of γ(u, t) dynamically adjusts the variance of the adaptation law to maintain the controller’s robustness without contributing highly discontinuous control activity. The shape of the HSF waveform is adjusted, under large servo requirements, as shown in Fig 5.

thumbnail
Fig 5. Automatic adjustment in the HSF waveform for the ACWF mechanism.

https://doi.org/10.1371/journal.pone.0256750.g005

The parameter γo is the basic variance of the function, ω is the positive constant between [0, 1] that presets the lower bound of the variance, η is the positive weight of u(t), and ψ is the positive fractional exponent of the scaled u(t) that prevents the self-adjustment at smaller control signals. The aforementioned parameters are tuned offline by iteratively minimizing Je. The selected values of these parameters are recorded in Table 3 [68].

thumbnail
Table 3. Parameter selection of the HSF for the ACWF-STR mechanism.

https://doi.org/10.1371/journal.pone.0256750.t003

4.3. Adjustable swfs using error-phase observers

This section presents another practical adaptive control scheme that self-tunes the LQR gains by adaptively modulating all the state weighting-factors associated with the QPI [57].

For the under-actuated systems, the degrees-of-freedom to be stabilized are greater than the rank of R which makes it quite hard to establish a correlation between ρ and the state-variables [58]. However, the coefficients of the state-weighting-matrix Q (denoted as qx) hold a one-to-one correspondence with the respective state-variables. This arrangement provides a pragmatic approach to dynamically adjust the values of qx online. Apart from obviating the necessity to tune and preset the state-weighting-factors based on a specific performance criterion, this approach increases the degree-of-freedom of the controller design [58]. Each weighting-factor is dynamically adjusted by using pre-calibrated nonlinear functions that are driven by the corresponding state-error variables of the system, as shown in Eq 34.

(34)

The control weighting-factor is preset to unity to maintain an economical control activity. The time-varying state weighting-matrix, Q(t), is used to modify the solution of the Matrix-Riccati-Equation, after every sampling interval, as shown below [17].

(35)

The updated P(t) re-computes the state-feedback gains online by using the following update law.

(36)

The STR equipped with adjustable State-Weighting-Factors (SWF) is shown below.

(37)

The block diagram of SWF-STR is shown in Fig 6. The following Lyapunov function is used to verify the asymptotic stability of the SWF-STR architecture [17].

(38)

The first-derivative of V(t) is expressed as follows.

(39)

The term approaches to zero in an infinite horizon control problem [35]. Thus, the simplified expression of reduces to Eq 40.

(40)

The expression of first-derivative is negative-definite as long as Q(t)>0, which justifies the stability of the proposed STR.

This adaptation law relies upon the “phase” of the system’s state-response(s) to adaptively tune the state-weighting-factors [53]. The baseline weight-adjusting functions are implemented via pre-calibrated HSFs that depend on the variations in the magnitude of the classical state-error and phase of the state-response. These HSFs are retrofitted with an auxiliary phase-observer that accurately “deduces” and informs the adaptation mechanism regarding the movement of the state-response (away or towards the reference) based only on the instantaneous polarities of the classical state-error and the state-error-derivative variables [54]. The “phase” information is also used to automatically “mutate” the shape of each HSF waveform. This synthetic self-deduction and self-mutation capability significantly enhances the robustness of the adaptive control procedure against exogenous disturbances; thus, making it highly suitable for damping control applications. The following qualitative rules are used to constitute the online adaptation mechanism [53].

  1. When the response is diverging from the reference, the values of qx are inflated to apply a stiff control effort which damps the overshoots and reverses the direction of response.
  2. When the response is converging to the reference, the values of qx are reduced to apply a soft control effort which allows the response to settle (naturally) with minimum fluctuations.

These characteristics induce rapid transits in the response with strong damping against oscillations while suppressing the peak servo requirements. However, this rationale requires precise information regarding the phase (direction of motion) of the response to restructure the control procedure. Consider the time-domain error profile of an arbitrary under-damped system, shown in Fig 7, under the influence of a bounded disturbance.

thumbnail
Fig 7. Error profile of an arbitrary under-damped system.

https://doi.org/10.1371/journal.pone.0256750.g007

The error profile is divided into four phases; A, B, C, and D. Each phase represents a distinct operating condition that is addressed individually to attain the best control effort. The polarities of error and error-derivative are the same when the response is deviating from the reference (phases A and C). The polarities of error and error-derivative are opposite when the response is converging to reference (phases B and D) [53, 54]. In lieu of this state-error behavior, the phase is observed as follows [69]. (41) where, and mx is a step(.) function that yields a “zero” if its internal product yields a negative value and a “one” if the internal product yields a positive value, and ‘x’ denotes the state-variable being considered. This phase-observer is embedded within the structure of a state-error-dependent HSF to alter the waveform’s shape as the state-error changes [69]. The proposed self-mutating HSF is given in Eq 42 [59]. (42) where, ax and bx are the positive upper and lower bounds of each function such that axbx to ensure qx(t)≥0, and γx represents the variance of each function. The proposed HSF complies with the aforementioned meta-rules. Each weight-adjusting function is augmented with its corresponding Boolean operator, mα or mθ.

The logical rules governing the self-mutation of qx(t) are defined in Table 4. The mutation scheme is illustrated in Fig 8 [59]. In phases A and C, the response deviates from the reference. Since the error and error-derivative variables have the same polarities, the Boolean-setting of mx = 1 selects the growing function of the form . This setting delivers a tight control effort to damp the overshoot (or undershoot). In phases B and D, the response converges to reference. The error and error-derivative variables have opposite polarities which lead to the Boolean-setting of mx = 0. This setting contributes a relatively gentle control effort to allow for a quick yet smooth settlement of the response.

thumbnail
Fig 8. Self-mutation scheme for weight-adjusting functions.

https://doi.org/10.1371/journal.pone.0256750.g008

With the commissioning of the phase-observer, the weight-adjusting function(s) autonomously reconfigure their waveforms as illustrated in Fig 9 [53]. The proposed augmentation strengthens the controller’s disturbance-rejection capability by autonomously transforming the growing behaviour of the waveform into a decaying behaviour as the state-response transits from divergence phase to convergence phase, and vice-versa. The self-mutating error-phase-based HSFs are formulated as follows [59].

(43)(44)(45)(46)

thumbnail
Fig 9. Variation rules for weighting coefficients in every phase.

https://doi.org/10.1371/journal.pone.0256750.g009

The hyper-parameters associated with each weight-adjusting function are tuned by iteratively minimizing Je to yield strong damping control. The tuned parameters are shown in Table 5 [59].

thumbnail
Table 5. Parameter selection of the error-phase-dependent HSFs.

https://doi.org/10.1371/journal.pone.0256750.t005

The adapted values of qx(t) remain positive throughout the operating regime, which ensures the system’s stability. The STR equipped with the self-mutating error-phase-dependent HSFs is denoted as the “EP-STR” [59].

4.4. Adjustable SWFs using error-magnitude observers

The proposed scheme dynamically updates the state-feedback gains, after every sampling interval, by adaptively modulating the state-weighting-factors as well as the control-input weighting factor associated with the QPI, concurrently, by using online state-error dependent expert self-tuning mechanism(s) [57]. This arrangement is beneficial because it indirectly alters the state-feedback gains by harnessing the full potential of the proposed hierarchical adaptive LQR scheme by dynamically adjusting all the user-specified constituent weighting-factors of the Riccati equation.

It enhances the adaptability of control procedure to realize the environmental indeterminacies and flexibly steer the control profile to compensate for the consequent parametric variations. The weighting-matrices containing the self-adjusting coefficients are represented as follows.

(47)

The value of R is maintained at unity to economize the control energy expenditure. The rationale and the methodology used to formulate the state-error dependent online self-tuning mechanism(s) for the state-weighting-factors is discussed in the following discussions. The restructured Riccati Equation is expressed in Eq 48.

(48)

The Riccati Equation yields a time-varying solution, P(t), after every sampling instant. The self-adjusting state-feedback gain vector is calculated by using Eq 49.

(49)

The proposed STR law is defined as follows.

(50)

This self-tuning strategy observes the real-time variations in the state-error magnitudes to dynamically adjust the weighting-factors while preserving the system’s stability throughout the operating regime. The rationale used to develop the error-magnitude observer for self-tuning control of robotic systems has been experimentally verified in the available literature [11]. It relies upon the following two meta-rules to modify the critical controller parameters [52].

  1. The proportional state-weighting-factors (qα and qθ) are inflated as the magnitude of corresponding classical state-errors tend to reduce, and vice-versa.
  2. The differential state-weighting-factors ( and ) are inflated as the magnitude of corresponding state-error-derivatives tend to increase, and vice-versa.

Together, these characteristics dynamically reconfigures the control procedure to strengthen the system’s disturbance-compensation capability [11, 52]. To ensure a smooth transition of the weighting-factors, nonlinear scaling functions are required to be continuous, bounded, and even-symmetric. Hence, the weight-adaptation functions are implemented via partial-hyperbolic-functions (PHFs), whose shapes and forms are configured offline according to the above-mentioned qualitative rules [70]. It is to be noted that the hyperbolic secant functions and zero-mean Gaussian functions can also be used instead of the PHF to mathematically program the said adaptation law [64, 68]. The error-magnitude driven PHFs used to scale each state and control weighting-factor are formulated below [71]. (51) (52) (53) (54) (55) where, ax and bx represent the prescribed upper and lower bounds of the state-weighting functions, and γx represents the variance of the state-weighting functions. The waveforms of the weighting-adjusting functions are shown in Fig 10.

thumbnail
Fig 10.

The waveforms of the proportional (left) and the differential (right) weight-adjusting functions.

https://doi.org/10.1371/journal.pone.0256750.g010

A proper selection of the γx enables the controller to apply a stiffer control effort under disturbed state and a softer control effort under equilibrium state of the system. This arrangement strengthens the system’s damping against fluctuations, yields minimum-time transient recovery and renders a smoother control activity [71]. It also averts the limit-cycles contributed by static-friction during dead-zones. In this mechanism, the value of ρ is also adaptively modulated as a nonlinear function of the classical state-error variables. This arrangement prevents the actuator from getting saturated due to the rapid fluctuations and large overshoots in the control-input profile, without trading-off the system’s robustness, under exogenous disturbances [52]. It contributes rapid transits with strong damping against disturbances while economizing the control-energy expenditure [71].

The prescribed bounds of each hyperbolic function are carefully selected so that qx(.)>0 and ρ(.)>0, under every operating condition, to ensure an asymptotically stable control behavior. The hyper-parameters associated with each weight-adjusting function are tuned by iteratively minimizing Jc to attain the best position-regulation accuracy. The tuned parameters are presented in Table 6 [71]. The STR constructed via the error-magnitude driven PHFs is denoted as “EM-STR”.

thumbnail
Table 6. Parameter selection of the error-magnitude-dependent PHFs.

https://doi.org/10.1371/journal.pone.0256750.t006

5. Comparative performance assessment

This section presents a detailed overview of the hardware setup, testing procedure, and comparative experimental analysis of the proposed control schemes.

5.1. Experimental setup

The proposed self-tuning control mechanisms are analyzed by conducting hardware experiments on QNET RIP hardware setup [62]. The angular displacements, θ and α, are measured in real-time by using the optical rotary encoders that are commissioned on-board the hardware setup. These encoders are installed at the pivot of the pendulum rod and with the motor’s shaft, respectively. The hardware setup uses NI-ELVIS II data-acquisition board to capture the encoder measurements and digitize them at a sampling rate of 1000 Hz [11]. The digitized measurements are then serially transmitted to the software control routine at 9600 bps. The customized control routine is digitally implemented by using the “Block Diagram” tool as well as the built-in mathematical functions available in the virtual-instrument file of the LABVIEW Software. The said software is running on a 2.0 GHz digital computer with 8.0 GB RAM. After every sampling instant, the control routine receives the updated sensor measurements, adjusts the critical controller parameters, and computes the modified control signal. The control routine uses the computer’s built-in real-time clock to plan the successive updates in weighting factors after every sampling interval. The front-end of the control software acts as a user interface that records and graphically displays the real-time variations in the state and control-input. The generated control signals are serially transmitted back to the motor driver circuit that is installed on-board the hardware setup. The driver circuit translates the incoming motor control signals into pulse-width-modulated commands that are subsequently amplified to actuate the DC motor. The DC motor and its driving circuit, commissioned on the RIP hardware setup, are durable and agile enough to handle the discontinuous control activity contributed by the proposed control schemes. The QNET Rotary Pendulum’s hardware setup is shown in Fig 11.

5.2. Tests and results

The position-regulation and disturbance-compensation capability of the proposed adaptive control schemes are compared by conducting five unique “hardware-in-the-loop” experiments on the QNET pendulum setup. The time-domain state and control-input variations are recorded for comparative analysis. The graphical results pertaining to θ and α are depicted in degrees (or deg.) to simplify the visual understanding. The detailed testing procedures along with the corresponding graphical results are presented as follows:

  1. Reference tracking: The position-regulation behavior of the pendulum under normal conditions is analyzed by allowing the rod and the arm to track their respective reference positions. The variations in θ(t), α(t), Vm(t), and K(t) are shown in Fig 12.
  2. Impulsive-disturbance compensation: The controller’s ability to compensate the impact of bounded impulsive disturbances is examined by applying a pulse-signal in the Vm(t) profile to perturb the state-response(s). The applied pulse has a time-duration and a peak-magnitude of 100.0 ms and -5.0 V, respectively. The pulse signal is injected in the control response at discrete intervals. The resulting variations in θ(t), α(t), Vm(t), and and K(t) are shown in Fig 13.
  3. Step-disturbance attenuation: The controller’s ability to attenuate random exogenous torques is assessed by injecting a -5.0 V step-disturbance signal in the Vm(t) profile at t ≈ 5.0 s mark. The behavior of θ(t), α(t), Vm(t), and and K(t) are illustrated in Fig 14.
  4. Noise suppression: The controller’s ability to suppress the chattering and control-input ripples induced by the lumped disturbances, measurement noise, or the hysteresis contributed by the parasitic impedances in electronic components is analyzed by injecting a low-amplitude and high-frequency sinusoidal signal, d(t) = 1.5 sin(20πt), in the system’s control input voltage, Vm(t). The time-domain profiles of θ(t), α(t), Vm(t), and and K(t) are depicted in Fig 15.
  5. Model-error rejection: The controller’s ability to reject the identification errors and the real-time model variations is evaluated by changing the pendulum arm’s mass to modify the coefficients of state and input matrices of the system’s model, expressed in Section 2. This modification is realized by attaching a 0.10 kg metallic mass beneath the pendulum’s arm via a hook, as shown in Fig 11, at t ≈ 5.0 s mark. This modification abruptly changes the coefficients of the system’s model during the experiment, and thus, induces perturbations in the pendulum’s response. The behavior of θ(t), α(t), Vm(t), and K(t) are illustrated in Fig 16.
thumbnail
Fig 13. Pendulum’s response under impulsive disturbances.

https://doi.org/10.1371/journal.pone.0256750.g013

thumbnail
Fig 15. Pendulum’s response under sinusoidal disturbance.

https://doi.org/10.1371/journal.pone.0256750.g015

5.3. Analysis and discussions

The quantitative analysis of the experimental results is done with the aid of the following seven Key-Performance-Indicators (KPIs):

  1. The root-mean-squared value of error (RMSEx) in the pendulum angle response(s).
  2. The mean-squared value of the applied DC motor voltage (MSVm).
  3. The magnitude of the peak overshoot (OSθ) observed in θ(t).
  4. The time taken by pendulum’s rod (tset) to settle within ±2% of the reference after a disturbance.
  5. The disturbance-induced angular offset in the arm’s position (αoffset).
  6. The peak-to-peak amplitude of the disturbance-induced fluctuations in the arm’s position (αpp).
  7. The magnitude of peak motor voltage (Vm,p).

The aforementioned KPIs are used as the standard performance measures in the available literature to critically analyze the position-regulation behavior, disturbance-rejection capability, and control energy requirements of the system [15, 64]. The experimental results, expressed in terms of the aforementioned KPIs, are summarized in Table 7. The proposed control schemes remain stable under every disturbance condition. The results clearly indicate that the generic LQR underperforms as compared to the adaptive controller variants in every test case. The ADoS-STR exhibits a moderately better position regulation behavior as compared to generic LQR. Its control-input economy is relatively better than ACWF-STR in every test-case. The ACWF-STR manifests significant improvement in the robustness but also renders highly discontinuous control activity which contributes to chattering in the response of θ(t).

The EP-STR exhibits a time-optimal behavior as compared to ACWF-STR and ADoS-STR. Apart from contributing enhanced disturbance-rejection; it delivers better control-input efficiency than other STR variants while maintaining the system’s asymptotic-stability throughout the operating regime. However, its time-domain performance is inferior to that of EM-STR, especially under the testing scenarios Test A, C, and E. The EM-STR demonstrates significant enhancement in the disturbance compensation capability and position-regulation accuracy as compared to the EP-STR. However, amid transient disturbances, the EM-STR consumes relatively large control energy and exhibits large peaks in the control voltage profile. A concise qualitative analysis of the performances of the proposed STR variants is summarized as follows:

In Test-A, the RIP exhibits the largest deviations in the angular responses under the influence of LQR. The deviations in the responses of θ and α progressively reduce as the nominal LQR is retrofitted with enhanced adaptation mechanisms. The ASWF-adapted EM-STR shows optimum position-regulation accuracy with minimum reference-tracking error, minimum chattering, and reasonably low control-energy consumption as compared to other adaptive controller variants (except for EP-STR). The EP-STC shows the second-best position-regulation performance and the best control energy expenditure amongst the other STR variants. The pendulum response of ADoS-STR shows a fixed offset of 0.3 deg. from the vertical reference throughout the experimental trial. The ACWF-STR shows persistent chattering in θ(t).

In Test-B, the ILQR demonstrates the slowest transient-recovery and insufficient damping against the impulsive disturbances. It demonstrates the largest magnitude of the peak overshoot in the pendulum’s response, which is followed by persistent steady-state oscillations. The ACWF-STR continues to exhibit a highly discontinuous control activity. The EP-STR exhibits minimum transient-recovery time to effectively attenuate the oscillations and shows minimum OSθ while attenuating the impulsive disturbances. The EP-STR consumes minimum average control-input energy (MSVm). Its peak servo requirements are also much smaller than that of EM-STR. The EM-STR shows minimum steady-state fluctuations upon convergence, owing to the augmentation of the phase-based self-learning capability of the controllers.

In Test-C, the step-disturbance permanently displaces the arm from its reference position. The LQR manifests the largest post-disturbance displacement in the arm’s position and large oscillations in the rod. The intermediate STR variants demonstrate moderately better transient-recovery behavior with reasonable damping against the oscillations. The EM-STR, however, effectively suppresses the influence of the applied step-disturbance by contributing minimum RMSE and offset in the nominal positions of the pendulum and the arm, respectively. It exhibits the minimum αoffset and the minimum peak-to-peak amplitude of the oscillations in the pendulum’s responses, θ(t). Furthermore, EM-STR contributes a slightly better control-input economy as compared to EP-STR. The ADoS-STR exhibits the most economical control-input behavior in this test-case.

In Test D, the EP-STR effectively attenuates the ripples in the response caused by the noise. Despite the noise, the EP-STR controlled system manages to regulate the pendulum at the desired reference position(s) with minimal RMSE and minimum control voltage requirements. The EM-STR exhibits the second-best time-domain behavior, in terms of the control-energy expenditure and position-regulation accuracy.

In Test-E, the EM-STR again surpasses other STR variants compared in this article. It robustly compensates the perturbations induced by the modeling-error by delivering strongly damping against the oscillations in the state-responses, and thus, minimizing the reference-tracking error as well as the control energy consumption. The EM-STR effectively attenuates the peak-to-peak amplitude of the post-disturbance oscillations in the state-responses. The control activity of the AI-STC controlled system is relatively smoother than the EP-STR variants. The EP-STR exhibits the minimum RMSE in the pendulum’s angular profile, θ(t). The ADoS-STR exhibits the most economical control-input behavior in this test-case.

From a functional point of view, the state-feedback gains associated with EM-STR respond and adapt to the real-time state-variations relatively quickly. Unlike other STR variants, the abrupt yet small variations of EM-STR gains justify its enhanced adaptability, robustness, and smoother control activity under exogenous disturbances. This flexibility is attributed to the dynamic self-adjustment of all weighting-factors associated with the ARE. The enhanced adaptability of EM-STR comes at the cost of tuning a relatively large number of hyper-parameters (as compared to other adaptation mechanisms discussed here). However, the betterment in the performance is sufficient to ignore this drawback.

The experimental analysis validates the superior position-regulation accuracy and enhanced robustness of the EM-STR in almost every test-case. It manifests better adaptability under perturbed conditions as compared to other STC variants. It effectively removes the inherent shortcomings of other adaptation mechanisms, which enables it to flexibly steer the control trajectory. However, EM-STR does consume large control energy as compared to the other controllers in almost every test-case. The EP-STR shows the second-best time-domain performance after EM-STR.

The proposed hierarchical control procedure is highly scalable. Each controller variants exhibits a certain degree of resilience against the aforementioned disturbance scenarios. However, in future, the proposed control procedure can also be augmented with auxiliary neuro-fuzzy adaptive compensators, suggested in [72, 73], to effectively handle the hardware limits imposed on under-actuated systems; such as input and actuator dead-zones, limit cycles, and parametric uncertainties associated with the system’s actuated and un-actuated state-variables.

The constitution of the proposed hierarchical control procedure only requires the a priori identification of the systems linear state-space model and pre-calibrated weight-adjusting functions. Thus, apart from the self-stabilizing mechatronic platforms, the proposed control schemes can be easily extended to flexible-joint robotic manipulators and other classes on under-actuated systems as well [74].

6. Conclusion

This paper presents the comparative performance assessment of four state-error-driven hierarchical adaptive control strategies that enhance the disturbance-rejection capability of closed-loop under-actuated mechatronic systems. Each adaptation mechanism dynamically reconfigures the constituents of the Riccati equation in an innovative manner to self-tune the state-feedback gains of LQR. The proposed architecture delivers adaptive actions in real-time without explicitly relying on the estimation of state-dependent-coefficients in the system’s state-space model. This feature makes it highly scalable and computationally economical. The improvement in time-domain performance and robustness imparted by each self-tuning-regulators, discussed in this article, is analyzed under practical disturbance scenarios by conducting real-time hardware experiments on the QNET rotary pendulum system. The experimental outcomes validate the superior robustness and position-regulation accuracy of the EM-STR scheme in almost every test case. It is a resourceful scheme that utilizes the full state-error-feedback to self-adjust the state and control-input weighting-factors of the QPI online. The EP-STR delivers the second-best time-domain performance and maintains a reasonable control-input economy. Furthermore, EP-STR excels EM-STR under the influence of the step-disturbance. Its ability to self-mutate in real-time increases the controller’s degree-of-freedom which enhances the system’s response speed and damping against disturbances. In the future, the performance of the proposed control scheme can be further investigated by employing expert adaptive systems that are driven by soft computing techniques. The proposed reconfiguration schemes can also be enhanced by self-regulating the variances and exponents of the hyperbolic functions. The feasibility of the proposed controller(s) can also be analyzed by extending it to other mechatronic systems.

References

  1. 1. Mahmoud MS. Advanced Control Design with Application to Electromechanical Systems. 1st ed. Netherlands: Elsevier Science; 2018.
  2. 2. Szuster M, Hendzel Z. Intelligent Optimal Adaptive Control for Mechatronic Systems. Cham: Springer; 2017.
  3. 3. Krafes S, Chalh Z, Saka A. A Review on the Control of Second Order Underactuated Mechanical Systems. Complexity. 2018; Volume 2018: Article 9573514: 1–18.
  4. 4. An-chyau H, Yung-feng C, Chen-yu K. Adaptive Control of Underactuated Mechanical Systems. Singapore: World Scientific; 2015.
  5. 5. Gritli H, Belghit S. Robust feedback control of the underactuated Inertia Wheel Inverted Pendulum under parametric uncertainties and subject to external disturbances: LMI formulation. J Franklin Inst. 2018; 355(18): 9150–9191.
  6. 6. Peternel L, Noda T, Petrič T, Ude A, Morimoto J, Babič J. Adaptive Control of Exoskeleton Robots for Periodic Assistive Behaviours Based on EMG Feedback Minimisation. PLoS ONE. 2016; 11(2): 1–26. pmid:26881743
  7. 7. Fukuda T, Hasegawa Y. Mechanism and control of mechatronic system with higher degrees of freedom. Annu Rev Control. 2004; 28(2), 137–155.
  8. 8. Chan RPM, Stol KA, Halkyard CR. Review of modelling and control of two-wheeled robots. Annu Rev Control. 2013; 37(1): 89–103.
  9. 9. Odili JB, Mohmad Kahar MN, Noraziah A. Parameters-tuning of PID controller for automatic voltage regulators using the African buffalo optimization. PLoS ONE. 2017; 12(4): 1–17. pmid:28441390
  10. 10. Jeng JC, Ge GP. Disturbance-rejection-based tuning of proportional–integral–derivative controllers by exploiting closed-loop plant data. ISA Trans. 2016; 62: 312–324. pmid:26922494
  11. 11. Saleem O, Mahmood-ul-Hasan K. Robust stabilisation of rotary inverted pendulum using intelligently optimised nonlinear self-adaptive dual fractional order PD controllers. Int J Syst Sci. 2019; 50(7): 1399–1414.
  12. 12. Ahmed BS, Sahib MA, Gambardella LM, Afzal W, Zamli KZ. Optimum Design of PIλDμ Controller for an Automatic Voltage Regulator System Using Combinatorial Test Design. PLoS ONE. 2016; 11(11): 1–20. pmid:27829025
  13. 13. Tang Y, Zhou D, Jiang W. A New Fuzzy-Evidential Controller for Stabilization of the Planar Inverted Pendulum System. PLoS ONE. 2016; 11(8): 1–16. pmid:27482707
  14. 14. Bhatti OS, Tariq OB, Manzar A, Khan OA. Adaptive intelligent cascade control of a ball-riding robot for optimal balancing and station-keeping. Adv Robot. 2018; 32(2): 63–76.
  15. 15. Awais M, Khan L, Ahmad S, Mumtaz S, Badar R. Nonlinear adaptive NeuroFuzzy feedback linearization based MPPT control schemes for photovoltaic system in microgrid. PLoS ONE. 2020; 15(6): 1–36. pmid:32603382
  16. 16. Casellato C, Antonietti A, Garrido JA, Carrillo RR, Luque NR, et al. Adaptive Robotic Control Driven by a Versatile Spiking Cerebellar Network. PLoS ONE. 2014; 9(11): 1–17. pmid:25390365
  17. 17. Lewis FL, Vrabie D, Syrmos VL. Optimal Control. New Jersey: John Wiley and Sons; 2012.
  18. 18. Saleem O, Rizwan M, Ahmad M. Augmented Linear Quadratic Regulator for Enhanced Output-Voltage Control of DC-DC Buck Converter. Control Eng Appl Inform. 2018; 20(4): 40–49.
  19. 19. Prasad LB, Tyagi B, Gupta HA. Optimal Control of Nonlinear Inverted Pendulum System Using PID Controller and LQR: Performance Analysis Without and With Disturbance Input. Int J Autom Comput. 2014; 11(6): 661–670.
  20. 20. Ghartemani MK, Khajehoddin SA, Jain P, Bakhshai A. Linear quadratic output tracking and disturbance rejection. Int J Control. 2011; 84(8): 1442–1449.
  21. 21. Azimi SM, Naghizadeh RA, Kian AR. Optimal Controller Design for Interconnected Power Networks with Predetermined Degree of Stability. IEEE Syst J. 2019; 13(3): 3165–3175.
  22. 22. Xue D, Chen YQ, Atherton DP. Linear Feedback Control: Analysis and Design with MATLAB. Philadelphia: SIAM; 2007.
  23. 23. Sun M, Liu J, Nian X, Dai L. The design of delay-dependent wide-area PSS for interconnected power system with prescribed degree of stability σ. Proceedings of 35th Chinese Control Conference; 2016 Jul 27–29; Chengdu, China. New York: IEEE; 2016.
  24. 24. Radisavljevic V, Koskie S. Suboptimal strategy for the finite-time linear-quadratic optimal control problem. IET Control Theor Appl. 2017; 6(10): 1516–1521.
  25. 25. Gritli H, Belghith S. LMI-based synthesis of a robust saturated controller for an underactuated mechanical system subject to motion constraints. Eur J Control. 2021; 57: 179–193.
  26. 26. Chen H, Sun N. Nonlinear Control of Underactuated Systems Subject to both Actuated and Unactuated State Constraints with Experimental Verification. IEEE Trans Ind Electron. 2020; 67(9): 7702–7714.
  27. 27. Anjum W, Husain AR, Abdul Aziz J, Abbasi MA, Alqaraghuli H. Continuous dynamic sliding mode control strategy of PWM based voltage source inverter under load variations. PLoS ONE. 2020; 15(2): 1–20. pmid:32027697
  28. 28. Jin S, Bak J, Kim J, Seo T, Kim HS. Switching PD-based sliding mode control for hovering of a tilting-thruster underwater robot. PLoS ONE. 2018; 13(3): 1–16. pmid:29547650
  29. 29. Psillakis HE. Integrator backstepping with the nonlinear PI method: An integral equation approach. Eur J Control. 2016; 28: 49–55.
  30. 30. Zhang X, Jiang W, Li Z, Song S. A hierarchical Lyapunov-based cascade adaptive control scheme for lower-limb exoskeleton. Eur J Control. 2019; 50: 198–208.
  31. 31. Alagoz BB, Kavuran G, Ates A, Yeroglu C. Reference-shaping adaptive control by using gradient descent optimizers. PLoS ONE. 2017; 12(11): 1–20. pmid:29186173
  32. 32. Smith AMC, Yang C, Ma H, Culverhouse P, Cangelosi A, Burdet E. Novel Hybrid Adaptive Controller for Manipulation in Complex Perturbation Environments. PLoS ONE. 2015; 10(6): 1–19. pmid:26029916
  33. 33. Li SP. Adaptive control with optimal tracking performance. Int J Syst Sci. 2018; 49(3). 496–510.
  34. 34. Gruenwald B, Yucelen T. On transient performance improvement of adaptive control architectures. Int J Control. 2015; 88(11): 2305–2315.
  35. 35. Bai Y, Biggs JD, Wang X, Cui N. A singular adaptive attitude control with active disturbance rejection. Eur J Control. 2017; 35: 50–56.
  36. 36. Saleem O, Rizwan M, Zeb AA, Hanan A, Saleem A. Online adaptive PID tracking control of an aero-pendulum using PSO-scaled fuzzy gain adjustment mechanism. Soft Comput. 2020; 24: 10629–10643.
  37. 37. Zhang D, Wei B. A review on model reference adaptive control of robotic manipulators. Annu Rev Control. 2017; 43: 188–198.
  38. 38. Saleem O, Rizwan M, Mahmood-ul-Hasan K, Ahmad M. Performance Enhancement of Multivariable Model-Reference Optimal Adaptive Motor Speed Controller using Error-Dependent Hyperbolic Gain Functions. Automatika. 2020; 61(1): 117–131.
  39. 39. Maity A, Höcht L, Holzapfel F. Time-varying parameter model reference adaptive control and its application to aircraft. Eur J Control. 2019; 50: 161–175.
  40. 40. Dian S, Chen L, Hoang S, Pu M, Liu J. Dynamic Balance Control Based on an Adaptive Gain-scheduled Backstepping Scheme for Power-line Inspection Robots. IEEE/CAA J Automatica Sinica. 2019; 6(1): 198–208.
  41. 41. Ahmed MF, Dorrah HT. Design of gain schedule fractional PID control for nonlinear thrust vector control missile with uncertainty. Automatika. 2018; 59(3–4): 357–372.
  42. 42. Önkol M, Kasnakoğlu C. Adaptive Model Predictive Control of a Two-wheeled Robot Manipulator with Varying Mass. Meas Control. 2018; 51(1–2), 38–56.
  43. 43. Goodwin GC, Middleton RH, Seron MM, Campos B. Application of nonlinear model predictive control to an industrial induction heating furnace. Annu Rev Control. 2013; 37(2): 271–277.
  44. 44. Çimen T. Systematic and effective design of nonlinear feedback controllers via the state-dependent Riccati equation (SDRE) method. Annu Rev Control. 2010; 34(1): 32–51.
  45. 45. Batmani Y, Davoodi M, Meskin N. Nonlinear Suboptimal Tracking Controller Design Using State-Dependent Riccati Equation Technique. IEEE Trans Control Syst Technol. 2017; 25(5): 1833–1839.
  46. 46. Cheng D, Zhang L. Adaptive control of linear Markov jump systems. Int J Syst Sci. 2016; 37(7): 477–483.
  47. 47. Qi W, Gao X. L1 control for positive Markovian jump systems with partly known transition rates. Int J Control Automat Syst. 2017; 15: 274–280.
  48. 48. Ma Z, Fang Y, Zheng H, Liu L. Active Disturbance Rejection Control with Self-Adjusting Parameters for Vibration Displacement System of Continuous Casting Mold. IEEE Access. 2019; 7: 52498–52507.
  49. 49. Abdul-Adheem WR, Ibraheem IK. From PID to Nonlinear State Error Feedback Controller. Int J Adv Comput Sci Appl. 2017 8(1): 312–322.
  50. 50. Humaidi AJ, Ibraheem IK. Speed Control of Permanent Magnet DC Motor with Friction and Measurement Noise Using Novel Nonlinear Extended State Observer-Based Anti-Disturbance Control. Energies. 2019; 12(9): 1–22.
  51. 51. Saleem O, Mahmood-ul-Hasan K. Adaptive collaborative speed control of PMDC motor using hyperbolic secant functions and particle swarm optimization. Turkish J Electr Eng Comput Sci. 2018; 26(3): 1612–1622.
  52. 52. Shang WW, Cong S, Li ZX, Jiang SL. Augmented Nonlinear PD Controller for a Redundantly Actuated Parallel Manipulator. Adv Robot. 2009; 12(12–13): 1725–1742.
  53. 53. Saleem O, Omer U. EKF-based self-regulation of an adaptive nonlinear PI speed controller for a DC motor. Turkish J Electr Eng Comput Sci. 2017; 25(5): 4131–4141.
  54. 54. Armstrong B, Wade B. Nonlinear PID control with partial state knowledge: damping without derivatives. Int J Robot Res. 2000; 19(8): 715–731.
  55. 55. Hui L, Xingqiao L, Jing L. The research of Fuzzy Immune Linear Active Disturbance Rejection Control Strategy for three-motor synchronous system. Control Eng Appl Inform. 2015; 17(4): 50–58.
  56. 56. Lee YJ, Cho HC, Lee KS. Immune algorithm based active PID control for structure systems. J Mech Sci Technol. 2006; 20: 1823–1833.
  57. 57. Basua B, Nagarajaiaha S. A wavelet-based time-varying adaptive LQR algorithm for structural control. Eng Struct. 2008; 30: 2470–2477.
  58. 58. Zhang H, Wang J, Lu G. Self-organizing fuzzy optimal control for under-actuated systems. J Syst Control Eng. 2014; 228(8): 578–590.
  59. 59. Saleem O, Mahmood-Ul-Hasan K. Indirect Adaptive State-Feedback Control of Rotary Inverted Pendulum Using Self-Mutating Hyperbolic-Functions for Online Cost Variation. IEEE Access. 2020; 8(1): 91236–91247.
  60. 60. Boubaker O, Iriarte R. The Inverted Pendulum in Control Theory and Robotics: From Theory to New Innovations. Stevenage: Institution of Engineering and Technology; 2017.
  61. 61. Kennedy E, King E, Tran H. Real-time implementation and analysis of a modified energy based controller for the swing-up of an inverted pendulum on a cart. Eur J Control. 2019; 50: 176–187.
  62. 62. Balamurugan S, Venkatesh P. Fuzzy sliding-mode control with low pass filter to reduce chattering effect: an experimental validation on Quanser SRIP. Sadhana. 2017; 42(10): 1693–1703.
  63. 63. Saleem O, Rizwan M. Performance optimization of LQR-based PID controller for DC-DC buck converter via iterative-learning-tuning of state-weighting matrix. Int J Numer Model. 2019; 32(3): 1–17.
  64. 64. Saleem O, Rizwan M, Mahmood-ul-Hasan K. Self-tuning State-Feedback Control of a Rotary Pendulum System using Adjustable Degree-of-Stability Design. Automatika. 2021; 62(1): 84–97.
  65. 65. Filip I, Szeidert I. Tuning the control penalty factor of a minimum variance adaptive controller. Eur J Control. 2017; 37: 16–26.
  66. 66. Filip I, V-asar C, Szeidert I, Prostean O. Self-tuning strategy for a minimum variance control system of a highly disturbed process. Eur J Control. 2019; 46: 49–62.
  67. 67. Filip I, Dragan F, Szeidert I, Albu A. Minimum-Variance Control System with Variable Control Penalty Factor. Appl Sci. 2020; 10(7): 1–21.
  68. 68. Saleem O, Mahmood-ul-Hasan K, Rizwan M. Self-Tuning State-Feedback Control of Rotary Pendulum via Online Adaptive Reconfiguration of Control Penalty-Factor. Control Eng Appl Inform. 2020; 22(4): 1–11.
  69. 69. Isayed BM, Hawwa MA. A nonlinear PID control scheme for hard disk drive servo systems. Proceedings of 2007 IEEE Mediterranean Conference on Control & Automation; 2007 Jun 27–29; Athens, Greece. New York: IEEE; 2008.
  70. 70. Bucklaew TP, Liu CS. A New Nonlinear Gain Structure for PD-Type Controllers in Robotic Applications. J Robot Syst. 1999; 16(11): 627–649.
  71. 71. Saleem O, Mahmood-ul-Hasan K. Adaptive State-space Control of Under-actuated Systems Using Error-magnitude Dependent Self-tuning of Cost Weighting-factors. Int J Control Autom Syst. 2021; 19(3): 931–941.
  72. 72. Yang T, Sun N, Chen H, Fang Y. Neural Network-Based Adaptive Antiswing Control of an Underactuated Ship-Mounted Crane With Roll Motions and Input Dead Zones. IEEE Trans Neural Netw Learn Syst. 2020; 31(3): 901–914. pmid:31059458
  73. 73. Yang T, Sun N, Chen H, Fang Y. Adaptive Fuzzy Control for a Class of MIMO Underactuated Systems With Plant Uncertainties and Actuator Deadzones: Design and Experiments. IEEE Trans Cybern. 2021: 1–14. pmid:33531326
  74. 74. Diao S, Sun W, Su SF, Xia J. Adaptive Fuzzy Event-Triggered Control for Single-Link Flexible-Joint Robots With Actuator Failures. IEEE Trans Cybern. 2021: 1–14. pmid:33502994