ulti-objective gain optimizer for a multi-input active disturbance rejection ontroller: Application to series elastic actuators✩

Series elastic actuators (SEA) have been gaining increasing popularity as a mechanical drive in contemporary force-controlled robotic manipulators thanks to their ability to infer the applied torque from measurements of the elastic element’s deflection. Accurate deflection control is crucial to achieve a desired output torque and, therefore, unmodelled dynamics and dynamic loads can severely compromise force fidelity. Multi-input active disturbance rejection controllers (ADRC) have the ability to estimate such disturbances affecting the plant behaviour and cancel them via an appropriate feedback controller. Thus, they offer a promising control architecture for SEA. ADRC, however, can have upwards of eight tuning parameters for each controlled state. Tuning the controller becomes quite challenging, especially in the context of multi-input, multi-objective control. This paper tackles the problem of ADRC tuning as a multi-parametric and multi-objective optimization approach. An ADRC is developed to regulate the output torque of a multi-input hybrid motor-brake–clutch SEA. The controller has a total of 22 tunable parameters. Point dominance-based nondominated sorting genetic algorithm is used to find the optimal control gains, first considering nine individual control objectives, and then in the context of multi-objective. The algorithm provides a set of potential solutions that highlight the tradeoffs between the control objectives. It is up to the discretion of the designer to select the appropriate solution that best suits a given application. The approach is validated experimentally and the results are compared with a simulated model. Experimental results confirm the suitability of the proposed approach for single and multiple control objectives in a variety of experimental scenarios and show good agreement with the analytical model.


Introduction
Series elastic actuators (SEAs) with passive compliance are gaining terrain in the field of collaborative human-robot systems. They incorporate an elastic element between the mechanical drive and the robot arm, trading off bandwidth for gains in stability, force control, and robustness against shock loads (DeBoon, Nokleby, La Delfa, & Rossa, 2019). As a result of the inherent increasing complexity in the actuation system, SEA control architecture must be adapted to take into account unmodelled dynamics and other disturbances. SEA often use model-based force controllers, suggesting that the controller has significant background knowledge about the plant. In many cases such as in the context of human-robot interaction, creating the model is not feasible or, if time is of the essence, resource consuming. This is where active disturbance rejection controllers (ADRC) flourish, since they are error based and the exact plant model need not be known. ADRC is a viable substitute for the familiar proportionalintegral-derivative (PID) controller where a more robust control strategy is necessary (Ahi & Nobakhti, 2017;Wu, Sun, & Lee, 2017;Xing, Jeon, Park, & Oh, 2013;Zhao & Guo, 2015). PID controllers have three tuning parameters, each with well defined properties. ADRC, however, can have upwards of eight tuning parameters for each controlled state. Therefore, tuning the controller becomes challenging and depends completely on the control objective for a given application. Single-objective genetic algorithms (GAs) and other stochastic methods have been used to tune ADRC gains in a variety of applications ranging from force and temperature control to rocket position control (Geng, Yang, Zhang, & Chen, 2010;Hou, Wang, Gong, & Zhang, 2018;Hu, Zhang, & Liu, 2013;Li et al., 2018;Wang, Lu, Hou, & Gao, 2018;Zhang, Fan, Zhao, Ai, & Gong, 2014). Many optimization and convergence studies have been conducted on tuning the tracking differentiator and extended state observer gains for ADRCs Wang, Zu, Duan, & Li, https://doi.org/10.1016/j.conengprac.2021.104733 Received 4 May 2020; Received in revised form 24 November 2020; Accepted 9 January 2021 Available online xxxx 0967-0661/© 2021 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). 2019; Zhang, Xiao, Yu, & Xie, 2020;Zhang, Xu, & Gerada, 2019). Optimizing a controller, however, is often not a single objective task. ADRC gains have conflicting implications in the controller performance and as such, optimizing several control objectives at a time, such as rise time, settling time, overshoot, control effort, and tracking error, is impractical. In the majority of design problems and in particular in human-machine interaction systems, many of these control objectives need to be taken into account and balanced.
A more suitable approach to automate the tuning of an ADRC must incorporate multiple control objectives, giving rise to a new challenge: there may be more than a single set of parameters that satisfies the objectives (Madoński, Piosik, & Herman, 2013). Several algorithms have been developed to address this issue, including nondominated sorting genetic algorithms (NSGA-II) and strength pareto evolutionary algorithms (SPEA2). In problems with an increased number of objectives, the performance of these algorithms is known to deteriorate since behaviour becomes similar to randomly exploring the search space since most solutions are nondominated with respect to each other (Deb & Saxena, 2006;Kalyanmoy et al., 2001). For this reason, other solvers capable of solving many-objective problems are needed.
In all NSGA solvers, a solution that performs better than another in at least one objective, and not worse in any other objective is said to be dominant. In order to tune the ADRC, 9 objectives will be defined later on. With such a large number of objectives, most solutions will be uniquely optimal to a given objective and will prevent the algorithm from converging. A solution to this issue can be resolved using reference point domination to improve the diversity of solutions along the pareto front (Deb & Jain, 2014;Hernandez Mejia et al., 2017) by forcing them to distribute along the search space (Ciro, Dugardin, Yalaoui, & Kelly, 2016). This concept, called -NSGA-III, is further expanded in Yuan, Xu, and Wang (2014) to push solutions closer to the pareto front. Preference incorporation in integrated in Elarbi, Bechikh, Gupta, Said, and Ong (2018) to create a new algorithm, the RPD-NSGA-II that further improves convergence and diversity of the solutions. This method is shown in Elarbi et al. (2018), to provide similar or better results when compared against its predecessing genetic algorithms on commonly-used benchmark problems involving up to 20 objectives. Thus, it is selected for the multi-parametric and multi-objective problem of ADRC tuning presented in this paper. This paper tackles the problem of ADRC tuning as a multiparametric and multi-objective optimization approach applied to SEA for human-machine interaction. An ADRC is developed to regulate the output torque of a multi-input SEA comprising of a brake and motor. Torque control is achieved indirectly by monitoring the deflection of the elastic element and adjusting it via a distributed control law for the brake and motor torques, rendering the system as a multi-input, single-output (MISO) entity. The ADRC has 22 tuning parameters, 11 for the motor and 11 for the brake controller. One of the contributions of this paper includes using the point dominance-based nondominated sorting genetic algorithm to find the optimal control gains, first for nine individual control objectives, and then for all control objectives at once. Rather than converging to a single solution, the algorithm provides a set of optimal solutions that highlight the tradeoff between different control objectives. A solution that best suits a given application can then be selected. Another contribution of this paper is the optimization of a dual input SEA controller. To the best of the authors' knowledge, this is the first implementation of a multi-parametric and multi-objective gain optimizer of ADRC gain applied specifically to a dual input SEAs. The framework proposed in this paper has applications in a variety of other control architectures and systems.
The remainder of this paper is structured as follows. First, the MISO SEA actuator is introduced and its state space model is derived. The multi-input ADRC is then laid out, including the extended state observers and the distributive control law for the brake and motor. The optimization algorithm is introduced in Section 4 along with all performance objectives, mainly, tracking error, control effort, percent overshoot, rise time, settling time, maximum input, steady-state error, disengagement time, and the number of input direction changes. Experimental results confirm the suitability of the proposed approach for single and multiple control objectives in a variety of experimental scenarios and show good agreement with the analytical model.

MISO series elastic actuator model
Consider the differentially-clutched SEA we introduced in DeBoon et al. (2019). The actuator is composed of a DC motor in series with a spring, connected to a magnetic particle brake through a differential clutch. A simplified dynamics diagram and a picture of the device are shown in Fig. 1. The actuator has three operating modes. Mode 1: The brake is fully engaged and the actuator acts like a classical SEA. Mode 2: The brake is engaged and the motor is static. The user is then directly coupled to a grounded spring whose stored energy is controlled by adjusting the braking torque. Hybrid mode: Both the motor and brake are engaged, and the brake and differential act as a continuous variable-slip clutch between the spring and the output. As the motor compresses the spring, the brake regulates the amount of energy stored in the spring, thereby controlling the output torque.
The actuator demonstrated above is a multi-input device that measures spring deflection to infer the output torque at the end-effector. The deflection is determined through the difference in encoder measurements between the motor-mounted encoder and the spring-side shaft encoder.
The device dynamics can be summarized by the following set of differential equations (DeBoon et al., 2019): where is the stiffness of the spring, is the angular position, the dot operators (̇) and (̈) represent the first and second time derivatives, respectively. and represent the inertial and viscous friction coefficients, and represents a torque. Throughout this paper, subscripts , , , and refer to the dynamics in the motor body, brake body, user/output body, and the differential gearbox, respectively. refers to the spring constant of an output measurement device or a passive elastic environment.
is the motor torque, which for low speed can also be related through the input voltage of the device = ( )∕ where is the motor torque constant and is the winding resistance of the motor. The brake torque can be modelled as = ( ∕ ) where is the input voltage of the brake and is the winding resistance. This function in a natural state is nonlinear due to magnetic hysteresis of the particle brake, however, in this paper it will be simplified to be proportional to the input current through a gain . Finally, represents the user torque acting against the satellite-side dynamics of the differential gearbox.
For an open differential layout where the torque is split evenly between the planetary gears the following equations hold (see Fig. 1a): Note that (5) also holds for the first and second time derivatives, i.e.̈= (̈+̈)∕2). Combining the previous equations to make the system independent of the position of the user, the following two differential equations are obtained: . It uses a DC motor, a spring, a rheological brake, a torsion spring. The spring is located between the motor and the differential, whose other side is attached to the brake. The satellite gears of the differential make up the output shaft of the actuator; (b) is an image of the actuator showing the two inputs (motor and brake) as well as the differential clutch mechanism. Substituting (7) into (6) yields: Similarly, by inserting (6) into (7) obtains: The previous equations can be rearranged to generate the multiinput state space model of the actuator for a state vector = [̇̇̇] aṡ= + , that is: where the constants in matrix A are being the input vector. When the actuator rotates, the output torque can be calculated based on the relative compression of both sides of the elastic element , its stiffness constant , and all dynamic losses as: where = − . By substituting for the differential law in (5), the governing equation for the application of the output torque By neglecting the minimal inertia in the spring-side and user-side differential bodies, the governing equation of the output torque becomes: With the MISO state-space model known, the active disturbance rejection controller can be developed.

Multi-input active disturbance rejection torque controller
The objective of ADRC is to provide accurate torque outputs based strictly on measurements of the deflection of the spring. A reasonable estimate of the deflection angle can be extracted from the encoder readings on either side of the spring. Active disturbance rejection control (ADRC) will be used for this purpose, as ADRC is an error-based control method that can compensate for unmodelled disturbances, such as backlash and brake hysteresis, via a time-optimal solution to a reference trajectory designed for non-ideal systems. The output of the controller can be distributed to multiple inputs, which is the case for the actuator described earlier. Convergence of nonlinear ADRCs for multi-input systems is demonstrated in Guo and Zhao (2013).

Reference and transient output torque through spring deflection profile
In the context of torque control for the elastic actuator, the transient profile is an updated reference that is a function of the proportional error 1 and time varying error 2 for the single output system: where is the reference torque, and = ( − ) is the inferred output torque measured from the deflection of the elastic element measured at sample . These errors can be inserted into the fhan function from Han (2009) along with an acceleration coefficient 0 and smoothing factor ℎ 0 to produce a desired transient profile as shown in Fig. 2. The fhan function is defined as: This function creates distinct transient profiles with differing acceleration rates 0 for the same reference. The output of fhan provides a realistic alternative to transients in physical systems, as an input reference such as a Heaviside step function has an infinite derivative at the transient point which is impossible to recreate. The constants in (16) are (Han, 2009): where , 1 , 2 , , and are intermediate variables. The position and velocity reference commands for each of the subsystems of the multi-input actuator can be derived from the reference torque generated by the desired transient profile mentioned above. Once the desired profile is determined, (14) and (16) can be used to determine the controllable reference position and velocity states. The process is demonstrated for one of the subsystems (motor position and velocity) through the following discrete set of equations: where ℎ is the sampling period and 1 0 and ℎ 1 0 are tuning parameters related to controller aggression and error levelling, respectively. The remaining subsystems can be computed in a similar manner to (19) and (20). This ensures each subsystem of the controller collectively tries to minimize the error 1 ( ) of the single output relative to the last sampled state of the actuator ( − 1).

Extended state observers
Consider the multi-input single-output (MISO) time varying system described in Section 2 with 6 state variables defined by the vector ∈ R, = 1, 2, .., 6 as = [ 1 1 1 2 2 1 2 2 3 1 3 2 ] or = [̇̇̇] as defined in (10). A block diagram of the controller is shown in Fig. 3. In this example, a system with three independent states , , and as well as their first time derivatives are measurable. Therefore, a total of three system equations for the ADRC can be used. The three subsystems ( = 1, 2, 3) and their respective nonlinearities are: where , = 1, 2, and are imperfect or nonlinear functions describing the subsystem and any external disturbances captured in ( ), ( ) is the control input of the subsystem, and ( ) is the output, an angular displacement for the multi-input plant described in Section 2. The three local total disturbance terms can be estimated and combined by equatinḡ1 = and̄2 = 1 ( , 1 , 2 , ( )), which provides the following description of the subsystem with a common disturbance term: A linear approximation̄( ) for the nonlinear term allows the subsystem to be extended by a new state representing the sum of disturbances as̄3. This disturbance and its first time derivativė̄3 are defined by: where the total disturbance of the subsystem can be combined to produce: The controller has three extended state observers (ESO) to determine the angular displacement and velocity of each of the subsystems. The ESO evaluates discrepancies in expected values and estimates disturbances present in each of the subsystems. The state extension of the ESO provides a means of evaluating nonlinearities around the spring deflection, magnetic hysteresis, static friction, and other unmodelled disturbances. Each ESO is defined as: where 0 = 1, 2, 3 are the observer gains for a dual integral plant state and ( , , ,̂( )) is an error function for subsystem . See Appendix A for the full derivation.

Control law
The issue surrounding multi-input systems lies in the derivation of the control law for each input. In systems employing a single extended state observer, the control law often is some combination of the proposed control input and a disturbance error correction. In single input Fig. 3. Block diagram of a simplified version of the controller. The profile generator adjusts the input at unrealistic instantaneous reference transients to improve differential tracking error; the nonlinear feedback combiner aggregates the proportional and time-varying error in the states and proposes an input to the plant. The extended state observer provides a means of estimating and compensating unmodelled disturbances by creating a new state that encapsulates all disturbances in the system. This disturbance is fed into a control law that generates the actual control inputs for the MISO system. systems, the control law is similar to the following: where ( ) is from the nonlinear feedback combiner detailed below, and 1 and 2 are proportional and derivative control gains, respectively. Note that many valid linear or nonlinear controllers for exist. Potential single-input nonlinear operators are suggested by Han in Han (2009), where the proposed control input could be (see Fig. 3): with 0 < 1 < 1 < 2 being tuning parameters, and the nonlinear function is designed to improve convergence time (Han, 2009). The goal of the nonlinear feedback combiner is to assist in converging at a faster rate than a PID controller. It is similar to producing time-varying PD gains. Since a static proportional and derivative gain can guarantee convergence for a reasonable range (Wang, Dodds, & Bailey, 1996), tuning the values of 1 and 2 can maintain this guarantee with the advantage of faster convergence times. The primary method of controlling the torque in the multi-input single-output (MISO) SEA from Section 2 uses both controllable inputs, i.e., the commanded motor voltage and brake voltage , in a distributive manner. The actual control input for the dual-input system is given as (see Fig. 4): where 0 represents the total control effort of the actuator, and 1 and 2 represent the input for the motor and brake, respectively. 0 is a distributive gain consisting of the sum of all subsystem gains , = 1, 2, 3. These are tunable proportional parameters that map subsystem error to the causal inputs. A unity gain for a specific subsystem error function indicates that the cause of the error for the subsystem is directly addressed by the input in question. 1 and 2 are the observed proportional and time-varying error between the reference profiles generated in (19) and (20), and the observed stateŝ1 and̂2, respectively. These error functions are derived from the relationship between each subsystem and the single output given the current state of the actuator, i.e. the angular position and velocity of the remaining states.
1 and ℎ 1 are adjustable parameters unique to each subsystem.̂3 is the total disturbance estimated by the observer from subsystem . With the above control law the input distribution converges provided that and with 1 ∈ [−1, 1], 2 ∈ [0, 1] being linear distributing functions dependent on the current state of the spring deflection and the reference torque, and 0 is a tuning parameter to produce meaningful distribution of the inputs. See Appendix B for the full derivation.

ADRC tunable parameters
Due to the large number of tunable parameters in the ADRC, it becomes difficult to evaluate the contribution of each parameter on the performance of the actuator. For each independent variable (inclusive of all time derivatives), there is one observer to estimate the states and one total disturbance term for the subsystem. The MISO systems in (29) have additional tunable gains equal to the number of inputs multiplied by the number of extended observers. Multi-input systems also have multiple transient profile generators. Each ℎ function defined in (16) has two controllable parameters, 0 and ℎ 0 . Therefore, the number of tunable parameters from the transient profile generators is twice the number of extended state observers. All of the tunable ADRC Table 1 Summary of all tunable parameters in an ADRC applied to series elastic actuator (SEA) control. parameters to be optimized in the following sections are summarized in Table 1.

Reference point dominance-based nondominated sorting genetic algorithm
The MISO ADRC described in the previous section has 22 tunable parameters. Finding approximate values for them becomes even more challenging with multiple control objectives, particularity when they contradict one another. Therefore, the controller tuning needs a multi-objective optimizer. As an example, consider two interdependent control objectives: the minimization of control effort and the minimization of rise time. Since rise time decreases with the control effort, if the latter is to be minimized the system would have an extended rise time. When the control objectives are in opposition, there might not be a single set of values for the ADRC parameters (called solution hereafter) to satisfy all of the control objectives. In fact, a set of optimal solutions create a pareto front, that is, a set of solutions that are not strictly inferior than (or are not dominated by) any other solution. A solution that performs better than another in at least one objective and not worse in any other objective is said to be dominant. Fig. 5 illustrates this phenomenon. The solutions connected with a dotted line represent the location of entirely nondominated solutions (pareto front). The dominated solutions have one or more solutions with a more optimal value for at least one of the objectives. This is represented in the figure by shaded regions, where the dominating solution lies on the bottom left corner of the regions and any other solution that is within this region is therefore dominated for the dual minimization objectives. In problems with multiple objectives, issues may arise around the diversity of solutions with complex pareto fronts (Ishibuchi, Setoguchi, Masuda, & Nojima, 2016;Li & Zhang, 2008).
A solution to automate the tuning of an ADRC should incorporate a multi-objective optimizer. As mentioned in the introduction, since the RPD-NSGA-II algorithm from Elarbi et al. (2018) is the most efficient of existing multi-objective solvers, it is selected to tackle the multiobjective problem presented in tuning the variables present in the ADRC presented earlier. In multi-objective optimization the goal is to obtain a set of potential solutions with satisfactory performance on all fronts. The chosen algorithm places emphasis on the diversity of the population in the evolutionary algorithm. The goal of the RPD-NSGA-II is to have a combination of convergence and diversity, which are not independent of one another. The stochastic nature of evolutionary algorithms helps improve diversity in the search for optimal solutions, the non-RPD-dominated sorting and selection of the multiple pareto fronts allows for convergence around the objectives.

Reference points and distance measures
The diversity guarantee of the RPD-NSGA-II algorithm is attributed to the reference points generated at the beginning of the process. The reference points are generated using a method proposed in Das and Dennis (1998), with a set of evenly distributed points on a normalized hyperplane as shown in Fig. 6(a). Once the fitness values for each objective are obtained, each range of fitness values are normalized by the maximum and minimum values, obtaining the solution set as marked by the green points in Fig. 6(b). The potential solutions are then associated with the nearest reference point vector from the ideal point (origin for minimization problems) along the normalized hyperplane. Consider the set of solutions in Fig. 6(b) that are nearing the plane Fig. 5. Example of pareto front for control effort (∫ ) and rise time ( ) minimization. The pareto front is a set of nondominated solutions with one or more solutions having an optimal value for at least one of the objectives. Fig. 6(c), where each of the solutions are attributed to a reference point. In this example, solution is associated with reference point 2 and solutions and are associated with reference point 3 . Once all of the solutions have been associated with their respective closest reference point, each solution is assigned two distance measures, 1 and 2 , used to aid in the non-RPD-dominated sorting process. Distance measure 1 refers to the magnitude of the distance between the origin (for minimization problems) and the normal drawn from the reference vector to the potential solution, as in Fig. 6(c). This distance relates to convergence of the solutions, as a smaller magnitude of 1 means better overall fitness of a solution. Distance measure 2 refers to the magnitude of the normal as in Fig. 6(d). This distance is used to help encourage diversity in the selection process for the next generation, as reference points with solution crowding will begin to reduce emphasis on some of the solutions to favour diversity.

RP-dominance and non-RPD-dominated sorting
The two distance measurements are used to evaluate a solution's dominance over other solutions, thereby generating an alternative to determining pareto fronts apart from pareto-dominance. Let = [ 1 , 2 , … , ] be a vector containing the ADRC tunable parameters listed in Table 1. A potential solution has a multi-objective fitness value is said to pareto dominate another solution, say 2 , if and for at least one = 1, 2, … , , | ( ) < ( ). For the RPD-NSGA-II algorithm, dominance of a solution over another is taken a step further. Solution is said to RP-dominate solution if pareto dominates or if and are pareto equivalent and any of the following are true (Elarbi et al., 2018): (1) Both solutions are associated with the same reference point, but the value of 1 for is lower than the value of 1 for ; or (2) Both solutions are associated with different reference points, but the value of 1 for is lower than the value of 1 for and there are fewer solutions associated with the same reference point as than that of . This operation is known as computing the reference point density.
Therefore, the entire population can be evaluated and placed into various dominating ranks. The ranks are an extension of paretodominance, with emphasis on diversity due to the second condition above. This methodology is referred to as non-RPD-dominated sorting (Elarbi et al., 2018). Once the entire population has been sorted, the top 50% of solutions amongst the best performing RP-dominated pareto fronts progress to the next generation of the optimization loop as the parent population. If the termination condition of the optimization process is not achieved, the newly recreated parent population is varied using standard stochastic genetic algorithm operators (crossover and mutation) and is re-evaluated for its fitness and distance measures. The total algorithm with all components is displayed in Fig. 7.

Performance objectives for SEA
One can now define the control objectives of the ADRC of the multiinput SEA as the following performance metrics. While all of these are well known to the control engineer, it is important to clarify how they relate to SEAs for human-machine interaction specifically.
Tracking error : The strict adherence of a desired output state to a reference profile is the goal of most control systems and, therefore, this objective is biased in evaluating fitness in the optimizer. Tracking error refers to the integral of the error over all time. This is displayed as the sum of the shaded region in Fig. 8. Ideally, = 0, which refers to perfect tracking between the reference ( ) and the actual output ( ). i.e., In the context of the SEA in human-machine interaction, minimal tracking error refers to the closeness of the desired torque to a reference torque. See Fig. 8. Control effort : Refers to the integral of the controller output 0 ( ) over all time: The controller effort relates to the actuator's efficiency, which constitutes a significant objective in portable human-robot interaction systems such as the SEA presented earlier.
Percent overshoot : Refers to the percentage of the maximum output value that exceeds the reference. This is particularly useful when the output ( ) is nearing rated hardware limits.
is defined as: Minimizing the percent overshoot limits the maximum output torque in the SEA, which is particularly important in the context of force/torque control during human-machine interaction. Rise time : Rise time refers to the time it takes for the system to reach a value that is 95% of the reference value from the time of transience. This is useful for high speed switching applications. is defined by = min( ) | ( ) ≥ 0.95 ( ).
Settling time ( ): Refers to the time it takes for the system to permanently settle within ±5% of the reference value measured from the time of transience. The settling time is defined as = max( ) where | ( ( ) − ( )) | ≥ 0.05 ( ). Although the settling time is a good metric to determine stability in response to a transient state, as we shall see later on, minimizing the settling time in SEAs adds unwanted oscillations due to the presence of the elastic element.
Maximum input : Refers to the maximum value of the input(s) of the device. This objective is useful when the source is limited, nearing saturation, or there are tight tolerances around the applied input to the plant.
is defined as = max( ( )). Maximum input is important for devices to be used in close proximity to humans. Note that the minimum objective for is the same as that for , as both are trying to minimize the input.
Steady-state error : Steady-state error is the error between the reference and the settled output state when → ∞. This is particularly useful in high precision applications where there are tight tolerances on the final state of the output. The steady-state error is defined Minimizing the steady-state error is important in SEAs where precise torque control is more desirable than the time it takes to reach the reference, allowing for smooth, well-defined motions.
Disengagement time : Disengagement time is the time the SEA actuator takes to release the stored energy in the elastic element or reverse back to an initial state. It is measured from the time disengagement is initiated, , to the time when it is within 5% of the reference value at the time disengagement was initiated. This is important in applications regarding physical interaction with humans, as a source of collision mitigation.
is defined as = min( ) | ( ) ≤ 0.05 ( ). Quickly reducing the amount of energy in the actuator can ensure there is no holding torque and minimizes the amount of energy transferred to the output.

Number of input direction changes
: Refers to the number of instances the control effort changes sign. High switching applications become difficult to physically implement and can damage sensitive plants.
is defined as: This objective is inserted into the algorithm as a method to reduce the number of high-switching solutions from the population. High switching in a SEA is undesirable and may be an indication of an unstable behaviour. Each of the above objectives are minimization functions, that is, the minimum value of each objective translates to better performance. A graphical representation of each of the above objective functions are displayed in Fig. 8.

Application to multi-input series elastic actuators
The ADRC and the optimizer described earlier can now be applied to control the SEA output torque. Fig. 1(b) shows the experimental setup. A similar reference profile to that shown in Fig. 8 is selected as the desired torque profile. It is assumed that the torque is proportional to the spring deflection, hence, the reference profile shown in Fig. 8 is the desired deflection of the SEA spring and is the input to the controller. This reference profile is selected to be challenging for the controller, as well as provide the necessary metrics to compute meaningful values for each control objective in Section 4.3.
The actuator's output shaft is connected to a load cell from which the output torque is measured. Due to this constraint and to ensure consistency in the experiments, =̇= 0 ∀ , ⟹ = − , for all the scenarios considered later. The constrained condition of the actuator to measure the output torque using a load cell diminishes the need for an additional extended state observer. Thus, the controller has two extended state observers, one for controlling the contributions to the output deflection based on the motor's angular displacement and one for controlling the output deflection based on the angular displacement of the magnetic particle brake. Therefore, the controller has a total of 22 tunable parameters as listed in Table 1.
The state-space model of the actuator is implemented in the optimizer. For every generation, the entire merged population (parent population and varied population, see Fig. 7) is simulated and the fitness values for each objective is determined. The optimizer is chosen to have the crossover and mutation variation parameters set to 35, with a population of 300, over a total of 500 generations. To ensure the results provided meaningful control of the plant, a bias is placed on the tracking error. This provided much more desirable results, where the fitness values for increasing generations are displayed in Table 2 for a single objective at a time.

Experimental results
Once the optimization algorithm has run its course, the resulting pareto front set can be evaluated experimentally with the SEA. Two sets of experiments are reported: single and multiple objective optimization. The solution that provides the best result to minimize one or more control objectives at a time is selected and the gains determined from the optimizer are physically implemented. Table 2 summarizes the best results obtained in each objective and the experimental and simulation results are shown in Fig. 9. The top plot in each panel shows the desired deflection of the elastic element along with the simulated and measured deflections for the values listed in Table 2. The bottom panel presents the simulated and experimentally applied input to the actuator, i.e., the brake and motor voltages.

Single control objective results
The ADRC gains are for the motor and brake found to minimize the tracking error are summarized in the first two lines of Table 3 and the experimental results obtained for these gains are displayed in Fig. 9a. From the experiments, one can discern that the actuator is able to maintain reasonable tracking of the spring deflection. The small perturbation around 12 s is due to backlash in the differential gear. The result is well matched with the simulated results and certainly performs favourable for reference tracking.
The optimized gains that minimize the control effort are listed in the third and fourth lines of Table 3 and the experimental results are shown in Fig. 9b. A net decrease in the motor voltage can be seen as  Fig. 9a. One can also note the absence of any overshoot, which is another indication that the control objective has been attained.
The results for minimal overshoot are shown in Fig. 9c. One can again observe the lack of overshoot. Further, the control effort resembles again that of Fig. 9a. The results for the settling time, maximum input, and state error in both the simulations and the experiments for the gains highlighted in Table 3 are displayed in Figs. 9d, 9e, and 9f, respectively. From the results it can be seen that each control objective is achieved. The results of the controller minimizing time to disengagement using the gains in Table 3 in both the simulations and experiments are displayed in Fig. 9g. In this objective the fitness value is a function of the number of input direction changes, therefore, naturally unbiasing high switching solutions.
The minimization of rise time resulted in an unstable controller. The final population of the 500th generation contained members with unstable controllers, however, this instability does ensure the output rises as quickly to the reference as possible. This is a good example on the importance of defining objectives and biasing the outcome based on the more important objectives. If there were no bias on any objective, the unstable controller solution would become selected for variation into future generations. For the most part, this can have adverse effects on the progression of the controller gains, as it contradicts most of the other objectives. For obvious reasons, the controller was not implemented experimentally and the simulation results are not included in Fig. 9.

Multi-objective control results
The optimizer has demonstrated validity in optimizing various single objective cases, however, one has the option to choose a solution that best represents their unique application from the population set at the end of the optimization. Consider 5 members of the resulting population shown in Table 4, and their respective fitness values when the tracking error, control effort, and overshoot, are set as concurrent control objectives.
If the device was to be used in the context of human-machine interaction, specifically robot-assisted rehabilitation, there may be a number of deterrents when selecting the appropriate control strategy.
For example, there may be a very specific torque goal in mind and to ensure that the patient does not experience overexertion and perhaps the device is destined to be mobile and, therefore, battery operated. In this case, the most significant objectives to optimize are the tracking error, control effort, and percent overshoot. If these were the specifications for the controller design, the three candidate solutions could be Members 1, 2, 3, and 5 of the population from Table 4, as each of them have reasonable values for minimizing the three objectives in question. Member 4 may be discarded, as it is dominated by every other solution with respect to the significant objectives. Furthermore, Members 1 and 5 have a percent overshoot that could be considered unreasonably high for the design specifications and, therefore, could be discarded as well. The remaining members, 2 and 3, have relatively similar values for the tracking error and the percent overshoot , but vary significantly in control effort . The designer may also choose  to select the gains optimized from Member 3 as it has the lowest control effort, compromising slightly on tracking error and percent overshoot compared to Member 2. The results of Member 3 with optimized gains from Table 4 from the final population set is demonstrated in Fig. 10. This example demonstrates the importance of multi-objective optimization in the process of selecting controller gains for specific applications, where the designer can view the trade-offs between various solutions. The ability to gauge the overall performance of a controller provides a means of tailoring the controller based on the specifications of the application. From the experimental results, it is clear that the multi-objective optimizer can combine the features of several single objective problems described in the previous section.

Conclusions
The most difficult portion of active disturbance rejection control is the tuning of the system parameters. Presented in this paper is the implementation of the RPD-NSGA-II from Elarbi et al. (2018) to optimize the parameters required for ADRC on a multi-input SEA. By using a multi-objective optimization technique coupled with a simulation, the parameters required to achieve the desired performance for a physical system were determined. The RPD-NSGA-II routine proved capable of handling this multi-objective optimization problem to provide the end user with a set of dominating solutions such that the designer is able to choose gain values based on the objectives most suitable for their applications. Comparing the results of this paper with other optimization algorithms is difficult since this is the only implementation of a generic algorithm to a dual input SEA. However, for a comparison between the performance of this optimizer and other decomposition-based multi objective genetic algorithms, the reader is refereed to Elarbi et al. (2018), and a comparison between a PID controller and an ADRC applied to the same SEA is available in DeBoon, Nokleby, and Rossa (2020).
In order to choose gains that are favourable to multiple objectives, the designer could evaluate the set of fitness values for each member of the resulting population and determine how they want to bias their controller. The relative trade-offs between the objectives becomes apparent and, therefore, the designer can select the set of gains that is best suited for their applications. There are a wide number of applications for an ADRC optimization method. Since active disturbance rejection control is a favourable alternative to PID, the controller can be used in a multitude of plants ranging from robotic actuators used in medical devices to autonomous vehicles and industrial automation.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Appendix A. Expanded state observers for the multi-input series elastic actuator
The three extended state observers can be determined from the equalities:̄1 1 = ,̄2 1 = , and̄3 1 = . The systems of equations for the subsystems are structured as follows. For the motor subsystem: = 4 − − . These equations can be converted to three extended state variables, each with their own total disturbance terms̄3. For the motor subsystem: The linear approximation of the input function in this case is ( , ) ≅ . Therefore,̄1 = . Similarly for the brake subsystem: The linear approximation of the input function in this case is 1 ( , ) ≅ ℎ ( −4 ) . Therefore,̄2 = ℎ ( − 4 )∕( ). Finally for the spring subsystem: where 1 0 , = 1, 2, 3 are observer proportional coefficients selected by the designer. 1 , = 1, 2, 3 are observer error functions. The error function suggested from Han (2009) provide nonlinear observer error functions as: