Imitation-Based Motion Planning and Control of a Multi-Section Continuum Robot Interacting With the Environment

Recently, flexible robots are growing in importance owing to their merits over rigid robots in maneuverability and safety, which equips them to work in unstructured environments, such as occur in medical applications. However, motion planning and control of flexible manipulators is challenging due to their compliance behavior and system uncertainties. Thus, this letter presents an Imitation-based Motion Planning (IbMP) approach, along with dynamic impedance control for learning, planning, and trajectory tracking of a two-section soft continuum robot in a dynamic environment. Point-to-point motion demonstrations, including the robot's tip position and orientation are intuitively provided by a motion capture system (OptiTrack V120-trio) and a similar kinematic flexible interface. Additionally, a singularity-free dynamic model based on Lagrangian formulation and Taylor expansion series of a two-section continuum robot is derived while planning for robot motions. The simulation results show that the IbMP approach, along with the dynamic impedance control, generalizes the robot's motion by varying the initial and goal of the robot's tip pose while avoiding static and dynamic obstacles, and moving back to the desired track after disturbances. Finally, the stability and performance of the IbMP algorithm against input disturbances are assessed using a Monte Carlo approach that can guide the selection of gain values.

octopus arms, elephant trunks, worms, and snakes to facilitate dexterous manoeuvring in static and dynamic environments. Unlike classical manipulators with rigid and bulky elements, which face challenges, especially in crowded environments, continuously bending continuum robots with compliance behavior and flexible backbones have shown promising performance in complex and dynamic environments [1]. Due to their morphological structure, continuum robots can extend, shrink, and bend, which facilitates shaping their structure to suit confined spaces in unstructured environments such as the biomedical field [2], Minimally Invasive Surgeries (MIS) [3], and exploration missions [4].
Despite the merits of soft robots, they still suffer from common limitations in modelling, motion planning, and control due to their compliant structure and viscoelastic characteristics, leading to unpredictable motion behaviors. Recently, multiple motion planning techniques, which were purposed originally for rigid robots, are used for continuum robots [5], [6]. However, these methods still need programming experts and significant computational time to precisely define a sequence of actions that the robot adheres to during movement to achieve the task and to keep up with environmental changes like changing goals or avoiding dynamic obstacles [7]. Thus, there has recently been an increasing interest in the Learning from Demonstration (LfD) framework due to its performance that allows non-expert users with limited robotics knowledge to quickly program and adapt the robot behaviors while concerning the task constraints through kinesthetic learning or via teleoperation [8]. In LfD, state-action pairs of the desired task are recorded from the human demonstrations either kinesthetically or via teleoperation. Then, the robot learns to regenerate and generalize the demonstrated motions. Impressive results were acquired by LfD approaches in many applications [9]. In [10], motion planningbased LfD of multiple-segment flexible manipulator actuated by ionic polymer-metal composite (IPMC) flexible actuators was developed using kinesthetic teaching. Human demonstrations of the manipulator passing through a narrow hole were extracted and smoothed based on Gaussian Mixture Model (GMM) and Gaussian Mixture Regression (GMR). The experimental results showed that the soft robot passed through a narrow hole successfully. However, the human operator needed to adjust the voltage of each robot section manually during motion in static environments based on the traditional trial and error method, which This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ represented a time-consuming task. In addition, the developed LfD approach had no guarantee of generalizing the collected demonstrations and regenerating new movements.
The development of pose planning-based Dynamic Movement Primitives (DMP) and control of a pneumatic actuated bionic handling assistant robot for apple picking tasks was introduced in [11]. Kinesthetically, the robot's tip poses were recorded during teacher demonstrations. The experimental tests showed that the soft robot successfully picked and placed the apple. However, the proposed learning-based orientation method combined unit quaternion and angle axis method, which may generate angle error [12], and it was applied only in a static environment. In addition, in many medical applications, having a direct contact with the robot through kinesthetic teaching could be challenging. In our previous work [13], [14], a Demonstration Guided Pose Planning (DGPP) approach and control of a two-section continuum robot were proposed. The human demonstrations were recorded using a low accuracy Inertial Measurement Unit (IMU). In [13], a rotation matrix representing orientation suffered from singularity at specific configurations. In addition, the computed torque with adaptive kinematic control was applied to track the robots' tip position, which was not efficient while interacting with the environment. While in [14], a DGPP-based quaternion that was introduced to plan the tip position and orientation separately had the potential to cause unsynchronised movements. Moreover, the developed adaptive kinematic control was not appropriate to capture the dynamic behavior of the soft robot.
On the other hand, developing a control algorithm for continuum manipulators that can successfully interact robustly with dynamic environments is challenging [15]. Nonlinear hybrid position/force control of a tendon-driven catheter was presented in [16]. Cosserat rod theory was applied for static modeling of the catheter. The simulation results showed the effectiveness of the developed position/Force controller in regulating the constant applied force. In [17], a hybrid position/force control algorithm was developed for a single-section cable-driven continuum robot. The control structure was built depending on selection matrices to choose between position control application before or after interaction and force control during interaction. The experimental results showed that the controller was stable while interacting with rigid and soft environments. However, it was not effective and safe in many applications, such as surgical operations, due to its high velocities when applying position control directly before colliding with the environment, which may lead to unpredictable interaction. Also, this control approach neglected to follow the position trajectory during the interaction period leading to deviation from the reference motion trajectories [18].
The contributions of this article are summarized in the following aspects. (1) Proposing an Imitation-based Motion Planning (IbMP) approach for simultaneously learning the position and orientation of a two-section soft continuum robot which is chosen to increase the flexibility and dexterity for inspection tasks. Via teleoperation, the learning procedure starts by collecting the human demonstrations using a motion capture system (OptiTrack-V120) from a flexible interface, which has similar kinematics to the simulated robot. The proposed IbMP approach is not limited to reproducing the demonstrated movements but extends to generalizing the tip pose trajectories while changing the initial and goal poses, and withstanding external disturbances. The collision of the continuum robot with the environment represents an inevitable problem, especially when the robot works in narrow spaces and may lead to serious damage to the robot or the environment. Thus, the IbMP approach can avoid obstacles in static and dynamic environments. (2) Evaluating the robustness of IbMP based on a Monte Carlo simulationbased algorithm which can guide the IbMP tuning process. (3) developing a dynamic impedance controller to compensate for the interaction forces with the environment while tracking the robot's tip position trajectory. The dynamic model is derived based on a Lagrangian formulation with Taylor Expansion to overcome singularities at specific configurations.
The remainder of this article is organized as follows. Section II introduces kinematics and dynamic modeling of a two-section continuum robot based on Lagrangian formulation with Taylor expansion. Section III gives a detailed description of the IbMP approach. Dynamic impedance control for tracking the tip position trajectory while interacting with the environment is presented in Section IV. Simulation results of the IbMP approach and the developed impedance control are evaluated in Section V. Finally, the conclusion and future works are introduced in Section VI.

II. KINEMATIC AND DYNAMIC MODELLING
Different actuation mechanisms are used in continuum robots to achieve the required tip bending, such as cable-driven, pneumatic, hydraulic, and Shape Memory Alloy (SMA) actuation. Due to their simplicity, scalability, and simple control, soft robots actuated by cable-driven mechanisms are used in many fields such as inspection, exploration, and medicine.
The schematic diagram of a single-section continuum robot is shown in Fig. 1. Each section contains three cables spaced by 120 • that can be varied to control the robot's tip moment. Because of the precise representation and low computational time derivation compared to other kinematic modeling techniques, the constant curvature approach is chosen to describe the pose of section i relative to its base i − 1 [19], as follows: where q e = [h i , θ i , β i ] ∈ R 6 , for i = 1, 2 is the vector of configuration variables, in which h i is the section length, θ i represents the bending plane angle from +X axis, β i = κ i h i is the angle subtended by the bending arc in which κ i is the curvature, and R(q e ) represents the rotation that maps the i axes to the are the cable lengths of each section which are related to the joint variables as follows: where d is the distance from the robot center to each cable. The kinematic representation in (1) suffers from the singularities at specific configurations (κ i = 0) that can be tackled by applying 7th order Taylor expansion to approximate the Cartesian position p(q e ).
The Lagrangian formulation of the equation of motion is computed as follows: where Γ ∈ R 6 is the driving torques, F ∈ R 3 is the contact force vector, J = ∂p ∂q e ∈ R 3×6 represents the Jacobian matrix, M (q e ) ∈ R 6×6 , N (q e ,q e ) ∈ R 6×6 and G(q e ) ∈ R 6 are the inertia matrix, the Coriolis and Centrifugal matrix and the gravity vector respectively.

III. IMITATION BASED MOTION PLANNING
Dynamic Movement Primitives (DMP) is a motion learning technique in which the soft robotic arm can execute many tasks based on a collection of human demonstrations without relearning based on a second-order differential equation [20]. DMP is used to learn complex discrete and rhythmic movements and can generalize motion trajectories to adapt to variable conditions, including changing initial or goal poses while avoiding static obstacles. Here, we are interested in discrete point-to-point movements of the two section ends of a continuum robot.
The learning steps of the proposed IbMP approach start by capturing demonstrated tip pose trajectories and their derivative, respectively. Then, learning the tip trajectory in Cartesian space is introduced, with orientation represented in quaternion form. After that, to ensure synchronized motion and avoid external perturbations, a phase variable s is introduced into the differential equation for the pose error.

A. IbMP Representation of Robot Tip Position
The transformation system for 3D discrete point-to-point motion while avoiding obstacles is described by the differential equation of a spring-damper system with forcing function f (s) as follows [21]: where K a and D a are tuning gains, which act like the elastic and damping terms respectively, s represents a phase variable that monotonically changes from 1 to 0 during motion to avoid explicit time dependencies of the forcing function f , o ∈ R 3 is the obstacle position, x is the robot section's tip position trajectory, x 0 and x g are the initial and goal positions respectively. In addition, v andv represent the scaled velocity and acceleration respectively, and τ > 0 is a temporal scaling factor to alter the duration of motion trajectory slower or faster. Multiple LfD approaches have been proposed for avoiding obstacles based on neural networks to model the coupling terms [22]. However, these methods require additional learning of multiple human demonstrations to succeed in this task. Other approaches added a coupling term P (x, v) (4) to avoid obstacles in R 2 and R 3 without additional demonstrations, which is computed as follows: where ζ and ϑ are tuning gains which are selected based on the size of the obstacle,ȯ is the obstacle velocity, r represents a rotation matrix with rotation axis a = (o − x) × v and rotation angle π/2 that is added to overcome the inconsistency that may happen between the acceleration generated by the coupling term and the velocity of the robot [23]. The steering angle φ is the angle between the robot tip position x towards the obstacle position o and the robot tip velocity v relative to the obstacle velocityȯ which is computed as follows: The forcing function f (s) is parameterized by a nonlinear radial basis functions Υ, which improves the robot's capability to follow any trajectory from initial position x 0 to goal position x g . The force equation is calculated as follows: where W j represents the learned weights that are computed based on a local regression approach, and ψ j (s) are radial basis functions that are approximately equidistantly spaced and calculated as follows: where c j = e −α s [0:Υ−1] Υ−1 and w j = |(c (j+1) − c j )| for j = 1, 2. . .Υ represents the centers and widths of the Gaussian functions distributed along the motion phase [24].

B. IbMP Representation of Robot Tip Orientation
Recently, there has been an increased interest in developing motion planning algorithms concerning the orientation of continuum robots [25]. The robot's tip orientation can be represented in different forms such as rotation matrix, Euler angles, and Axis-angle. In this regard, equations (4), (5) can be reformulated to describe the tip orientation in these forms. However, representation in such a way may suffer from loss of DOF, discontinuity, and high computational load [14].
The IbMP based quaternion can be represented similar to (4) and (5) as follows [26]: where K z and D z are tuning parameters, q = q 0 + q 1 i + q 2 j + q 3 k ∈ R 4 represents the orientation in quaternion form, q g is the goal orientation and η = τ ω, andη = τω are the quaternion derivatives which are related to the angular velocity ω = 2 log(q gq ).
is a diagonal scaling matrix in which R 0 and R g are the reference and goal orientation matrices. The f q (s) is a nonlinear forcing term that allows the robot tip to track the orientation trajectory from the initial orientation q 0 to the goal orientation q g .
where μ j are the learned weights required to follow the orientation trajectory.
To achieve motion synchronization while learning both the position and orientation trajectories of the robot's end-effector, the phase variable s should combine the tracking error of position and orientation to overcome the time dependency problem. s is computed as follows: where α s and α r are tuning gains, e xq = ( e x 0 + e x + e q 0 + e q ) is the tracking error, in which e x 0 = x g − x and e x = x d − x are the tip position errors in which x d is the reference position trajectory, e q 0 = q g − q and e q = q d − q represents the orientation errors in which q d represents the desired orientation.
On the other hand, to overcome discontinuity and improve motion smoothness, the goal position x g in (4) is modified to vary continuously with time by integrating the following equation: where α g is constant gain, x g 0 is the new goal position. Similarly, the goal orientation q g in (10) varies with time to keep up with orientation discontinuity that occurs due to goal variation and is calculated as follows: where α g 0 ∈ R 4×4 is positive diagonal gain matrix and q g 0 is the new goal orientation. As depicted in Fig. 2, the robot's tip pose variables x, v,v, q, η andη are recorded at each time step t = 0, 1, . . ..T . Then, the phase variable s is computed by integrating the differential (13), then the target forces can be computed easily from (4) and (10) respectively. The problem is now addressed as a local weighted regression problem that could be solved based on linear least squares to get the weights W i and μ i . After that, these learned weights are the core of computing the actual forcing terms (8) and (12). Finally, by integrating (4), (5) and (10), (11), the actual position and orientation trajectories are easily obtained.

IV. CONTROLLER DESIGN
Impedance control is characterized by regulating the dynamic behavior between the robot's tip position and the exerted force on the environment. Meanwhile, achieving accurate force tracking is not mandatory, unlike traditional hybrid position/force control algorithms, which focus on position tracking and force tracking separately. Therefore, impedance control represents a reasonable trade-off between position tracking and force control, making it a suitable controller in many applications such as surgical operations.
Interaction between the robot tip and the environment causes shape deformation, leading to non-zero contact forces. Assuming the deformed part of the environment could be represented by a mass-spring-damper system as follows: where H ∈ R 3×3 , D ∈ R 3×3 , K ∈ R 3×3 are inertia, damping, and stiffness matrices of the environment that are selected as symmetric positive definite to ensure that the system will be asymptotically stable. e = x d − x ∈ R 3 is the error between the reference and actual position trajectories,ė =ẋ d −ẋ in whichẋ d is the desired velocity andẋ = Jq e , andë =ẍ d −ẍ in whichẍ d is the desired acceleration andẍ =Jq e + Jq e . It is worth noting that the reference tip position x d trajectory is changed during interaction which is represented by step disturbance for simplicity. The impedance control equation in joint coordinates is derived as follows [27]: where a =ẍ d is the desired acceleration. The block diagram of the dynamic impedance control is shown in Fig. 3.

A. Stability Analysis
Substituting the driving torques Γ (17) into (3), yields a closed-loop dynamics in (16). Thus, the user-defined matrices H , D and K determine the robot's behaviors when subjected to external forces. In this work, we are concerned environment with  fixed stiffness for simplicity. To discuss the stability analysis of (16), the Lyapunov candidate function is assumed as follows: Differentiating V (e,ė) along the trajectories of (16) with H constant yields:V (e,ė) =ė T Hë + e T Kė (19) By Substituting the value ofë from (16) in (19) with F = 0 and symmetric stiffness matrix K , results: Equation (20) shows thatV is negative definite for positive diagonal matrix D. Hence, we can conclude stability if all diagonal values of the damping matrix are constant and positive [28].

V. RESULTS AND DISCUSSION
This section presents the simulation results of IbMP and dynamic impedance control as follows. First, capturing the human demonstrations experimentally is introduced. Then, multiple simulation scenarios are conducted to evaluate the performance of IbMP, including trajectory tracking, avoiding obstacles and rejecting external disturbances. Additionally, the robustness of IbMP is analyzed based on the Monte-Carlo approach. Finally, the performance of dynamic impedance control of a two-section continuum robot while interacting with the environment is evaluated in terms of trajectory tracking. The simulated parameters of the IbMP approach and dynamic impedance control are listed in Table I.

A. Data Capture
The two-section continuum robot is controlled via teleoperation based on the motion of a flexible interface with similar kinematics for precise mapping, as shown in Fig. 4. The interface consists of two sections attached in series; each section is printed from the flexible filament (Ninja-flex (1.75 mm)) by using a 3D printer (MagiX MF-2500EP). As depicted in Fig. 4, a polygon shape with six markers represents the pose at the end of each section of the robot. Due to its high accuracy, simplicity, and auto-calibration feature, the motion capture system (OptiTrack V120-trio) is used to capture each marker's position and orientation relative to the base frame. Meanwhile, a cubic interpolation algorithm is implemented in the vision system (Motive software) to to compensate automatically for missing data. At the same time, a Moving Average Filter (MAF) is used to remove the noise in pose trajectories, as shown in Fig. 5.
As depicted in Fig. 4, the learning process is started by asking the user to move the flexible interface, each section of which 16.18 cm long, for randomly inspection tasks from one pose to another while capturing the pose of the end of each section of the robot by using the motion capture system.

B. Learning of the Tip Position Trajectories While Avoiding
Obstacles.  Fig. 6(a). While the error between the reference and learned trajectories is shown in Fig. 6(b). Fig. 6(a) illustrates the ability of IbMP to reproduce complex movements accurately, which can be observed at beginning and end of the demonstrated trajectory. Additionally, the generalization capability of the proposed    Fig. 7(a) and (b) respectively. The generated movement has successfully adapted to the position trajectories without an additional learning phase. Fig. 8 illustrates a motion simulation of the robot's tip while tracking one of the learned demonstrations using the Unity platform. In Unity, a kinematics model is built for the twosections continuum robot with the assumption that the velocities of the robot are low and no need to incorporate the robot's dynamics. The code is made publicly available for visualization https://github.com/elhussieny/continuum_robot.
Furthermore, IbMP has avoided collision successfully with a static and a dynamic sphere-shaped obstacles with different diameters that may exist within the pathway of the robot's tip before returning to the reference trajectory as shown in Fig. 9(a) and (b) respectively. It is worth mentioning that the position and velocity of obstacles are assumed here. However, in real dynamic environments, the motion capture system of LiDar scanner could be used to estimate these variables. However, there is no guarantee that the proposed IbMP approach can allow the whole robot to avoid obstacles. Due to the compliance behavior of continuum robots, it is acceptable to hit obstacles within reasonable ranges compared to rigid robots in many applications. In this regard, future plans will be proposed to apply the IbMP in actuation space to reliably generate similar shapes while considering the constraints of cable length.

C. Learning of the Tip Orientation Trajectories While Rejecting External Perturbations.
Similarly, IbMP has succeeded in reproducing the tip orientation trajectories based on equations (10), (11), and (13) Fig. 10(a). The error between the reference and actual orientation trajectories is shown in Fig. 10(b). Additionally, IbMP generates novel orientation trajectories by changing the goal orientation  Fig. 11. Fig. 12(a) illustrates the capability of IbMP in avoiding collisions and returning to the target pose after external perturbations. The error between the demonstrated and learned orientations after applying a step disturbance is shown in Fig. 12(b).

D. Robustness Analysis
Multiple simulation scenarios based on the Monte Carlo approach [29] were carried out to analyze the stability and performance of the IbMP approach in terms of a wide variation of the goal position. In each scenario, 200 MATLAB simulations were performed while changing the goal position randomly within acceptable uniform distributions [4,7,20] T ≤ [x, y, z cm] T ≤ [11,24,32] T . Due to limited space, only 10 simulations are evaluated and sorted based on Root Mean Square Error (RMSE) calculations. Based on the suggested gains K a 1 = 200, D a 1 = 50, figure 13 shows an insignificant change in the RMSE values during simulations while reaching the goal position, which implies the performance of the proposed IbMP. Subsequently, Monte Carlo simulation scenarios have been conducted to show the effect of changing the gain values (K a , D a ) in (4) on the robustness of the IbMP algorithm. As shown in Fig. 13, the gains (K a 1 , D a 1 ) give the lowest RMSE values which implies the best performance to reach to the goal position. While (K a 2 , D a 2 ) and (K a 6 , D a 6 ) result in worst performance. As noted, the Monte Carlo method could guide the direction of choosing the optimal gains that enhance the system performance.

E. The Performance of the Dynamic Impedance Control
The dynamic impedance control has successfully tracked the tip position trajectory as shown in Fig. 14(a). It is suggested  that the continuum robot's tip interacted with the environment during the period t = 5:6 s. For simplicity, this interaction is represented as a step disturbance of amplitude 2 cm, which means that the reference trajectory is changed to be x d = x d + 2. The error between the reference and actual trajectories is shown in Fig. 14(b).

VI. CONCLUSION
In this research article, the Imitation-based Motion Planning (IbMP) approach is developed for planning discrete complex movements of a two-section continuum robot. The human demonstrations are captured while moving the flexible interface, with similar kinematics to the main robot, in front of a motion capture system. The performance capability of IbMP is not limited to regenerating the captured demonstrations but extends to producing novel movements by changing the initial and goal positions, avoiding static and dynamic obstacles, and rejecting external disturbances with no need to relearn the weights. The robustness of the proposed IbMP has been successfully evaluated based on the Monte-Carlo algorithm that could suggest the direction of selecting the optimal gains for best performance. On the other hand, dynamic impedance control is developed to regulate the interaction forces between the robot and environment while smoothly tracking the reference position trajectory.
The dynamic model of two section continuum robot is derived based on a Lagrangian formulation with Taylor expansion to avoid singularities due to specific configurations. The compliance behavior of soft continuum robots, IbMP approach, and dynamic impedance control enhance the performance of robot in many applications, such as in the biomedical field that require a repeatable sequence of movements, high precision, and safety characteristics.
The IbMP may suffer from an oscillatory behavior, especially with volumetric obstacles. Additionally, a significant computational time is required to adjust all of the tuning gains. In future work, an obstacle avoidance approach based on a dynamic potential field with super-quadratic function will be developed to overcome the limitations of the current approach while allowing the whole robot to avoid volumetric obstacles not only in ambient space, but also in joint space. On the other hand, the Nonlinear Model Predictive Control (NMPC) will also be developed to improve the controller performance while applying constraints on the cable lengths and the application workspace volume. Furthermore, a practical validation of the IbMP approach with dynamic impedance control will be conducted in a dynamic environment.