Machine Learning-based Framework for Optimally Solving the Analytical Inverse Kinematics for Redundant Manipulators

Solving the analytical inverse kinematics (IK) of redundant manipulators in real time is a difficult problem in robotics since its solution for a given target pose is not unique. Moreover, choosing the optimal IK solution with respect to application-specific demands helps to improve the robustness and to increase the success rate when driving the manipulator from its current configuration towards a desired pose. This is necessary, especially in high-dynamic tasks like catching objects in mid-flights. To compute a suitable target configuration in the joint space for a given target pose in the trajectory planning context, various factors such as travel time or manipulability must be considered. However, these factors increase the complexity of the overall problem which impedes real-time implementation. In this paper, a real-time framework to compute the analytical inverse kinematics of a redundant robot is presented. To this end, the analytical IK of the redundant manipulator is parameterized by so-called redundancy parameters, which are combined with a target pose to yield a unique IK solution. Most existing works in the literature either try to approximate the direct mapping from the desired pose of the manipulator to the solution of the IK or cluster the entire workspace to find IK solutions. In contrast, the proposed framework directly learns these redundancy parameters by using a neural network (NN) that provides the optimal IK solution with respect to the manipulability and the closeness to the current robot configuration. Monte Carlo simulations show the effectiveness of the proposed approach which is accurate and real-time capable ($\approx$ \SI{32}{\micro\second}) on the KUKA LBR iiwa 14 R820.


Introduction
The inverse kinematics (IK) [40] solution is fundamental for many applications in robotics involving motion planning, e.g., point-to-point trajectory optimization [46,21], path-wise trajectory planning [19,12], dexterous grasping [15,33], and pick-and-place scenarios [36,31]. Solving the IK problem for a given target position in the task space yields the robot's configuration in joint space that satisfies the kinematic constraints [27].
In general, this algebraic problem can be solved for a manipulator with 6 degrees of freedom (DoF), see, e.g., [8], but is not generally applicable to kinematically redundant manipulators [37]. On the other hand, numerical methods, typically based on least-squares or pseudoinverse formulations, are widely employed, see, e.g., [48], for various kinematic structures due to their simplicity, low computing time, and their generality. However, these methods may converge to a local minimum that predominantly depends on the initial guess of the solution.
In contrast to the numerical IK, the analytical IK computes the exact solution, which is important for many industrial applications [50,13]. The computing time of the analytical IK solutions is much faster and real-time capable, compared to the numerical approach. While the analytical IK is only available for specific robot kinematics, most industrial robots are designed such that an analytical solution of the IK is available. Hence, the IK of 6-DoF industrial robots with a spherical wrist, non-offset 7 DoF S-R-S redundant manipulators [42,22], e.g., KUKA LBR iiwa 14 R820, but also the offset redundant manipulator, e.g., Franka Emika Panda [11], OB7 [32], and Neura Robotics LARA [29], can be solved analytically. These manipulators are often referred to as collaborative robots (Cobot). Typically, the analytical IK parameterizes the robot redundancy by additional (three) parameters, which are usually named redundancy parameters. Examples are [38] and [17] for the non-offset and offset redundant manipulator, respectively. Due to the redundant nature of these parameters, infinite sets of parameters exist in general yielding different IK solutions. However, finding the set of redundancy parameters which represents the best IK solution of a specific task is a non-trivial problem.
In this work, a learning-based framework to compute the optimal set of redundancy parameters for an analytical IK is proposed. This improves the computing performance for solving the IK problem, which is essential for highly dynamic tasks like catching objects in mid-flight or handing over objects between moving agents. These tasks are frequently solved using trajectory optimization where typically dynamic system constraints, state and control input constraints, and a target pose constraint are considered.
The target pose constraint is often formulated in the task space and for kinematically redundant robots, in which infinite joint configurations satisfy this constraint. Utilizing the analytical IK in the trajectory optimization problem allows to find the best target joint configuration for a specific task. On the other hand, computing time becomes an issue with this approach. Recently, [47] proposed a realtime capable closed-form solution for the KUKA iiwa 14 R820, where the authors minimize the joint velocities and accelerations while avoiding joint boundaries for a trajectory tracking task. This approach does neither consider the dynamic constraints nor the system state and input constraints in the trajectory optimization.
In addition, it is crucial that the target configuration is well chosen among the infinite solutions of the IK, i.e. close to the initial robot configuration and with high manipulability. For example in a dynamic handover of an object where the target is moving, see Fig. 1, it is advantageous to choose a target configuration that maximizes manipulability such that the robot end-effector can move to another target configuration with high agility. Moreover, choosing a target configuration that is close to the initial configuration of the robot can lead to high performance of the trajectory optimization with a high success rate. Including such criteria in the trajectory optimization is the motivation for the work in this paper.
To this end, a learning-based framework to include additional, application-specific criteria in the analytical IK of redundant robots is proposed. First, a database of 10 8 random pairs of initial configurations and target poses is generated. For each pair, the optimal trajectory is computed using classical approaches, considering applicationspecific criteria, and is stored in the database together with the set of optimal redundancy parameters. This database serves as the basis to train a neural network (NN), which is used to predict the optimal redundancy parameters for the analytical IK in highly dynamic real-time applications.
The main contribution of this work is a learning-based initial configuration target pose possible target configurations Figure 1: An example of a handover task between robot and human.
framework that employs a NN to predict the redundancy parameters of an analytical IK. This yields an optimal target joint configuration for a given target pose by considering application-specific criteria. In this work, the target joint configuration is chosen close to the current joint configuration of the robot and to have a high measure of manipulability. The proposed learning framework significantly speeds up the computing time of the trajectory optimization problem. Note that the proposed framework is tailored to the non-offset redundant manipulator KUKA LBR iiwa 14 R820. Nevertheless, it is also applicable to other kinematically redundant robots with an analytical IK, e.g., [11,32,29]. The remainder of this paper is organized as follows. Section 2 presents the mathematical modeling and analytical inverse kinematics. Additionally, details of the pointto-point (PTP) trajectory optimization problem and the algorithm for determining the optimal target joint configuration w.r.t. application-specific criteria are given. The learning framework for predicting the redundancy parameters for the analytical IK problem including database generation and the proposed NN is presented in Section 3. Simulation results are shown in Section 4. The last section, Section 5, concludes this work.

Trajectory Optimization Framework
This section presents the trajectory optimization framework which is commonly used in robotics [2]. For example, in Figure 1, with a given target pose and a robot's initial configuration, an optimal trajectory in joint space is planned for the robot to catch an object.
In this section, the mathematical modeling of the KUKA LBR iiwa 14 R820, including kinematics and system dynamics, is briefly summarized. Then, the analytical inverse kinematics with the redundancy parameters of this redundant manipulator is presented in Section 2.2. Subsequently, a point-to-point trajectory optimization is per- formed, which is used to plan trajectories to the optimal target configuration, explained in Section 2.4.

Mathematical modeling
The KUKA LBR iiwa 14 R820 is an anthropomorphic manipulator due to its similarity to a human arm, which has an S-R-S kinematic structure [42]. The coordinate frames O i and the corresponding seven revolute joints q i , i = 1, . . . , 7, of the robot are shown in Fig. 2. The red, green, and blue arrows represent the x-, y-, and z-axis, respectively. The shoulder intersection position p s of the joint axes q 1 , q 2 , and q 3 and the wrist intersection position p w of the joint axes q 5 , q 6 , and q 7 correspond to the shoulder and wrist positions of the human arm, respectively. The elbow position p e is in the center of the joint axis q 4 .
The robot is modeled as a rigid-body system with the generalized coordinates q T = [q 1 , q 2 , . . . , q 7 ], see Fig.  2, which are the rotation angles q i around the z-axes (blue arrows) of each coordinate frame O i , i = 1, . . . , 7. To describe the kinematic relationship between the joint angles and the pose of the robot links comprising position and orientation, the homogeneous transformations T m n between two adjacent frames O n and O m are constructed, see Tab.
1. Here and in the following, the homogeneous transformation of a simple translation by distance d along the local  axis j ∈ {x, y, z} is denoted by T D,j (d), while an elementary rotation around the local axis j ∈ {x, y, z} by the angle φ is described by T R,j (φ). The end-effector transformation matrix T e 0 is referred to as forward kinematics and is computed in the form comprising the 3D tip position p t ∈ R 3 and the 3D orientation of the end effector as rotation matrix R 7 0 ∈ R 3×3 . The equations of motion are derived using the Lagrange formalism, see, e.g., [40], where M(q) denotes the symmetric and positive definite mass matrix, C(q,q) is the Coriolis matrix, g(q) is the force vector associated with the potential energy, and τ are the motor torque inputs. The kinematic and dynamic parameters of the KUKA LBR iiwa in (2) are taken from [41].
Since the mass matrix M(q) is invertible, (2) is rewritten in the state-space forṁ with the system state x T = [q T ,q T ]. The kinematic and dynamic limits of the robot [23] are summarized in Tab.
2. All limits are symmetric w.r.t. zero, i.e. q i = −q i , q i = −q i , and τ i = −τ i . To reduce the complexity of the system dynamics (2), the vector of joint accelerationq is utilized as a new control input for planning a trajectory in Section 2.3, i.e., u = q = M −1 (q)(τ − C(q,q)q − g(q)). Hence, the system dynamics (3) is rewritten in the compact forṁ

Analytical inverse kinematics
Typically in manipulation tasks, the desired end-effector pose for a point-to-point motion is given in the 6D Cartesian space in the form, cf. (1) To compute the robot joint configuration q from a desired end-effector pose T e 0,d , the inverse kinematics (IK) of the robot has to be solved. In the following, an inverse kinematics solution with redundancy parmeters tailored to the non-offset 7-DoF robot KUKA LBR iiwa 14 R820 is shortly revisited, see, e.g., [38]. Similar to [22,10], the redundancy parameters of this robot are chosen as the binary vector j T c = [j s , j e , j w ] and the angle ϕ, which are introduced below.
With a given end-effector pose T e 0,d , the position of the robot wrist p w in the world frame is fixed and is computed as with the distance from the wrist point to the end effector of the robot d 7 + d t . The vector p sw from the fixed shoulder position p s = 0 0 d 1 + d 2 T to the wrist position p w is expressed as p sw = p w − p s . Using the law of cosines in the triangle formed by the shoulder, elbow, and wrist, the joint position q 4 is immediately calculated as where d se = d 3 +d 4 is the distance from the shoulder to the elbow and d ew = d 5 + d 6 is the distance from the elbow to the wrist. In (7), the binary redundancy parameter j e ∈ {−1, 1} distinguishes between the elbow-up and the elbow-down configuration. The constellation of the shoulder, elbow, and wrist position forms two triangles of which the sides have a constant length for a given end-effector pose. Further, these three points and the two triangles lie on a plane, denoted as arm plane, which can be rotated around the vector p sw resulting in two cones, see Fig. 3. Thereby, the elbow position p e always stays on the perimeter of the cone bases. As a result, the robot can perform self-motions by moving the elbow on this perimeter. To this end, an arm angle ϕ is introduced as a redundancy parameter, referring to the angle between a reference arm with the special configuration q 3,n = 0 (red lines in Fig. 3), and the actual arm plane (blue lines in Fig. 3). Here and in the following, the index n refers to the reference arm configuration. The actual elbow orientation R 4 0 is equivalent to rotating the orientation of the reference elbow orientation R 4 0,n about the shoulder-wrist vector p sw by ϕ, i.e.
with Rodrigues' formula [27] where I 3×3 is the identity matrix, and [a] × denotes the skew-symmetric matrix of the vector a. Since q 4 remains unchanged between the reference arm configuration and the actual arm configuration for a given end-effector pose, (8) leads to Note that R 3 0,n depends only on the joint angles q 1,n and q 2,n , since q 3,n = 0 in the reference configuration. The joint angles q 1,n and q 2,n , shown in Fig. 4, are simply found as q 1,n = arctan2(p sw,x , p sw,y ) (11a) with p T sw = [p sw,x , p sw,y , p sw,z ], and Note that R 3 0 and R 7 4 can be directly computed using (10b), (10c) and Tab. 1. Analytically, the rotation matrices R 3 0 and R 7 4 result from Tab. 1 in the form where the elements written as * are omitted for brevity. From (12), the joint angles of the redundant manipulator are computed in a straightforward manner where j s , j w ∈ {−1, 1} are the remaining binary redundancy parameters and R[i, j] is the matrix element of the i-th row and j-th column of R.
In summary, the parameterization of the inverse kinematics solution uses the three binary variables j T c = [j s , j e , j w ] and the arm angle ϕ in (7) and (13) as redundancy parameters to determine a unique joint configuration q for a desired end-effector pose T e 0,d . In Fig. 3, the blue lines illustrate a possible robot configuration with j c = [1, −1, 1] T that is rotated by ϕ = 95°from the reference arm plane, drawn with red lines. To this end, by combining (7) and (13), the unique analytical inverse kinematics of the KUKA LBR iiwa 14 reduces to the compact form with the redundancy parameters j c ∈ {1, −1} 3 and ϕ ∈ [0, 2π].

Point-to-point trajectory optimization
In the point-to-point (PTP) trajectory planning, a desired trajectory ξ * (t) = [x * (t), u * (t)] T , t ∈ [t 0 , t F ] for the robotic system (4) is planned from an initial con- The target configuration has to satisfy the forward kinematics relation for the desired endeffector pose T e 0,d , see (1) Without loss of generality, the initial time t 0 is chosen as t 0 = 0. Furthermore, the target configuration is assumed to be a stationary point The PTP trajectory planning is formulated as optimization problem using the direct collocation method, see, e.g., [3], by discretizing the trajectory ξ(t), t ∈ [0, t F ], with N +1 grid points and solving the resulting static optimization problem (17) denotes the optimal duration of the trajectory from the initial state x t0 to the target state x t F . In addition, R is a positive definite weighting matrix for the input u which also weighs the tradeoff between the cost of the duration and the smoothness of the trajectory. The system dynamics (4) is approximated by the trapezoidal rule in (16b). Moreover, x and x in (16d) denote the symmetric lower and upper bounds of the state, respectively, and (16e) considers the upper and lower torque limit τ and τ .
It should be noted that (16e) is a computationally expensive inequality constraint, mainly because of the large expressions in the Coriolis matrix C(q,q). Indeed, the Coriolis matrix is often neglected in industrial applications [4,44]. To still consider the influence of the Coriolis matrix C(q,q) for the torque limits, the range of values of the term C(q,q)q is investigated for the KUKA LBR iiwa 14 R820 using a Monte Carlo simulation. In this simulation, 10 8 uniformly distributed random state vectors x are selected from the admissible operating range, see Tab. N m, which is much smaller than the torque limits of the motor. Although the influence of the Coriolis matrix on the dynamics of the overall system is not significant, it is still advantageous to consider these physical limits in the optimization problem (16). To this end, the costly inequality condition (16e) is replaced by The optimal trajectory is computed by solving the static optimization problem (16a)-(16d) and (18)  2.4. Optimal target configuration q t F In this section, the optimal choice for the target configuration q t F is discussed. The inverse kinematics for a redundant robot does not yield a unique joint configuration q t F , as presented in Section 2.2. Moreover, choosing an unsuitable target configuration q t F may cause the trajectory optimization (16) to fail or deliver poor results.
For redundant robots, there is an infinite number of joint configuration solutions q t F for a desired end-effector pose T e 0,d . Therefore, two criteria for selecting the best inverse kinematics solution, i.e. the manipulability and closeness, are introduced in the following and an optimization problem is formulated.
First, the manipulability m(q) [49] is the most popular index used to measure the dexterity of a robot for a specific joint configuration q. It is defined as where the geometric manipulator Jacobian J(q) takes the form In (20), ω t is the angular velocity of the end effector described in the frame O 0 , which is computed by To reduce the computational burden of (19) due to the computation of the determinant, an analytical expression of the manipulability is derived, which is given in the appendix. Second, to consider the closeness between the inverse kinematics solution q and the initial joint configuration q 0 of the robot, the L ∞ -norm . ∞ is employed to find the largest deviation between these two joint space configurations. Here, the closeness is given by where q 0 is the initial joint configuration of the initial state x 0 = [q T 0 , 0]. Next, the two criteria (19) and (22) are considered in an optimization problem to choose the best target configuration q t F for a given target pose T e 0,d . To solve this problem, according to (14), the redundancy parameters of the inverse kinematics j c and ϕ have to be determined. Since there are three binary redundancy parameters in j c , 2 3 = 8 different values are contained in the set X jc = {j c,i | i = 1, ..., 8}. Additionally, the arm angle ϕ ∈ [0, 2π] is equidistantly discretized with the grid points The following optimization problem is solved to find the best target configuration q t F = q * i,j as well as its corresponding redundancy parameters j * c and the virtual angle ϕ * for the desired pose T e 0,d arg min with the user-defined weighting parameters ω m > 0 and ω c > 0. To compute an optimal trajectory ξ * from the current robot configuration, represented by q 0 , to the given desired pose T e 0,d , the optimization problem (23) is solved first to obtain the optimal solution q * i,j = q t F . Then, this optimal robot target configuration q t F is used in the PTP trajectory optimization (16). The block diagram of this process is illustrated in Fig. 5.

Framework for learning redundancy parameters
The cost function (23b) and the inverse kinematics (23c) are nonlinear and discontinuous functions with many local minima, which is illustrated on the right-hand side of Fig. 6 for an example joint configuration q. Therefore, to find the global optimum, the optimization problem (23) has to be solved by exhaustive search, which is a time-consuming process since (23c) has to be evaluated 8n ϕ times, see Fig. 5. To significantly reduce the computational effort for this step, a neural network (NN) is presented in this section to quickly determine the joint configuration j * c and narrow down the search space for the arm angle ϕ * for a desired end-effector pose T e 0,d and the given initial configuration q 0 .
First, the generation of the database to train the NN for learning the redundancy parameters j c and ϕ is introduced. Then, the network architecture of this NN is presented in the next step.

Database generation
For the database generation, N p pairs of robot initial joint configurations q 0,k and corresponding feasible desired poses T e,k 0,d are randomly selected from a uniform random distribution in the admissible ranges and are stored in the set X = {ζ k } = {q 0,k , T e,k 0,d k = 1, ..., N p }. For each pair of q 0,k and T e,k 0,d , the optimization problem (23) is solved by an exhaustive search to find the global optimum redundancy parameters j * c and ϕ * as well as the target configuration q * t F . The redundancy parameters are stored in the set Y = {η k } = {j c,k , ϕ k | k = 1, ..., N p }. The database D = (X , Y) comprises both sets X and Y. Elements from the set X are the input to the NN and elements from the set Y are the corresponding labeled outputs, see Fig. 7.
The input data in the set X contain redundant data due to the constant bottom row in the desired pose T e,k 0,d , see (5). Therefore, only the three basis vectors e x,k , e y,k , and e z,k , of R e,d 0 = [e x,k , e y,k , e z,k ] and the position of the end effector p t,k are considered in the set X . Thus, the input set X is re-arranged in the form Since (23b)-(23c) are discontinuous nonlinear functions, see Fig. 6, a complex NN is required to approximate these functions. However, the training and prediction time of such a neural network is very long, making it impossible to be implemented in real time. Thus, instead of directly predicting the virtual angle ϕ, only the range of this angle, denoted by the bin index b ϕ ∈ {1, ..., n b } with the total number of bins n b , is predicted. This way, the value of the bin index b ϕ indicates that the virtual angle ϕ is in .., n b . This helps to reduce the complexity and to realize the proposed NN for a real-time application. Consequently, ϕ k is replaced by its bin index b ϕ,k ∈ {1, ..., n b } in the set Y resulting in Y = η k = {j c,k , b ϕ,k }.

Network Architecture
The architecture of the proposed NN is shown in Fig.  8. This NN is designed for the two sub-problems, i.e., to initial configuration numerical solution learn the joint configuration j c and the bin index b ϕ of the arm angle ϕ. Note that the input of the proposed NN is ζ ∈ X and the output is the predicted value First, two fully connected layers of size 32 with a ReLU activation function [24] are utilized, as shown in Fig. 8. Since there are 8 possibilities for choosing j c , a fully connected layer of size 8 with a softmax activation function [5] is employed to output j c . The cross-entropy function is used to compute the loss between the prediction j c,k of the NN and the target value j c,k in the form where M is the size of the training dataset.
Second, the predictedĵ c is concatenated with the input ζ again as a new input for the second subproblem. Similar to the first subproblem, two fully connected layers of size 32 with a ReLU activation function are used. Subsequently, the fully connected layer of size 8 and the softmax activation function are implemented to predict the bin index b ϕ of the arm angle ϕ. Again, the cross entropy function is used to compute the loss between the predicted value of the bin indexb ϕ and the target value b ϕ The proposed NN is trained by using the Adam [20] optimizer with the learning rate of α = 10 −3 . Furthermore, the L 2 regularization [20] with λ = 10 −6 is added to both loss functions L jc and L bϕ . This helps to avoid overfitting [28].

Results
The simulation results presented in this section are obtained on a computer with a 3.4 GHz Intel Core i7-10700K and 32 GB RAM. The generated database with N p = 10 8 pairs described in Section 3.1 is randomly shuffled and divided into 3 subsets, i.e., training dataset, validation dataset, and test dataset, which are partitioned as 80%, 10%, and 10% of the generated database, respectively. To speed up the computing time of database generation, C++ code is generated for (23) using MAT-LAB coder in MATLAB R2021b. Additionally, the analytical expression of the manipulability in the appendix (29) is utilized. The remaining parameters are chosen as n ϕ = 100 and n b = 8. For the database generation, the average computing time of (23) for a given pose and initial joint configuration is approximately 1.5 ms. Since n ϕ in (23) is set to 100, the average computing time of the analytical inverse kinematics expression in (23c) is approximate 1.8 µs.

Statistical information on training the proposed NN
The proposed NN is trained using the open-source software package Keras [14]. To reduce the training time, the CUDA cores of an Nvidia GeForce RTX 3070 are employed. During training, the mini-batch size is set to 2000 and the training data is reshuffled in each epoch. Fig. 9 shows that the learning accuracy for the joint configuration j c of the training dataset and the validation dataset after 500 epochs reaches 96.62% and 95.76%, respectively. Also, the corresponding values of the loss function L jc decreased to 0.1034 and 0.1264, respectively. To further validate the training result, the accuracy of the test dataset with the trained parameters of the proposed NN is approximately 96.49%.  For further validation, the proposed NN is compared to the performance of four well-known algorithms, i.e., naive Bayes classifier [16], discriminant analysis classifier [43], binary decision tree classifier [26] and k-nearest neighbor classifier [18]. Similar to the proposed NN, each classifier takes the input ζ ∈ X and outputs the prediction η ∈ Y. The statistical performance of the four algorithms and the proposed NN is shown in Tab. 3. Among the above algorithms, the binary decision tree classifier achieves the highest prediction accuracy for j c and b ϕ , i.e., 77.8% and 65.5%, respectively. Moreover, the average execution time of this classifier is approximately 0.35 µs, i.e., the fastest algorithm. However, the prediction accuracy of the proposed NN is still significantly higher compared to the binary decision tree classifier. Another aspect is the memory consumption, which is with 0.17 MB much less compared to over 70 MB for each of the four other algorithms. This is reasonable since in the proposed NN, the memory consumption is mainly used for storing the network parameters. The average NN execution time for the prediction of j c and b ϕ is about 7.35 µs. Also, the average computing of the analytical inverse kinematics with the given j c and b ϕ is about 2 µs. Thus, the proposed NN provides the possibility to compute a good IK solution with respect to the two criteria, i.e., manipulability (19) and closeness (22), in real time.

Performance of the proposed NN in the framework of trajectory optimization
To verify the efficiency of the proposed NN, the example task of planning a PTP trajectory from the initial configuration is considered. First, the comparison between the well-known damped least-squares inverse kinematics solution [27,6] and the proposed algorithm is depicted in Fig. 6. On the righthand side of Fig. 6, a color map of (23b) is depicted where the x-axis comprises the 8 possible joint configurations j c ∈ X jc and the y-axis contains the n ϕ = 100 arm angles ϕ ∈ X ϕ . Using the network architecture of Fig. 8 with n b = 8, the proposed NN takes about 7.35 µs to predict the joint configuration j c = [−1, −1, −1] T and the bin b ϕ = 3, i.e. ϕ ∈ [π/2, 3π/4]. To find the optimum value for the arm angle ϕ inside the predicted bin, (14) and (23a) are evaluated on an equidistant grid for ϕ ∈ [π/2, 3π/4] with n ϕ /n b grid points. This way, the effort to solve the optimization problem (23) reduces from 8n ϕ = 800 to n ϕ /n b ≈ 13 evaluations of (14) and (23b). Since the analytical manipulability expression (29) is used in (23b), the computing time of (23b) is approximately 0.15 µs, which is much smaller than (14). Thus, the total execution time for computing the optimal target configuration q t F is approximately 32 µs including the computing time of (14) of 2 µs. On the other hand, the damped leastsquares method in this example requires 17 iterations to find the solution of the inverse kinematics with a tolerance of 10 −8 . The computing time of the numerical method is approximately 3 ms.
On the left-hand side of Fig. 6, the computed target configurations for the given desired target pose T e 0,d are To further demonstrate the effectiveness of the proposed IK approach, the two target configurations (27) and (28) are used in the trajectory optimization framework detailed in Section 2. The nonlinear optimization problem (16) is solved using the interior point solver IPOPT [45] together with the linear solver MA27 [9]. To increase the computational speed, the gradient and the numerical Hessian are computed using the BFGS method [25] and provided to the IPOPT solver. The trajectory in (16) is discretized with N = 50 collocation points, giving a total of 1051 optimization variables. For this comparison, the same initial configuration q 0 and two different target configurations q t F ,A according to (27) and q t F ,N according to (28) of the pose T e 0,d from (26) are used. While the computing time of the optimization (16) for both target configurations is almost the same (55 ms), the time for moving to the target configuration of the proposed algorithm q t F ,A is 3.83 s compared to 4.03 s of the numerical solution q t F ,N . Moreover, the cost function in (16a) with q t F ,A and q t F ,N is 4.7 and 5.03, respectively. The optimal trajectories ξ A and ξ N corresponding to the target configurations (27) and (28), respectively, are validated on the experimental setup depicted in Fig. 11. This experimental setup comprises two main components, i.e. the robot KUKA LBR iiwa R820 and the PC. The PC communicates with the robot via a network interface card (NIC) using the EtherCAT protocol. The computed torque controller is implemented as MATLAB/Simulink module, which is executed via the real-time automation software Beckhoff TwinCAT. The sampling time T s = 125 µs is used for the robot sensors and actuators. The scaled joint position, velocity, and torque for all robot axes, normalized to their respecting limits for the two optimal trajectories ξ A and ξ N and the corresponding measurements from the experiments are shown in Fig. 12. Note that in this figure, the trajectories do not exceed the value ±1, which means that all state and input constraints in (16d) and (18) are respected. The travel time of the trajectory from the solution of the proposed NN (≈ 3.9 s) is slightly shorter than that from the numerical IK (≈ 4.1 s). Since the proposed NN is designed to select the configuration that is closer to the robot's initial configuration via (22), the motion ranges of joints 6 and 7 of ξ A are much smaller than the corresponding ranges of ξ N . Consequently, this could lead to a more time-efficient optimal trajectory. A video of several experiments for comparison can be found in the supplementary material in https://www.acin.tuwien.ac.at/en/360e/.
Finally, a Monte Carlo simulation is performed to validate the efficiency of the proposed NN in the PTP trajectory optimization. To this end, 10 5 pairs of initial robot configurations q 0 and target poses T e 0,d are randomly selected from a uniform random distribution in the admissible ranges. Then, the proposed NN and the numerical IK are used to determine the target joint configuration and for each target configuration, an optimal trajectory is calculated using (16). The statistical results are summarized in Tab. 4. While the computing times of (16) utilizing the target configuration of the proposed algorithm q t F ,A and the numerical IK q t F ,N are nearly the same (≈ 30 ms),  Figure 12: Joint position, velocity, and torque for all robot axes, normalized to their respective limits, referred to with the bar symbol, for optimal trajectories ξ A and ξ N of the numerical IK and the proposed NN algorithm, respectively. The desired trajectories are shown as solid lines and the measured trajectories are drawn as dashed lines. For safety reasons, the limits for the motor torques are 50% lower than the limits in Tab. 2.
the average optimal trajectory time using the proposed algorithm is slightly better, i.e. 4.52 s compared to 5.39 s.
Since the solution of the numerical IK depends on the initial guess, 13896 test cases fail to converge to feasible target configurations. Note that the maximum number of iterations for the numerical IK is 50. Additionally, after excluding 13896 failed cases, 1588 test cases are not valid to plan the PTP trajectory using (16). Note that these test cases fail because of violating the iteration limit, i.e., 100 iterations, which is set in the IPOPT solver. The overall success rate by using the numerical IK is approximately 84.5%. On the other hand, for the proposed algorithm, in all the test cases, a feasible target configuration is found. Only 554 test cases fail during the planning of the PTP trajectory due to the iteration limit of the IPOPT solver. Overall, the proposed NN outperforms the numerical IK by achieving a success rate of 99.5%.

Conclusions
In this work, a machine learning-based approach for the inverse kinematics (IK) of kinematically redundant robots is presented, which is suitable for trajectory planning in highly dynamic real-time applications like humanrobot object handovers or robotic object catching. In this approach, the optimal redundancy parameters are predicted by a neural network (NN) according to the applicationspecific criteria, closeness to the initial robot configuration and manipulability at the target pose. Redundancy parameters, i.e. a virtual arm angle and binary variables describing the joint configurations, resolve the non-uniqueness of the analytical IK of redundant robots and allow for a unique mapping between the target pose and the joint configuration. Since a NN is employed, the proposed framework can be applied to different collaborative robots, e.g., KUKA LBR iiwa 14 R820, Franka Emika Panda, OB7, of which the analytical IK can be parameterized by redundancy parameters. The NN used in the proposed framework outperforms classical classification algorithms in terms of accuracy and the prediction run time. A Monte Carlo simulation of 10 5 random pairs of an initial configuration and a target pose validates the proposed algorithm in the context of point-to-point (PTP) trajectory optimization. The proposed method succeeds in 99.5% of the test cases to find a feasible target configuration while achieving a shorter optimal time of the trajectory from the initial to the target pose on average compared to using a numerical IK method at a significantly shorter computing time (≈ 32 µs for the proposed IK compared to ≈ 3 ms for the numerical IK). Thus, the proposed framework is perfectly suited for real-time applications.
In future works, this machine learning-based framework will be applied to dynamic human-robot handover tasks.