ARKOMA dataset: An open-source dataset to develop neural networks-based inverse kinematics model for NAO robot arms

The inverse kinematics plays a vital role in the planning and execution of robot motions. In the design of robotic motion control for NAO robot arms, it is necessary to find the proper inverse kinematics model. Neural networks are such a data-driven modeling technique that they are so flexible for modeling the inverse kinematics. This inverse kinematics model can be obtained by means of training neural networks with the dataset. This training process cannot be achieved without the presence of the dataset. Therefore, the contribution of this research is to provide the dataset to develop neural networks-based inverse kinematics model for NAO robot arms. The dataset that we created in this paper is named ARKOMA. ARKOMA is an acronym for ARif eKO MAuridhi, all of whom are the creators of this dataset. This dataset contains 10000 input-output data pairs in which the end-effector position and orientation are the input data and a set of joint angular positions are the output data. For further application, this dataset was split into three subsets: training dataset, validation dataset, and testing dataset. From a set of 10000 data, 60 % of data was allocated for the training dataset, 20 % of data for the validation dataset, and the remaining 20 % of data for the testing dataset. The dataset that we provided in this paper can be applied for NAO H25 v3.3 or later.


a b s t r a c t
The inverse kinematics plays a vital role in the planning and execution of robot motions.In the design of robotic motion control for NAO robot arms, it is necessary to find the proper inverse kinematics model.Neural networks are such a datadriven modeling technique that they are so flexible for modeling the inverse kinematics.This inverse kinematics model can be obtained by means of training neural networks with the dataset.This training process cannot be achieved without the presence of the dataset.Therefore, the contribution of this research is to provide the dataset to develop neural networks-based inverse kinematics model for NAO robot arms.The dataset that we created in this paper is named ARKOMA.ARKOMA is an acronym for ARif eKO MAuridhi, all of whom are the creators of this dataset.This dataset contains 10 0 0 0 input-output data pairs in which the endeffector position and orientation are the input data and a set of joint angular positions are the output data.For further application, this dataset was split into three subsets: training dataset, validation dataset, and testing dataset.From a set of 10 0 0 0 data, 60 % of data was allocated for the training dataset, 20 % of data for the validation dataset, and the remaining 20 % of data for the testing dataset.The dataset that

Value of the Data
• In the design of the robotic motion control for NAO robot arms, it is required to obtain the proper inverse kinematics model.Inverse kinematics is such a highly nonlinear problem that it is frequently difficult to solve analytically.This problem can be overcome by neural networks.The dataset provided in this paper can be utilized by NAO robot developers to develop the inverse kinematics model based on neural networks.• This dataset contains the input-output data pairs, so this is suitably addressed on supervised learning.Supervised learning is a paradigm where the neural networks are trained by using labeled data.Labeled data means that each input has its corresponding output or desired target.In other words, labeled data is the dataset that comprises the input-output data pairs.In supervised learning, there are different neural network architectures, such as multi-layer perceptron neural networks (MLPNN), convolutional neural networks (CNN), radial basis function neural networks (RBFNN), and so on.By using the dataset, NAO robot developers can develop different neural network architectures to obtain the inverse kinematics model.• This dataset was divided into three subsets: training dataset, validation dataset, and testing dataset.The training dataset is used to train neural networks.The validation dataset is then utilized to validate the performance of neural networks during the training process, while the testing dataset is employed after the training process to test the performance of the trained neural networks.Therefore, the dataset that we provided in this paper can be applied to not only train neural networks but also validate and test the performance of the trained neural networks-based inverse kinematics model for NAO robot arms.

Objective
Inverse kinematics plays a significant role in various applications related to robot motions.This is because the planning and execution of appropriate robot motions are inexorable from the inverse kinematics role.It is important to note that most tasks to be performed by the robotic arm are defined in the cartesian space.To perform manipulation tasks such as grasping objects, pick-and-place objects, and path tracking, the control inputs are given in terms of the desired position and orientation of the robot end-effector in the cartesian space.The manipulation tasks in industrial manufacturing for the welding robot, painting robot, and assembling robot are also determined in the cartesian space.However, what we can directly control from the robotic arm to perform the given tasks is its joint actuators that operate in the joint space.Therefore, the only way to solve this problem is through the inverse kinematics.The inverse kinematics deals with the mapping from the robot end-effector operational space, which is the cartesian space, to the robot joint space.The inverse kinematics is specifically used to find a set of joint angular positions based on the given position and orientation of the robot end-effector.The joint angular positions obtained from the inverse kinematics are required to control the joint actuators so that the robot end-effector can be driven to the desired targets in the cartesian space [1] .
Due to its nonlinearity, solving the inverse kinematics becomes a complex and challenging task.In general, the inverse kinematics can be solved by using analytical and computational methods.The analytical methods highly depend on the robot kinematics structure, so they are applicable to only a certain robot structure.In the case of robotic arms, Pieper's criterion stated that the inverse kinematics can be possibly solved by the analytical methods when the robotic arms contain a spherical wrist whose three revolute joint axes intersect at a common point [2] .Otherwise, solving the inverse kinematics by using the analytical methods becomes difficult to achieve.Unfortunately, the kinematic structure of NAO robot arms does not meet that criterion.The computational methods are more flexible in solving the inverse kinematics due to the fact that they are not dependent on the robot kinematics structure.The metaheuristic algorithms are such computational methods frequently used to solve the inverse kinematics.The metaheuristic algorithms, such as the particle swarm optimization [3] , grey wolf optimization [4] , and firefly algorithm [5] , have been widely applied for solving the inverse kinematics of any robot.They are such stochastic optimization methods that their convergence to the desired solution highly depends on the random initializations.These methods may not find an optimal solution when the initial guess is not appropriate.In this condition, the objective function resulted from the metaheuristic algorithms is vulnerable to being trapped in local minima.As a consequence, the obtained inverse kinematics solution becomes unacceptable.Hence, to overcome the remaining problems of the aforementioned methods, we can use neural networks.Neural networks have the capability to learn from the patterns of the data through a process called training [6] .Their capability to learn from the data makes neural networks more flexible for solving any complex problem than the aforementioned methods.
Neural networks are such a data-driven modeling technique that they require the dataset to create the inverse kinematics model.The presence of the dataset is a fundamental requirement in neural networks.It means that the presence of the dataset becomes a prerequisite in the use of neural networks.For that reason, we are motivated to create the dataset and disseminate this dataset through this paper.The dataset that we provided in this paper can be widely used by the NAO robot developers to build the neural networks-based inverse kinematics model for NAO robot arms.It can be only achievable with the presence of the dataset.Through this dataset, neural networks can learn from the patterns of the data to find the relationship model between the inputs and outputs in the dataset.In this dataset, the inputs are the robot end-effector position and orientation, whereas the outputs are a set of joint angular positions.This relationship Based on the number of the input and output parameters in the dataset, the neural networksbased inverse kinematics model will have six neurons in the input layer and five neurons in the output layer, as shown in Fig. 1 .In between the input and output layers, there are hidden layers.To determine the number of hidden layers and the neurons in each of them, it is required to train neural networks by feeding the dataset into neural networks [6] .Through the training process, all of us can adjust the neural networks' hyperparameters to find the optimal number of hidden layers and their neurons.As seen in Fig. 2 , the training process must be conducted until reaching a condition where the loss function in terms of mean squared error (MSE) approximates zero.

Data Description
The dataset that we published in this data repository [ 7 , 8 ] contains input-output data pairs.In this dataset, the input data is the end-effector position and orientation, while the output data is a set of joint angular positions.In this case, the end-effector position and orientation are in the cartesian space, and a set of joint angular positions are in the joint space.For the input data, there are six input parameters where the end-effector position is denoted by the position vectors [ P x , P y , P z ], and the end-effector orientation is expressed by the rotation vectors [ R x , R y , R z ].It is important to note that the end-effector position is referred to as the position of the NAO's hands with respect to the torso, and the end-effector orientation refers to the orientation of the NAO's hands frame relative to the torso's coordinate frame.The torso, which is located in the center of the NAO robot body, is defined as the base coordinate frame.To sum up, the input parameters for the inverse kinematics are entirely described in Table 1 .

Table 1
The input parameters for the inverse kinematics.

P x
The end-effector position with respect to the torso's x -axis P y The end-effector position with respect to the torso's y -axis P z The end-effector position with respect to the torso's z-axis R x The end-effector orientation relative to the torso's x -axis R y The end-effector orientation relative to the torso's y -axis R z The end-effector orientation relative to the torso's z-axis Meanwhile, for the output data, there are five output parameters.The number of the output parameters for the inverse kinematics is equal to the number of joints.Physically, the NAO robot has five revolute joints for each arm.These revolute joints are the shoulder pitch ( θ 1 ), shoulder roll ( θ 2 ), elbow yaw ( θ 3 ), elbow roll ( θ 4 ), and wrist yaw ( θ 5 ).These revolute joints have different operational ranges, all of which operate in radians as shown in Table 2 [9] .

Experimental Design
The dataset plays a pivotal role in the implementation of artificial intelligence technologies, one of which is neural networks.The presence of this dataset is very important to build neural networks.Due to its important role, we are motivated to create the dataset to develop the neural networks-based inverse kinematics model for NAO robot arms.This dataset can be used as a means of establishing the robotic motion control for NAO robot arms.The experimental design to generate this dataset can be illustrated in Fig. 3 .The sequential steps to generate this dataset are as follows: (1) data acquisition, (2) modeling the forward kinematics of NAO robot arms, (3) matrix decomposition, (4) feature extraction, (5) aggregation, and (6) data allocation.

Data acquisition
The NAO robot is physically similar to the human body structure because of containing a head, two arms, and two legs.This humanoid robot is available in two models: NAO H21 and NAO H25.Compared with the NAO H21, the NAO H25 has more degrees of freedom (DOF).In total, the NAO H21 has 21 DOF, while the NAO H25 has 25 DOF distributed on its head, two arms, and two legs.The main difference between these two models lies in the number of joints and the type of end-effector on their arms.Meanwhile, for the other parts of the body, the number of joints and the type of end-effector are the same.For this reason, our discussion will be more focused on the robotic arms of the NAO H25.The NAO robot version that we used for the data acquisition is the NAO H25 v3.3.
The NAO robot arms kinematically contain a series of rigid links connected by revolute joints.The assembly of rigid links connected by the revolute joints forms the kinematic chain [10] .The part located in the last kinematic chain is a so-called end-effector.For each arm, the NAO H25 has five revolute joints and a prehensile hand as the end-effector [9] .The movement direction of the end-effector in the workspace is driven by revolute joints, each of which can move independently with respect to others.By varying the angular positions of these revolute joints, we can create various robotic arm poses.In the data acquisition, we generated 10 0 0 0 robotic arm poses by using choregraphe software.Fig. 4 shows the data sample for the robotic arm poses.To avoid data redundancy, these robotic arm poses should be unique to one another.

A set of joint angular positions
From the data acquisition, we obtained 10 0 0 0 robotic arm poses.As previously mentioned, the robotic arm poses are created from various joint configurations.Therefore, by extracting the robotic arm poses, we can obtain the data containing a set of joint angular positions of the NAO robot arms.The data samples for the joint angular positions are presented in Table 3 and  Table 4 , respectively.These tables have fields (columns) and records (rows).The data fields represent the joints involved in a robotic arm, while the data records are a set of joint angular positions.The joint angular positions that we provided in this dataset are in the radian unit.

Forward kinematics
Forward kinematics is concerned with the mapping from the joint space to the end-effector operational space, which is the cartesian space.The forward kinematics is also referred to as direct kinematics.The forward kinematics has a specific function to calculate the end-effector position and orientation in the cartesian space based on a given set of joint angular positions.Because of its important role, the forward kinematics must be formulated correctly.The way to formulate the forward kinematics of the NAO robot arms has been entirely explained in our previous research [1] .In short, the forward kinematics can be obtained by multiplying a set of the transformation matrices, each of which is the result of transformation between two adjacent coordinate frames from the torso as the base coordinate frame to the hand as the end-effector.The forward kinematics is typically defined in a homogeneous transformation matrix, as shown in Eq. (1) .

Matrix decomposition
As mentioned in Section 5.1 , the result of the forward kinematics is generally expressed in a homogeneous transformation matrix.This transformation matrix consists of 12 main elements that denote the end-effector position and orientation in the cartesian space.This transformation matrix can be then decomposed into the 3 ×1 translation matrix and the 3 ×3 rotation matrix, as shown in Eqs. ( 2)-( 3) .The translation matrix describes the end-effector position with respect to the torso, while the rotation matrix indicates the end-effector orientation relative to the torso's coordinate frame.(3)

Feature extraction
From the matrix decomposition, we obtained the translation matrix and rotation matrix.The translation matrix has three elements, each of which denotes the end-effector position in the three-dimensional cartesian space.Meanwhile, the rotation matrix contains nine elements.However, only three elements of them are independent [11] .It indicates that the rotation matrix gives a redundant description for the end-effector orientation in the three-dimensional cartesian space.For this reason, we require feature extraction.The objective of the feature extraction is to reduce the number of features in the data by means of creating new features from the existing ones and then followed by eliminating unnecessary features.In this case, the feature extraction will be carried out by transforming the rotation matrix to rotation vectors.The way to transform the rotation matrix to rotation vectors can be conducted by the following equations [ 12 , 13 ] Alternatively, to make the transformation from the rotation matrix to the rotation vectors easier, we can use SciPy library [14] .In general, the end-effector orientation represented in the rotation matrix contains nine parameters.The drawback of the rotation matrix representation is that there are so many parameters, which makes it difficult to interpret.As a result of the feature extraction, the end-effector orientation can be represented by only using three parameters.

Aggregation
The end-effector position in the three-dimensional cartesian space is denoted by using three parameters [ P x , P y , P z ], whereas the end-effector orientation in the three-dimensional cartesian space is also denoted by using three parameters [ R x , R y , R z ].Accordingly, the total number of the parameters that represent the end-effector position and orientation in the three-dimensional cartesian space is six parameters.

The end-effector position and orientation
The experiment has been conducted to create the dataset.From the data acquisition that we explained in the previous section, we obtained data composed of a set of joint angular positions.Besides, we also require the end-effector position and orientation data.By using the forward kinematics, we can generate the data containing the end-effector position and orientation.It can be conducted by feeding a set of joint angular positions into the forward kinematics Eq. ( 1) .Because of containing a redundant description for the end-effector orientation, we decided to conduct feature extraction.In this case, the feature extraction aims to reduce the redundancy and to give a simpler representation for the end-effector orientation.The feature extraction was conducted by transforming the rotation matrix to rotation vectors using Eqs.(4 -6 ).In this way, the end-effector position and orientation can be represented by only using six parameters, each of which is independent.From the experimental result, we acquired the data comprising the end-effector position and orientation.The data samples for the end-effector position and orientation of the NAO robot arms can be seen in Tables 5 and 6 .

Table 5
The end-effector position and orientation of the NAO's left arm. No

Data allocation
The dataset to build neural networks-based inverse kinematics model for NAO robot arms consists of 10 0 0 0 input-output data pairs, where the input data is the end-effector position and orientation and the output data is a set of joint angular positions.This dataset should be then split into three partitions: the training dataset, validation dataset, and testing dataset [6] .Each of them has a different function.The training dataset is used for training the neural networks.The validation dataset is then applied to validate the neural networks' performance during the training process.Moreover, the validation dataset can be also used to tune hyperparameters of  the neural networks.Meanwhile, the testing dataset is employed post-training to test how well the performance of the trained neural networks.As depicted in Fig. 5 , from 10 0 0 0 data, 60 % of data was allocated for the training dataset, 20 % of data for the validation dataset, and the other 20 % of data for the testing dataset.Fig. 6 presents the result of data allocation for the NAO's left arm.This dataset contains the input-output data pairs.For the input data, there are six input parameters.In the input parameters, the end-effector position is denoted by the position vectors [ P x , P y , P z ], and the end-effector orientation is represented by the rotation vectors [ R x , R y , R z ].On the other hand, there are five output parameters for the output data.These output parameters are the left shoulder pitch joint ( θ 1 ), left shoulder roll joint ( θ 2 ), left elbow yaw joint ( θ 3 ), left elbow roll joint ( θ 4 ), and left wrist yaw joint ( θ 5 ), respectively.As a result of data allocation, 60 0 0 data was provided for the training dataset, 20 0 0 data for the validation dataset, and the rest of 20 0 0 data for the testing dataset.
Meanwhile, the result of data allocation for the NAO's right arm can be seen in Fig. 7 .This dataset is composed of input-output data pairs.There are six input parameters for the input data.In the input parameters, the end-effector position is denoted by the position vectors [ P x , P y , P z ], while the end-effector orientation is expressed by the rotation vectors [ R x , R y , R z ].On the other side, there are five output parameters for the output data.These output parameters include the right shoulder pitch joint ( θ 1 ), right shoulder roll joint ( θ 2 ), right elbow yaw joint ( θ 3 ), right elbow roll joint ( θ 4 ), and right wrist yaw joint ( θ 5 ).From the data allocation, 60 0 0 data can be used for the training dataset, 20 0 0 data for the validation dataset, and the remaining 20 0 0 data for the testing dataset.The dataset that we generated in this research is entirely in CSV format.CSV stands for Comma Separated Values.This file format has been widely used for saving data.A CSV file is generally tabular data that is saved as plain text.This file uses commas to separate values in tabular data.It should be noted that the kind of dataset in this paper is tabular data.As mentioned earlier, this dataset contains the input-output data pairs.For further applications, this dataset is divided into three partitions: training dataset, validation dataset, and testing dataset.This dataset is entirely stored in CSV format whose file names can be seen in Fig. 8 .The files saved in CSV format have a .csvextension.One of the main advantages of storing data in CSV format is that the CSV file can be opened and viewed in any program application, such as Notepad, Microsoft Excel, Google Spreadsheet, and so on.Moreover, because the CSV file is plain text, it requires less storage capacity than the other file formats.
This dataset consists of input and output data pairs, so this dataset is specifically addressed on supervised learning.Supervised learning is such a paradigm where the neural networks are trained by a set of labeled data.Labeled data means that each input has its corresponding output or desired target.In supervised learning, the objective is to find the relationship model between the inputs and outputs.In this dataset, the inputs are the end-effector position and orientation in three-dimensional cartesian space, and the outputs are a set of joint angular positions in the joint space.The relationship model resulted from this learning is called the neural networksbased inverse kinematics model.The number of data provided in this paper is sufficient to build the neural networks-based inverse kinematics model for NAO robot arms.In supervised learning, where the dataset contains input and output data pairs, the number of required data is typically not as many as in unsupervised learning.This is because the labeled data in supervised learning provides guidance in the learning process.This guidance reduces the ambiguity in the learning process, thereby making it possible to achieve good performance with a smaller number of data compared to unsupervised learning.Different from supervised learning, unsupervised learning typically requires more data to achieve good results because it has to learn from unlabeled data.

Limitations
The dataset that we provided in this paper is compatible with the NAO H25 v3.3 or later.However, this dataset is incompatible with the previous version.

Fig. 2 .
Fig. 2. Training process to find the neural networks-based inverse kinematics model.

Fig. 4 .
Fig. 4. The data sample of the NAO robot arm poses.

Table 2
The output parameters for the inverse kinematics.

Table 3
A set of joint angular positions extracted from the NAO's left arm poses.

Table 6
The end-effector position and orientation of the NAO's right arm.