Registration of Six Degrees of Freedom Data with Proper Handling of Positional and Rotational Noise

When two six degrees of freedom (6DOF) datasets are registered, a transformation is sought that minimizes the misalignment between the two datasets. Commonly, the measure of misalignment is the sum of the positional and rotational components. This measure has a dimensional mismatch between the positional component (unbounded and having length units) and the rotational component (bounded and dimensionless). The mismatch can be formally corrected by dividing the positional component by some scale factor with units of length. However, the scale factor is set arbitrarily and, depending on its value, more or less importance is associated with the positional component relative to the rotational component. This may result in a poorer registration. In this paper, a new method is introduced that uses the same form of bounded, dimensionless measure of misalignment for both components. Numerical simulations with a wide range of variances of positional and rotational noise show that the transformation obtained by this method is very close to ground truth. Additionally, knowledge of the contribution of noise to the misalignment from individual components enables the formulation of a rational method to handle noise in 6DOF data.


Introduction
The pose of a three dimensional rigid body is determined by six degrees of freedom: three coordinates of the position vector (defining, e.g., the location of the center of mass) and three angles (e.g., Euler angles or yaw, pitch, and roll) which uniquely parameterize a 3x3 rotation matrix. A pose measuring instrument outputs position and orientation data in a coordinate frame defined by the pose of the instrument in a global coordinate system. Any data acquired by the instrument from two different locations need to be transformed into one coordinate system through a process known as registration. A similar procedure is required when a robot's vision system collects data in one coordinate frame while the robot's arm operates in another coordinate frame (robot-world/hand-eye calibration problem). Due to noise present in acquired 6DOF data (both in their positional and rotational parts) the alignment of two datasets is not perfect. Mathematically, both the registration and calibration problems can be formulated as the minimization of some kind of error measure E pose (H) where the homogeneous transformation H being sought consists of a rotation and a translation.
Various techniques have been developed to obtain H (see [1] for a comprehensive review). Some methods are based on iterative minimization [2,3], while others provide closed form solutions [4,5]. There are techniques that parameterize rotation with the help of quaternions [6,7]; others use an axis and angle representation [8]. Many approaches follow a separable solution strategy: first calculate the rotational part of H and then calculate the translation [9]. Fewer procedures simultaneously solve both the rotational and translational components of H [10,11]. Another class of techniques falls into a category of structure from Initially, an arbitrary value was assigned to λ and the first minimization of E pose was performed. The weight factor was then updated to rot loc E E λ = and substituted back into (3). Next, the new E pose was minimized again and the whole process was repeated. Figure 2 in [18] shows that regardless of the initial value of λ , after only three steps, λ approaches a constant value, which is interpreted as the correct weight between the rotational and positional components of E pose . Unfortunately, this procedure removes only a mismatch in length units between E loc and 2 loc σ , so they both can be expressed in the same units, like mm or µm. The procedure does not provide a true measure of relative noise levels, because the limit in (4b) http://dx.doi.org/10.6028/jres.118.013 is equally as ill defined as E pose in (1). Depending on the size of the bounding box containing positional data, the same positional noise loc σ can be declared as large or small, regardless of the actual value of rot σ .
In this paper the mismatch between E loc and E rot in the registration procedure is removed in a different way. Instead of using the Euclidean norm to measure the distance between two sets of 3D points, E loc is expressed as a sum of squared angular differences between matching vectors. Then, the same form of function can be used to calculate E rot . This formulation ensures that the ratio E loc /E rot is truly a good measure of the relative amount of noise in the positional and rotational components of 6DOF data. Extensive numerical simulations reveal that it may be more advantageous to use only the positional or the rotational part of data in some experimental conditions.

Definition of Error Function
Two 6DOF datasets {p j , A j } and {q j , B j }, j=1,…,N, are acquired in two different coordinate systems where, for each j, p j and q j are two corresponding vectors and A j and B j are two corresponding 3x3 rotation matrices. The registration transformation consists of a rotation matrix R and a translation vector v. In this paper, all rotation matrices are used in an axis and angle representation. The axis can be represented as a unit column vector u cos cos , cos sin ,sin T ϑ ϕ ϑ ϕ ϑ ϕ ϑ = u (5) where ( )  (6) where I is 3x3 identity matrix and Following separable procedures, the rotation R is found first and then the translation v is calculated as ctr ctr = − v p Rq (8) where p ctr and q ctr are centroids of N points {p j } and {q j }. So, the hard part of registration is to find a correct rotation R. When only 3DOF positional data are available, the typical approach is to move the origins of coordinate systems to the corresponding centroids ctr ctr , , j j j j = − = − p p p q q q (9) and then find a rotation R which minimizes the following Euclidean norm Such defined E loc has dimensionality of (length) 2 and causes problems when 6DOF data need to be registered. To avoid this problem, vectors j p and j q are normalized ctr ctr ctr ctr p p q q p q p p q q (11) and the error function is defined as where 0 ≤ w j ≤ 1 is a weight factor for a given j-th term and ⋅ stands for the dot product of two vectors. The error function so defined gauges the angular misalignment between vectors j  p and j  Rq . This error function can now be minimized to find the best rotation R by using any (perhaps gradient based) optimization procedure. The gradient of E loc can be calculated as where individual partial derivatives of rotation matrix γ R are expressed as ( ) and derivatives of vector u can be explicitly evaluated from (5). E rot can be calculated using a similar form of error function as E loc in (12). Since A j is the rotation matrix, its three columns are unit vectors, i.e., the first column A j (:,1) is a unit vector along the rotated x direction, the second column A j (:,2) is a unit vector along the rotated y, and A j (:,3) along the rotated z (and similarly for B j ). Thus, E rot can be defined as (15) where 0 ≤ s j (k) ≤ 1 is a weight factor for the corresponding (j,k) term. The gradient of E rot can be calculated similarly as for E loc in (13) with γ R calculated in (14). Finally, the full error function E pose for registering 6DOF data using both the positional and rotational parts of data is defined as in (1), with E loc defined by (12) and E rot by (15). The gradient of E pose can be calculated from gradients of E loc and E rot as defined in (13) and (16), respectively.
In order to use the above procedure, the weight factors w j in (12) and s j (k) in (15) must be defined as well as a starting point for minimization ( ) If there were no noise in the acquired rotational data A j and B j , then the following would hold for every j In reality, a slightly different matrix R j has to be calculated for each j where u j and ρ j are the axis and the angle of rotation and where u 0 depends on angles ( ) Once the starting point for the minimization is determined, the weight factors w j in (12) and s j (k) in (15) can be calculated where j  p and j  q are defined in (11)

Numerical Simulations
In order to test the performance of the introduced procedure, extensive numerical simulations were done. First, a primary 6DOF dataset was generated , , , , , was formed as in (6). Then, the secondary 6DOF dataset , , , , ,  could be created as follows. The position vector is , , , The positional noise is represented by ( ) 0, f N , a vector with three components that are pseudo-random numbers obtained from a generator with Gaussian distribution, zero mean and standard deviation equal to f. The rotation matrix B j is calculated as ( ) , , where the rotation matrix C j was calculated as in (6), h is a standard deviation of angular noise, and , ,  are pseudo-random numbers obtained by the standard Gaussian generator (with zero mean and standard deviation equal to one). In order to make an easier comparison between the effects caused by angular noise (h in radians) and positional noise (f in mm), the standard deviation of positional noise was calculated as where g is expressed in radians and avg L is the averaged length of vectors p j centered at p ctr avg ctr 1 1 .
Once a pair of primary and secondary data was generated, the best fit rotation ( ) * * * * , , ϑ ϕ ρ R was determined by three different methods. In method 1, only positional data were used and E loc , defined in (12), was minimized. In method 2, only rotational data were used and E rot , as defined in (15) For every pair of noise parameters (g, h), the above procedure was repeated many times. First, in order to check the stability of results obtained for different noise realizations, for a given transformation { } 0 ,  t R and primary data, N noise secondary datasets were generated ( ) , , , , , Each n-th dataset was obtained by using different sequences of random numbers , , j j j ζ ω η     and ( ) 0, f N for rotational and positional noise, respectively. In addition to this test, M data primary datasets were generated ( ) , , , , , For each m-th primary dataset, N noise secondary datasets were generated as described previously ( , ) , , , , , where k = 1,…, K trans . For each k-th transformation, each m-th primary dataset and each n-th random sequence, a secondary dataset ( , , ) , , , , , Overall, for every pair of noise parameters (g, h), a total of N tot = N noise x M data x K trans secondary datasets were generated. Each secondary dataset was registered to its corresponding primary data using each of the three different methods described earlier.
All numerical calculations were performed on a 32-bit PC in double precision. A standard quasi-Newton minimization algorithm as implemented by Davidon-Fletcher-Powell (DFP) in [19] was used in all optimizations. Exit criteria from this iterative procedure were set to 10 -6 for both the minimum change in step and the flatness of a gradient. Pseudo-random numbers were generated using the gasdev function provided in [19].  Figure 2 shows which method most frequently delivered the smallest deviation from ground truth d k as defined in (24). Figure 3 displays how often a given method delivered the best results. Figure 4 shows the outcome of a prediction procedure defined as follows. For each pair of noise parameters (g, h) and each l-th pair of data, the ratio l α was calculated as in (26). Then, if low l T α ≤ or high l T α ≥ , where T low < T high are predefined parameters, the prediction was made. If high l T α ≥ , then the best results were expected to be delivered either by method 2 (minimization of E rot , positional data ignored) or by method 3 (minimization of E pose , full 6DOF data used). Similarly, if low l T α ≤ , then the best results were expected from either method 1 (minimization of E loc , rotational data ignored) or from method 3. This prediction was compared with the actual deviations from ground truth d k , k = 1,2,3. If the prediction was correct, the number of successful predictions for that pair (g, h) was increased by one. This process was repeated for every l-th pair of data (l = 1,…, N tot ). Displayed in Fig. 4

Discussion
The results presented in Fig. 1 indicate that, as expected, the ratio α correlates well with the ratio of noise parameters g/h. This is important because the systems that are used for pose determination usually do not provide much information about the noise levels present in positional and rotational data. When the ratio α becomes small, one may expect that discarding the rotational part of 6DOF data may, on average, lead to better registration. Similarly, for α large, discarding the positional part of the full data may on average yield a better result. For intermediate values of α using full 6DOF data should lead to the best results. Figure 2 and (g, h). The third method most frequently delivers the best or second best result. In the latter case, for low T α ≤ or high T α ≥ , the difference between the two best methods is in most cases two orders of magnitude smaller than the difference between the worst and the best method; see Fig. 5. This means there is a large difference between the worst method and the remaining two methods. At the same time, differences between the two best methods are small. Thus, using full 6DOF data and minimizing E pose seems to be the best or close to the best strategy across the whole range of noise parameters (g, h). For very small or very large values of the ratio , it may be worthwhile to discard the noisy part of the 6DOF data and to redo the minimization using E loc or E rot with only positional or rotational data, respectively. A similar conclusion was formulated already in [3] without a systematic study of the mutual relation between positional and rotational noise. That observation was based on particular parameters used in the simulations: length of position vector j ∈ p [500 mm, 1000 mm], length of translation 0 t = 800 mm, positional noise 1 mm and rotational noise 2.5 mrad (using a uniform random number generator). However, during the simulations, 0 t and j p were calculated in meters. Only positional data were also used in another hand-eye calibration procedure. The minimum variance method introduced in [20] delivered better results than two other methods [8,18]. Systematic studies presented in this paper explain why this apparently surprising conclusion can be correct.
One may wonder why for small or large values of α the predictions are not perfect, as the data in Fig.   4 indicate. However, it should be remembered that a residual value of the error function (like E rot or E loc ) provides only an indication of the noise level present in experimental data, not the actual deviation of best fit parameters from the unknown ground truth.

Conclusions
Performance of the iterative minimization procedure for registration of 6DOF noisy data was studied in computer simulations. The properly defined error function E pose removed the mismatch between the positional component of the error (unbounded, in length units) and the bounded, dimensionless rotational component. The error function E pose can be minimized and the resulting rotation matrix provides a good approximation to the true rotation across a large range of positional and rotational noise variances. Thus, both the developers and the users of pose determining systems could benefit from being able to properly gauge a relative amount of noise in the positional and the rotational parts of 6DOF data.