Fitting the space line of high repeatability to noisy point cloud data

The spatial line fitting algorithm with high repeatability for noise point cloud data is challenging and it is of great significance for high-precision measurement, visual guidance of robotics and CNC machining, but little study has been done at this topic. In this paper, a new algorithm is proposed to fit the spatial line with high repeatability to adapt to the point cloud data with noises. This method combines the idea of repeated residuals and the least square algorithm to filter out the noises and then solve the problem. The algorithm can successfully filter the noises. Four popular straight lines fitting algorithms are implemented as benchmarks. A large number of experimental results show that this algorithm can obtain highly repeatable and accurate straight lines.


Introduction
Line fitting is one of the basic problems of data processing in the field of computer graphics, which widely exists in the fields of computer-aided design, computer-aided manufacturing, robot path planning and optical measurement, etc. [1][2][3] . There is little study on line fitting with high repeatability in noisy environment. High repeatability fitting is defined as: the same object is repeatedly scanned by different people at different time to obtain a series of point cloud data, on which one fitting algorithm is done; the less difference between each resultant fitting line, the higher repeatability the algorithm has. There is little study is done on line fitting regarding the high repeatability. In this paper, a feasible method of space line fitting is proposed.
Generally, 2D linear data fitting is performed using Least Squares (LS) [4] or Total Least Squares (TLS) [5] . For 3D point data set, the linear parameter equation is transformed into four parameters in [7] , and then TLS algorithm is adopted to fit spatial lines. Hu [4] uses LS and TLS to fit a spatial line by projecting it along XYZ coordinate axis, respectively. However, the above methods are all sensitive to noises. Principal component analysis [8] is explored to minimize the sum of squares of distances between points and fitting lines. Li [6] decentralizes the data and performs singular value decomposition to fit the spatial linear data. Du [9] uses Newton-Gradient optimization algorithm to optimally achieve spatial line fitting. Many methods have been proposed to handle outliers among which RANdom SAmple Consensus (RANSAC) [10] algorithm is most popular. RANSAC algorithm remove of noise point data through random sampling robustly; M-estimate SAmple Consensus (MSAC) [11][12] is also proposed to improve RANSAC algorithm. In this paper, a new robust line fitting algorithm is proposed to fit highly repetitive spatial lines. The repeated least trimmed squares idea is integrated into the total least squares method, improving the robustness. In the algorithm, repeated residuals are first applied to the line fitting such that the line with high repeatability can be fitted.

TLS method for line fitting
A space line passing a point ( 0 , 0 , 0 ) with the direction ( , , ) can be expressed as below: Eq.(1) can be also represented as Eq. (2): (1) can be further written as follows: Eq. (3) can be rewritten in matrix format: According to the least square indirect adjustment model [4] , the error of Eq. (4) can be expressed as:

Optimized line fitting
Since the fitting result of the RANSAC algorithm is sensitive to the threshold value. In this paper, the repeated minimum residual is introduced for line fitting optimization. The distance between spatial points and straight lines can be defined in Figure 1 and Eq.(7). The number of iterations T in Step 2 is determined according to the probability [11] by Eq.
where is the selected points are the probability of valid points, usually >97%, which is decided by the performance of the 3D camera and the onsite sampling environment; is the probability of any point being an noise point, usually less than 20% and it can be calculated as: = ℎ is the number of points participating in the fitting. In Eq.(8), ℎ stands for the probabilities of the selected ℎ points being all interior points and (1 − ℎ ) represents the probability of the ℎ points having at least one noise point.

Results & Discussions
To verify the effectiveness of the proposed algorithm. The algorithm is implemented using MATLAB 2018a and run on the PC with OS Win 7, CPU Intel i7-7700 and 8 GB DDR4. Four algorithms are implemented as benchmarks, including LS [4] , TLS [7] , RANSAC [10] and MSAC [11] . To evaluate the repeatability and accuracy of the algorithms, three quantity standards are defined by Eq.(9)-Eq. It can be seen from Fig.2(a) that all the five fitting methods can obtain good result for the data set without noisy points. When the data set contains outlier clusters (see Fig.2 (b)), the fitting results of LT and TLS algorithms have very large deviation, which indicates that such methods are particularly affected by noisy point clusters. For the noise ratio shown in Fig.2(c) and Fig.2(d), LS and TLS fitting algorithms have a large deviation. RANSAC and MSAC algorithms are better, but a deviation obviously occurs. However, the algorithm in this paper can achieve a good result. To study the influence of the ratio of outliers on the line fitting algorithm, a line of 200 sampling points is generated. Thirty experiments are conducted by adjusting the ratio of noise points (15%, 30% and 45%). The statistical data, i.e., ( ), ( ) and ( ) are evaluated and listed in Table 1 for repeatability evaluation. It can be seen from Table 1 that the results of all methods are getting worse with the increase of noise point ratios. However, in this example, the proposed algorithm performs best among all the algorithms.
Example II: In Figure 3, the straight line of the outer edge of one workpiece needs to be measured. To quickly measure it, laser scanner is generally used to obtain the point cloud data, then run the line fitting algorithm to fit the data into one line, measuring the length of the line. To test the repeatability of the proposed algorithm, the workpiece will be placed ten times with different poses and at different positions. After each placement, one scanning will be done and ten data sets for the same workpiece can be acquired. All the tested algorithms will test each data set and then fit to obtain a series of spatial line.  The fitting results of five algorithms are shown in Figure 3, and the corresponding statistical calculation data are shown in Table 2. It can be seen from Table 2 that the minimum mean ( ) of the sum of distances between space points and lines is 27.2413. At the same time, the root mean square error of the normal vector and the coordinates of the points passing by the proposed algorithm is far less than that of the other four algorithms. Therefore, it shows that this algorithm can fit the spatial lines with higher accuracy and repeatability. For example, the value of ( ) by RANSAC algorithm is 0.0441, while the algorithm proposed in this paper is significantly reduced to be 0.003.
The above experimental results show that for the spatial point cloud data with different noise point ratios, the proposed algorithm is more accurate and repeatable than benchmarks.