A Novel Reconstruction Method for Measurement Data Based on MTLS Algorithm

Reconstruction methods for discrete data, such as the Moving Least Squares (MLS) and Moving Total Least Squares (MTLS), have made a great many achievements with the progress of modern industrial technology. Although the MLS and MTLS have good approximation accuracy, neither of these two approaches are robust model reconstruction methods and the outliers in the data cannot be processed effectively as the construction principle results in distorted local approximation. This paper proposes an improved method that is called the Moving Total Least Trimmed Squares (MTLTS) to achieve more accurate and robust estimations. By applying the Total Least Trimmed Squares (TLTS) method to the orthogonal construction way in the proposed MTLTS, the outliers as well as the random errors of all variables that exist in the measurement data can be effectively suppressed. The results of the numerical simulation and measurement experiment show that the proposed algorithm is superior to the MTLS and MLS method from the perspective of robustness and accuracy.


Introduction
Nowadays, benefitting from the development of reverse engineering and computer technology, the meshless method widely used for reconstructing the discrete data has been studied by varieties of scholars, and consequently different types of meshless methods have been proposed [1,2]. Among all the numerical methods, the meshless method obtains the local approximants of the entire parameter domain only based on the nodal points instead of elements [3,4]. In view of its outstanding features, it has replaced traditional estimation methods in some research fields [5,6]. The meshless methods that have been widely used include the Moving Least Squares (MLS), the smoothed particle hydrodynamics, the radial basis function, etc., in which the MLS method is one of the most popular methods [7].
After years of development, the MLS method has already been employed to solve engineering and scientific problems in many fields [8][9][10]. For example, Dabboura et al., used the MLS method to acquire the result of the Kuramoto-Sivashinsky equation [11]. Amirfakhrian and Mafikandi applied the MLS method to approximate the parametric curves to verify the reliability [7]. Lee [12] proposed an improved moving least-squares algorithm to approximate a set of unorganized points with a smooth curve without self-intersections, in which Euclidean minimum spanning tree, region expansion and refining iteration were used. Then, Belytschko et al. [13] first presented the Element-Free Galerkin (EFG) method by combining the weak form of the Galerkin and MLS method, which can obtain the segmentation. If the local data is sufficient to reflect the true feature information, the MLS method has good local approximation property and the ability of low order fitting [34].
Consider θ {θ 1 , θ 2 , . . . , θ N } and ϑ{ϑ 1 , ϑ 2 , . . . , ϑ N } are nodes in a bounded region Ω in space R D [35]. The approximation function f h for each point θ in the MLS method is defined as where a(θ) = [a 1 (θ), a 2 (θ), . . . , a m (θ)] T is a vector of unknown coefficient a j (θ), (j = 1, 2, . . . , m), b(θ) is a vector of the basis b j (θ), and all of their dimension is m. In view of the low order fitting characteristics, we only consider the most commonly used linear least squares estimation. In this article, the basis functions of curve and surface reconstruction are b = [1, θ] T and b = [1, θ, ϑ] T respectively. To obtain the unknown optimal parameter vector a(θ), the MLS method solves it by determining the minimum sum of absolute differences of all nodes between f (θ) and f h (θ) function [36]. We know that the error function is based on Equation (1), in which the independent variable is θ. The function is given as where W = diag(w 1 (s), w 2 (s), · · · , w n (s)) and s = |(θ−θ I )|/r. r represents the radius of the influence domain. The weight function w(s) is used to ensure a good local approximation. The value of w(s) has decreasing property with the distance between the fitting point and the nodal points and makes sure that the value of the fitting point will be affected only by these points in the influence domain. Many types of functions can meet this requirement [37], so the selection of the weight function is not fixed but determined by the accuracy under the conditions of continuity and smoothness. This article adopts the following function This weight function is shown in Figure 1. The weight function plays an important role in the local approximants, which provides weight values for the points in the influence domain, as shown in Figure 2. The weight of each point in the influence domain can be determined by the projection distance from this point to nodal point, which ensures that the approximation is globally continuous and the shape functions satisfy the compatibility condition.
Obtain the partial derivative of Equation (2) and make it equal 0, i.e., The sum of errors of MLS approximation function gets extreme value. Then, the optimal coefficient vector is obtained The weight function plays an important role in the local approximants, which provides weight values for the points in the influence domain, as shown in Figure 2. The weight function plays an important role in the local approximants, which provides weight values for the points in the influence domain, as shown in Figure 2. The weight of each point in the influence domain can be determined by the projection distance from this point to nodal point, which ensures that the approximation is globally continuous and the shape functions satisfy the compatibility condition.
Obtain the partial derivative of Equation (2) and make it equal 0, i.e., The sum of errors of MLS approximation function gets extreme value. Then, the optimal coefficient vector is obtained The weight of each point in the influence domain can be determined by the projection distance from this point to nodal point, which ensures that the approximation is globally continuous and the shape functions satisfy the compatibility condition.
Obtain the partial derivative of Equation (2) and make it equal 0, i.e., The sum of errors of MLS approximation function gets extreme value. Then, the optimal coefficient vector is obtained where

MTLS Method
The MLS method gains the local approximation coefficients by using the least squares estimation, which considers that the model of the random error is the Gauss-Markov (GM) model. Then, R. Scitovski et al. [26] proposed the MTLS method to process the random errors that exist in all variables.
Suppose that (x j , y j ), j = 1, . . . , n is a set of data in the curve y = f (x). When errors occur to all the variables of measurement data, gain the local approximation parameters (c 0 , c 1 ) ∈R of the function in the sense of TLS estimation. Unlike the MLS method, the MTLS method gains the coefficients through determining the minimum sum of weighted squared orthogonal distances. In the actual measured data, random errors always exist in both dependent and independent variables of data. According to the error theory, the MTLS method is more logical to process the EV model than the MLS method [28,38].

LTS Method
A brief introduction is given to the LTS method of the polynomial model in this part. Get a set of data (x j , y j ), j = 1, ..., n in the curve y = f (x). Then, we can express its model [39] as where X is a n × p matrix, β t (β t ∈ β, t = 1, 2, . . . , C P n ) is a p × 1 regression coefficient vector and e is the error matrix. For each parameter vector β t , the residual vector is defined as r t = Y−Xβ t . The unknown r t is a n-dimensional vector whose square is defined in ascending order as r 2 [40] first introduced LTS estimator, and its expression is defined as follows where the value of the trimming constant h∈ (n/2, n) depends on the degree of data pollution [41]. In the calculation, the integers h equals (n + p + 1)/2 and P = p + 1. It is known to us that the breakdown point is the most basic standard to judge whether an estimator is robust enough or not. When h = n/2, the breakdown point of the LTS estimator is up to 1/2. Especially, when h is equal to n, it corresponds to the least squares estimation and its breakdown point is close to zero [42]. This means that the modelling process can automatically eliminate (n−h) larger residuals as long as the percentage of data pollution is no more than 50% [43].

MTLTS Method
As stated above, the MTLS method is susceptible to the outliers, but it considers the random errors that exist in all variables. Even though the LTS method is robust, it cannot express the local geometry feature information of the complicated model. Therefore, we propose a MTLTS method, in which the TLTS method (a combination of the TLS and LTS) is employed to acquire the fitting coefficients of influence domain ( Figure 3). Sensors 2020, 20, x FOR PEER REVIEW 6 of 17 For the proposed algorithm, the TLTS method is employed to determine the local optimal parameter vector in the influence domain. Let k + 1 < n, where n and k + 1 are the numbers of the nodes in the whole parameter domain and in the influence domain, respectively. For an arbitrary influence domain, there are C P k+1 subsamples based on the TLTS method. For each subsample, the SVD based TLS method is utilized for obtaining the regression coefficients βt (βt ∈ β, t = 1, 2, …, C P k+1 ). The function model is defined as where A1 and B1 are the true values, A and B represent the actual measured values, and the errors between them are ∆A and ∆B.
An augmented matrix C is made for the subsample and the SVD of C is described by The squared residuals can be obtained by the local coefficient vector and defined in ascending order as The TLTS method is different from the traditional LTS method as it takes into account the random errors that exist in all variables, in which the distance d 2 j (j ∈ [1, 2, …, P]) is the squared residual in the orthogonal direction. On this occasion, the sum of the squared residual of the smallest h-subset of each subsample is defined as The coefficient matrix β = [β1, β2, …, βC  For the proposed algorithm, the TLTS method is employed to determine the local optimal parameter vector in the influence domain. Let k + 1 < n, where n and k + 1 are the numbers of the nodes in the whole parameter domain and in the influence domain, respectively. For an arbitrary influence domain, there are C P k+1 subsamples based on the TLTS method. For each subsample, the SVD based TLS method is utilized for obtaining the regression coefficients β t (β t ∈ β, t = 1, 2, . . . , C P k+1 ). The function model is defined as where A 1 and B 1 are the true values, A and B represent the actual measured values, and the errors between them are ∆A and ∆B.
An augmented matrix C is made for the subsample and the SVD of C is described by ] T , and the singular matrix Σ = diag(σ 1 , σ 2 , . . . , σ P+1 ). If σ P σ P+1 , the solution of TLS is unique, it can be gained by the following formula [24,25] The squared residuals can be obtained by the local coefficient vector and defined in ascending order as The TLTS method is different from the traditional LTS method as it takes into account the random errors that exist in all variables, in which the distance d 2 j (j ∈ [1, 2, . . . , P]) is the squared residual in the orthogonal direction. On this occasion, the sum of the squared residual of the smallest h-subset of each subsample is defined as Sensors 2020, 20, 6449 ] can be obtained by repeating calculations. The TLTS estimation is used to determine the corresponding optimal coefficient vector by finding the smallest h-subset. The estimation is defined as Move the fixed point throughout the domain and repeat the previous steps, in which the estimation for each point is independent. Then, we get the reconstructed curve or surface. In this paper, we set

Case Study
To validate the data fitting performance of the MTLTS method, numerical simulations as well as experimental examples are given in this section. In the numerical simulation, the tested data is simulated by artificially adding random errors and outliers. The spline weight function introduced is applied to all cases.

Case 1
Take the function as an example. A uniformly distributed set of nodes (x j , y j ) from the Equation (14) is first selected. Then, get the data (x jm , y jm ) by adding outliers (0, ∆y i ) and the random errors (δ j , ε j ) to (x j , y j ), where the random errors obey the normal distribution with a mean value of zero. The sum of absolute differences between the fitting points and the theoretical points is employed in evaluating their performance where y jn and y j are the fitting points and theoretical points. Let n = 201 and r = (x jm (201) − x jm (1)) × 3/100 in Case 1, in which x jm (1) = −5 and x jm (201) = 5. Figure 4 presents the fitting curves obtained by the MLS, MTLS and MTLTS. The summation of the differences for these methods under different conditions are shown in Table 1, respectively. These points marked in Figure 4 are outliers. In the cases of this paper, we provided relatively more outliers in the whole domain to verify the proposed algorithm.  The fitting accuracy also can be evaluated by the Root Mean Square (RMS) value. The results are still consistent with the sum of absolute differences, as shown in Table 2. In order to avoid repetition, the RMS values are not placed in the other cases.
where zjn and zj are fitting points and theoretical points. Following the same approach described in Case 1, the fitting results of three methods under different random conditions are shown in Table 3 and Figure 5, respectively. The fitting accuracy also can be evaluated by the Root Mean Square (RMS) value. The results are still consistent with the sum of absolute differences, as shown in Table 2. In order to avoid repetition, the RMS values are not placed in the other cases.

Case 2
Take the function z = (x 2 − y 2 )/10 (16) and where z jn and z j are fitting points and theoretical points. Following the same approach described in Case 1, the fitting results of three methods under different random conditions are shown in Table 3 and Figure 5, respectively.   From Figures 4 and 5, we know that MLS and MTLS are not robust model reconstruction methods. For these two methods, outliers have a great influence on the estimation of nearby fitting points and even lead to distortion of the results. In comparison, the sum of differences of the MTLTS method is much smaller in the presence of the contaminated data. To validate the fitting accuracy of the MTLTS method when there are no abnormal points in the discrete data, we still take the curve function to get the data in contrast to Case 1. As shown in Figure 6, the curves reconstructed by the three methods provide good approximation characteristics. However, the comparison of the result listed in Table 4 and Figure 7 shows that the fitting differences of MTLTS method are obviously lower than the other two methods.  From Figures 4 and 5, we know that MLS and MTLS are not robust model reconstruction methods. For these two methods, outliers have a great influence on the estimation of nearby fitting points and even lead to distortion of the results. In comparison, the sum of differences of the MTLTS method is much smaller in the presence of the contaminated data. To validate the fitting accuracy of the MTLTS method when there are no abnormal points in the discrete data, we still take the curve function to get the data in contrast to Case 1. As shown in Figure 6, the curves reconstructed by the three methods provide good approximation characteristics. However, the comparison of the result listed in Table 4 and Figure 7 shows that the fitting differences of MTLTS method are obviously lower than the other two methods.   To obtain the corresponding CPU-times amongst IMTLS, MLS, and MTLS, the Case I is taken as an example and MATLAB is used to test the computation load of these algorithms. All procedures are conducted on a PC with Intel(R) Core TM i7 2.7/2.9 GHz 8 RAM (Santa Clara County, CA, USA). The results are shown in Table 5. To obtain the corresponding CPU-times amongst IMTLS, MLS, and MTLS, the Case I is taken as an example and MATLAB is used to test the computation load of these algorithms. All procedures are conducted on a PC with Intel(R) Core TM i7 2.7/2.9 GHz 8 RAM (Santa Clara County, CA, USA). The results are shown in Table 5.

Case 3
To further verify the performance of MTLTS method, it is also applied to fit the measurement data obtained by a precision measurement platform, as shown in Figure 8. To obtain the corresponding CPU-times amongst IMTLS, MLS, and MTLS, the Case I is taken as an example and MATLAB is used to test the computation load of these algorithms. All procedures are conducted on a PC with Intel(R) Core TM i7 2.7/2.9 GHz 8 RAM (Santa Clara County, CA, USA). The results are shown in Table 5.

Case 3
To further verify the performance of MTLTS method, it is also applied to fit the measurement data obtained by a precision measurement platform, as shown in Figure 8. The measurement system is based on the LM50 laser-interferometric gauging probe and performs measurement of the surface profile of the processed workpiece. The employed point-contact ruby probe has a low contact pressure while offering a high measurement accuracy. At the planned layout point, the surface profile data of the workpiece is obtained by the X-axis and LM50, respectively. X-axis has a repetitive positioning error of about 41 nm and the sensor has a repetitive error of around 127 nm. As shown in Figure 9, the measurement data was obtained The measurement system is based on the LM50 laser-interferometric gauging probe and performs measurement of the surface profile of the processed workpiece. The employed point-contact ruby probe has a low contact pressure while offering a high measurement accuracy. At the planned layout point, the surface profile data of the workpiece is obtained by the X-axis and LM50, respectively. X-axis has a repetitive positioning error of about 41 nm and the sensor has a repetitive error of around 127 nm. As shown in Figure 9, the measurement data was obtained experimentally by measuring the profile of an optical flat, which has a peak-to-valley (PV) value of 31 nm. The measurement length is 90 mm and the total number of sampling points is 91. MLS, MTLS, and MTLTS method are applied to process the experimental data and TLTS method is used for linear regression. Then, the corresponding straightness values are used to verify the performances of these methods. The fitting results of the MTLTS with different C P k+1 parameters are shown in Figure 10. The measurement length is 90 mm and the total number of sampling points is 91. MLS, MTLS, and MTLTS method are applied to process the experimental data and TLTS method is used for linear regression. Then, the corresponding straightness values are used to verify the performances of these methods. The fitting results of the MTLTS with different C P k+1 parameters are shown in Figure 10. The straightness values obtained by the three reconstruction methods are listed in Table 6. The measurement length is 90 mm and the total number of sampling points is 91. MLS, MTLS, and MTLTS method are applied to process the experimental data and TLTS method is used for linear regression. Then, the corresponding straightness values are used to verify the performances of these methods. The fitting results of the MTLTS with different C P k+1 parameters are shown in Figure 10. The straightness values obtained by the three reconstruction methods are listed in Table 6.   As shown in Table 6, MLS, MTLS, and MTLTS with different C P k+1 parameters are applied to fit the measurement data of the optical plat, and TLS and TLTS with different C P k+1 parameters are used for linear regression. Figure 11 shows the variation trend of the straightness values when different curve fitting and linear regression methods are chosen. As shown in Table 6, MLS, MTLS, and MTLTS with different C P k+1 parameters are applied to fit the measurement data of the optical plat, and TLS and TLTS with different C P k+1 parameters are used for linear regression. Figure 11 shows the variation trend of the straightness values when different curve fitting and linear regression methods are chosen. As shown in Figure 11, MLS and MTLS method are both greatly influenced by the outliers. With the increase of P value of TLTS method, the results of evaluated straightness get worse, which also illustrates the robustness of TLTS method for linear regression. In comparison, the obtained straightness of MTLTS method is always closest to the standard value. Furthermore, with the increase of P value of MTLTS method (i.e., with the decrease of nodes for determining the local approximate coefficients within a single influence domain), the results of MTLTS method tend to be stable, which confirms the effectiveness of the proposed method.
The same measuring instrument is used to measure the generatrix of spherical surface, as shown in Figure 12. As shown in Figure 11, MLS and MTLS method are both greatly influenced by the outliers. With the increase of P value of TLTS method, the results of evaluated straightness get worse, which also illustrates the robustness of TLTS method for linear regression. In comparison, the obtained straightness of MTLTS method is always closest to the standard value. Furthermore, with the increase of P value of MTLTS method (i.e., with the decrease of nodes for determining the local approximate coefficients within a single influence domain), the results of MTLTS method tend to be stable, which confirms the effectiveness of the proposed method.
The same measuring instrument is used to measure the generatrix of spherical surface, as shown in Figure 12. The radius of the spherical surface is 254.0677 mm tested by Taylor Hobson PGI 1240 profilometer. The profile data are fitted by MLS, MTLS, and MTLTS respectively. The reconstructed data is processed for circular registration by the simulated annealing algorithm. Figure 13 shows the error graphs and the PV values of the three methods are obtained in Table 7. The radius of the spherical surface is 254.0677 mm tested by Taylor Hobson PGI 1240 profilometer. The profile data are fitted by MLS, MTLS, and MTLTS respectively. The reconstructed data is processed for circular registration by the simulated annealing algorithm. Figure 13 shows the error graphs and the PV values of the three methods are obtained in Table 7. The radius of the spherical surface is 254.0677 mm tested by Taylor Hobson PGI 1240 profilometer. The profile data are fitted by MLS, MTLS, and MTLTS respectively. The reconstructed data is processed for circular registration by the simulated annealing algorithm. Figure 13 shows the error graphs and the PV values of the three methods are obtained in Table 7. Figure 13. The error graphs processed by three methods. As shown in Table 7, the PV value processed by the MTLTS method is significantly smaller than the other two methods. In order to verify the stability of the algorithm when different numbers of points of influence domain are eliminated, Figure 14 shows the PV value trend graph of the process. As the P value increases, the PV values gradually become stabilized. Figure 13. The error graphs processed by three methods. As shown in Table 7, the PV value processed by the MTLTS method is significantly smaller than the other two methods. In order to verify the stability of the algorithm when different numbers of points of influence domain are eliminated, Figure 14 shows the PV value trend graph of the process. As the P value increases, the PV values gradually become stabilized. The proposed MTLTS algorithm has combined the advantages of the MTLS and LTS method and involves outstanding characteristics. Although the measurement data has outliers, it is still able to reconstruct the curve or surface from the discrete data with high accuracy by applying the improved method. Furthermore, the comparison with another two numerical estimation methods represents that the accuracy and robustness of the MTLTS algorithm have been significantly enhanced whether there are outliers in the data or not. The proposed MTLTS algorithm has combined the advantages of the MTLS and LTS method and involves outstanding characteristics. Although the measurement data has outliers, it is still able to reconstruct the curve or surface from the discrete data with high accuracy by applying the improved method. Furthermore, the comparison with another two numerical estimation methods represents that the accuracy and robustness of the MTLTS algorithm have been significantly enhanced whether there are outliers in the data or not.

Conclusions
In this study, a robust reconstruction algorithm for measurement data, based on the MTLS method, is presented by introducing the TLTS method to the influence domain for finding an optimal local parameter vector. We studied the algorithm from the perspective of calculation and theory. Owing to the construction principle of the algorithm, it does not only possess the property of acquiring the shape function with high order continuity and consistency under the basis function with low order. In addition, the robust algorithm overcomes the shortcoming of lacking robustness that is difficult to be solved for the traditional numerical estimation methods (MLS and MTLS). To verify the proposed method in terms of fitting performance, all three methods are employed for fitting the data generated by numerical simulation and experimental measurement. The results show that the MTLTS method has significant advantages over the MTLS and MLS method whether there are outliers or not, which proves the performance of this robust algorithm.