A Regularized Regression Thermal Error Modeling Method for CNC Machine Tools under Different Ambient Temperatures and Spindle Speeds

Establishing a mathematical model to predict and compensate for the thermal error of CNC machine tools is a commonly used approach. Most existing methods, especially those based on deep learning algorithms, have complicated models that need huge amounts of training data and lack interpretability. Therefore, this paper proposes a regularized regression algorithm for thermal error modeling, which has a simple structure that can be easily implemented in practice and has good interpretability. In addition, automatic temperature-sensitive variable selection is realized. Specifically, the least absolute regression method combined with two regularization techniques is used to establish the thermal error prediction model. The prediction effects are compared with state-of-the-art algorithms, including deep-learning-based algorithms. Comparison of the results shows that the proposed method has the best prediction accuracy and robustness. Finally, compensation experiments with the established model are conducted and prove the effectiveness of the proposed modeling method.


Introduction
Thermal error is one of the main factors affecting the accuracy of CNC machine tools [1], which are gradually becoming dominant with the improvement of machine tool accuracy. In general, there are two main approaches to solving the problem of thermal errors. The first approach is to establish a simulation model based on some physical properties that can be used to simulate the thermal deformation of the structure [2]. However, this approach suffers by determining consistent boundary conditions and building an exact physical model in practice. Alternatively, the second approach is to establish a mathematical model to predict thermal errors. This approach has been widely used in practice and has been the subject of a large amount of research [1,3]. In general, a mathematical thermal error modeling method includes two key steps. The first step is to select temperature variables for modeling and is called temperature-sensitive point (TSP) selection. The temperature variables refer to the values measured by temperature sensors at different positions of the machine tool. TSP selection can simplify the model structure or mitigate the collinearity between temperature variables. The idea of TSP selection is to first classify temperature variables into different clusters and then select the most important one from each cluster [4]. This can effectively prevent the temperature variables from being strongly correlated, and the number of modeling temperature variables (MTVs) is reduced at the same time.
In the second step of a mathematical thermal error modeling method, establishing a thermal error prediction model is the key to achieving satisfactory prediction effects. Various algorithms have been applied to thermal error modeling, such as backpropagation neural network (NN) [5], support vector machine [6], multiple linear regression [7,8], state-space [9], and Gaussian process regression (GPR) [10]. Notably, the NN [5] only consists of a single hidden layer whereas the deep learning method is more complicated with multiple hidden layers. In addition, ridge regression [11] and principal component regression [12] algorithms have been used to solve the collinearity between temperature variables. Recently, deep learning algorithms have been adopted for thermal error modeling to further improve prediction accuracy. For example, Fujishima et al. [13] proposed a novel deep-learning thermal error compensation method in which the compensation weight can be changed adaptively according to the reliability of thermal displacement prediction. The deep learning convolutional NN algorithm [14], bidirectional long short-term memory (LSTM) deep learning algorithm [15], and stacked LSTM algorithm [16] have all been used for thermal error modeling. Furthermore, several researchers have built hybrid thermal error models by combining different algorithms to take advantage of their respective features [17][18][19][20][21]. More recently, digital twin technology was adopted by [22] to solve the problem of thermal errors. The authors utilized the digital twin concept to propose a self-learning-empowered error control framework for the real-time thermal error prediction and control.
The above studies provide various solutions to thermal errors. However, thermal error models, especially for those based on the deep learning method, have several limitations, including a very complex structure, requiring a large amount of training data, and a lack of interpretability. As a result, these methods are difficult to deploy in practical engineering for thermal error compensation of machine tools. In other words, in addition to prediction accuracy, robustness [23], and adaptability [24], practicality should be considered as another important indicator to effectively solve the engineering problem of thermal error modeling and compensation. From this perspective, it is suggested that traditional regression algorithms are more suitable. While traditional regression algorithms have a simple model structure and good interpretability, the prediction effects of the regression algorithms in the existing literature are not as good as those of deep learning algorithms. Therefore, there is a research gap: the existing literature lacks a thermal error modeling method that has a simple structure, good interpretability, and comparable performance of prediction effects to deep learning algorithms.
To fill the research gap, a new method based on a regularized regression algorithm is proposed to enhance the prediction ability of the regression algorithm. In particular, the least absolute regression algorithm is first used for thermal error modeling. To improve the robustness of the established model, both L1 and L2 regularizations are used by shrinking the regression coefficients. Accordingly, the stability of the regression model is improved. In addition, the proposed modeling method can automatically select TSPs owing to the presence of L1 regularization. Further, multiple batches of experimental data are used for modeling to ensure the sufficiency of thermal error information, which is a key prerequisite for effective modeling. Through analysis, the optimal combination of the number of MTVs and the coefficients of different regularization terms can be obtained, which is further used for thermal error modeling by the regularized regression algorithm. In summary, there are two main contributions of the proposed thermal error modeling method: (1) the least absolute regression algorithm, integrated with L1 and L2 regularization, is able to automatically select TSPs and reduce the collinearity simultaneously, thereby enhancing the prediction ability; (2) the prediction effects of the proposed thermal error modeling method are better with the simple model structure, compared with the state-of-the-art algorithms, including complex deep-learning-based algorithms.
Section 2 introduces the thermal error measurement experiments. Afterward, the existing thermal error modeling algorithms are briefly described in Section 3, including TSP selection and modeling algorithms. The proposed thermal error modeling method based on a regularized regression algorithm is introduced in Section 4. The effects of the number of MTVs and the coefficients of regularization terms are systematically analyzed for the proposed modeling method in Section 5. In Section 6, the proposed modeling method is compared with the state-of-the-art algorithms and verified by actual compensation experiments. Finally, conclusions are drawn in Section 7.

Thermal Error Modeling Algorithms
To obtain sufficient experimental data, 46 batches of thermal error experiments were conducted within a year. The environment temperatures and spindle speeds of these experiments are different.

Experimental Object
In this study, a vertical three-axis machining center (Vcenter-55 type) is regarded as the experimental object, as shown in Figure 1. The thermal errors of the spindle are measured using five displacement sensors, which are capacitive with a measurement accuracy of 1 µm. The thermal error can be obtained by subtracting the initial measurement value from each displacement measurement according to ISO 230-3: 2020 [25]. Section 2 introduces the thermal error measurement experiments. Afterward, the existing thermal error modeling algorithms are briefly described in Section 3, including TSP selection and modeling algorithms. The proposed thermal error modeling method based on a regularized regression algorithm is introduced in Section 4. The effects of the number of MTVs and the coefficients of regularization terms are systematically analyzed for the proposed modeling method in Section 5. In Section 6, the proposed modeling method is compared with the state-of-the-art algorithms and verified by actual compensation experiments. Finally, conclusions are drawn in Section 7.

Thermal Error Modeling Algorithms
To obtain sufficient experimental data, 46 batches of thermal error experiments were conducted within a year. The environment temperatures and spindle speeds of these experiments are different.

Experimental Object
In this study, a vertical three-axis machining center (Vcenter-55 type) is regarded as the experimental object, as shown in Figure 1. The thermal errors of the spindle are measured using five displacement sensors, which are capacitive with a measurement accuracy of 1 µm. The thermal error can be obtained by subtracting the initial measurement value from each displacement measurement according to ISO 230-3: 2020 [25]. Ten platinum resistance temperature sensors (measurement accuracy: 0.1 °C) are arranged at different key positions of the machining center. Table 1 shows the distribution and function of these 10 sensors. The positions of these sensors are shown in Figure 2. Note that T10 is not displayed in Figure 2 since it was installed on the machine body frame to measure the ambient temperature.  Ten platinum resistance temperature sensors (measurement accuracy: 0.1 • C) are arranged at different key positions of the machining center. Table 1 shows the distribution and function of these 10 sensors. The positions of these sensors are shown in Figure 2. Note that T10 is not displayed in Figure 2 since it was installed on the machine body frame to measure the ambient temperature.

Experimental Arrangements
The experiments were conducted according to ISO 230-3: 2020 [25]. The 46 batches of experiments were sorted based on initial environment temperature and were labeled K1-K46. The experimental parameters are shown in Table 2. The spindle speeds of these experiments were set to three grades: 2000 revolutions per minute (rpm), 4000 rpm, and 6000 rpm. Each batch of the experiment lasted at least 6 h. In each batch of experiments, the spindle idled and the worktable ran back and forth at a constant feed rate. The experimental data were recorded every 5 min, including thermal error and temperature.

Experimental Arrangements
The experiments were conducted according to ISO 230-3: 2020 [25]. The 46 batches of experiments were sorted based on initial environment temperature and were labeled K1-K46. The experimental parameters are shown in Table 2. The spindle speeds of these experiments were set to three grades: 2000 revolutions per minute (rpm), 4000 rpm, and 6000 rpm. Each batch of the experiment lasted at least 6 h. In each batch of experiments, the spindle idled and the worktable ran back and forth at a constant feed rate. The experimental data were recorded every 5 min, including thermal error and temperature.

Experimental Data Analysis
Since the experiments in this study were conducted throughout an entire year, the range of the initial environmental temperature for K1-K46 is large, namely 3.7-32.2 °C. The temperature changes of K1 and K46 are shown in Figure 3, which illustrates the differences in temperature changes of the machining center under different ambient temperatures.

Experimental Data Analysis
Since the experiments in this study were conducted throughout an entire year, the range of the initial environmental temperature for K1-K46 is large, namely 3.7-32.2 • C. The temperature changes of K1 and K46 are shown in Figure 3, which illustrates the differences in temperature changes of the machining center under different ambient temperatures.   Table 3.   Combined with three ambient temperature intervals and three spindle speeds, nine different experimental conditions can be divided, as shown in Table 3. One batch of experimental data can be randomly selected from each experimental condition, and a total of nine batches of data can be selected for analysis. In these nine batches of experiments, every three batches of experiments are in the same ambient temperature range, and the spindle speeds are 2000, 4000, and 6000 rpm, respectively.

Existing Thermal Error Modeling Algorithms
In this section, the existing TSP selection and modeling algorithms are introduced, respectively. Specifically, Section 3.1 introduces the correlation coefficient TSP selection algorithm. Section 3.2 introduces the ARX, LSTM, and GPR thermal error modeling algorithms.   Figure 4 shows that the thermal errors are significantly different under different experimental conditions. For example, the lower the ambient temperature, the larger the thermal error in the Z direction under the same spindle speed. Within the same ambient temperature interval, the higher the spindle speed, the larger the thermal error. That is to say, the thermal error is affected by both the spindle speed and ambient temperature change. For instance, the thermal error of K5 is the largest among the nine batches of experiments due to the lowest ambient temperature and the highest spindle speed. On the contrary, the thermal error of K44 is the smallest because of the highest ambient temperature and the lowest spindle speed. The phenomenon above indicates that the thermal errors contain different information under different experimental conditions. Note that for each experiment, all temperature measurements are subtracted from the initial ambient temperatures for thermal error modeling.

Existing Thermal Error Modeling Algorithms
In this section, the existing TSP selection and modeling algorithms are introduced, respectively. Specifically, Section 3.1 introduces the correlation coefficient TSP selection algorithm. Section 3.2 introduces the ARX, LSTM, and GPR thermal error modeling algorithms.

TSP Selection Algorithm Based on Correlation Efficient
The rth temperature variable is x r = [x r1 , x r2 , . . . , x rn ] T , where n is the number of data samples. The temperature data can be recorded as X = x 1 , x 2 , . . . x r . . . x p , where p is the number of temperature variables. The thermal error is y = [y 1 , y 2 , . . . , y n ] T . Then the correlation coefficient ρ x r y can be calculated as where Var(·) represents the variances of the corresponding variable and Cov(x r , y) is the covariance between x r and y. Specifically, where x r and y are the average values of x r and y, respectively. This correlation coefficient method selects the temperature variables that are highly correlated with the thermal error as TSPs. However, this method may lose important environmental temperature information, which has significant effects on thermal errors [4] because the environmental temperature variable is rarely selected as a TSP.

Thermal Error Modeling Methods Based on ARX, LSTM, and GPR Algorithms
In this subsection, three state-of-the-art algorithms are briefly introduced, the ARX, LSTM, and GPR algorithms. The ARX thermal error model can be expressed as shown below.
where y t−i = x j,t−i = 0 when t ≤ i for j = 1, . . . p, and t varies from 1 to n. In addition, m a is the order of auto-regression and m b is the length of the input memory. The output y t is influenced by its past m a values and the past m b values of the input. Finally, α i and β j,i are the coefficients of y t−i and x j,t−i , respectively. There are many studies establishing thermal error prediction models using the LSTM neural network algorithm [8,26,27]. In this study, the LSTM neural network is adopted as another baseline method and is available in MATLAB software. Specifically, the LSTM method contains a sequence input layer and a regression output layer. At the same time, an LSTM layer and a fully connected layer are designed. For more details on the structure of the LSTM and its training options, please refer to [10].
The GPR algorithm can be used for thermal error modeling [10], which has good prediction effects. The GPR algorithm is selected as the third baseline method for prediction comparison in this study. It is noteworthy that although the GPR thermal error modeling algorithm can automatically select the TSPs, the TSP selection is conducted in an iterative procedure. However, our proposed method automatically selects TSPs in a single step, which is more straightforward and computationally inexpensive.

The Proposed Thermal Error Modeling Method
As has been noted, the existing thermal error modeling algorithms lack simple structure, good interpretability, and comparable performance of prediction effects to deep learning algorithms. To fill this gap, we propose a new method based on the regularized regression model in this section. Specifically, the regularized regression model is first formulated in Section 4.1. Then, the solution to the regularized regression model is provided in Section 4.2.

Thermal Error Modeling Based on Regularized Regression
The multiple linear regression thermal error y model concerning temperature variables x 1 , x 2 , . . . x r . . . x p can be expressed as where β = β 0 , β 1 , β 2 , . . . , β p T represent the coefficients of the model. ε is the random error obeying N(0, σ 2 ), where σ 2 is the variance of the normal distribution.
According to the least-squares algorithm, β can be calculated by minimizing the objective function as shown below.
where A = [I, X] and I indicate an n-dimensional unit column vector. Then the regression coefficients can be calculated in the closed-form equation aŝ As pointed out in the existing literature [11], the least-squares algorithm is sensitive to outliers and collinearity between independent variables. The collinearity would lead to A T A ≈ 0 and then the values of the main diagonal elements of A T A −1 are large. As a result, the variance of the estimated regression coefficientsβ is large, as shown below.
To solve this problem, the ridge regression algorithm replaces the matrix A T A in Equation (4) with A T A + λ 2 E. Then the coefficients can be estimated aŝ where E represents an n-dimensional identity matrix. λ 2 is called the ridge parameter. From the optimization point of view, the object function of the ridge regression algorithm adds the L2 regularization term as a penalty term based on the least-squares algorithm, as shown below.
Further, the least absolute shrinkage and selection operator (LASSO) algorithm takes the L1 regularization term as the penalty term to select important variables involved in the model. Then the objective function is In addition, the elastic-net regression (ENR) algorithm combines the L1 and L2 regularization to construct the penalty terms. The object function of the ENR algorithm is where λ 1 and λ 2 are the coefficients of the L1 and L2 regularization terms, respectively. It can be found from Equation (13) that the ENR algorithm includes the least-squares regression, ridge regression, and LASSO. When λ 1 = λ 2 = 0, Equation (13) is the objective function of the least-squares algorithm. When λ 1 = 0 and λ 2 = 0, Equation (13) is the objective function of the ridge regression algorithm. When λ 1 = 0 and λ 2 = 0, Equation (13) is the objective function of the LASSO algorithm.
Compared with the least-squares regression, the least-absolute regression has better robustness. The object function of the least-absolute algorithm is shown below.
Similarly, the L1 and L2 regularization can also be applied to Equation (14), then the objective function of Equation (14) can be updated as follows.
where λ 3 and λ 4 are the coefficients of the L1 and L2 regularization terms, respectively. The regression coefficients can be estimated by solving Equation (15), which is called the least absolute elastic-net regression (LAENR) algorithm in this study. As a result, the LAENR algorithm can not only select important variables like LASSO regression but also inherits the stability of ridge regression. Furthermore, with the use of the least-absolute algorithm, the LAENR algorithm has better robustness.
To intuitively show the differences between the ridge regression, LASSO, ENR, and LAENR algorithms, the case of only two independent variables is taken as an example to illustrate the optimal solutions to Equations (11)-(13) and (15) ( Figure 5). In Figure 5, the blue parts represent the original objective function of Equation (7). The green parts represent the regularization terms, which indicates that the regularization terms limit the value of the model coefficients. It can be observed that the optimal solutions of LASSO, ENR, and LAENR algorithms can easily fall on a coordinate axis. In this case, the coefficient of the independent variable on the other coordinate axis is zero. As a result, the variable selection is realized. The reason for this situation is the existence of L1 regularization, which restricts the feasible region of the optimal solution to a region with cusps. As a comparison, the ridge regression algorithm cannot select variables since the coefficients are close to zero but not equal to zero. The shape of the objective function changes from a paraboloid (LASSO and ENR) to a conical surface (LAENR), which demonstrates the difference between the least-squares and the least-absolute algorithms. sent the regularization terms, which indicates that the regularization terms limit the value of the model coefficients. It can be observed that the optimal solutions of LASSO, ENR, and LAENR algorithms can easily fall on a coordinate axis. In this case, the coefficient of the independent variable on the other coordinate axis is zero. As a result, the variable selection is realized. The reason for this situation is the existence of L1 regularization, which restricts the feasible region of the optimal solution to a region with cusps. As a comparison, the ridge regression algorithm cannot select variables since the coefficients are close to zero but not equal to zero. The shape of the objective function changes from a paraboloid (LASSO and ENR) to a conical surface (LAENR), which demonstrates the difference between the least-squares and the least-absolute algorithms.

Solution to the Least-Absolute Regularized Regression
Solving Equation (15) is an unconstrained nonlinear multivariable minima problem. Since there is an operation to solve the absolute value of the objective function, there is no continuous first derivative. As a result, the analytical solution to this problem does not exist. Therefore, the quasi-Newton method [28] is adopted to obtain the optimal solution quickly and reliably. In the Newton method, the minimum * of objective function ( ) can be calculated by the iterative equation as shown below.

Solution to the Least-Absolute Regularized Regression
Solving Equation (15) is an unconstrained nonlinear multivariable minima problem. Since there is an operation to solve the absolute value of the objective function, there is no continuous first derivative. As a result, the analytical solution to this problem does not exist. Therefore, the quasi-Newton method [28] is adopted to obtain the optimal solution quickly and reliably. In the Newton method, the minimum β * of objective function Q 6 (β) can be calculated by the iterative equation as shown below.
where H k = H(β k ) = ∂Q 6 ∂β i β j p×p , which is called the Hessian matrix, and k is the index of the kth iteration. g k = g(β k ) = ∇Q 6 (β k ), which represents the value of the gradient vector of Q 6 at point β k . The difference between the quasi-Newton and Newton methods is to solve the inverse of the Hessian matrix. In the quasi-Newton method, the inverse of the Hessian matrix is represented by an approximate positive definite symmetric matrix, thus avoiding the calculation of second-order partial derivatives. There are other methods to construct the inverse of the Hessian matrix, such as the Davidon-Fletcher-Powell method [29]. In this study, the inverse of the Hessian matrix is constructed using the Broyden-Fletcher-Goldfarb-Shanno method, which is generally considered to be the most efficient. The iterative equation of the Hessian matrix is shown below.
where y k = Q 6 (β k ), δ k = β k+1 − β k . Based on the above iterative calculation, the optimal solution β * to the problem of Equation (15) can be obtained. To minimize the influence of the local minimum problem and obtain the global optimal solution, a multi-start algorithm is adopted. The multi-start algorithm first generates many start points within the feasible region. Then the quasi-Newton method is used to solve Equation (15) at each start point. Finally, the global optimal solution is obtained from all the solutions corresponding to the start points.
Note that the temperature and thermal error data are normalized before modeling. The normalized method is shown below.
where x r and y represent the mean of the x r and y, respectively. The σ x r and σ y are the standard deviations of x r and y, respectively. Then the thermal error prediction model after normalization can be established based on the above modeling method, as shown below.
where b i (i = 1, 2, . . . p) are the coefficients of the model. The model coefficients of the original data β can be obtained by the follow formula:

Analysis of the Proposed Modeling Method
In this section, the Z-direction thermal error is taken as an example to show the analysis process. The number of MTVs and regularization coefficients of the proposed modeling method are analyzed respectively.

Analysis of Modeling Effects with Different Numbers of Temperature Variables
Although the proposed modeling method has the ability of variable selection, the number of temperature variables initially involved in the modeling has an important impact on the modeling results. To study the influence of different numbers of temperature variables for modeling, the selected nine batches of experiments in Section 2.3 are used together for thermal error modeling different numbers of temperature variables. The modeling effects with different numbers of temperature variables are analyzed.
According to the TSPs selection method in Section 3.1, the correlation coefficient between thermal error and each temperature variable can be calculated. Accordingly, the order of the temperature variables can be obtained as 1, 5, 2, 4, 3, 6, 9, 8, 7, and 10. Then the MTVs can be determined according to this order. Then the thermal error models are built with the selected nine batches of experiments by the LAENR algorithm. In addition, the regularization coefficients are set as λ 3 = 1 and λ 4 = 1 for illustration. Before modeling, the temperature data X = x 1 , x 2 , . . . x r . . . x p are subtracted from the initial value to obtain the temperature increments ∆X = ∆x 1 , ∆x 2 , . . . ∆x r . . . ∆x p for modeling. Therefore, the regression model of thermal error iŝ whereŷ represents the predicted thermal errors. The optimal solution of the LAENR algorithm is calculated using MATLAB software. Then the modeling results, including regression model coefficients and fitting accuracy, are calculated as shown in Table 4. The fitting accuracy is characterized by the root mean square (RMS) [30] when the established model makes predictions on the modeled data itself. From Table 4, we have the following observations. First, the coefficients of some temperature variables are equal to zero when the number of temperature variables is more than six. This proves that the LAENR algorithm can automatically eliminate unimportant temperature variables in the modeling process because this algorithm enjoys the sparsity characteristics of the L1 regularization. Second, the trend of fitting accuracy with the number of temperature variables, which is also shown in Figure 6, is that the fitting accuracy improves as the number of temperature variables increases. This indicates that more temperature variables involved in thermal error modeling can provide more temperature information, and thus improve the fitting accuracy. In addition, the curve becomes flat when more than nine temperature variables are included for modeling. This shows that nine temperature variables are sufficient to model the thermal error. Note that the LAENR algorithm would automatically eliminate unimportant or duplicate temperature variables from the initial modeling temperature variables.

Analysis of the Optimal Number of Temperature Variables and Regularization Coefficients
Based on the above analysis, it is obvious that the number of MTVs affect the thermal error prediction effects. However, there is no uniform principle for determining the number of MTVs. In addition, the values of the regularization coefficients have important influences on the thermal error prediction. Therefore, considering different numbers of MTVs and different regularization coefficients, the thermal error model is established by the selected nine batches of experimental data.
The remaining 37 batches of experiments are predicted by the established model. The prediction accuracy of the model for the th batch of the experiment can be obtained by calculating the RMS [30] , as shown below.

Analysis of the Optimal Number of Temperature Variables and Regularization Coefficients
Based on the above analysis, it is obvious that the number of MTVs affect the thermal error prediction effects. However, there is no uniform principle for determining the number of MTVs. In addition, the values of the regularization coefficients have important influences on the thermal error prediction. Therefore, considering different numbers of MTVs and different regularization coefficients, the thermal error model is established by the selected nine batches of experimental data.
The remaining 37 batches of experiments are predicted by the established model. The prediction accuracy of the model for the kth batch of the experiment can be obtained by calculating the RMS [30] S k , as shown below.
where k = 1, . . . , 37 represents the batch of the predicted experimental data. The mean [12] of S k represents the prediction accuracy of the thermal error model. The calculations are as follows.
where K = 37. The prediction accuracy of the established models with different numbers of MTVs and regularization coefficients can be calculated. To visualize the influence of the number of MTVs and the regularization coefficients, the prediction accuracy calculation results are drawn as a curved surface, as shown in Figure 7. From Figure 7, it can be observed that the prediction accuracies are better (the smaller value means better accuracy) with a larger number of temperature variables. According to the calculation, when all 10 temperature variables are involved in the modeling, and the regularization coefficients = 15 and = 14, the prediction effect is the best. The best prediction accuracy and robustness are 2.93 µm and 1.52 µm, respectively. The LAENR algorithm can automatically eliminate unimportant or duplicate ones from the 10 temperature variables. The modeling results will be given later in Section 6.1 to show which unimportant temperature variables are eliminated. As a result, no separate TSP algorithm is needed for thermal error modeling, which would greatly simplify the thermal error modeling process.

Prediction Effects with Nine Batches of Experiments
Based on our preliminary results, we found that randomly selecting multiple batches of data for modeling cannot effectively improve the sufficiency of modeling data because some experimental data contain repetitive information. Therefore, in this paper, we select modeling data according to experimental conditions, that is, select the data of different  Figure 7, it can be observed that the prediction accuracies are better (the smaller value means better accuracy) with a larger number of temperature variables. According to the calculation, when all 10 temperature variables are involved in the modeling, and the regularization coefficients λ 3 = 15 and λ 4 = 14, the prediction effect is the best. The best prediction accuracy and robustness are 2.93 µm and 1.52 µm, respectively. The LAENR algorithm can automatically eliminate unimportant or duplicate ones from the 10 temperature variables. The modeling results will be given later in Section 6.1 to show which unimportant temperature variables are eliminated. As a result, no separate TSP algorithm is needed for thermal error modeling, which would greatly simplify the thermal error modeling process.

Prediction Effects with Nine Batches of Experiments
Based on our preliminary results, we found that randomly selecting multiple batches of data for modeling cannot effectively improve the sufficiency of modeling data because some experimental data contain repetitive information. Therefore, in this paper, we select modeling data according to experimental conditions, that is, select the data of different experimental conditions for modeling. According to the analysis in Section 2.3, there are nine different experimental conditions, as shown in Table 3. Therefore, nine batches of experiments can be selected for modeling from each experimental condition. To verify the effectiveness of this approach, one batch of data is randomly selected from each experimental condition in Table 3. Thus, a total of nine batches are selected to build a thermal error model using the LAENR algorithm. The remaining data are predicted by the established model. The standard deviation [12] of S k is used to represent the robustness of the model. The calculations are as follows.
where K = 37 represents the total batch of the predicted experiments. Then the prediction accuracy and robustness are calculated using Equations (22)- (24). The above analysis process is repeated 10 times to improve the credibility of the calculation results. The calculation results are shown in Table 5. From Table 5, the average prediction accuracy and robustness are 2.82 µm and 1.74 µm, respectively. However, if we randomly select nine batches without considering the experimental conditions, the average prediction accuracy and robustness are 4.43 µm and 2.26 µm, respectively, based on 10 repetitions. This result proves that the method of selecting modeling data according to experimental conditions can achieve better prediction effects.

Prediction Effects Analysis
In this section, the modeling and prediction effects of the LAENR, ENR, ARX, LSTM, and GPR modeling methods are provided in Sections 6.1 and 6.2, respectively. Then compensation experiments are carried out to show the effectiveness of the proposed modeling method in Section 6.3.

Modeling Effects of the Five Methods
In this subsection, the modeling result of the proposed LAENR is compared with that of the ENR, ARX, LSTM, and GPR methods. As mentioned in Section 5.2, the thermal error LAENR model is built using corresponding optimal regularization coefficients. In the same way, the optimal regularization coefficients of the ENR algorithm can be obtained as λ 1 = 10 and λ 2 = 8. Then all 10 temperature variables and the optimal regularization coefficients are applied to thermal error ENR model establishment. In addition, the thermal error ARX, LSTM, and GPR models are established according to reference [10]. The fitting curves of the thermal errors in the Z direction of the LAENR, ENR, ARX, LSTM, and GPR modeling methods are shown in Figure 8.
LAENR model is built using corresponding optimal regularization coefficients. In the same way, the optimal regularization coefficients of the ENR algorithm can be obtained as = 10 and = 8. Then all 10 temperature variables and the optimal regularization coefficients are applied to thermal error ENR model establishment. In addition, the thermal error ARX, LSTM, and GPR models are established according to reference [10]. The fitting curves of the thermal errors in the Z direction of the LAENR, ENR, ARX, LSTM, and GPR modeling methods are shown in Figure 8. Further calculations show that the fitting accuracy of the LAENR, ENR, ARX, LSTM, and GPR methods are 1.85 µm, 2.06 µm, 0.53 µm, 1.87 µm, and 0.18 µm, respectively. In addition, coefficients of the ENR and LAENR models are shown in Table 6. The temperature variables and are zero in the ENR algorithm and the coefficients of temperature variable is zero in the LAENR algorithm. This shows that the LAENR modeling method has a stronger variable selection ability, thereby improving the robustness of the model.

Prediction Effects of the Five Methods
With the randomly selected nine batches of experiments according to experimental conditions, the LAENR, ENR, ARX, LSTM, and GPR thermal error models can be established. The remaining 37 batches of data are predicted. Then the prediction effects of these Further calculations show that the fitting accuracy of the LAENR, ENR, ARX, LSTM, and GPR methods are 1.85 µm, 2.06 µm, 0.53 µm, 1.87 µm, and 0.18 µm, respectively. In addition, coefficients of the ENR and LAENR models are shown in Table 6. The temperature variables T 5 and T 10 are zero in the ENR algorithm and the coefficients of temperature variable T 4 is zero in the LAENR algorithm. This shows that the LAENR modeling method has a stronger variable selection ability, thereby improving the robustness of the model.

Prediction Effects of the Five Methods
With the randomly selected nine batches of experiments according to experimental conditions, the LAENR, ENR, ARX, LSTM, and GPR thermal error models can be established. The remaining 37 batches of data are predicted. Then the prediction effects of these models can be calculated by Equations (22)- (24). To improve the reliability of the calculation results, the process of "select-predict-calculate" is repeated 10 times. The average values of these 10 calculation results are obtained, as shown in Table 7.  Table 7 shows that the values of S M and S D of the LAENR method are the smallest, in most cases, and are 3.66 µm and 1.52 µm in the X direction, 3.88 µm and 1.50 µm in the Y direction, 2.82 µm and 1.38 µm in the Z direction. These results prove that the proposed modeling method has the best prediction results. In particular, the prediction effects of the LAENR algorithm are better than those of the ENR algorithm, which means that the least absolute regression modeling method in this study can effectively improve the prediction ability. In addition, the LAENR method performs better than the deep-learning-based LSTM method in all directions. This shows our proposed method has the advantages of simple structure and better prediction performance. Finally, the prediction effects of the GPR algorithm are comparable to the proposed modeling method. However, the GPR model is complex with the iterative TSP selection procedure and lacks interpretability. It is also noteworthy that while the GPR and ARX methods have better modeling accuracy than the LAENR method as shown in Section 6.1, they do not perform better than the proposed method in prediction accuracy. One possible reason is that both GPR and ARX methods overfit the data and their prediction accuracies decrease with unseen test data.

Compensation Verification Experiments
The established LAENR thermal error prediction model was transmitted to the CNC system of the research object. Then three batches of thermal error compensation verification experiments for air cutting are carried out and recorded as V1-V3. The spindle of V1-V3 idles at speeds of 2000, 4000, and 6000 rpm, respectively. The feed rate of the three experiments is 1500 mm/min. In each batch of the experiment, the temperatures of the machining center are measured in real-time and transferred to the prediction model. Then the thermal error can be predicted and compensated in real time.
The thermal error compensation function is turned on after the machine has been running for 2 h to visually show the actual compensation effect. The thermal error changes of V1-V3 are plotted in Figure 9. After compensation, the ranges of thermal errors in the X, Y, and Z directions are −0.50-2.41 µm, −1.48-2.27 µm, and −2.45-2.26 µm. The effectiveness of the proposed modeling method is proved. Sensors 2023, 23, x FOR PEER REVIEW 17 of 19 Figure 9. Thermal error measurement results after compensation.

Conclusions
To address the growing complexity and lack of interpretability of thermal error modeling, this study proposes an effective and practical method based on regularized regression. The optimal number of MTVs and regularization coefficients are analyzed based on experimental data. The prediction models are established by experimental data under different experimental conditions. The proposed modeling method is compared with those of the ENR, ARX, LSTM, and GPR algorithms. The calculation results show that the proposed modeling method achieves the best prediction accuracy and robustness in all X, Y, and Z directions. Finally, the effectiveness of the proposed modeling method in real-world applications is proved by compensation experiments, which can control the thermal errors with ±3 μm.
For future research, the modeling method with other normalizations of temperature differences between experiments will be studied first. Second, in our verification experiments, the error drifted out of the tolerance bandwidth after a certain amount of time. To address this issue, the adaptability of the thermal error prediction model by online updating will be considered as an important future work to improve the ability to maintain prediction accuracy. Last, a universal modeling method applicable to most types of machine tools will be studied. Thermal error (μm) Thermal error (μm) Thermal error (μm) Figure 9. Thermal error measurement results after compensation.

Conclusions
To address the growing complexity and lack of interpretability of thermal error modeling, this study proposes an effective and practical method based on regularized regression. The optimal number of MTVs and regularization coefficients are analyzed based on experimental data. The prediction models are established by experimental data under different experimental conditions. The proposed modeling method is compared with those of the ENR, ARX, LSTM, and GPR algorithms. The calculation results show that the proposed modeling method achieves the best prediction accuracy and robustness in all X, Y, and Z directions. Finally, the effectiveness of the proposed modeling method in real-world applications is proved by compensation experiments, which can control the thermal errors with ±3 µm.
For future research, the modeling method with other normalizations of temperature differences between experiments will be studied first. Second, in our verification experiments, the error drifted out of the tolerance bandwidth after a certain amount of time. To address this issue, the adaptability of the thermal error prediction model by online updating will be considered as an important future work to improve the ability to maintain prediction accuracy. Last, a universal modeling method applicable to most types of machine tools will be studied.