Abstract

Section flattening often occurs in the hot bending process of magnesium alloy tube with large curvature. In order to control the forming quality of the tube, it is necessary to measure the section profile of the magnesium alloy pipe online. In this paper, the laser vision system is used to measure the profile of magnesium alloy tube. Due to the influence of the environment and the surface quality of the pipe, there are obviously isolated outliers in the profile data, which seriously affects the accuracy and precision of the tube measurement. An outlier identification algorithm based on robust locally weighted regression and PaйTa criterion is proposed. This algorithm is used to identify the typically isolated outliers in the measurement process and discuss its identification ability. Meanwhile, it is compared with the moving mean identifier and the Hampel identifier. Subsequently, the ellipse fitting of profile data was carried out, and the fitting ellipse parameters and fitting precision of the curved section were obtained. At the same time, the fitting results were compared before and after the outliers are eliminated. The experiment proves that the outlier identification method based on robust locally weighted regression and PaйTa criterion can effectively identify outliers in profile data, especially for spot outliers. This algorithm is a robust, accurate, and efficient outlier identification method, which can effectively improve the laser profile measurement accuracy of the pipe section and has great significance for the quality control of magnesium alloy tube.

1. Introduction

Magnesium alloy has the advantages of low density and high specific strength and specific stiffness, so it is gradually receiving attention from various industries [14]. In particular, various types of magnesium alloy tubes are widely used in aviation, aerospace, transportation, and other fields [57]. However, various defects often occur in the hot bending forming process of magnesium alloy tubes with large curvatures, such as section flattening, excessive thinning, and wrinkle fracture [811]. Section flattening is an unavoidable problem in magnesium alloy tubes with large curvature. As shown in Figure 1, the section flattening of the magnesium alloy tube is mainly due to the action of bending moments and , the compression stress on the inner side, and the tensile stress on the outer side, and the pipe is bent due to the uneven force on both sides. As a result, the section is deformed to become approximately elliptical. For the same specification, the smaller the bending radius, the larger and and the more obvious the trend of section flattening. If it is a bending without a mandrel, the flattening is more serious, which affects the forming quality of the pipe. In order to control the forming quality of the tube, the section profile must be measured online, and the flattening of the magnesium alloy section must be detected in real time.

Pipe profile measurement generally uses contact measurement and noncontact measurement [1215]. Machine vision is an important method in noncontact measurement. Compared with traditional manual contact measurement methods, machine vision measurement has high accuracy and fast speed and causes no damage [1619]. In this study, a laser vision system based on line structure light was used to measure the flattening section of magnesium alloy tube caused by the large curvature hot bending to obtain the accurate profile data of the magnesium alloy tube section.

The online vision measurement of tube bending is strongly demanded to improve the dimensional accuracy. However, the quality of the profile data deteriorates sharply due to the environment and surface quality of the pipe. In particular, many isolated outliers appear in the measured data. Outliers generally mean that the observed values are significantly different from most observed values, that is, the measured data do not obey the statistical distribution law of the data [20]. In this experiment, the outliers appear in isolation as the form of dot or point, which are not necessarily related to the data quality before and after. Therefore, they are also called isolated outliers. Outliers can be generated for a variety of reasons, such as sensor noise, channel interference, and human factors [12, 21]. These outliers will cause data distortion, leading to the miscalculation of model parameters and wrong analysis results. Therefore, it is necessary to identify and eliminate the outliers in the laser measurement process to improve the precision and accuracy of the measurement.

Commonly used identification methods for outliers include Nair, Grubbs, Dixon, etc [22, 23]. These methods are difficult to apply in laser profile measurement due to the limitation of data distribution and data amount. This paper presents an outlier identification method based on robust locally weighted regression (RLWR), which is applied to laser profile detection. RLWR is developed from locally weighted regression (LWR) and belongs to nonparametric estimation. Nonparametric estimation is an important research direction of modern statistical analysis. It can adapt to more complex nonlinear changes without assuming the specific form of population distribution and error distribution and without directly obtaining data models, which is more flexible, robust, and widely applied than parameter estimation [24, 25].

RLWR is a robust fitting process that integrates local polynomial estimation and locally weighted regression with excellent smoothing performance. This method was first proposed by Cleveland [26] and further elaborated [27, 28], subsequently improved by Jacoby [29] and Loader [30]. It has been gradually applied to different fields of scientific research and engineering applications. Ma et al. [31] used the RLWR to reduce the impact of high-frequency noise on superresolution enhancement of multiangle remote sensing imagery. Leonor et al. [32] minimized the influence of the inhomogeneity effect on tree reradiation pattern by using RLWR. Chen et al. [33] proposed an algorithm based on RLWR and robust z-scores for the construction of a pit-free canopy height models. Nurunnabi et al. [34, 35] and Liu et al. [36] used RLWR techniques to study the filtering of 3D ground cloud point data. Yu et al. [37] applied the method to the smoothing of combustion kinetics data of pine sawdust biochar.

This study is based on the robust smoothing performance of RLWR, which is used to smooth the laser measurement data and realize the identification of isolated outliers. At the same time, profile fitting was carried out to verify the effectiveness of this algorithm in the profile inspection.

This paper organized as follows. Section 2 introduces the laser vision measurement system and its basic principles and analyzes the profile data of magnesium alloy tube, pointing out the typical outliers in the data. In Section 3, the moving average algorithm, Hampel algorithm, and RLWR algorithm are, respectively, adopted to identify the outliers. It focuses on the outlier recognition effect for the RLWR combined with PaйTa criterion, median criterion, and quartile criterion, respectively. In Section 4, elliptic fitting of profile data is performed by using the RLWR identification algorithm, and relevant elliptic parameters and fitting error are obtained, at the same time, compared with original data. Lastly, a conclusion is given in Section 5.

2. Laser Vision Measurement System and Profile Outliers

2.1. Laser Vision Measurement System

The laser vision measurement system adopted in this experiment mainly consists of a line-structured light sensor, image processing unit, and control unit, as shown in Figure 2. The system works as follows: the laser source in the sensor projects line structure light to a measured object. The reflected light is captured by the camera in the sensor. The profile data are recorded and displayed by industrial PC after image processing. The measuring platform can move along the axial direction of the pipe, which is controlled by the motion control system, realizing the profile measurement of each section for the pipe.

Laser triangulation is the basic principle of line-structured light vision measurement. The line-structured light projected by laser source forms a scattered light band on the pipe surface, and the scattered light is imaged in CCD array. The distance and coordinate data of the measured point can be obtained through triangular geometry [38, 39].

As shown in Figure 3, the point is a measurement point on pipe surface, point is the imaging point of in CCD image plane, is the focal length of camera, is the distance between the center of light source and the center of camera lens, and is the angle between the axis and the construction line, which is formed by the measured point and the center of light source. The precise coordinates of on the measured profile can be obtained from the spatial geometry in the figure.

2.2. Typical Profile Outliers

Observe the profile data of each section obtained by the measuring system. There are two typical isolated outliers in the measured profile, that is, the spot outliers and the point outliers. The spot outliers or speckle outliers are composed of multiple point outliers, as shown on the right side of Figure 4. The double-point isolated outliers are shown in the middle of Figure 5.

In order to facilitate analysis, as shown in Figure 6, the double-point outliers in Figure 5 are superimposed in Figure 4. Subsequently, the recognition ability of different identification algorithm is investigated for two types of outliers.

3. Identification of Profile Outliers

3.1. Moving Mean Identifier Recognizes Isolated Outliers

Moving mean identifier or movmean identifier is the easiest way to identify isolated outliers [40, 41]. The basic definition is as follows. For the data sequence , moving window length is , and the outliers are judged by the following equation:where is the local mean and represents local standard deviation () within the moving window. The moving mean identifier is to use the criterion to judge the outliers in the data window when the length is . When the difference between the measured value and the local mean is greater than three times the local standard deviation, it is considered as an outlier.

The moving mean identifier is used to identify outliers in Figure 6, and the recognition results are shown in Figures 7 and 8.

The moving window length is gradually increased. When the window length increases to 19, only one point outlier is identified, as shown in Figure 7. When the window length is increased to 25, the threshold range is reduced in the middle of the profile so that the double-point outliers in the middle could be completely identified (see Figure 8). However, the spot outliers on the right are never identified regardless of the length of the moving window.

It can be seen from Figures 7 and 8 that the distinguishing threshold at the location of outliers is greatly increased due to the existence of outliers in the middle of the profile. With the increase of window length, the threshold curve in the middle of the profile is gradually smooth and the threshold range is gradually reduced. However, the upper and lower threshold ranges on both sides of the profile increase significantly. It indicates that the identification ability of outliers on both sides of the profile decreases with the increase of window width.

3.2. Hampel Identifier Recognizes Isolated Outliers

Hampel identifier is a median identification method, which uses the median and absolute median deviation as a robust estimation of the location and distribution of outliers, with good robustness [4245].

Hampel identifier is defined as follows: for data sequences , the number of neighbors on either side of is ; then, the moving window length is , and the local median is .

The local scaled median estimated deviation () is expressed as follows:where , which is the unbiased estimation of the Gaussian distribution.

When the difference between the measured data and the local median is greater than times , the measured value is considered to be an outlier, as shown in equation (4).

Hampel identifier is used to recognize the outliers in Figure 6. The identification effect is observed by changing the length of moving window when . As shown in Figure 9, when the window length is 5, the threshold range of identification is relatively narrow. Hampel identifier was able to identify the double-point outlier in the middle of the profile, but it is unable to identify the spot outliers on the right side. At the same time, the identification threshold fluctuates significantly with the spot outliers. Furthermore, the misidentification of outliers is observed. Besides, influenced by the discontinuous data on both sides of the profile, the identification threshold fluctuated greatly and the endpoint data are identified as outliers.

When the window length is increased to 25, as shown in Figure 10, the upper and lower identification threshold on both sides of the profile increased significantly and the central double point outliers can be effectively identified, but the spot outliers on the right side of the profile are unrecognized. It is worth noting that the misidentification in Figure 9 no longer appeared. Meanwhile, several discontinuous data at the right end of the profile are identified as outliers.

If the moving window length continues to increase, the identification threshold on both sides of the profile will increase accordingly. Although the double-point outliers can be identified, the spot outliers are still unrecognizable.

Compared with the moving mean identifier, the Hampel identifier significantly reduces the threshold range and the fluctuation phenomenon, which can effectively identify point outliers, but it is still unable to identify spot outliers. In addition, the large interval of profile data will increase the difference between the data and the median, resulting in a large fluctuation for threshold, which affects its ability to identify outliers.

3.3. Recognition of Isolated Outliers Based on RLWR

In this section, the RLWR algorithm is combined with PaйTa criterion, median criterion, and quartile criterion to identify the isolated outliers and choose the appropriate smoothing window length and observe its identification effect.

3.3.1. Identification of Outliers Based on RLWR and PaйTa Criterion

The RLWR smoothing algorithm is combined with the PaйTa criterion, i.e., 3σ criterion, to identify the laser profile outliers; this algorithm can be referred to as the RLWRP identifier. The basic approach of this identifier is to smooth the profile data by using the RLWR algorithm firstly, and then the residual between smoothing data and original data is calculated; finally, we use the PaйTa criterion to identify outliers.

The algorithm of RLWRP identifier is as follows.

The measured data sequence is . The data model assumes the following:where is a smooth function of and is independent and normally distributed with zero mean and variance.

Set the smoothing coefficient as , where , and round to get the data width , . Taking each observation point as the center, select the appropriate to determine the smoothing window length , .

Subsequently, a weight function is selected for the locally weighted regression (LWR). LWR typically uses tricube weight function for weighted least squares fit, defined aswhere is local neighborhood in smoothing window which is closest to and is the distance between and in smoothing window. The value of is a maximum for the point closest to and reduces to zero for the point farthest to in smoothing window.

Use weighted least squares method get estimates of parameters. The parameters estimates of equation (5) are the values of the parameters that minimize

The coefficients from each local neighborhood are used to estimate the fitted values at .

Generally, the RLWR selects the bisquare weight function as follows:where , in which is the fitting residual, i.e., . , in which is the median of .

Replace with as new bisquare weight, which is used to estimate the new set of RLWR coefficients by minimizing the error sum of squares:

The new RLWR fitting value is calculated using weighted least squares method. Repeat the above steps of robust enhancement, and the final robust locally weighted fitting value is obtained.

Next, the outliers are identified according to the 3σ criterion. The residual is calculated by using the smoothing value obtained by the RLWR algorithm and the original measurement data, and then we get the mean of residual, i.e., . Finally, the standard deviation is obtained:

When the difference between the residual and the mean is greater than three times the local standard deviation, it is considered to be an outlier, as shown in the following equation:

RLWRP identifier is used to identify the outliers in Figure 6, and its identification ability is observed under different smoothing windows. The length of smoothing window increases gradually from 5. This method shows good recognition effect when the length of smoothing window increases to 11.

As shown in Figure 11, the blue dot is the original data and the red dash dot line is the RLWR smoothing curve. The RLWR smoothing curve retains the characteristics of original profile without the risk of excessive smoothing. In Figure 12, the distribution trend of the original data is removed, which is obtained by using the residual between the smoothing data and the original data, and the outliers in the residual are identified by the 3σ criterion, as shown in the box. The identification result also is plotted in Figure 11. It can be seen from the figure that the method successfully identifies the double-point outlier and the spot outlier. Meanwhile, a few discontinuous data at the right end of the profile are recognized as outliers.

3.3.2. Identification of Outliers Based on RLWR and Median Criterion

This section uses RLWR smoothing algorithm combined with median criterion to identify outliers; it can be called the RLWRM identifier. The algorithm is as follows.

According to the residual obtained in above section, the median of the residual sequence is calculated, i.e., .

The scaled median absolute deviation () is defined as follows:where .

When the data element is greater than three times , it is considered to be outlier, as shown below.

The RLWRM identifier is used to identify outliers in Figure 6. The RLWR smoothing data are used when the smoothing window length is 11. The identification effect is shown in Figures 13 and 14. In Figure 13, the blue point is the residual between RLWR smoothing data and original data and the red boxes are the outliers identified by the median criterion. The identification result and the smoothed value are plotted in Figure 14 for observation. It can be seen from the figure that although the RLWRM identifier can recognize the double-point outliers and the spot outliers, there are many misidentifications about the isolated outliers.

3.3.3. Identification of Outliers Based on RLWR and Quartile Criterion

This section uses RLWR smoothing algorithm combined with quartile criterion to identify outliers; it can be called the RLWRQ identifier. Quartile criterion is a relatively robust identification method; the algorithm divides sorted data into quarters; , , and are their break points. is lower quartile (25 percentile), is median (50 percentile), and is upper quartile (75 percentile). The interquartile range () is introduced here as a statistic for checking outliers, i.e., .

Outliers are defined as elements more than 1.5 above or below .

The RLWRQ identifier is used to identify outliers in Figure 6. Similarly, the smoothing window length of RLWR is 11. The identification results are shown in Figures 15 and 16. It can be seen from the figures that the identification effect of the RLWRQ identifier is similar to the RLWRM identifier. This algorithm also has misidentification of outliers and identifies more normal data as outliers. If these values are removed, the profile will not be truly reflected, forming new errors and affecting the measurement accuracy.

4. Profile Fitting and Error Analysis

According to the previous section, the RLWRP identifier can obtain better identification results for the isolated outlier. The identification results in Figure 13 are used to remove outliers, and the ellipse fitting experiment is performed by the least squares method.

The least squares method is one of the most important methods of data fitting. The least squares method has the characteristics of simple, effective, and strong applicability. Therefore, this method is selected to conduct ellipse fitting research. This paper chooses the algebraic least squares method to carry out ellipse fitting research, which is to determine the ellipse parameters by measuring the smallest algebraic distance squared from the fitting ellipse to the ellipse.

The elliptic algebraic equation is expressed as follows:

According to the principle of least squares method, its objective function is minimized as follows:

To minimize on the basis of the extreme value principle, the following equation exists:

Thus, a linear equation is obtained. Then, by solving the linear equations and combining the constraints, the values of the equation coefficients can be obtained. Get the elliptic equation and draw the fitted ellipse. Finally, get the ellipse equation and draw the fitted ellipse.

The fitting results are compared before and after outlier eliminating, as shown in Figure 17. Meanwhile, the fitting ellipse parameters are shown in Table 1.

In Figure 17, the blue dots are the original data, the boxes are the outliers that are identified, the dotted line is the fitted ellipse of the original data, and the dash-dotted line is the fitting ellipse after removing the outliers.

Before removing outliers, “” is the center of the ellipse fitted with the original data, the diameter of the ellipse’s major axis is 11.7197 mm, the eccentricity is 0.9263, and the ellipticity is 0.6233. The ellipticity is defined as follows:

After eliminating the outliers, the center of the fitted ellipse is “+,” the major axis diameter is 15.8586 mm, the eccentricity is 0.6739, and the ellipticity is 0.2611.

At the same time, the profile was measured by the coordinate measuring machine (CMM). The measurement data are shown in Table 1. By comparison, it is found that the fitting ellipse with outliers removed is similar to the measurement results of CMM.

From the above figure and table, it can be found that the fitting result of the original profile has deviated from the actual situation. After removing the outliers by using the RLWRP identifier, the fitted ellipse conforms to the measurement reality.

In addition, the fitting error before and after removing outliers is analyzed, and the results are shown in Table 2. For the original profile, the sum of squares due to error (SSE) is 7.1938e − 06 and the root mean square error (RMSE) is 1.5750e − 04. After eliminating the outliers, the SSE and the RMSE of the fitted ellipse are reduced to 2.1157e − 06 and 8.6617e − 05, respectively. It shows that the elimination of outliers greatly reduces the fitting error and improves the fitting accuracy, and the measurement results are more accurate.

5. Conclusions

In this paper, the profile of magnesium alloy tubes with large curvature is measured by a laser vision system based on line structure laser. There are two typical outliers in measurement, that is, point outliers and spot outliers. For these outliers, the moving average method, the Hampel method, and the RWLR method were, respectively, adopted to identify the outliers of profile data. And their ability to identify outliers is discussed for the above methods with different window lengths.

The experiment found that all the above methods could identify the isolated point outliers, but neither the moving mean method nor the Hampel method could identify the isolated spot outliers. In this article, the RWLR method is studied emphatically for the isolated outliers, which was combined with the PaйTa criterion, the median criterion, and the quartile criterion. The research shows that the RWLR smoothing algorithm combined with PaйTa criterion, i.e., RLWRP identifier, can more accurately identify different types of outliers with a lower misidentification rate.

At the same time, according to the outlier identification result of the RWLRP identifier, the profile fitting was carried out by the algebraic least squares method. Then, the main parameters of the fitted ellipse are obtained, and the fitting errors are calculated. After the comparison and analysis of the fitting results before and after the outlier processing, it is found that the data contaminated by outliers will lead to a great deviation of profile fitting and wrong profile shape parameters. The RWLRP identifier is a robust, accurate, and efficient outlier identification method, which can effectively deal with outliers in profile data, especially for spot outliers. This algorithm is suitable for data cleaning in the line structure light measurement, which can effectively improve the precision and accuracy of online profile measurement in the process of hot bending of magnesium alloy tube.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work was supported by the Shanxi Engineering Technology Research Center Construction Project (201805D121006) and supported by the Science and Technology Major Project of Shanxi Province (20181102016 and 20181102002) and the Natural Science Foundation of Shanxi Province (201801D121169) and sponsored by the Collaborative Innovation Center of Taiyuan Heavy Machinery Equipment (1331 Project).