Detection and Recognition of Defects in X-ray Images of Welding Seams under Compressed Sensing

In order to solve the defects in welding seams of submerged arc welding, the X-ray images were used to detect defects. By comprehensive analysis and experimental study, the offline database of defects in welding seams was established, and online intensification and segmentation algorithm as well as recognition method based on the compressed sensing for defect images were designed. First, the defect database of X-ray images of welding seams was established by the offline data. After the welding images were acquired, the defect segmentation and acquisition algorithm based on the clustering method were proposed. A series of characteristic values of defects in the offline database were used as atoms in the compressed sensing algorithm dictionary, and atoms were optimized with the PCA method, to facilitate the improvement of the processing speed. Supported by the optimal dictionary, the category of defects was obtained. The actual analysis of circular and linear defects was carried out to give ROC curves classified in two cases.


Introduction
Common NDT methods included ray detection method, ultrasonic detection method, magnetic detection method, eddy-current detection method, and penetration detection method, etc. In various NDT detection technologies, the processing based on X-ray images of welding seams was the method used by many enterprises. For the detection of most of the images of welding seams, the segmentation should be carried out first to acquire the defect or noise area. The exact image segmentation was an important premise for the defect detection. In literature [1], the image segmentation was carried out with the method of feature vector clustering, which achieved some results. In literature [2], it was pointed out that the spatial factor of pixels in the algorithm of literature [1] was not considered, and the second-best segmentation results could only be obtained when some images were segmented. In literature [3], the Self-organizing Feature Map Neural Network was used to realize the binary segmentation of the image. In literature [4], the parallel image segmentation was carried out with Hopfield network, in which many segmentation problems were involved, and the simulated annealing technology was introduced to avoid from falling from the local minimum. The X-ray detection image was used in actual industrial production, which had low contrast, blurry edges, large noise and fluctuant background, thus increasing the difficulty in the realization of the real-time detection of defects in welding seams. The effect of segmentation must be further improved for effective recognition, which was studied in details here. It should be noted that the segmentation must be based on the image intensification, such as median filtering [5], weighting of gradient inverse [6], Robust statistics [7], and histogram equalization [8], etc. For X-ray images of welding seams, good intensification strategies should be used according to characteristics of images.  [10]. In the past, the application of compressed sensing in the image processing was generally concentrated on the transfer of the image information, which had few studies on image recognition. Some literature researches showed that the compressed sensing technology had the strong robustness in face image recognition [11][12]. Further study on compressed sensing and its application in recognition classification could solve the poor robustness of algorithm in blurred image when X-ray images of welding seams processed with the existing algorithm. Particularly as the coefficient vector of compressed sensing was taken as a characteristic generally reflects the images to be detected, any change in a separate coefficient would not affect the overall judgment result. The study was carried out aimed at the coefficient vector, and the establishment of mathematical model for defects in nonparametric welding seams based on coefficient vector was conductive to improvement and enrichment of recognition theory of defects in welding seams.
For various links put forward above, including intensification and segmentation of images of welding seams, and recognition of defects, etc., the recognition method was studied under the corresponding algorithm and compressed sensing theory, which was conductive to pre-discovery of welding defects and early treatment, thus to avoid the occurrence of accidents.

Offline establishment of defect database
In the study, the detection image of welding seams of submerged arc welding provided by X-ray detection section in an oil steel pipe factory was taken as the object. The linear array of image intensifier was used to visualize the attenuated X-ray, and American PE field device was used to collect the video information on welding seams in the moving process of welded pipes. In order to save data, VGA signals of PE device were saved to the server. Some welding seams in the established image database were shown in

Online intensification of images of welding seams
The noise of images of welding seams was mainly generated from the heating, photoelectric conversion and image transformation process of the resistive device, which mainly focused on In the Formula, ) , ( y x f indicated the grayscale of pixels before the transformation; ) , ( y x g indicated the grayscale after the transformation; m and n respectively indicated the grayscale of the highest and lowest pixel of the image before the transformation. The images of welding seams with poor contrast of sin function were used for global processing, and the processed images were shown in (a) Intensification effect and grayscale histogram (b) Intensification effect and grayscale histogram of thin-wall welded pipes of thick-wall welded pipes Figure.3 Effect of thin-wall welded pipes and thick-wall welded pipes after the intensification of sin function The processed images showed that whether the detection image of thin-wall welded pipes or thickwall welded pipes, the contrast of welding seam area and background area was significantly improved, and images were evenly distributed in the whole grayscale interval. Especially for images with intensified thick-wall welded pipes, the distribution of the grayscale showed the double-peak curve, as shown in Fig 3(d).

Segmentation and extraction of defect area
The core idea of clustering was that as long as the grayscale density of midpoint in an area was larger than a certain threshold, it was added into the similar clustering. The clustering DBSCAN algorithm based on density was used here to carry out the segmentation of welding defects. The grayscale density of the image was defined as in Formula (2): The clustering based on the grayscale density was realized according to the following step: (1) The filtering of ROI was carried out for the original image, to mark the grayscale density of each point, and the class value Cls of all pixels was set to 0.
(2) Setting the clustering radius Eps=2, the cluster template was shown in Fig 4, in which the hollow circle in the figure was the midline point. The clustering of each pixel was carried out according to the template of 5*5.   (4) The template was made to traverse each pixel in ROI area, and the pixels with the difference with the grayscale density value of template center point within 3 were defined as "approximate point of grayscale density".
(5) The number of approximate points of grayscale density in the template area was judged. When it was larger than the lower limit of clustering density value, approximate points of grayscale density within this area were clustered to a type; otherwise, the center point of the template was defined as the noise point, and the corresponding class value was set to Cls=0.
(6) Within the template area, the class value of approximate point of a certain grayscale density Cls>0, the class value Cls of approximate point of other grayscale density in this area was the same. Within the range of this template, the class value of all approximate points of grayscale density Cls=0, a clustering was defined again, and the class value Cls of all approximate points of grayscale density within the template area was defined as the number value clustered again.
(7) If total class number NCls>255 in ROI area, the filtering was carried out for the original image, and clustering again was carried out after it returned to Step (1).
The processing of defects in welding seams was carried out with the density clustering method based on DBSCAN, and the segmented effect was shown in Fig 5.

Recognition rules
Different features represented different physical meanings, and the range of these feature parameters was different. A group of defect feature values was considered as an instance of the defect. Thus, a row of data had K defect features, which were recorded as K yR  , in which y was an instance of the defect. In the expression below, the instance of the defect y was no longer considered as the image constituted by pixels, while 1-dimension vector constituted by several defect feature values; this vector represented an image.
The defect images of the i th class constituted the dictionary set

., K A A A A =
. When a fault instance y to be recognized was given, its category was unknown at the beginning. The dictionary A was used to represent the fault instance to be recognized, as shown in Formula (4): 11 11 1, 1 1, 1 ,1 ,1 , , , The vector x should satisfy The solved x indicated typical sparse signals, few components of which were not zero. Combing the dictionary matrix A , the category of y could be known. This represented the above idea of the sparse representation theory as the mathematical model, namely, Formula (6) The original NP difficult solution was transformed to the solution of minimization of 1 norm. The fast and effective mathematical tools could be found for iterative solution. Once x was solved accurately, the classification could be carried out because it was very sparse and non-zero values were generally concentrated on the correct category, thus to achieve the purpose of fault recognition. But sometimes the non-zero values may be not concentrated enough. Although it's indeed concentrated enough, a no-man in loop method is needed, which means the judgment should be automatically. In order to solve this problem, the following method could be used for calculation and judgment without human's involvement, namely, to calculate the difference ()

Dictionary selection
When the dictionary scale was large, it could contain rich atom structures. However, in actual application, one or two atoms might realize sparse representation. For some defect feature vectors with complex structure, dozens of atoms might be enough. As fault images were continuously added, the problem of overextension of redundant dictionary existed, and it is necessary to optimize dictionary. The PCA (Principal Component Analysis) was used here to realize the selection of atoms. This method depended on an orthogonal transformation, which transformed the original coordinate system to the new coordinate system of coordinate axes pointing to the most dispersed P orthogonal directions of sample points. Setting X to m-dimension random variable, the weighting of n base vectors was used to express X. Thus, According to the orthogonality of , both sides of Formula (12) were post multiplied to obtain: The above formulas showed that   Figure 6 showed that when the threshold was set to 0.90, over 90% of feature information of principal component information could be kept. At this time, the number of atoms decreased from 7 to 4, which improved the linear independence between different dimensions of data at the same time of decreasing the dimension.

Conclusions
Aiming at two defects in welding seams (circular defects and limit row defect), the mentioned method was used to carry out calculation of recognition classification. continuous variable of sensitivity and specificity, which appealed the interrelation between sensitivity and specificity with the composition method. ROC could intuitively show the accuracy of the classification, and the position of the black circle in the top left corner was Cut-off point, namely, the best operating point. At this time, AUC corresponding to the classification result of 4-dimension special atom was 0.67, and AUC corresponding to the classification result of 7-dimension atom was 0.76. Figure 7 showed that under the condition that the optimal atoms were selected, the accuracy of the classification was sacrificed but the atom structure was simplified, and the operation time was reduced, which was more significant under the condition that atoms were over redundant.