On-machine surface defect detection using light scattering and deep learning

This paper presents an on-machine surface defect detection system using light scattering and deep learning. A supervised deep learning model is used to mine the information related to defects from light scattering patterns. A convolutional neural network is trained on a large dataset of scattering patterns that are predicted by a rigorous forward scattering model. The model is valid for any surface topography with homogeneous materials and has been veriﬁed by comparing with experimental data. Once the neural network is trained, it allows for fast, accurate, and robust defect detection. The system capability is validated on microstructured surfaces produced by ultraprecision diamondmachining.


INTRODUCTION
Microstructured surfaces [1] can have a range of functionalities and have attracted much research attention from the perspective of their machining [2], measurement [3], and characterization [4]. In-process detection of defects in surface manufacturing is critical as it can ensure quality and save material, machine hours, and energy, ultimately reducing production costs.
Optical methods for areal surface topography measurement [5], such as coherence scanning interferometry, imaging confocal microscopy, and focus variation microscopy, struggle to accurately measure complex surface structures, such as sharp edges and vee-grooves where multiple scattering effects cannot be ignored [6,7]. Also, most optical methods for topography measurement require vertical scanning, and thus they are slow. On-machine measurement of microstructured surfaces using atomic force microscopy (AFM) [8] and scanning tunneling microscope (STM) [9] has been demonstrated, but besides the low scanning speed, AFM and STM are limited by a compromise between range and resolution [10].
Light scattering techniques such as scatterometry are used for in-process defect detection, as they can infer surface information from scattering patterns [11], with the advantages of being high speed, noncontact, and low cost. Scatterometry has been widely used for the measurement of periodic structures in semiconductor chips [12], but is usually limited to the measurement of surface features with relatively small surface height variations [11] (typically less than half the wavelength of the light source). To measure large features, one possible solution is to increase the wavelength of the light source, e.g., using infrared [13]. The commonly used scattering models in scatterometry, such as rigorous coupled wave analysis [12], can only model small periodic features, while finite element model methods [14] and finite-difference time-domain model methods [15] are computationally expensive when modeling far-field scattering.
In scatterometry, the far-field scattering pattern is interrogated to extract the surface characteristics. A library search method is commonly used, which contains reference scattering patterns that can be created for surfaces of interest, by experiment and/or simulation. The characteristic of the surface to be measured can be predicted using the measured scattering signal by searching for the closest dataset in the library [16], typically using minimum root-mean-squared error methods. However, this "library search method" can be slow, particularly when the library is large, as all datasets in the library need to be searched, which leads to a drawback for in-process implementation. To address this issue, neural networks were introduced to replace the library search methods [17,18]. However, neural networks have the drawback of high computational complexity, and in their standard formulation (where each part of the scattering spectrum is assigned to a different input node) they cannot Research Article capture any dependency between neighboring portions of the scattering spectrum [19], which is essential to the analysis of scattering signals [20].
In this paper, a novel light scattering and deep learning method is presented for on-machine surface defect detection. A rigorous scattering model based on a boundary element method (BEM) [21,22] that is capable of addressing multiple scattering effects is used to simulate the far-field scattering corresponding to specific types of surfaces. The simulated scattered field can be used to train the deep learning model. Once trained, the deep learning model can solve the inverse scattering problem far quicker than the slow library search method. With the help of the rigorous scattering model, large surface features can be addressed by the method without using a long-wavelength light source. The deep learning model is designed as a convolutional neural network (CNN)-based model, which can learn the scattering patterns in the training process and can extract surface defect information more efficiently and intelligently. The BEM model used was a 2D model, and one-directional surfaces such as gratings were considered. A prototype system was built, and experiments, including two types of defective surfaces fabricated by high-precision diamond machining, demonstrate the effectiveness of the proposed method.

LIGHT SCATTERING AND DEEP LEARNING METHOD
A schema of the proposed method for surface defect detection is shown in Fig. 1. A collimated laser beam illuminates the surface of the sample, and the scattered light is captured by a photodiode sensor in the far field. The measured scattering data is first filtered with a median filter to remove outliers, and the intensity of the signal is normalized to a zero to one interval. The scattering data is then fed into a trained deep learning model to detect defects. The deep learning model uses a convolutional neural network (CNN). A large number of nondefective and defective surfaces with different surface parameters are artificially generated. Their associated scattering signals are then simulated using a rigorous forward 2D BEM model [21,22], with which the multiple scattering problem in the one-directional microstructured grating surfaces can be solved and scattering signals for surfaces with features larger than the wavelength of the light source can be simulated. The scattering signals are labeled and used to train the supervised CNN. Once the CNN is trained, it can recognize any scattering signals associated with the surfaces within the range of the simulated parameters.

A. Convolutional Neural Network Design
A CNN for defect detection using scattering patterns has been designed as shown in Fig. 2. The CNN is designed as four convolutional layers, two pooling layers, and two fully connected layers. The input of the neural network is scattering signal captured by the sensor over a range of angles. The CNN is trained using BEM simulation data. Simulation of the scattering patterns is implemented in Matlab and parallelized in a high-performance computing (HPC) cluster (University of Nottingham's Augusta HPC, 115 compute nodes, 40 processors, and 192 GB RAM/node). As the target surfaces are one-directional gratings, simulated scattering patterns feature a large number of repeated, zero-valued data points, except for several peaks in different diffraction orders, which may cause overfitting in the CNN during training. To avoid overfitting, random noise is added to the simulated scattering signals, and dropout is added to a fully connected layer. The power of the noise was designed to be uniformly distributed between zero and 1% of the peak signal. The output of the CNN is two real numbers representing the probabilities of nondefective and defective states for the measured surface. The CNN model is implemented using Tensorflow/Keras and trained on hardwareaccelerated graphics processing units (NVIDIA Quadro P5000, 2560 CUDA cores, 16 GB RAM).

B. Prototype System and Setup of the On-Machine Experiment
A prototype system was developed for on-machine experiments. A schema and photograph of the system and the on-machine experimental setup are shown in Fig. 3. A laser with a wavelength of 633 nm is plane polarized, projected to a mirror with an incidence angle of 45 o , and reflected from the surface of the measured sample. A sensor module (SM) consisting of a pinhole, a focusing lens, and a photodiode is mounted on a rotational stage to capture the scattered light over a range of angles (0 o to 120 o ). The scattering signal is first amplified and passed through an analog-to-digital converter (AD). The system is mounted on a diamond turning machine (Moore Nanotech 350FG) to perform on-machine experiments.

RESULTS AND DISCUSSION
Two different types of microstructured surfaces were designed, machined, and measured to verify the proposed method. The surfaces were manufactured on the diamond turning machine shown in Fig. 3(b). The material of the samples was aluminum.

A. Saw-Tooth Microstructured Surfaces
The design of the saw-tooth microstructured surface is shown in Fig. 4. The angle of the saw-tooth is 90 o and is machined using a 90 o single crystal diamond tool. The pitch and height (peak-tovalley) of the microstructures were designed as 8 µm and 4 µm, respectively.
During the machining process, the diamond tool becomes worn, and this can cause defects in the machined surface. A sharp and a worn tool were used to machine the samples. Figure 5 shows scanning electron microscope (SEM) images for a sharp tool and a worn tool. The worn tool has a tip angle equivalent to 157 o .    Both the sharp and worn tools were used to machine the designed saw-tooth surfaces, under the same machining conditions. The machined surfaces were labeled as nondefective and defective accordingly. Figure 6 shows the SEM and AFM results for the surface machined using the two diamond tools. The AFM result in Fig. 6(b) shows that the height of the microstructures machined by the sharp tool is approximately 2.5 µm, which is smaller than the design value of 4 µm. This may be caused by both the imperfection of the tooltip of the diamond tool and the material swelling effect [23]. The AFM result in Fig. 6(d) shows that the height of the microstructure machined using the worn tool is approximately 0.6 µm, which is significantly smaller than the design value and is smaller than the calculated value (4/tan(157 • /2) = 0.81 µm) using the angle of the tooltip (157 o ), which may also be due to the material swelling effect.
Scattering signals for the designed surfaces, considering different parameters and experimental settings, were simulated. Forty-one different height values were equidistantly sampled from 2.0 to 6.0 µm for the nondefective surface, 19 values from 0.1 to 1.9 µm, and 20 values from 6.1 to 8.0 µm for the defective surfaces. To reflect the variations in the actual setup, three different values for every considered parameter and experiment setting were equidistantly sampled in the simulation, and their ranges are summarized in Table 1. Hence, 41 × 3 5 = 9963 datasets were simulated as nondefective surfaces, and (19 + 20) × 3 5 = 9477 datasets were simulated as defective surfaces. In total, 19,440 datasets were generated to train the deep learning model. Eighty percent of the datasets were used for training, and 20% were used for validation. After 200 epochs, the classification accuracies for the training and validation datasets achieved 97.36% and 95.55%, respectively.  On-machine experiments were conducted using the prototype system. The measured scattering signals for the two surfaces are shown in Fig. 7. Both signals were filtered using a median filter to remove outliers, and the peak intensities were normalized to 1. As the scattering signal was found to have outliers in the experiment, a median filter was implemented that was able to sufficiently remove outliers, while at the same time preserving the shape of the scattering pattern. After feeding the scattering signals into the trained deep learning model, the defect detection results were obtained as summarized in Table 2. The results show that both nondefective and defective surfaces were correctly detected, with high probabilities. The computational time for prediction was 3 ms in both cases, which is fast enough to be suitable for real-time implementation.

B. Vee-Groove Microstructured Surfaces
The design of the vee-groove microstructure is shown in Fig. 8. The vee-grooves have an angle of 25.5 o and were machined using a diamond tool with tooltip angle of 25.5 o . The pitch and height of the grooves were designed as 6 µm and 4 µm, respectively, and there were flat regions between the grooves. Two samples were machined: one using a sharp tool and one with a worn tool. SEM images of the tools are shown in Fig. 9. Figure 9(b) shows that the worn tooltip was skewed and its angle was about 44 o . Under the same machining settings, the  vee-grooves machined by the worn tool are expected to have a larger width on the top compared to those machined by the sharp tool. The depth of the grooves can be maintained as 4 µm due to the flat regions.
Two samples were machined using the same machining settings using the sharp tool and the worn tool. Figure 10 shows the SEM images of the machined surfaces. The results show that the surface machined with the sharp tool has a smaller width than that machined using the worn tool as expected.
For the BEM simulation, 22 different tooltip angles for the vee-grooves were equidistantly sampled from 20.  Table 3. Hence, 22 × 3 6 = 16038 datasets were simulated as nondefective surfaces, and (10 + 28) × 3 6 = 27702 datasets were simulated as defective surfaces. In total, 43,740 datasets were generated to train the deep learning model. Eighty percent of the datasets were used for training, and 20% were used for validation. After 200 epochs, the model reached a training accuracy of 89.44% and validation accuracy of 99.46%.
After the deep learning model was trained, on-machine experiments were conducted. The measured scattering signals are shown in Fig. 11, and the defect detection results are   C. Comparison with Other Methods

Comparison with Support Vector Machine
Compared to conventional methods such as a support vector machine (SVM) [24], the proposed CNN-based method has some advantages: (1) The CNN-based method is more robust. The noise in the scattering pattern affects the resolution of the maximizing margin problem in the SVM, increasing the likelihood of wrong predictions. On the contrary, the CNN is intrinsically designed to extract the relevant features from the scattering pattern and thus is ultimately more robust to noise. (2) The output of the CNN-based method provides results as probabilities, while the SVM method only outputs the hard classification result (i.e., a true/false statement on the recognized class). Probabilities indicate how close the observation is to each class and thus provide a more detailed depiction of the classification result.
To support the above claims, additional experiments were conducted using the same datasets but trained with the SVM method. Tables 5 and 6 show the defect detection results for the saw-tooth and vee-groove microstructured surfaces using the SVM. The classification results for the saw-tooth microstructured surfaces shown in Table 5 are consistent with our CNN-based method, but no probability information is available to assess how close to each class the observations were  found to be. This lack of information makes it more difficult to investigate the performance of the classifier and does not allow assessment of the conditions where the machine learning classifier may be performing less robustly. Concerning the results for the vee-groove microstructured surfaces shown in Table 6, the surface machined with the sharp tool was incorrectly detected as defective by the SVM. This may be due to the high complexity of the training dataset and small imperfections of the machined workpiece. This has also been reflected in the results of the CNN-based method as shown in Table 4, where the probability for correct detection is 0.916, which is relatively low compared to the other results but is still correct with high probability. The results show that the CNN-based method is more robust than the SVM method. It is interesting to note that, with different datasets, the trained SVM models have different complexities (fitting to different datasets), which result in different prediction time, i.e., 2 ms for the saw-tooth microstructured surfaces (19,440 training datasets) and 19 ms for the vee-groove microstructured surfaces (43,740 datasets).

Comparison with Library Search Method
The performance of the proposed CNN-based method was also compared with that of the conventional library search method.
The results are shown in Table 7 and  Research Article method. It should be noted that the library search method was implemented in Python running on a single CPU core; the performance may be improved using parallel computing, but in all likelihood, it would still not be comparable to the CNN-based method. Table 8 shows that the library search method detected the workpiece machined with the sharp tool as a defective surface, which is similar to the result determined by the SVM method in the previous section, and it is incorrect. The classification error may be caused by a large local deviation of the scattering signal, influencing the library search method, which is based on minimizing the root-mean-square error. The results demonstrate that the CNN-based method is more robust than the library search method.

D. Discussion
Both experiments for the saw-tooth and vee-groove microstructured surfaces demonstrate the effectiveness of the proposed scattering and machine learning method for defect detection. As a demonstration, defects were caused by worn tools, reflecting scenarios in actual manufacturing processes. Both types of microstructures were difficult to accurately measure by conventional optical or stylus instruments.
The proposed method successfully demonstrates that the defect characteristics of the microstructured surfaces can be determined at far-field distances and on machine, which is critical in the manufacturing environment. The feature sizes of both microstructured surfaces were of the order of several micrometres, which are significantly larger than the wavelength of the light source (633 nm). The results show that the proposed method can overcome the limitations of traditional scatterometry. The method can potentially be extended to a wider range of applications for very rough surfaces, such as additive manufactured surfaces with a 3D BEM model [25], which will be investigated in the near future. Although several hours are needed to simulate the datasets and train the deep learning model, once the model is trained, the prediction is fast (approximately several milliseconds using Python in a modern desktop PC) and suitable for on-machine implementation. Since a large dataset is necessary for the training of the deep learning model, using simulation data as opposed to experimental data significantly saves time and experimental resources. Combining light scattering with CNN-based deep learning benefits from automatic feature learning and extraction, which makes the proposed method capable of learning the complex patterns from the scattering signals. It should be noted that as a demonstration for the proposed method, the BEM model used in this work is a 2D model, and this paper focuses on defect detection for single-directional structured surfaces. The beam diameter of the laser beam used in the experiment was about 1 mm, which was significantly large compared to the microstructured features in the experiment (several micrometers). Hence, each scattering pattern is actually representative of the aggregated contributions of multiple features over the illuminated area. One difference between the simulated and real topography should be noted. While the simulated surfaces only contained the main features specific to each surface class definition, the real surfaces may have contained additional features in the form of localized defects, imperfections, debris, and other singularities. These deviations may be the reason why classification performance was slightly degraded (less than 10%) in some cases. Further investigations on the effects of additional features adding as disturbance factors needs to be considered. The comparison experiments with SVM and the library search method also demonstrated the advantages of the proposed CNN-based method in terms of accuracy and performance.

CONCLUSION
A method for on-machine surface defect detection using light scattering and deep learning has been proposed. A deep convolutional neural network was trained using a large scattering dataset simulated by a rigorous forward scattering model. With the trained neural network, defect information for surfaces within the range of simulated parameters can be predicted using the scattering signal. The proposed technique is promising for on-machine defect detection of surfaces with high speed and robustness.