Real-time fabric defect detection based on multi-scale convolutional neural network

: Fabric defect detection plays an important role in ensuring quality control in the textile manufacturing industry. This study introduces a fabric defect detection method based on a multi-scale convolutional neural network (MSCNN) to improve accuracy and time efficiency. For detection accuracy, the MSCNN is constructed to obtain different scales of feature maps, which enhance the representation of tiny scale fabric defects. A faster defect locating method is designed with pre-known size information obtained by clustering analysis to reduce the computation time. An experiment is carried out for illustrating that the accuracy of MSCNN for each defect reaches over 92%, and the frames per second (FPS) is more than 29. Further analysis results demonstrate that the proposed MSCNN can accurately detect the fabric defects with a tiny scale, and the speed of detection can reach 30 m/min to satisfy the industrial requirements.


Introduction
Fabric defect detection is significant for textile quality control [1]. Traditionally, the defects are examined by skilled workers via visual sensation, as shown in Fig. 1. The manual detection method has disadvantages as a high error rate for human fatigue and lowspeed execution as skilled workers can detect only 15-20 m/min. Therefore, it's necessary to develop an automatic detection method to enhance productive efficiency [2].
In recent years, with the rapid development of digital image processing techniques and machine vision, lots of automatic detection methods are proposed to replace manual detection. The fabric defect detection methods based on machine vision are mainly divided into two groups: non-motif-based and motif-based groups. The motif-based methods require a defect-free ground truth for comparison of the motifs, but there are plenty of different fabrics with complicated background features, and even the same type of fabric has a difference because of the exchanging workshop environment, such as the noise and illumination. It is hard for motif-based methods to obtain the robust defect-free ground truth for detection [3]. Therefore, most works focused on the non-motifbased group, which can be further subdivided into five categories: statistical, spectral, model-based, learning-based, and structural [4], as shown in Fig. 2. The statistical methods [5][6][7] distinguish defective and defect-free regions according to the analysis of their different statistical feature, similarity, and regularity. This method is not efficient in detecting low-contrast fabrics and patterned fabrics. The spectral methods [8,9], widely used in the study of fabric defects, highlight the differences between defective and defect-free regions in the frequency domain, which can effectively realise the detection of fabrics with a simple background. Typical spectral methods include Fourier transform, wavelet transform [10], and Gabor transform [11]. Although these methods are widely studied, they cannot effectively detect patterned fabrics. Also, the real-time requirements are difficult to meet because they are high computationally demanding. The model-based and structural methods [12] are relatively less, possibly because they are not robust enough and highly data-dependent. In this paper, we will mainly concentrate on the learning-based method.
With the rapid development of artificial intelligent technology, deep learning method has achieved brilliant performance in image classification and detection. Numerous learning-based methods [13][14][15][16][17][18] have been proposed to solve the problems of fabric defect detection because of its greater extraction ability of image features. Jing et al. [19] proposed a learning-based method with a deep convolutional neural network (CNN) to extract the features of fabric, and used the mean shift algorithm to segment the defects, realising the detection of Yarn-dyed fabric. However, this method needs two steps to realise the detection of fabric, cannot realise the end-to-end detection of detection. Wei et al. [20] used a learningbased method to construct faster region-CNN (R-CNN) to classify and search defects region, acquiring high accuracy end-to-end detection of fabrics with complicated texture and small defects. While this method requires lots of computational capacity, meaning to be time consuming, which cannot meet the real-time   requirement of detection in fabric production. Liu et al. [21] proposed a single shot multibox detector (SSD) method for fabric defect detection; the detector is designed directly at the end of the CNN, so that the speed of detection has been improved. However, this method failed to locate some tiny defects accurately. Although the detection algorithm based on deep learning has better performance in accuracy and more robust than other detection methods. There are still two main challenges in the defect detection task [22][23][24][25][26].
First, the defects with tiny scales are difficult to detect, and the existing fabric defect detection methods do not design a network aiming at this character. As shown in Fig. 3, the hole and stain defects, as the common defects, have regular shapes and scales, which are relatively easy to detect. However, the end out defects, double flat and broken course defects are tiny in scale with irregular shapes. These defects are long and narrow in shape, only a few pixels, accounting for less than 1/100 of the whole image's pixels in one direction. Compared to the defects with regular shapes, these defects with irregular shapes and tiny scales are more difficult to extract effective defects features from the fabric images detected, which improves the complexity of detection.
Second, the fabric production has a high requirement for realtime defect detection; the movement speed of fabrics in production is often higher than 15 m/min. While the existing deep-learning detection methods do not consider the similarity shapes and sizes of the same types of fabric defects when designing the detector, which always causes time-consuming of detection, it can hardly satisfy the real-time demand of fabric detection, and will lead to a decrease in productivity. That means they cannot be applied in practice.
In this paper, a deep-learning-based detection method, multiscale CNN (MSCNN), is proposed to realise the detection for five types of defects in linen fabric and patterned fabric. Compared with other detection methods, there are two main contributions: (i) Targeting at existing fabric defects detection methods have low accuracy in detecting tiny-scale defects, applies the pyramid features a method in a neural network to enhance the feature representation and detection accuracy of tiny scale fabric defects.
(ii) For real-time detection, taking advantage of the similarity of the same type of defects, using the K-means clustering analysis to obtain defects bounding boxes with pre-known size information, which are used to replace existing time-consuming locating methods, so as to reduce the computation in location and improve the speed of detection.
The rest of this paper is organised as follows: Section 2 presents the MSCNN framework and the model training process. Section 3 reports the experiments and discussion. Section 4 shows the conclusion and raises several future issues.

Framework of MSCNN model
Aiming at existing fabric defects detection methods have low accuracy for detecting tiny-scale defects, we take advantage of the strong ability of pyramid features [27] for extracting multi-scale features, apply the pyramid features a method in neural network and propose the MSCNN to improve the detection accuracy for tiny scale defects. The MSCNN is constructed using feature fusion based on the VGG16 network [28].
To satisfy the requirement of fabric detection, we take advantage of that the same type of defects always have similar shapes and sizes, using a clustering algorithm to get the common size of all types defects, and then using these defects bounding boxes with pre-knowledge to replace the region generation method in defects locating, so as to reduce the computation required in the defects location process and to improve the speed of detection. The framework of the proposed MSCNN is shown in Fig. 4.

Fabric image feature extraction:
In the progress of detection: the fabric images captured by the line-scan chargecoupled device (CCD) camera are used as input, and standardised as the matrix X = x 1, 1 , …, x i, j . The CNN is used to extract features from the image matrix, and obtain the feature map with the pixel of 13*13 and 26*26, respectively. The feature map is input in the detection module, and the output is location and classification information of fabric defects detection.
At the front end of the CNN, the features of fabric images are extracted using convolution and pooling. The 3*3 size and 1 step small convolution kernels are used in the convolutional layer to enhance the ability to extract the local features of the fabric images; the convolution computation formula is shown as where x i, j is the value of the pixel matrix of the fabric images.
The tanh activation function is used to non-linear the convolution feature z u, v , so as to enhance the feature extraction ability of CNN, the tanh function is shown as After that, the convolution layer outputs the feature matrix , H is taken as the input of the pooling layer. The pooling layer uses 2*2 size Maxpooling operation to conduct the down-sampling of feature matrix; while extracting the key features of fabric images, the parameters of the CNN are reduced, so as to improve the computation speed of the CNN. The feature map with the pixel of 13*13 and 26*26 is finally extracted after multiple convolutions and pooling operations. At the back end of the CNN, feature maps of different pixels are used as two independent branches to detect. The feature maps with the pixel of 26*26 have a higher feature pixel, which is used to identify tiny defects in fabric images, such as hole, end out and broken course. The feature maps with the pixel of 13*13, which are used to identify more obvious defects, such as stain. At the same time, the twice up-sampling of the 13*13 feature maps are conducted, and obtain new 26*26 feature maps to conduct feature fusion with original 26*26 feature maps, so as to enhance the feature extraction ability of 26*26 feature maps. By constructing an MSCNN to extract tiny scale defects feature, respectively, the detection accuracy of tiny defects is improved. The parameters of CNN are shown in Table 1.

Defects location and classification:
In the fabric production process, the two main factors determining the shape and size of defects are fabric type and production equipment. Although the shape of defects varies from this fabric to another, there is no significant difference in shape and size between the same defects. In order to reduce the computation of location and to improve the speed of detection, the locating method is designed with pre-known size information obtained by clustering analysis [29,30] to reduce the computation time of defect region generation as shown in Fig. 5.
The defects bounding boxes with pre-known size information is obtained K-means clustering algorithm; the clustering algorithm is shown as of defects bounding boxes with the pre-known size we want to obtain, and u k are randomly assigned to an initial value at first, k is the number of defects bounding box types. In this study, we set k = 4 corresponding to four types of defects. By iterating and minimising the J, the defects bounding boxes x i are divided into four categories according to their size and aspect ratio.
After obtaining the categories of defects, The intersection over union (IOU ) between each type of defect bounding boxes with pre-known size and corresponding defect bounding box was calculated, respectively. The IOU represents the overlap ratio, and the higher the IOU is, the higher the overlap degree will be. When the intersection ratio value is one, it means that the defects bounding boxes with pre-known defects size and corresponding defect bounding box completely coincide. The formula of IOU is shown as (see (6)) where Box w box , h box represent the defects bounding boxes with the pre-known defects size corresponding to (w box , h box ) in length and width.
Although set D to be the optimisation parameters of IOU The gradient descent algorithm is used to optimise the parameters D. After many iterations, the maximum size parameters of the defects bounding boxes with pre-known size for each type of defects are obtained; the gradient descent formula is shown as After obtaining the size parameters of the defects bounding boxes with pre-known size, the feature map of the last convolution is input to detection. Although CNN's can extract fabric image features efficiently, it is hard to locate fabric defects. To locate the defects, it is necessary to match the defects bounding boxes with the feature map. Firstly, the scaling ratios p w and p h are obtained according to the matrix column-to-column ratio between the feature and the defects bounding boxes. Secondly, after obtaining the scaling ratios, the size parameters w box k , h box Thirdly, the feature map is divided into N*N cells according to the columns and rows of the feature map, the centre of each cell is obtained as the anchor coordinates (t x , t y ). Each anchor coordinate is matched with the corresponding four types of defects bounding boxes, and only generate a total number of N*N*4 defects bounding boxes. Finally, the classification probability P(i) of the defects and the confidence of defects bounding boxes are computed, which are shown as where h i represents the value of the convolution kernels corresponding to the last convolutional layer. The confidence threshold value of the defects bounding box is set according to the detection requirements, and the threshold segmentation is used to remove the defects bounding boxes with a confidence lower than the threshold, so as to output the final fabric defect detection result, which is shown as  where the T represents the threshold, and the Y represents the final output defects bounding box in the detection results.

Model training
Image brightness changes and image noise caused by light changes and floating dust often appear in fabric image acquisition, which will influence the detection accuracy. The illumination normalisation [31] and noise corruption are processed to simulate the images of fabric in different workshop environments before the model training, so as to enhance the robust for the detection model to environmental changes. The model training takes the fabric images and defects bounding boxes with pre-known defects size as input, and improves the detection accuracy by minimising the loss function, the procedures of model training are as follows: Input: Dataset X, defects bounding boxes with pre-known defects size u k = w box k , h box k , number of epochs.

Process
Step 1: Randomly set network parameters W (l) , b (l) , and centre of anchor coordinates x i , y i .
Step 2: Iterative update parameters for i = 1 to m Calculate the Loss Calculate the partial derivatives: ∇W (l) = ∂Loss/∂W (l) , ∇b (l) = ∂Loss/∂b (l) , ∇x i = ∂Loss/∂x i , ∇y i = ∂Loss/∂y i , Update these parameters: Loss = XY loss + WH loss + Class loss + Confidence loss (19) where N*N represents the cell number of the feature map. B represents the number of defects bounding boxes generated in each feature map.

Datasets and evaluation criteria
In this section, typical fabric defects in production are selected as experimental samples, and automatic fabric defect detection equipment is built to collect fabric images to evaluate the fabric defect detection algorithm. The fabric defect detection experimental platform is shown in Fig. 6. The components of fabric defect datasets are shown in Table 2, which contain two types of fabrics and five types of defects. The images of defects are shown in Fig. 3. From Figs. 3a-d, the linen fabric defects are classified as stain and broken course, stain, hole,  and broken course types. From Figs. 3e-h, the patterned fabric defects are classified as hole, end out, stain, and double flat. All images are captured by the defects detection and experimental platform. The evaluation criteria utilised in our experiments include two aspects: mean accuracy precision (MAP) and frames per second (FPS). The FPS means the number of images that can be detected within a second, which is taken as the indicator to evaluate the speed of detection, can be expanded as follows, and the frameCount means the number of detected images, the elapasedTime means the time of detecting these images The MAP is taken as the indicator to evaluate the accuracy of detection, which represents the mean integral area of the precisionrecall curve for each type of defects, the formula of MAP is shown as

Evaluation of defect detection performance
To verify the performance of the proposed MSCNN model more fully. In this section, we will compare detection results with other deep learning detection algorithms, such as fast R-CNN [32], faster R-CNN [33], Yolo [34], Yolov3 [35] and SSD [36], where the fast R-CNN, faster R-CNN and SSD are based on VGG16, Yolo is based on DarkNet17, which have CNN of similar depth with MSCNN, while Yolov3 is based on 53 layers DarkNet53. In terms of locating method, fast R-CNN and faster R-CNN uses a region proposal network (RPN) to locate defects. The SSD, Yolo and Yolov3 are using the bounding boxes generated randomly, which have no pre-known defects size to locate the defects.
To ensure the experiment fair, all the experiments are done carried out in the same environment: the program is compiled from Python3.6, the GTX1060 GPU is used to train and test detection models, 1000 iterations are carried out in the whole training process, the learning rate is 10 −4 , the batch size is set to 16, and the bounding box threshold is set to 0.4. Also, the training set and testing set have been divided according to 8:2 before training; all algorithms have experimented with the same data. The fabric defects detection results with different methods are shown in Table 3.
According to these experimental results, it can be seen (i) compared with the SSD algorithm, the MSCNN has a higher accuracy of detection by constructing the multi-scale neural network and the accuracy of MSCNN is higher than fast R-CNN and faster R-CNN under the same network depth. (ii) The MSCNN we proposed has almost the same MAP as the Yolov3 algorithm, while the Yolov3 algorithm constructs a CNN with a depth of 53 layers, which is far higher than MSCNN with a depth of 16 layers. (iii) In the aspect of detection speed, by using defects bounding boxes with pre-known defects size to locate the defects, the MSCNN we proposed has the best speed of detection, and better than the fast R-CNN and faster R-CNN using region generation methods.
On the whole, the fabric defects detection performance of MSCNN is better than other algorithms. The MAP and FPS of experimental results with the above detection algorithms results in two types of fabrics are shown in Fig. 7.
In this contrast experiment, we used randomly generated bounding boxes to locate the defects, to quantify the effect of defects bounding boxes with pre-known defects size on detection. The number of defects bounding boxes is set to 4, 8 and 12, which are named MSCNN-4, MSCNN-8 and MSCNN-12, respectively, and have the same network structure as MSCNN; the results are shown in Table 4. Similar conclusions can be drawn that the MSCNN tends to have better detection speed and accuracy by using defects bounding boxes with pre-known defects size to locate the defects in the process of detection. Table 5 shows the detection accuracy of the proposed MSCNN algorithm of various defects on two types of fabrics. The results show that the accuracy of the proposed algorithm detection for each kind of tiny defects is higher than 92%. This algorithm has good suppression ability for the wrong detection and missing detection.
To verify the influence of feature fusion on tiny scale fabric defects detection accuracy, we used MSCNN without a 13*13 feature map to do detection, which means the neural network only extracts 26*26 feature maps for detection. The results of detection accuracy are shown in Table 6. As we can see, for these tiny scale defects, such as end-out, broken curse and double-flat, there are obvious reduction of accuracy compared with Table 6, while the accuracy stain and hole defects are still higher than 90%. The conclusion can be drawn that the detection accuracy of tiny defects can be improved by using feature fusion to construct MSCNN.

Verification of fabric defect detection
The fabric defect image was input into the detection model to obtain detection results with classification and location, as shown in Figs. 8 and 9. The detection results include three kinds of information: defect types, defects bounding box and confidence. The defects bounding boxes of each colour correspond to different types of defects. The detection results show that even for the fabric with the complicated texture background and the overlapping region of defect features, it also can be detected precisely.
The FPS of the detection algorithm is 29.1 (29.1 images can be detected per second on average), the detection time required for each image is 30-40 ms. In the actual defect detection, the linescan CCD camera's pixel is 4096*2, and capture a 4096*512 pixel fabric image, which is divided into eight 512*512 images input detection models for detection. The fabric width is 1.2 m, each detection of eight images is equivalent to the detection of fabric length of 0.15 m. The speed of detection can reach about 30 m/min, which is faster than manual detection.

Conclusion
This study proposed an MSCNN for fabric defect detection, which can improve the detection accuracy for tiny scale defects and the speed. Compared with traditional detection methods, MSCNN is able to extract different pixels of defects feature maps, which can enhance the representation of tiny scale fabric defects. Furthermore, this paper designed a faster location method that used the K-means clustering analysis obtaining defects bounding boxes with pre-known defects size information to replace the region generation method in location, realising the improvement of detection speed. In addition, the experiment results demonstrated that the proposed MSCNN realised the high detection accuracy for five common types of defects in linen fabric and patterned fabric, the detection accuracy for each type of tiny scale defects reached >92%, and the speed of detection reaches 30 m/min, which can satisfy the requirements of defect detection speed in production.
While there are still two limitations to the proposed methods. Firstly, this method is data-driven, which needs plenty of labelled fabric images to train the model. Secondly, due to the lack of fabric images, the fabrics with more complicated texture are not tested in this paper.
In the future, the few-shot learning method will be used in MSCNN to lower the demand for fabric images, which will use less data to train useful detection model. In addition, the more complicated fabrics and different structures of CNN (such as DenseNet and ResNet) will be investigated in further research to extend the types of fabrics we can detect.