Convolutional neural network-based multi-label classification of PCB defects

: Due to the rapid development of printed circuit board (PCB) design technology, inspection of PCB surface defects has become an increasingly critical issue. The classification of PCB defects facilitates the root causes of detects’ identification. As PCB defects may be intensive, the actual PCB classification should not be considered as a binary or multi-category problem. This type of problem is called multi-label classification problem. Recently, as one of the deep learning frameworks, a convolutional neural network (CNN) has a major breakthrough in many areas of image processing, especially in the image classification. This study proposes a multi-task CNN model to handle the multi-label learning problem by defining each label learning as a binary classification task. In this study, the multi-label learning is transformed into multiple binary classification tasks by customising the loss function. Extensive experiments demonstrate that the proposed method achieves great performance on the dataset of defects.


Introduction
With the rapid development of the electronics industry, printed circuit board (PCB) designs need to be reformed.At the same time, it poses a huge challenge for PCB quality assurance.Traditionally, the inspection and classification of PCB defects are processed manually.However, the detection and classification accuracy of this method depends on unreliable manual inspections.The traditional method is obviously unrealistic and difficult to implement for the mass production of PCBs.It can be seen that traditional manual visual inspection methods cannot meet the requirements of PCB development.
On the contrary, using computers for PCB defects detection and classification is efficient and accurate.The current methods have a requirement for illumination, which results in high hardware cost.Therefore, the research of PCB defect classification has a practical significance in industrial production.

Related works
A lot of advanced approaches have been proposed for PCB defects classification.
Wu et al. [1] proposed the method in which each detected defect is classified into one of the seven defect types by three indices: the type of object detected, the difference in object numbers, and the difference in background numbers between the inspected image and the template.
Mauro and Roberto [2] introduced a technique for PCB inspection based on the comparison of the connected table (a list of connected PCB holes) between the reference image and the test image.The method used connected component analysis, a natural way to extract the connectivity information of conductors of a PCB.The implementation of this technique needs standard morphology image processing techniques.This technique is impractical for manufacturing as it has a long processing time, around 5 s.Moreover, many types of defects cannot be detected, such as mousebite, under etching, over etching and variations between the printed lines.
Chen and Liu [3] proposed the concept of a connected component influencing region, which was used to extract the feature pattern corresponding to the feature pattern of the reference image in the measured image.By comparing the features, it is easy to distinguish whether a PCB has defects by setting a threshold on the similarity.
In Putera et al.'s study [4]; defects were classified into seven categories.Each category classified a minimum of one to a maximum of four different types of defects.Based on it, Putera et al. [5] organised the two existing categories with two defects into four new groups.In the research, each group only contained one type of defect.
Similarly, Chen et al. [6] and Nakagawa et al. [7] proposed to use the support vector machines (SVMs) to classify PCB defects.Chen et al. classified PCB defects into four types, open circuit, short circuit, pinhole, over-etched and under-etched types.They used SVM and RGB colour information as categorical feature vectors for PCB classification.Nakagawa et al. proposed a referential method which classifies the defects into three defined classes.The proposed approach classified the true and pseudo defects by adding features to decrease the error rate.This approach consisted of two steps.First, the features were extracted from the defect candidate region after extracting the difference between the test image and the reference image.Second, selected features were learned with multiple SVM and classified into the class.
Chauhan and Bhardwaj [8] compared a standard PCB image with the inspected one using a simple subtraction algorithm that can detect the defected regions.Through this method, three types of defects, over etchings (opens), under-etchings (short circuit) and holes can be detected.
Ren et al. [9] proposed a referential method which utilised the edge grey gradient of the PCB image in order to classify defects into five defined classes.In [10], all 14 types of defects were detected and classified in all possible classes using the referential inspection approach.The proposed algorithm is mainly divided into five stages: image registration, pre-processing, image segmentation, defect detection, and defect classification.The algorithm was able to perform inspection even when the captured test image was rotated, scaled and translated with respect to the template image.The experiment showed that the method can effectively classify all 14 kinds of PCB defects.
However, the limitation of these methods is that they can only classify one kind of defect for each image or group the defects, and cannot deal with multiple defects existing in a small area.

Our work
Most PCB classification methods are based on reference comparison methods and require complex image preprocessing.Deep networks can overcome these shortcomings since it forms a self-learning framework which has achieved record-breaking results in image processing tasks [11].As the most competitive image classification architecture, the CNN model generates from biology, and it only needs original images, instead of complex features, as input [12].CNNs are multilayer feed forward neural networks and have been used in a large number of image recognition tasks [13][14][15].
Image classification is implemented through a cascade of feature extraction and recognition.Inspired by it, this study presents a convolutional neural network (CNN) to solve multiple label classification problems.Multi-label learning usually allocates multiple labels for an instance at the same time.Therefore, we consider single-label allocation as a binary classification problem and convert multi-label learning into multiple single-label assignment tasks.
The classification algorithm consists of two steps.First, through the traditional image processing method, the cropped image of the PCB is obtained instead of the full image.Second, through the CNN, the features are extracted from cropped images and then used for classification.
The method proposed in this study has three contributions: i. Image tiles are used as input instead of the entire original image.ii.This method belongs to multi-label classification, which means it can automatically assign a set of labels to each instance.iii.The PCB defects classification accuracy is better than the stateof-the-art.
The remainder of this paper is organised as follows: Section 2 introduces the method and model.Section 3 describes the dataset and implementation details.Meanwhile, experimental results analysis is given in the section.In Section 4, we conclude the paper.

Methodology
In this section, we briefly review traditional PCB classification methods and introduce the CNN model we used in detail.

Traditional method
There are three main methods for the classification of PCBs: reference comparison method, non-reference comparison method, and hybrid method.
The reference comparison method compares the features of the image to be examined and the features of the reference image one by one.It is simple, but requires accurate image alignment, together with satisfactory lighting condition [16].Typical methods include edge comparison algorithm based on the reference image and measured image [17], PCB inspection technology based on a comparison of the connected table (i.e.connected PCB hole list) and test image [2] and PCB detection algorithm based on mathematical morphology [18].
The non-reference comparison method is based on the predefined design rule of the PCB to determine whether there is a defect in the image.For example, Benedek [19] introduced a probabilistic approach for optical quality checking of solder pastes in the PCB based on hierarchical marked point process.The advantage is that there is no necessary to refer to the image and image alignment.However, the number of detected defects is limited.Some defects that meet the design rules cannot be detected and the error rate cannot meet the actual demand.
The hybrid method is a combination of a reference comparison method and a non-reference comparison method.
At present, the main-stream methods are reference comparison.It firstly aligns the reference image with the measured image and then extracts the features to achieve the classification of the PCB defects.

Convolutional neural network
CNN [20] is a deep learning method developed upon the traditional multi-layer neural network [21], which is widely applied to image classification.Compared with traditional multilayer neural networks, CNNs mainly add three basic concepts: local receptive fields, shared weights, and pooling layers.With these settings, CNNs classify images by the cascade of feature extraction and classification recognition.

Our model
The CNN constructed in this study has three blocks.Each block has a convolutional layer, an activation layer, and a max pooling layer.Finally, a six-category classification is performed through the fully connected layer.The illustration of the proposed CNN structure for the classification of defects is shown in Fig. 1.

Network structure:
We introduce the main layer we used in the study.
i. Convolutional layer: The convolutional layer is resource consuming.Through the convolution operation, the input image is transformed into new spaces and can be used as features.The convolutional layer consists of the input image (I), filter (K) and offset (b).Among them, the filter size is F, the step size is S, and the value of zero padding is P. Suppose the input image size is W × H × D and the channel number is C.
Given above, the output can be computed as [22]  study is rectified linear unit (ReLU) function.The ReLU function is described as follows [23]: (5) iii.Pooling layer: The pooling layer reduces the dimension of each feature map without losing important information in the image.Its main role is to reduce dimensions in space.The pooling layer can mainly be divided into max pooling layer and average pooling layer.In practice, the effect of max pooling layer is better than the average pooling layer in general, so we choose the max pooling layer.Suppose the input image size is W × H × D, the filter size is F, and the step size is S.Then, the output image size is W2 × H2 × D2, where [22] ) iv. Fully-connected layer: The fully-connected layer is the main block in traditional multi-layer perceptron.Through the fully connected layer, the features which combine the local features with global features are used to calculate the probability of each category.

Network settings:
The network consists of three convolution processes; each of them is followed by a batch normalisation layer and ReLU activation.Through supervised learning, the parameters of all layer are tuned to suit this task.The network layer settings are shown in Table 1.

Optimisation function:
The optimisation function is used to update and calculate the network parameters that affect the training of the model and the output of the model.So that it approaches the optimal value, thereby minimises the loss function L(x).The optimisation function selected in this study is adaptive moment estimation (Adam).Adam dynamically adjusts the learning rate of each parameter using the first-order moment estimation and second-order moment estimation of the gradient.The advantage of Adam is that after the offset correction, the learning rate for each iteration has a certain range, which makes the parameters more stable.The main process is as follows [24].
Let ɛ be the step, θ be the origin parameter, δ be the numerical stability amount, β 1 be the first-order momentum attenuation coefficient and β 2 be the second-order momentum attenuation coefficient.Among them, m t and v t represent partial derivatives of average angle and non-central variance angle, respectively.

Loss function:
In the study, the loss function is refined to satisfy multi-label classification.
As there is no correlation between PCB defects, there is no dependency between the labels, we can ignore the label correlations.This study decomposes the multi-label learning problem into six independent binary classification problems.
In the study, we named the image label as '000000'.Each bit corresponds to the PCB defect of no defect (no defect is regarded as one kind of defect), short circuit, open circuit, spurious copper, mousebite, and spur.If the corresponding position label is 1, it means that this kind of defect exists, and 0 for the contrary.For example, if an image is labelled '010001', it means there are two kinds of defects, short circuit, and spur in the image.The loss function we refine is as follows: Let x i be the prediction at the i position and y i be the real label.N is the total number of categories.Then, the loss and its derivative can be computed as 3 Experiments and results analysis

Dataset and protocols
Defects dataset is collected by us.The dataset used in this study is divided into a training set and a test set which are two separate parts of the dataset.As shown in Fig. 2, the categories of defects include open circuit, short circuit, spurious copper, mousebite, and spur.Moreover, no defects are regarded as one kind of defect.Each category has 200 images.The total number of images in the training set is 1200, including the case of several categories of defects coexistence.There are 150 images in the test set.So there are a total of 1350 images in the dataset.

Implementation details
In this section, we briefly introduce the image preprocessing and the implementation details of the CNN.

Pre-processing:
The original image of this study is provided by Ma [25].The image is acquired under the coaxial light source.
First, the original PCB images are cropped to remove the white edges.To make the circuit on the small image block concentrated, we scale the PCB image into 800 × 600 using the bilinear method.The formula is defined as [26] where Q and f indicate the corresponding position of the pixel, then the pixel at (x, y) after interpolation can be calculated.Performed in steps of 64, the image is cut into small pieces of 128 × 128 size, and the dataset labelled in this scale.Among them, the training set has the same ratio for each category.

Training details:
The implementation is performed using the Pytorch toolbox.In the experiment, the initial learning rate of Adam is set as 0.0001.Meanwhile, the batch sizes for the training set and test set are 32 and 24, respectively.

Results of multi-category
As shown in Table 2, the classification accuracy for the open circuit is 97.34% and followed by the short circuit (96.67%) accuracy.Meanwhile, the spur is 83.34%.The high precision of open circuit and short circuit shows that the CNN model used in this study can effectively extract the image features of these defects.The reason for the better performance of the open circuit and the short circuit is that this two defects had more obvious features, meanwhile the number of pixels of theses defects in the image is more than other defects.The test accuracy rates of spurious copper, mousebite, and no defects are 88.67, 87.34, and 86.00%, respectively.The three types of defects can be effectively classified and identified.Among them, no defects picture recognition rate is the lowest, mainly due to the fact that the differences between the no defects images are large.However, the amount of training data in this study is limited.Due to its small size and insignificant characteristics, the accuracy of the spur is the lowest.
As can be seen from the overall accuracy (89.89%), the CNN is effective for PCB defects classification.

Results of two categories
In the later stage of the experiment, in order to verify whether the CNN is suitable for PCB defects classification, this study classifies the images into two categories: no defects and existing defects.Each category has 320 images.In the study, the five practical defects in the six categories are unified into one, and the two classifications of PCB defects are performed through the CNN.
The accuracy rate is 92.86%, which is higher than the accuracy of multi-category classification.Therefore, the results verify that the CNN can effectively extract PCB features.

Compared with state-of-the-art
In [25], it is proposed to quickly determine whether there is a defect in the target image by using the threshold segmentation of the difference image and the contour length recognition of the latent defect area.By calculating the change time of the grey value of the peripheral boundary, the defect type can be easily identified.
As shown in Table 3, it can be known from the comparison of experimental results that the short circuit (90.00%), open circuit (90.00%), spurious copper (85.00%), and mousebite (83.00%) accuracy obtained by this method are lower than this study.
As shown in Table 4, the accuracy of the method (80.0000%) based on image subtraction proposed by Raihan and Ce [27] is lower than the method proposed in this study (92.8571%).
From the experimental results, the method proposed achieves a high accuracy.

Conclusion
This study has transformed multi-label learning into multiple binary classification tasks and proposes a deep neural network architecture to handle this multi-task problem.In this study, the CNN model is used to classify the type of PCB defect.Instead of using the entire image, the small patches cut from the PCB image are used as input.These small patches are then divided into the six categories such as no defects, short circuit, open circuit, spurious copper, mousebite, and spur.The experimental results show that this method can effectively classify the defects on the PCB, including the case of a single defect and multiple defects.
Due to the limited number of images in the dataset used in this study, the accuracy of PCB defect classification has been affected to some extent.Therefore, expanding the image dataset will be attempted to improve the classification accuracy of PCB defects.
) ii. Activation layer: Activation function preserves and maps neuron features through functions, and it preserves features and removes redundant data.The activation function used in this

Fig. 1
Fig. 1 CNN structure for the classification of defects

Table 2
Results of multi-categories