Recognition of industrial machine parts based on transfer learning with convolutional neural network

Qiaoyang Li; Guiming Chen

doi:10.1371/journal.pone.0245735

Abstract

As the industry gradually enters the stage of unmanned and intelligent, factories in the future need to realize intelligent monitoring and diagnosis and maintenance of parts and components. In order to achieve this goal, it is first necessary to accurately identify and classify the parts in the factory. However, the existing literature rarely studies the classification and identification of parts of the entire factory. Due to the lack of existing data samples, this paper studies the identification and classification of small samples of industrial machine parts. In order to solve this problem, this paper establishes a convolutional neural network model based on the InceptionNet-V3 pretrained model through migration learning. Through experimental design, the influence of data expansion, learning rate and optimizer algorithm on the model effectiveness is studied, and the optimal model was finally determined, and the test accuracy rate reaches 99.74%. By comparing with the accuracy of other classifiers, the experimental results prove that the convolutional neural network model based on transfer learning can effectively solve the problem of recognition and classification of industrial machine parts with small samples and the idea of transfer learning can also be further promoted.

Citation: Li Q, Chen G (2021) Recognition of industrial machine parts based on transfer learning with convolutional neural network. PLoS ONE 16(1): e0245735. https://doi.org/10.1371/journal.pone.0245735

Editor: Le Hoang Son, Vietnam National University, VIETNAM

Received: May 15, 2020; Accepted: January 7, 2021; Published: January 28, 2021

Copyright: © 2021 Li, Chen. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: Data are held in a public repository. https://share.weiyun.com/CTBAFgtB.

Funding: This research was funded by the National Natural Science Foundation of China (71601180).

Competing interests: The authors have declared that no competing interests exist.

1. Introduction

With the advancement of industrial technology and the transformation and development of modern factories, the manufacturing industry has gradually entered an unmanned and intelligent stage. In order to enable the entire production workshop to complete tasks systematically, orderly and independently, intelligent monitoring and maintenance are an important link. In order to improve the accuracy of monitoring and maintenance, accurate recognition of parts and machine parts of the entire industrial workshop is the primary goal.

In order to realize unmanned and intelligent factories, in literature [1], a method of recognizing the position, location and orientation of irregular machine parts with a complex outline of the external contour is suggested. In literature [2], a geometric part measurement system for shaft parts based on machine vision is presented. These studies have potential applications for the development of manufacturing. However, there is no research on the identification of small samples of parts in the whole factory.

In order to realize the accurate recognition of various machine parts, it is necessary to research the image recognition of industrial machine parts. Image recognition refers to the technology of recognition of images with the same characteristics in different modes and environments via the use of computers to process, analyze and understand a large number of images. Traditional image recognition processes include image acquisition, image preprocessing, feature extraction and image recognition. Literature [3] evaluates the performance of the developed MLP and SOM NN based classifier for detection of four conditions of three phase induction motor and examined the results. The cross-validation method in the paper is worth learning, but its calculation is more complicated and there are fewer classification types, and the effect cannot be determined in the face of small samples and multiple classifications, and the accuracy of the results can be further improved. At the same time, the literature only studies the induction motor, and does not extend to the parts of the entire factory. In recent years, deep learning based on convolutional neural networks (CNN) [4] have been used for various types of image recognition and achieved very good results. In 2012, the CNN network model reduced the error rate from 25.77% to 15.319% for the first time in the ImageNet Large Acale Visual Recognition Challenge (ILSVRC) competition. And at the 2017 competition, the lowest error rate has dropped to 2.251%.

Although CNN plays a certain role in image feature extraction and recognition, they all require a large amount of sample data for iterative training of neural networks. However, for this specific field of industrial machine component recognition, there is not enough sample data. If the small sample data is directly trained by CNN, the obtained model will have a large error and is difficult be promoted. In order to solve the problem of image recognition with small sample, Gene Kitamura realized the detection and recognition of small samples of ankle fractures through a new training and multi-view merged CNN [5] and the final recognition accuracy reached 81%. Jufeng Yang proposed a self-paced learning algorithm for small sample recognition of clinical rare skin diseases [6]. Qian Huang proposes a new blood cell classification framework based on medical hyperspectral imaging in order to complete the task of white blood cell classification under small sample training, combining the modulated Gabor wavelet and CNN kernel [7]. In the field of industrial machine parts recognition, there are few algorithms related to its specific research. In order to solve the problem of recognition and classification of industrial machine parts under the condition of small samples, a recognition method of industrial machine parts based on transfer learning [8–15] with CNN is proposed. Not only can further improve the recognition accuracy, but also save the time of model training. And the related parameters are flexible and can be adjusted according to different target images and recognition tasks. It also has a good promotion effect in related research and industrial practice.

The first part of this article explains the necessity and current situation of the research on this subject, aiming to solve the problem that the factory will realize the accurate classification and recognition of industrial machine parts in the plant under a small sample data set in the future. The second part introduces and constructs the basic theory and basic framework of transfer learning based on CNN. The third part introduces the source and classification of the experimental data set, designs the experimental grouping and studies the influence of different variables on the model training effect. The fourth part summarizes the experiment and puts forward the research direction of the next stage.

2. Construction of convolutional neural network model based on transfer learning

2.1. Convolutional neural network

CNN in deep learning is a neural network specifically used to process data with a grid structure. CNN includes multiple convolutional layers, pooling layers, and fully connected layers [16]. Based on the feed-forward neural network, the model updates the parameter weights by iteratively training the loss function to feed back the errors to each network layer. As the iterative training progresses, the parameter weights are continuously updated to achieve the desired training effect.

The role of the convolution layer is to perform feature mapping of the input through the convolution kernel to extract the features of the image [17–19]. The convolution operation formula is (1)

Where S (i, j) represents the output tensor of the convolution layer, I (i+m, j+n) represents the input tensor of the convolution layer, K (m, n) represents the convolution kernel, i, j represent the coordinate values of the tensor, m, n represent the coordinate value of the convolution kernel.

The function of the pooling layer is to further process the feature mapping results obtained by the convolution operation. The pooling function will statistically summarize the feature values of a position in the plane and its adjacent positions, and use the summarized result as the value of this position in this plane. Common pooling functions include average pooling, maximum pooling, and random pooling. Taking the maximum pooling function with a size of 2 × 2 as an example, its calculation formula is (2)

Where f_pool represents the result after pooling, s_i,j represents the element whose position on the feature map tensor is (i, j).

The fully connected layer is a dimensionality reduction and tiling of the results obtained by the convolutional layer and the pooling layer and then performs non-linear transformation through the activation function. Finally, the results are input into the classifier for classification.

2.2. Transfer learning and model building

2.2.1. Transfer learning.

Transfer learning is a new machine learning method that uses existing knowledge to solve different but related domain problems. For CNN, the convolutional layer and pooling layer are retained. The CNN's convolutional layer trained on a large amount of sample data can perform feature extraction on another image data. The extracted feature vector is processed by the pooling layer and then processed. Add a new fully connected layer to form a new network model. To put it simply, it retains the model's feature extraction and recognition capabilities and adds new object orientation to enable itself to complete new image recognition and classification tasks.

2.2.2. InceptionNet-V3 convolutional network model.

Commonly used pretrained models contain Resnet [20], VGG [21], Alexnet [22, 23] and InceptionNet-V3 [24], etc. Compared with other models, InceptionNet-V3's classifier has a smaller number of operations, which can reduce the training time, and can also reduce the structural redundancy through convolution. At the same time, we can see from the literature [25] that on the classification problem based on transfer learning, InceptionNet-V3 achieved good results. Therefore, this article first considers the use of InceptionNet-V3 for transfer learning. This paper uses InceptionNet-V3 convolutional network model for transfer learning. InceptionNet-V3 was proposed in the paper "Rethinking the Inception Architecture for Computer Vision" in December 2015. InceptionNet-V3 has two main improvements over InceptionNet-V2. The first is to optimize the structure of the Inception Module. The second is to introduce a larger two-dimensional convolution into two smaller one-dimensional convolutions in InceptionNet-V3. This is called the "Factorization into small convolutions" idea. This asymmetric convolutional structure split is more effective than symmetric structures in dealing with more and richer spatial features and increasing feature diversity. The architecture diagram of the InceptionNet-V3 model is shown in Fig 1.

Download:

Fig 1. InceptionNet-V3 model architecture diagram.

https://doi.org/10.1371/journal.pone.0245735.g001

The InceptionNet-V3 model has a total of 46 layers and consists of 11 Inception modules, including 96 convolutional layers. The convolutional layers are implemented by TensorFlow's Slim tool.

2.2.3. Construction of transfer learning model.

InceptionNet-V3 completed training on the ImageNet data set and the number of training samples reached 1.2 million [26–28]. However, the number of images of industrial machine parts is not yet large enough. Therefore, the transfer learning method is used to recognize and classify industrial machine parts based on the InceptionNet-V3 model. For the trained InceptionNet-V3 model, the parameters of all convolutional layers are retained and the last fully connected layer is replaced. The previous network layer of this fully connected layer is called the bottleneck layer, which is the last Dropout layer in InceptionNet-V3. The results of the new fully connected layer are passed to a Softmax layer, and new recognition tasks can be processed. The modified module process of industrial machine parts recognition is shown in Fig 2.

Download:

Fig 2. Modified flowchart of recognition of industrial machine parts.

https://doi.org/10.1371/journal.pone.0245735.g002

The gradient descent optimizers that can be used in training mainly include stochastic gradient descent, AdaGrad, RMSProp and Adam optimizers. Take the Adam algorithm as an example to introduce the principle of its optimization method.

First set the global learning rate σ. The exponential decay rate of moment estimation is ρ₁ and ρ₂, and in the interval [0,1], the default is 0.9 and 0.990. The initialized parameter is ω. A small constant created for numerical stability δ, default takes δ = 10⁻⁸. The first and second moment variables s and r with initial values of 0. And an event step count t, t is initialized with t = 0. Then execute the following steps in a loop without stopping before the stop condition.

(1) Take out the mini-batch data of m samples from the training set {x₁,x₂,⋯,x_m}, and the target corresponding to the data is represented by y_i.

(2) Calculate the gradient as follows.

(3)

(3) The refresh time steps are as follows.

(4)

(4) Update the first-order partial moment estimation.

(5)

(5) Update the second-order biased moment estimation。 (6)

(6) Correct the deviation of the first-order moment.

(7)

(7) Correct the deviation of the second moment.

(8)

(8) Calculate the update amount of the parameter.

(9)

(9) Update the parameters according to Δω.

(10)

Assume that the output of the original neural network is y₁, y₂, … y_n, then the output after softmax [29] regression processing is (11)

3. Model training and analysis of experimental results

3.1. Data set

The data set used in the experiment came from a field shooting of a factory, includeing 11 types of industrial machine parts such as control panels, plate, robotic arms, and assembly, etc. with a total of 1002 images. As shown in Fig 3A–3K are sample images of each type of image set. Table 1 summarizes the number of each category. In the experiment, the image is augmented to 4008 images through rotation, flip, etc. 80% of the data set is used for training, 10% is used for validation and 10% is used for testing.

Download:

Fig 3. Examples of dataset image.

https://doi.org/10.1371/journal.pone.0245735.g003

Download:

Table 1. Summary of the number of each category.

https://doi.org/10.1371/journal.pone.0245735.t001

In Fig 3, from top to bottom, from left to right, control panels, robotic arms, interactive module, assembly, big machines, engine, hangar, old machinery, plates, tech parts and others are in turn.

The classification and recognition process of industrial machine parts based on transfer learning can be obtained as shown in Fig 4.

Download:

Fig 4. Classification and recognition process of industrial machine parts based on transfer learning.

https://doi.org/10.1371/journal.pone.0245735.g004

Compared with other models, the novelty of the model in the paper is as follows:

The training process is simplified, the amount of calculation is reduced, and training time is saved.
In the case of limited samples, better training results can be achieved.
The redundancy of the structure is reduced, which is conducive to further expansion, improvement and supplementation.

3.2 Experimental design

The experiments were completed under the software environment of Python 3.7.0 and TensorFlow 1.15.0. In the hardware environment, the CPU uses Intel Corei5-6200U and the main frequency is 2.3GHz; the GPU uses NVIDIA GeForce 950M and 2GB video memory.

The hyperparameters of the training neural network are set as follows: the initial learning rate is set to 0.01, the batch size is set to 32 and the total number of iteration training times is set to 40,000.

In order to get better training results, the experiment set different contrast experimental groups:

Comparison of the original data set (1002 images) and the model of the data set (4008 images) after simple flipping, folding and other operations.
Comparison of models obtained under different learning rates.
Comparison of models obtained using different gradient descent optimizers.

3.3 Analysis of experimental results

3.3.1 Impact of image data augmentation on models.

For this experimental sample, under the condition that the learning rate is set to 0.01 and the optimizer uses a stochastic gradient descent optimizer, the two trained models are compared. The trend of the accuracy of the training set and validation set with the number of iterative trainings is shown in the Figs 5 and 6.

Download:

Fig 5. The trend of the accuracy of the training set of the original data and the augmented data with the number of iterative training.

https://doi.org/10.1371/journal.pone.0245735.g005

Download:

Fig 6. The trend of the accuracy of the validation set of the original data and the augmented data with the number of iterative training.

https://doi.org/10.1371/journal.pone.0245735.g006

It can be seen that during the training process, the training set accuracy of the original data and the augmented data both reached 100% after 10,000 iterations of training. The accuracy of recognition of the augmented data validation set is significantly higher than the original data after 25,000 iterations of training. Then compare the value of the loss function during training, as shown in Fig 7. It can be found that the value of the loss function of the augmented data is always slightly higher than the value of the loss function of the original data, which also shows that as the number of data sets increases, it is necessary to increase the number of trainings to obtain more ideal training results.

Download:

Fig 7. The change trend of the loss function value of the original data and the augmented data with the number of iterative training.

https://doi.org/10.1371/journal.pone.0245735.g007

After the model training is completed, the test set divided by the experiment is used to test the model and the training effect of training data set and validation data set is summarized as shown in Table 2.

Download:

Table 2. Comparison of model accuracy between original and augmented data.

https://doi.org/10.1371/journal.pone.0245735.t002

From Table 2, after 40,000 iterations of training, the accuracy of recognition of the training set has reached 100%. By expanding the image through operations such as rotation and folding, the accuracy of recognition of the validation set of the model is increased by 6.26 percentage points, and the accuracy of recognition of the test set is increased by 1.52 percentage points. The accuracy of recognition of the test set is improved, but the amplitude is not large. The reason is that operations such as rotation and folding do not change the features and quality of the image. At the same time, the model has been trained on large data sets due to transfer learning model. A better feature extraction ability is obtained, so on the other hand, the effect of simple expansion of small sample data is also weakened.

3.3.2 Impact of different learning rates on models.

For the augmented data, under the condition of using a stochastic gradient descent optimizer, different learning rates are set and the loss function value is observed during the training of the model. Compare the loss function values of the first 200 iterations of training with learning rates [30–32] of 0.0001, 0.001, 0.01 and 0.1 respectively, as shown in Fig 8.

Download:

Fig 8. The trend of the loss function value at different learning rates with the number of iterative training.

https://doi.org/10.1371/journal.pone.0245735.g008

It can be obtained from Fig 8 that if the learning rate is too small (for example, the learning rate is set to 0.0001), the value of the loss function will fluctuate continuously, but the convergence cannot be reduced. The reason is that the learning rate is too small, the convergence speed is slow and no obvious convergence effect can be obtained with a small number of iterative training times. At the same time, it can be found that in the case of the experimental samples and settings, when the learning rate is 0.1, a significant gradient explosion occurs at the beginning of training. In order to eliminate the chance, further testing whether the learning rate is too large will cause a gradient explosion. Under the same conditions, the learning rate is set to 1 and 3 for iterative training. The loss function value of the first 200 iterations is shown in Fig 9.

Download:

Fig 9. The trend of loss function value with the number of iterative training at large learning rates.

https://doi.org/10.1371/journal.pone.0245735.g009

It can be obtained from Fig 9 that with the continuous increase of the learning rate, the peak value of the loss function at the beginning of training also increases and the effect of the gradient explosion is more significant. At the same time, in the subsequent iterative training, the value of the loss function continuously oscillated, proving that the model parameters are updated too quickly and the difference is large, destroying the previously trained weight information, causing the model to fail and the transfer learning to be meaningless.

After filtering the learning rate, select the accuracy of the data training set and validation set of the models at learning rates of 0.001 and 0.01, as shown in Figs 10 and 11.

Download:

Fig 10. The trend of the accuracy of the training set with the number of iterative training at learning rates of 0.001 and 0.01.

https://doi.org/10.1371/journal.pone.0245735.g010

Download:

Fig 11. The trend of the accuracy of the validation set with the number of iterative training at learning rates of 0.001 and 0.01.

https://doi.org/10.1371/journal.pone.0245735.g011

It can be obtained from Figs 10 and 11, during the iterative training process, the accuracy of recognition of the training set and validation set of the model with a learning rate of 0.01 is always higher than the model with a learning rate of 0.001. Then compare the change of the loss function value with the number of iterative training times under the two learning rate values, as shown in Fig 12.

Download:

Fig 12. The trend of loss function value with the number of iterative training at learning rates of 0.001 and 0.01.

https://doi.org/10.1371/journal.pone.0245735.g012

It can be obtained from Fig 12 that under the experimental samples, neither of the two models of learning rate take the case of non-convergence or gradient explosion. The loss function value of the model with a learning rate of 0.01 is smaller than that of the model with a learning rate of 0.001. The convergence is faster and the fluctuation range is smaller.

After 40,000 iterations of training are completed, the test set is used to test the two learning rate models and the training effect of training data set and validation data set is summarized as shown in Table 3.

Download:

Table 3. Comparison of model accuracy rates at different learning rates.

https://doi.org/10.1371/journal.pone.0245735.t003

According to Table 3, after 40,000 iterations of training, the accuracy of recognition of the validation set of the model with a learning rate of 0.01 is the same as that of the model with a learning rate of 0.001, but the accuracy of recognition of the test set of the model with a learning rate of 0.01 is increased by 6.77 percentage points.

3.3.3 Impact of different gradient descent optimizers on models.

In order to further optimize the model and improve the accuracy, for the augmented data, the training results of the model are observed under the condition that the learning rate is set to 0.01 and a stochastic gradient descent optimizer and adaptive learning rate optimizers [33–35] based on AdaGrad algorithm, RMSProp algorithm, and Adam algorithm are used. Among them, for the adaptive learning rate optimizer based on the AdaGrad algorithm, initial_accumulator_value is set to 0.1. For the adaptive learning rate optimizer based on RMSPop algorithm, decay is set to 0.9, momentum is set to 0.0 and epsilon is set to 1e-10 by default. For Adaptive learning rate optimizer based on Adam algorithm, beta1 is set to 0.9, beta2 is set to 0.999 and epsilon is set to 1e-10 by default. In order to facilitate the comparison of the trends in the accuracy of the training set and the accuracy of the validation set under different optimizers, the result of 40,000 iterative training is taken every 1000 times to map, as shown in Figs 13 and 14.

Download:

Fig 13. The trend of the accuracy of the training set and validation set with the number of iterative training under different optimizers.

https://doi.org/10.1371/journal.pone.0245735.g013

Download:

Fig 14. The trend of the accuracy of the validation set with the number of iterative training under different optimizers.

https://doi.org/10.1371/journal.pone.0245735.g014

It can be obtained from Figs 13 and 14 that under the experimental samples, the accuracy with the stochastic gradient descent optimizer and the AdaGrad adaptive learning rate optimizer has almost no difference and the accuracy with the Adam adaptive learning rate optimizer is higher than the other three optimizers. At the same time, it was found that the model with the adaptive learning rate optimizer based on RMSProp algorithm has lower accuracy of recognition of training set and validation set than the other three optimizer models and the fluctuation range is large. Combining the change of the loss function value under different optimizers (as shown in Fig 15), during the training of the optimizer of the RMSProp algorithm, the value of the loss function continuously oscillates and does not converge. It can be obtained that the adaptive learning rate optimizer based on RMSProp algorithm is not applicable to this experimental sample.

Download:

Fig 15. The trend of loss function value with the number of iterative training under different optimizers.

https://doi.org/10.1371/journal.pone.0245735.g015

On the basis of Fig 15, remove the RMSprop adaptive learning rate optimizer and compare the loss function values of the other three optimizers, as shown in Fig 16. It can be found that the Adam adaptive learning rate optimizer has a small loss function value and fast convergence. After 7000 iterations of training, it has approached 0 and the change value is less than 0.005. The optimizer experienced a brief fluctuation during 6000 iterations, but the fluctuation range was less than 0.4, which did not affect the parameter update.

Download:

Fig 16. The trend of loss function value with the number of iterative training under different optimizers.

https://doi.org/10.1371/journal.pone.0245735.g016

After 40,000 iterations of training were completed, the test set was used to test the models under the four optimizers and the training effect of training data set and validation data set is summarized as shown in Table 4.

Download:

Table 4. Comparison of model accuracy rates under different gradient descent optimizers.

https://doi.org/10.1371/journal.pone.0245735.t004

It can be obtained from Table 4 that, for the experimental sample, the accuracy of the recognition of training set, validation set and test set of the model based on Adam adaptive learning rate optimizer are higher than the other three optimizers and the model training effect is better.

3.3.4 Further optimization of results.

Since the above experiment is based on 80% of dataset used for training, under normal circumstances it is easy to lead to overfitting. Therefore, the samples are divided into different groups for model training. The final accuracy rates of training set, validation set and test set are shown in Table 5.

Download:

Table 5. Accuracy of results of different groups.

https://doi.org/10.1371/journal.pone.0245735.t005

It can be seen from the results that when the ratio is 8:1:1, no overfitting occurs, and when the ratio of the training set is reduced, the accuracy of the test set obtained decreases. This paper analyzes this phenomenon and the main reasons are as follows.

Compared with big data, the number of the data set of this article is relatively small. If the proportion of the training set is further reduced, the training sample will be too small and the model fitting effect will be poor.
The InceptionNet-V3 used in this article has been trained by ImageNet and has good feature extraction capabilities. Unlike ordinary deep learning, transfer learning solves research problems with small samples.
It can be seen from the above training results that the accuracy of the test set has reached 99.04%, so it is judged that there is no overfitting.

Since the model will be applied to large sample data in the future, it is necessary to further optimize the model to achieve higher accuracy. This paper draws on the ideas and algorithms of literature [36], and considers the method of k-fold cross-validation to improve the model. Take 10% of the original data set as the final test set, and perform 10-fold cross-validation on the remaining 90% of the data. The accuracy rates of the 10 validation sets and test set obtained are shown in Table 6.

Download:

Table 6. 10-fold cross-validation training results.

https://doi.org/10.1371/journal.pone.0245735.t006

It can be seen from Table 6 that the model has been further optimized by using 10-fold cross-validation. Except for one validation set with an accuracy rate of 96.88%, the accuracy rates of the remaining 9 validation sets are all 100%, and the accuracy rate of the final test set reaches 99.74%, which is 0.7% higher than the optimal result in Table 5. This proves the effectiveness and superiority of the 10-fold cross-validation method.

3.4 Comparison of classification results of different classifiers

In order to prove the superiority of the method adopted in this paper, after the feature extraction of the image, different classifiers are used for training and the accuracy of the obtained results is shown in Table 7. The parameter settings of the classifiers here are general values or default values, and the ratio of the training set to the test set of the classifier is 4:1. The k-fold cross-validation method is not used here, because it can be seen from the results that the accuracy rate obtained is much lower than the method used in this paper, so even if other classifiers use k-fold cross-validation, they cannot achieve quite high accuracy. It can be seen that for the recognition of industrial machine parts in factories with small samples, transfer learning based on CNN has obtained very good results and can be applied in the intelligent construction of factories in the future.

Download:

Table 7. Accuracy of classification results of different classifiers.

https://doi.org/10.1371/journal.pone.0245735.t007

4. Conclusion

Based on the transfer learning of the InceptionNet-V3 convolutional neural network model, this paper identifies and classifies 11 types of components of industrial machines. Through data augmentation, setting different learning rates and different gradient descent optimizers, the accuracy of recognition of training set accuracy, validation set and test set of the trained model are compared based on 40,000 iterations of training. In the end, after the data augmentation, the initial learning rate is taken as 0.01 and the optimizer uses the Adam adaptive learning rate gradient descent optimizer, the obtained training model is optimal. Through the analysis of the data set division ratio and 10-fold cross-validation, the final accuracy rate of the test set is 99.74%. By comparing with the accuracy of other classifiers, it can be seen that the method adopted in this paper has a better effect. This provides a basis and foundation for each factory to carry out intelligent monitoring based on its own parts and components in the future industrial background. Due to the complexity of model calculations, we will continue to study how to simplify calculations so that the model can be quickly applied in industry.

Supporting information

S1 Appendix. Program.

https://doi.org/10.1371/journal.pone.0245735.s001

(ZIP)

S2 Appendix. Experimental data.

https://doi.org/10.1371/journal.pone.0245735.s002

(XLS)

S3 Appendix. Description of the dataset.

https://doi.org/10.1371/journal.pone.0245735.s003

(DOC)

References

1. Stryczek R. Finite point sets in recognizing location and orientation of machine parts of complex shapes. Pattern Analysis and Applications. 2019, 21.
- View Article
- Google Scholar
2. Li Bin. Research on geometric dimension measurement system of shaft parts based on machine vision. EURASIP Journal on Image and Video Processing. 2018, 1, pp. 101.
- View Article
- Google Scholar
3. Ghate V. N.; Dudul S. V. Optimal MLP neural network classifier for fault detection of three phase induction motor. Expert Systems with Applications. 2010, 37, pp. 3468–3481.
- View Article
- Google Scholar
4. Krizhevsky A.; Sutskever I.; Hinton G. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems. 2012, 2.
- View Article
- Google Scholar
5. Kitamura G.; Chung C. Y.; Moore B. E. Ankle fracture detection utilizing a convolutional neural network ensemble implemented with a small sample, de novo training, and multiview incorporation. Journal of Digital Imaging. 2019.
- View Article
- Google Scholar
6. Yang J.; Wu X.; Liang J.; Sun X.; Wang L. Self-paced balance learning for clinical skin disease recognition. IEEE Transactions on Neural Networks & Learning Systems. 2019, 99, pp.1–15.
- View Article
- Google Scholar
7. Huang Q.; Li W.; Zhang B.; Li Q.; Tao R.; Lovell N. H. Blood cell classification based on hyperspectral imaging with modulated Gabor and CNN. IEEE Journal of Biomedical and Health Informatics. 2019, p.1. pmid:30892256
- View Article
- PubMed/NCBI
- Google Scholar
8. Muhammad R.; Ghazala R.; Rockson A.; Gyu S. C.; Seong-I J1. Scene Classification for Sports Video Summarization Using Transfer Learning. Sensors. 2020, 6.
- View Article
- Google Scholar
9. Ying Z.; Xuan C.; Zhai Y.; Sun B.; Li J.; Deng W.; et al. TAI-SARNET: Deep Transferred Atrous-Inception CNN for Small Samples SAR ATR. Sensors. 2020, 6. pmid:32204506
- View Article
- PubMed/NCBI
- Google Scholar
10. Wei W.; Huerta E. A.; Whitmore B. C.; Lee J. C.; Stephen H.; Rupali C. Deep transfer learning for star cluster classification: i. application to the phangs-hst survey. Monthly Notices of the Royal Astronomical Society.
- View Article
- Google Scholar
11. Tian W.; Liao Z.; Wang X. Transfer learning for neural network model in chlorophyll-a dynamics prediction. Environmental Science and Pollution Research(C). 2019. pmid:31410825
- View Article
- PubMed/NCBI
- Google Scholar
12. Zhang C.; Qiao K.; Wang L.; Tong L.; Yan B. A visual encoding model based on deep neural networks and transfer learning for brain activity measured by functional magnetic resonance imaging. Journal of Neuroscience Methods. 2019, 325. pmid:31255596
- View Article
- PubMed/NCBI
- Google Scholar
13. Castelluccio B. C.; Kenney J. G.; Johannesen J. K. Individual alpha peak frequency moderates transfer of learning in cognitive remediation of schizophrenia. Journal of the International Neuropsychological Society. 2020, 1, pp.19–30. pmid:31983373
- View Article
- PubMed/NCBI
- Google Scholar
14. Samala R. K.; Heang‐Ping Chan; Hadjiiski L.; Helvie M. A.; Wei J.; Cha K. Mass detection in digital breast tomosynthesis: deep convolutional neural network with transfer learning from mammography. Medical Physics. 2016, 12. pmid:27908154
- View Article
- PubMed/NCBI
- Google Scholar
15. Bruijne M. D. Machine learning approaches in medical image analysis: from detection to diagnosis. Medical image analysis. 2016, 33.
- View Article
- Google Scholar
16. Peng Y.; Liao M.; Song Y.; Deng H.; Wang Y. FB-CNN: Feature Fusion-Based Bilinear CNN for Classification of Fruit Fly Image. IEEE Access. 2020, 8, pp.3987–3995.
- View Article
- Google Scholar
17. Anthimopoulos M.; Christodoulidis S.; Ebner L.; Christe A.; Mougiakakou S. Lung pattern classification for interstitial lung diseases using a deep convolutional neural network. IEEE Transactions on Medical Imaging. 2016, 5, pp.1207–1216. pmid:26955021
- View Article
- PubMed/NCBI
- Google Scholar
18. Matsugu M.; Mori K.; Mitari Y.; Kaneda Y. Subject independent facial expression recognition with robust face detection using a convolutional neural network. Neural Networks. 2003, 16, pp.555–559. pmid:12850007
- View Article
- PubMed/NCBI
- Google Scholar
19. Jin K. H.; Mccann M. T.; Froustey E.; Unser M. Deep convolutional neural network for inverse problems in imaging. IEEE Transactions on Image Processing. 2017, p.1.
- View Article
- Google Scholar
20. Lu J.; Yang J.; Batra D.; Parikh D. Hierarchical question-image co-attention for visual question answering.
- View Article
- Google Scholar
21. He K.; Zhang X.; Ren S.; Sun J. Deep Residual Learning for Image Recognition. IEEE Conference on Computer Vision & Pattern Recognition. IEEE Computer Society. 2016.
- View Article
- Google Scholar
22. Minhas R.A.; Javed A.; Irtaza A.; Mahmood M.T.; Joo Y.B. Shot classification of field sports videos using AlexNet Convolutional Neural Network. Applied Sciences. 2019, 9.
- View Article
- Google Scholar
23. Rastegari, M.; Ordonez, V.; Redmon, J.; Farhadi, A. XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks. European Conference on Computer Vision. Springer, Cham. 2016.
24. Saikia A. R.; Bora K.; Mahanta L. B.; Das A. K. Comparative assessment of cnn architectures for classification of breast fnac images. Tissue and Cell. 2019. pmid:30947968
- View Article
- PubMed/NCBI
- Google Scholar
25. Raghu S.; Sriraam N.; Temel Y.; Rao S.V.; Kubben P.L. EEG based multi-class seizure type classification using convolutional neural network and transfer learning. Neural Networks. 2020, 124, pp. 202–212. pmid:32018158
- View Article
- PubMed/NCBI
- Google Scholar
26. Zhu X.; Zuo J.; Ren H. A modified deep neural network enables identification of foliage under complex background. Connection ence. 2019, 4, pp.1–15.
- View Article
- Google Scholar
27. Kim D. H.; Mackinnon T. Artificial intelligence in fracture detection: transfer learning from deep convolutional neural networks. Clinical Radiology. 2017. pmid:29269036
- View Article
- PubMed/NCBI
- Google Scholar
28. Sun Z.; Sun Y. Automatic detection of retinal regions using fully convolutional networks for diagnosis of abnormal maculae in optical coherence tomography images. Journal of biomedical optics. 2019, 5. pmid:31111697
- View Article
- PubMed/NCBI
- Google Scholar
29. Efendioglu H. S.; Yildirim T.; Fidanboylu K. Prediction of force measurements of a microbend sensor based on an artificial neural network. Sensors. 2009, 9, pp.7167–7176. pmid:22399991
- View Article
- PubMed/NCBI
- Google Scholar
30. Bowling M.; Veloso M. Multiagent learning using a variable learning rate. Artificial Intelligence. 2002, 2, pp.215–250.
- View Article
- Google Scholar
31. Tan J.; Chen C. B. Deep learning the holographic black hole with charge. International Journal of Modern Physics D. 2019.
- View Article
- Google Scholar
32. Bernardoni F.; Geisler D.; King J. A.; Javadi A. H.; Ehrlich S. Altered medial frontal feedback learning signals in anorexia nervosa. Biological Psychiatry. 2017, 3. pmid:29025688
- View Article
- PubMed/NCBI
- Google Scholar
33. Arab A.; Alfi A. An adaptive gradient descent-based local search in memetic algorithm applied to optimal controller design. Information Sciences. 2014. pmid:32226109
- View Article
- PubMed/NCBI
- Google Scholar
34. Gao Y.; Biguri A.; Blumensath T. Block stochastic gradient descent for large-scale tomographic reconstruction in a parallel network. IEEE Transactions on Parallel and Distributed Systems. 2019.
- View Article
- Google Scholar
35. Chen Y.; Chi Y.; Fan J.; Ma C. Gradient descent with random initialization: fast global convergence for nonconvex phase retrieval. Mathematical Programming. 2018.
- View Article
- Google Scholar
36. Vabalas A.; Gowen E.; Poliakoff E.; Casson A. J. Machine learning algorithm validation with a limited sample size. PLoS ONE. 2019, 14(11). pmid:31697686
- View Article
- PubMed/NCBI
- Google Scholar

[ref1] 1. Stryczek R. Finite point sets in recognizing location and orientation of machine parts of complex shapes. Pattern Analysis and Applications. 2019, 21.
View Article
Google Scholar

[2] View Article

[3] Google Scholar

[ref2] 2. Li Bin. Research on geometric dimension measurement system of shaft parts based on machine vision. EURASIP Journal on Image and Video Processing. 2018, 1, pp. 101.
View Article
Google Scholar

[5] View Article

[6] Google Scholar

[ref3] 3. Ghate V. N.; Dudul S. V. Optimal MLP neural network classifier for fault detection of three phase induction motor. Expert Systems with Applications. 2010, 37, pp. 3468–3481.
View Article
Google Scholar

[8] View Article

[9] Google Scholar

[ref4] 4. Krizhevsky A.; Sutskever I.; Hinton G. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems. 2012, 2.
View Article
Google Scholar

[11] View Article

[12] Google Scholar

[ref5] 5. Kitamura G.; Chung C. Y.; Moore B. E. Ankle fracture detection utilizing a convolutional neural network ensemble implemented with a small sample, de novo training, and multiview incorporation. Journal of Digital Imaging. 2019.
View Article
Google Scholar

[14] View Article

[15] Google Scholar

[ref6] 6. Yang J.; Wu X.; Liang J.; Sun X.; Wang L. Self-paced balance learning for clinical skin disease recognition. IEEE Transactions on Neural Networks & Learning Systems. 2019, 99, pp.1–15.
View Article
Google Scholar

[17] View Article

[18] Google Scholar

[ref7] 7. Huang Q.; Li W.; Zhang B.; Li Q.; Tao R.; Lovell N. H. Blood cell classification based on hyperspectral imaging with modulated Gabor and CNN. IEEE Journal of Biomedical and Health Informatics. 2019, p.1. pmid:30892256
View Article
PubMed/NCBI
Google Scholar

[20] View Article

[21] PubMed/NCBI

[22] Google Scholar

[ref8] 8. Muhammad R.; Ghazala R.; Rockson A.; Gyu S. C.; Seong-I J1. Scene Classification for Sports Video Summarization Using Transfer Learning. Sensors. 2020, 6.
View Article
Google Scholar

[24] View Article

[25] Google Scholar

[ref9] 9. Ying Z.; Xuan C.; Zhai Y.; Sun B.; Li J.; Deng W.; et al. TAI-SARNET: Deep Transferred Atrous-Inception CNN for Small Samples SAR ATR. Sensors. 2020, 6. pmid:32204506
View Article
PubMed/NCBI
Google Scholar

[27] View Article

[28] PubMed/NCBI

[29] Google Scholar

[ref10] 10. Wei W.; Huerta E. A.; Whitmore B. C.; Lee J. C.; Stephen H.; Rupali C. Deep transfer learning for star cluster classification: i. application to the phangs-hst survey. Monthly Notices of the Royal Astronomical Society.
View Article
Google Scholar

[31] View Article

[32] Google Scholar

[ref11] 11. Tian W.; Liao Z.; Wang X. Transfer learning for neural network model in chlorophyll-a dynamics prediction. Environmental Science and Pollution Research(C). 2019. pmid:31410825
View Article
PubMed/NCBI
Google Scholar

[34] View Article

[35] PubMed/NCBI

[36] Google Scholar

[ref12] 12. Zhang C.; Qiao K.; Wang L.; Tong L.; Yan B. A visual encoding model based on deep neural networks and transfer learning for brain activity measured by functional magnetic resonance imaging. Journal of Neuroscience Methods. 2019, 325. pmid:31255596
View Article
PubMed/NCBI
Google Scholar

[38] View Article

[39] PubMed/NCBI

[40] Google Scholar

[ref13] 13. Castelluccio B. C.; Kenney J. G.; Johannesen J. K. Individual alpha peak frequency moderates transfer of learning in cognitive remediation of schizophrenia. Journal of the International Neuropsychological Society. 2020, 1, pp.19–30. pmid:31983373
View Article
PubMed/NCBI
Google Scholar

[42] View Article

[43] PubMed/NCBI

[44] Google Scholar

[ref14] 14. Samala R. K.; Heang‐Ping Chan; Hadjiiski L.; Helvie M. A.; Wei J.; Cha K. Mass detection in digital breast tomosynthesis: deep convolutional neural network with transfer learning from mammography. Medical Physics. 2016, 12. pmid:27908154
View Article
PubMed/NCBI
Google Scholar

[46] View Article

[47] PubMed/NCBI

[48] Google Scholar

[ref15] 15. Bruijne M. D. Machine learning approaches in medical image analysis: from detection to diagnosis. Medical image analysis. 2016, 33.
View Article
Google Scholar

[50] View Article

[51] Google Scholar

[ref16] 16. Peng Y.; Liao M.; Song Y.; Deng H.; Wang Y. FB-CNN: Feature Fusion-Based Bilinear CNN for Classification of Fruit Fly Image. IEEE Access. 2020, 8, pp.3987–3995.
View Article
Google Scholar

[53] View Article

[54] Google Scholar

[ref17] 17. Anthimopoulos M.; Christodoulidis S.; Ebner L.; Christe A.; Mougiakakou S. Lung pattern classification for interstitial lung diseases using a deep convolutional neural network. IEEE Transactions on Medical Imaging. 2016, 5, pp.1207–1216. pmid:26955021
View Article
PubMed/NCBI
Google Scholar

[56] View Article

[57] PubMed/NCBI

[58] Google Scholar

[ref18] 18. Matsugu M.; Mori K.; Mitari Y.; Kaneda Y. Subject independent facial expression recognition with robust face detection using a convolutional neural network. Neural Networks. 2003, 16, pp.555–559. pmid:12850007
View Article
PubMed/NCBI
Google Scholar

[60] View Article

[61] PubMed/NCBI

[62] Google Scholar

[ref19] 19. Jin K. H.; Mccann M. T.; Froustey E.; Unser M. Deep convolutional neural network for inverse problems in imaging. IEEE Transactions on Image Processing. 2017, p.1.
View Article
Google Scholar

[64] View Article

[65] Google Scholar

[ref20] 20. Lu J.; Yang J.; Batra D.; Parikh D. Hierarchical question-image co-attention for visual question answering.
View Article
Google Scholar

[67] View Article

[68] Google Scholar

[ref21] 21. He K.; Zhang X.; Ren S.; Sun J. Deep Residual Learning for Image Recognition. IEEE Conference on Computer Vision & Pattern Recognition. IEEE Computer Society. 2016.
View Article
Google Scholar

[70] View Article

[71] Google Scholar

[ref22] 22. Minhas R.A.; Javed A.; Irtaza A.; Mahmood M.T.; Joo Y.B. Shot classification of field sports videos using AlexNet Convolutional Neural Network. Applied Sciences. 2019, 9.
View Article
Google Scholar

[73] View Article

[74] Google Scholar

[ref23] 23. Rastegari, M.; Ordonez, V.; Redmon, J.; Farhadi, A. XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks. European Conference on Computer Vision. Springer, Cham. 2016.

[ref24] 24. Saikia A. R.; Bora K.; Mahanta L. B.; Das A. K. Comparative assessment of cnn architectures for classification of breast fnac images. Tissue and Cell. 2019. pmid:30947968
View Article
PubMed/NCBI
Google Scholar

[77] View Article

[78] PubMed/NCBI

[79] Google Scholar

[ref25] 25. Raghu S.; Sriraam N.; Temel Y.; Rao S.V.; Kubben P.L. EEG based multi-class seizure type classification using convolutional neural network and transfer learning. Neural Networks. 2020, 124, pp. 202–212. pmid:32018158
View Article
PubMed/NCBI
Google Scholar

[81] View Article

[82] PubMed/NCBI

[83] Google Scholar

[ref26] 26. Zhu X.; Zuo J.; Ren H. A modified deep neural network enables identification of foliage under complex background. Connection ence. 2019, 4, pp.1–15.
View Article
Google Scholar

[85] View Article

[86] Google Scholar

[ref27] 27. Kim D. H.; Mackinnon T. Artificial intelligence in fracture detection: transfer learning from deep convolutional neural networks. Clinical Radiology. 2017. pmid:29269036
View Article
PubMed/NCBI
Google Scholar

[88] View Article

[89] PubMed/NCBI

[90] Google Scholar

[ref28] 28. Sun Z.; Sun Y. Automatic detection of retinal regions using fully convolutional networks for diagnosis of abnormal maculae in optical coherence tomography images. Journal of biomedical optics. 2019, 5. pmid:31111697
View Article
PubMed/NCBI
Google Scholar

[92] View Article

[93] PubMed/NCBI

[94] Google Scholar

[ref29] 29. Efendioglu H. S.; Yildirim T.; Fidanboylu K. Prediction of force measurements of a microbend sensor based on an artificial neural network. Sensors. 2009, 9, pp.7167–7176. pmid:22399991
View Article
PubMed/NCBI
Google Scholar

[96] View Article

[97] PubMed/NCBI

[98] Google Scholar

[ref30] 30. Bowling M.; Veloso M. Multiagent learning using a variable learning rate. Artificial Intelligence. 2002, 2, pp.215–250.
View Article
Google Scholar

[100] View Article

[101] Google Scholar

[ref31] 31. Tan J.; Chen C. B. Deep learning the holographic black hole with charge. International Journal of Modern Physics D. 2019.
View Article
Google Scholar

[103] View Article

[104] Google Scholar

[ref32] 32. Bernardoni F.; Geisler D.; King J. A.; Javadi A. H.; Ehrlich S. Altered medial frontal feedback learning signals in anorexia nervosa. Biological Psychiatry. 2017, 3. pmid:29025688
View Article
PubMed/NCBI
Google Scholar

[106] View Article

[107] PubMed/NCBI

[108] Google Scholar

[ref33] 33. Arab A.; Alfi A. An adaptive gradient descent-based local search in memetic algorithm applied to optimal controller design. Information Sciences. 2014. pmid:32226109
View Article
PubMed/NCBI
Google Scholar

[110] View Article

[111] PubMed/NCBI

[112] Google Scholar

[ref34] 34. Gao Y.; Biguri A.; Blumensath T. Block stochastic gradient descent for large-scale tomographic reconstruction in a parallel network. IEEE Transactions on Parallel and Distributed Systems. 2019.
View Article
Google Scholar

[114] View Article

[115] Google Scholar

[ref35] 35. Chen Y.; Chi Y.; Fan J.; Ma C. Gradient descent with random initialization: fast global convergence for nonconvex phase retrieval. Mathematical Programming. 2018.
View Article
Google Scholar

[117] View Article

[118] Google Scholar

[ref36] 36. Vabalas A.; Gowen E.; Poliakoff E.; Casson A. J. Machine learning algorithm validation with a limited sample size. PLoS ONE. 2019, 14(11). pmid:31697686
View Article
PubMed/NCBI
Google Scholar

[120] View Article

[121] PubMed/NCBI

[122] Google Scholar

Figures

Abstract

1. Introduction

2. Construction of convolutional neural network model based on transfer learning

2.1. Convolutional neural network

2.2. Transfer learning and model building

2.2.1. Transfer learning.

2.2.2. InceptionNet-V3 convolutional network model.

2.2.3. Construction of transfer learning model.

3. Model training and analysis of experimental results

3.1. Data set

3.2 Experimental design

3.3 Analysis of experimental results

3.3.1 Impact of image data augmentation on models.

3.3.2 Impact of different learning rates on models.

3.3.3 Impact of different gradient descent optimizers on models.

3.3.4 Further optimization of results.

3.4 Comparison of classification results of different classifiers

4. Conclusion

Supporting information

S1 Appendix. Program.

S2 Appendix. Experimental data.

S3 Appendix. Description of the dataset.

References