Skip to main content
Log in

Active learning using a self-correcting neural network (ALSCN)

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Data labeling represents a major obstacle in the development of new models because the performance of machine learning models directly depends on the quality of the datasets used to train these models and labeling requires substantial manual effort. Labeling the entire dataset is not always necessary, and not every item from the image dataset contributes equally to the training process. Active learning or guided labeling is one of the attempts to automate and speed up labeling as much as possible. In this study we present a novel active learning algorithm (ALSCN) that contains two networks, convolutional neural network and self-correcting neural network (SCN). The convolutional network is trained using only manually labeled data, and after training that network it predicts labels for unlabeled items. The SCN network is trained with all available items, some of those items are manually labeled and remaining items are automatically labeled with previous network. After training SCN network, it predicts new labels for all available items, and the new labels are compared with the labels used for training. Items in which differences have been identified are selected for manual labeling and then added to dataset of previously manually labeled items. After that, the convolutional network is trained with extended dataset and previously described steps are repeated. Our experiments show that the network trained using items selected by the proposed method exceeds the performance of a network trained with the same number of items randomly selected from the set of available items. Items from the complete datasets are selected in several iterations, and used for training the models. The accuracy of the models trained with selected items matched or exceeded the accuracy of models trained with the entire dataset, which shows the extent of reduction in the required manual labeling effort. The efficiency of presented algorithm is tested on three datasets (MNIST, Fashion MNIST and CIFAR-10). The final results show that manual labeling is required for only 6.11% (3667/60,000), 23.92% (14,353/60,000) and 59.4% (29,704/50,000) items, in case of MNIST, Fashion MNIST and CIFAR-10 dataset, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. Wang D (2013) Active labeling in deep learning and its application to emotion prediction." PhD diss., University of Missouri–Columbia

  2. Sener O, Savarese S (2017) Active learning for convolutional neural networks: a core-set approach, International Conference on Learning Representations (ICLR)

  3. Kim T, Lee K, Ham S, Park B, Lee S, Hong D, Kim GB, Kyung YS, Kim CS, Kim N (2020) Active learning for accuracy enhancement of semantic segmentation with CNN-corrected label curations: Evaluation on kidney segmentation in abdominal CT,Scientific reports, (Nature Publisher Group), 10(1), pp.1–7

  4. Fortuny EJ, Martens D, Provost F (2013) Predictive modeling with big data: is bigger really better? Big Data 1(4):215–226

    Article  Google Scholar 

  5. Gao M, Zhang Z, Yu G, Arık SÃ, Davis LS, Pfister T (2020) Consistency-based semi-supervised active learning: Towards minimizing labeling cost, In European Conference on Computer Vision (pp. 510–526). Springer, Cham

  6. Zhu X, Vondrick C, Ramanan D, Fowlkes CC (2012) Do We Need More Training Data or Better Models for Object Detection?, In BMVC (Vol. 3, No. 5)

  7. Neutatz F, Mahdavi M, Abedjan Z (2019) Ed2: a case for active learning in error detection, In Proceedings of the 28th ACM international conference on information and knowledge management (pp. 2249–2252)

  8. Bekker AJ, Goldberger J (2016) Training deep neural-networks based on unreliable labels, In 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 2682–2686). IEEE

  9. Nettleton D, Orriols-Puig A, Fornells A (2010) A study of the effect of different types of noise on the precision of supervised learning techniques, Artificial intelligence review

  10. Liu X, Li S, Kan M, Shan S, Chen X (2017) Self-error-correcting convolutional neural network for learning with noisy labels, In 2017 12th IEEE international conference on Automatic Face & Gesture Recognition (FG 2017) (pp. 111–117). IEEE

  11. Grandvalet Y, Bengio Y (2005) Semi-supervised learning by entropy minimization, In CAP (pp. 529–536)

  12. Fazakis N, Vasileios GK, Christos KA, Stamatis K, Sotiris K (2019) Combination of Active Learning and Semi-Supervised Learning under a Self-Training Scheme, Entropy 21, no. 10, 988

  13. Chen T, Kornblith S, Swersky K, Norouzi M, Hinton G (2020) Big self-supervised models are strong semi-supervised learners, advances in neural information processing systems, (NeurIPS 2020)

  14. Goudjil M, Koudil M, Bedda M, Ghoggali N (2018) A novel active learning method using SVM for text classification. Int J Autom Comput 15(3):290–298

    Article  Google Scholar 

  15. Chen T, Kornblith S, Norouzi M, Hinton G (2020) A simple framework for contrastive learning of visual representations, proceedings of the 37th international conference on machine learning. PMLR 119:1597–1607

    Google Scholar 

  16. Mnih V, Hinton GE (2012) Learning to label aerial images from noisy data, In Proceedings of the 29th international conference on machine learning (ICML-12) (pp. 567–574)

  17. Seeger M (2001) Learning with labeled and unlabeled data, Institute for Adaptive and Neural Computation, University of Edinburg

  18. Settles B (2009) Active learning literature survey, University of Wisconsin-Madison Department of Computer Sciences

  19. Yang Y, Ma Z, Nie F, Chang X, Hauptmann AG (2015) Multi-class active learning by uncertainty sampling with diversity maximization. Int J Comput Vis 113(2):113–127

    Article  MathSciNet  Google Scholar 

  20. Sharma M, Bilgic M (2017) Evidence-based uncertainty sampling for active learning. Data Min Knowl Disc 31(1):164–202

    Article  MathSciNet  Google Scholar 

  21. Zhou J, Sun S (2014) Improved margin sampling for active learning, In Chinese Conference on Pattern Recognition. Springer, Berlin, pp 120–129

    Google Scholar 

  22. Grimova N, Macas M (2019) Query-By-Committee Framework Used for Semi-Automatic Sleep Stages Classification, In Multidisciplinary Digital Publishing Institute Proceedings (Vol. 31, No. 1, p. 80)

  23. Wang K, Zhang D, Li Y, Zhang R, Lin L (2016) Cost-effective active learning for deep image classification. IEEE Trans Circuits Syst Video Technol 27(12):2591–2600

    Article  Google Scholar 

  24. Wei K, Iyer R, Bilmes J (2015) Submodularity in data subset selection and active learning, In International conference on machine learning (pp. 1954–1963)

  25. Joshiy AJ, Porikli F, Papanikolopoulos N (2010) Multi-class batch-mode active learning for image classification, In 2010 IEEE international conference on robotics and automation (pp. 1873–1878). IEEE

  26. Jakramate B, Kab’an A (2012) Label-noise robust logistic regression and its applications, in Machine Learning and Knowledge Discovery in Databases, pp. 143–158

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Velibor Ilić.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ilić, V., Tadić, J. Active learning using a self-correcting neural network (ALSCN). Appl Intell 52, 1956–1968 (2022). https://doi.org/10.1007/s10489-021-02515-y

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-021-02515-y

Keywords

Navigation