A multi-pronged accurate approach to optical character recognition, using nearest neighborhood and neural-network-based principles

Kumar, G Kishor; Kumar, R Raja; Chakka, Ram; Viswanath, P

doi:10.1007/s12046-021-01703-3

A multi-pronged accurate approach to optical character recognition, using nearest neighborhood and neural-network-based principles

Published: 13 September 2021

Volume 46, article number 189, (2021)
Cite this article

Sādhanā Aims and scope Submit manuscript

G Kishor Kumar¹,
R Raja Kumar¹,
Ram Chakka² &
…
P Viswanath³

269 Accesses
4 Citations
Explore all metrics

Abstract

Digital systems have been playing a vital role in various applications such as banking, finance, healthcare, manufacturing, security and so on. Also, their role and applications are becoming wider and more crucial. In many such applications, identifying and recognizing a character, or a digit accurately, plays a significant role, especially in banking and financial sectors and other sectors where an error can cause much loss or damage. This, technically, is called the optical character recognition (OCR) problem. In this context, contribution of this paper is of two folds. First, we propose a multi-layer perceptron (MLP) neural network architecture that includes an input layer, hidden layers and an output layer to develop an effective method for OCR. The architecture builds a model that learns representations from the input data and further these representations are used for classifying the unknown data. This proposed method MLP, which recognizes optical characters, is compared to existing nearest neighborhood methods such as condensed nearest neighbor (CNN), modified condensed nearest neighbor (MCNN) and other class nearest neighbor (OCNN), in performance. Posterior probabilities and conditional probabilities pertaining to recognition are computed, estimated and validated on the test data (OCR and Pendigits) for all the afore-mentioned methods. Using these posterior probabilities, probabilities of detection of the newly drawn character or digits can be estimated. The proposed model in this paper outperforms existing methods. The second contribution is as follows. In certain critical applications, it is very important to achieve the highest possible accuracy even if it is expensive. To achieve this a multi-pronged approach using multiple methods is developed based on these four methods, in order to improve and estimate the accuracy, in cases when multiple methods concur or otherwise.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A low-cost hybrid handwritten Devanagari character classifier

Article 23 December 2022

Optical Character Detection and Recognition for Image-Based in Natural Scene

A Survey: Artificial Neural Network for Character Recognition

References

Yanming Guo and Yu Liu et al 2016 Deep learning for visual understanding: a review. Neuro Computing 187: 27–48
Ciregan D, Meier U and Schmidhuber J 2012 Multi-column deep neural networks for image classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3642–3649
Krizhevsky A, Sutskever L and Hinton G E 2012 ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105
Mikolov T, Sutskever I, Corrado G S and Dean J 2013 Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119
Bordes A, Glorot X, Weston J and Bengio Y 2012 Joint learning of words and meaning representations for open-text semantic parsing. In: Proceedings of the International Conference on Artificial Intelligence and Statistics, IEEE, pp. 127–135
Deng L 2014 A tutorial survey of architectures, algorithms, and applications for deep learning. APSIPA Transactions on Signal and Information Processing 3: https://doi.org/10.1017/ATSIP.2014.4
Bengia Y 2009 Learning deep architectures for AI. Foundations and trends in Machine Learning 2(1): 1–127
Article Google Scholar
Schmidhuber J 2015 Deep learning in neural networks: an overview. Machine Learning 61: 85–127
Google Scholar
Bengio Y 2013 Deep learning of representations: looking forward. In: Proceedings of the International Conference on Statistical Language and Speech Processing, pp. 1–37
Bengio Y, Courville A and Vincent P 2013 Representation learning: a review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence 35: 1798–1828
Article Google Scholar
Kishor Kumar G, Viswanath P and Ananda Rao A 2011 Intrusion detection using an ensemble of decision trees. In: Proceedings of the Indian International Conference on Artificial Intelligence, pp. 382–392
Kishor Kumar G, Viswanath P and Ananda Rao A 2016 Ensemble of randomized soft decision trees for robust classification. Sadhana 41(3): 273–282
MathSciNet MATH Google Scholar
Kishor Kumar G, Viswanath P and Ananda Rao A 2015 Ensemble of soft decision trees using multiple approximate fuzzy-rough set based reducts. International Journal of Information Processing 9(2): 36–46
Google Scholar
Raj Kumar R, Viswanath P and Shoba Bindu C, 2016 An approach to reduce the computational burden of nearest neighbor classifier. Procedia Computer Science 85: 588–597
Article Google Scholar
Raj Kumar R, Viswanath P and Shoba Bindu C 2016 Nearest neighbor classifiers: reducing the computational demands. In: Proceedings of the 6th IEEE International Conference on Advanced Computing (IACC), pp. 45–50
Raj Kumar R, Viswanath P and Shoba Bindu C 2017 Nearest neighbor classifiers: a review. International Journal of Computational Intelligence Research 13(2): 303–311
Google Scholar
Tomek I 1976 Two modifications of CNN. IEEE Transactions on Systems, Man and Cybernetics 6: 769–772
MathSciNet MATH Google Scholar
Swonger C W 1972 Sample set condensation for a condensed nearest neighbor decision rule for pattern recognition. In: Frontiers in Pattern Recognition, pp. 511–526
Gates G 1972 The reduced nearest neighbor rule (corresp.). IEEE Transactions on Information Theory 18(3): 431–433
Marr D 1983 Vision: a computational investigation into the human representation and processing of visual information. San Francisco: W. H. Freeman
Google Scholar
Hubel D H and Wiesel T N 1962 Receptive fields, binocular interaction and functional architecture in the cats visual cortex. Journal of Physiology 160: 106–154
Article Google Scholar
The MNIST database of handwritten digits. http://yann.lecun.com/exdb/mnist/
Kwolek B 2005 Face detection using convolutional neural networks and Gabor filters. Lecture Notes in Computer Science 3696: 551–556
Article Google Scholar
Osadchy M, LeCun Y and Miller M, 2007 Synergistic face detection and pose estimation with energy-based models. Journal of Machine Learning Research 8: 1197–1215
Google Scholar
Huang F J and LeCun Y 2006 Large-scale learning with SVM and convolutional nets for generic object categorization. In: Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR 06)
Lee H, Largman Y, Pham P and Ng A 2009 Unsupervised feature learning for audio classification using convolutional deep belief networks. In: Advances in Neural Information Processing Systems 22
http://www.darpa.mil/IPTO/solicit/baa/BAA09-40PIP.pdf
http://www.numenta.com
http://www.binatix.com
Sivic J, Everingham M and Zisserman A 2005 Person spotting: video shot retrieval for face sets. In: Proceedings of CIVR
Lu C and Tang X 2015 Surpassing human-level face verification performance on LFW with gaussian face. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, Austin, Texas, June 2015, pp. 3811–3819
Cinbis R G, Verbeek J J and Schmid C 2011 Unsupervised metric learning for face identification in TV video. In: Proceedings of ICCV, pp. 1559–1566
Parkhi O M, Simonyan K, Vedaldi A and Zisserman A 2014 A compact and discriminative face track descriptor. In: Proceedings of CVPR
Simonyan K, Parkhi O M, Vedaldi A and Zisserman A 2013 Fisher vector faces in the wild. In: Proceedings of BMVC
Jeff Donahue, Lisa Anne Hendricks et al 2015 Long-term recurrent convolutional networks for visual recognition and description. IEEE Transactions on Pattern Analysis and Machine Intelligence 39(4): 677–691
Article Google Scholar
Andrea Vedaldi and Karel Lenc 2015 MatConvNet: convolutional neural networks for MATLAB. In: Proceedings of the 23rd ACM International Conference on Multimedia, pp. 689–692
Alec Radford, Luke Metz and Soumith Chintala 2015 Unsupervised representation learning with deep convolutional generative adversarial networks. In: Proceedings of ICLR2015, pp. 689–692
Olaf Ronneberger, Philipp Fischer and Thomas Brox 2015 U-Net: convolutional networks for biomedical image segmentation. In: Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 234–241
Shuai Z, Sadeep J, Bernardino R, Vibhav V et al 2015 Conditional random fields as recurrent neural networks. In: Proceedings of ICCV, https://doi.org/10.1109/ICCV.2015.179
Chao D, Chen C, Kaiming H and Xiaoou T 2014 Image super-resolution using deep convolutional networks. IEEE Transactions on Pattern Analysis and Machine Intelligence 38: 295–307
Google Scholar
Ng Joe Y, Matthew J H, Sudheendra V et al 2014 Beyond short snippets: deep networks for video classification. In: Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), https://doi.org/10.1109/CVPR.2015.7299101
Christian Szegedy, Sergey Ioffe et al 2017 Inception-V4, Inception-ResNet and the impact of residual connections on learning. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17)
Jiang H Z, Wang J D, Yuan Z J, Wu Y, Zheng N N and Li S P 2013 Salient object detection: a discriminative regional feature integration approach. In: Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2083–2090, https://doi.org/10.1109/CVPR.2013.271
Volodymyr M, Adria P B, Mehdi M et al 2016 Asynchronous methods for deep reinforcement learning. Proceedings of Machine Learning Research 48: 1928–1937
Google Scholar
Rami A, Guillaume A, Amjad A, Christof A et al 2016 Theano: a Python framework for fast computation of mathematical expressions. arXiv preprint arXiv:1605.02688
Ziwei L, Ping L, Xiaogang W and Xiaoou T 2015 Deep learning face attributes in the wild. In: Proceedings of the 2015 IEEE International Conference on Computer Vision
Zhang X, Zhao J J and Lecun Y 2015 Character level convolutional networks for text classification. In: Proceedings of the Neural Information Processing Systems Conference, Montreal, Quebec, Canada
Das N, Sarkar R, Basu S, Kundu M, Nasipuri M and Basu D K 2012 A genetic algorithm based region sampling for selection of local features in handwritten digit recognition application. Applied Soft Computing 12(5): 1592–1606
Article Google Scholar
Sarkhel R, Das N, Das A, Kundu M and Nasipuri M 2017 A multi-scale deep quad tree based feature extraction method for the recognition of isolated handwritten characters of popular indic scripts. Pattern Recognition 1(71): 78–93
Article Google Scholar
Khandelwal A, Choudhury P, Sarkar R, Basu S, Nasipuri M and Das N 2009 Text line segmentation for unconstrained handwritten document images using neighborhood connected component analysis. In: Proceedings of the International Conference on Pattern Recognition and Machine Intelligence, December 16, pp. 369–374
Basu S, Das N, Sarkar R, Kundu M, Nasipuri M and Basu D K 2012 An MLP based approach for recognition of handwritten Bangla numerals. arXiv preprint arXiv:1203.0876
Pal A, Jaiswal S, Ghosh S, Das N and Segfast N M 2019 A faster squeezenet based semantic image segmentation technique using depth-wise separable convolutions. In: Proceedings of the 11th Indian Conference on Computer Vision, Graphics and Image Processing
Sarkhel Ritesh, Saha A K and Nibaran Das 2015 An enhanced harmony search method for Bangla handwritten character recognition using region sampling. In: Proceedings of the 2nd IEEE International Conference on Recent Trends in Information Systems (ReTIS)
Gupta Anisha et al 2019 Multi-objective optimization for recognition of isolated handwritten Indic scripts. Pattern Recognition Letters 128: 318–325
Article Google Scholar
Khan N H and Adnan A 2018 Urdu optical character recognition systems: present contributions and future directions. IEEE Access 6: 46019–46046
Article Google Scholar
Noman Islam, Zeeshan Islam and Nazia Noor 2016. A survey on optical character recognition system. Journal of Information and Communication Technology 10(2): 1–4
Google Scholar
Devi V S and Murty M N 2002 An incremental prototype set building technique. Pattern Recognition 35(2): 505–513
Article Google Scholar
Dua D and Graff C 2019 UCI Machine Learning Repository. [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Rajeev Gandhi Memorial College of Engineering and Technology, Nandyal, India
G Kishor Kumar & R Raja Kumar
Independent Consultant: Projects, Research, Academics, Hyderabad, India
Ram Chakka
Department of Computer Science and Engineering, IIIT, Sri City, Chittoor, India
P Viswanath

Authors

G Kishor Kumar
View author publications
You can also search for this author in PubMed Google Scholar
R Raja Kumar
View author publications
You can also search for this author in PubMed Google Scholar
Ram Chakka
View author publications
You can also search for this author in PubMed Google Scholar
P Viswanath
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to G Kishor Kumar.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kumar, G.K., Kumar, R.R., Chakka, R. et al. A multi-pronged accurate approach to optical character recognition, using nearest neighborhood and neural-network-based principles. Sādhanā 46, 189 (2021). https://doi.org/10.1007/s12046-021-01703-3

Download citation

Received: 19 August 2019
Revised: 16 July 2021
Accepted: 30 July 2021
Published: 13 September 2021
DOI: https://doi.org/10.1007/s12046-021-01703-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A multi-pronged accurate approach to optical character recognition, using nearest neighborhood and neural-network-based principles

Abstract

Access this article

Similar content being viewed by others

A low-cost hybrid handwritten Devanagari character classifier

Optical Character Detection and Recognition for Image-Based in Natural Scene

A Survey: Artificial Neural Network for Character Recognition

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A multi-pronged accurate approach to optical character recognition, using nearest neighborhood and neural-network-based principles

Abstract

Access this article

Similar content being viewed by others

A low-cost hybrid handwritten Devanagari character classifier

Optical Character Detection and Recognition for Image-Based in Natural Scene

A Survey: Artificial Neural Network for Character Recognition

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation