Skip to main content
Log in

NN-based analytic approach to symbol level recognition for degraded Bengali printed documents

  • Published:
Sādhanā Aims and scope Submit manuscript

Abstract

Analysis of degraded printed documents has been a research topic for last several years. In this article the contribution lies in segmentation of word images into symbols and recognition of the symbols of degraded printed document images of Bengali, the 7th most popular language in the world. A novel approach to symbol level segmentation based on a Multilayer Perceptron (MLP) network is proposed. A database of segmenting and non-segmenting image columns is developed from the ISIDDI page level database and segmentation is treated as a two-class classification problem. The MLP weights are learnt based on this database using the back propagation algorithm. We have introduced certain new metrics, based on which the F-score of the proposed segmentation algorithm is determined. Our method utilizes information that is relevant for character segmentation, ignoring other highly variable information contained in a printed text document, thus allowing for efficient transfer learning between datasets and alleviating the need for labelled training data. Other than Bengali, we have tested on English, Tamil and Devnagari scripts. For the classification purpose we have identified 336 symbols, and the corresponding training and test sets have been developed. The ISIDDI database is used for this purpose. Two classifiers, one CNN based and the other LSTM based, have been developed for this 336-class problem. The classification accuracies obtained on the test set by the CNN classifier and the LSTM classifier are 86.05% and 88.11%, respectively. The proposed classifiers outperform the existing classifiers for the ISIDDI database.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15
Figure 16
Figure 17
Figure 18
Figure 19
Figure 20
Figure 21
Figure 22
Figure 23
Figure 24
Figure 25
Figure 26
Figure 27
Figure 28
Figure 29
Figure 30
Figure 31
Figure 32
Figure 33
Figure 34

Similar content being viewed by others

References

  1. Robertson B and Boschetti F 2017 Large-scale optical character recognition of ancient greek. Mouseion 14(3): 341–359

    Article  Google Scholar 

  2. White N 2012 Training Tesseract for ancient Greek OCR. Eiiruzov 28–29

  3. Jenckel M, Bukhari S S and Dengel A 2016 anyOCR: a sequence learning based OCR system for unlabeled historical documents. In: Proceedings of the 2016 23rd International Conference on Pattern Recognition (ICPR), pp. 4035–4040

  4. Tang Y, Peng L, Xu Q, Wang Y and Furuhata A 2016 CNN based transfer learning for historical Chinese character recognition. In: Proceedings of the 2016 12th IAPR Workshop on Document Analysis Systems (DAS). IEEE, pp. 25–29

  5. Zhang J, Zhu Y, Du J and Dai L 2018 Radical analysis network for zero-shot learning in printed Chinese character recognition. In: Proceedings of the 2018 IEEE International Conference on Multimedia and Expo (ICME). IEEE, pp. 1–6

  6. Darwish K and Oard D W 2002 Term selection for searching printed Arabic. In: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, pp. 261–268

  7. Breuel T M, Ul-Hasan A, Al-Azawi M A and Shafait F 2013 High-performance OCR for printed English and Fraktur using LSTM networks. In: Proceedings of the 2013 12th International Conference on Document Analysis and Recognition. IEEE, pp. 683–687

  8. Chaudhuri B, Pal U and Mitra M 2002 Automatic recognition of printed Oriya script. Sadhana 27(1): 23–34

    Article  Google Scholar 

  9. Seethalakshmi R, Sreeranjani T, Balachandar T, Singh A, Singh M, Ratan R and Kumar S 2005 Optical character recognition for printed Tamil text using Unicode. Journal of Zhejiang University-SCIENCE A 6(11): 1297–1305

    Article  Google Scholar 

  10. Chaudhuri B and Pal U 1998 A complete printed Bangla OCR system. Pattern Recognition 31(5): 531–549

    Article  Google Scholar 

  11. Biswas C, Mukherjee P S, Ghosh K, Bhattacharya U and Parui S K 2018 A hybrid deep architecture for robust recognition of text lines of degraded printed documents. In: Proceedings of the 24th International Conference on Pattern Recognition. IEEE, pp. 3174–3179

  12. Lakshmi C V and Patvardhan C 2004 An optical character recognition system for printed Telugu text. Pattern Analysis and Applications 7(2): 190–204

    Article  MathSciNet  Google Scholar 

  13. Chaudhuri B and Pal U 1997 An OCR system to read two Indian language scripts: Bangla and Devnagari (Hindi). In: Proceedings of the International Conference on Document Analysis and Recognition, ICDAR, vol. 2, pp. 1011–1015

    Article  Google Scholar 

  14. Hasnat M A, Chowdhury M R and Khan M 2009 An open source Tesseract based optical character recognizer for Bangla script. In: Proceedings of the 2009 10th International Conference on Document Analysis and Recognition. IEEE, pp. 671–675

  15. Hasnat M, Chowdhury M R, Khan M et al 2009 Integrating Bangla script recognition support in Tesseract OCR. In: Proceedings of the Conference on Language and Technology 2009 (CLT09)

  16. Pal U and Chaudhuri B B 1994 OCR in Bangla: an Indo-Bangladeshi language. In: Proceedings of the 12th IAPR International Conference on Pattern Recognition, vol. 3 – Conference C: Signal Processing (Cat. No. 94CH3440-5), vol. 2, pp. 269–273

  17. Mahmud J U, Raihan M F and Rahman C M 2003 A complete OCR system for continuous Bengali characters. In: Proceedings of the TENCON 2003 Conference on Convergent Technologies for Asia–Pacific Region, vol. 4, pp. 1372–1376

  18. Shatil A M S and Khan M 2006 Minimally segmenting performance Bangla optical character recognition using Kohonen network. Doctoral Dissertation, BRAC University

  19. Pal U, Belad A and Choisy C 2003 Touching numeral segmentation using water reservoir concept. Pattern Recognition Letters 24(1–3): 261–272

    Article  Google Scholar 

  20. Pal U and Datta S 2003 Segmentation of Bangla unconstrained handwritten text. In: Proceedings of the Seventh International Conference on Document Analysis and Recognition. Citeseer, pp. 1128–1132

  21. Upreti K K and Bag S 2016 Segmentation of unconstrained handwritten Hindi words using polygonal approximation. In: Proceedings of the 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 150–155

  22. Blumenstein M and Verma B 1997 An artificial neural network based segmentation algorithm for off-line handwriting recognition. In: Proceedings of the International Conference on Computational Intelligence and Multimedia Applications, flCCAL4 ’98

  23. Bhowmik T K, Parui S K, Roy U and Schomaker L 2016 Bangla handwritten character segmentation using structural features: a supervised and bootstrapping approach. ACM Transactions on Asian and Low-Resource Language Information Processing 15(4): 29

    Article  Google Scholar 

  24. Otsu N 1979 A threshold selection method from gray-level histograms. IEEE Transactions on Systems, Man, and Cybernetics 9(1): 62–66

    Article  Google Scholar 

  25. Singh C, Bhatia N and Kaur A 2008 Hough transform based fast skew detection and accurate skew correction methods. Pattern Recognition 41(12): 3528–3546

    Article  Google Scholar 

  26. Chaudhuri B and Ghosh S 1998 A statistical study of Bangla corpus, recognition. In: Proceedings of the International Conference on Computational Linguistics, Speech and Document Processing, Calcutta, India, pp. C32–C37

  27. Dhingra K D, Sanyal S and Sharma P K 2008 A robust OCR for degraded documents. In: Advances in Communication Systems and Electrical Engineering. Springer, pp. 497–509

  28. Likforman Sulem L, Zahour A and Taconet B 2007 Text line segmentation of historical documents: a survey. International Journal on Document Analysis and Recognition 9(2): 123–138

    Article  Google Scholar 

  29. Sauvola J and Pietikinen M 2000 Adaptive document image binarization. Pattern Recognition 33(2): 225–236

    Article  Google Scholar 

  30. Liu Y J and You F C 2011 Application of mathematical morphology on touching or broken characters processing. Advanced Materials Research 171: 73–77

    Article  Google Scholar 

  31. Hasan Y M and Karam L J 2000 Morphological text extraction from images. IEEE Transactions on Image Processing 9(11): 1978–1983

    Article  Google Scholar 

  32. Taghva K, Nartker T, Borsack J and Condit A 1999 UNLV-ISRI document collection for research in OCR and information retrieval. In: Proceedings of Document Recognition and Retrieval VII. International Society for Optics and Photonics, vol. 3967, pp. 157–164

  33. Marti U V and Bunke H 2001 Text line segmentation and word recognition in a system for general writer independent handwriting recognition. In: Proceedings of the Sixth International Conference on Document Analysis and Recognition, IEEE, pp. 159–163

  34. Devi G G and Sathyanarayanan G 2017 A connected components labeling algorithm for 4-connectivity based on position matrix. International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2(6)

  35. Zeiler M D and Fergus R 2014 Visualizing and understanding convolutional networks. In: Proceedings of the European Conference on Computer Vision. Springer, pp. 818–833

  36. Hochreiter S, Bengio Y, Frasconi P and Schmidhuber J 2001 Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. A Field Guide to Dynamical Recurrent Networks. IEEE Press.

  37. Maitra D S, Bhattacharya U and Parui S K 2015 CNN based common approach to handwritten character recognition of multiple scripts. In: Proceedings of the 2015 13th International Conference on Document Analysis and Recognition (ICDAR). IEEE, pp. 1021–1025

  38. Scherer D, Mller A and Behnke S 2010 Evaluation of pooling operations in convolutional architectures for object recognition. In: Proceedings of the International conference on Artificial Neural Networks. Springer, pp. 92–101

  39. Ciresan D C, Meier U, Masci J, Maria Gambardella L and Schmidhuber J 2011 Flexible, high performance convolutional neural networks for image classification. In: Proceedings of the IJCAI—International Joint Conference on Artificial Intelligence, Barcelona, Spain, vol. 22, p. 1237

  40. Aharrane N, Dahmouni A, Ensah K E M and Satori K 2017 End-to-end system for printed Amazigh script recognition in document images. In: Proceedings of the 2017 International Conference on Advanced Technologies for Signal and Image Processing (ATSIP). IEEE, pp. 1–6

  41. Krizhevsky A, Sutskever I and Hinton G E 2012 Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105

  42. Su B and Lu S 2014 Accurate scene text recognition based on recurrent neural network. In: Proceedings of the Asian Conference on Computer Vision. Springer, pp. 35–48

  43. Messina R and Louradour J 2015 Segmentation-free handwritten Chinese text recognition with LSTM–RNN. In: Proceedings of the 2015 13th International Conference on Document Analysis and Recognition (ICDAR). IEEE, pp. 171–175

  44. Mukherjee P S, Chakraborty B, Bhattacharya U and Parui S K 2017 A hybrid model for end to end online hand writing recognition. In: Proceedings of the 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR). IEEE, vol. 1, pp. 658–663

  45. Hochreiter S and Schmidhuber J 1997 Long short-term memory. Neural Computation 9(8): 1735–1780

    Article  Google Scholar 

  46. Graves A, Jaitly N and Mohamed A R 2013 Hybrid speech recognition with deep bidirectional LSTM. In: Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), IEEE, pp. 273–278

  47. Graves A and Jaitly N 2014 Towards end-to-end speech recognition with recurrent neural networks. In: Proceedings of the International Conference on Machine Learning, pp. 1764–1772

  48. LeCun Y, Bottou L, Bengio Y and Haffner P 1998 Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11): 2278–2324

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jayati Mukherjee.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mukherjee, J., Parui, S.K. & Roy, U. NN-based analytic approach to symbol level recognition for degraded Bengali printed documents. Sādhanā 45, 263 (2020). https://doi.org/10.1007/s12046-020-01492-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s12046-020-01492-1

Keywords

Navigation