Abstract
As a precursor of optical character recognition (OCR) technology, script identification finds many applications like sorting and indexing of document images. Classifying these scripts, especially at different scales and orientations, is one of the interesting and vital problems in the field of document image analysis. In this paper, an algorithm is proposed for the identification of scripts using scale and rotation robust log-polar wavelet and semi decimated wavelet features. Initially, words are segmented from document images in the form of text-blobs by the Gaussian filter. Then, texture features are calculated using a combination of discrete wavelet and semi decimated discrete wavelet transforms in log-polar domain. Here, most of the rotational and scale variations are removed in log-polar domain, whereas wavelet transform is capable of extracting the information at different resolution levels. This helps in the formation of significant textures for the purpose of characterization. At last, k-nearest neighbor classifier is used for the identification of scripts. Comprehensive experiments on different databases illustrate the effectiveness of the proposed algorithm. Benchmarking analysis shows that a maximum recall rate of 98.96% is obtained, and demonstrates better performance compared to the other contemporary approaches.
Similar content being viewed by others
References
Ahamed P, Kundu S, Khan T, Bhateja V, Sarkar R, Mollah AF (2020) Handwritten Arabic numerals recognition using convolutional neural network. J Ambient Intell Hum Comput 11:5445–5457
ALPH-REGIM Database. http://www.regim.org/database/alph.html, http://ewh.ieee.org/r8/tunisia/regim/alph_regim/.
Behrad A, Khoddami M, Salehpour M (2010) A novel framework for farsi and latin script identification and farsi handwritten digit recognition. J Autom Control 20:17–25
Brodić D, Milivojević ZN, Maluckov ČA (2015) An approach to the script discrimination in the Slavic documents. Soft Comput 19:2655–2665
Busch A, Boles WW (2002) Texture classification using wavelet scale relationships. In: 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing, IEEE, pp IV-3584-IV-3587
Busch A, Boles WW, Sridharan S (2004) Logarithmic quantisation of wavelet coefficients for improved texture classification performance. In: 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, IEEE, pp iii-569
Busch A, Boles WW, Sridharan S (2005) Texture for script identification. IEEE Trans Pattern Anal Mach Intell 27:1720–1732
Chun YD, Seo SY, Kim NC (2003) Image retrieval using BDIP and BVLC moments. IEEE Trans Circuits Syste Video Technol 13:951–957
Ghosh S, Chaudhuri BB (2011) Composite script identification and orientation detection for indian text images. In: 2011 International Conference on Document Analysis and Recognition. IEEE, pp 294–298
Ghosh D, Dube T, Shivaprasad A (2010) Script recognition—a review. IEEE Trans Pattern Anal Mach Intell 32:2142–2161
Haboubi S, Maddouri SS, Amiri H (2011) Separation between Arabic and Latin scripts from bilingual text using structural features. In: International Conference on Integrated Computing Technology. Springer, pp 132–143
Hangarge M, Santosh K, Pardeshi R (2013) Directional discrete cosine transform for handwritten script identification. In: 2013 12th International Conference on Document Analysis and Recognition. IEEE, pp 344–348
Haralick RM, Watson L (1981) A facet model for image data. Comput Graph Image Process 15:113–129
Hochberg J, Bowers K, Cannon M, Kelly P (1999) Script and language identification for handwritten document images. Int J Doc Anal Recogn 2:45–52
Hu H (2014) Illumination invariant face recognition based on dual-tree complex wavelet transform. IET Comput Vision 9:163–173
Huang G-B, Zhou H, Ding X, Zhang R (2011) Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybernet Part B (Cybernetics) 42:513–529
Jindal M, Hemrajani N (2013) Script identification for printed document images at text-line level using DCT and PCA IOSR. J Comput Eng 12:97–102
Joshi GD, Garg S, Sivaswamy J (2007) A generalised framework for script identification. Int J Document Anal Recogn (IJDAR) 10:55–68
Kacem A, Saidani A, Belaid A (2014) How to separate between machine-printed/handwritten and arabic/latin words? ELCVIA Electron Lett Comput Vision Image Anal 13:1–17
Kittler J, Hatef M, Duin RP, Matas J (1998) On combining classifiers. IEEE Trans Pattern Anal Mach Intell 20:226–239
Kolekar M (2002) An algorithm for designing optimal Gabor filter for segmenting multi-textured images. IETE J Res 48:181–187
Kong H, Akakin HC, Sarma SE (2013) A generalized Laplacian of Gaussian filter for blob detection and its applications. IEEE Trans Cybernet 43:1719–1733
Lee WS, Kim NC, Jang IH (2010) Texture feature-based language identification using wavelet-domain BDIP, BVLC, and NRMA features. In: 2010 IEEE International Workshop on Machine Learning for Signal Processing. IEEE, pp 444–449
Li S, Shen Q, Sun J (2007) Skew detection using wavelet decomposition and projection profile analysis. Pattern Recogn Lett 28:555–562
Li J, Mei X, Prokhorov D, Tao D (2016) Deep neural network for structural prediction and lane detection in traffic scene. IEEE Trans Neural Networks Learn Syst 28:690–703
Luo X-Q, Zhang Z-C, Zhang B-C, Wu X-J (2017) Contextual information driven multi-modal medical image fusion. IETE Tech Rev 34:598–611
Mahmoud SA (1994) Arabic character recognition using Fourier descriptors and character contour encoding. Pattern Recogn 27:815–824
Manmatha R, Rothfeder JL (2005) A scale space approach for automatically segmenting words from historical handwritten documents. IEEE Trans Pattern Anal Mach Intell 27:1212–1225
Mao W, Chung F-l, Lam KK, Sun W-C (2002) Hybrid Chinese/English text detection in images and video frames. In: Object recognition supported by user interaction for service robots. IEEE, pp 1015–1018
Matungka R, Zheng YF, Ewing RL (2009) Image registration using adaptive polar transform. IEEE Trans Image Process 18:2340–2354
Mitchell TM (1997) Machine learning. McGraw-Hill, New York
Moussa SB, Zahour A, Benabdelhafid A, Alimi AM (2008) Fractal-based system for Arabic/Latin, printed/handwritten script identification. In: 2008 19th International Conference on Pattern Recognition. IEEE, pp 1–4
Namboodiri AM, Jain AK (2004) Online handwritten script recognition. IEEE Trans Pattern Anal Mach Intell 26:124–130
Narayanan VS, Kasthuri N (2020) An efficient recognition system for preserving ancient historical documents of English characters. J Ambient Intell Humaniz Comput. https://doi.org/10.1007/s12652-020-02201-w
Nigam S, Khare A (2012) Curvelet transform-based technique for tracking of moving objects. IET Comput Vision 6:231–251
Obaidullah SM, Halder C, Das N, Roy K (2016) A new dataset of word-level offline handwritten numeral images from four official Indic scripts and its benchmarking using image transform fusion. Int J Intell Eng Inform 4:1–20
Obaidullah S, Santosh K, Halder C, Das N, Roy K (2017) Word-level multi-script Indic document image dataset and baseline results on script identification. Int J Comput Vision Image Process (IJCVIP) 7:81–94
Obaidullah SM, Halder C, Santosh K, Das N, Roy K (2018) PHDIndic_11: page-level handwritten document image dataset of 11 official Indic scripts for script identification. Multimedia Tools Appl 77:1643–1678
Obaidullah SM, Santosh K, Halder C, Das N, Roy K (2019) Automatic Indic script identification from handwritten documents: page, block, line and word-level approach. Int J Mach Learn Cybernet 10:87–106
Padma M, Vijaya P (2009) Monothetic separation of Telugu, Hindi and English text lines from a multi script document. In: 2009 IEEE International Conference on Systems, Man and Cybernetics. IEEE, pp 4870–4875
Pal U, Chaudhuri B (2002) Identification of different script lines from multi-script documents. Image Vision Comput 20:945–954
Pan W, Suen CY, Bui TD (2005) Script identification using steerable Gabor filters. In: Eighth International Conference on Document Analysis and Recognition (ICDAR'05). IEEE, pp 883–887
Pardeshi R, Chaudhuri B, Hangarge M, Santosh K (2014) Automatic handwritten Indian scripts identification. In: 2014 14th international conference on frontiers in handwriting recognition. IEEE, pp 375–380
Patil SB, Subbareddy N (2002) Neural network based system for script identification in Indian documents. Sadhana 27:83–97
Pati PB, Ramakrishnan A (2008) Word level multi-script identification. Pattern Recogn Lett 29:1218–1229
Poornachandra S, Ravichandran V, Kumaravel N (2003) Mapping of discrete cosine transform (DCT) and discrete sine transform (DST) based on symmetries. IETE J Res 49:35–42
Pun C-M, Lee M-C (2003) Log-polar wavelet energy signatures for rotation and scale invariant texture classification. IEEE Trans Pattern Anal Mach Intell 25:590–603
Sahare P, Dhok SB (2017a) Review of text extraction algorithms for scene-text and document images. IETE Tech Rev 34:144–164
Sahare P, Dhok SB (2017b) Script identification algorithms: a survey. Int J Multimedia Inf Retrieval 6:211–232
Sahare P, Dhok SB (2018a) Multilingual character segmentation and recognition schemes for Indian document images. IEEE Access 6:10603–10617
Sahare P, Dhok SB (2018b) Separation of handwritten and machine-printed texts from noisy documents using contourlet transform. Arab J Sci Eng 43:8159–8177
Sahare P, Dhok SB (2019a) Robust character segmentation and recognition schemes for multilingual Indian document Images. IETE Tech Rev 36:209–222
Sahare P, Dhok SB (2019b) Separation of machine-printed and handwritten texts in noisy documents using wavelet transform. IETE Tech Rev 36:341–361
Sahare P, Chaudhari RE, Dhok SB (2019) Word level multi-script identification using curvelet transform in log-polar domain. IETE J Res 65:410–432
Shijian L, Tan CL (2007) Script and language identification in noisy and degraded document images. IEEE Trans Pattern Anal Mach Intell 30:14–24
Shivakumara P, Yuan Z, Zhao D, Lu T, Tan CL (2015) New gradient-spatial-structural features for video script identification. Comput Vision Image Understand 130:35–53
Shi C-Z, Gao S, Liu M-T, Qi C-Z, Wang C-H, Xiao B-H (2015) Stroke detector and structure based models for character recognition: a comparative study. IEEE Trans Image Process 24:4952–4964
Shi B, Bai X, Yao C (2016) Script identification in the wild via discriminative convolutional neural network. Pattern Recogn 52:448–458
Singh PK, Dalal SK, Sarkar R, Nasipuri M (2015) Page-level script identification from multi-script handwritten documents. In: Proceedings of the 2015 Third International Conference on Computer, Communication, Control and Information Technology (C3IT). IEEE, pp 1–6
Singh PK, Sarkar R, Bhateja V, Nasipuri M (2018) A comprehensive handwritten Indic script recognition system: a tree-based approach. J Ambient Intell Humaniz Comput. https://doi.org/10.1007/s12652-018-1052-4
Soman K (2010) Insight into wavelets: from theory to practice. PHI Learning Pvt. Ltd., Delhi
Spitz AL (1997) Determination of the script and language content of document images. IEEE Trans Pattern Anal Mach Intell 19:235–245
Vincent N, Bouletreau V, Emptoz H, Sabourin R (2000) How to use fractal dimensions to qualify writings and writers. Fractals 8:85–97
Zagoris K, Pratikakis I, Antonacopoulos A, Gatos B, Papamarkos N (2014) Distinction between handwritten and machine-printed text based on the bag of visual words model. Pattern Recogn 47:1051–1062
Zheng Y, Li H, Doermann D (2002) The segmentation and identification of handwriting in noisy document images. In: International Workshop on Document Analysis Systems. Springer, pp 95–105
Zhou J, Wang F, Xu J, Yan Y, Zhu H (2019) A novel character segmentation method for serial number on banknotes with complex background. J Ambient Intell Humaniz Comput 10:2955–2969
Zhu G, Yu X, Li Y, Doermann D (2009) Language identification for handwritten document images using a shape codebook. Pattern Recogn 42:3184–3191
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Sahare, P., Dhok, S.B. Script pattern identification of word images using multi-directional and multi-scalable textures. J Ambient Intell Human Comput 12, 9739–9755 (2021). https://doi.org/10.1007/s12652-020-02718-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-020-02718-0