Abstract
This paper presents a new adaptive binarization method for the degraded document images. Variable background, non-uniform illumination, and blur caused by humidity are the addressed degradations. The proposed method has four steps: contrast analysis, which calculates the local contrast threshold; contrast stretching, thresholding by computing global threshold; and noise removal to improve the quality of binarized image. Evaluation of proposed method has been done using optical character recognition, visual criteria, and established measures: execution time, F-measure, peak signal-to-noise ratio, negative rate metric, and information to noise difference. Our method is tested on the four types of datasets including Document Image Binarization Contest (DIBCO) series datasets (DIBCO 2009, H-DIBCO 2010, and DIBCO 2011), which include a variety of degraded document images. On the basis of evaluation measures, the results of proposed method are promising and achieved good performance after extensive testing with eight techniques referred in the literature.
Similar content being viewed by others
References
Sun, J., Hotta, Y., Katsuyama, Y., Naoi, S.: Camera based degraded text recognition using grayscale feature. In: Proceeding of ICDAR’05, IEEE, vol. 1, pp. 182–186 (2005)
Gatos, B., Pratikakis, I., Perantonis, S.J.: Adaptive degraded document image binarization. Pattern Recognit. 39, 317–327 (2006)
Mori, S., Suen, C., Yamamoto, K.: Historical review of OCR research and development. In: Proceedings of the IEEE vol. 80, pp. 1029–1058 (1992)
Sezgin, M., Sankur, B.: Survey over image thresholding techniques and quantitative performance evaluation. J. Electron. Imaging 13(1), 146–168 (2004)
Chen, S., Li, D.: Image binarization focusing on objects. Neurocomputing 69, 2411–2415 (2006)
Wen, J., Li, S., Sun, J.: A new binarization method for non-uniform illuminated document images. Pattern Recognit 46, 1670–1690 (2012)
Abutaleb, A.S.: Automatic thresholding of gray-level pictures using two-dimensional entropy. Comput Vis. Graph. Image Process. 47, 22–32 (1989)
Kapur, J.N., Sahoo, P.K., Wong, A.K.C.: A new method for gray-level picture thresholding using the entropy of the histogram. Comput. Vis. Graph. Image Process. 29, 273–285 (1985)
Kittler, J., Illingworth, J.: Minimum error thresholding. Pattern Recognit. 19(1), 41–47 (1986)
Otsu, N.: A threshold selection method from gray level histograms. IEEE Trans. Syst. Man Cybern. 9, 62–66 (1979)
Bernsen, J.: Dynamic thresholding of grey-level images. In: Proceedings of ICPR’86, pp. 1251–1255 (1986)
Niblack, W.: An Introduction to Digital Image Processing, pp. 115–116. Prentice-Hall, Englewood Cliffs, NJ (1986)
Sauvola, J., Pietikainen, M.: Adaptive document image binarization. Pattern Recognit. 33, 225–236 (2000)
Kim, I.K., Jung, D.W., Park, R.H.: Document image binarization based on topographic analysis using a water flow model. Pattern Recognit. 35, 265–277 (2002)
Pai, Y.T., Pai, Y.F., Ruan, S.J.: Adaptive thresholding algorithm: efficient computation technique based on intelligent block detection for degraded document images. Pattern Recognit. 9, 3177–3187 (2010)
Valizadeh, M., Komeili, M., Armanfard, N., Kabir, E.: Degraded document image binarization based on combination of two complementary algorithms. In: Proceedings of ICACTEA’09, IEEE, pp. 595–599 (2009)
Yokobayashi, M., Wakahara, T.: Segmentation and recognition of characters in scene images using selective binarization in color space and GAT correlation. In: Proceedings of the Eight International Conference on Document Analysis and Recognition, vol. 1, pp. 167–171 (2005)
Badekas, E., Papamarkos, N.: Estimation of appropriate parameter values for document binarization techniques. Int. J. Robotics Autom. 24(1), 66–78 (2009)
Moghaddam, R.F., Cheriet, M.: Low quality document image modeling and enhancement. In: Proceedings of International Journal of Document Analysis and Recognition (IJDAR) vol. 11(4), pp. 183–201 (2009)
Trier, O.D., Taxt, T.: Evaluation of binarization methods for document images. IEEE Trans. Pattern Anal. Mach. Intell. 17(3), 312–315 (1995)
Trier, O.D., Jain, A.K.: Goal-directed evaluation of binarization methods. IEEE Trans. Pattern Anal. Mach. Intell. 17, 1191–1201 (1995)
Nakagawa, Y., Rosenfeld, A.: Some experiments on variable thresholding. Pattern Recognit. 11, 191–204 (1979)
Liu, Y., Srihari, S.N.: Document image binarization based on texture features. IEEE Trans. Pattern Anal. Mach. Intell. 19(5), 540–544 (1997)
Ramirez, M., Tapia, E., Block, M., Rojas, R.: Quintile linear algorithm for robust binarization of digitalized letters. In: Proceedings of Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), IEEE, pp. 1158–1162, (2007)
Kavallieratou, E.: A binarization algorithm specialized on document images and photos. In: Proceedings of the Eight International Conference on Document Analysis and Recognition, vol. 1, pp. 463–467 (2005)
Ramírez, M.A., Tapia, E., Rojas, R., Cuevas, E.: Transition thresholds and transition operators for binarization and edge detection. Pattern Recognit. 43(4), 3243–3254 (2010)
Kuk, J.G., Cho, N.I.: Feature based binarization of document images degraded by uneven light condition. In: Proceedings of 10th International Conference on Document Analysis and Recognition (ICDAR), pp. 748–752 (2009)
Lu, S., Su, B., Tan, C.L.: Binarization of historical document images using the local maximum and minimum. In: Proceedings of the 9th IAPR International Workshop on Document Analysis Systems (2010)
Lu, S., Su, B., Tan, C.L.: Document image binarization using background estimation and stroke edges. Int. J. Doc. Anal. Recognit. (IJDAR) 13(4), 303–314 (1010)
Rosenfeld, A., Kak, A.C.: Digital Picture Processing, 2nd edn. Academic Press, New York (1982)
Kittler, J., Illingworth, J.: On threshold selection using clustering criteria. IEEE Trans. Syst. Man Cybernet. 15, 652–655 (1985)
Brink, A.D.: Thresholding of digital images using two-dimensional entropies. Pattern Recognit. 25, 803–808 (1992)
Yan, H.: Unified formulation of a class of image thresholding techniques. Pattern Recognit. 29, 2025–2032 (1996)
Users.iit.demokritos.gr/\(\sim \)bgat/DIBCO2009/benchmark/.
Users.iit.demokritos.gr/\(\sim \)bgat/H-DIBCO2010/benchmark/.
Utopia.duth.gr/\(\sim \)ipratika/DIBCO2011/benchmark/.
ABBYY Software. Available www.finereader.com
Free OCR Software. Available www.softi.co.uk
Levenshtein, V.I.: “Binary codes capable of correcting deletions”, insertions and reversals. Sov. Phys. Dokl. 6, 707–710 (1966)
Gatos, B., Ntirogiannis, K., Pratikakis, I.: ICDAR 2009 Document Image Binarization Contest (DIBCO 2009). In: 2009 10th International Conference on Document Analysis and Recognition, pp. 1375–1382 (2009)
Pratikakis, I., Gatos, B., Ntirogiannis, K.: H-DIBCO 2010—handwritten document image binarization competition. In: 2010 12th International Conference on Frontiers in Handwriting Recognition, pp. 727–732 (2010)
Pratikakis, I., Gatos, B., Ntirogiannis, K.: ICDAR 2011 Document Image Binarization Contest (DIBCO 2011). In: 2011 International Conference on Document Analysis and Recognition, pp. 1506–1510 (2011)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Singh, B.M., Sharma, R., Ghosh, D. et al. Adaptive binarization of severely degraded and non-uniformly illuminated documents. IJDAR 17, 393–412 (2014). https://doi.org/10.1007/s10032-014-0219-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10032-014-0219-6