Abstract
There is a considerable interest in designing automatic systems that will scan a given paper document and store it on electronic media for easier storage, manipulation, and access. Most documents contain graphics and images in addition to text. Thus, the document image has to be segmented to identify the text regions, so that OCR techniques may be applied only to those regions. In this paper, we present a simple method for document image segmentation in which text regions in a given document image are automatically identified. The proposed segmentation method for document images is based on a multichannel filtering approach to texture segmentation. The text in the document is considered as a textured region. Nontext contents in the document, such as blank spaces, graphics, and pictures, are considered as regions with different textures. Thus, the problem of segmenting document images into text and nontext regions can be posed as a texture segmentation problem. Two-dimensional Gabor filters are used to extract texture features for each of these regions. These filters have been extensively used earlier for a variety of texture segmentation tasks. Here we apply the same filters to the document image segmentation problem. Our segmentation method does not assume any a priori knowledge about the content or font styles of the document, and is shown to work even for skewed images and handwritten text. Results of the proposed segmentation method are presented for several test images which demonstrate the robustness of this technique.
Similar content being viewed by others
References
Becker RA, Chambers JM, Wilks AR (1988) The New S Language. Wadsworth & Brooks/Cole, Pacific Grove, CA
Chernoff H (1973) The use of faces to represent points in k-dimensional space graphically. J. Am. Stat. Assoc. 68:361–368
Clark M, Bovik AC (1989) Experiments in segmenting texton patterns using localized spatial filters. Pattern Recognition 22(6):707–717
Coggins JM, Jain AK (1985) A spatial filtering approach to texture analysis. Pattern Recognition Letters (3):195–203
Farrokhnia F (1990) Multi-channel filtering techniques for texture segmentation and surface quality inspection. Ph.D. thesis. Dept. of Electrical Eng., Michigan State University
Farrokhnia F, Jain AK (1991) A multi-channel filtering approach to texture segmentation. Proc. IEEE Computer Vision and Pattern Recognition Conf. Maui, June, pp 364–370
Fletcher LA, Kasturi R (1988) A robust algorithm for text string separation from mixed text/graphics images. IEEE Trans. Pattern Analysis and Machine Intelligence 10(6):910–918
Gabor D (1946) Theory of communication. J. Inst. Elect. Engr. 93:429–457
Iwaki O, Kida H, Arakawa H (1987) A segmentation method based on office document hierarchical structure. Proc. IEEE Int. Conf. Sys. Man Cybern. Alexandria, VA, October, pp 759–763
Jain AK, Chandrasekaran B (1982) Dimensionality and sample size considerations in pattern recognition practice. In: Krishnaiah PR, Kanal LN (eds), Handbook of Statistics 2, North Holland, pp 835–855
Jain AK, Dubes RC (1988) Algorithms for clustering data. Prentice-Hall, New Jersey
Jain AK, Farrokhnia F (1991) Unsupervised texture segmentation using Gabor filters. Pattern Recognition 24(12):1167–1186
Malik J, Perona P (1990) Preattentive texture discrimination with early vision mechanisms. J. Opt. Soc. Amer. A. 7(5):923–932
Nadler M (1984) A survey of document segmentation and coding techniques. Computer Vision, Graphics and Image Processing 28:240–262
Nagy G (1989) Document analysis and optical character recognition. Proc. Fifth Intl. Conf. on Image Analysis and Processing, Positano, Italy, Sept. 20–22, pp 511–529
Ni LM, Jain AK (1985) A VLSI systolic architecture for pattern clustering. IEEE Trans. Pattern Analysis and Machine Intelligence 7:80–89
Pavlidis T, Swartz J. and Wang YP (1990) Fundamentals of bar code information theory. IEEE Computer 23(4):74–86
Perry A and Lowe DG (1989) Segmentation of textured images. Proc. IEEE Computer Soc. Conf. on Computer Vision and Pattern Recognition San Diego, CA, pp 326–332
Sríhari SN (1986) Document image understanding. Proc. IEEE Comput. Soc. Fall Joint Computer Conf. Dallas, Texas, Nov. 2–6
Tan TN, Constantinides AG (1990) Texture analysis based on a human visual model. Proc. IEEE Int. Conf. on Acoust., Speech, Cignal Proc. Albuquerque, New Mexico, April, pp 2091–2110
Turner MR (1986) Texture Discrimination by Gabor Functions. Biol Cybern. 55:71–82
Wahl FM, Wong KY, Casey RG (1982) Block segmentation and text extraction in mixed text/image documents. Computer Graphics and Image Processing 20:375–390
Wang D, Srihari SN (1989) Classification of newspaper image blocks using texture analysis. Computer Vision, Graphics and Image Processing 47:327–352
Wong KY, Casey RG, Wahl FM (1982) Document analysis system. IBM Journal Res. Dev. 26(6):647–656
Author information
Authors and Affiliations
Additional information
This work was supported by the National Science Foundation under NSF grant CDA-88-06599 and by a grant from E. 1. Du Pont De Nemours & Company.
Rights and permissions
About this article
Cite this article
Jain, A.K., Bhattacharjee, S. Text segmentation using gabor filters for automatic document processing. Machine Vis. Apps. 5, 169–184 (1992). https://doi.org/10.1007/BF02626996
Issue Date:
DOI: https://doi.org/10.1007/BF02626996