Abstract
Proper processing and efficient representation of the digitized images of printed documents require the separation of the various information types: text, graphics, and image elements. For most applications it is sufficient to separate text and nontext, because text contains the most information. This paper describes the implementation and performance of a robust algorithm for text extraction and segmentation that is completely independent of text orientation and can deal with text in various font styles and sizes. Text objects can be nested in nontext areas, and inverse printing can also be analyzed. It should be mentioned that the classification is based only on rough image features, and individual characters are not recognized. The three main processing steps of the system are the generation of connected components, neighborhood analysis, and generation of text lines and blocks. As output, connected components are classified as text or nontext. Text components are grouped as characters, words, lines, and blocks. Nontext objects are accumulated as a separate nontext block.
Similar content being viewed by others
References
Baird HS, Jones, SE, Fortune, SJ (1990) Image Segmentation by shape-directed covers; Proceedings of the 10th International Conference on Pattern Recognition, Atlantic City 2:820–825
Barneck N (1989) Methods for photo noise extraction. Daimler Benz Research Report, Ulm, (Original in German)
Bixler JP (1988) Tracking text in mixed-mode documents. Proceedings of the Conference on Document Processing Systems, Santa Fe, N. M., pp. 177–185
Fisher JL, Hinds, SC, D'Amato DPD (1990) A rule-based system for document image segmentation. Proceedings of the 10th International Conference on Pattern Recogintion, Atlantic City, pp. 567–572
Fletcher LL, Kasturi RK (1988) A robust algorithm for text string separation from mixed text/graphics images; IEEE Trans Patt Anal Machine Intell 10:910–919
Higashino J, Fujisawa H, Nakano Y, Ejiri M (1986) A knowledge-based segmentation method for document understanding. Proceedings of the 8th International Conference on Pattern Recognition, pp. 745–748
Hainzl J (1985) Mathematik für Naturwissenschaftler (Original in German) Teubner, Stuttgart
Hönes F, Lichter J (1993) Text extraction within mixed-mode documents. Proceedings of the 2nd International Conference on Document Analysis and Recognition DAR, Tsukuba Science City, Japan, pp. 655–659
Hönes F, Zimmer R (1992) Separation of textual and nontextual information within mixed mode documents. Proceedings of the Machine Vision and Applications, Tokyo, pp. 71–74
Mandler E, Oberlaender M (1990) One-pass encoding of connected components in multi-valued images. Proceedings of the 10th International Conference on Pattern Recognition, Atlantic City, pp. 64–69
Nadler M (1984) Survey: document segmentation and coding techniques. Comput Vision Graph Image Process 28:240–262
Nagy G, Seth S, Stoddard SD (1986) Document analysis with an expert system; Pattern Recognition in Practice II, Amsterdam, pp. 149–159
O'Gorman L (1992) The document spectrum for bottom-up page layout analysis. Proceedings of the International Workshop on Structural and Syntactic Pattern Recognition, Bern, pp. 270–279
Schürmann J (1977) Polynomklassifikatoren für die Zeichenerkennung (Original in German). Oldenburg, München
Wahl FM, Wong KY, Casey RG (1982) Block segmentation and text extraction in mixed text/image documents. Comput Graph Image Process 20:375–390
Wang D, Srihari SN (1989) Classification of newspaper image blocks using texture analysis. Comput Vision Graph Image Process 47:327–352
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hönes, F., Lichter, J. Layout extraction of mixed mode documents. Machine Vis. Apps. 7, 237–246 (1994). https://doi.org/10.1007/BF01213414
Issue Date:
DOI: https://doi.org/10.1007/BF01213414