Skip to main content
Log in

Layout extraction of mixed mode documents

  • Published:
Machine Vision and Applications Aims and scope Submit manuscript

Abstract

Proper processing and efficient representation of the digitized images of printed documents require the separation of the various information types: text, graphics, and image elements. For most applications it is sufficient to separate text and nontext, because text contains the most information. This paper describes the implementation and performance of a robust algorithm for text extraction and segmentation that is completely independent of text orientation and can deal with text in various font styles and sizes. Text objects can be nested in nontext areas, and inverse printing can also be analyzed. It should be mentioned that the classification is based only on rough image features, and individual characters are not recognized. The three main processing steps of the system are the generation of connected components, neighborhood analysis, and generation of text lines and blocks. As output, connected components are classified as text or nontext. Text components are grouped as characters, words, lines, and blocks. Nontext objects are accumulated as a separate nontext block.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Baird HS, Jones, SE, Fortune, SJ (1990) Image Segmentation by shape-directed covers; Proceedings of the 10th International Conference on Pattern Recognition, Atlantic City 2:820–825

    Google Scholar 

  • Barneck N (1989) Methods for photo noise extraction. Daimler Benz Research Report, Ulm, (Original in German)

  • Bixler JP (1988) Tracking text in mixed-mode documents. Proceedings of the Conference on Document Processing Systems, Santa Fe, N. M., pp. 177–185

  • Fisher JL, Hinds, SC, D'Amato DPD (1990) A rule-based system for document image segmentation. Proceedings of the 10th International Conference on Pattern Recogintion, Atlantic City, pp. 567–572

  • Fletcher LL, Kasturi RK (1988) A robust algorithm for text string separation from mixed text/graphics images; IEEE Trans Patt Anal Machine Intell 10:910–919

    Google Scholar 

  • Higashino J, Fujisawa H, Nakano Y, Ejiri M (1986) A knowledge-based segmentation method for document understanding. Proceedings of the 8th International Conference on Pattern Recognition, pp. 745–748

  • Hainzl J (1985) Mathematik für Naturwissenschaftler (Original in German) Teubner, Stuttgart

    Google Scholar 

  • Hönes F, Lichter J (1993) Text extraction within mixed-mode documents. Proceedings of the 2nd International Conference on Document Analysis and Recognition DAR, Tsukuba Science City, Japan, pp. 655–659

  • Hönes F, Zimmer R (1992) Separation of textual and nontextual information within mixed mode documents. Proceedings of the Machine Vision and Applications, Tokyo, pp. 71–74

  • Mandler E, Oberlaender M (1990) One-pass encoding of connected components in multi-valued images. Proceedings of the 10th International Conference on Pattern Recognition, Atlantic City, pp. 64–69

  • Nadler M (1984) Survey: document segmentation and coding techniques. Comput Vision Graph Image Process 28:240–262

    Google Scholar 

  • Nagy G, Seth S, Stoddard SD (1986) Document analysis with an expert system; Pattern Recognition in Practice II, Amsterdam, pp. 149–159

  • O'Gorman L (1992) The document spectrum for bottom-up page layout analysis. Proceedings of the International Workshop on Structural and Syntactic Pattern Recognition, Bern, pp. 270–279

  • Schürmann J (1977) Polynomklassifikatoren für die Zeichenerkennung (Original in German). Oldenburg, München

  • Wahl FM, Wong KY, Casey RG (1982) Block segmentation and text extraction in mixed text/image documents. Comput Graph Image Process 20:375–390

    Google Scholar 

  • Wang D, Srihari SN (1989) Classification of newspaper image blocks using texture analysis. Comput Vision Graph Image Process 47:327–352

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Frank Hönes.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hönes, F., Lichter, J. Layout extraction of mixed mode documents. Machine Vis. Apps. 7, 237–246 (1994). https://doi.org/10.1007/BF01213414

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF01213414

Key words

Navigation