Paper
19 January 2009 Character recognition in the presence of occluding clutter
Author Affiliations +
Proceedings Volume 7247, Document Recognition and Retrieval XVI; 72470I (2009) https://doi.org/10.1117/12.805855
Event: IS&T/SPIE Electronic Imaging, 2009, San Jose, California, United States
Abstract
Many documents contain (free-hand) underlining, "COPY" stamps, crossed out text, doodling and other "clutter" that occlude the text. In many cases, it is not possible to separate the text from the clutter. Commercial OCR solutions typically fail for cluttered text. We present a new method for finding the clutter using path analysis of points on the skeleton of the clutter/text connected component. This method can separate the clutter from the text even for fairly complex clutter shapes. Even with good localization of occluding clutter, it is difficult to use feature-based recognition for occluded characters, simply because the clutter affects the features in various ways. We propose a new algorithm that uses adapted templates of the font in the document that can be used for all forms of occlusion of the character. The method finds the simulated localization of the corresponding clutter in the templates and compares the unaffected parts of the templates and the character. The method has proved highly successful even when much of the character is occluded. We present examples of clutter localization and character recognition with occluded characters.
© (2009) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Knut T. Fosseide and Lars Aurdal "Character recognition in the presence of occluding clutter", Proc. SPIE 7247, Document Recognition and Retrieval XVI, 72470I (19 January 2009); https://doi.org/10.1117/12.805855
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Optical character recognition

Associative arrays

Detection and tracking algorithms

Binary data

Error analysis

Image filtering

Image processing

Back to Top