A Head-Mounted Device for Recognizing Text in Natural Scenes

Merino-Gracia, Carlos; Lenc, Karel; Mirmehdi, Majid

doi:10.1007/978-3-642-29364-1_3

Carlos Merino-Gracia¹⁸,
Karel Lenc¹⁹ &
Majid Mirmehdi²⁰

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7139))

Included in the following conference series:

International Workshop on Camera-Based Document Analysis and Recognition

1121 Accesses
33 Citations

Abstract

We present a mobile head-mounted device for detecting and tracking text that is encased in an ordinary flat-cap hat. The main parts of the device are an integrated camera and audio webcam together with a simple remote control system, all connected via a USB hub to a laptop. A near to real-time text detection algorithm (around 14 fps for 640×480 images) which uses Maximal Stable Extremal Regions (MSERs) for image segmentation is proposed. Comparative text detection results against the ICDAR 2003 text locating competition database along with performance figures are presented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aoki, H., Schiele, B., Pentland, A.: Realtime personal positioning system for wearable computers. In: ISWC 1999, pp. 37–43. IEEE Computer Society, Washington, DC, USA (1999)
Google Scholar
Chmiel, J., Stankiewicz, O., Switala, W., Tluczek, M., Jelonek, J.: Read IT project report: A portable text reading system for the blind people (2005)
Google Scholar
Donoser, M., Bischof, H.: Efficient maximally stable extremal region (MSER) tracking. In: CVPR 2006, pp. 553–560 (2006)
Google Scholar
Donoser, M., Arth, C., Bischof, H.: Detecting, Tracking and Recognizing License Plates. In: Yagi, Y., Kang, S.B., Kweon, I.S., Zha, H. (eds.) ACCV 2007, Part II. LNCS, vol. 4844, pp. 447–456. Springer, Heidelberg (2007)
Chapter Google Scholar
Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: CVPR 2010, pp. 2963–2970 (2010)
Google Scholar
Ezaki, N., Kiyota, K., Minh, B., Bulacu, M., Schomaker, L.: Improved text-detection methods for a camera-based text reading system for blind persons. In: ICDAR 2005, pp. 257–261 (2005)
Google Scholar
Hedgpeth, T., Black, J.A., Panchanathan, S.: A demonstration of the iCARE portable reader. In: ASSETS 2006, pp. 279–280 (2006)
Google Scholar
Kurzweil, R.: The age of spiritual machines: when computers exceed human intelligence. Viking Press (1998)
Google Scholar
Liang, J., Doermann, D., Li, H.: Camera-based analysis of text and documents: a survey. IJDAR, 84–104 (2005)
Google Scholar
Lucas, S.M., Panaretos, A., Sosa, L., Tang, A., Wong, S., Young, R.: ICDAR 2003 robust reading competitions. In: ICDAR 2003, pp. 682–687 (2003)
Google Scholar
Lucas, S.: ICDAR 2005 text locating competition results. In: ICDAR 2005, pp. 80–84 (2005)
Google Scholar
Mancas-Thillou, C., Mirmehdi, M.: Super-resolution text using the teager filter. In: CBDAR 2005, pp. 10–16 (2005)
Google Scholar
Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide baseline stereo from maximally stable extremal regions. In: BMVC 2002 (2002)
Google Scholar
Mayol, W.W., Tordoff, B.J., Murray, D.W.: Wearable visual robots. Personal and Ubiquitous Computing 6, 37–48 (2002)
Article Google Scholar
Merino, C., Mirmehdi, M.: A framework towards realtime detection and tracking of text. In: CBDAR 2007, pp. 10–17 (2007)
Google Scholar
Myers, G.K., Burns, B.: A robust method for tracking scene text in video imagery. In: CBDAR 2005 (2005)
Google Scholar
Neumann, L., Matas, J.: A Method for Text Localization and Recognition in Real-World Images. In: Kimmel, R., Klette, R., Sugimoto, A. (eds.) ACCV 2010, Part III. LNCS, vol. 6494, pp. 770–783. Springer, Heidelberg (2011)
Chapter Google Scholar
Nistér, D., Stewénius, H.: Linear Time Maximally Stable Extremal Regions. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part II. LNCS, vol. 5303, pp. 183–196. Springer, Heidelberg (2008)
Chapter Google Scholar
Pan, Y.F., Hou, X., Liu, C.L.: Text localization in natural scene images based on conditional random field. In: ICDAR 2009, pp. 6–10 (2009)
Google Scholar
Pan, Y.F., Hou, X., Liu, C.L.: A hybrid approach to detect and localize texts in natural scene images. TIP (2011)
Google Scholar
Peters, J.P., Thillou, C., Ferreira, S.: Embedded reading device for blind people: a user-centred design. In: AIPR 2004, pp. 217–222 (2004)
Google Scholar
Shi, X., Xu, Y.: A wearable translation robot. In: ICRA 2005 (2005)
Google Scholar
Targhi, A.T., Hayman, E., Olof Eklundh, J.: Real-time texture detection using the LU-transform. In: CIMCV (2006)
Google Scholar
Zhang, J., Kasturi, R.: Extraction of text objects in video documents: Recent progress. In: DAS 2008, pp. 5–17. IEEE Computer Society, Washington, DC, USA (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

Neurochemistry and Neuroimaging Laboratory, University of La Laguna, Spain
Carlos Merino-Gracia
Center for Machine Perception, Czech Technical University, Czech Republic
Karel Lenc
Visual Information Laboratory, University of Bristol, UK
Majid Mirmehdi

Authors

Carlos Merino-Gracia
View author publications
You can also search for this author in PubMed Google Scholar
Karel Lenc
View author publications
You can also search for this author in PubMed Google Scholar
Majid Mirmehdi
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Graduate School of Engineering, Dept. of Computer Science and Intelligent Systems, Osaka Prefecture University, 1-1 Gakuencho, Naka Sakai, 599-8531, Osaka, Japan
Masakazu Iwamura
German Research Center for Artificial Intelligence, Multimedia Analysis and Data Mining Competence Center, Trippstadter Str. 122, 67663, Kaiserslautern, Germany
Faisal Shafait

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Merino-Gracia, C., Lenc, K., Mirmehdi, M. (2012). A Head-Mounted Device for Recognizing Text in Natural Scenes. In: Iwamura, M., Shafait, F. (eds) Camera-Based Document Analysis and Recognition. CBDAR 2011. Lecture Notes in Computer Science, vol 7139. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29364-1_3

Download citation

DOI: https://doi.org/10.1007/978-3-642-29364-1_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-29363-4
Online ISBN: 978-3-642-29364-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics