Abstract
We present a mobile head-mounted device for detecting and tracking text that is encased in an ordinary flat-cap hat. The main parts of the device are an integrated camera and audio webcam together with a simple remote control system, all connected via a USB hub to a laptop. A near to real-time text detection algorithm (around 14 fps for 640×480 images) which uses Maximal Stable Extremal Regions (MSERs) for image segmentation is proposed. Comparative text detection results against the ICDAR 2003 text locating competition database along with performance figures are presented.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aoki, H., Schiele, B., Pentland, A.: Realtime personal positioning system for wearable computers. In: ISWC 1999, pp. 37–43. IEEE Computer Society, Washington, DC, USA (1999)
Chmiel, J., Stankiewicz, O., Switala, W., Tluczek, M., Jelonek, J.: Read IT project report: A portable text reading system for the blind people (2005)
Donoser, M., Bischof, H.: Efficient maximally stable extremal region (MSER) tracking. In: CVPR 2006, pp. 553–560 (2006)
Donoser, M., Arth, C., Bischof, H.: Detecting, Tracking and Recognizing License Plates. In: Yagi, Y., Kang, S.B., Kweon, I.S., Zha, H. (eds.) ACCV 2007, Part II. LNCS, vol. 4844, pp. 447–456. Springer, Heidelberg (2007)
Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: CVPR 2010, pp. 2963–2970 (2010)
Ezaki, N., Kiyota, K., Minh, B., Bulacu, M., Schomaker, L.: Improved text-detection methods for a camera-based text reading system for blind persons. In: ICDAR 2005, pp. 257–261 (2005)
Hedgpeth, T., Black, J.A., Panchanathan, S.: A demonstration of the iCARE portable reader. In: ASSETS 2006, pp. 279–280 (2006)
Kurzweil, R.: The age of spiritual machines: when computers exceed human intelligence. Viking Press (1998)
Liang, J., Doermann, D., Li, H.: Camera-based analysis of text and documents: a survey. IJDAR, 84–104 (2005)
Lucas, S.M., Panaretos, A., Sosa, L., Tang, A., Wong, S., Young, R.: ICDAR 2003 robust reading competitions. In: ICDAR 2003, pp. 682–687 (2003)
Lucas, S.: ICDAR 2005 text locating competition results. In: ICDAR 2005, pp. 80–84 (2005)
Mancas-Thillou, C., Mirmehdi, M.: Super-resolution text using the teager filter. In: CBDAR 2005, pp. 10–16 (2005)
Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide baseline stereo from maximally stable extremal regions. In: BMVC 2002 (2002)
Mayol, W.W., Tordoff, B.J., Murray, D.W.: Wearable visual robots. Personal and Ubiquitous Computing 6, 37–48 (2002)
Merino, C., Mirmehdi, M.: A framework towards realtime detection and tracking of text. In: CBDAR 2007, pp. 10–17 (2007)
Myers, G.K., Burns, B.: A robust method for tracking scene text in video imagery. In: CBDAR 2005 (2005)
Neumann, L., Matas, J.: A Method for Text Localization and Recognition in Real-World Images. In: Kimmel, R., Klette, R., Sugimoto, A. (eds.) ACCV 2010, Part III. LNCS, vol. 6494, pp. 770–783. Springer, Heidelberg (2011)
Nistér, D., Stewénius, H.: Linear Time Maximally Stable Extremal Regions. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part II. LNCS, vol. 5303, pp. 183–196. Springer, Heidelberg (2008)
Pan, Y.F., Hou, X., Liu, C.L.: Text localization in natural scene images based on conditional random field. In: ICDAR 2009, pp. 6–10 (2009)
Pan, Y.F., Hou, X., Liu, C.L.: A hybrid approach to detect and localize texts in natural scene images. TIP (2011)
Peters, J.P., Thillou, C., Ferreira, S.: Embedded reading device for blind people: a user-centred design. In: AIPR 2004, pp. 217–222 (2004)
Shi, X., Xu, Y.: A wearable translation robot. In: ICRA 2005 (2005)
Targhi, A.T., Hayman, E., Olof Eklundh, J.: Real-time texture detection using the LU-transform. In: CIMCV (2006)
Zhang, J., Kasturi, R.: Extraction of text objects in video documents: Recent progress. In: DAS 2008, pp. 5–17. IEEE Computer Society, Washington, DC, USA (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Merino-Gracia, C., Lenc, K., Mirmehdi, M. (2012). A Head-Mounted Device for Recognizing Text in Natural Scenes. In: Iwamura, M., Shafait, F. (eds) Camera-Based Document Analysis and Recognition. CBDAR 2011. Lecture Notes in Computer Science, vol 7139. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29364-1_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-29364-1_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-29363-4
Online ISBN: 978-3-642-29364-1
eBook Packages: Computer ScienceComputer Science (R0)