Abstract
Automatic extraction of text from multimedia contents is an important problem that needs to be solved in order to obtain more effective retrieval engines. Recently, Crandall, Antani and Kasturi have shown that a direct analysis of certain DCT coefficients can be used to locate potential regions of caption text in MPEG-1 videos. In this paper, we extend their proposal to wavelet-coded images, and show that localization of text superimposed in natural scenes can also be effectively and efficiently performed by a wavelet transformation of the image followed by an analysis of the distribution of second order statistics on high frequency wavelet bands.
Keywords
This work has been supported by the Spanish Ministerio de Ciencia y Tecnología under grants TIC2003-09291 and TIC2000-0399-C02-01.
Download to read the full chapter text
Chapter PDF
References
O’Gorman, L., Kasturi, R. (eds.): Document Image Analysis. IEEE Computer Society Press, Los Alamitos (1997) (Published as Technical Briefing)
Hu, J., Bagga, A.: Categorizing images in web documents. IEEE Trans. on Multimedia, 22–30 (2004)
Allier, B., Duong, J., Gagneux, A., Mallet, P., Emptoz, H.: Texture feature characterization for logical pre-labeling. In: Proc. of Int. Conference on Document Analysis and Recognition, pp. 567–571 (2003)
Zhong, Y., Karu, K., Jain, A.K.: Locating text in complex color images. In: Proc. of Int. Conference on Document Analysis and Recognition, pp. 146–149 (1995)
Patel, D.: Page segmentation for document image analysis using a neural network. Optical Engineering 35, 1854–1861 (1996)
Payne, J.S., Stonham, T.J., Patel, D.: Document segmentation using texture analysis. In: Proc. of Int. Conference on Pattern Recognition, pp. 380–382 (1994)
Jain, A.K., Bhattacharjee, S.K.: Address block location on envelopes using Gabor filters. Pattern Recognition 25, 1459–1477 (1992)
Menoti, D., Borges, D.L., Facon, J., Britto, A.S.: Segmentation of postal envelopes for address block location: an approach based on feature selection in wavelet space. In: Proc. of Int. Conf. on Document Analysis and Recognition, pp. 699–703 (2003)
Crandall, D., Antani, S., Kasturi, R.: Extraction of special effects caption text events from digital video. Int. Journal on Document Analysis and Recognition 5, 138–157 (2003)
van Hateren, J.H., Ruderman, D.L.: Independent component analysis of natural image sequences yields spatio-temporal filters similar to simple cells in primary visual cortex. Proc. of the Royal Society of London, Series B 265, 2315–2320 (1998)
Brodatz, P.: Textures: A photographic Album for Artists and Designers. Dover Publications, N.Y (1966)
Rao, K.R., Yip, P.: Discrete Cosine Transform. In: Algorithms, Advantages, Applications, Academic Press, London (1990)
Mallat, S.: A Wavelet Tour of Signal Processing. Academic Press, London (1998)
Olshausen, B.A., Field, D.J.: Sparse coding with an overcomplete basis set: a strategy employed by V1. Vision Research 37, 3311–3325 (1997)
Olshausen, B.A., Field, D.J.: Natural image statistics and efficient coding. Network Computation in Neural Systems 7, 333–339 (1996)
Wang, J.Z.: Integrated Region-based Image Retrieval. Kluwer Academic Publishers, The Netherlands (2001)
Mallat, S.: A theory for multiresolution signal decomposition: the wavelet representation. IEEE Trans. on Pattern Analysis and Machine Intelligence 11, 674–693 (1989)
Bhattacharya, U., Chaudhuri, B.B.: A majority voting scheme for multiresolution recognition of handprinted numerals. In: Proc. of Int. Conference on Document Analysis and Recognition (ICDAR), pp. 16–20 (2003)
Li, H., Doermann, D.S.: Automatic identification of text in digital video key frames. In: Proc. of Int. Conference on Pattern Recognition, pp. 129–132 (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jiménez, J., Martí, E. (2004). Localization of Caption Texts in Natural Scenes Using a Wavelet Transformation. In: Sanfeliu, A., Martínez Trinidad, J.F., Carrasco Ochoa, J.A. (eds) Progress in Pattern Recognition, Image Analysis and Applications. CIARP 2004. Lecture Notes in Computer Science, vol 3287. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30463-0_12
Download citation
DOI: https://doi.org/10.1007/978-3-540-30463-0_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23527-9
Online ISBN: 978-3-540-30463-0
eBook Packages: Springer Book Archive