Abstract
In this paper we present a novel video text detection and segmentation system. In the detection stage, we utilize edge density feature, pyramid strategy and some weak rules to search for text regions, so that high detection rate can be achieved. Meanwhile, to eliminate the false alarms and improve the precision rate, a multilevel verification strategy is adopted. In the segmentation stage, a precise polarity estimation algorithm is firstly provided. Then, multiple frames containing the same text are integrated to enhance the contrast between text and background. Finally, a novel connected components based binarization algorithm is proposed to improve the recognition rate. Experimental results show the superior performance of the proposed system.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Aslandogan, Y., Yu, C.T.: Techniques and Systems for Image and Video Retrieval. IEEE Transactions on Knowledge and Data Engineering 11(1), 56–63 (1999)
Gllavata, J., Ewerth, R., Freisleben, B.: Text Detection in Images Based on Unsupervised Classification of High-frequency Wavelet Coefficients. In: Proceedings of 17th International Conference on Pattern Recognition, vol. 1, pp. 425–428 (2004)
Tekinalp, S., Alatan, A.: Utilization of Texture, Contrast and Color Homogeneity for Detecting and Recognizing Text from Video Frames. In: Proceedings of 2003 International Conference on Image Processing, vol. 2, pp. 505–508 (2003)
Kim, K., Byun, H., Song, Y., Choi, Y., Chi, S., Kim, K., Chung, Y.: Scene Text Extraction in Natural Scene Images Using Hierarchical Feature Combining and Verification. In: Proceedings of 17th International Conference on Pattern Recognition, vol. 2, pp. 679–682 (2004)
Wang, K., Kangas, J.A.: Character Location in Scene Images from Digital Camera. Pattern Recognition 36(10), 2287–2299 (2003)
Cai, M., Song, J., Lyu, M.R.: A New Approach for Video Text Detection. In: Proceedings of 2002 International Conference on Image Processing, vol. 1, pp. 117–120 (2002)
Wang, R., Jin, W., Wu, L.: A Novel Video Caption Detection Approach Using Multi-frame Integration. In: Proceedings of 17th International Conference on Pattern Recognition, vol. 1, pp. 449–452 (2004)
Wolf, C., Jolion, J.: Extraction and Recognition of Artificial Text in Multimedia Documents. Pattern Analysis and Application 6(4), 309–326 (2003)
Hua, X.S., Yin, P., Zhang, H.J.: Efficient Video Text Recognition Using Multiple Frame Integration. In: Proceedings of 2002 International Conference on Image Processing, vol. 2, pp. 397–400 (2002)
Lienhart, R., Wernicke, A.: Localizing and Segmenting Text in Images and Videos. IEEE Transactions on Circuits and Systems for Video Technology 12(4), 256–268 (2002)
Hasan, Y.M., Karam, L.J.: Morphological Text Extraction from Images. IEEE Transaction on Image Processing 9(11), 1978–1983 (2000)
Strouthopoulos, C., Papamarkos, N., Atsalakis, A.: Text Extraction in Complex Color Documents. Pattern Recognition 35(8), 1743–1758 (2002)
Ahonen, T., Hadid, A., Pietikinen, M.: Face Recognition with Local Binary Patterns. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3021, pp. 469–481. Springer, Heidelberg (2004)
Hadid, A., Pietikainen, M., Ahonen, T.: A Discriminative Feature Space for Detecting and Recognizing Faces. In: Proceedings of the 2004 IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 797–804 (2004)
Jung, B.H., Katagiri, S.: Discriminative Learning for Minimum Error Classification. IEEE Transaction on Signal Processing 40(12), 3043–3054 (1992)
Otsu, N.: A Threshold Selection Method from Gray-level Histograms. Man and Cybernetics 9(1), 62–66 (1979)
Seeger, M., Dance, C.: Binarising Camera Images for OCR. In: Proceedings of the 6th International Conference on Document Analysis and Recognition, vol. 1, pp. 54–58 (2001)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Xu, L., Wang, K. (2008). Extracting Text Information for Content-Based Video Retrieval. In: Satoh, S., Nack, F., Etoh, M. (eds) Advances in Multimedia Modeling. MMM 2008. Lecture Notes in Computer Science, vol 4903. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77409-9_6
Download citation
DOI: https://doi.org/10.1007/978-3-540-77409-9_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-77407-5
Online ISBN: 978-3-540-77409-9
eBook Packages: Computer ScienceComputer Science (R0)