ABSTRACT
Nowadays personal albums are becoming more and more popular due to the explosive growth of digital image capturing devices. An effective automatic annotation system for personal albums is desired for both efficient browsing and search. Existing research on image annotation evolves through two stages: learning-based methods and web-based methods. Learning-based methods attempt to learn classifiers or joint probabilities between images and concepts, which are difficult to handle large-scale concept sets due to the lack of training data. Web-based methods leverage web image data to learn relevant annotations, which greatly expand the scale of concepts. However, they still suffer two problems: the query image lacks prior knowledge and the annotations are often noisy and incoherent. To address the above issues, we propose a web-based annotation approach to annotate a collection of photos simultaneously, instead of annotating them independently, by leveraging the abundant correlations among the photos. A multi-graph similarity propagation based semi-supervised learning (MGSP-SSL) algorithm is proposed to suppress the noises in the initial annotations from the Web. Experiments on real personal albums show that the proposed approach outperforms existing annotation methods.
- J, Sivic., A, Zisserman. Video Google: A text retrieval approach to object matching in video. In Proceedings of International Conference on Computer Vision (ICCV), 2003. Google ScholarDigital Library
- J. Li and J. Z. Wang, Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach. IEEE Trans. On Pattern Analysis and Machine Intelligence, 2003. 25(19): p.1075--1088. Google ScholarDigital Library
- C. Cusano, G. Ciocca, and R. Schettini. Image Annotation Using SVM. Proceedings of Internet Imaging, Vol. SPIE 5304. 2004.Google Scholar
- S. L. Feng, R. Manmatha, and V. Lavrenko. Multiple bernoulli relevance models for image and video annotation. In Proc. of CVPR, Washington, DC, June, 2004. Google ScholarDigital Library
- D. M. Blei and M. I. Jordan. Modeling annotated data. In Proc. SIGIR, Toronto, July. 2003. Google ScholarDigital Library
- C. Wang, F. Jing, L. Zhang and H.J. Zhang. Scalable search-based image annotation of personal images. In Proceedings of the 8th ACM international workshop on Multimedia information retrieval. ACM Press New York, NY, USA, 2006, 269--278 Google ScholarDigital Library
- X. Wang, L. Zhang, F. Jing and W. Y. Ma. AnnoSearch: Image Auto-Annotation by Search. International Conference on Computer Vision and Pattern Recognition. Washington, DC, 2006, 1483--1490. Google ScholarDigital Library
- X. Rui, M. Li, Z. Li, W. Y. Ma, Yu, N. Bipartite Graph Reinforcement Model for web image annotation. In Proceedings of the 15th Annual ACM International Conference on Multimedia, 2007 Google ScholarDigital Library
- J. Cui, F. Wen, R. Xiao, Y. Tian, X. Tang. EasyAlbum: An Interactive Photo Annotation System Based on Face Clustering and Re-ranking. In Proceedings of SIGCHI, 2007. Google ScholarDigital Library
- L. Zhang, L. Chen, M. Li, H. Zhang. Automated annotation of human faces in family albums. In Proceedings of ACM Multimedia, 2003. Google ScholarDigital Library
- D. Zhou, J. Huang, and B. Schölkopf. Learning with Local and global consistency. 18th Annual Conference on Neural Information Processing System, 2003.Google Scholar
- J. Jeon, V. Lavrenko and R. Manmatha. Automatic Image Annotation and Retrieval Using Cross-media Relevance Models. In Proc. of SIGIR, Toronto, July 2003. Google ScholarDigital Library
- J Philbin, O Chum, M Isard, J Sivic, A Zisserman. Object retrieval with large vocabularies and fast spatial matching. CVPR, 2007.Google ScholarCross Ref
- M. Wang, X. Hua, X. Yuan, Y. Song and L. Dai. Optimizing multi-graph learning: towards a unified video annotation scheme. Proceedings of the 15th international conference on Multimedia, 2007. Google ScholarDigital Library
- J. Tang, X. Hua, T. Mei, G. Qi, S. Li and X. Wu. Temporally Consistent Gaussian Random Field for Video Semantic Analysis. IEEE International Conference on Image Processing, 2007.Google Scholar
- V. Lavrenko, R. Manmatha and J. Jeon. A Model for Learning the Semantics of Pictures. In Proc. NIPS, 2003.Google Scholar
- X. He, W. Y. Ma and H. J. Zhang. Learning an image manifold for retrieval. In Proc. of ACM international Conference On Multimedia, 2005. Google ScholarDigital Library
- R. L. Cilibrasi, P. M. B. Vitányi. The google similarity distance. IEEE Trans. on Knowledge and data engineering, 2007 Google ScholarDigital Library
- J. Liu, M. Li, W. Y. Ma, Q. Liu and H. Lu An Adaptive Graph Model for Automatic Image Annotation. ACM Workshop on Multimedia Information Retrieval (MIR), 2006. Google ScholarDigital Library
- P. Duygulu and K. Barnard. Object recognition as machine translation: learning a lexicon for a fixed image vocabulary. In Proc. of ECCV, 2002. Google ScholarDigital Library
- J. Kandola, J. Shawe-Taylor, N. Cristianini. Learning Semantic Similarity. Annual Conference on Neural Information Processing System, 2003.Google Scholar
- X. Wang, W. Y. Ma, G. Xue, X. Li. Multi-Model Similarity Propagation and its Application for Web Image Retrieval. In Proceedings of ACM International Conference on Multimedia, 2004 Google ScholarDigital Library
- D. Lowe. Local feature view clustering for 3D object recognition. In Proc. CVPR, 2001.Google ScholarCross Ref
- H. Lejsek, F. Ásmundsson, B. Jónsson, L. Amsaleg. Scalability of local image descriptors: a comparative study. ACM Multimedia 2006: 589--598. Google ScholarDigital Library
- S. Boll, P. Sandhaus, A. Scherp, U. Westermann. Semantics, content, and structure of many for the creation of personal photo albums. ACM Multimedia 2007: 641--650 Google ScholarDigital Library
- X. Lian, L. Chen, J. X. Yu, G. Wang, G. Yu. Similarity Match Over High Speed Time-Series Streams. ICDE 2007: 1086--1095Google Scholar
- F. Golshani. EIC's Message: Multimedia is Correlated Media. IEEE MultiMedia 11(1): (2004) Google ScholarDigital Library
- A. Zunjarwad, H. Sundaram, L. Xie. Contextual wisdom: social relations and correlations for multimedia event annotation. ACM Multimedia 2007: 615--624 Google ScholarDigital Library
- Y. Lin, H. Sundaram, Y. Chi, J. Tatemura, B. Tseng Detecting splogs via temporal dynamics using self-similarity analysis. TWEB 2(1): (2008)q Google ScholarDigital Library
- M. Cooper, J. Foote, A. Girgensohn, L. Wilcox. Temporal event clustering for digital photo collections. TOMCCAP 1(3): 269--288 (2005) Google ScholarDigital Library
- R. Zhang, Z. Zhang, M. Li, W. Y. Ma, H. Zhang. A Probabilistic Semantic Model for Image Annotation and Multi-Modal Image Retrieval. ICCV 2005: 846--851 Google ScholarDigital Library
- W. Klas, R. King. Context-Aware Multimedia. Encyclopedia of Multimedia 2006Google Scholar
- L. Hardman, J. Ossenbruggen. Creating meaningful multimedia presentations. ISCAS 2006Google Scholar
- J. Liu, B. Wang, M. Li, Z. Li, W. Y. Ma, H. Lu and S. Ma. Dual Cross-Media Relevance Model for Image Annotation. In Proceedings of the 15th Annual ACM International Conference on Multimedia 2007. Google ScholarDigital Library
- E. Chang, et al. CBSA: Content-Based Soft Annotation for Multimodal Image Retrieval Using Bayes Point Machines. CirSysVideo, 2003. 13(1): p. 26--38. Google ScholarDigital Library
Index Terms
- Annotating personal albums via web mining
Recommendations
Multi-view Semi-supervised Learning for Web Image Annotation
MM '15: Proceedings of the 23rd ACM international conference on MultimediaWith the explosive increasing of web image data, image annotation has become a critical research issue for image semantic index and search. In this work, we propose a novel model, termed as multi-view semi-supervised learning (MVSSL), for robust image ...
Automatic image annotation using semi-supervised generative modeling
Image annotation approaches need an annotated dataset to learn a model for the relation between images and words. Unfortunately, preparing a labeled dataset is highly time consuming and expensive. In this work, we describe the development of an ...
Automatic image annotation by semi-supervised manifold kernel density estimation
The insufficiency of labeled training data is a major obstacle in automatic image annotation. To tackle this problem, we propose a semi-supervised manifold kernel density estimation (SSMKDE) approach based on a recently proposed manifold KDE method. Our ...
Comments