Cross-Modal Saliency Correlation for Image Annotation

Gu, Yun; Xue, Haoyang; Yang, Jie

doi:10.1007/s11063-016-9511-4

Cross-Modal Saliency Correlation for Image Annotation

Published: 02 March 2016

Volume 45, pages 777–789, (2017)
Cite this article

Neural Processing Letters Aims and scope Submit manuscript

Yun Gu¹,
Haoyang Xue² &
Jie Yang²

439 Accesses
7 Citations
Explore all metrics

Abstract

Automatic image annotation is an attractive service for users and administrators of online photo sharing websites. In this paper, we propose an image annotation approach exploiting the crossmodal saliency correlation including visual and textual saliency. For textual saliency, a concept graph is firstly established based on the association between the labels. Then semantic communities and latent textual saliency are detected; For visual saliency, we adopt a dual-layer BoW (DL-BoW) model integrated with the local features and salient regions of the image. Experiments on MIRFlickr and IAPR TC-12 datasets demonstrate that the proposed method outperforms other state-of-the-art approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

The term ‘community’ comes from research field of networks which is similar to ‘clique’ in graph-cut problems but not identical.

References

Blondel VD, Guillaume JL, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech 2008(10):P10008
Article Google Scholar
Cao X, Zhang H, Guo X, Liu S, Meng D (2015) Sled: semantic label embedding dictionary representation for multi-label image annotation. IEEE Trans Image Process 24:2746
Article MathSciNet Google Scholar
Carneiro G, Chan A, Moreno P, Vasconcelos N (2007) Supervised learning of semantic classes for image annotation and retrieval. IEEE Trans Pattern Anal Mach Intell 29(3):394–410. doi:10.1109/TPAMI.2007.61
Article Google Scholar
Chapelle O, Haffner P, Vapnik VN (1999) Support vector machines for histogram-based image classification. IEEE Trans Neural Netw 10(5):1055–1064
Article Google Scholar
Cusano C, Ciocca G, Schettini R (2003) Image annotation using SVM. In: Electronic imaging 2004. International Society for Optics and Photonics, pp. 330–338
Fu J, Mei T, Yang K, Lu H, Rui Y (2015) Tagging personal photos with transfer deep learning. In: Proceedings of the 24th international conference on World Wide Web, International World Wide Web Conferences Steering Committee, pp 344–354
Goh KS, Chang EY, Li B (2005) Using one-class and two-class svms for multiclass image annotation. IEEE Trans Knowl Data Eng 17(10):1333–1346
Article Google Scholar
Grubinger M, Clough P, M U Ller H, Deselaers T (2006) The iapr tc-12 benchmark: A new evaluation resource for visual information systems. Int Workshop OntoImage 5:10
Google Scholar
Gu Y, Qian X, Li Q, Wang M, Hong R, Tian Q (2015) Image annotation by latent community detection and multikernel learning. IEEE Trans Image Process 24(11):3450–3463. doi:10.1109/TIP.2015.2443501
Article MathSciNet Google Scholar
Gu Y, Xue H, Yang J, Jia Z (2014) Automatic image annotation exploiting textual and visual saliency. In: Neural information processing. Springer, Berlin, pp 95–102
Han Y, Wu F, Tian Q, Zhuang Y (2012) Image annotation by input-output structural grouping sparsity. IEEE Trans Image Process 21(6):3066–3079
Article MathSciNet Google Scholar
Huiskes MJ, Lew MS (2008) The mir flickr retrieval evaluation. In: Proceedings of the 1st ACM international conference on Multimedia information retrieval, ACM, pp 39–43
Li X, Snoek CG, Worring M (2009) Learning social tag relevance by neighbor voting. IEEE Trans Multimed 11(7):1310–1322
Article Google Scholar
Liu D, Hua XS, Yang L, Wang M, Zhang HJ (2009) Tag ranking. In: Proceedings of the 18th international conference on World wide web, ACM, pp. 351–360
Liu X, Cheng B, Yan S, Tang J, Chua TS, Jin H (2009) Label to region by bi-layer sparsity priors. In: Proceedings of the 17th ACM international conference on Multimedia, ACM, pp. 115–124
Liu X, Liu R, Li F, Cao Q (2012) Graph-based dimensionality reduction for knn-based image annotation. In: ICPR, pp. 1253–1256
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
Article Google Scholar
Lu Z, Wang L (2015) Learning descriptive visual representation for image classification and annotation. Pattern Recognit 48(2):498–508
Article Google Scholar
Ma Z, Nie F, Yang Y, Uijlings JR, Sebe N (2012) Web image annotation via subspace-sparsity collaborated feature selection. IEEE Trans Multimed 14(4):1021–1030
Article Google Scholar
Makadia A, Pavlovic V, Kumar S (2010) Baselines for image annotation. Int J Comput Vis 90(1):88–105
Article Google Scholar
Qi X, Han Y (2007) Incorporating multiple SVMs for automatic image annotation. Pattern Recognit 40(2):728–741
Article MATH Google Scholar
Qian X, Hua XS, Tang YY, Mei T (2014) Social image tagging with diverse semantics. IEEE Transactions on Cybern 44(12):2493–2508. doi:10.1109/TCYB.2014.2309593
Article Google Scholar
Qian X, Liu X, Zheng C, Du Y, Hou X (2013) Tagging photos using users’ vocabularies. Neurocomputing 111:144–153
Article Google Scholar
Saito P, de Rezende PJ, Falc A O AX, Suzuki CT, Gomes JF (2013) A data reduction and organization approach for efficient image annotation. In: Proceedings of the 28th annual ACM symposium on applied computing, pp. 53–57
Sonnenburg S, Rätsch G, Schäfer C, Schölkopf B (2006) Large scale multiple kernel learning. J Mach Learn Res 7:1531–1565
MathSciNet MATH Google Scholar
Tang J, Hong R, Yan S, Chua TS, Qi GJ, Jain R (2011) Image annotation by k nn-sparse graph-based label propagation over noisily tagged web images. ACM Trans Intell Syst Technol (TIST) 2(2):14
Google Scholar
Tang J, Yan S, Zhao C, Chua TS, Jain R (2013) Label-specific training set construction from web resource for image annotation. Signal Process 93(8):2199–2204
Article Google Scholar
Tang J, Zha ZJ, Tao D, Chua TS (2012) Semantic-gap-oriented active learning for multilabel image annotation. IEEE Trans Image Process 21(4):2354–2360
Article MathSciNet Google Scholar
Wu B, Lyu S, Hu BG, Ji Q (2015) Multi-label learning with missing labels for image annotation and facial action unit recognition. Pattern Recognit 48:2279
Article Google Scholar
Yan R, Natsev A, Campbell M (2008) A learning-based hybrid tagging and browsing approach for efficient manual image annotation. In: CVPR, pp. 1–8
Yang C, Zhang L, Lu H, Ruan X, Yang MH (2013) Saliency detection via graph-based manifold ranking. In: Computer vision and pattern recognition, 2013. IEEE Conference on CVPR 2013, pp. 3166–3173
Yang Y, Wu F, Nie F, Shen HT, Zhuang Y, Hauptmann AG (2012) Web and personal image annotation by mining label correlation with relaxed visual graph embedding. IEEE Trans Image Process 21(3):1339–1351
Article MathSciNet Google Scholar
Zhang D, Islam MM, Lu G (2012) A review on automatic image annotation techniques. Pattern Recognit 45(1):346–362
Article Google Scholar
Zhang ML, Zhou ZH (2007) Ml-knn: a lazy learning approach to multi-label learning. Pattern Recognit 40(7):2038–2048
Article MATH Google Scholar
Zhu G, Wang Q, Yuan Y (2014) Tag-saliency: combining bottom-up and top-down information for saliency detection. Comput Vis Image Underst 118:40–49
Article Google Scholar

Download references

Acknowledgments

This work is partly supported by NSFC China (No: 61572315), and 973 Plan, China (No. 2015CB856004).

Author information

Authors and Affiliations

School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
Yun Gu
Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Shanghai, China
Haoyang Xue & Jie Yang

Authors

Yun Gu
View author publications
You can also search for this author in PubMed Google Scholar
Haoyang Xue
View author publications
You can also search for this author in PubMed Google Scholar
Jie Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jie Yang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gu, Y., Xue, H. & Yang, J. Cross-Modal Saliency Correlation for Image Annotation. Neural Process Lett 45, 777–789 (2017). https://doi.org/10.1007/s11063-016-9511-4

Download citation

Published: 02 March 2016
Issue Date: June 2017
DOI: https://doi.org/10.1007/s11063-016-9511-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Cross-Modal Saliency Correlation for Image Annotation

Abstract

Access this article

Similar content being viewed by others

Automatic Image Annotation Exploiting Textual and Visual Saliency

Social image tag enrichment based on textual similarity modeling

Cross-domain semantic transfer from large-scale social media

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Cross-Modal Saliency Correlation for Image Annotation

Abstract

Access this article

Similar content being viewed by others

Automatic Image Annotation Exploiting Textual and Visual Saliency

Social image tag enrichment based on textual similarity modeling

Cross-domain semantic transfer from large-scale social media

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation