Fuzzy bag of words for social image description

Li, Yanshan; Liu, Weiming; Huang, Qinghua; Li, Xuelong

doi:10.1007/s11042-014-2138-4

Fuzzy bag of words for social image description

Published: 15 June 2014

Volume 75, pages 1371–1390, (2016)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Yanshan Li^1,2,3,
Weiming Liu²,
Qinghua Huang¹ &
…
Xuelong Li⁴

737 Accesses
16 Citations
Explore all metrics

Abstract

Rapid growth of social media resources brings huge challenges and opportunities for image description technologies. The performance of image description method directly affects the accuracy of image retrieval, image annotation and image recognition. Bag of Words (BoW) as an efficient approach to describing the images has been attracting more and more attention. However, in traditional BoW, the maps between the words in the codebook and the features extracted from the images are actually ambiguous. As the Fuzzy Sets Theory (FST) is a powerful means for dealing with uncertainty efficiently, we utilize the FST to solve the problem caused by the ambiguity between the features and words. Accordingly, we propose a new type of BoW named as FBoW to describe images based on FST. Firstly, the features are extracted from the images. Secondly, k-means is utilized to learn the codebook. Thirdly, a fuzzy membership function is designed to measure the similarity between the features and words. The optimal parameters of the fuzzy membership function are obtained by using a Genetic Algorithm (GA). The histogram is generated by adding up the fuzzy membership values of each word to describe the images. The experimental results show that the proposed FBoW outperforms traditional BoW for social image description.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 4

Multi-scale local structure patterns histogram for describing visual contents in social image retrieval systems

Article 28 March 2016

Discriminative Image Representation for Classification

Image Classification Model Using Visual Bag of Semantic Words

Article 01 July 2019

References

Banerji S, Sinha A, Liu C (2013) A New Bag of Words LBP (BoWL) Descriptor for Scene Image Classification. In: 15th International Conference on Computer Analysis of Images and Patterns, CAIP 2013. Springer. York, UK, pp 490–497. doi: 10.1007/978-3-642-40261-6_59
Chapter Google Scholar
Chatfield K, Lempitsky V, Vedaldi A, Zisserman A (2011) The devil is in the details: an evaluation of recent feature encoding methods. BMVC 2011:76.1–76.12. doi:10.5244/C.25.76
Google Scholar
Farhangi MM, Soryani M, Fathy M (2013) Improvement the Bag of Words Image Representation Using Spatial Information. In: Proceedings of the Second International Conference on Advances in Computing and Information Technology, ACITY 2012. Springer, Chennai, India, pp 681–690. doi:10.1007/978-3-642-31552-7_69
Chapter Google Scholar
Grana C, Borghesani D, Manfredi M, Cucchiara R (2013) A fast approach for integrating ORB descriptors in the bag of words model. In: Proc. SPIE 8667, Multimedia Content and Mobile Devices. SPIE, Burlingame, California, USA, pp 866709-866709-8. doi:10.1117/12.2008460
Huang Q (2011) Discovery of time-inconsecutive co-movement patterns of foreign currencies using an evolutionary biclustering method. Appl Math Comput 218(8):4353–4363. doi:10.1016/j.amc.2011.10.011
MathSciNet MATH Google Scholar
Huang Q, Lee S, Liu L, Lu M, Jin L, Li A (2010) A robust graph-based segmentation method for breast tumors in ultrasound images. Ultrasonics 52(2):266–275. doi:10.1016/j.ultras.2011.08.011
Article Google Scholar
Ji R, Duan L, Chen J, Xie L, Yao H, Gao W (2013) Learning to distribute vocabulary indexing for scalable visual search. IEEE Trans on Multimedia 15(1):153–166. doi:10.1109/TMM.2012.2225035
Article Google Scholar
Ji R, Duan L, Chen J, Yao H, Yuan J, Rui Y, Gao W (2012) Location discriminative vocabulary coding for mobile landmark search. Int J Comput Vis 96(3):290–314. doi:10.1007/s11263-011-0472-9
Article Google Scholar
Ji R, Gao Y, Hong R, Liu Q, Tao D, Li X (2014) Spectral-spatial constraint hyperspectral image classification. IEEE Trans Geosci Remote Sens 52(3):1811–1824. doi:10.1109/TGRS.2013.2255297
Article Google Scholar
Ji R, Yao H, Liu W, Sun X, Tian Q (2012) Task-dependent visual-codebook compression. IEEE Trans Image Process 21(4):2282–2293. doi:10.1109/TIP.2011.2176950
Article MathSciNet Google Scholar
Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2006. IEEE, New York, NY, USA, pp 2169–2178. doi: 10.1109/CVPR.2006.68
Li W, Dong P (2013) Object recognition based on the region of interest and optical bag of words model. In: Proceedings of the Fifth International Conference on Internet Multimedia Computing and Service, ICIMCS 2013. ACM, New York, USA, pp 394–398. doi: 10.1145/2499788.2499873
Li X, Huang Q, Jin L, Wei G, Tao D (2011) Exploiting local coherent patterns for unsupervised feature ranking. IEEE Trans on Syst, Man and Cybern Part B Cybern 41(6):1471–1482. doi:10.1109/TSMCB.2011.2151256
Article Google Scholar
Li Y, Liu W, Li X, Huang Q, Li X (2013) GA-SIFT: A new scale invariant feature transform for multispectral image using geometric algebra. Information Sciences. (In press)
Li F, Pietro P (2005) A bayesian hierarchical model for learning natural scene categories. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005. IEEE, San Diego, CA, USA, pp 524–531. doi: 10.1109/CVPR.2005.16
Li J, Tao D (2013) Simple exponential family PCA. IEEE Trans on Neural Netw and Learn Syst 24(3):485–497. doi:10.1109/TNNLS.2012.2234134
Article Google Scholar
Li J, Tao D (2013) Exponential family factors for Bayesian factor analysis. IEEE Trans on Neural Netw and Learn Syst 24(6):964–976. doi:10.1109/TNNLS.2013.2245341
Article Google Scholar
Liu W, Tao D (2013) Multiview hessian regularization for image annotation. IEE Trans on Image Process 22(7):2676–2687. doi:10.1109/TIP.2013.2255302
Article MathSciNet Google Scholar
Lowe DG (1999) Object recognition from local scale-invariant features. In: The Proceedings of the Seventh IEEE International Conference on Computer Vision, ICCV 1999. IEEE, Kerkyra, Greece, pp 1150–1157. doi:10.1109/ICCV.1999.790410
Lowe DG (2004) Distinctive image features from scale-invariant Key points. Int J Comput Vis 60(2):91–110. doi:10.1023/B:VISI.0000029664.99615.94
Article Google Scholar
Philbin J, Chum O, Isard M, Sivic J, Zisserman A (2008) Lost in quantization: Improving particular object retrieval in large scale image databases. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2008. IEEE, Anchorage, AK, USA, pp 1–8. doi : 10.1109/CVPR.2008.4587635
Tao D, Jin L (2012) Discriminative information preservation for face recognition. Neurocomputing 91:11–20. doi:10.1016/j.neucom.2012.02.024
Article Google Scholar
Tao D, Li X, Wu X, Maybank SJ (2007) General tensor discriminant analysis and Gabor features for gait recognition. IEEE Trans on Pattern Anal and Mach Intel 29(10):1700–1715. doi:10.1109/TPAMI.2007.1096
Article Google Scholar
Tao D, Liang L, Jin L, Gao Y (2011) Similar Handwritten Chinese Character Recognition Using Discriminative Locality Alignment Manifold Learning. In: International Conference on Document Analysis and Recognition, ICDAR 2011. IEEE, Beijing, China, pp 1012–1016. doi:10.1109/ICDAR.2011.205
van Gemert J C, Geusebroek J M, Veenman C J, Smeulders AWM (2008) Kernel codebooks for scene categorization. In: 10th European Conference on Computer Vision, ECCV 2008. Springer, Marseille, France, pp 696–709. doi:10.1007/978-3-540-88690-7_52
Chapter Google Scholar
van Gemert JC, Veenman CJ, Smeulders AW, Geusebroek JM (2010) Visual word ambiguity. IEEE Trans on Pattern Anal and Mach Intel 32(7):1271–1283. doi:10.1109/TPAMI.2009.132
Article Google Scholar
Wang J, Yang J, Yu K, Lv F, Huang T, Gong Y (2010) Locality-constrained linear coding for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2010. IEEE, San Francisco, CA, pp 3360–3367. doi:10.1109/CVPR.2010.5540018
Wu L, Hoi SCH (2011) Enhancing Bag-of-words models with semantics-preserving metric learning. IEEE Multimedia 18(1):24–37. doi:10.1109/MMUL.2011.7
Article Google Scholar
Wu L, Hoi SCH, Yu N (2010) Semantics-preserving bag-of-words models and applications. IEEE Trans Image Process 19(7):1908–1920. doi:10.1109/TIP.2010.2045169
Article MathSciNet Google Scholar
Wu Z, Ke Q, Sun J, Shum H (2009) A multi-sample, multi-tree approach to bag-of-words image representation for image retrieval. In: IEEE 12th International Conference on Computer Vision. ICCV 2009, pp 1992–1999. doi:10.1109/ICCV.2009.5459439
Zadeh LA (1965) Fuzzy sets. Inf Control 8(3):338–353
Article Google Scholar
Zha Z, Zhang H, Wang M, Luan H, Chua TS(2013) Detecting Group Activities with Multi-Camera Context. IEEE Transactions on Circuits and Systems for Video Technologies 23(5):856–869. doi: 10.1109/TCSVT.2012.2226526
Article Google Scholar
Zha Z, Wang M, Zheng Y, Yang Y, Hong R, Chua TS (2012) Interactive Video Indexing With Statistical Active Learning. IEEE Transactions on Multimedia 14(1): 17–27. doi: 10.1109/TMM.2011.2174782
Article Google Scholar
Zha Z, Yang L, Mei T, Wang M, Wang Z, Chua TS, Hua X (2010) Visual query suggestion: Towards Capturing User Intent in Internet Image Search. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMMCAP) 6(3), Article No. 13doi: 10.1145/1823746.1823747
Article Google Scholar
Zhang Y, Jin R, Zhou Z (2010) Understanding bag-of-words model: a statistical framework. Int J Mach Learn Cybern 1(1–4):43–52. doi:10.1007/s13042-010-0001-0
Article Google Scholar
Zheng S, Huang Q, Jin L, Wei G (2011) Real-time extended-field-of-view ultrasound based on a standard PC. Appl Acoust 73(4):423–432. doi:10.1016/j.apacoust.2011.09.013
Article Google Scholar

Download references

Acknowledgments

The research was supported by National Natural Science Funds of China (Nos. 61125106, 61372007, 91120302, and 61072093), Guangdong Provincial Project of Transportation Science and Technology (No. 2012-02-084), Natural Science Funds of Guangdong Province (No. S2012010009885), the Fundamental Research Funds for the Central Universities (No. 2014ZG0038), Projects of innovative science and technology, Department of Education, Guangdong Province (No. 2013KJCX0012), and Shaanxi Key Innovation Team of Science and Technology (Grant No.: 2012KCT-04).

Author information

Authors and Affiliations

School of Electronic and Information Engineering, South China University of Technology, Guangzhou, 510640, China
Yanshan Li & Qinghua Huang
School of Civil Engineering and Transportation, South China University of Technology, Guangzhou, 510640, China
Yanshan Li & Weiming Liu
Shenzhen University, Shenzhen, 518060, China
Yanshan Li
The Center for OPTical IMagery Analysis and Learning (OPTIMAL), State Key Laboratory of Transient Optics and Photonics, Xi’an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, Xi’an, 710119, Shaanxi, China
Xuelong Li

Authors

Yanshan Li
View author publications
You can also search for this author in PubMed Google Scholar
Weiming Liu
View author publications
You can also search for this author in PubMed Google Scholar
Qinghua Huang
View author publications
You can also search for this author in PubMed Google Scholar
Xuelong Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qinghua Huang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, Y., Liu, W., Huang, Q. et al. Fuzzy bag of words for social image description. Multimed Tools Appl 75, 1371–1390 (2016). https://doi.org/10.1007/s11042-014-2138-4

Download citation

Received: 04 December 2013
Revised: 17 April 2014
Accepted: 02 June 2014
Published: 15 June 2014
Issue Date: February 2016
DOI: https://doi.org/10.1007/s11042-014-2138-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fuzzy bag of words for social image description

Abstract

Access this article

Similar content being viewed by others

Multi-scale local structure patterns histogram for describing visual contents in social image retrieval systems

Discriminative Image Representation for Classification

Image Classification Model Using Visual Bag of Semantic Words

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Fuzzy bag of words for social image description

Abstract

Access this article

Similar content being viewed by others

Multi-scale local structure patterns histogram for describing visual contents in social image retrieval systems

Discriminative Image Representation for Classification

Image Classification Model Using Visual Bag of Semantic Words

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation