Discriminative sequential association latent dirichlet allocation for visual recognition

Yao, Ting-Ting; Xie, Zhao; Gao, Jun; Wang, Chi

doi:10.1007/s10044-014-0444-0

Discriminative sequential association latent dirichlet allocation for visual recognition

Theoretical Advances
Published: 10 January 2015

Volume 19, pages 719–730, (2016)
Cite this article

Pattern Analysis and Applications Aims and scope Submit manuscript

Ting-Ting Yao¹,
Zhao Xie¹,
Jun Gao¹ &
…
Chi Wang¹

351 Accesses
2 Citations
Explore all metrics

Abstract

Graphical models have been employed in a wide variety of computer vision tasks. Assignments of latent variables in typical models usually suffer the confused explanation in sampling way. In this paper we present discriminative sequential association Latent Dirichlet Allocation, a novel statistical model for the task of visual recognition, and especially focus on the case of few training examples. By introducing the switching variables and formulating the direct discriminative analysis, the sequential associations are considered as priori to establish a relevance determination mechanism to obtain the reasonable assignments of latent variables and avoid the invalid labeling oscillations. We demonstrate the power of our model on two common-used datasets, and the experiment results show that our model can achieve better performances with efficient convergence and give well interpretations of specific topic assignments at the same time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Microsoft COCO: Common Objects in Context

ImageNet Large Scale Visual Recognition Challenge

Article 11 April 2015

A survey on semi-supervised learning

Article Open access 15 November 2019

Notes

We call it as ‘LDA-50’ for brevity, which is the abbreviation of particular graphical model with given sampling iteration. All below are in the same case.
We use KBoW, LBoW, KSPM and LSPM short for “BoW representation (represent an image as an orderless collection of local features) + RBF kernel SVM”, “BoW representation + linear SVM”, “SPM representation (partition the image into sub-regions and compute histograms of local features found inside each sub-region) + RBF kernel SVM”, “SPM representation + linear SVM” respectively.

References

Andrieu C, De Freitas N, Doucet A, Jordan MI (2003) An introduction to MCMC for machine learning. Mach Learn 50(1–2):5–43
Article MATH Google Scholar
Bishop CM (2006) Pattern recognition and machine learning. Springer, New York
MATH Google Scholar
Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3(4–5):993–1022
MATH Google Scholar
Bosch A, Zisserman A, Muonz X (2008) Scene classification using a hybrid generative/discriminative approach. IEEE Trans Pattern Anal Mach Intell 30(4):712–727
Article Google Scholar
Endres I, Srikumar V, Chang MW, Hoiem D (2012) Learning shared body plans. In: IEEE conference on computer vision and pattern recognition, pp 3130–3137
Fernando B, Fromont E, Tuytelaars T (2014) Mining midlevel features for image classification. Int J Comput Vis 108(3):186–203
Article MathSciNet Google Scholar
Fei-Fei L, Perona P (2005) A Bayesian hierarchical model for learning natural scene categories. IEEE Conf Comput Vis Pattern Recognit 2:524–531
Google Scholar
Fei-Fei L, Fergus R, Perona P (2007) Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories. Comput Vis Image Underst 106(1):59–70
Article Google Scholar
Fei-Fei L, Perona P (2005) A Bayesian hierarchical model for learning natural scene categories. IEEE Conf Comput Vis Pattern Recognit San Diego, CA, USA 2:524–531
Google Scholar
Fox EB, Sudderth EB, Jordan MI, Willsky AS (2011) A sticky HDP-HMM with application to speaker diarization. Ann Appl Stat 5(2A):1020–1056
Article MathSciNet MATH Google Scholar
Gustafsson F, Gunnarsson F, Bergman N, Forssell U, Jansson J, Karlsson R, Nordlund PJ (2002) Particle flters for positioning, navigation, and tracking. IEEE Trans Signal Process 50(2):425–437
Article Google Scholar
Hofmann T (1999) Probabilistic latent semantic indexing. In: Proceedings of the 22nd annual international SIGIR conference on research and development in information retrieval, pp 50–57
Kim D, Chung Y, Oh A (2012) Variable selection for latent Dirichlet allocation. arXiv preprint arXiv:1205.1053
Kivinen JJ, Sudderth EB, Jordan MI (2007) Learning multiscale representations of natural scenes using dirichlet processes. In: IEEE international conference on computer vision, pp 1–8
Kwon J, Lee KM (2013) Wang-Landau Monte Carlo-based tracking methods for abrupt motions. IEEE Trans Pattern Anal Mach Intell 35(4):1011–1024
Article Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp 1097–1105
Larlus D, Verbeek J, Jurie F (2010) Category level object segmentation by combining bag-of-words models with Dirichlet processes and random fields. Int J Comput Vis 88(2):238–253
Article MathSciNet Google Scholar
Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. IEEE Conf Comput Vis Pattern Recognit New York, USA 2:2169–2178
Google Scholar
Li LJ, Socher R, Fei-Fei L (2009) Towards total scene understanding: classification, annotation and segmentation in an automatic framework. In: IEEE conference on computer vision and pattern recognition, pp 2036–2043
Li SZ (2009) Markov random field modeling in image analysis. Springer, London
MATH Google Scholar
Lowe DG (1999) Object recognition from local scale-invariant features. Proc Seventh IEEE Int Conf Comput Vis 2:1150–1157
Article Google Scholar
Miller GA (1995) Wordnet: a lexical database for English. Commun ACM 38(11):39–41
Article Google Scholar
Neal RM (2000) Markov chain sampling methods for dirichlet process mixture models. J Comput Graph Stat 9(2):249–265
MathSciNet Google Scholar
Oliva A, Torralba A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. Int J Comput Vis 42(3):145–175
Article MATH Google Scholar
Paisitkriangkrai S, Shen C, van den Hengel A (2012) Sharing features in multi-class boosting via group sparsity. In: IEEE conference on computer vision and pattern recognition, pp 2128–2135
Putthividhya D, Attias HT, Nagarajan SS (2010) Topic regression multi-modal latent dirichlet allocation for image annotation. In: IEEE conference on computer vision and pattern recognition, pp 3408–3415
Russell BC, Torralba A, Murphy KP, Freeman WT (2008) LabelMe: a database and web-based tool for image annotation. Int J Comput Vis 77(1–3):157–173
Article Google Scholar
Sudderth EB, Torralba A, Freeman WT, Willsky AS (2005) Learning hierarchical models of scenes, objects, and parts. In: 10th IEEE international conference on computer vision, Beijing, China, vol 2, pp 1331–1338
Sudderth EB, Torralba A, Freeman WT, Willsky AS (2008) Describing visual scenes using transformed objects and parts. Int J Comput Vis 77(1–3):291–330
Article Google Scholar
Tang S, Wang H, Shao J, Wu F, Chen M, Zhuang Y (2013) \(\pi\)LDA: document clustering with selective structural constraints. In: Proceedings of the 21st ACM international conference on multimedia, pp 753–756
Torralba A, Murphy KP, Freeman WT (2004) Sharing features: efficient boosting procedures for multiclass object detection. IEEE Conf Comput Vis Pattern Recognit Washington, DC, USA 2:762–769
Google Scholar
Torralba A, Murphy KP, Freeman WT (2005) Contextual models for object detection using boosted random fields. In: Neural Information Processing Systems 17 (NIPS), pp 1401–1408
Teh YW, Jordan MI Beal MJ (2006) Hierarchical dirichlet processes. J Am Stat Assoc 101(476):1566–1581
Article MathSciNet MATH Google Scholar
Tu Z, Zhu SC (2002) Image segmentation by data-driven markov chain Monte Carlo. IEEE Trans Pattern Anal Mach Intell 24(5):657–673
Article Google Scholar
Ullman S, Vidal-Naquet M, Sali E (2002) Visual features of intermediate complexity and their use in classification. Nat Neurosci 5(7):682–687
Google Scholar
Wang X, Ma X, Grimson WEL (2009) Unsupervised activity perception in crowded and complicated scenes using hierarchical Bayesian models. IEEE Trans Pattern Anal Mach Intell 31(3):539–555
Article Google Scholar
Yu X, Aloimonos Y (2010) Attribute-based transfer learning for object categorization with zero/one training example. In: Proceedings of European conference on computer vision, Springer, pp 127–140
Sivic J, Russell BC, Efros AA, Zisserman A, Freeman WT (2005) Discovering objects and their location in images. IEEE Int Conf Comput Vis 1:370–377
Google Scholar
Zhu L, Chen Y, Torralba A, Freeman W, Yuille A (2010) Part and appearance sharing: recursive compositional models for multi-view. In: IEEE conference on computer vision and pattern recognition, pp 1919–1926

Download references

Acknowledgments

This work was supported by National Natural Science Foundation of China under Grants 61273237, Fundamental Research Funds for the Central Universities 2012HGCX0001 and the National Basic Research Program of China (973 Program) under Grant 2013CB329604.

Author information

Authors and Affiliations

School of Computer and Information, Hefei University of Technology, P.O. BOX 98, No. 193 Tunxi Road, Hefei, 230009, Anhui, China
Ting-Ting Yao, Zhao Xie, Jun Gao & Chi Wang

Authors

Ting-Ting Yao
View author publications
You can also search for this author in PubMed Google Scholar
Zhao Xie
View author publications
You can also search for this author in PubMed Google Scholar
Jun Gao
View author publications
You can also search for this author in PubMed Google Scholar
Chi Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhao Xie.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yao, TT., Xie, Z., Gao, J. et al. Discriminative sequential association latent dirichlet allocation for visual recognition. Pattern Anal Applic 19, 719–730 (2016). https://doi.org/10.1007/s10044-014-0444-0

Download citation

Received: 18 October 2013
Accepted: 21 December 2014
Published: 10 January 2015
Issue Date: August 2016
DOI: https://doi.org/10.1007/s10044-014-0444-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Discriminative sequential association latent dirichlet allocation for visual recognition

Abstract

Access this article

Similar content being viewed by others

Microsoft COCO: Common Objects in Context

ImageNet Large Scale Visual Recognition Challenge

A survey on semi-supervised learning

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Discriminative sequential association latent dirichlet allocation for visual recognition

Abstract

Access this article

Similar content being viewed by others

Microsoft COCO: Common Objects in Context

ImageNet Large Scale Visual Recognition Challenge

A survey on semi-supervised learning

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation