Semantic Segmentation with Second-Order Pooling

Carreira, João; Caseiro, Rui; Batista, Jorge; Sminchisescu, Cristian

doi:10.1007/978-3-642-33786-4_32

João Carreira^21,22,
Rui Caseiro²¹,
Jorge Batista²¹ &
…
Cristian Sminchisescu²²

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7578))

Included in the following conference series:

European Conference on Computer Vision

8799 Accesses
157 Citations

Abstract

Feature extraction, coding and pooling, are important components on many contemporary object recognition paradigms. In this paper we explore novel pooling techniques that encode the second-order statistics of local descriptors inside a region. To achieve this effect, we introduce multiplicative second-order analogues of average and max-pooling that together with appropriate non-linearities lead to state-of-the-art performance on free-form region recognition, without any type of feature coding. Instead of coding, we found that enriching local descriptors with additional image information leads to large performance gains, especially in conjunction with the proposed pooling methodology. We show that second-order pooling over free-form regions produces results superior to those of the winning systems in the Pascal VOC 2011 semantic segmentation challenge, with models that are 20,000 times faster.

Download to read the full chapter text

Chapter PDF

ExFuse: Enhancing Feature Fusion for Semantic Segmentation

SPLeaP: Soft Pooling of Learned Parts for Image Classification

Laplacian Pyramid Reconstruction and Refinement for Semantic Segmentation

Keywords

References

Schmid, C., Mohr, R.: Local grayvalue invariants for image retrieval. TPAMI (1997)
Google Scholar
Dance, C., Willamowski, J., Fan, L., Bray, C., Csurka, G.: Visual categorization with bags of keypoints. In: ECCV SLCV Workshop (2004)
Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: CVPR (2006)
Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR (2005)
Google Scholar
Boureau, Y., Ponce, J., LeCun, Y.: A theoretical analysis of feature pooling in vision algorithms. In: ICML (2010)
Google Scholar
Boureau, Y., Le Roux, N., Bach, F., Ponce, J., LeCun, Y.: Ask the locals: multi-way local pooling for image recognition. In: ICCV (2011)
Google Scholar
Ranzato, M., Boureau, Y., LeCun, Y.: Sparse feature learning for deep belief networks. In: NIPS (2007)
Google Scholar
Arsigny, V., Fillard, P., Pennec, X., Ayache, N.: Geometric means in a novel vector space structure on symmetric positive-definite matrices. In: SIAM JMAA (2006)
Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. IJCV (2004)
Google Scholar
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge (2011), http://www.pascal-network.org/challenges/VOC/voc2011/workshop/index.html
Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. In: CVIU (2007)
Google Scholar
Fergus, R., Perona, P., Zisserman, A.: Weakly supervised scale-invariant learning of models for visual recognition. IJCV (2007)
Google Scholar
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. TPAMI (2010)
Google Scholar
Grauman, K., Darrell, T.: The pyramid match kernel: discriminative classification with sets of image features. In: ICCV (2005)
Google Scholar
Joachims, T.: Training linear svms in linear time. In: ACM KDD. ACM (2006)
Google Scholar
Tuzel, O., Porikli, F., Meer, P.: Pedestrian detection via classification on riemannian manifolds. TPAMI (2008)
Google Scholar
Perronnin, F., Sánchez, J., Mensink, T.: Improving the Fisher Kernel for Large-Scale Image Classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 143–156. Springer, Heidelberg (2010)
Chapter Google Scholar
Carreira, J., Li, F., Sminchisescu, C.: Object Recognition by Sequential Figure-Ground Ranking. IJCV (2012)
Google Scholar
Ion, A., Carreira, J., Sminchisescu, C.: Probabilistic joint segmentation and labeling. In: NIPS (2011)
Google Scholar
Arbelaez, P., Hariharan, B., Gu, C., Gupta, S., Bourdev, L., Malik, J.: Semantic segmentation using regions and parts. In: CVPR (2012)
Google Scholar
Bhatia, R.: Positive Definite Matrices. Princeton Series in Applied Mathematics. Princeton University Press, Princeton (2007)
Google Scholar
Caseiro, R., Henriques, J., Martins, P., Batista, J.: A nonparametric riemannian framework on tensor field with application to foreground segmentation. In: ICCV (2011)
Google Scholar
Davies, P.I., Higham, N.J.: A schur-parlett algorithm for computing matrix functions (2003)
Google Scholar
Caputo, B., Jie, L.: A performance evaluation of exact and approximate match kernels for object recognition. ELCVIA (2010)
Google Scholar
Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. TPAMI (2002)
Google Scholar
Vedaldi, A., Fulkerson, B.: VLFeat: An open and portable library of computer vision algorithms (2008), http://www.vlfeat.org/
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: CVPR (2001)
Google Scholar
Lampert, C., Blaschko, M., Hofmann, T.: Beyond sliding windows: Object localization by efficient subwindow search. In: CVPR (2008)
Google Scholar
Arbelaez, P., Maire, M., Fowlkes, C., Malik, J.: From contours to regions: An empirical evaluation. In: CVPR, pp. 2294–2301 (2009)
Google Scholar
Carreira, J., Sminchisescu, C.: CPMC: Automatic Object Segmentation Using Constrained Parametric Min-Cuts. TPAMI (2012)
Google Scholar
Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: LIBLINEAR: A library for large linear classification. JMLR (2008)
Google Scholar
Hariharan, B., Arbelaez, P., Bourdev, L., Maji, S., Malik, J.: Semantic contours from inverse detectors. In: ICCV (2011)
Google Scholar
Jolliffe, I.: Principal Component Analysis. Springer (1986)
Google Scholar
Ladický, Ľ., Sturgess, P., Alahari, K., Russell, C., Torr, P.H.S.: What, Where and How Many? Combining Object Detectors and CRFs. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 424–437. Springer, Heidelberg (2010)
Chapter Google Scholar
Gonfaus, J.M., Boix, X., van de Weijer, J., Bagdanov, A.D., Serrat, J., Gonzàlez, J.: Harmony potentials for joint classification and segmentation. In: CVPR (2010)
Google Scholar
Wang, J., Yang, J., Yu, K., Lv, F., Huang, T.S., Gong, Y.: Locality-constrained linear coding for image classification. In: CVPR (2010)
Google Scholar
Bo, L., Sminchisescu, C.: Efficient Match Kernel between Sets of Features for Visual Recognition. In: NIPS (2009)
Google Scholar
Boiman, O., Shechtman, E., Irani, M.: In defense of nearest-neighbor based image classification. In: CVPR (2008)
Google Scholar
Duchenne, O., Joulin, A., Ponce, J.: A graph-matching kernel for object categorization. In: ICCV (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Systems and Robotics, University of Coimbra, Portugal
João Carreira, Rui Caseiro & Jorge Batista
Faculty of Mathematics and Natural Sciences, University of Bonn, Germany
João Carreira & Cristian Sminchisescu

Authors

João Carreira
View author publications
You can also search for this author in PubMed Google Scholar
Rui Caseiro
View author publications
You can also search for this author in PubMed Google Scholar
Jorge Batista
View author publications
You can also search for this author in PubMed Google Scholar
Cristian Sminchisescu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Microsoft Research Ltd., CB3 0FB, Cambridge, UK
Andrew Fitzgibbon
Dept. of Computer Science, University of North Carolina, 27599, Chapel Hill, NC, USA
Svetlana Lazebnik
California Institute of Technology, 91125, Pasadena, CA, USA
Pietro Perona
Institute of Industrial Science, The University of Tokyo, 153-8505, Tokyo, Japan
Yoichi Sato
INRIA, 38330, Montbonnot, France
Cordelia Schmid

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Carreira, J., Caseiro, R., Batista, J., Sminchisescu, C. (2012). Semantic Segmentation with Second-Order Pooling. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds) Computer Vision – ECCV 2012. ECCV 2012. Lecture Notes in Computer Science, vol 7578. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33786-4_32

Download citation

DOI: https://doi.org/10.1007/978-3-642-33786-4_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33785-7
Online ISBN: 978-3-642-33786-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Semantic Segmentation with Second-Order Pooling

Abstract

Chapter PDF

Similar content being viewed by others

ExFuse: Enhancing Feature Fusion for Semantic Segmentation

SPLeaP: Soft Pooling of Learned Parts for Image Classification

Laplacian Pyramid Reconstruction and Refinement for Semantic Segmentation

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Semantic Segmentation with Second-Order Pooling

Abstract

Chapter PDF

Similar content being viewed by others

ExFuse: Enhancing Feature Fusion for Semantic Segmentation

SPLeaP: Soft Pooling of Learned Parts for Image Classification

Laplacian Pyramid Reconstruction and Refinement for Semantic Segmentation

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation