Skip to main content

Graph Neural Networks Based Multi-granularity Feature Representation Learning for Fine-Grained Visual Categorization

  • Conference paper
  • First Online:
MultiMedia Modeling (MMM 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13142))

Included in the following conference series:

Abstract

There inherently exists a hierarchy with different levels of classification granularity for object categories. This hierarchy involves rich semantic relationships among categories, which can benefit fine-grained visual categorization (FGVC) but is overlooked by most of previous works. In this paper, a novel graph neural networks based multi-granularity feature representation learning framework is presented for FGVC, which boosts feature learning of different grain levels simultaneously and enhances multiple granularity categorization. Under this framework, we propose two kinds of correlation graphs, i.e., Abstract Graph (AG) and Detailed Graph (DG). AG assigns one node for each grain level while DG regards different categories at each grain level as different nodes. With AG and DG, two graph neural networks based multiple grain feature learning methods are proposed. With AG, graph gate neural network is utilized to explore the interactions between features from different grain levels and help learn more discriminative and comprehensive feature representation for each grain level. Based on DG, we employ graph convolutional network to model the category hierarchical semantic relationships and enhance the feature by regularizing the semantic space division. To facilitate the research, we construct a large-scale car dataset, i.e., Car-FG3K (Available at http://www.nlpr.ia.ac.cn/iva/homepage/jqwang/Car-FG3K.htm), which covers three-level categories and is more challenging than the existing car datasets in terms of category count and view variation. We conduct experiments on this new dataset and two other datasets, i.e., CUB-200-2011 and FGVC-Aircraft, and our methods achieve comparable results to state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bertinetto, L., Müller, R., Tertikas, K., Samangooei, S., Lord, N.A.: Making better mistakes: leveraging class hierarchies with deep networks. In: CVPR (2020)

    Google Scholar 

  2. Chang, D., et al.: The devil is in the channels: mutual-channel loss for fine-grained image classification. IEEE Trans. Image Process. 29, 4683–4695 (2020)

    Google Scholar 

  3. Chang, D., Pang, K., Zheng, Y., Ma, Z., Song, Y.Z., Guo, J.: Your “flamingo” is my “bird”: fine-grained, or not. In: CVPR (2021)

    Google Scholar 

  4. Chen, T., Lin, L., Chen, R., Wu, Y., Luo, X.: Knowledge-embedded representation learning for fine-grained image recognition. In: IJCAI (2018)

    Google Scholar 

  5. Chen, T., Wu, W., Gao, Y., Dong, L., Luo, X., Lin, L.: Fine-grained representation learning and recognition by exploiting hierarchical semantic embedding. In: MM (2018)

    Google Scholar 

  6. Chen, Y., Bai, Y., Zhang, W., Mei, T.: Destruction and construction learning for fine-grained image recognition. In: CVPR (2019)

    Google Scholar 

  7. Cho, K., van Merrienboer, B., Gulcehre, C., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: EMNLP (2014)

    Google Scholar 

  8. Du, R., et al.: Fine-grained visual classification via progressive multi-granularity training of jigsaw patches. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12365, pp. 153–168. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58565-5_10

    Chapter  Google Scholar 

  9. Fang, H., Xu, Y., Wang, W., Liu, X., Zhu, S.: Learning knowledge-guided pose grammar machine for 3D human pose estimation. In: AAAI (2018)

    Google Scholar 

  10. Fu, J., Zheng, H., Mei, T.: Look closer to see better: recurrent attention convolutional neural network for fine-grained image recognition. In: CVPR (2017)

    Google Scholar 

  11. Gao, Y., Chen, Y., Wang, J., Lu, H.: Progressive rectification network for irregular text recognition. Sci. China Inf. Sci. 63(2), 1–14 (2020). https://doi.org/10.1007/s11432-019-2710-7

    Article  MathSciNet  Google Scholar 

  12. Gao, Z., Wang, L., Wu, G.: Lip: local importance-based pooling. In: ICCV (2019)

    Google Scholar 

  13. Ge, W., Lin, X., Yu, Y.: Weakly supervised complementary parts models for fine-grained image classification from the bottom up. In: CVPR (2019)

    Google Scholar 

  14. He, K., Gkioxari, G., Dollár, P., Girshick, R.B.: Mask R-CNN. In: ICCV (2017)

    Google Scholar 

  15. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)

    Google Scholar 

  16. Huang, L., Huang, Y., Ouyang, W., Wang, L.: Part-level graph convolutional network for skeleton-based action recognition. In: AAAI (2020)

    Google Scholar 

  17. Huang, Z., Li, Y.: Interpretable and accurate fine-grained recognition via region grouping. In: CVPR (2020)

    Google Scholar 

  18. Kipf, T., Welling, M.: Semi-supervised classification with graph convolutional networks. ArXiv (2017)

    Google Scholar 

  19. Krause, J., Stark, M., Deng, J., Fei-Fei, L.: 3D object representations for fine-grained categorization. In: 3dRR-13 (2013)

    Google Scholar 

  20. Li, Y., Tarlow, D., Brockschmidt, M., Zemel, R.S.: Gated graph sequence neural networks. CoRR (2016)

    Google Scholar 

  21. Luo, W., Zhang, H., Li, J., Wei, X.: Learning semantically enhanced feature for fine-grained image classification. IEEE Signal Process. Lett. 27, 1545–1549(2020)

    Google Scholar 

  22. Maji, S., Kannala, J., Rahtu, E., Blaschko, M., Vedaldi, A.: Fine-grained visual classification of aircraft. Technical report (2013)

    Google Scholar 

  23. Nilsback, M.E., Zisserman, A.: Automated flower classification over a large number of classes. In: ICVGIP (2008)

    Google Scholar 

  24. Scarselli, F., Gori, M., Tsoi, A.C., Hagenbuchner, M., Monfardini, G.: The graph neural network model. IEEE Trans. Neural Netw. 20(1), 61–80 (2009). https://doi.org/10.1109/TNN.2008.2005605

  25. Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The caltech-UCSD birds-200-2011 dataset. Technical report (2011)

    Google Scholar 

  26. Wang, Y., et al.: Multi-label classification with label graph superimposing. In: AAAI (2020)

    Google Scholar 

  27. Wang, Z., Wang, S., Li, H., Dou, Z., Li, J.: Graph-propagation based correlation learning for weakly supervised fine-grained image classification. In: AAAI (2020)

    Google Scholar 

  28. Xie, S., Yang, T., Wang, X., Lin, Y.: Hyper-class augmented and regularized deep learning for fine-grained image classification. In: CVPR (2015)

    Google Scholar 

  29. Xu, K., Hu, W., Leskovec, J., Jegelka, S.: How powerful are graph neural networks? In: ICLR (2019)

    Google Scholar 

  30. Yang, L., Luo, P., Loy, C.C., Tang, X.: A large-scale car dataset for fine-grained categorization and verification. In: CVPR (2015)

    Google Scholar 

  31. Yang, L., Zhan, X., Chen, D., Yan, J., Loy, C.C., Lin, D.: Learning to cluster faces on an affinity graph. In: CVPR (2019)

    Google Scholar 

  32. Zheng, H., Fu, J., Zha, Z., Luo, J.: Looking for the devil in the details: learning trilinear attention sampling network for fine-grained image recognition. In: CVPR (2019)

    Google Scholar 

  33. Zhou, B., Khosla, A., Lapedriza, À., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: CVPR (2016)

    Google Scholar 

  34. Zhou, F., Lin, Y.: Fine-grained image classification by exploring bipartite-graph labels. In: CVPR (2015)

    Google Scholar 

  35. Zhou, P., et al.: Attention-based bidirectional long short-term memory networks for relation classification. In: ACL (2016)

    Google Scholar 

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China (No. 61772527, 62002356, 62076235, 61976210, 62176254, 62002357 and 62006230), Ministry of Education industry-University Cooperative Education Program (Wei Qiao Venture Group, No. E1425201) and Open Research Projects of Zhejiang Lab (No. 2021KH0AB07).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Min Huang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wu, H., Guo, H., Miao, Q., Huang, M., Wang, J. (2022). Graph Neural Networks Based Multi-granularity Feature Representation Learning for Fine-Grained Visual Categorization. In: Þór Jónsson, B., et al. MultiMedia Modeling. MMM 2022. Lecture Notes in Computer Science, vol 13142. Springer, Cham. https://doi.org/10.1007/978-3-030-98355-0_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-98355-0_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-98354-3

  • Online ISBN: 978-3-030-98355-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics