Graph Neural Networks Based Multi-granularity Feature Representation Learning for Fine-Grained Visual Categorization

Wu, Hongyan; Guo, Haiyun; Miao, Qinghai; Huang, Min; Wang, Jinqiao

doi:10.1007/978-3-030-98355-0_20

Hongyan Wu¹⁵,
Haiyun Guo¹⁶,
Qinghai Miao¹⁵,
Min Huang¹⁵ &
…
Jinqiao Wang^15,16

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13142))

Included in the following conference series:

International Conference on Multimedia Modeling

2070 Accesses
3 Citations

Abstract

There inherently exists a hierarchy with different levels of classification granularity for object categories. This hierarchy involves rich semantic relationships among categories, which can benefit fine-grained visual categorization (FGVC) but is overlooked by most of previous works. In this paper, a novel graph neural networks based multi-granularity feature representation learning framework is presented for FGVC, which boosts feature learning of different grain levels simultaneously and enhances multiple granularity categorization. Under this framework, we propose two kinds of correlation graphs, i.e., Abstract Graph (AG) and Detailed Graph (DG). AG assigns one node for each grain level while DG regards different categories at each grain level as different nodes. With AG and DG, two graph neural networks based multiple grain feature learning methods are proposed. With AG, graph gate neural network is utilized to explore the interactions between features from different grain levels and help learn more discriminative and comprehensive feature representation for each grain level. Based on DG, we employ graph convolutional network to model the category hierarchical semantic relationships and enhance the feature by regularizing the semantic space division. To facilitate the research, we construct a large-scale car dataset, i.e., Car-FG3K (Available at http://www.nlpr.ia.ac.cn/iva/homepage/jqwang/Car-FG3K.htm), which covers three-level categories and is more challenging than the existing car datasets in terms of category count and view variation. We conduct experiments on this new dataset and two other datasets, i.e., CUB-200-2011 and FGVC-Aircraft, and our methods achieve comparable results to state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bertinetto, L., Müller, R., Tertikas, K., Samangooei, S., Lord, N.A.: Making better mistakes: leveraging class hierarchies with deep networks. In: CVPR (2020)
Google Scholar
Chang, D., et al.: The devil is in the channels: mutual-channel loss for fine-grained image classification. IEEE Trans. Image Process. 29, 4683–4695 (2020)
Google Scholar
Chang, D., Pang, K., Zheng, Y., Ma, Z., Song, Y.Z., Guo, J.: Your “flamingo” is my “bird”: fine-grained, or not. In: CVPR (2021)
Google Scholar
Chen, T., Lin, L., Chen, R., Wu, Y., Luo, X.: Knowledge-embedded representation learning for fine-grained image recognition. In: IJCAI (2018)
Google Scholar
Chen, T., Wu, W., Gao, Y., Dong, L., Luo, X., Lin, L.: Fine-grained representation learning and recognition by exploiting hierarchical semantic embedding. In: MM (2018)
Google Scholar
Chen, Y., Bai, Y., Zhang, W., Mei, T.: Destruction and construction learning for fine-grained image recognition. In: CVPR (2019)
Google Scholar
Cho, K., van Merrienboer, B., Gulcehre, C., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: EMNLP (2014)
Google Scholar
Du, R., et al.: Fine-grained visual classification via progressive multi-granularity training of jigsaw patches. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12365, pp. 153–168. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58565-5_10
Chapter Google Scholar
Fang, H., Xu, Y., Wang, W., Liu, X., Zhu, S.: Learning knowledge-guided pose grammar machine for 3D human pose estimation. In: AAAI (2018)
Google Scholar
Fu, J., Zheng, H., Mei, T.: Look closer to see better: recurrent attention convolutional neural network for fine-grained image recognition. In: CVPR (2017)
Google Scholar
Gao, Y., Chen, Y., Wang, J., Lu, H.: Progressive rectification network for irregular text recognition. Sci. China Inf. Sci. 63(2), 1–14 (2020). https://doi.org/10.1007/s11432-019-2710-7
Article MathSciNet Google Scholar
Gao, Z., Wang, L., Wu, G.: Lip: local importance-based pooling. In: ICCV (2019)
Google Scholar
Ge, W., Lin, X., Yu, Y.: Weakly supervised complementary parts models for fine-grained image classification from the bottom up. In: CVPR (2019)
Google Scholar
He, K., Gkioxari, G., Dollár, P., Girshick, R.B.: Mask R-CNN. In: ICCV (2017)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)
Google Scholar
Huang, L., Huang, Y., Ouyang, W., Wang, L.: Part-level graph convolutional network for skeleton-based action recognition. In: AAAI (2020)
Google Scholar
Huang, Z., Li, Y.: Interpretable and accurate fine-grained recognition via region grouping. In: CVPR (2020)
Google Scholar
Kipf, T., Welling, M.: Semi-supervised classification with graph convolutional networks. ArXiv (2017)
Google Scholar
Krause, J., Stark, M., Deng, J., Fei-Fei, L.: 3D object representations for fine-grained categorization. In: 3dRR-13 (2013)
Google Scholar
Li, Y., Tarlow, D., Brockschmidt, M., Zemel, R.S.: Gated graph sequence neural networks. CoRR (2016)
Google Scholar
Luo, W., Zhang, H., Li, J., Wei, X.: Learning semantically enhanced feature for fine-grained image classification. IEEE Signal Process. Lett. 27, 1545–1549(2020)
Google Scholar
Maji, S., Kannala, J., Rahtu, E., Blaschko, M., Vedaldi, A.: Fine-grained visual classification of aircraft. Technical report (2013)
Google Scholar
Nilsback, M.E., Zisserman, A.: Automated flower classification over a large number of classes. In: ICVGIP (2008)
Google Scholar
Scarselli, F., Gori, M., Tsoi, A.C., Hagenbuchner, M., Monfardini, G.: The graph neural network model. IEEE Trans. Neural Netw. 20(1), 61–80 (2009). https://doi.org/10.1109/TNN.2008.2005605
Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The caltech-UCSD birds-200-2011 dataset. Technical report (2011)
Google Scholar
Wang, Y., et al.: Multi-label classification with label graph superimposing. In: AAAI (2020)
Google Scholar
Wang, Z., Wang, S., Li, H., Dou, Z., Li, J.: Graph-propagation based correlation learning for weakly supervised fine-grained image classification. In: AAAI (2020)
Google Scholar
Xie, S., Yang, T., Wang, X., Lin, Y.: Hyper-class augmented and regularized deep learning for fine-grained image classification. In: CVPR (2015)
Google Scholar
Xu, K., Hu, W., Leskovec, J., Jegelka, S.: How powerful are graph neural networks? In: ICLR (2019)
Google Scholar
Yang, L., Luo, P., Loy, C.C., Tang, X.: A large-scale car dataset for fine-grained categorization and verification. In: CVPR (2015)
Google Scholar
Yang, L., Zhan, X., Chen, D., Yan, J., Loy, C.C., Lin, D.: Learning to cluster faces on an affinity graph. In: CVPR (2019)
Google Scholar
Zheng, H., Fu, J., Zha, Z., Luo, J.: Looking for the devil in the details: learning trilinear attention sampling network for fine-grained image recognition. In: CVPR (2019)
Google Scholar
Zhou, B., Khosla, A., Lapedriza, À., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: CVPR (2016)
Google Scholar
Zhou, F., Lin, Y.: Fine-grained image classification by exploring bipartite-graph labels. In: CVPR (2015)
Google Scholar
Zhou, P., et al.: Attention-based bidirectional long short-term memory networks for relation classification. In: ACL (2016)
Google Scholar

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China (No. 61772527, 62002356, 62076235, 61976210, 62176254, 62002357 and 62006230), Ministry of Education industry-University Cooperative Education Program (Wei Qiao Venture Group, No. E1425201) and Open Research Projects of Zhejiang Lab (No. 2021KH0AB07).

Author information

Authors and Affiliations

School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China
Hongyan Wu, Qinghai Miao, Min Huang & Jinqiao Wang
Institution of Automation, Chinese Academy of Sciences, Beijing, China
Haiyun Guo & Jinqiao Wang

Authors

Hongyan Wu
View author publications
You can also search for this author in PubMed Google Scholar
Haiyun Guo
View author publications
You can also search for this author in PubMed Google Scholar
Qinghai Miao
View author publications
You can also search for this author in PubMed Google Scholar
Min Huang
View author publications
You can also search for this author in PubMed Google Scholar
Jinqiao Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Min Huang .

Editor information

Editors and Affiliations

IT University of Copenhagen, Copenhagen, Denmark
Björn Þór Jónsson
Dublin City University, Dublin, Ireland
Cathal Gurrin
University of Science, VNU-HCM, Ho Chi Minh City, Vietnam
Minh-Triet Tran
University of Bergen, Bergen, Norway
Duc-Tien Dang-Nguyen
National Tsing Hua University, Hsinchu, Taiwan
Anita Min-Chun Hu
Hanoi University of Science and Technology, Hanoi, Vietnam
Binh Huynh Thi Thanh
Median Technologies, Valbonne, France
Benoit Huet

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wu, H., Guo, H., Miao, Q., Huang, M., Wang, J. (2022). Graph Neural Networks Based Multi-granularity Feature Representation Learning for Fine-Grained Visual Categorization. In: Þór Jónsson, B., et al. MultiMedia Modeling. MMM 2022. Lecture Notes in Computer Science, vol 13142. Springer, Cham. https://doi.org/10.1007/978-3-030-98355-0_20

Download citation

DOI: https://doi.org/10.1007/978-3-030-98355-0_20
Published: 15 March 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-98354-3
Online ISBN: 978-3-030-98355-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics