DCNet: exploring fine-grained vision classification for 3D point clouds

Wu, Rusong; Bai, Jing; Li, Wenjing; Jiang, Jinzhe

doi:10.1007/s00371-023-02816-y

DCNet: exploring fine-grained vision classification for 3D point clouds

Original article
Published: 22 March 2023

Volume 40, pages 781–797, (2024)
Cite this article

The Visual Computer Aims and scope Submit manuscript

Rusong Wu¹,
Jing Bai ORCID: orcid.org/0000-0003-4247-6210^1,2^na1,
Wenjing Li¹ &
…
Jinzhe Jiang¹

331 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Fine-grained 3D point cloud classification is vital for shape analysis and understanding. However, due to the subtle inter-class differences and the significant intra-class variations, applying the existing point cloud network directly to fine-grained visual classification tasks may suffer overfitting and cannot achieve good performance. To address this problem, we propose a unified and robust learning framework, named dynamic confusion network (DCNet), which helps the network capture the subtle differences between samples from different sub-categories more robustly. Specifically, in the stage of feature extraction, we design a novel mutual complementary mechanism between an attention block and a dynamic sample confusion block to extract more abundant discriminative features. Furthermore, we construct robust adversarial learning between a dynamic sample confusion loss and a cross-entropy loss based on a siamese network framework to make the network learn more stable feature distributions. We conduct comprehensive experiments and show that DCNet achieves the best performance in three fine-grained categories, with relative accuracy improvements of 1.35%, 1.28%, and 2.30% on Airplane, Car, and Chair, respectively, compared to state-of-the-art point cloud methods. In addition, our approach also achieves comparable performance for the coarse-grained dataset on ModelNet40.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Convolutional Point Transformer

PointFormer: A Dual Perception Attention-Based Network for Point Cloud Classification

Improved Training for 3D Point Cloud Classification

References

Qi, C.R., Su, H., Mo, K., Guibas, L.J.: Pointnet: deep learning on point sets for 3d classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 652–660 (2017)
Wang, Y., Sun, Y., Liu, Z., Sarma, S.E., Bronstein, M.M., Solomon, J.M.: Dynamic graph cnn for learning on point clouds. Acm Trans. Graph. (tog) 38(5), 1–12 (2019)
Article Google Scholar
Zhao, H., Jiang, L., Fu, C.-W., Jia, J.: Pointweb: enhancing local neighborhood features for point cloud processing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5565–5573 (2019)
Liu, X., Han, Z., Liu, Y.-S., Zwicker, M.: Point2sequence: learning the shape representation of 3d point clouds with an attention-based sequence to sequence network. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 8778–8785 (2019)
Liu, X., Han, Z., Liu, Y.-S., Zwicker, M.: Fine-grained 3d shape classification with hierarchical part-view attention. IEEE Trans. Image Process. 30, 1744–1758 (2021)
Article Google Scholar
Dubey, A., Gupta, O., Guo, P., Raskar, R., Farrell, R., Naik, N.: Pairwise confusion for fine-grained visual classification. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 70–86 (2018)
Huang, S., Wang, X., Tao, D.: Stochastic partial swap: enhanced model generalization and interpretability for fine-grained recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 620–629 (2021)
Qi, C.R., Yi, L., Su, H., Guibas, L.J.: Pointnet++: deep hierarchical feature learning on point sets in a metric space. Adv. Neural Inform Process Syst 30 (2017)
Lan, S., Yu, R., Yu, G., Davis, L.S.: Modeling local geometric structure of 3d point clouds using geo-cnn. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Duan, Y., Zheng, Y., Lu, J., Zhou, J., Tian, Q.: Structural relational reasoning of point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 949–958 (2019)
Feng, M., Zhang, L., Lin, X., Gilani, S.Z., Mian, A.: Point attention network for semantic segmentation of 3d point clouds. Pattern Recogn. 107, 107446 (2020)
Article Google Scholar
Ma, X., Qin, C., You, H., Ran, H., Fu, Y.: Rethinking network design and local geometry in point cloud: a simple residual mlp framework (2022)
Zhao, H., Jiang, L., Jia, J., Torr, P.H., Koltun, V.: Point transformer. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 16259–16268 (2021)
Guo, M.-H., Cai, J.-X., Liu, Z.-N., Mu, T.-J., Martin, R.R., Hu, S.-M.: Pct: point cloud transformer. Comput. Visual Med. 7(2), 187–199 (2021)
Article Google Scholar
Xu, Y., Fan, T., Xu, M., Zeng, L., Qiao, Y.: Spidercnn: Deep learning on point sets with parameterized convolutional filters. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 87–102 (2018)
Thomas, H., Qi, C.R., Deschaud, J.-E., Marcotegui, B., Goulette, F., Guibas, L.J.: Kpconv: flexible and deformable convolution for point clouds. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6411–6420 (2019)
Komarichev, A., Zhong, Z., Hua, J.: A-cnn: Annularly convolutional neural networks on point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7421–7430 (2019)
Li, Y., Bu, R., Sun, M., Wu, W., Di, X., Chen, B.: Pointcnn: Convolution on x-transformed points. Adv. Neural Inform. Process. Syst. 31 (2018)
Zhang, K., Hao, M., Wang, J., de Silva, C.W., Fu, C.: Linked dynamic graph cnn: learning on point cloud via linking hierarchical features. arXiv:1904.10014 (2019)
Pan, L., Chew, C.-M., Lee, G.H.: Pointatrousgraph: deep hierarchical encoder-decoder with point atrous convolution for unorganized 3d points. In: 2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 1113–1120. IEEE (2020)
Qiu, S., Anwar, S., Barnes, N.: Geometric back-projection network for point cloud classification. IEEE Trans. Multimedia 24, 1943–1955 (2021)
Article Google Scholar
Chen, L., Zhang, Q.: Ddgcn: graph convolution network based on direction and distance for point cloud learning. Vis. Comput. 1–11 (2022)
Dominguez, M., Dhamdhere, R., Petkar, A., Jain, S., Ptucha, R.: General-purpose deep point cloud feature extractor. In: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV) (2018)
Qiu, S., Anwar, S., Barnes, N.: Pnp-3d: A plug-and-play for 3d point clouds. IEEE Trans. Pattern Anal. Mach. Intell. (2021)
Kong, S., Fowlkes, C.: Low-rank bilinear pooling for fine-grained classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 365–374 (2017)
Cui, Y., Zhou, F., Wang, J., Liu, X., Lin, Y., Belongie, S.: Kernel pooling for convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2921–2930 (2017)
Li, P., Xie, J., Wang, Q., Zuo, W.: Is second-order information helpful for large-scale visual recognition? In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2070–2078 (2017)
Zhu, Y., Liu, G.: Fine-grained action recognition using multi-view attentions. Vis. Comput. 36(9), 1771–1781 (2020)
Article Google Scholar
Wang, Q., Li, P., Zhang, L.: G2denet: Global gaussian distribution embedding network and its application to visual recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2730–2739 (2017)
Yu, C., Zhao, X., Zheng, Q., Zhang, P., You, X.: Hierarchical bilinear pooling for fine-grained visual recognition. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 574–589 (2018)
Fu, J., Zheng, H., Mei, T.: Look closer to see better: Recurrent attention convolutional neural network for fine-grained image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4438–4446 (2017)
Zheng, H., Fu, J., Mei, T., Luo, J.: Learning multi-attention convolutional neural network for fine-grained image recognition. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5209–5217 (2017)
Li, M., Lei, L., Sun, H., Li, X., Kuang, G.: Fine-grained visual classification via multilayer bilinear pooling with object localization. Vis. Comput. 38(3), 811–820 (2022)
Article Google Scholar
Zhuang, P., Wang, Y., Qiao, Y.: Learning attentive pairwise interaction for fine-grained classification. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 13130–13137 (2020)
Chang, D., Ding, Y., Xie, J., Bhunia, A.K., Li, X., Ma, Z., Wu, M., Guo, J., Song, Y.-Z.: The devil is in the channels: mutual-channel loss for fine-grained image classification. IEEE Trans. Image Process. 29, 4683–4695 (2020)
Zhang, X., Zhou, X., Lin, M., Sun, J.: Shufflenet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6848–6856 (2018)
Su, H., Maji, S., Kalogerakis, E., Learned-Miller, E.: Multi-view convolutional neural networks for 3d shape recognition. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 945–953 (2015)
Li, J., Chen, B.M., Lee, G.H.: So-net: self-organizing network for point cloud analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9397–9406 (2018)
Bai, J., Xu, H.: Msp-net: multi-scale point cloud classification network. J. Comput. Aided Des. Comput. Graph 31, 1927–1934 (2019)
Wen, X., Han, Z., Liu, X., Liu, Y.-S.: Point2spatialcapsule: aggregating features and spatial relationships of local regions on point clouds using spatial-aware capsules. IEEE Trans. Image Process. 29, 8855–8869 (2020)
Article Google Scholar
Liu, J., Zhang, J., Li, W., Zhang, C., Sun, Y.: Memory-based jitter: improving visual recognition on long-tailed data with diversity in memory (2020)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Wang, Q., Wu, B., Zhu, P., Li, P., Hu, Q.: Eca-net: Efficient channel attention for deep convolutional neural networks. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
Guo, M.-H., Liu, Z.-N., Mu, T.-J., Hu, S.-M.: Beyond self-attention: External attention using two linear layers for visual tasks. arXiv:2105.02358 (2021)
Hou, Q., Zhou, D., Feng, J.: Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13713–13722 (2021)
Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: Cbam: Convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–19 (2018)

Download references

Acknowledgements

Thank the support of the National Natural Science Foundation of China under Grant 62162001, 61762003, the Natural Science Foundation of Ningxia Province of China under Grant 2022AAC02041, the Ningxia Excellent Talent Program, North Minzu University Innovation Project (YCX21093).

Author information

Jing Bai have contributed equally to this work.

Authors and Affiliations

The School of Computer Science and Engineering, North Minzu University, Yinchuan, 750021, China
Rusong Wu, Jing Bai, Wenjing Li & Jinzhe Jiang
The Key Laboratory of Images Processing and Pattern Laboratory, Commission: IPPRLab, North Minzu University, Yinchuan, 750021, China
Jing Bai

Authors

Rusong Wu
View author publications
You can also search for this author in PubMed Google Scholar
Jing Bai
View author publications
You can also search for this author in PubMed Google Scholar
Wenjing Li
View author publications
You can also search for this author in PubMed Google Scholar
Jinzhe Jiang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jing Bai.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wu, R., Bai, J., Li, W. et al. DCNet: exploring fine-grained vision classification for 3D point clouds. Vis Comput 40, 781–797 (2024). https://doi.org/10.1007/s00371-023-02816-y

Download citation

Accepted: 01 February 2023
Published: 22 March 2023
Issue Date: February 2024
DOI: https://doi.org/10.1007/s00371-023-02816-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

DCNet: exploring fine-grained vision classification for 3D point clouds

Abstract

Access this article

Similar content being viewed by others

Convolutional Point Transformer

PointFormer: A Dual Perception Attention-Based Network for Point Cloud Classification

Improved Training for 3D Point Cloud Classification

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

DCNet: exploring fine-grained vision classification for 3D point clouds

Abstract

Access this article

Similar content being viewed by others

Convolutional Point Transformer

PointFormer: A Dual Perception Attention-Based Network for Point Cloud Classification

Improved Training for 3D Point Cloud Classification

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation