Learning representative viewpoints in 3D shape recognition

Chu, Huazhen; Le, Chao; Wang, Rongquan; Li, Xi; Ma, Huimin

doi:10.1007/s00371-021-02203-5

Learning representative viewpoints in 3D shape recognition

Original article
Published: 02 July 2021

Volume 38, pages 3703–3718, (2022)
Cite this article

The Visual Computer Aims and scope Submit manuscript

Huazhen Chu¹,
Chao Le²,
Rongquan Wang¹,
Xi Li² &
…
Huimin Ma¹

377 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Adopting many viewpoints and mining the relationship between them, 3D shape recognition inferring the object’s category from 2D rendered images has proven effective. However, using a limited number of general representative viewpoints to form a reasonable expression of the object is a task with both practical and theoretical significance. This paper proposes a multi-view CNN architecture with independent viewpoint feature extraction and the unity of importance weights, which can dramatically decrease the number of viewpoints by learning the representative ones. First, the view-based and independent view features are extracted by a deep neural network. Second, the network automatically learns relativity between these viewpoints and outputs the importance weights of views. Finally, view features are aggregated to predict the category of objects. Through iterative learning of these critical weights in instances, global representative viewpoints are selected. We assess our method on two challenging datasets, ModelNet and ShapeNet. Rigorous experiments show that our strategy is competitive with the latest method using only six viewpoints and RGB information as input. Meanwhile, our approach also achieves state-of-the-art performance by using 20 viewpoints as input. Specifically, the proposed approach achieves 99.34% and 97.49% accuracy on the ModelNet10 and ModelNet40, and 80.0% mAP on ShapeNet.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A viewpoint-guided prototype network for 3D shape classification

Article 11 September 2023

Latent-MVCNN: 3D Shape Recognition Using Multiple Views from Pre-defined or Random Viewpoints

Article 29 May 2020

SPNet: Deep 3D Object Classification and Retrieval Using Stereographic Projection

References

Kong, C., Lin, C.-H., Lucey, S.: Using locally corresponding cad models for dense 3d reconstructions from a single image, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4857–4865. (2017)
Murthy, J.K., Krishna, G.S., Chhaya, F., Krishna, K.M.: Reconstructing vehicles from a single image: Shape priors for road scene understanding, in: 2017 IEEE International Conference on Robotics and Automation (ICRA), IEEE, pp. 724–731. (2017)
Chen, X., Ma, H., Wan, J., Li, B., Xia, T.: Multi-view 3d object detection network for autonomous driving, in: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 1907–1915. (2017)
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)
Article Google Scholar
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 640–651 (2014)
Google Scholar
Guo, Y., Wang, F., Xin, J.: Point-wise saliency detection on 3d point clouds via covariance descriptors. Visual Comput. 34(10), 1325–1338 (2018)
Article Google Scholar
Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., Xiao, J.: 3d shapenets: A deep representation for volumetric shapes, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 1912–1920
Savva, M., Yu, F., Su, H., Aono, M., Chen, B., Cohen-Or, D., Deng, W., Su, H., Bai, S., Bai, X., et al.: Shrec16 track: largescale 3d shape retrieval from shapenet core55, in: Proceedings of the eurographics workshop on 3D object retrieval, Vol. 10, 2016
Kanezaki, A., Matsushita, Y., Nishida, Y.: Rotationnet: Joint object categorization and pose estimation using multiviews from unsupervised viewpoints, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 5010–5019
Yavartanoo, M., Kim, E.Y., Lee, K.M.: Spnet: Deep 3d object classification and retrieval using stereographic projection, in: Asian Conference on Computer Vision, Springer, 2018, pp. 691–706
Yu, T., Meng, J., Yuan, J.: Multi-view harmonized bilinear network for 3d object recognition, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 186–194
Wang, C., Pelillo, M., Siddiqi, K.: Dominant set clustering and pooling for multi-view 3d object recognition, arXiv preprint arXiv:1906.01592
Bu, S., Wang, L., Han, P., Liu, Z., Li, K.: 3d shape recognition and retrieval based on multi-modality deep learning. Neurocomputing 259, 183–193 (2017)
Article Google Scholar
Qi, C.R., Su, H., Mo, K., Guibas, L.J.: Pointnet: Deep learning on point sets for 3d classification and segmentation, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 652–660
Klokov, R., Lempitsky, V.: Escape from cells: Deep kd-networks for the recognition of 3d point cloud models, in: Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 863–872
Liu, X., Han, Z., Liu, Y.-S., Zwicker, M.: Point2sequence: Learning the shape representation of 3d point clouds with an attention-based sequence to sequence network, in: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33, 2019, pp. 8778–8785
Liu, Y., Fan, B., Xiang, S., Pan, C.: Relation-shape convolutional neural network for point cloud analysis, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 8895–8904
Zhang, K., Hao, M., Wang, J., de Silva, C.W., Fu, C.: Linked dynamic graph cnn: Learning on point cloud via linking hierarchical features, arXiv preprint arXiv:1904.10014
Feng, Y., You, H., Zhang, Z., Ji, R., Gao, Y.: Hypergraph neural networks, in: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33, 2019, pp. 3558–3565
Li, J., Chen, B.M., Lee, G.H.: So-net: Self-organizing network for point cloud analysis, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 9397–9406
Cheraghian, A., Petersson, L.: 3dcapsule: Extending the capsule architecture to classify 3d point clouds, in: 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), IEEE, 2019, pp. 1194–1202
Maturana, D., Scherer, S.: Voxnet: A 3d convolutional neural network for real-time object recognition, in: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, 2015, pp. 922–928
Brock, A., Lim, T., Ritchie, J.M., Weston, N.: Generative and discriminative voxel modeling with convolutional neural networks, arXiv preprint arXiv:1608.04236
Kumawat, S., Raman, S.: Lp-3dcnn: Unveiling local phase in 3d convolutional neural networks, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 4903–4912
Su, H., Maji, S., Kalogerakis, E., Learned-Miller, E.: Multi-view convolutional neural networks for 3d shape recognition, in: Proceedings of the IEEE international conference on computer vision, 2015, pp. 945–953
Esteves, C., Xu, Y., Allen-Blanchette, C., Daniilidis, K.: Equivariant multi-view networks, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 1568–1577
Han, Z., Lu, H., Liu, Z., Vong, C.-M., Liu, Y.-S., Zwicker, M., Han, J., Chen, C.P.: 3d2seqviews: aggregating sequential views for 3d global feature learning by cnn with hierarchical attention aggregation. IEEE Trans. Image Process. 28(8), 3986–3999 (2019)
Article MathSciNet MATH Google Scholar
Han, Z., Shang, M., Liu, Z., Vong, C.-M., Liu, Y.-S., Zwicker, M., Han, J., Chen, C.P.: Seqviews2seqlabels: learning 3d global features via aggregating sequential views by rnn with attention. IEEE Trans. Image Process. 28(2), 658–672 (2018)
Article MathSciNet MATH Google Scholar
Jiang, J., Bao, D., Chen, Z., Zhao, X., Gao, Y.: Mlvcnn: Multi-loop-view convolutional neural network for 3d shape retrieval, in: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33, 2019, pp. 8513–8520
Arsalan Soltani, A., Huang, H., Wu, J., Kulkarni, T.D., Tenenbaum, J.B.: Synthesizing 3d shapes via modeling multi-view depth maps and silhouettes with deep generative networks, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 1511–1519
Johns, E., Leutenegger, S., Davison, A.J.: Pairwise decomposition of image sequences for active multi-view recognition, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 3813–3822
Chatfield, K., Simonyan, K., Vedaldi, A., Zisserman, A.: Return of the devil in the details: Delving deep into convolutional nets, arXiv preprint arXiv:1405.3531
Zanuttigh, P., Minto, L.: Deep learning for 3d shape classification from multiple depth maps, in: 2017 IEEE International Conference on Image Processing (ICIP), IEEE, 2017, pp. 3615–3619
Qi, C.R., Su, H., Nießner, M., Dai, A., Yan, M., Guibas, L.J.: Volumetric and multi-view cnns for object classification on 3d data, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 5648–5656
Bai, S., Bai, X., Zhou, Z., Zhang, Z., Jan Latecki, L.: Gift: A real-time and scalable 3d shape search engine, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 5023–5032
Bai, S., Bai, X., Zhou, Z., Zhang, Z., Tian, Q., Latecki, L.J.: Gift: towards scalable 3d shape retrieval. IEEE Trans. Multimedia 19(6), 1257–1271 (2017)
Article Google Scholar
Wei, X., Yu, R., Sun, J.: Hrge-net: Hierarchical relational graph embedding network for multi-view 3d shape recognition, arXiv preprint arXiv:1908.10098
Wei, X., Yu, R., Sun, J.: View-gcn: View-based graph convolutional network for 3d shape analysis, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 1850–1859
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 1–9

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (No. U20B2062), the Beijing Municipal Science & Technology Project (No. Z1911000 07419001), the Beijing National Research Center for Information Science and Technology, and the key Laboratory of Opto-Electronic Information Processing, CAS (No. JGA202004027).

Author information

Authors and Affiliations

University of Science and Technology Beijing, Beijing, China
Huazhen Chu, Rongquan Wang & Huimin Ma
Tsinghua University, Beijing, China
Chao Le & Xi Li

Authors

Huazhen Chu
View author publications
You can also search for this author in PubMed Google Scholar
Chao Le
View author publications
You can also search for this author in PubMed Google Scholar
Rongquan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xi Li
View author publications
You can also search for this author in PubMed Google Scholar
Huimin Ma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Huimin Ma.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chu, H., Le, C., Wang, R. et al. Learning representative viewpoints in 3D shape recognition. Vis Comput 38, 3703–3718 (2022). https://doi.org/10.1007/s00371-021-02203-5

Download citation

Accepted: 06 June 2021
Published: 02 July 2021
Issue Date: November 2022
DOI: https://doi.org/10.1007/s00371-021-02203-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning representative viewpoints in 3D shape recognition

Abstract

Access this article

Similar content being viewed by others

A viewpoint-guided prototype network for 3D shape classification

Latent-MVCNN: 3D Shape Recognition Using Multiple Views from Pre-defined or Random Viewpoints

SPNet: Deep 3D Object Classification and Retrieval Using Stereographic Projection

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Learning representative viewpoints in 3D shape recognition

Abstract

Access this article

Similar content being viewed by others

A viewpoint-guided prototype network for 3D shape classification

Latent-MVCNN: 3D Shape Recognition Using Multiple Views from Pre-defined or Random Viewpoints

SPNet: Deep 3D Object Classification and Retrieval Using Stereographic Projection

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation