3D Feature Prediction for Masked-AutoEncoder-Based Point Cloud Pretraining

Yan, Siming; Yang, Yuqi; Guo, Yuxiao; Pan, Hao; Wang, Peng-shuai; Tong, Xin; Liu, Yang; Huang, Qixing

Computer Science > Computer Vision and Pattern Recognition

arXiv:2304.06911 (cs)

[Submitted on 14 Apr 2023 (v1), last revised 28 Apr 2024 (this version, v2)]

Title:3D Feature Prediction for Masked-AutoEncoder-Based Point Cloud Pretraining

Authors:Siming Yan, Yuqi Yang, Yuxiao Guo, Hao Pan, Peng-shuai Wang, Xin Tong, Yang Liu, Qixing Huang

View PDF HTML (experimental)

Abstract:Masked autoencoders (MAE) have recently been introduced to 3D self-supervised pretraining for point clouds due to their great success in NLP and computer vision. Unlike MAEs used in the image domain, where the pretext task is to restore features at the masked pixels, such as colors, the existing 3D MAE works reconstruct the missing geometry only, i.e, the location of the masked points. In contrast to previous studies, we advocate that point location recovery is inessential and restoring intrinsic point features is much superior. To this end, we propose to ignore point position reconstruction and recover high-order features at masked points including surface normals and surface variations, through a novel attention-based decoder which is independent of the encoder design. We validate the effectiveness of our pretext task and decoder design using different encoder structures for 3D training and demonstrate the advantages of our pretrained networks on various point cloud analysis tasks.

Comments:	Published in ICLR 2024
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2304.06911 [cs.CV]
	(or arXiv:2304.06911v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2304.06911

Submission history

From: Siming Yan [view email]
[v1] Fri, 14 Apr 2023 03:25:24 UTC (10,983 KB)
[v2] Sun, 28 Apr 2024 18:36:19 UTC (43,790 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:3D Feature Prediction for Masked-AutoEncoder-Based Point Cloud Pretraining

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:3D Feature Prediction for Masked-AutoEncoder-Based Point Cloud Pretraining

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators