SPOT: Scalable 3D Pre-training via Occupancy Prediction for Autonomous Driving

Yan, Xiangchao; Chen, Runjian; Zhang, Bo; Yuan, Jiakang; Cai, Xinyu; Shi, Botian; Shao, Wenqi; Yan, Junchi; Luo, Ping; Qiao, Yu

Computer Science > Computer Vision and Pattern Recognition

arXiv:2309.10527 (cs)

[Submitted on 19 Sep 2023 (v1), last revised 25 Sep 2023 (this version, v2)]

Title:SPOT: Scalable 3D Pre-training via Occupancy Prediction for Autonomous Driving

Authors:Xiangchao Yan, Runjian Chen, Bo Zhang, Jiakang Yuan, Xinyu Cai, Botian Shi, Wenqi Shao, Junchi Yan, Ping Luo, Yu Qiao

View PDF

Abstract:Annotating 3D LiDAR point clouds for perception tasks including 3D object detection and LiDAR semantic segmentation is notoriously time-and-energy-consuming. To alleviate the burden from labeling, it is promising to perform large-scale pre-training and fine-tune the pre-trained backbone on different downstream datasets as well as tasks. In this paper, we propose SPOT, namely Scalable Pre-training via Occupancy prediction for learning Transferable 3D representations, and demonstrate its effectiveness on various public datasets with different downstream tasks under the label-efficiency setting. Our contributions are threefold: (1) Occupancy prediction is shown to be promising for learning general representations, which is demonstrated by extensive experiments on plenty of datasets and tasks. (2) SPOT uses beam re-sampling technique for point cloud augmentation and applies class-balancing strategies to overcome the domain gap brought by various LiDAR sensors and annotation strategies in different datasets. (3) Scalable pre-training is observed, that is, the downstream performance across all the experiments gets better with more pre-training data. We believe that our findings can facilitate understanding of LiDAR point clouds and pave the way for future exploration in LiDAR pre-training. Codes and models will be released.

Comments:	15 pages, 9 figures
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2309.10527 [cs.CV]
	(or arXiv:2309.10527v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2309.10527

Submission history

From: Bo Zhang [view email]
[v1] Tue, 19 Sep 2023 11:13:01 UTC (6,925 KB)
[v2] Mon, 25 Sep 2023 06:41:30 UTC (6,925 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:SPOT: Scalable 3D Pre-training via Occupancy Prediction for Autonomous Driving

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:SPOT: Scalable 3D Pre-training via Occupancy Prediction for Autonomous Driving

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators