Journals & Magazines >IEEE Transactions on Circuits... >Volume: 33 Issue: 10

MedoidsFormer: A Strong 3D Object Detection Backbone by Exploiting Interaction With Adjacent Medoid Tokens

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

In this paper, we propose MedoidsFormer, a novel transformer-based backbone equipped with a self-attention mechanism that is tailored explicitly to LiDAR-based 3D object ...View more

Metadata

Abstract:

In this paper, we propose MedoidsFormer, a novel transformer-based backbone equipped with a self-attention mechanism that is tailored explicitly to LiDAR-based 3D object detection. Unlike 2D object detection, the proportion of target objects to the input scene is much smaller, and their distribution is significantly sparser in 3D object detection. Given these observations, we introduce a new self-attention mechanism called Medoids Attention, focusing on exploiting interactions within surrounding regions, which not only reduces computation and memory costs but obtains discriminative context information. Instead of aggregating tokens from adjacent areas, we present a dynamic semantic-aware token mining process through k-Medoids clustering to direct select representative tokens for attention modeling. Our proposed method shows consistent improvement over existing 3D object detectors through extensive experiments and achieves state-of-the-art performance on the large-scale Waymo Open Dataset. We also conduct comprehensive ablation studies to verify the efficacy of the new self-attention mechanism and provide thorough insights.

Published in: IEEE Transactions on Circuits and Systems for Video Technology ( Volume: 33, Issue: 10, October 2023)

Page(s): 5844 - 5854

Date of Publication: 21 March 2023

ISSN Information:

DOI: 10.1109/TCSVT.2023.3260115

Funding Agency:

Contents

References is not available for this document.

MedoidsFormer: A Strong 3D Object Detection Backbone by Exploiting Interaction With Adjacent Medoid Tokens

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

MedoidsFormer: A Strong 3D Object Detection Backbone by Exploiting Interaction With Adjacent Medoid Tokens

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?