5 February 2024 HA-DQS-Net: dynamic query design based on transformer with hollow attention
Hongyi Wang, Di Yan, Yunpeng Li, Limei Song
Author Affiliations +
Abstract

A common problem in the field of object detection is that the image features could not be fully expressed. And another issue is that the static query selection in the detection transformer (DETR)-like models cannot adapt well to different datasets due to the fixed number of selected object queries. To solve these problems, hollow attention (HA) and dynamic query selection (DQS) modules were proposed, and a network HA-DQS-Net was further formed. HA integrates specially designed masks into self-attention to better combine channel and spatial directional feature information, thereby learning more complex and comprehensive target features. DQS improves the idea of static query selection in the current DETR-like model by dynamically selecting the number of object queries based on the actual number of targets in the image, which enhances the accuracy of the model. HA-DQS-Net, which combines the advantages of HA and DQS, has a competitive performance in the field of object detection. The excellent detection effectiveness of our viewpoint is validated based on PASVAL VOC and a homemade smoking dataset. It is worth noting that all APs have been improved when HA is applied to different DETR-like models, which improves the universality of the HA module.

© 2024 SPIE and IS&T
Hongyi Wang, Di Yan, Yunpeng Li, and Limei Song "HA-DQS-Net: dynamic query design based on transformer with hollow attention," Journal of Electronic Imaging 33(1), 013033 (5 February 2024). https://doi.org/10.1117/1.JEI.33.1.013033
Received: 21 June 2023; Accepted: 11 January 2024; Published: 5 February 2024
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Matrices

Transformers

Object detection

Design

Education and training

Data modeling

Head

Back to Top