HA-DQS-Net: dynamic query design based on transformer with hollow attention

Hongyi Wang; Di Yan; Yunpeng Li; Limei Song

doi:10.1117/1.JEI.33.1.013033

5 February 2024 HA-DQS-Net: dynamic query design based on transformer with hollow attention

Hongyi Wang, Di Yan, Yunpeng Li, Limei Song

Author Affiliations +

Journal of Electronic Imaging, Vol. 33, Issue 1, 013033 (February 2024). https://doi.org/10.1117/1.JEI.33.1.013033

Abstract

A common problem in the field of object detection is that the image features could not be fully expressed. And another issue is that the static query selection in the detection transformer (DETR)-like models cannot adapt well to different datasets due to the fixed number of selected object queries. To solve these problems, hollow attention (HA) and dynamic query selection (DQS) modules were proposed, and a network HA-DQS-Net was further formed. HA integrates specially designed masks into self-attention to better combine channel and spatial directional feature information, thereby learning more complex and comprehensive target features. DQS improves the idea of static query selection in the current DETR-like model by dynamically selecting the number of object queries based on the actual number of targets in the image, which enhances the accuracy of the model. HA-DQS-Net, which combines the advantages of HA and DQS, has a competitive performance in the field of object detection. The excellent detection effectiveness of our viewpoint is validated based on PASVAL VOC and a homemade smoking dataset. It is worth noting that all APs have been improved when HA is applied to different DETR-like models, which improves the universality of the HA module.

Citation Download Citation

Hongyi Wang, Di Yan, Yunpeng Li, and Limei Song "HA-DQS-Net: dynamic query design based on transformer with hollow attention," Journal of Electronic Imaging 33(1), 013033 (5 February 2024). https://doi.org/10.1117/1.JEI.33.1.013033

Received: 21 June 2023; Accepted: 11 January 2024; Published: 5 February 2024

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available

Members: $24.00

Non-members: $28.00 ADD TO CART

JOURNAL ARTICLE
17 PAGES

DOWNLOAD PAPER SAVE TO MY LIBRARY

GET CITATION

RIGHTS & PERMISSIONS

Get copyright permission Get copyright permission on Copyright Marketplace

KEYWORDS

Matrices

Transformers

Object detection

Design

Education and training

Data modeling

Head

Show All Keywords

Keywords/Phrases

Search In:

Publication Years