28 November 2023 Aerial tracking of camouflaged people in woodlands
Yang Liu, Cong-Qing Wang, Bin Xu, Yong-Jun Zhou
Author Affiliations +
Abstract

With the remarkable advances of unmanned aerial vehicles (UAVs) and machine vision, aerial tracking has attracted wide attention from scholars. Previous tracking methods were mostly implemented in clean and well-lit environments, making it challenging to track camouflaged people rapidly and accurately in woodlands. We develop a framework for camouflaged people aerial tracking (CPAT) based on transformer. Specifically, a camouflaged people discovery strategy is proposed to rapidly generate training samples from the unlabeled videos captured by the UAV. Dynamic programming is also employed to filter noises to generate smooth candidate frames. To exploit multilevel feature information, a transformer fusion framework is designed to integrate shallow spatial information and in-depth semantic features. For reducing computing consumption, the spatial attention reduction mechanism is embedded in the multihead attention for fast tracking. Further, we build a dataset for evaluating the effect of camouflaged people tracking called Cam235, which consists of 85 manually labeled test sequences and more than 100k frames of the unlabeled training set. Exhaustive experiments on Cam235-test and popular tracking datasets prove that the CPAT is superior to other trackers for practical application. Under the most challenging condition of camouflaged people tracking, the CPAT achieves the precision of 67.9%, surpassing the state-of-the-art trackers by large margins.

© 2023 SPIE and IS&T
Yang Liu, Cong-Qing Wang, Bin Xu, and Yong-Jun Zhou "Aerial tracking of camouflaged people in woodlands," Journal of Electronic Imaging 32(6), 063018 (28 November 2023). https://doi.org/10.1117/1.JEI.32.6.063018
Received: 6 July 2023; Accepted: 30 October 2023; Published: 28 November 2023
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Video

Unmanned aerial vehicles

Education and training

Transformers

Feature fusion

Image segmentation

Visualization

Back to Top