research-article

LSOTB-TIR: A Large-Scale High-Diversity Thermal Infrared Object Tracking Benchmark

Authors:
Qiao Liu

Harbin Institute of Technology, Shenzhen, Shenzhen, China

Harbin Institute of Technology, Shenzhen, Shenzhen, China
View Profile

,
Xin Li

Harbin Institute of Technology, Shenzhen, Shenzhen, China

Harbin Institute of Technology, Shenzhen, Shenzhen, China
View Profile

,
Zhenyu He

Harbin Institute of Technology, Shenzhen & Peng Cheng Laboratory, Shenzhen, China

Harbin Institute of Technology, Shenzhen & Peng Cheng Laboratory, Shenzhen, China
View Profile

,
Chenglong Li

Anhui University, Hefei, China

Anhui University, Hefei, China
View Profile

,
Jun Li

Harbin Institute of Technology, Shenzhen, Shenzhen, China

Harbin Institute of Technology, Shenzhen, Shenzhen, China
View Profile

,
Zikun Zhou

Harbin Institute of Technology, Shenzhen, Shenzhen, China

Harbin Institute of Technology, Shenzhen, Shenzhen, China
View Profile

,
Di Yuan

Harbin Institute of Technology, Shenzhen, Shenzhen, China

Harbin Institute of Technology, Shenzhen, Shenzhen, China
View Profile

,
Jing Li

Harbin Institute of Technology, Shenzhen, Shenzhen, China

Harbin Institute of Technology, Shenzhen, Shenzhen, China
View Profile

,
Kai Yang

Harbin Institute of Technology, Shenzhen, Shenzhen, China

Harbin Institute of Technology, Shenzhen, Shenzhen, China
View Profile

,
Nana Fan

Harbin Institute of Technology, Shenzhen, Shenzhen, China

Harbin Institute of Technology, Shenzhen, Shenzhen, China
View Profile

,
Feng Zheng

Southern University of Science and Technology, Shenzhen, China

Southern University of Science and Technology, Shenzhen, China
View Profile

MM '20: Proceedings of the 28th ACM International Conference on MultimediaOctober 2020Pages 3847–3856https://doi.org/10.1145/3394171.3413922

Published:12 October 2020Publication History

MM '20: Proceedings of the 28th ACM International Conference on Multimedia

Pages 3847–3856

ABSTRACT

In this paper, we present a Large-Scale and high-diversity general Thermal InfraRed (TIR) Object Tracking Benchmark, called LSOTB-TIR, which consists of an evaluation dataset and a training dataset with a total of 1,400 TIR sequences and more than 600K frames. We annotate the bounding box of objects in every frame of all sequences and generate over 730K bounding boxes in total. To the best of our knowledge, LSOTB-TIR is the largest and most diverse TIR object tracking benchmark to date. To evaluate a tracker on different attributes, we define 4 scenario attributes and 12 challenge attributes in the evaluation dataset. By releasing LSOTB-TIR, we encourage the community to develop deep learning based TIR trackers and evaluate them fairly and comprehensively. We evaluate and analyze more than 30 trackers on LSOTB-TIR to provide a series of baselines, and the results show that deep trackers achieve promising performance. Furthermore, we re-train several representative deep trackers on LSOTB-TIR, and their results demonstrate that the proposed training dataset significantly improves the performance of deep TIR trackers. Codes and dataset are available at https://github.com/QiaoLiuHit/LSOTB-TIR.

Supplemental Material

3394171.3413922.mp4

mp4

9 MB

Download

References

B. Babenko, Ming Hsuan Yang, and S. Belongie. 2009. Visual tracking with online Multiple Instance Learning. In IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
Chenglong Bao, Yi Wu, Haibin Ling, and Hui Ji. 2012. Real time robust L1 tracker using accelerated proximal gradient approach. In IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
A. Berg, J. Ahlberg, and M. Felsberg. 2015. A thermal Object Tracking benchmark. In IEEE International Conference on Advanced Video and Signal Based Surveillance.Google Scholar
Amanda Berg, Jorgen Ahlberg, and Michael Felsberg. 2016. Channel coded distribution field tracking for thermal infrared imagery. In IEEE Conference on Computer Vision and Pattern Recognition Workshops.Google ScholarCross Ref
Luca Bertinetto, Jack Valmadre, Stuart Golodetz, Ondrej Miksik, and Philip HS Torr. 2016a. Staple: Complementary learners for real-time tracking. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
Luca Bertinetto, Jack Valmadre, Joao F Henriques, Andrea Vedaldi, and Philip HS Torr. 2016b. Fully-Convolutional Siamese Networks for Object Tracking. In European Conference on Computer Vision Workshops.Google Scholar
David S Bolme, J Ross Beveridge, Bruce A Draper, and Yui Man Lui. 2010. Visual object tracking using adaptive correlation filters. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
Martin Danelljan, Goutam Bhat, Fahad Shahbaz Khan, and Michael Felsberg. 2019. Atom: Accurate tracking by overlap maximization. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
Martin Danelljan, Goutam Bhat, Fahad Shahbaz Khan, and Michael Felsberg. 2017. ECO: efficient convolution operators for tracking. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
Martin Danelljan, Gustav H"ager, Fahad Khan, and Michael Felsberg. 2014. Accurate scale estimation for robust visual tracking. In British Machine Vision Conference.Google ScholarCross Ref
Martin Danelljan, Gustav Hager, Fahad Shahbaz Khan, and Michael Felsberg. 2015. Learning spatially regularized correlation filters for visual tracking. In IEEE International Conference on Computer Vision.Google ScholarDigital Library
James W Davis and Vinay Sharma. 2007. Background-subtraction using contour-based fusion of thermal and visible imagery. Computer Vision and Image Understanding, Vol. 106, 2 (2007), 162--182.Google ScholarDigital Library
Huseyin Seckin Demir and Omer Faruk Adil. 2018. Part-Based Co-Difference Object Tracking Algorithm for Infrared Videos. In International Conference on Image Processing.Google Scholar
Xingping Dong and Jianbing Shen. 2018. Triplet Loss in Siamese Network for Object Tracking. In European Conference on Computer Vision.Google Scholar
Heng Fan, Liting Lin, Fan Yang, Peng Chu, Ge Deng, Sijia Yu, Hexin Bai, Yong Xu, Chunyuan Liao, and Haibin Ling. 2019. Lasot: A high-quality benchmark for large-scale single object tracking. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
Michael Felsberg, Amanda Berg, Gustav Hager, Jorgen Ahlberg, et al. 2015. The thermal infrared visual object tracking VOT-TIR2015 challenge results. In IEEE International Conference on Computer Vision Workshops.Google ScholarDigital Library
Michael Felsberg, Matej Kristan, Jivr i Matas, Alevs Leonardis, et al. 2016. The Thermal Infrared Visual Object Tracking VOT-TIR2016 Challenge Results. In European Conference on Computer Vision Workshops.Google Scholar
Rikke Gade and Thomas B Moeslund. 2014. Thermal cameras and applications: a survey. Machine vision and applications, Vol. 25, 1 (2014), 245--262.Google Scholar
Jin Gao, Haibin Ling, Weiming Hu, and Junliang Xing. 2014. Transfer learning based visual tracking with gaussian processes regression. In European Conference on Computer Vision.Google ScholarCross Ref
Peng Gao, Yipeng Ma, Ke Song, Chao Li, Fei Wang, and Liyi Xiao. 2018. Large margin structured convolution operator for thermal infrared object tracking. In International Conference on Pattern Recognition.Google ScholarCross Ref
Robert Geirhos, Patricia Rubisch, Claudio Michaelis, Matthias Bethge, Felix A Wichmann, and Wieland Brendel. 2098. ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. In International Conference on Learning Representations.Google Scholar
Georgia Gkioxari and Jitendra Malik. 2015. Finding action tubes. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
Erhan Gundogdu, Aykut Koc, Berkan Solmaz, et al. 2016. Evaluation of feature channels for correlation-filter-based visual object tracking in infrared spectrum. In IEEE Conference on Computer Vision and Pattern Recognition Workshops.Google ScholarCross Ref
Erhan Gundogdu, Huseyin Ozkan, H. Seckin Demir, et al. 2015. Comparison of infrared and visible imagery for object tracking: Toward trackers with superior IR performance. In IEEE Conference on Computer Vision and Pattern Recognition Workshops.Google Scholar
Qing Guo, Wei Feng, Ce Zhou, et al. 2017. Learning dynamic siamese network for visual object tracking. In IEEE International Conference on Computer Vision.Google ScholarCross Ref
Sam Hare, Stuart Golodetz, Amir Saffari, Vibhav Vineet, et al. 2015. Struck: Structured output tracking with kernels. IEEE Transactions of Pattern Analysis and Machine Intelligence, Vol. 38, 10 (2015), 2096--2109.Google ScholarDigital Library
Jo ao F Henriques, Rui Caseiro, Pedro Martins, and Jorge Batista. 2015. High-speed tracking with kernelized correlation filters. IEEE Transactions of Pattern Analysis and Machine Intelligence, Vol. 37, 3 (2015), 583--596.Google ScholarDigital Library
Hamed Kiani Galoogahi, Ashton Fagg, and Simon Lucey. 2017. Learning background-aware correlation filters for visual tracking. In IEEE International Conference on Computer Vision.Google ScholarCross Ref
M. Kristan, A. Leonardis, J. Matas, M. Felsberg, et al. 2017. The Visual Object Tracking VOT2017 Challenge Results. In IEEE International Conference on Computer Vision Workshops.Google Scholar
Bo Li, Wei Wu, Qiang Wang, Fangyi Zhang, Junliang Xing, and Junjie Yan. 2019 e. Siamrpn+: Evolution of siamese visual tracking with very deep networks. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
Chenglong Li, Xinyan Liang, Yijuan Lu, Nan Zhao, and Jin Tang. 2019 a. RGB-T object tracking: benchmark and baseline. Pattern Recognition, Vol. 96 (2019), 106977.Google ScholarDigital Library
Feng Li, Cheng Tian, Wangmeng Zuo, Lei Zhang, and Ming-Hsuan Yang. 2018. Learning spatial-temporal regularized correlation filters for visual tracking. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
Meihui Li, Lingbing Peng, Yingpin Chen, et al. 2019 d. Mask Sparse Representation Based on Semantic Features for Thermal Infrared Target Tracking. Remote Sensing, Vol. 11, 17 (2019), 1967.Google ScholarCross Ref
Xin Li, Qiao Liu, Nana Fan, et al. 2019 b. Hierarchical spatial-aware Siamese network for thermal infrared object tracking. Knowledge-Based Systems, Vol. 166 (2019), 71--81.Google ScholarCross Ref
Xin Li, Chao Ma, Baoyuan Wu, Zhenyu He, and Ming-Hsuan Yang. 2019 c. Target-Aware Deep Tracking. In IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
Yang Li, Jianke Zhu, and Steven CH Hoi. 2015. Reliable patch trackers: Robust visual tracking by exploiting reliable patches. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
Qiao Liu, Zhenyu He, Xin Li, and Yuan Zheng. 2019. PTB-TIR: A Thermal Infrared Pedestrian Tracking Benchmark. IEEE Transaction on Multimedia, Vol. 22, 3 (2019), 666--675.Google ScholarCross Ref
Qiao Liu, Xin Li, Zhenyu He, Nana Fan, Di Yuan, and Hongpeng Wang. 2020. Learning Deep Multi-Level Similarity for Thermal Infrared Object Tracking. IEEE Transaction on Multimedia (2020). https://doi.org/10.1109/TMM.2020.3008028Google Scholar
Qiao Liu, Xiaohuan Lu, Zhenyu He, et al. 2017. Deep convolutional neural networks for thermal infrared object tracking. Knowledge-Based Systems, Vol. 134 (2017), 189--198.Google ScholarCross Ref
Chao Ma, Jia-Bin Huang, Xiaokang Yang, and Ming-Hsuan Yang. 2015a. Hierarchical convolutional features for visual tracking. In IEEE International Conference on Computer Vision.Google ScholarDigital Library
Chao Ma, Xiaokang Yang, Chongyang Zhang, and Ming Hsuan Yang. 2015b. Long-term correlation tracking. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research, Vol. 9, Nov (2008), 2579--2605.Google Scholar
Matthias Muller, Adel Bibi, Silvio Giancola, Salman Alsubaihi, and Bernard Ghanem. 2018. Trackingnet: A large-scale dataset and benchmark for object tracking in the wild. In European Conference on Computer Vision.Google ScholarDigital Library
Hyeonseob Nam and Bohyung Han. 2016. Learning multi-domain convolutional neural networks for visual tracking. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
Jan Portmann, Simon Lynen, Margarita Chli, and Roland Siegwart. 2014. People detection and tracking from aerial thermal views. In IEEE Robotics and Automation Society.Google Scholar
Yuankai Qi, Shengping Zhang, Lei Qin, et al. 2016. Hedged Deep Tracking. In IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
Olga Russakovsky, Jia Deng, Hao Su, et al. 2015. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision, Vol. 115, 3 (2015), 211--252.Google ScholarDigital Library
Laura Sevilla-Lara and Erik Learned-Miller. 2012. Distribution fields for tracking. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).Google Scholar
Yibing Song, Chao Ma, Lijun Gong, et al. 2017. Crest: Convolutional residual learning for visual tracking. In IEEE International Conference on Computer Vision.Google ScholarCross Ref
Yibing Song, Chao Ma, Xiaohe Wu, Lijun Gong, et al. 2018. Vital: Visual tracking via adversarial learning. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
Jack Valmadre, Luca Bertinetto, Jo ao Henriques, Andrea Vedaldi, and Philip HS Torr. 2017. End-to-end representation learning for correlation filter based tracking. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
Vijay Venkataraman, Guoliang Fan, Joseph P Havlicek, et al. 2012. Adaptive kalman filtering for histogram-based appearance learning in infrared imagery. IEEE Transactions on Image Processing, Vol. 21, 11 (2012), 4622--4635.Google ScholarDigital Library
Ning Wang, Yibing Song, Chao Ma, Wengang Zhou, Wei Liu, and Houqiang Li. 2019 a. Unsupervised Deep Tracking. In IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
Ning Wang, Wengang Zhou, Qi Tian, Richang Hong, Meng Wang, and Houqiang Li. 2018. Multi-cue correlation filters for robust visual tracking. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
Qiang Wang, Li Zhang, Luca Bertinetto, Weiming Hu, and Philip HS Torr. 2019 b. Fast online object tracking and segmentation: A unifying approach. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
Yi Wu, Jongwoo Lim, and Ming-Hsuan Yang. 2013. Online object tracking: A benchmark. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarDigital Library
Zheng Wu, Nathan Fuller, Diane Theriault, and Margrit Betke. 2014. A thermal infrared video benchmark for visual analysis. In IEEE Conference on Computer Vision and Pattern Recognition Workshops.Google ScholarDigital Library
Jia Xu, Huchuan Lu, and Ming Hsuan Yang. 2012. Visual tracking via adaptive structural local sparse appearance model. In IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
Xianguo Yu, Qifeng Yu, Yang Shang, and Hongliang Zhang. 2017. Dense structural learning for infrared object tracking at 200+ frames per second. Pattern Recognition Letter, Vol. 100 (2017), 152--159.Google ScholarDigital Library
Lichao Zhang, Abel Gonzalez-Garcia, Joost van de Weijer, Martin Danelljan, and Fahad Shahbaz Khan. 2019. Synthetic data generation for end-to-end thermal infrared tracking. IEEE Transactions on Image Processing, Vol. 28, 4 (2019), 1837--1850.Google ScholarDigital Library

Index Terms

LSOTB-TIR: A Large-Scale High-Diversity Thermal Infrared Object Tracking Benchmark
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Tracking
      2. Computer vision representations
        Image representations

Recommendations

TrackingNet: A Large-Scale Dataset and Benchmark for Object Tracking in the Wild
Computer Vision – ECCV 2018
Abstract
Despite the numerous developments in object tracking, further improvement of current tracking algorithms is limited by small and mostly saturated datasets. As a matter of fact, data-hungry trackers based on deep-learning currently rely on object ...
Read More
LaSOT: A High-quality Large-scale Single Object Tracking Benchmark
Abstract
Despite great recent advances in visual tracking, its further development, including both algorithm design and evaluation, is limited due to lack of dedicated large-scale benchmarks. To address this problem, we present LaSOT, a high-quality Large-...
Read More
The Thermal Infrared Visual Object Tracking VOT-TIR2015 Challenge Results
ICCVW '15: Proceedings of the 2015 IEEE International Conference on Computer Vision Workshop (ICCVW)

The Thermal Infrared Visual Object Tracking challenge 2015, VOT-TIR2015, aims at comparing short-term single-object visual trackers that work on thermal infrared (TIR) sequences and do not apply pre-learned models of object appearance. VOT-TIR2015 is ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MM '20: Proceedings of the 28th ACM International Conference on Multimedia
October 2020
4889 pages
ISBN:9781450379885
DOI:10.1145/3394171
General Chairs:
Chang Wen Chen
Chinese University of Hong Kong, Shenzhen, China
,
Rita Cucchiara
UNIMORE, Italy
,
Xian-Sheng Hua
Alibaba Group, China
,
Program Chairs:
Guo-Jun Qi
Futurewei Technologies, USA
,
Elisa Ricci
UNITN & Fondazione Bruno Kessler, Italy
,
Zhengyou Zhang
Tencent, China
,
Roger Zimmermann
National University of Singapore, Singapore
Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 12 October 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
deep representation learning
thermal infrared dataset
thermal infrared object tracking
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate995of4,171submissions,24%
Upcoming Conference
MM '24

Sponsor:

sigmm

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 45
  Total Citations
  View Citations
- 294
  Total Downloads
- Downloads (Last 12 months)82
- Downloads (Last 6 weeks)13
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

LSOTB-TIR: A Large-Scale High-Diversity Thermal Infrared Object Tracking Benchmark

MM '20: Proceedings of the 28th ACM International Conference on Multimedia

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

TrackingNet: A Large-Scale Dataset and Benchmark for Object Tracking in the Wild

LaSOT: A High-quality Large-scale Single Object Tracking Benchmark

The Thermal Infrared Visual Object Tracking VOT-TIR2015 Challenge Results