ABSTRACT
In this paper, we present a Large-Scale and high-diversity general Thermal InfraRed (TIR) Object Tracking Benchmark, called LSOTB-TIR, which consists of an evaluation dataset and a training dataset with a total of 1,400 TIR sequences and more than 600K frames. We annotate the bounding box of objects in every frame of all sequences and generate over 730K bounding boxes in total. To the best of our knowledge, LSOTB-TIR is the largest and most diverse TIR object tracking benchmark to date. To evaluate a tracker on different attributes, we define 4 scenario attributes and 12 challenge attributes in the evaluation dataset. By releasing LSOTB-TIR, we encourage the community to develop deep learning based TIR trackers and evaluate them fairly and comprehensively. We evaluate and analyze more than 30 trackers on LSOTB-TIR to provide a series of baselines, and the results show that deep trackers achieve promising performance. Furthermore, we re-train several representative deep trackers on LSOTB-TIR, and their results demonstrate that the proposed training dataset significantly improves the performance of deep TIR trackers. Codes and dataset are available at https://github.com/QiaoLiuHit/LSOTB-TIR.
Supplemental Material
- B. Babenko, Ming Hsuan Yang, and S. Belongie. 2009. Visual tracking with online Multiple Instance Learning. In IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
- Chenglong Bao, Yi Wu, Haibin Ling, and Hui Ji. 2012. Real time robust L1 tracker using accelerated proximal gradient approach. In IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
- A. Berg, J. Ahlberg, and M. Felsberg. 2015. A thermal Object Tracking benchmark. In IEEE International Conference on Advanced Video and Signal Based Surveillance.Google Scholar
- Amanda Berg, Jorgen Ahlberg, and Michael Felsberg. 2016. Channel coded distribution field tracking for thermal infrared imagery. In IEEE Conference on Computer Vision and Pattern Recognition Workshops.Google ScholarCross Ref
- Luca Bertinetto, Jack Valmadre, Stuart Golodetz, Ondrej Miksik, and Philip HS Torr. 2016a. Staple: Complementary learners for real-time tracking. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
- Luca Bertinetto, Jack Valmadre, Joao F Henriques, Andrea Vedaldi, and Philip HS Torr. 2016b. Fully-Convolutional Siamese Networks for Object Tracking. In European Conference on Computer Vision Workshops.Google Scholar
- David S Bolme, J Ross Beveridge, Bruce A Draper, and Yui Man Lui. 2010. Visual object tracking using adaptive correlation filters. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
- Martin Danelljan, Goutam Bhat, Fahad Shahbaz Khan, and Michael Felsberg. 2019. Atom: Accurate tracking by overlap maximization. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
- Martin Danelljan, Goutam Bhat, Fahad Shahbaz Khan, and Michael Felsberg. 2017. ECO: efficient convolution operators for tracking. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
- Martin Danelljan, Gustav H"ager, Fahad Khan, and Michael Felsberg. 2014. Accurate scale estimation for robust visual tracking. In British Machine Vision Conference.Google ScholarCross Ref
- Martin Danelljan, Gustav Hager, Fahad Shahbaz Khan, and Michael Felsberg. 2015. Learning spatially regularized correlation filters for visual tracking. In IEEE International Conference on Computer Vision.Google ScholarDigital Library
- James W Davis and Vinay Sharma. 2007. Background-subtraction using contour-based fusion of thermal and visible imagery. Computer Vision and Image Understanding, Vol. 106, 2 (2007), 162--182.Google ScholarDigital Library
- Huseyin Seckin Demir and Omer Faruk Adil. 2018. Part-Based Co-Difference Object Tracking Algorithm for Infrared Videos. In International Conference on Image Processing.Google Scholar
- Xingping Dong and Jianbing Shen. 2018. Triplet Loss in Siamese Network for Object Tracking. In European Conference on Computer Vision.Google Scholar
- Heng Fan, Liting Lin, Fan Yang, Peng Chu, Ge Deng, Sijia Yu, Hexin Bai, Yong Xu, Chunyuan Liao, and Haibin Ling. 2019. Lasot: A high-quality benchmark for large-scale single object tracking. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
- Michael Felsberg, Amanda Berg, Gustav Hager, Jorgen Ahlberg, et al. 2015. The thermal infrared visual object tracking VOT-TIR2015 challenge results. In IEEE International Conference on Computer Vision Workshops.Google ScholarDigital Library
- Michael Felsberg, Matej Kristan, Jivr i Matas, Alevs Leonardis, et al. 2016. The Thermal Infrared Visual Object Tracking VOT-TIR2016 Challenge Results. In European Conference on Computer Vision Workshops.Google Scholar
- Rikke Gade and Thomas B Moeslund. 2014. Thermal cameras and applications: a survey. Machine vision and applications, Vol. 25, 1 (2014), 245--262.Google Scholar
- Jin Gao, Haibin Ling, Weiming Hu, and Junliang Xing. 2014. Transfer learning based visual tracking with gaussian processes regression. In European Conference on Computer Vision.Google ScholarCross Ref
- Peng Gao, Yipeng Ma, Ke Song, Chao Li, Fei Wang, and Liyi Xiao. 2018. Large margin structured convolution operator for thermal infrared object tracking. In International Conference on Pattern Recognition.Google ScholarCross Ref
- Robert Geirhos, Patricia Rubisch, Claudio Michaelis, Matthias Bethge, Felix A Wichmann, and Wieland Brendel. 2098. ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. In International Conference on Learning Representations.Google Scholar
- Georgia Gkioxari and Jitendra Malik. 2015. Finding action tubes. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
- Erhan Gundogdu, Aykut Koc, Berkan Solmaz, et al. 2016. Evaluation of feature channels for correlation-filter-based visual object tracking in infrared spectrum. In IEEE Conference on Computer Vision and Pattern Recognition Workshops.Google ScholarCross Ref
- Erhan Gundogdu, Huseyin Ozkan, H. Seckin Demir, et al. 2015. Comparison of infrared and visible imagery for object tracking: Toward trackers with superior IR performance. In IEEE Conference on Computer Vision and Pattern Recognition Workshops.Google Scholar
- Qing Guo, Wei Feng, Ce Zhou, et al. 2017. Learning dynamic siamese network for visual object tracking. In IEEE International Conference on Computer Vision.Google ScholarCross Ref
- Sam Hare, Stuart Golodetz, Amir Saffari, Vibhav Vineet, et al. 2015. Struck: Structured output tracking with kernels. IEEE Transactions of Pattern Analysis and Machine Intelligence, Vol. 38, 10 (2015), 2096--2109.Google ScholarDigital Library
- Jo ao F Henriques, Rui Caseiro, Pedro Martins, and Jorge Batista. 2015. High-speed tracking with kernelized correlation filters. IEEE Transactions of Pattern Analysis and Machine Intelligence, Vol. 37, 3 (2015), 583--596.Google ScholarDigital Library
- Hamed Kiani Galoogahi, Ashton Fagg, and Simon Lucey. 2017. Learning background-aware correlation filters for visual tracking. In IEEE International Conference on Computer Vision.Google ScholarCross Ref
- M. Kristan, A. Leonardis, J. Matas, M. Felsberg, et al. 2017. The Visual Object Tracking VOT2017 Challenge Results. In IEEE International Conference on Computer Vision Workshops.Google Scholar
- Bo Li, Wei Wu, Qiang Wang, Fangyi Zhang, Junliang Xing, and Junjie Yan. 2019 e. Siamrpn+: Evolution of siamese visual tracking with very deep networks. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
- Chenglong Li, Xinyan Liang, Yijuan Lu, Nan Zhao, and Jin Tang. 2019 a. RGB-T object tracking: benchmark and baseline. Pattern Recognition, Vol. 96 (2019), 106977.Google ScholarDigital Library
- Feng Li, Cheng Tian, Wangmeng Zuo, Lei Zhang, and Ming-Hsuan Yang. 2018. Learning spatial-temporal regularized correlation filters for visual tracking. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
- Meihui Li, Lingbing Peng, Yingpin Chen, et al. 2019 d. Mask Sparse Representation Based on Semantic Features for Thermal Infrared Target Tracking. Remote Sensing, Vol. 11, 17 (2019), 1967.Google ScholarCross Ref
- Xin Li, Qiao Liu, Nana Fan, et al. 2019 b. Hierarchical spatial-aware Siamese network for thermal infrared object tracking. Knowledge-Based Systems, Vol. 166 (2019), 71--81.Google ScholarCross Ref
- Xin Li, Chao Ma, Baoyuan Wu, Zhenyu He, and Ming-Hsuan Yang. 2019 c. Target-Aware Deep Tracking. In IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
- Yang Li, Jianke Zhu, and Steven CH Hoi. 2015. Reliable patch trackers: Robust visual tracking by exploiting reliable patches. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
- Qiao Liu, Zhenyu He, Xin Li, and Yuan Zheng. 2019. PTB-TIR: A Thermal Infrared Pedestrian Tracking Benchmark. IEEE Transaction on Multimedia, Vol. 22, 3 (2019), 666--675.Google ScholarCross Ref
- Qiao Liu, Xin Li, Zhenyu He, Nana Fan, Di Yuan, and Hongpeng Wang. 2020. Learning Deep Multi-Level Similarity for Thermal Infrared Object Tracking. IEEE Transaction on Multimedia (2020). https://doi.org/10.1109/TMM.2020.3008028Google Scholar
- Qiao Liu, Xiaohuan Lu, Zhenyu He, et al. 2017. Deep convolutional neural networks for thermal infrared object tracking. Knowledge-Based Systems, Vol. 134 (2017), 189--198.Google ScholarCross Ref
- Chao Ma, Jia-Bin Huang, Xiaokang Yang, and Ming-Hsuan Yang. 2015a. Hierarchical convolutional features for visual tracking. In IEEE International Conference on Computer Vision.Google ScholarDigital Library
- Chao Ma, Xiaokang Yang, Chongyang Zhang, and Ming Hsuan Yang. 2015b. Long-term correlation tracking. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
- Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research, Vol. 9, Nov (2008), 2579--2605.Google Scholar
- Matthias Muller, Adel Bibi, Silvio Giancola, Salman Alsubaihi, and Bernard Ghanem. 2018. Trackingnet: A large-scale dataset and benchmark for object tracking in the wild. In European Conference on Computer Vision.Google ScholarDigital Library
- Hyeonseob Nam and Bohyung Han. 2016. Learning multi-domain convolutional neural networks for visual tracking. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
- Jan Portmann, Simon Lynen, Margarita Chli, and Roland Siegwart. 2014. People detection and tracking from aerial thermal views. In IEEE Robotics and Automation Society.Google Scholar
- Yuankai Qi, Shengping Zhang, Lei Qin, et al. 2016. Hedged Deep Tracking. In IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
- Olga Russakovsky, Jia Deng, Hao Su, et al. 2015. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision, Vol. 115, 3 (2015), 211--252.Google ScholarDigital Library
- Laura Sevilla-Lara and Erik Learned-Miller. 2012. Distribution fields for tracking. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
- Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).Google Scholar
- Yibing Song, Chao Ma, Lijun Gong, et al. 2017. Crest: Convolutional residual learning for visual tracking. In IEEE International Conference on Computer Vision.Google ScholarCross Ref
- Yibing Song, Chao Ma, Xiaohe Wu, Lijun Gong, et al. 2018. Vital: Visual tracking via adversarial learning. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
- Jack Valmadre, Luca Bertinetto, Jo ao Henriques, Andrea Vedaldi, and Philip HS Torr. 2017. End-to-end representation learning for correlation filter based tracking. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
- Vijay Venkataraman, Guoliang Fan, Joseph P Havlicek, et al. 2012. Adaptive kalman filtering for histogram-based appearance learning in infrared imagery. IEEE Transactions on Image Processing, Vol. 21, 11 (2012), 4622--4635.Google ScholarDigital Library
- Ning Wang, Yibing Song, Chao Ma, Wengang Zhou, Wei Liu, and Houqiang Li. 2019 a. Unsupervised Deep Tracking. In IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
- Ning Wang, Wengang Zhou, Qi Tian, Richang Hong, Meng Wang, and Houqiang Li. 2018. Multi-cue correlation filters for robust visual tracking. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
- Qiang Wang, Li Zhang, Luca Bertinetto, Weiming Hu, and Philip HS Torr. 2019 b. Fast online object tracking and segmentation: A unifying approach. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
- Yi Wu, Jongwoo Lim, and Ming-Hsuan Yang. 2013. Online object tracking: A benchmark. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarDigital Library
- Zheng Wu, Nathan Fuller, Diane Theriault, and Margrit Betke. 2014. A thermal infrared video benchmark for visual analysis. In IEEE Conference on Computer Vision and Pattern Recognition Workshops.Google ScholarDigital Library
- Jia Xu, Huchuan Lu, and Ming Hsuan Yang. 2012. Visual tracking via adaptive structural local sparse appearance model. In IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
- Xianguo Yu, Qifeng Yu, Yang Shang, and Hongliang Zhang. 2017. Dense structural learning for infrared object tracking at 200+ frames per second. Pattern Recognition Letter, Vol. 100 (2017), 152--159.Google ScholarDigital Library
- Lichao Zhang, Abel Gonzalez-Garcia, Joost van de Weijer, Martin Danelljan, and Fahad Shahbaz Khan. 2019. Synthetic data generation for end-to-end thermal infrared tracking. IEEE Transactions on Image Processing, Vol. 28, 4 (2019), 1837--1850.Google ScholarDigital Library
Index Terms
- LSOTB-TIR: A Large-Scale High-Diversity Thermal Infrared Object Tracking Benchmark
Recommendations
TrackingNet: A Large-Scale Dataset and Benchmark for Object Tracking in the Wild
Computer Vision – ECCV 2018AbstractDespite the numerous developments in object tracking, further improvement of current tracking algorithms is limited by small and mostly saturated datasets. As a matter of fact, data-hungry trackers based on deep-learning currently rely on object ...
LaSOT: A High-quality Large-scale Single Object Tracking Benchmark
AbstractDespite great recent advances in visual tracking, its further development, including both algorithm design and evaluation, is limited due to lack of dedicated large-scale benchmarks. To address this problem, we present LaSOT, a high-quality Large-...
The Thermal Infrared Visual Object Tracking VOT-TIR2015 Challenge Results
ICCVW '15: Proceedings of the 2015 IEEE International Conference on Computer Vision Workshop (ICCVW)The Thermal Infrared Visual Object Tracking challenge 2015, VOT-TIR2015, aims at comparing short-term single-object visual trackers that work on thermal infrared (TIR) sequences and do not apply pre-learned models of object appearance. VOT-TIR2015 is ...
Comments