Exploring the State-of-the-Art in Multi-Object Tracking: A Comprehensive Survey, Evaluation, Challenges, and Future Directions

Du, Chenjie; Lin, Chenwei; Jin, Ran; Chai, Bencheng; Yao, Yingbiao; Su, Siyu

doi:10.1007/s11042-023-17983-2

Exploring the State-of-the-Art in Multi-Object Tracking: A Comprehensive Survey, Evaluation, Challenges, and Future Directions

Published: 09 February 2024

(2024)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Chenjie Du¹,
Chenwei Lin²,
Ran Jin¹,
Bencheng Chai¹,
Yingbiao Yao ORCID: orcid.org/0000-0001-7946-6070² &
…
Siyu Su²

637 Accesses
Explore all metrics

Abstract

Multiple object tracking (MOT), as a typical application scenario of computer vision, has attracted significant attention from both academic and industrial communities. With its rapid development, MOT has becomes an hot topic. However, maintaining robust MOT in complex scenarios still faces significant challenges, such as irregular motion patterns, similar appearances, and frequent occlusions. Based on an extensive investigation into the state-of-the-art MOT, this survey has made the following efforts: 1) listing down preceding MOT approaches and current classifications; 2) surveying the MOT metrics and benchmark databases; 3) evaluating the MOT approaches frequently employed; 4) discussing the main challenges for MOT; and 5) putting forward potential directions for the development of future MOT approaches. By doing so, it strives to provide a systematic and comprehensive overview of existing MOT methods from SDE to TBA perspectives, thereby promoting further research into this emerging and important field.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Microsoft COCO: Common Objects in Context

ImageNet Large Scale Visual Recognition Challenge

Article 11 April 2015

Attention mechanisms in computer vision: A survey

Article Open access 15 March 2022

Data availability

All relevant data are within the paper.

References

Seidenschwarz J, Brasó G, Serrano VC, Elezi I, Leal-Taixé L (2023) Simple cues lead to a strong multi-object tracker. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13813–13823. https://doi.org/10.1109/CVPR52729.2023.01327
Book Google Scholar
Li S, Fischer T, Ke L, Ding H, Danelljan M, Yu F (2023) Ovtrack: Open vocabulary multiple object tracking. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5567–5577. https://doi.org/10.1109/CVPR52729.2023.00539
Book Google Scholar
Wu D, Han W, Wang T, Dong X, Zhang X, Shen J (2023) Referring multi-object tracking. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 14633–14642. https://doi.org/10.1109/CVPR52729.2023.01406
Book Google Scholar
Meimetis D, Daramouskas I, Perikos I, Hatzilygeroudis I (2023) Real-time multiple object tracking using deep learning methods. Neural Comput Appl 35(1):89–118
Article Google Scholar
Yin J, Wang W, Meng Q, Yang R, Shen J (2020) A unified object motion and affinity model for online multi-object tracking. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6768–6777. https://doi.org/10.1109/CVPR42600.2020.00680
Book Google Scholar
Welch G, Bishop G (1995) An introduction to the kalman filter. In: Proceedings of international conference on computer graphics and interactive techniques, pp 1–16
Google Scholar
Hu W, Li X, Luo W, Zhang X, Maybank S, Zhang Z (2012) Single and multiple object tracking using log-euclidean riemannian subspace and block-division appearance model. IEEE Trans Pattern Anal Mach Intell 34(12):2420–2440
Article PubMed Google Scholar
Zhang L, Van Der Maaten L (2013) Preserving structure in model-free tracking. IEEE Trans Pattern Anal Mach Intell 36(4):756–769
Article Google Scholar
Morimitsu H, Bloch I, Cesar-Jr RM (2017) Exploring structure for long-term tracking of multiple objects in sports videos. Comput Vis Image Underst 159:89–104
Article Google Scholar
Ošep A, Mehner W, Voigtlaender P, Leibe B (2018) Track, then decide: Category-agnostic vision-based multi-object tracking. In: Proceedings of the IEEE international conference on robotics and automation (ICRA), pp 3494–3501. https://doi.org/10.1109/ICRA.2018.8460975
Chapter Google Scholar
Zhang L, Maaten L (2013) Structure preserving object tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1838–1845. https://doi.org/10.1109/CVPR.2013.240
Chapter Google Scholar
Bewley A, Ge Z, Ott L, Ramos F, Upcroft B (2016) Simple online and realtime tracking. In: Proceedings of the IEEE international conference on image processing (ICIP), pp 3464–3468. https://doi.org/10.1109/ICIP.2016.7533003
Chapter Google Scholar
Wojke N, Bewley A, Paulus D (2017) Simple online and realtime tracking with a deep association metric. In: Proceedings of the IEEE international conference on image processing (ICIP), pp 3645–3649. https://doi.org/10.1109/ICIP.2017.8296962
Chapter Google Scholar
Cao J, Pang J, Weng X, Khirodkar R, Kitani K (2023) Observation-centric sort: Rethinking sort for robust multi-object tracking. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9686–9696. https://doi.org/10.1109/CVPR52729.2023.00934
Book Google Scholar
Meneses M, Matos L, Prado B, Carvalho A, Macedo H (2020) Learning to associate detections for real-time multiple object tracking. https://doi.org/10.48550/arXiv.2007.06041
Aharon N, Orfaig R, Bobrovsky BZ (2022) Bot-sort: Robust associations multi-pedestrian tracking. Comput Vis Pattern Recognit. https://doi.org/10.48550/arXiv.2206.14651
Du Y, Zhao Z, Song Y, Zhao Y, Su F, Gong T, Meng H (2023) Strongsort: Make deepsort great again. IEEE Trans Multimed. https://doi.org/10.1109/TMM.2023.3240881
Zhang Y, Sun P, Jiang Y, Yu D, Weng F, Yuan Z, Luo P, Liu W, Wang X (2022) Bytetrack: Multi-object tracking by associating every detection box. In: Proceedings of the european conference on computer vision, pp 1–21. https://doi.org/10.48550/arXiv.2110.06864
Chapter Google Scholar
Ren H, Han S, Ding H, Zhang Z, Wang H, Wang F (2023) Focus on details: Online multi-object tracking with diverse fine-grained representation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11289–11298. https://doi.org/10.1109/CVPR52729.2023.01086
Book Google Scholar
Kong J, Mo E, Jiang M, Liu T (2022) Motfr: Multiple object tracking based on feature recoding. IEEE Trans Circuits Syst Video Technol 32(11):7746–7757
Article Google Scholar
Jiang M, Zhou C, Kong J (2022) Aoh: Online multiple object tracking with adaptive occlusion handling. IEEE Signal Process Lett 29:1644–1648
Article ADS Google Scholar
Li C, Dobler G, Feng X, Tracknet WY (2019) Tracknet: Simultaneous object detection and tracking and its application in traffic video analysis. https://doi.org/10.48550/arXiv.1902.01466
Sun S, Akhtar N, Song H, Mian A, Shah M (2019) Deep affinity network for multiple object tracking. IEEE Trans Pattern Anal Mach Intell 43(1):104–119
Google Scholar
Liang C, Zhang Z, Zhou X, Li B, Zhu S, Hu W (2022) Rethinking the competition between detection and reid in multiobject tracking. IEEE Trans Image Process 31:3182–3196
Article ADS PubMed Google Scholar
Chu P, Wang J, You Q, Ling H, Liu Z (2023) Transmot: Spatial-temporal graph transformer for multiple object tracking. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 4870–4880. https://doi.org/10.1109/WACV56688.2023.00485
Book Google Scholar
Xu J, Cao Y, Zhang Z, Hu H (2019) Spatial-temporal relation networks for multi-object tracking. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 3988–3998. https://doi.org/10.1109/ICCV.2019.00409
Book Google Scholar
Ciaparrone G, Sánchez FL, Tabik S, Troiano L, Tagliaferri R, Herrera F (2020) Deep learning in video multi-object tracking: A survey. Neurocomputing 381:61–88
Article Google Scholar
Emami P, Pardalos PM, Elefteriadou L, Ranka S (2020) Machine learning methods for data association in multi-object tracking. ACM Computing Surveys (CSUR) 53(4):1–34
Article Google Scholar
Rakai L, Song H, Sun S, Zhang W, Yang Y (2022) Data association in multiple object tracking: A survey of recent techniques. Expert Syst Appl 192:116300
Article Google Scholar
Park Y, Dang LM, Lee S, Han D, Moon H (2021) Multiple object tracking in deep learning approaches: A survey. Electronics 10(19):2406
Article Google Scholar
Camplani M, Paiement A, Mirmehdi M, Damen D, Hannuna S, Burghardt T, Tao L (2017) Multiple human tracking in rgbdepth data: A survey. IET Comput Vision 11(4):265–285
Article Google Scholar
Luo W, Xing J, Milan A, Zhang X, Liu W, Kim TK (2021) Multiple object tracking: A literature review. Artif Intell 293:103448
Article MathSciNet Google Scholar
Cao ZQ, Sai B, Lu X (2020) Review of pedestrian tracking: Algorithms and applications. Acta Phys Sin 69(8):084203-1-084203-18
Article Google Scholar
Pal SK, Pramanik A, Maiti J, Mitra P (2021) Deep learning in multi-object detection and tracking: state of the art. Appl Intell 51:6400–6429
Article Google Scholar
Sun P, Cao JK, Jiang Y, Yuan ZH, Bai S, Kitani K, Luo P (2022) DanceTrack: Multi-object tracking in uniform appearance and diverse motion. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 20961–20970. https://doi.org/10.1109/CVPR52688.2022.02032
Chapter Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Commun ACM 60(6):84–90. https://doi.org/10.1145/3065386
Article Google Scholar
Szegedy C, Toshev A, Erhan D (2013) Deep neural networks for object detection. In: Proceedings of the neural information processing systems, pp 2553–2561
Google Scholar
Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, LeCun Y (2014) Overfeat: Integrated recognition, localization and detection using convolutional networks. In: Proceedings of the international conference on learning representations
Google Scholar
Ren S, He K, Girshick R, Sun J (2017) Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031
Article PubMed Google Scholar
He K, Gkioxari G, Dollár P, Girshick R (2017) Mask R-CNN. In: Proceedings of the IEEE international conference on computer vision, pp 2961–2969. https://doi.org/10.1109/ICCV.2017.322
Book Google Scholar
Sun J, Chen L, Xie Y, Zhang S, Jiang Q, Zhou X, Bao H (2020) Disp R-CNN: Stereo 3d object detection via shape prior guided instance disparity estimation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10548–10557. https://doi.org/10.1109/CVPR42600.2020.01056
Book Google Scholar
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg A.C (2016) Ssd: Single shot multibox detector. In: Proceedings of the european conference on computer vision, pp 21–37. https://doi.org/10.1007/978-3-319-46448-0_2
Lin TY, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. IEEE Trans Pattern Anal Mach Intell 99:2999–3007
Google Scholar
Wang CY, Bochkovskiy A, Liao HYM (2023) Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7464–7475. https://doi.org/10.1109/CVPR52729.2023.00721
Book Google Scholar
Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, Zagoruyko S (2020) End-to-end object detection with transformers. In: Proceedings of the european conference on computer vision (ECCV), pp 213–229. https://doi.org/10.1007/978-3-030-58452-8_13
Chapter Google Scholar
Gupta A, Narayan S, Joseph KJ, Khan S, Khan FS, Shah M (2022) Ow-detr: Open-world detection transformer. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 9225–9234. https://doi.org/10.1109/CVPR52688.2022.00902
Book Google Scholar
Zhu X, Su W, Lu L, Li B, Wang X, Dai J (2020) Deformable detr: Deformable transformers for end-to-end object detection. https://doi.org/10.48550/arXiv.2010.04159
Sun P, Tan M, Wang W, Liu C, Xia F, Leng Z, Anguelov D (2022) Swformer: Sparse window transformer for 3d object detection in point clouds. In: Proceedings of the European conference on computer vision, pp 426–442. https://doi.org/10.1007/978-3-031-20080-9_25
Book Google Scholar
Wang X, Doretto G, Sebastian T, Rittscher J, Tu P (2007) Shape and appearance context modeling. In: Proceedings of the IEEE 11^th international conference on computer vision, pp 1–8. https://doi.org/10.1109/ICCV.2007.4409019
Farenzena M, Bazzani L, Perina A, Murino V, Cristani M (2010) Person re-identification by symmetry-driven accumulation of local features. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 2360–2367. https://doi.org/10.1109/CVPR.2010.5539926
Book Google Scholar
Zhao R, Ouyang W, Wang X (2013) Unsupervised salience learning for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3586–3593. https://doi.org/10.1109/CVPR.2013.460
Book Google Scholar
Liao S, Hu Y, Zhu X, Li SZ (2015) Person re-identification by local maximal occurrence representation and metric learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2197–2206. https://doi.org/10.1109/CVPR.2015.7298832
Book Google Scholar
Zhang Y, Wang C, Wang X, Zeng W, Liu W (2021) Fairmot: On the fairness of detection and re-identification in multiple object tracking. Int J Comput Vision 129:3069–3087
Article Google Scholar
Xiao T, Li S, Wang B, Lin WX (2017) Joint detection and identification feature learning for person search. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3415–3424. https://doi.org/10.1109/CVPR.2017.360
Book Google Scholar
Liu H, Feng J, Qi M, Jiang J, Yan S (2017) End-to-end comparative attention networks for person re-identification. IEEE Trans Image Process 26(7):3492–3506
Article ADS MathSciNet Google Scholar
Chang X, Huang PY, Shen YD, Liang X, Yang Y, Hauptmann AG (2018) Rcaa: Relational context-aware agents for person search. In: Proceedings of the European conference on computer vision (ECCV), pp 84–100. https://doi.org/10.1007/978-3-030-01240-3_6
Wang Z, Zheng L, Liu Y, Li Y, Wang S (2020) Towards real-time multi-object tracking. In: Proceedings of the European conference on computer vision (ECCV), pp 107–122. https://doi.org/10.1007/978-3-030-58621-8_7
Lu Z, Rathod V, Votel R, Huang J (2020) Retinatrack: Online single stage joint detection and tracking. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 14668–14678. https://doi.org/10.1109/CVPR42600.2020.01468
Book Google Scholar
Chen D, Zhang S, Yang J, Schiele B (2021) Norm-aware embedding for efficient person search and tracking. Int J Comput Vision 129:3154–3168
Article Google Scholar
Yoon JH, Lee CR, Yang MH, Yoon KJ (2016) Online multi-object tracking via structural constraint event aggregation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1392–1400. https://doi.org/10.1109/CVPR.2016.155
Book Google Scholar
Bochinski E, Eiselein V, Sikora T (2017) High-speed tracking-by-detection without using image information. In: Proceedings of the 14^th IEEE international conference on advanced video and signal based surveillance (AVSS), pp 1–6. https://doi.org/10.1109/avss.2017.8078516
Zhou H, Ouyang W, Cheng J, Wang X, Li H (2018) Deep continuous conditional random fields with asymmetric inter-object constraints for online multi-object tracking. IEEE Trans Circuits Syst Video Technol 29(4):1011–1022
Article Google Scholar
Shan C, Wei C, Deng B, Huang J, Hua XS, Cheng X, Liang K (2020) Tracklets predicting based adaptive graph tracking. https://doi.org/10.48550/arXiv.2010.09015
Girbau A, Giró-i-Nieto X, Rius I, Marqués F (2021) Multiple object tracking with mixture density networks for trajectory estimation. https://doi.org/10.48550/arXiv:2106.10950
Peng J, Wang C, Wan F, Wu Y, Wang Y, Tai Y, Wang C, Li J, Huang F, Fu Y (2020) Chained-tracker: Chaining paired attentive regression results for end-to-end joint multiple-object detection and tracking. In: Proceedings of the European conference on computer vision (ECCV), pp 145–161. https://doi.org/10.1007/978-3-030-58548-8_9
Pang B, Li Y, Zhang Y, Li LC (2020) Tubetk: Adopting tubes to track multi-object in a one-step training model. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6308–6318. https://doi.org/10.1109/CVPR42600.2020.00634
Book Google Scholar
Han S, Huang P, Wang H, Yu E, Liu D, Pan X (2022) Mat: Motion-aware multi-object tracking. Neurocomputing 476:75–86
Article Google Scholar
Bergmann P, Meinhardt T, Leal-Taixe L (2019) Tracking without bells and whistles. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 941–951. https://doi.org/10.1109/ICCV.2019.00103
Book Google Scholar
Yu E, Li Z, Han S, Wang H (2022) Relationtrack: Relation-aware multiple object tracking with decoupled representation. IEEE Trans Multimed. https://doi.org/10.1109/TMM.2022.3150169
Liang C, Zhang Z, Zhou X, Li B, Lu Y (2022) One more check: Making “fake background” be tracked again. In: Proceedings of the AAAI conference on artificial intelligence, pp 1546–1554. https://doi.org/10.1609/aaai.v36i2.20045
Book Google Scholar
Liu Q, Chen D, Chu Q, Yuan L, Liu B, Zhang L, Yu N (2022) Online multi-object tracking with unsupervised re-identification learning and occlusion estimation. Neurocomputing 483:333–347
Article Google Scholar
Cui YM, Yan LQ, Cao ZW, Liu DF (2021) TF-Blender: Temporal feature blender for video object detection. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV), pp 8118–8127. https://doi.org/10.1109/ICCV48922.2021.00803
Liu DF, Cui YM, Chen YJ, Zhang JY, Fan B (2020) Video object detection for autonomous driving: Motion-aid feature calibration. Neurocomputing 409:1–11
Article Google Scholar
Sheng H, Zhang Y, Wu YB, Wang S, Lyu WF, Ke W, Xiong Z (2020) Hypothesis testing based tracking with spatio-temporal joint interaction modeling. IEEE Trans Circuits Syst Video Technol 30(9):2971–2983
Article Google Scholar
Wang S, Sheng H, Zhang Y, Wu YB, Xiong Z (2021) A general recurrent tracking framework without real data. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV), pp 13219–13228. https://doi.org/10.1109/ICCV48922.2021.01297
Chapter Google Scholar
Wu H, Nie JH, Zhu ZM, He ZW, Gao MY (2022) Leveraging temporal-aware FNE-grained features for robust multiple object tracking. J Supercomput 79:2910–2931
Article Google Scholar
Lang C, Braun A, Schillingmann L, Valada A (2023) Self-supervised multi-object tracking for autonomous driving from consistency across timescales. IEEE Robot Autom Lett 8(11):7711–7718
Article Google Scholar
Zhou TF, Li JW, Li XY, Shao L (2021) Target-aware object discovery and association for unsupervised video multi-object segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 6985–6994. https://doi.org/10.1109/CVPR46437.2021.00691
Peng JL, Wang T, Lin WY, Wang J, See J, Wen SL, Ding E (2020) TPM: Multiple object tracking with tracklet-plane matching. Pattern Recogn 107:107480
Article Google Scholar
Mhalla A, Chateau T (2019) Improving multi-object tracking-by-detection model using a temporal interlaced encoding and a specialized deep detector. In: Proceedings of the IEEE intelligent vehicles symposium, pp 510–516. https://doi.org/10.1109/IVS.2019.8814102
Book Google Scholar
Zhao SY, Wu YB, Wang S, Ke W, Sheng H (2022) Mask guided spatial-temporal fusion network for multiple object tracking. In: Proceedings of the IEEE international conference on image processing (ICIP), pp 3231–3235. https://doi.org/10.1109/ICIP46576.2022.9898054
Chapter Google Scholar
Zhang JJ, Wang MY, Jiang HR, Zhang XY, Yan CG, Zeng D (2023) STAT: Multi-object tracking based on spatio-temporal topological constraints. IEEE Trans Multimed. https://doi.org/10.1109/TMM.2023.3323852
You SS, Yao HT, Xu CS (2022) Multi-object tracking with spatial-temporal topology-based detector. IEEE Trans Circuits Syst Video Technol 32(5):3023–3035
Article Google Scholar
Pang ZQ, Li J, Tokmakov P, Chen D, Zagoruyko S, Wang YX (2023) Standing between past and future spatio-temporal modeling for multi-camera 3D multi-object tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 17928–17938. https://doi.org/10.1109/CVPR52729.2023.01719
Wang YX, Kitani K, Weng XS (2021) Joint object detection and multi-object tracking with graph neural networks. In: Proceedings of the IEEE international conference on robotics and automation (ICRA), pp 13708–13715. https://doi.org/10.1109/ICRA48506.2021.9561110
Wang SK, Sun YX, Wang Z, Liu M (2024) ST-TrackNet: A multiple-object tracking network using spatio-temporal information. IEEE Trans Autom Sci Eng 21(1):284–295. https://doi.org/10.1109/TASE.2022.3216450
Article Google Scholar
Zhu TY, Hiller M, Ehsanpour M, Ma RK, Drummond T, Rezatofighi H (2021) Looking beyond two frames: End-to-end multi-object tracking using spatial and temporal transformers. IEEE Trans Pattern Anal Mach Intell 45:12783–12797
Google Scholar
Hu MJ, Zhu XT, Wang HT, Cao SX, Liu C, Song Q (2023) STDFormer: Spatial-temporal motion transformer for multiple object tracking. IEEE Trans Circuits Syst Video Technol 33(11):6571–6594
Article Google Scholar
Yang M, Wu Y, Jia Y (2017) A hybrid data association framework for robust online multi-object tracking. IEEE Trans Image Process 26(12):5667–5679
Article ADS MathSciNet Google Scholar
Yang M, Jia Y (2016) Temporal dynamic appearance modeling for online multi-person tracking. Comput Vis Image Underst 153:16–28
Article Google Scholar
Guo S, Wang J, Wang X, Tao D (2021) Online multiple object tracking with cross-task synergy. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8136–8145. https://doi.org/10.1109/CVPR46437.2021.00804
Chapter Google Scholar
Xu Y, Osep A, Ban Y, Horaud R, LealTaixé L, Alameda-Pineda X (2020) How to train your deep multi-object tracker. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6787–6796. https://doi.org/10.1109/CVPR42600.2020.00682
Book Google Scholar
Sadeghian A, Alahi A, Savarese S (2017) Tracking the untrackable: Learning to track multiple cues with long-term dependencies. In: Proceedings of the IEEE international conference on computer vision, pp 300–311. https://doi.org/10.1109/ICCV.2017.41
Book Google Scholar
Rezatofighi SH, Milan A, Zhang Z, Shi Q, Dick A, Reid I (2015) Joint probabilistic data association revisited. In: Proceedings of the IEEE international conference on computer vision, pp 3047–3055. https://doi.org/10.1109/ICCV.2015.349
Book Google Scholar
Benfold B, Reid I (2011) Stable multi-target tracking in real-time surveillance video. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3457–3464. https://doi.org/10.1109/CVPR.2011.5995667
Book Google Scholar
Kim C, Li F, Ciptadi A, Rehg JM (2015) Multiple hypothesis tracking revisited. In: Proceedings of the IEEE international conference on computer vision, pp 4696–4704. https://doi.org/10.1109/ICCV.2015.533
Book Google Scholar
Brasó G, Leal-Taixé L (2020) Learning a neural solver for multiple object tracking. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6247–6257. https://doi.org/10.1109/CVPR42600.2020.00628
Book Google Scholar
Gori M, Monfardini G, Scarselli F (2005) A new model for learning in graph domains. In: Proceedings of 2005 IEEE international joint conference on neural networks, pp 729–734. https://doi.org/10.1109/IJCNN.2005.1555942
Zhang L, Li Y, Nevatia R (2008) Global data association for multi-object tracking using network flows. In: Proceedings of 2008 IEEE conference on computer vision and pattern recognition, pp 1–8. https://doi.org/10.1109/CVPR.2008.4587584
Chari V, Lacoste-Julien S, Laptev I, Sivic J (2015) On pairwise costs for network flow multi-object tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5537–5545. https://doi.org/10.1109/CVPR.2015.7299193
Book Google Scholar
Butt AA, Collins RT (2013) Multi-target tracking by lagrangian relaxation to mincost network flow. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 1846–1853. https://doi.org/10.1109/CVPR.2013.241
Berclaz J, Fleuret F, Turetken E, Fua P (2011) Multiple object tracking using k-shortest paths optimization. IEEE Trans Pattern Anal Mach Intell 33(9):1806–1819
Article PubMed Google Scholar
Jiang H, Fels S, Little JJ (2007) A linear programming approach for multiple object tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1–8. https://doi.org/10.1109/CVPR.2007.383180
Book Google Scholar
Pirsiavash H, Ramanan D, Fowlkes CC (2011) Globally-optimal greedy algorithms for tracking a variable number of objects. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1201–1208. https://doi.org/10.1109/CVPR.2011.5995604
Book Google Scholar
Roshan Zamir A, Dehghan A, Shah M (2012) Gmcp-tracker: Global multi-object tracking using generalized minimum clique graphs. In: Proceedings of the European conference on computer vision (ECCV), pp 343–356. https://doi.org/10.1007/978-3-642-33709-3_25
Wang B, Wang G, Chan KL, Wang L (2016) Tracklet association by online target-specific metric learning and coherent dynamics estimation. IEEE Trans Pattern Anal Mach Intell 39(3):589–602
Article Google Scholar
Xiang J, Xu G, Ma C, Hou J (2020) End-to-end learning deep crf models for multi-object tracking deep crf models. IEEE Trans Circuits Syst Video Technol 31(1):275–288
Article Google Scholar
Brendel W, Amer M, Todorovic S (2011) Multiobject tracking as maximum weight independent set. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1273–1280. https://doi.org/10.1109/CVPR.2011.5995395
Book Google Scholar
Wang T, Chen K, Lin W, See J, Zhang Z, Xu Q, Jia X (2023) Spatio-temporal point process for multiple object tracking. IEEE Trans Neural Netw Learn Syst 34(4):1777–1788. https://doi.org/10.1109/TNNLS.2020.2997006
Article PubMed Google Scholar
Peng J, Gu Y, Wang Y, Wang C, Li J, Huang F (2020) Dense scene multiple object tracking with box-plane matching. In: Proceedings of the 28th ACM International Conference on Multimedia, pp 4615–4619. https://doi.org/10.1145/3394171.3416283
Ren W, Wang X, Tian J, Tang Y, Chan AB (2020) Tracking-by-counting: Using network flows on crowd density maps for tracking multiple targets. IEEE Trans Image Process 30:1439–1452
Article ADS MathSciNet PubMed Google Scholar
He Y, Wei X, Hong X, Ke W, Gong Y (2022) Identity-quantity harmonic multi-object tracking. IEEE Trans Image Process 31:2201–2215
Article ADS PubMed Google Scholar
Yu F, Li W, Li Q, Liu Y, Shi X, Yan J (2016) Poi: Multiple object tracking with high performance detection and appearance feature. In: Proceedings of the European conference on computer vision (ECCV), pp 36–42. https://doi.org/10.1007/978-3-319-48881-3_3
Fang K, Xiang Y, Li X, Savarese S (2018) Recurrent autoregressive networks for online multi-object tracking. In: Proceedings of the IEEE winter conference on applications of computer vision (WACV), pp 466–475. https://doi.org/10.1109/WACV.2018.00057
Zhou Z, Xing J, Zhang M, Hu W (2018) Online multi-target tracking with tensor-based high-order graph matching. In: Proceedings of the 24th international conference on pattern recognition (ICPR), pp 1809–1814. https://doi.org/10.1109/ICPR.2018.8545450
Mahmoudi N, Ahadi SM, Rahmati M (2019) Multi-target tracking using CNN-based features: CNNMTT. Multimed Tools Appl 78:7077–7096
Article Google Scholar
Baisa NL (2021) Occlusion-robust online multi-object visual tracking using a GM-PHD filter with CNN-based re-identification. J Vis Commun Image Represent 80:103279
Article Google Scholar
Yan LQ, Wang QF, Ma SQ, Wang JG, Yu CB (2022) Solve the puzzle of instance segmentation in videos: A weakly supervised framework with spatio-temporal collaboration. IEEE Trans Circuits Syst Video Technol 33:393–406
Article Google Scholar
Liu DF, Cui YM, Yan LQ, Mousas C, Yang B, Chen YJ (2021) Densernet: Weakly supervised visual localization using multi-scale feature aggregation. In: Proceedings of the AAAI conference on artificial intelligence, pp 6101–6109. https://doi.org/10.1609/aaai.v35i7.16760
Book Google Scholar
Bastani F, He ST, Madden S (2021) Self-supervised multi-object tracking with cross-input consistency. Adv Neural Inf Process Syst 34:13695–13706
Google Scholar
Su C, Zhang SL, Xing JL, Gao W, Tian Q (2016) Deep attributes driven multi-camera person re-identification. In: Proceedings of the European conference on computer vision (ECCV), pp 475–491. https://doi.org/10.1007/978-3-319-46475-6_30
Huang K, Lertniphonphan K, Chen F, Li J, Wang ZP (2023) Multi-object tracking by self-supervised learning appearance model. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops (CVPRW), pp 3163–3169. https://doi.org/10.1109/CVPRW59228.2023.00318
Engilberge M, Liu WZ, Fua P (2023) Multi-view tracking using weakly supervised human motion prediction. In: Proceedings of the IEEE Winter conference on applications of computer vision (WACV), pp 1582–1592. https://doi.org/10.1109/WACV56688.2023.00163
Cucchiara R, Fabbri M (2022) Fine-grained human analysis under occlusions and perspective constraints in multimedia surveillance. ACM Trans Multimed Comput Commun Appl (TOMM) 18:1–23. https://doi.org/10.1145/3476839
Article Google Scholar
Kieritz H, Hubner W, Arens M (2018) Joint detection and online multi-object tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 1459–1467. https://doi.org/10.1109/CVPRW.2018.00195
Book Google Scholar
Shuai B, Berneshawi A, Wang M, Liu C, Modolo D, Li X, Tighe J (2020) Application of multi-object tracking with siamese track-RCNN to the human in events dataset. In: Proceedings of the 28th ACM international conference on multimedia, pp 4625–4629. https://doi.org/10.1145/3394171.3416297
Liu K, Jin S, Fu ZH, Chen Z, Jiang RX, Ye JP (2023) Uncertainty-aware unsupervised multi-object tracking. In: Proceedings of the IEEE International conference on computer vision, pp 9962–9971. https://doi.org/10.1109/ICCV51070.2023.00917
Book Google Scholar
Li YL, Lu Y, Li J, Wang HZ (2023) Learning to reconnect interrupted trajectories for weakly supervised multi-object tracking. In: Proceedings of the IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 1–5. https://doi.org/10.1109/ICASSP49357.2023.10095463
Ruiz I, Porzi L, Bulò SR, Kontschieder P, Serrat J (2021) Weakly supervised multi-object tracking and segmentation. In: Proceedings of the IEEE winter conference on applications of computer vision (WACV), pp 125–133. https://doi.org/10.1109/WACVW52041.2021.00018
Chu P, Ling H (2019) Famnet: Joint learning of feature, affinity and multi-dimensional assignment for online multiple object tracking. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 6172–6181. https://doi.org/10.1109/ICCV.2019.00627
Book Google Scholar
Shuai B, Berneshawi AG, Li XY, Modolo D, Tighe J (2021) SiamMOT: Siamese multi-object tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 12372–12382. https://doi.org/10.1109/CVPR46437.2021.01219
Pang JM, Qiu LL, Li X, Chen HF, Li Q, Darrell T, Yu F (2021) Quasi-dense similarity learning for multiple object tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 164–173. https://doi.org/10.1109/CVPR46437.2021.00023
Bertinetto L, Valmadre J, Henriques JF, Vedaldi A, Torr PH (2016) Fully-convolutional siamese networks for object tracking. In: Proceedings of the European conference on computer vision, pp 850–865. https://doi.org/10.1007/978-3-319-48881-3_56
Book Google Scholar
Tao R, Gavves E, Smeulders AW (2016) Siamese instance search for tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1420–1142. https://doi.org/10.1109/CVPR.2016.158
Book Google Scholar
Li B, Yan J, Wu W, Zhu Z, Hu X (2018) High performance visual tracking with siamese region proposal network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8971–8980. https://doi.org/10.1109/CVPR.2018.00935
Book Google Scholar
Li B, Wu W, Wang Q, Zhang F, Xing J, Yan J (2019) Siamrpn++: Evolution of siamese visual tracking with very deep networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4282–4291. https://doi.org/10.1109/CVPR.2019.00441
Book Google Scholar
Zhou X, Koltun V, Krähenbühl P (2020) Tracking objects as points. In: Proceedings of the European conference on computer vision (ECCV), pp 474–490. https://doi.org/10.1007/978-3-030-58548-8_28
Silva D, Alemu LT, Shah M (2020) CL-MOT: A contrastive learning framework for multi-object tracking. In: Proceedings of the British machine vision conference (BMCV), pp 1–13.
Chung T, Cho M, Lee H, Lee S (2022) SSAT: Self-supervised associating network for multiobject tracking. IEEE Trans Circuits Syst Video Technol 32(11):7858–7868
Article Google Scholar
Kim S, Lee J, Ko BC (2022) SSL-MOT: Self-supervised learning based multi-object tracking. Appl Intell 53:930–940
Article Google Scholar
Wang Q, Zheng Y, Pan P, Xu Y (2021) Multiple object tracking with correlation learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3876–3886. https://doi.org/10.1109/CVPR46437.2021.00387
Book Google Scholar
Tokmakov P, Li J, Burgard W, Gaidon A (2021) Learning to track with object permanence. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 10860–10869. https://doi.org/10.1109/ICCV48922.2021.01068
Book Google Scholar
Wang G, Wang Y, Gu R, Hu W, Hwang JN (2022) Split and connect: A universal tracklet booster for multi-object tracking. IEEE Trans Multimed 25:1256–1268. https://doi.org/10.1109/TMM.2022.3140919
Yang M, Liu S, Chen K, Zhang H, Zhao E, Zhao T (2020) A hierarchical clustering approach to fuzzy semantic representation of rare words in neural machine translation. IEEE Trans Fuzzy Syst 28(5):992–1002
Article Google Scholar
Sun P, Cao J, Jiang Y, Zhang R, Xie E, Yuan Z, Wang C, Luo P (2020) Transtrack: Multiple object tracking with transformer. https://doi.org/10.48550/arXiv.2012.15460
Meinhardt T, Kirillov A, Leal-Taixe L, Feichtenhofer C (2022) Trackformer: Multi-object tracking with transformers. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8844–8854. https://doi.org/10.1109/CVPR52688.2022.00864
Book Google Scholar
Xu Y, Ban Y, Delorme G, Gan C, Rus D, Alameda-Pineda X (2021) Transcenter: Transformers with dense queries for multiple-object tracking. https://doi.org/10.48550/arXiv.2103.1514
Zeng F, Dong B, Zhang Y, Wang T, Zhang X, Wei Y (2022) Motr: End-to-end multiple-object tracking with transformer. In:Proceedings of the European Conference on Computer Vision (ECCV), pp 659–675. https://doi.org/10.1007/978-3-031-19812-0_38
Chen X, Iranmanesh SM, Lien KC (2022) Patchtrack: Multiple object tracking using frame patches. https://doi.org/10.48550/arXiv:2201.00080
ADS Google Scholar
Leal-Taixé L, Milan A, Reid I, Roth S, Schindler K (2015) Motchallenge 2015: Towards a benchmark for multi-target tracking. https://doi.org/10.48550/arXiv.1504.01942
Yang B, Yan J, Lei Z, Li SZ (2014) Aggregate channel features for multi-view face detection. In: Proceedings of the IEEE international joint conference on biometrics, pp 1–8. https://doi.org/10.1109/BTAS.2014.6996284
Book Google Scholar
Milan A, Leal-Taixé L, Reid I, Roth S, Schindler K (2016) Mot16: A benchmark for multi-object tracking. https://doi.org/10.48550/arXiv.1603.00831
Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2009) Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627–1645
Article Google Scholar
Dendorfer P, Osep A, Milan A, Schindler K, Cremers D, Reid I, Roth S, Leal-Taixé L (2021) Motchallenge: A benchmark for singlecamera multiple target tracking. Int J Comput Vision 129:845–881
Article Google Scholar
Yang F, Choi W, Lin Y (2016) Exploit all the layers: Fast and accurate cnn object detector with scale dependent pooling and cascaded rejection classifiers. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2129–2137. https://doi.org/10.1109/CVPR.2016.234
Book Google Scholar
Dendorfer P, Rezatofighi H, Milan A, Shi J, Cremers D, Reid I, Roth S, Schindler K, Leal-Taixé L (2020) Mot20: A benchmark for multi object tracking in crowded scenes. https://doi.org/10.48550/arXiv.2003.09003
Cheng ZY, Liang J, Tao GH, Liu DF, Zhang XY (2023) Adversarial training of self-supervised monocular depth estimation against physical-world attacks. Comput Vis Pattern Recognit. https://doi.org/10.48550/arXiv.2301.13487
Qin ZY, Lu XK, Liu DF, Nie XS, Yin YL, Shen JB, Loui AC (2023) Reformulating graph kernels for self-supervised space-time correspondence learning. IEEE Trans Image Process 32:6543–6557
Article ADS PubMed Google Scholar
Wang WG, Han C, Zhou TF, Liu DF (2022) Visual recognition with deep nearest centroids. In: Proceedings of the international conference on learning representations (ICLR), pp 1–30
Qin ZY, Lu XK, Nie XS, Liu DF, Yin YL, Wang WG (2023) Coarse-to-fine video instance segmentation with factorized conditional appearance flows. IEEE/CAA J Autom Sin 10:1192–1208
Article Google Scholar
Liu DF, Liang J, Geng T, Loui AC, Zhou TF (2023) Tripartite feature enhanced pyramid network for dense prediction. IEEE Trans Image Process 32:2678–2692
Article ADS PubMed Google Scholar
Zhu P, Wen L, Du D, Bian X, Hu Q, Ling H (2020) Vision meets drones: Past, present and future. https://doi.org/10.48550/arXiv.2001.06303
Du D, Qi Y, Yu H, Yang Y, Duan K, Li G, Zhang W, Huang Q, Tian Q (2018) The unmanned aerial vehicle benchmark: Object detection and tracking. In: Proceedings of the European conference on computer vision (ECCV), pp 370–386. https://doi.org/10.1007/978-3-030-01249-6_23
Dave A, Khurana T, Tokmakov P, Schmid C, Ramanan D (2020) Tao: A large-scale benchmark for tracking any object. In: Proceedings of the European conference on computer vision (ECCV), pp 436–454. https://doi.org/10.1007/978-3-030-58558-7_26
Gupta A, Dollar P, Girshick R (2019) Lvis: A dataset for large vocabulary instance segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5356–5364. https://doi.org/10.1109/CVPR.2019.00550
Book Google Scholar
Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? The kitti vision benchmark suite. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3354–3361. https://doi.org/10.1109/CVPR.2012.6248074
Book Google Scholar
Yu F, Chen H, Wang X, Xian W, Chen Y, Liu F, Madhavan V, Darrell T (2020) Bdd100k: A diverse driving dataset for heterogeneous multitask learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2636–2645. https://doi.org/10.1109/CVPR42600.2020.00271
Book Google Scholar
Wen L, Du D, Cai Z, Lei Z, Chang MC, Qi H, Lim J, Yang MH, Lyu S (2020) UA-DETRAC: A new benchmark and protocol for multi-object detection and tracking. Comput Vis Image Underst 193:102907
Article Google Scholar
Sun P, Kretzschmar H, Dotiwalla X, Chouard A, Patnaik V, Tsui P, Guo J, Zhou Y, Chai Y, Caine B, Vasudevan V, Han W, Ngiam J, Zhao H, Timofeev A, Ettinger S, Krivokon M, Gao A, Joshi A, Anguelov D (2020) Scalability in perception for autonomous driving: Waymo open dataset. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2443–2451. https://doi.org/10.1109/CVPR42600.2020.00252
Book Google Scholar
Lin W, Liu H, Liu S, Li Y, Qian R, Wang T, Xu N, Xiong H, Qi GJ, Sebe N (2020) Human in events: A large-scale benchmark for human-centric video analysis in complex events. https://doi.org/10.48550/arXiv.2005.04490
Athar A, Luiten J, Voigtlaender P, Khurana T, Dave A, Leibe B (1674–1683) Ramanan D (2023) Burst: A benchmark for unifying object recognition, segmentation and tracking in video. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 1674–1683. https://doi.org/10.1109/WACV56688.2023.00172
Voigtlaender P, Luo L, Yuan C, Jiang Y, Leibe B (2021) Reducing the annotation effort for video object segmentation datasets. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp 3060–3069. https://doi.org/10.1109/WACV48630.2021.00310
Book Google Scholar
Sundararaman R, De Almeida BC, Marchand E, Pettre J (2021) Tracking pedestrian heads in dense crowd. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3865–3875. https://doi.org/10.1109/CVPR46437.2021.00386
Book Google Scholar
Weber M, Xie J, Collins M, Zhu Y, Voigtlaender P, Adam H, Green B, Geiger A, Leibe B, Cremers D, Osep A, Leal-Taixé L, Chen LC (2021) Step: Segmenting and tracking every pixel. https://doi.org/10.48550/arXiv.2102.11859
Fabbri M, Brasó G, Maugeri G, Cetintas O, Gasparini R, Ošep A, Calderara S, Leal-Taixé L, Cucchiara R (2021) Motsynth: How can synthetic data help pedestrian detection and tracking? In: Proceedings of the IEEE/CVF international conference on computer vision, pp 10849–10859. https://doi.org/10.1109/ICCV48922.2021.01067
Book Google Scholar
Pedersen M, Haurum JB, Bengtson SH, Moeslund TB (2020) 3d-zef: A 3d zebrafish tracking benchmark dataset. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 2426–2436. https://doi.org/10.1109/CVPR42600.2020.00250
Book Google Scholar
Anjum S, Gurari D (2020) Ctmc: Cell tracking with mitosis detection dataset challenge. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp 982–983. https://doi.org/10.1109/CVPRW50498.2020.00499
Book Google Scholar
Voigtlaender P, Krause M, Osep A, Luiten J, Sekar BBG, Geiger A, Leibe B (2019) Mots: Multi-object tracking and segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7942–7951. https://doi.org/10.1109/CVPR.2019.00813
Book Google Scholar
Andriluka M, Roth S, Schiele B (2010) Monocular 3d pose estimation and tracking by detection. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 623–630. https://doi.org/10.1109/CVPR.2010.5540156
Book Google Scholar
Ferryman J, Shahrokni A (2009) Pets2009: Dataset and challenge. In: Proceedings of the twelfth IEEE International workshop on performance evaluation of tracking and surveillance, pp 1–6. https://doi.org/10.1109/PETS-WINTER.2009.5399556
Bernardin K, Stiefelhagen R (2008) Evaluating multiple object tracking performance: the clear mot metrics. EURASIP J Image Vid Process 2008:1–10
Article Google Scholar
Luiten J, Osep A, Dendorfer P, Torr P, Geiger A, Leal-Taixé L, Leibe B (2021) Hota: A higher order metric for evaluating multi-object tracking. Int J Comput Vision 129:548–578
Article Google Scholar
Wu Y, Sheng H, Zhang Y, Wang S, Xiong Z, Ke W (2022) Hybrid motion model for multiple object tracking in mobile devices. IEEE Int Things J 10(6):4735–4748. https://doi.org/10.1109/JIOT.2022.3219627
Article Google Scholar
Hornakova A, Kaiser T, Swoboda P, Rolinek M, Rosenhahn B, Henschel R (2021) Making higher order mot scalable: An efficient approximate solver for lifted disjoint paths. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 6330–6340. https://doi.org/10.1109/ICCV48922.2021.00627
Book Google Scholar
Zhang J, Zhou S, Chang X, Wan F, Wang J, Wu Y, Huang D (2020) Multiple object tracking by flowing and fusing. https://doi.org/10.48550/arXiv.2001.11180
Zhang Y, Sheng H, Wu Y, Wang S, Ke W, Xiong Z (2020) Multiplex labeling graph for near-online tracking in crowded scenes. IEEE Internet Things J 7(9):7892–7902
Article Google Scholar
Chen L, Ai H, Zhuang Z, Shang C (2018) Real-time multiple people tracking with deeply learned candidate selection and person reidentification. In: Proceedings of 2018 IEEE international conference on multimedia and expo (ICME), pp 1–6. https://doi.org/10.1109/ICME.2018.8486597
Son J, Baek M, Cho M, Han B (2017) Multi-object tracking with quadruplet convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5620–5629. https://doi.org/10.1109/CVPR.2017.403
Book Google Scholar
Chen J, Sheng H, Zhang Y, Xiong Z (2017) Enhancing detection model for multiple hypothesis tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 18–27. https://doi.org/10.1109/CVPRW.2017.266
Book Google Scholar

Download references

Funding

This work was supported in part by the Natural Science Foundation of China under Grant 61671192, and in part by the National Science Foundation for Post-Doctoral Scientists of China under Grant 2017M114, and in part by the Top-Ranking Discipline a Class of Electronics Science and Technology in Zhejiang Province, China.

Author information

Authors and Affiliations

College of Big Data and Software Engineering, Zhejiang Wanli University, Ningbo, 315100, China
Chenjie Du, Ran Jin & Bencheng Chai
School of Communication Engineering, Hangzhou Dianzi University, Hangzhou, 310018, China
Chenwei Lin, Yingbiao Yao & Siyu Su

Authors

Chenjie Du
View author publications
You can also search for this author in PubMed Google Scholar
Chenwei Lin
View author publications
You can also search for this author in PubMed Google Scholar
Ran Jin
View author publications
You can also search for this author in PubMed Google Scholar
Bencheng Chai
View author publications
You can also search for this author in PubMed Google Scholar
Yingbiao Yao
View author publications
You can also search for this author in PubMed Google Scholar
Siyu Su
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yingbiao Yao.

Ethics declarations

Conflicts of Interests

All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Du, C., Lin, C., Jin, R. et al. Exploring the State-of-the-Art in Multi-Object Tracking: A Comprehensive Survey, Evaluation, Challenges, and Future Directions. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-023-17983-2

Download citation

Received: 02 November 2023
Revised: 17 December 2023
Accepted: 21 December 2023
Published: 09 February 2024
DOI: https://doi.org/10.1007/s11042-023-17983-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Exploring the State-of-the-Art in Multi-Object Tracking: A Comprehensive Survey, Evaluation, Challenges, and Future Directions

Abstract

Access this article

Similar content being viewed by others

Microsoft COCO: Common Objects in Context

ImageNet Large Scale Visual Recognition Challenge

Attention mechanisms in computer vision: A survey

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of Interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Exploring the State-of-the-Art in Multi-Object Tracking: A Comprehensive Survey, Evaluation, Challenges, and Future Directions

Abstract

Access this article

Similar content being viewed by others

Microsoft COCO: Common Objects in Context

ImageNet Large Scale Visual Recognition Challenge

Attention mechanisms in computer vision: A survey

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of Interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation