Abstract
Optimization-based 3D object tracking is known to be precise and fast, but sensitive to large inter-frame displacements. In this paper we propose a fast and effective non-local 3D tracking method. Based on the observation that erroneous local minimum are mostly due to the out-of-plane rotation, we propose a hybrid approach combining non-local and local optimizations for different parameters, resulting in efficient non-local search in the 6D pose space. In addition, a precomputed robust contour-based tracking method is proposed for the pose optimization. By using long search lines with multiple candidate correspondences, it can adapt to different frame displacements without the need of coarse-to-fine search. After the pre-computation, pose updates can be conducted very fast, enabling the non-local optimization to run in real time. Our method outperforms all previous methods for both small and large displacements. For large displacements, the accuracy is greatly improved (\(81.7\%\, \text {v.s.}\, 19.4\%\)). At the same time, real-time speed (>50 fps) can be achieved with only CPU. The source code is available at https://github.com/cvbubbles/nonlocal-3dtracking.
X. Tian and X. Lin—Equally contributed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Arvo, J.: Fast random rotation matrices. In: Graphics gems III (IBM version), pp. 117–120. Elsevier (1992)
Choi, C., Christensen, H.I.: Real-time 3D model-based tracking using edge and keypoint features for robotic manipulation. In: IEEE International Conference on Robotics and Automation, pp. 4048–4055 (2010). https://doi.org/10.1109/ROBOT.2010.5509171
Deng, X., Mousavian, A., Xiang, Y., Xia, F., Bretl, T., Fox, D.: PoseRBPF: a Rao-Blackwellized particle filter for 6-D object pose tracking. IEEE Trans. Rob. 37(5), 1328–1342 (2021). https://doi.org/10.1109/TRO.2021.3056043
Drummond, T., Cipolla, R.: Real-time visual tracking of complex structures. IEEE Trans. Patt. Anal. Mach. Intell. 24(7), 932–946 (2002). https://doi.org/10.1109/TPAMI.2002.1017620
Harris, C., Stennett, C.: Rapid - a video rate object tracker. In: BMVC (1990)
Hexner, J., Hagege, R.R.: 2D–3D pose estimation of heterogeneous objects using a region based approach. Int. J. Comput. Vis. 118(1), 95–112 (2016). https://doi.org/10.1007/s11263-015-0873-2
Huang, H., Zhong, F., Qin, X.: Pixel-wise weighted region-based 3D object tracking using contour constraints. IEEE Trans. Visual. Comput. Graph. 1 (2021). https://doi.org/10.1109/TVCG.2021.3085197
Huang, H., Zhong, F., Sun, Y., Qin, X.: An occlusion-aware edge-based method for monocular 3D object tracking using edge confidence. Comput. Graph. Forum 39(7), 399–409 (2020). https://doi.org/10.1111/cgf.14154
Jain, P., Kar, P.: Non-convex optimization for machine learning. arXiv preprint arXiv:1712.07897 (2017)
Kwon, J., Lee, H.S., Park, F.C., Lee, K.M.: A geometric particle filter for template-based visual tracking. IEEE Trans. Pattern Anal. Mach. Intell. 36(4), 625–643 (2013)
Labbé, Y., Carpentier, J., Aubry, M., Sivic, J.: CosyPose: consistent multi-view multi-object 6D pose estimation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12362, pp. 574–591. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58520-4_34
Lepetit, V., Fua, P.: Monocular model-based 3D tracking of rigid objects. Now Publishers Inc (2005)
Li, Y., Wang, G., Ji, X., Xiang, Y., Fox, D.: DeepIM: deep iterative matching for 6D pose estimation. In: Proceedings of the ECCV, pp. 683–698 (2018)
Marchand, E., Uchiyama, H., Spindler, F.: Pose estimation for augmented reality: a hands-on survey. IEEE Trans. Vis. Comput. Graph. 22(12), 2633–2651 (2016). https://doi.org/10.1109/TVCG.2015.2513408
Peng, S., Liu, Y., Huang, Q., Zhou, X., Bao, H.: PVNet: pixel-wise voting network for 6DoF pose estimation. In: IEEE/CVF Conference on CVPR, pp. 4556–4565. IEEE, Long Beach, CA, USA, June 2019. https://doi.org/10.1109/CVPR.2019.00469
Prisacariu, V., Reid, I.: PWP3D: real-time segmentation and tracking of 3D objects. In: Proceedings of the 20th British Machine Vision Conference (September 2009). https://doi.org/10.1007/s11263-011-0514-3
Seo, B.K., Park, H., Park, J.I., Hinterstoisser, S., Ilic, S.: Optimal local searching for fast and robust textureless 3D object tracking in highly cluttered backgrounds. IEEE Trans. Vis. Comput. Graph. 20(1), 99–110 (2014). https://doi.org/10.1109/TVCG.2013.94
Stoiber, M., Pfanne, M., Strobl, K.H., Triebel, R., Albu-Schaeffer, A.: A sparse gaussian approach to region-based 6DoF object tracking. In: Proceedings of the Asian Conference on Computer Vision (2020)
Stoiber, M., Pfanne, M., Strobl, K.H., Triebel, R., Albu-Schäffer, A.: SRT3D: a sparse region-based 3D object tracking approach for the real world. Int. J. Comput. Vis. 130(4), 1008–1030 (2022). https://doi.org/10.1007/s11263-022-01579-8
Sun, X., Zhou, J., Zhang, W., Wang, Z., Yu, Q.: Robust monocular pose tracking of less-distinct objects based on contour-part model. IEEE Trans. Circuits Syst. Video Technol. 31(11), 4409–4421 (2021). https://doi.org/10.1109/TCSVT.2021.3053696
Tjaden, H., Schwanecke, U., Schömer, E.: Real-time monocular segmentation and pose tracking of multiple objects. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 423–438. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_26
Tjaden, H., Schwanecke, U., Schomer, E., Cremers, D.: A region-based gauss-newton approach to real-time monocular multiple object tracking. IEEE Trans. Pattern Anal. Mach. Intell. 41(8), 1797–1812 (2019). https://doi.org/10.1109/TPAMI.2018.2884990
Tjaden, H., Schwanecke, U., Schömer, E.: Real-time monocular pose estimation of 3D objects using temporally consistent local color histograms. In: IEEE International Conference on Computer Vision (ICCV), pp. 124–132 (2017). https://doi.org/10.1109/ICCV.2017.23
Vacchetti, L., Lepetit, V., Fua, P.: Combining edge and texture information for real-time accurate 3D camera tracking. In: IEEE and ACM International Symposium on Mixed and Augmented Reality, pp. 48–56 (2004). https://doi.org/10.1109/ISMAR.2004.24
Wen, B., Bekris, K.: BundleTrack: 6D pose tracking for novel objects without instance or category-level 3D models. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 8067–8074. IEEE (2021)
Wen, B., Mitash, C., Ren, B., Bekris, K.E.: se(3)-TrackNet: data-driven 6d pose tracking by calibrating image residuals in synthetic domains. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 10367–10373. IEEE (2020)
Xiang, Y., Schmidt, T., Narayanan, V., Fox, D.: PoseCNN: a convolutional neural network for 6D object pose estimation in cluttered scenes. In: Robotics: Science and Systems XIV. Robotics: Science and Systems Foundation, June 2018. https://doi.org/10.15607/RSS.2018.XIV.019
Zhang, J., Zhu, C., Zheng, L., Xu, K.: ROSEFusion: random optimization for online dense reconstruction under fast camera motion. ACM Trans. Graph. (TOG) 40(4), 1–17 (2021)
Zhong, L., Zhao, X., Zhang, Y., Zhang, S., Zhang, L.: Occlusion-aware region-based 3D pose tracking of objects with temporally consistent polar-based local partitioning. IEEE Trans. Image Process. 29, 5065–5078 (2020). https://doi.org/10.1109/TIP.2020.2973512
Acknowledgements
This work is supported by NSFC project 62172260, and the Industrial Internet Innovation and Development Project in 2019 of China.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Tian, X., Lin, X., Zhong, F., Qin, X. (2022). Large-Displacement 3D Object Tracking with Hybrid Non-local Optimization. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13682. Springer, Cham. https://doi.org/10.1007/978-3-031-20047-2_36
Download citation
DOI: https://doi.org/10.1007/978-3-031-20047-2_36
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20046-5
Online ISBN: 978-3-031-20047-2
eBook Packages: Computer ScienceComputer Science (R0)