Skip to main content

Temporal Extension Topology Learning for Video-Based Person Re-identification

  • Conference paper
  • First Online:
  • 234 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13848))

Abstract

Video-based person re-identification aims to match the same identification from video clips captured by multiple non-overlapping cameras. By effectively exploiting both temporal and spatial clues of a video clip, a more comprehensive representation of the identity in the video clip can be obtained. In this manuscript, we propose a novel graph-based framework, referred as Temporal Extension Adaptive Graph Convolution (TE-AGC) which could effectively mine features in spatial and temporal dimensions in one graph convolution operation. Specifically, TE-AGC adopts a CNN backbone and a key-point detector to extract global and local features as graph nodes. Moreover, a delicate adaptive graph convolution module is designed, which encourages meaningful information transfer by dynamically learning the reliability of local features from multiple frames. Comprehensive experiments on two video person re-identification benchmark datasets have demonstrated the effectiveness and state-of-the-art performance of the proposed method.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Ye, M., Shen, J., Lin, G., Xiang, T., Hoi, S.: Deep learning for person re-identification: a survey and outlook. IEEE Trans. Pattern Anal. Mach. Intell. 1–1 (2021)

    Google Scholar 

  2. Zheng, L., Yang, Y., Hauptmann, A.G.: Person re-identification: past, present and future. Sensors (Basel) 22(24), 9852 (2016)

    Google Scholar 

  3. Farenzena, M., Bazzani, L., Perina, A., Murino, V., Cristani, M.: Person re-identification by symmetry-driven accumulation of local features. In: Proeedings of IEEECONFERENCE on Computer Vision & Patternrecognition. pp. 2360–2367 (2010)

    Google Scholar 

  4. Liu, C., Gong, S., Loy, C.C., Lin, X.: Person re-identification: what features are important? In: Fusiello, A., Murino, V., Cucchiara, R. (eds.) ECCV 2012. LNCS, vol. 7583, pp. 391–401. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33863-2_39

    Chapter  Google Scholar 

  5. Liao, S., Yang, H., Zhu, X., Li, S.Z.: Person re-identification by local maximal occurrence representation and metric learning. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). (2015)

    Google Scholar 

  6. Matsukawa, T., Okabe, T., Suzuki, E., Sato, Y.: Hierarchical gaussian descriptor for person re-identification. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)

    Google Scholar 

  7. Xiong, F., Gou, M., Camps, O., Sznaier, M.: Person re-identification using kernel-based metric learning methods. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8695, pp. 1–16. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10584-0_1

    Chapter  Google Scholar 

  8. Zheng, W.S., Xiang, L., Tao, X., Liao, S., Lai, J., Gong, S.: Partial person re-identification. In: IEEE International Conference on Computer Vision. (2016)

    Google Scholar 

  9. Wang, G., et al.: High-order information matters: Learning relation and topology for occluded person re-identification. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)(2020)

    Google Scholar 

  10. Chung, D., Tahboub, K., Delp, E.J.: A two stream Siamese convolutional neural network for person re-identification. In: 2017 IEEE International Conference on Computer Vision (ICCV) (2017)

    Google Scholar 

  11. Chen, D., Li, H., Tong, X., Shuai, Y., Wang, X.: Video person re-identification with competitive snippet-similarity aggregation and co-attentive snippet embedding. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2018)

    Google Scholar 

  12. Xu, S., Yu, C., Kang, G., Yang, Y., Pan, Z.: Jointly attentive spatial-temporal pooling networks for video-based person re-identification. In: 2017 IEEE International Conference on Computer Vision (ICCV). (2017)

    Google Scholar 

  13. Hou, R., Chang, H., Ma, B., Huang, R., Shan, S.: BiCnet-TKS: learning efficient spatial-temporal representation for video person re-identification (2021)

    Google Scholar 

  14. Liu, J., Zha, Z.J., Wu, W., Zheng, K., Sun, Q.: Spatial-temporal correlation and topology learning for person re-identification in videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4370–4379 (2021)

    Google Scholar 

  15. Miao, J., Wu, Y., Liu, P., Ding, Y., Yang, Y.: Pose-guided feature alignment for occluded person re-identification. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV) (2019)

    Google Scholar 

  16. Sun, Y., et al.: Perceive where to focus: Learning visibility-aware part-level features for partial person re-identification. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020)

    Google Scholar 

  17. Mclaughlin, N., Rincon, J., Miller, P.: Recurrent convolutional network for video-based person re-identification. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)

    Google Scholar 

  18. Gao, J., Nevatia, R.: Revisiting temporal modeling for video-based person reID (2018)

    Google Scholar 

  19. Zheng, L., Bie, Z., Sun, Y., Wang, J., Su, C., Wang, S., Tian, Q.: MARS: a video benchmark for large-scale person re-identification. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9910, pp. 868–884. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46466-4_52

    Chapter  Google Scholar 

  20. Fu, Y., Wang, X., Wei, Y., Huang, T.S.: STA: spatial-temporal attention for large-scale video-based person re-identification. In: National Conference on Artificial Intelligence (2019)

    Google Scholar 

  21. Li, S., Bak, S., Carr, P., Wang, X.: Diversity regularized spatiotemporal attention for video-based person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern recognition, pp. 369–378 (2018)

    Google Scholar 

  22. Ouyang, D., Zhang, Y., Shao, J.: Video-based person re-identification via spatio-temporal attentional and two-stream fusion convolutional networks. Pattern Recog. Lett. 117, 153–160 (2018)

    Google Scholar 

  23. Subramaniam, A., Nambiar, A., Mittal, A.: Co-segmentation inspired attention networks for video-based person re-identification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 562–572 (2019)

    Google Scholar 

  24. Hou, R., Chang, H., Ma, B., Shan, S., Chen, X.: Temporal complementary learning for video person re-identification. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12370, pp. 388–405. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58595-2_24

    Chapter  Google Scholar 

  25. Jones, M.J., Rambhatla, S.: Body part alignment and temporal attention for video-based person re-identification. In: BMVC (2019)

    Google Scholar 

  26. Zhao, H., et al.: Spindle net: person re-identification with human body region guided feature decomposition and fusion. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1077–1085 (2017)

    Google Scholar 

  27. Wu, Y., Bourahla, O.E.F., Li, X., Wu, F., Tian, Q., Zhou, X.: Adaptive graph representation learning for video person re-identification. IEEE Trans. Image Process. 29, 8821–8830 (2020)

    Article  MATH  Google Scholar 

  28. Yang, J., Zheng, W.S., Yang, Q., Chen, Y.C., Tian, Q.: Spatial-temporal graph convolutional network for video-based person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3289–3299 (2020)

    Google Scholar 

  29. Yan, Y., Qin, J., Chen, J., Liu, L., Zhu, F., Tai, Y., Shao, L.: Learning multi-granular hypergraphs for video-based person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2899–2908 (2020)

    Google Scholar 

  30. Zhang, Z., Lan, C., Zeng, W., Jin, X., Chen, Z.: Relation-aware global attention for person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3186–3195 (2020)

    Google Scholar 

  31. Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016)

  32. Obinata, Y., Yamamoto, T.: Temporal extension module for skeleton-based action recognition. In: 2020 25th International Conference on Pattern Recognition (ICPR), IEEE, pp. 534–540 (2021)

    Google Scholar 

  33. Shi, W., Rajkumar, R.: Point-GNN: graph neural network for 3D object detection in a point cloud. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1711–1719 (2020)

    Google Scholar 

  34. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2020)

    Google Scholar 

  35. Hermans, A., Beyer, L., Leibe, B.: In defense of the triplet loss for person re-identification. arXiv preprint arXiv:1703.07737 (2017)

  36. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)

    Google Scholar 

  37. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, IEEE, pp. 248–255 (2009)

    Google Scholar 

  38. Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019)

    Google Scholar 

  39. Lin, T.Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48

    Chapter  Google Scholar 

  40. Li, J., Wang, J., Tian, Q., Gao, W., Zhang, S.: Global-local temporal representations for video person re-identification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3958–3967 (2019)

    Google Scholar 

  41. Hou, R., Ma, B., Chang, H., Gu, X., Shan, S., Chen, X.: VRSTC: occlusion-free video person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7183–7192 (2019)

    Google Scholar 

  42. Li, X., Zhou, W., Zhou, Y., Li, H.: Relation-guided spatial attention and temporal refinement for video-based person re-identification. In: Proceedings of the AAAI Conference on Artificial Intelligence. 34, 11434–11441 (2020)

    Google Scholar 

  43. Gu, X., Chang, H., Ma, B., Zhang, H., Chen, X.: Appearance-preserving 3D convolution for video-based person re-identification. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12347, pp. 228–243. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58536-5_14

    Chapter  Google Scholar 

  44. Chen, G., Rao, Y., Lu, J., Zhou, J.: Temporal coherence or temporal motion: which is more critical for video-based person re-identification? In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12353, pp. 660–676. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58598-3_39

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jiaqi Ning .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ning, J., Li, F., Liu, R., Takeuchi, S., Suzuki, G. (2023). Temporal Extension Topology Learning for Video-Based Person Re-identification. In: Zheng, Y., Keleş, H.Y., Koniusz, P. (eds) Computer Vision – ACCV 2022 Workshops. ACCV 2022. Lecture Notes in Computer Science, vol 13848. Springer, Cham. https://doi.org/10.1007/978-3-031-27066-6_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-27066-6_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-27065-9

  • Online ISBN: 978-3-031-27066-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics