Skip to main content
Log in

Person re-identification based on multi-scale feature fusion and multi-attention mechanism

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

Person re-identification is an image retrieval technique for person in real scenes. Due to factors such as camera angle, lighting, and occlusion, there is a high intra-class variation in the representation of a specific sample. Furthermore, discriminative local regions such as hats and shoes are often ignored, resulting in some useful local information being unable to be used for retrieval. In this paper, a multi-scale feature fusion network model combining global and local features is proposed. The network is built with four stacked building block, where multi-scale features are assigned with different weights and fused according to the output conditions of each branch. In addition, a multi-attention mechanism network is combined with the multi-scale feature fusion in this paper. This method aims to enable the network to model the relation between input images, so as to effectively aggregate the features of neighbour person samples to obtain a more robust image representation. Experimental results show that the retrieval performance can be improved by the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Availability of data and materials

Not applicable.

References

  1. Tan, H., Liu, X., Bian, Y., et al.: Incomplete descriptor mining with elastic loss for person re-identification. IEEE Trans. Circuits Syst. Video Technol. 32(1), 160–171 (2021)

    Article  Google Scholar 

  2. Zhao, H., Tian, M., Sun, S., et al.: Spindle net: Person re-identification with human body region guided feature decomposition and fusion. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition pp. 1077–1085 (2017)

  3. Tang, Q., Yan, P., Chen, J., et al.: Person re-identification based on multi-scale global feature and weight-driven part feature. AI Communications. 1–17 (2022)

  4. Zhuang, Z., Wei, L., Xie, L., et al.: Rethinking the distribution gap of person re-identification with camera-based batch normalization. In: European Conference on Computer Vision. Springer, Cham. 140–157 (2020)

  5. Zhang, A., Gao, Y., Niu, Y., et al.: Coarse-to-fine person re-identification with auxiliary-domain classification and second-order information bottleneck. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 598–607 (2021)

  6. Zheng, L., Huang, Y., Lu, H., et al.: Pose-invariant embedding for deep person re-identification. IEEE Trans. Image Process. 28(9), 4500–4509 (2019)

    Article  MathSciNet  Google Scholar 

  7. Fu, Y., Wei, Y., Zhou, Y., et al.: Horizontal pyramid matching for person re-identification. Proc. AAAI Conf. Artif. Intell. 33(01), 8295–8302 (2019)

    Google Scholar 

  8. Briot, A., Viswanath, P., Yogamani, S.: Analysis of efficient cnn design techniques for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 663–672 (2018)

  9. Yang, W., Huang, H., Zhang, Z., et al.: Towards rich feature discovery with class activation maps augmentation for person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1389–1398 (2019)

  10. Sun, Y., Zheng, L., Yang, Y., et al.: Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline). In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 480–496 (2018)

  11. Zhou, K., Yang, Y., Cavallaro, A., et al.: Omni-scale feature learning for person re-identification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3702–3712 (2019)

  12. Kalayeh, M. M., Basaran, E., Gökmen, M., et al.: Human semantic parsing for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1062–1071 (2018)

  13. Vaswani, A., Shazeer, N., Parmar, N., et al.: Attention is all you need. Adv. Neural Inf. Process. Syst. 5998–6008 (2017)

  14. Mnih, V., Heess, N., Graves, A.: Recurrent models of visual attention. Adv. Neural Inf. Process. Syst. 2204–2212 (2014)

  15. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)

  16. Wang, C., Zhang, Q., Huang, C., et al.: Mancs: A multi-task attentional network with curriculum sampling for person re-identification. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 365–381 (2018)

  17. Wang, S., Kong, J., Jiang, M., Liu, T.: Multiple depth-levels features fusion enhanced network for action recognition. J. Vis. Commun. Image Represent. 73, 1–11 (2020)

    Article  Google Scholar 

  18. Liu, T., Zhang, C., Lam, K.-M., Kong, J.: Decouple and resolve: transformer-based models for online anomaly detection from weakly labeled videos. IEEE Trans. Inf. Forensics Secur. 18, 15–28 (2023)

    Article  Google Scholar 

  19. Xu, H., Zhang, J.: Aanet: Adaptive aggregation network for efficient stereo matching. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1959–1968 (2020)

  20. Wang, G., Gong, S., Cheng, J., et al.: Faster person re-identification. In: Computer Vision–ECCV 2020, Springer International Publishing, pp. 275–292 (2020)

  21. He, S., Luo, H., Wang, P., et al.: Transreid: transformer-based object re-identification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 15013–15022 (2021)

  22. Xie, S., Girshick, R., Dollár, P., et al.: Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1492–1500 (2017)

  23. Radenović, F., Tolias, G., Chum, O.: Fine-tuning CNN image retrieval with no human annotation. IEEE Trans. Pattern Anal. Mach. Intell. 41(7), 1655–1668 (2018)

    Article  Google Scholar 

  24. Zheng, L., Shen, L., Tian, L., Wang, S., Wang, J., Tian, Q.: Scalable person re-identification: A benchmark. In: IEEE International Conference on Computer Vision (ICCV), pp. 1116–1124 (2015)

  25. Ristani,E., Solera, F., Zou, R., Cucchiara, R., Tomasi, C.: Performance measures and a data set for multi-target, multi-camera tracking. In: European Conference on Computer Vision, Springer, pp. 17-35 (2016)

  26. Gunes, H., Piccardi, M.: Affect recognition from face and body: early fusion vs. late fusion. In: 2005 IEEE International Conference on Systems, Man and Cybernetics, vol. 4, pp. 3437–3443 (2005)

  27. Sudholt, S., Fink, G. A.: PHOCNet: a deep convolutional neural network for word spotting in handwritten documents. In: 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), 2016, Shenzhen, China, pp. 277–282 (2016)

  28. Zhou, B., Khosla, A., Lapedriza, A., et al.: Learning deep features for discriminative localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2921–2929 (2016)

  29. Yandex, A. B., Lempitsky, V.: Aggregating local deep features for image retrieval. In: 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, pp. 1269–1277 (2015)

  30. Kalantidis, Y., Mellina, C., Osindero, S.: Cross-dimensional weighting for aggregated deep convolutional features. In: Computer Vision–ECCV 2016 Workshops, Springer International Publishing, pp. 685–701 (2016)

  31. Wang, H., Shen, J., Liu, Y., et al.: Nformer: Robust person re-identification with neighbor transformer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7297–7307 (2022)

  32. Hou, R., Ma, B., Chang, H., et al.: IAUnet: Global context-aware feature learning for person reidentification. IEEE Trans. Neural Netw. Learn. Syst. 32(10), 4460–4474 (2020)

    Article  Google Scholar 

  33. Wang, Z., Zhu, F., Tang, S., et al.: Feature erasing and diffusion network for occluded person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4754–4763 (2022)

Download references

Funding

This work was supported by Suzhou Science and Technology Planning Project (Grant No. SKJY2021044), Natural Science Foundation of Jiangsu Province, China (Grant Nos. BK20130324, BK20171249), Specialized Research Fund for the Doctoral Program of Higher Education (SRFDP) (Grant No. 20123201120009), and Natural Science Foundation of the Jiangsu Higher Education Institutions of China (Grant No. 12KJB510029).

Author information

Authors and Affiliations

Authors

Contributions

JP designed the work and performed the experiment. WZ wrote the manuscript text and provided experiment guidance. All authors reviewed the manuscript.

Corresponding author

Correspondence to Wei Zou.

Ethics declarations

Competing interests

The authors declare no conflict of interest.

Ethical approval

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pu, J., Zou, W. Person re-identification based on multi-scale feature fusion and multi-attention mechanism. SIViP 18, 243–253 (2024). https://doi.org/10.1007/s11760-023-02705-w

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-023-02705-w

Keywords

Navigation