A Siamese inception architecture network for person re-identification

Li, Shuangqun; Ma, Huadong

doi:10.1007/s00138-017-0843-5

A Siamese inception architecture network for person re-identification

Special Issue Paper
Published: 16 May 2017

Volume 28, pages 725–736, (2017)
Cite this article

Machine Vision and Applications Aims and scope Submit manuscript

Shuangqun Li¹ &
Huadong Ma¹

759 Accesses
6 Citations
Explore all metrics

Abstract

Person re-identification is an extremely challenging problem as person’s appearance often undergoes dramatic changes due to the large variations of viewpoints, illuminations, poses, image resolutions, and cluttered backgrounds. How to extract discriminative features is one of the most critical ways to address these challenges. In this paper, we mainly focus on learning high-level features and combine the low-level, mid-level, and high-level features together to re-identify a person across different cameras. Firstly, we design a Siamese inception architecture network to automatically learn effective semantic features for person re-identification in different camera views. Furthermore, we combine multi-level features in null space with the null Foley–Sammon transform metric learning approach. In this null space, images of the same person are projected to a single point, which minimizes the intra-class scatter to the extreme and maximizes the relative inter-class separation simultaneously. Finally, comprehensive evaluations demonstrate that our approach achieves better performance on four person re-identification benchmark datasets, including Market-1501, CUHK03, PRID2011, and VIPeR.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Object detection using YOLO: challenges, architectural successors, datasets and applications

Article 08 August 2022

SSD: Single Shot MultiBox Detector

End-to-End Object Detection with Transformers

References

Wang, X.: Intelligent multi-camera video surveillance: a review. Pattern Recognit. Lett. 34(1), 3–19 (2013)
Article Google Scholar
Loy, C.C., Xiang, T., Gong, S.: Multi-camera activity correlation analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1988–1995 (2009)
Köstinger, M., Hirzer, M., Wohlhart, P., Roth, P.M., Bischof, H.: Large scale metric learning from equivalence constraints. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2288–2295 (2012)
Zheng, L., Shen, L., Tian, L., Wang, S., Wang, J., Tian, Q.: Scalable person re-identification: a benchmark. In: Proceedings of the International Conference on Computer Vision, pp. 1116–1124 (2015)
Yang, Y., Yang, J., Yan, J., Liao, S., Yi, D., Li, S.Z.: Salient color names for person re-identification. In: Proceedings of the European Conference on Computer Vision, pp. 536–551 (2014)
Liu, X.C., Liu, W., Ma, H.D., Fu, H.Y.: Large-scale vehicle re-identification in urban surveillance videos. In: Proceedings of the International Conference on Multimedia and Expo, pp. 1–6 (2016)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 886–893 (2005)
Liao, S., Hu, Y., Zhu, X., Li, S.Z.: Person re-identification by local maximal occurrence representation and metric learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2197–2206 (2015)
Gan, C., Wang, N., Yang, Y., Yeung, D.Y., Hauptmann, A.G.: DevNet: A deep event network for multimedia event detection and evidence recounting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2568–2577 (2015)
Yao, H., Zhang, S., Zhang, Y., Li, J., Tian, Q.: Coarse-to-fine description for fine-grained visual categorization. IEEE Trans. Image Process. 25(10), 4858–4872 (2016)
Article MathSciNet Google Scholar
Ahmed, E., Jones, M., Marks, T.K.: An improved deep learning architecture for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3908–3916 (2015)
Li, W., Zhao, R., Xiao, T., Wang, X.: DeepReID: deep filter pairing neural network for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 152–159 (2014)
Chopra, S., Hadsell, R., LeCun, Y.: Learning a similarity metric discriminatively, with application to face verification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 539–546 (2005)
Li, S.Q., Liu, X.C., Liu, W., Ma, H.D., Zhang, H.T.: A discriminative null space based deep learning approach for person re-identification. In: Proceedings of the IEEE Conference on Cloud Computing and Intelligent Systems, pp. 480–484 (2016)
Ioffe, S., Szegedy, C.: Batch Normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of the International Conference on Machine Learning, pp. 448–456 (2015)
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)
Zhang, L., Xiang, T., Gong, S.: Learning a discriminative null space for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1239–1248 (2016)
Zhang, C., Liu, W., Ma, H.D., Fu, H.Y.: Siamese neural network based gait recognition for human identification. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing, pp. 2832–2836 (2016)
Liu, W., Zhang, Y., Tang, S., Tang, J., Hong, R., Li, J.: Accurate estimation of human body orientation from RGB-D sensors. IEEE Trans. Cybern. 43(5), 1442–1452 (2013)
Article Google Scholar
Wang, B., Tang, S., Zhao, R., Liu, W., Cen, Y.: Pedestrian detection based on region proposal fusion. In: Proceedings of the International Workshop on Multimedia Signal Processing, pp. 1–6 (2015)
Peng, P., Xiang, T., Wang, Y., Pontil, M., Gong, S., Huang, T., Tian, Y.: Unsupervised cross-dataset transfer learning for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1306–1315 (2016)
Liu, W., Mei, T., Zhang, Y.: Instant mobile video search with layered audio-video indexing and progressive transmission. IEEE Trans. Multimed. 16(8), 2242–2255 (2014)
Article Google Scholar
Wang, F., Zuo, W., Lin, L., Zhang, D., Zhang, L.: Joint learning of single-image and cross-image representations for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1288–1296 (2016)
Bromley, J., Bentz, J.W., Bottou, L., Guyon, I., LeCun, Y., Moore, C., Säckinger, E., Shah, R.: Signature verification using a “Siamese” time delay neural network. IJPRAI 7(4), 669–688 (1993)
Google Scholar
Xiao, T., Li, H., Ouyang, W., Wang, X.: Learning deep feature representations with domain guided dropout for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1249–1258 (2016)
Hirzer, M., Beleznai, C., Roth, P.M., Bischof, H.: Person re-identification by descriptive and discriminative classification. In: Proceedings of the Scandinavian Conference on Image Analysis, pp. 91–102 (2011)
Doug, G., Shane, B., Hai, T.: Evaluating appearance models for recognition, reacquisition, and tracking. In: Proceedings of the IEEE International Workshop on Performance Evaluation of Tracking and Surveillance (2007)
Felzenszwalb, P.F., Girshick, R.B., McAllester, D.A., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)
Article Google Scholar
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R.B., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the ACM International Conference on Multimedia, pp. 675–678 (2014)
Liu, W., Mei, T., Zhang, Y., Che, C., Luo, J.: Multi-task deep visual-semantic embedding for video thumbnail selection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3707–3715 (2015)
Chu, L., Wang, S., Zhang, Y., Huang, Q.: Robust spatial consistency graph model for partial duplicate image retrieval. IEEE Trans. Multimed. 15(8), 1982–1996 (2013)
Article Google Scholar
Davis, J.V., Kulis, B., Jain, P., Sra, S., Dhillon, I.S.: Information-theoretic metric learning. In: Proceedings of the International Conference on Machine Learning, pp. 209–216 (2007)
Weinberger, K.Q., Blitzer, J., Saul, L.K.: Distance metric learning for large margin nearest neighbor classification. In: Advances in Neural Information Processing Systems, pp. 1473–1480 (2005)

Download references

Acknowledgements

This work is supported by the National Key Research and Development Plan (2016YFC0801005), the Funds for Creative Research Groups of China (61421061), the NSFC-Guangdong Joint Fund (U1501254), the Beijing Training Project for the Leading Talents in S&T (ljrc 201502), the Cosponsored Project of Beijing Committee of Education.

Author information

Authors and Affiliations

Beijing Key Laboratory of Intelligent Telecommunication Software and Multimedia, Beijing University of Posts and Telecommunications, Beijing, 100876, China
Shuangqun Li & Huadong Ma

Authors

Shuangqun Li
View author publications
You can also search for this author in PubMed Google Scholar
Huadong Ma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Huadong Ma.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, S., Ma, H. A Siamese inception architecture network for person re-identification. Machine Vision and Applications 28, 725–736 (2017). https://doi.org/10.1007/s00138-017-0843-5

Download citation

Received: 31 October 2016
Revised: 03 March 2017
Accepted: 25 April 2017
Published: 16 May 2017
Issue Date: October 2017
DOI: https://doi.org/10.1007/s00138-017-0843-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Siamese inception architecture network for person re-identification

Abstract

Access this article

Similar content being viewed by others

Object detection using YOLO: challenges, architectural successors, datasets and applications

SSD: Single Shot MultiBox Detector

End-to-End Object Detection with Transformers

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Siamese inception architecture network for person re-identification

Abstract

Access this article

Similar content being viewed by others

Object detection using YOLO: challenges, architectural successors, datasets and applications

SSD: Single Shot MultiBox Detector

End-to-End Object Detection with Transformers

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation