research-article

Causal Intervention for Sparse-View Gait Recognition

Authors:
Jilong Wang

University of Science and Technology of China, Hefei, China

University of Science and Technology of China, Hefei, China

0009-0001-9668-2987
View Profile

,
Saihui Hou

Beijing Normal University & WATRIX.AI, Beijing, China

Beijing Normal University & WATRIX.AI, Beijing, China

0000-0003-4689-2860
View Profile

,
Yan Huang

Institute of Automation, Chinese Academy of Sciences, Beijing, China

Institute of Automation, Chinese Academy of Sciences, Beijing, China

0000-0002-8239-7229
View Profile

,
Chunshui Cao

WATRIX.AI, Beijing, China

WATRIX.AI, Beijing, China

0000-0001-6634-1682
View Profile

,
Xu Liu

WATRIX.AI, Beijing, China

WATRIX.AI, Beijing, China

0000-0002-0401-1343
View Profile

,
Yongzhen Huang

Beijing Normal University & WATRIX.AI, Beijing, China

Beijing Normal University & WATRIX.AI, Beijing, China

0000-0003-4389-9805
View Profile

,
Liang Wang

Institute of Automation, Chinese Academy of Sciences, Beijing, China

Institute of Automation, Chinese Academy of Sciences, Beijing, China

0000-0001-5224-8647
View Profile

MM '23: Proceedings of the 31st ACM International Conference on MultimediaOctober 2023Pages 77–85https://doi.org/10.1145/3581783.3612124

Published:27 October 2023Publication History

MM '23: Proceedings of the 31st ACM International Conference on Multimedia

Pages 77–85

ABSTRACT

Gait recognition aims at identifying individuals by unique walking patterns at a long distance. However, prevailing methods suffer from a large degradation when applied to large-scale surveillance systems. We find a significant cause of this issue is that previous methods heavily rely on full-view person annotations to reduce view differences by pulling closer the anchor to positive samples from different viewpoints. But, subjects under in-the-wild scenarios usually have only a limited number of sequences from different viewpoints. As a result, the available viewpoints of each subject are sparse compared to the whole dataset, and simply minimizing intra-identity differences cannot well reducing the view differences in the whole dataset. In this work, we formulate this overlooked problem as Sparse-View Gait Recognition and provide a comprehensive analysis of it by a Structural Causal Model for causalities among latent features, view distribution, and labels. Based on our analysis, we propose a simple yet effective method that enables networks to learn a more robust representation among different views. Specifically, our method consists of two parts: 1) an effective metric learning algorithmic implementation based on the backdoor adjustment, which improves the consistency of representations among different views; 2) an unsupervised view cluster algorithm to discover and identify the most influential view contexts. We evaluate the effectiveness of our method on popular GREW, Gait3D, CASIA-B, and OU-MVLP, showing that our method consistently outperforms baselines and achieves state-of-the-art performance. The code will be available at https://github.com/wj1tr0y/GaitCSV.

References

Gunawan Ariyanto and Mark S. Nixon. 2011. Model-based 3D gait biometrics. In 2011 International Joint Conference on Biometrics (IJCB). 1--7.Google Scholar
Ella Bingham and Heikki Mannila. 2001. Random projection in dimensionality reduction: applications to image and text data. In Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining. 245--250.Google ScholarDigital Library
R. Bodor, A. Drenner, D. Fehr, O. Masoud, and N. Papanikolopoulos. 2009. View-Independent Human Motion Classification Using Image-Based Reconstruction. Image Vision Comput., Vol. 27, 8 (jul 2009), 1194--1206.Google ScholarDigital Library
Tianrui Chai, Xinyu Mei, Annan Li, and Yunhong Wang. 2021. Silhouette-based view-embeddings for gait recognition under multiple views. In 2021 IEEE international conference on image processing (ICIP). IEEE, 2319--2323.Google ScholarCross Ref
Krzysztof Chalupka, Pietro Perona, and Frederick Eberhardt. 2015. Visual Causal Feature Learning (UAI'15). AUAI Press, Arlington, Virginia, USA, 181--190.Google Scholar
Hanqing Chao, Kun Wang, Yiwei He, Junping Zhang, and Jianfeng Feng. 2021. GaitSet: Cross-view gait recognition through utilizing gait as a deep set. IEEE transactions on pattern analysis and machine intelligence, Vol. 44, 7 (2021), 3467--3478.Google Scholar
Chao Fan, Junhao Liang, Chuanfu Shen, Saihui Hou, Yongzhen Huang, and Shiqi Yu. 2023. OpenGait: Revisiting Gait Recognition Towards Better Practicality. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 9707--9716.Google ScholarCross Ref
Chao Fan, Yunjie Peng, Chunshui Cao, Xu Liu, Saihui Hou, Jiannan Chi, Yongzhen Huang, Qing Li, and Zhiqiang He. 2020. GaitPart: Temporal Part-Based Model for Gait Recognition. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 14213--14221.Google Scholar
Wenhang Ge, Chunyan Pan, Ancong Wu, Hongwei Zheng, and Wei-Shi Zheng. 2021. Cross-camera feature prediction for intra-camera supervised person re-identification across distant scenes. In Proceedings of the 29th ACM International Conference on Multimedia. 3644--3653.Google ScholarDigital Library
Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the thirteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conference Proceedings, 249--256.Google Scholar
J. Han and Bir Bhanu. 2006. Individual recognition using gait energy image. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 28, 2 (2006), 316--322.Google ScholarDigital Library
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE international conference on computer vision. 1026--1034.Google ScholarDigital Library
Saihui Hou, Chunshui Cao, Xu Liu, and Yongzhen Huang. 2020. Gait Lateral Network: Learning Discriminative and Compact Representations for Gait Recognition. In Computer Vision - ECCV 2020: 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part IX (Glasgow, United Kingdom). Springer-Verlag, Berlin, Heidelberg, 382--398.Google ScholarDigital Library
William B. Johnson and Joram Lindenstrauss. 1984. Extensions of Lipschitz mappings into a Hilbert space. Conference on Modern Analysis and Probability (1984), 189--206.Google Scholar
Worapan Kusakunniran, Qiang Wu, Hongdong Li, and Jian Zhang. 2009. Multiple views gait recognition using View Transformation Model based on optimized Gait Energy Image. In 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops. 1058--1064.Google ScholarCross Ref
Beibei Lin, Shunli Zhang, and Xin Yu. 2021. Gait Recognition via Effective Global-Local Feature Representation and Local Temporal Aggregation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 14648--14656.Google ScholarCross Ref
David Lopez-Paz, Robert Nishihara, Soumith Chintala, Bernhard Schölkopf, and Léon Bottou. 2017. Discovering Causal Signals in Images. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 58--66.Google Scholar
Jian Luo, Jin Tang, Tardi Tjahjadi, and Xiaoming Xiao. 2016. Robust arbitrary view gait recognition based on parametric 3D human body reconstruction and virtual posture synthesis. Pattern Recognition, Vol. 60 (2016), 361--377.Google ScholarDigital Library
Aäron van den Oord, Yazhe Li, and Oriol Vinyals. 2018. Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748 (2018).Google Scholar
J. Pearl, M. Glymour, and N.P. Jewell. 2016. Causal Inference in Statistics: A Primer. Wiley. 2015037219Google Scholar
Yongming Rao, Guangyi Chen, Jiwen Lu, and Jie Zhou. 2021. Counterfactual Attention Learning for Fine-Grained Visual Categorization and Re-identification. In ICCV.Google Scholar
Florian Schroff, Dmitry Kalenichenko, and James Philbin. 2015. FaceNet: A unified embedding for face recognition and clustering. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 815--823.Google ScholarCross Ref
Kohei Shiraga, Yasushi Makihara, Daigo Muramatsu, Tomio Echigo, and Yasushi Yagi. 2016. GEINet: View-invariant gait recognition using a convolutional neural network. In 2016 International Conference on Biometrics (ICB). 1--8.Google ScholarCross Ref
Noriko Takemura, Yasushi Makihara, Daigo Muramatsu, Tomio Echigo, and Yasushi Yagi. 2018. Multi-view large population gait dataset and its performance evaluation for cross-view gait recognition. IPSJ Transactions on Computer Vision and Applications, Vol. 10, 1 (2018).Google ScholarCross Ref
Jin Tang, Jian Luo, Tardi Tjahjadi, and Fan Guo. 2016. Robust arbitrary-view gait recognition based on 3D partial similarity matching. IEEE Transactions on Image Processing, Vol. 26, 1 (2016), 7--22.Google ScholarDigital Library
Kaihua Tang, Jianqiang Huang, and Hanwang Zhang. 2020. Long-Tailed Classification by Keeping the Good and Removing the Bad Momentum Causal Effect. In Proceedings of the 34th International Conference on Neural Information Processing Systems (Vancouver, BC, Canada) (NIPS'20). Curran Associates Inc., Red Hook, NY, USA, Article 128, 12 pages.Google ScholarDigital Library
Yonghong Tian, Lan Wei, Shijian Lu, and Tiejun Huang. 2019. Free-view gait recognition. PloS one, Vol. 14, 4 (2019), e0214389.Google ScholarCross Ref
Paul Upchurch, Jacob Gardner, Geoff Pleiss, Robert Pless, Noah Snavely, Kavita Bala, and Kilian Weinberger. 2017. Deep feature interpolation for image content changes. In Proceedings of the IEEE conference on computer vision and pattern recognition. 7064--7073.Google ScholarCross Ref
Zhirong Wu, Yuanjun Xiong, Stella X Yu, and Dahua Lin. 2018. Unsupervised feature learning via non-parametric instance discrimination. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3733--3742.Google ScholarCross Ref
Xu Yang, Hanwang Zhang, and Jianfei Cai. 2021. Deconfounded Image Captioning: A Causal Retrospect. IEEE Transactions on Pattern Analysis and Machine Intelligence (2021), 1--1.Google Scholar
Shiqi Yu, Haifeng Chen, Edel B. García Reyes, and Norman Poh. 2017. GaitGAN: Invariant Gait Feature Extraction Using Generative Adversarial Networks. In 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). 532--539.Google Scholar
Shiqi Yu, Daoliang Tan, and Tieniu Tan. 2006. A framework for evaluating the effect of view angle, clothing and carrying condition on gait recognition. In 18th international conference on pattern recognition (ICPR'06), Vol. 4. IEEE, 441--444.Google Scholar
Zhongqi Yue, Hanwang Zhang, Qianru Sun, and Xian-Sheng Hua. 2020. Interventional Few-Shot Learning. In NeurIPS.Google Scholar
Tianyu Zhang, Lingxi Xie, Longhui Wei, Yongfei Zhang, Bo Li, and Qi Tian. 2020. Single Camera Training for Person Re-identification. In AAAI Conference on Artificial Intelligence (AAAI).Google Scholar
Yuqi Zhang, Yongzhen Huang, Shiqi Yu, and Liang Wang. 2019a. Cross-View Gait Recognition by Discriminative Feature Learning. IEEE Transactions on Image Processing, Vol. 29 (2019), 1001--1015.Google ScholarCross Ref
Ziyuan Zhang, Luan Tran, Xi Yin, Yousef Atoum, Xiaoming Liu, Jian Wan, and Nanxin Wang. 2019b. Gait recognition via disentangled representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 4710--4719.Google ScholarCross Ref
Guoying Zhao, Guoyi Liu, Hua Li, and Matti Pietikainen. 2006. 3D gait recognition using multiple cameras. In 7th International Conference on Automatic Face and Gesture Recognition (FGR06). IEEE, 529--534.Google ScholarDigital Library
Jinkai Zheng, Xinchen Liu, Wu Liu, Lingxiao He, Chenggang Yan, and Tao Mei. 2022. Gait recognition in the wild with dense 3d representations and a benchmark. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 20228--20237.Google ScholarCross Ref
Zheng Zhu, Xianda Guo, Tian Yang, Junjie Huang, Jiankang Deng, Guan Huang, Dalong Du, Jiwen Lu, and Jie Zhou. 2021. Gait recognition in the wild: A benchmark. In Proceedings of the IEEE/CVF international conference on computer vision. 14789--14799.Google Scholar

Index Terms

Causal Intervention for Sparse-View Gait Recognition
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Object recognition
      2. Computer vision tasks
        Biometrics
  2. Machine learning
    1. Machine learning algorithms
      1. Regularization

Recommendations

Deconfounded recommendation via causal intervention
Abstract
Traditional recommenders suffer from hidden confounding factors, leading to the spurious correlations between user/item profiles and user preference prediction, i.e., the confounding bias issue. Most works resort to only one confounding bias, ...
Read More
Disentangling causality: assumptions in causal discovery and inference
Abstract
Causality has been a burgeoning field of research leading to the point where the literature abounds with different components addressing distinct parts of causality. For researchers, it has been increasingly difficult to discern the assumptions ...
Read More
Gait recognition based on fusion of multi-view gait sequences
ICB'06: Proceedings of the 2006 international conference on Advances in Biometrics

In recent years, many gait recognition algorithms have been developed, but most of them depend on a specific view angle. In this paper,we present a new gait recognition scheme based on multi-view gait sequence fusion. An experimental comparison of the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MM '23: Proceedings of the 31st ACM International Conference on Multimedia
October 2023
9913 pages
ISBN:9798400701085
DOI:10.1145/3581783
General Chairs:
Abdulmotaleb El Saddik
University of Ottawa, Canada & MBZUAI, UAE
,
Tao Mei
HiDream.ai, China
,
Rita Cucchiara
University of Modena and Reggio Emilia, Italy
,
Program Chairs:
Marco Bertini
University of Florence, Italy
,
Diana Patricia Tobon Vallejo
Unversidad de Medellin, Colombia
,
Pradeep K. Atrey
University at Albany, State University of New York, USA
,
M. Shamim Hossain
M. Shamim Hossain (King Saud University, KSA
Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 27 October 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
causal inference
metric learning
sparse-view gait recognition
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate995of4,171submissions,24%
Upcoming Conference
MM '24

Sponsor:

sigmm

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 212
  Total Downloads
- Downloads (Last 12 months)212
- Downloads (Last 6 weeks)53
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Causal Intervention for Sparse-View Gait Recognition

MM '23: Proceedings of the 31st ACM International Conference on Multimedia

ABSTRACT

References

Cited By

Index Terms

Recommendations

Deconfounded recommendation via causal intervention

Disentangling causality: assumptions in causal discovery and inference

Gait recognition based on fusion of multi-view gait sequences

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media