ABSTRACT
Gait recognition aims at identifying individuals by unique walking patterns at a long distance. However, prevailing methods suffer from a large degradation when applied to large-scale surveillance systems. We find a significant cause of this issue is that previous methods heavily rely on full-view person annotations to reduce view differences by pulling closer the anchor to positive samples from different viewpoints. But, subjects under in-the-wild scenarios usually have only a limited number of sequences from different viewpoints. As a result, the available viewpoints of each subject are sparse compared to the whole dataset, and simply minimizing intra-identity differences cannot well reducing the view differences in the whole dataset. In this work, we formulate this overlooked problem as Sparse-View Gait Recognition and provide a comprehensive analysis of it by a Structural Causal Model for causalities among latent features, view distribution, and labels. Based on our analysis, we propose a simple yet effective method that enables networks to learn a more robust representation among different views. Specifically, our method consists of two parts: 1) an effective metric learning algorithmic implementation based on the backdoor adjustment, which improves the consistency of representations among different views; 2) an unsupervised view cluster algorithm to discover and identify the most influential view contexts. We evaluate the effectiveness of our method on popular GREW, Gait3D, CASIA-B, and OU-MVLP, showing that our method consistently outperforms baselines and achieves state-of-the-art performance. The code will be available at https://github.com/wj1tr0y/GaitCSV.
- Gunawan Ariyanto and Mark S. Nixon. 2011. Model-based 3D gait biometrics. In 2011 International Joint Conference on Biometrics (IJCB). 1--7.Google Scholar
- Ella Bingham and Heikki Mannila. 2001. Random projection in dimensionality reduction: applications to image and text data. In Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining. 245--250.Google ScholarDigital Library
- R. Bodor, A. Drenner, D. Fehr, O. Masoud, and N. Papanikolopoulos. 2009. View-Independent Human Motion Classification Using Image-Based Reconstruction. Image Vision Comput., Vol. 27, 8 (jul 2009), 1194--1206.Google ScholarDigital Library
- Tianrui Chai, Xinyu Mei, Annan Li, and Yunhong Wang. 2021. Silhouette-based view-embeddings for gait recognition under multiple views. In 2021 IEEE international conference on image processing (ICIP). IEEE, 2319--2323.Google ScholarCross Ref
- Krzysztof Chalupka, Pietro Perona, and Frederick Eberhardt. 2015. Visual Causal Feature Learning (UAI'15). AUAI Press, Arlington, Virginia, USA, 181--190.Google Scholar
- Hanqing Chao, Kun Wang, Yiwei He, Junping Zhang, and Jianfeng Feng. 2021. GaitSet: Cross-view gait recognition through utilizing gait as a deep set. IEEE transactions on pattern analysis and machine intelligence, Vol. 44, 7 (2021), 3467--3478.Google Scholar
- Chao Fan, Junhao Liang, Chuanfu Shen, Saihui Hou, Yongzhen Huang, and Shiqi Yu. 2023. OpenGait: Revisiting Gait Recognition Towards Better Practicality. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 9707--9716.Google ScholarCross Ref
- Chao Fan, Yunjie Peng, Chunshui Cao, Xu Liu, Saihui Hou, Jiannan Chi, Yongzhen Huang, Qing Li, and Zhiqiang He. 2020. GaitPart: Temporal Part-Based Model for Gait Recognition. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 14213--14221.Google Scholar
- Wenhang Ge, Chunyan Pan, Ancong Wu, Hongwei Zheng, and Wei-Shi Zheng. 2021. Cross-camera feature prediction for intra-camera supervised person re-identification across distant scenes. In Proceedings of the 29th ACM International Conference on Multimedia. 3644--3653.Google ScholarDigital Library
- Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the thirteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conference Proceedings, 249--256.Google Scholar
- J. Han and Bir Bhanu. 2006. Individual recognition using gait energy image. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 28, 2 (2006), 316--322.Google ScholarDigital Library
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE international conference on computer vision. 1026--1034.Google ScholarDigital Library
- Saihui Hou, Chunshui Cao, Xu Liu, and Yongzhen Huang. 2020. Gait Lateral Network: Learning Discriminative and Compact Representations for Gait Recognition. In Computer Vision - ECCV 2020: 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part IX (Glasgow, United Kingdom). Springer-Verlag, Berlin, Heidelberg, 382--398.Google ScholarDigital Library
- William B. Johnson and Joram Lindenstrauss. 1984. Extensions of Lipschitz mappings into a Hilbert space. Conference on Modern Analysis and Probability (1984), 189--206.Google Scholar
- Worapan Kusakunniran, Qiang Wu, Hongdong Li, and Jian Zhang. 2009. Multiple views gait recognition using View Transformation Model based on optimized Gait Energy Image. In 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops. 1058--1064.Google ScholarCross Ref
- Beibei Lin, Shunli Zhang, and Xin Yu. 2021. Gait Recognition via Effective Global-Local Feature Representation and Local Temporal Aggregation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 14648--14656.Google ScholarCross Ref
- David Lopez-Paz, Robert Nishihara, Soumith Chintala, Bernhard Schölkopf, and Léon Bottou. 2017. Discovering Causal Signals in Images. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 58--66.Google Scholar
- Jian Luo, Jin Tang, Tardi Tjahjadi, and Xiaoming Xiao. 2016. Robust arbitrary view gait recognition based on parametric 3D human body reconstruction and virtual posture synthesis. Pattern Recognition, Vol. 60 (2016), 361--377.Google ScholarDigital Library
- Aäron van den Oord, Yazhe Li, and Oriol Vinyals. 2018. Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748 (2018).Google Scholar
- J. Pearl, M. Glymour, and N.P. Jewell. 2016. Causal Inference in Statistics: A Primer. Wiley. 2015037219Google Scholar
- Yongming Rao, Guangyi Chen, Jiwen Lu, and Jie Zhou. 2021. Counterfactual Attention Learning for Fine-Grained Visual Categorization and Re-identification. In ICCV.Google Scholar
- Florian Schroff, Dmitry Kalenichenko, and James Philbin. 2015. FaceNet: A unified embedding for face recognition and clustering. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 815--823.Google ScholarCross Ref
- Kohei Shiraga, Yasushi Makihara, Daigo Muramatsu, Tomio Echigo, and Yasushi Yagi. 2016. GEINet: View-invariant gait recognition using a convolutional neural network. In 2016 International Conference on Biometrics (ICB). 1--8.Google ScholarCross Ref
- Noriko Takemura, Yasushi Makihara, Daigo Muramatsu, Tomio Echigo, and Yasushi Yagi. 2018. Multi-view large population gait dataset and its performance evaluation for cross-view gait recognition. IPSJ Transactions on Computer Vision and Applications, Vol. 10, 1 (2018).Google ScholarCross Ref
- Jin Tang, Jian Luo, Tardi Tjahjadi, and Fan Guo. 2016. Robust arbitrary-view gait recognition based on 3D partial similarity matching. IEEE Transactions on Image Processing, Vol. 26, 1 (2016), 7--22.Google ScholarDigital Library
- Kaihua Tang, Jianqiang Huang, and Hanwang Zhang. 2020. Long-Tailed Classification by Keeping the Good and Removing the Bad Momentum Causal Effect. In Proceedings of the 34th International Conference on Neural Information Processing Systems (Vancouver, BC, Canada) (NIPS'20). Curran Associates Inc., Red Hook, NY, USA, Article 128, 12 pages.Google ScholarDigital Library
- Yonghong Tian, Lan Wei, Shijian Lu, and Tiejun Huang. 2019. Free-view gait recognition. PloS one, Vol. 14, 4 (2019), e0214389.Google ScholarCross Ref
- Paul Upchurch, Jacob Gardner, Geoff Pleiss, Robert Pless, Noah Snavely, Kavita Bala, and Kilian Weinberger. 2017. Deep feature interpolation for image content changes. In Proceedings of the IEEE conference on computer vision and pattern recognition. 7064--7073.Google ScholarCross Ref
- Zhirong Wu, Yuanjun Xiong, Stella X Yu, and Dahua Lin. 2018. Unsupervised feature learning via non-parametric instance discrimination. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3733--3742.Google ScholarCross Ref
- Xu Yang, Hanwang Zhang, and Jianfei Cai. 2021. Deconfounded Image Captioning: A Causal Retrospect. IEEE Transactions on Pattern Analysis and Machine Intelligence (2021), 1--1.Google Scholar
- Shiqi Yu, Haifeng Chen, Edel B. García Reyes, and Norman Poh. 2017. GaitGAN: Invariant Gait Feature Extraction Using Generative Adversarial Networks. In 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). 532--539.Google Scholar
- Shiqi Yu, Daoliang Tan, and Tieniu Tan. 2006. A framework for evaluating the effect of view angle, clothing and carrying condition on gait recognition. In 18th international conference on pattern recognition (ICPR'06), Vol. 4. IEEE, 441--444.Google Scholar
- Zhongqi Yue, Hanwang Zhang, Qianru Sun, and Xian-Sheng Hua. 2020. Interventional Few-Shot Learning. In NeurIPS.Google Scholar
- Tianyu Zhang, Lingxi Xie, Longhui Wei, Yongfei Zhang, Bo Li, and Qi Tian. 2020. Single Camera Training for Person Re-identification. In AAAI Conference on Artificial Intelligence (AAAI).Google Scholar
- Yuqi Zhang, Yongzhen Huang, Shiqi Yu, and Liang Wang. 2019a. Cross-View Gait Recognition by Discriminative Feature Learning. IEEE Transactions on Image Processing, Vol. 29 (2019), 1001--1015.Google ScholarCross Ref
- Ziyuan Zhang, Luan Tran, Xi Yin, Yousef Atoum, Xiaoming Liu, Jian Wan, and Nanxin Wang. 2019b. Gait recognition via disentangled representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 4710--4719.Google ScholarCross Ref
- Guoying Zhao, Guoyi Liu, Hua Li, and Matti Pietikainen. 2006. 3D gait recognition using multiple cameras. In 7th International Conference on Automatic Face and Gesture Recognition (FGR06). IEEE, 529--534.Google ScholarDigital Library
- Jinkai Zheng, Xinchen Liu, Wu Liu, Lingxiao He, Chenggang Yan, and Tao Mei. 2022. Gait recognition in the wild with dense 3d representations and a benchmark. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 20228--20237.Google ScholarCross Ref
- Zheng Zhu, Xianda Guo, Tian Yang, Junjie Huang, Jiankang Deng, Guan Huang, Dalong Du, Jiwen Lu, and Jie Zhou. 2021. Gait recognition in the wild: A benchmark. In Proceedings of the IEEE/CVF international conference on computer vision. 14789--14799.Google Scholar
Index Terms
- Causal Intervention for Sparse-View Gait Recognition
Recommendations
Deconfounded recommendation via causal intervention
AbstractTraditional recommenders suffer from hidden confounding factors, leading to the spurious correlations between user/item profiles and user preference prediction, i.e., the confounding bias issue. Most works resort to only one confounding bias, ...
Disentangling causality: assumptions in causal discovery and inference
AbstractCausality has been a burgeoning field of research leading to the point where the literature abounds with different components addressing distinct parts of causality. For researchers, it has been increasingly difficult to discern the assumptions ...
Gait recognition based on fusion of multi-view gait sequences
ICB'06: Proceedings of the 2006 international conference on Advances in BiometricsIn recent years, many gait recognition algorithms have been developed, but most of them depend on a specific view angle. In this paper,we present a new gait recognition scheme based on multi-view gait sequence fusion. An experimental comparison of the ...
Comments