Physics-Guided Human Motion Capture with Pose Probability Modeling

Jingyi Ju; Buzhen Huang; Chen Zhu; Zhihao Li; Yangang Wang

doi:10.24963/ijcai.2023/105

Physics-Guided Human Motion Capture with Pose Probability Modeling

Jingyi Ju, Buzhen Huang, Chen Zhu, Zhihao Li, Yangang Wang

Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence

Main Track. Pages 947-955. https://doi.org/10.24963/ijcai.2023/105

PDF BibTeX

Incorporating physics in human motion capture to avoid artifacts like floating, foot sliding, and ground penetration is a promising direction. Existing solutions always adopt kinematic results as reference motions, and the physics is treated as a post-processing module. However, due to the depth ambiguity, monocular motion capture inevitably suffers from noises, and the noisy reference often leads to failure for physics-based tracking. To address the obstacles, our key-idea is to employ physics as denoising guidance in the reverse diffusion process to reconstruct physically plausible human motion from a modeled pose probability distribution. Specifically, we first train a latent gaussian model that encodes the uncertainty of 2D-to-3D lifting to facilitate reverse diffusion. Then, a physics module is constructed to track the motion sampled from the distribution. The discrepancies between the tracked motion and image observation are used to provide explicit guidance for the reverse diffusion model to refine the motion. With several iterations, the physics-based tracking and kinematic denoising promote each other to generate a physically plausible human motion. Experimental results show that our method outperforms previous physics-based methods in both joint accuracy and success rate. More information can be found at https://github.com/Me-Ditto/Physics-Guided-Mocap.

Keywords:

Computer Vision: CV: Biometrics, face, gesture and pose recognition

Computer Vision: CV: 3D computer vision