research-article

Open Access

A deep learning framework for character motion synthesis and editing

Authors:
Daniel Holden

University of Edinburgh

University of Edinburgh
View Profile

,
Jun Saito

Marza Animation Planet

Marza Animation Planet
View Profile

,
Taku Komura

University of Edinburgh

University of Edinburgh
View Profile

Authors Info & Claims

ACM Transactions on Graphics Volume 35 Issue 4Article No.: 138pp 1–11https://doi.org/10.1145/2897824.2925975

Published:11 July 2016Publication History

ACM Transactions on Graphics

Abstract

We present a framework to synthesize character movements based on high level parameters, such that the produced movements respect the manifold of human motion, trained on a large motion capture dataset. The learned motion manifold, which is represented by the hidden units of a convolutional autoencoder, represents motion data in sparse components which can be combined to produce a wide range of complex movements. To map from high level parameters to the motion manifold, we stack a deep feedforward neural network on top of the trained autoencoder. This network is trained to produce realistic motion sequences from parameters such as a curve over the terrain that the character should follow, or a target location for punching and kicking. The feedforward control network and the motion manifold are trained independently, allowing the user to easily switch between feedforward networks according to the desired interface, without re-training the motion manifold. Once motion is generated it can be edited by performing optimization in the space of the motion manifold. This allows for imposing kinematic constraints, or transforming the style of the motion, while ensuring the edited motion remains natural. As a result, the system can produce smooth, high quality motion sequences without any manual pre-processing of the training data.

Supplemental Material

a138.mp4

mp4

321.3 MB

Download

Available for Download

zip

a138-holden-supp.zip (94.4 MB)

Supplemental files.

References

Allen, B. F., and Faloutsos, P. 2009. Evolved controllers for simulated locomotion. In Motion in games. Springer, 219--230. Google ScholarDigital Library
Arikan, O., and Forsyth, D. A. 2002. Interactive motion generation from examples. ACM Trans on Graph 21, 3, 483--490. Google ScholarDigital Library
Bergstra, J., Breuleux, O., Bastien, F., Lamblin, P., Pascanu, R., Desjardins, G., Turian, J., Warde-Farley, D., and Bengio, Y. 2010. Theano: a CPU and GPU math expression compiler. In Proc. of the Python for Scientific Computing Conference (SciPy). Oral Presentation.Google Scholar
CMU. Carnegie-Mellon Mocap Database. http://mocap.cs.cmu.edu/.Google Scholar
Du, Y., Wang, W., and Wang, L. 2015. Hierarchical recurrent neural network for skeleton based action recognition. In Proc. of the IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
Fragkiadaki, K., Levine, S., Felsen, P., and Malik, J. 2015. Recurrent network models for human dynamics. In Proc. of the IEEE International Conference on Computer Vision, 4346--4354. Google ScholarDigital Library
Gatys, L. A., Ecker, A. S., and Bethge, M. 2015. A neural algorithm of artistic style. CoRR abs/1508.06576.Google Scholar
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y. 2014. Generative adversarial nets. In Proc. of Advances in Neural Information Processing Systems. 2672--2680. Google ScholarDigital Library
Graves, A., Mohamed, A.-R., and Hinton, G. 2013. Speech recognition with deep recurrent neural networks. In Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on, IEEE, 6645--6649.Google Scholar
Grochow, K., Martin, S. L., Hertzmann, A., and Popović, Z. 2004. Style-based inverse kinematics. ACM Trans on Graph 23, 3, 522--531. Google ScholarDigital Library
Hariharan, B., Arbeláez, P. A., Girshick, R. B., and Malik, J. 2014. Hypercolumns for object segmentation and fine-grained localization. CoRR abs/1411.5752.Google Scholar
Heck, R., and Gleicher, M. 2007. Parametric motion graphs. In Proc. of the 2007 Symposium on Interactive 3D Graphics and Games, ACM, 129--136. Google ScholarDigital Library
Hinton, G. 2012. A practical guide to training restricted boltzmann machines. In Neural Networks: Tricks of the Trade, G. Montavon, G. Orr, and K.-R. Mller, Eds., vol. 7700 of Lecture Notes in Computer Science. 599--619.Google Scholar
Holden, D., Saito, J., Komura, T., and Joyce, T. 2015. Learning motion manifolds with convolutional autoencoders. In SIGGRAPH Asia 2015 Technical Briefs, ACM, 18:1--18:4. Google ScholarDigital Library
Kim, M., Hyun, K., Kim, J., and Lee, J. 2009. Synchronized multi-character motion editing. ACM Trans on Graph 28, 3, 79. Google ScholarDigital Library
Kingma, D. P., and Ba, J. 2014. Adam: A method for stochastic optimization. CoRR abs/1412.6980.Google Scholar
Kovar, L., and Gleicher, M. 2004. Automated extraction and parameterization of motions in large data sets. ACM Trans on Graph 23, 3, 559--568. Google ScholarDigital Library
Kovar, L., Gleicher, M., and Pighin, F. 2002. Motion graphs. ACM Trans on Graph 21, 3, 473--482. Google ScholarDigital Library
Krizhevsky, A., Sutskever, I., and Hinton, G. E. 2012. Imagenet classification with deep convolutional neural networks. In Proc. of Advances in Neural Information Processing Systems, 1097--1105.Google Scholar
Lee, J., and Lee, K. H. 2004. Precomputing avatar behavior from human motion data. In Proc. of 2004 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, 79--87. Google ScholarDigital Library
Lee, J., and Shin, S. Y. 1999. A hierarchical approach to interactive motion editing for human-like figures. SIGGRAPH'99, 39--48. Google ScholarDigital Library
Lee, J., Chai, J., Reitsma, P. S., Hodgins, J. K., and Pollard, N. S. 2002. Interactive control of avatars animated with human motion data. ACM Trans on Graph 21, 3, 491--500. Google ScholarDigital Library
Lee, Y., Wampler, K., Bernstein, G., Popović, J., and Popović, Z. 2010. Motion fields for interactive character locomotion. ACM Trans on Graph 29, 6, 138. Google ScholarDigital Library
Levine, S., and Koltun, V. 2014. Learning complex neural network policies with trajectory optimization. In Proc. of the 31st International Conference on Machine Learning (ICML-14), 829--837.Google Scholar
Levine, S., Wang, J. M., Haraux, A., Popović, Z., and Koltun, V. 2012. Continuous character control with low-dimensional embeddings. ACM Trans on Graph 31, 4, 28. Google ScholarDigital Library
Min, J., and Chai, J. 2012. Motion graphs++: a compact generative model for semantic motion analysis and synthesis. ACM Trans on Graph 31, 6, 153. Google ScholarDigital Library
Min, J., Chen, Y.-L., and Chai, J. 2009. Interactive generation of human animation with deformable motion models. ACM Trans on Graph 29, 1, 9. Google ScholarDigital Library
Mittelman, R., Kuipers, B., Savarese, S., and Lee, H. 2014. Structured recurrent temporal restricted boltzmann machines. In Proc. of the 31st International Conference on Machine Learning (ICML-14), 1647--1655.Google Scholar
Mordatch, I., Lowrey, K., Andrew, G., Popovic, Z., and Todorov, E. 2015. Interactive control of diverse complex characters with neural networks. In Proc. of Advances in Neural Information Processing Systems. Google ScholarDigital Library
Mukai, T., and Kuriyama, S. 2005. Geostatistical motion interpolation. ACM Trans on Graph 24, 3, 1062--1070. Google ScholarDigital Library
Müller, M., Röder, T., Clausen, M., Eberhardt, B., Krüger, B., and Weber, A. 2007. Documentation mocap database hdm05. Tech. Rep. CG-2007-2, Universität Bonn, June.Google Scholar
Nair, V., and Hinton, G. E. 2010. Rectified linear units improve restricted boltzmann machines. In Proc. of the 27th International Conference on Machine Learning (ICML-10), 807--814.Google Scholar
Ofli, F., Chaudhry, R., Kurillo, G., Vidal, R., and Bajcsy, R. 2013. Berkeley mhad: A comprehensive multimodal human action database. In Applications of Computer Vision (WACV), 2013 IEEE Workshop on, 53--60. Google ScholarDigital Library
Rose, C., Cohen, M. F., and Bodenheimer, B. 1998. Verbs and adverbs: Multidimensional motion interpolation. IEEE Comput. Graph. Appl. 18, 5, 32--40. Google ScholarDigital Library
Rose III, C. F., Sloan, P.-P. J., and Cohen, M. F. 2001. Artist-directed inverse-kinematics using radial basis function interpolation. Computer Graphics Forum 20, 3, 239--250.Google ScholarCross Ref
Safonova, A., and Hodgins, J. K. 2007. Construction and optimal search of interpolated motion graphs. ACM Trans on Graph 26, 3, 106. Google ScholarDigital Library
Shin, H. J., and Oh, H. S. 2006. Fat graphs: constructing an interactive character with continuous controls. In Proc. of the 2006 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, Eurographics Association, 291--298. Google ScholarDigital Library
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., and Salakhutdinov, R. 2014. Dropout: A simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research 15, 1, 1929--1958. Google ScholarDigital Library
Tan, J., Gu, Y., Liu, C. K., and Turk, G. 2014. Learning bicycle stunts. ACM Trans on Graph 33, 4, 50. Google ScholarDigital Library
Taylor, G. W., and Hinton, G. E. 2009. Factored conditional restricted boltzmann machines for modeling motion style. In Proc. of the 26th International Conference on Machine Learning, ACM, 1025--1032. Google ScholarDigital Library
Taylor, G. W., Hinton, G. E., and Roweis, S. T. 2011. Two distributed-state models for generating high-dimensional time series. The Journal of Machine Learning Research 12, 1025--1068. Google ScholarDigital Library
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., and Manzagol, P.-A. 2010. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. The Journal of Machine Learning Research 11, 3371--3408. Google ScholarDigital Library
Wang, J., Hertzmann, A., and Blei, D. M. 2005. Gaussian process dynamical models. In Proc. of Advances in Neural Information Processing Systems, 1441--1448.Google Scholar
Xia, S., Wang, C., Chai, J., and Hodgins, J. 2015. Realtime style transfer for unlabeled heterogeneous human motion. ACM Trans on Graph 34, 4, 119:1--119:10. Google ScholarDigital Library
Yamane, K., and Nakamura, Y. 2003. Natural motion animation through constraining and deconstraining at will. Visualization and Computer Graphics, IEEE Transactions on 9, 3, 352--360. Google ScholarDigital Library
Zeiler, M. D., and Fergus, R. 2014. Visualizing and understanding convolutional networks. In Computer Vision--ECCV 2014. Springer, 818--833.Google Scholar

Index Terms

A deep learning framework for character motion synthesis and editing
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Image and video acquisition
        Motion capture

Recommendations

Local motion phases for learning multi-contact character movements

Training a bipedal character to play basketball and interact with objects, or a quadruped character to move in various locomotion modes, are difficult tasks due to the fast and complex contacts happening during the motion. In this paper, we propose a ...
Read More
A Deep Learning Framework for Character Motion Synthesis and Editing
Seminal Graphics Papers: Pushing the Boundaries, Volume 2

We present a framework to synthesize character movements based on high level parameters, such that the produced movements respect the manifold of human motion, trained on a large motion capture dataset. The learned motion manifold, which is represented ...
Read More
Synchronized multi-character motion editing

The ability to interactively edit human motion data is essential for character animation. We present a novel motion editing technique that allows the user to manipulate synchronized multiple character motions interactively. Our Laplacian motion editing ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Graphics Volume 35, Issue 4
July 2016
1396 pages
ISSN:0730-0301
EISSN:1557-7368
DOI:10.1145/2897824
Issue’s Table of Contents
Seminal Graphics Papers: Pushing the Boundaries, Volume 2
August 2023
893 pages
ISBN:9798400708978
DOI:10.1145/3596711
Editor:
Mary C. Whitton
Department of Computer Science, UNC Chapel Hill, USA
Copyright © 2016 ACM
Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 11 July 2016
Published in tog Volume 35, Issue 4

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Badges
- Seminal Paper
Author Tags
autoencoder
character animation
convolutional neural networks
deep learning
human motion
manifold learning
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 404
  Total Citations
  View Citations
- 3,944
  Total Downloads
- Downloads (Last 12 months)891
- Downloads (Last 6 weeks)127
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A deep learning framework for character motion synthesis and editing

ACM Transactions on Graphics

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Local motion phases for learning multi-contact character movements

A Deep Learning Framework for Character Motion Synthesis and Editing

Synchronized multi-character motion editing