skip to main content
research-article

Learning bicycle stunts

Published:27 July 2014Publication History
Skip Abstract Section

Abstract

We present a general approach for simulating and controlling a human character that is riding a bicycle. The two main components of our system are offline learning and online simulation. We simulate the bicycle and the rider as an articulated rigid body system. The rider is controlled by a policy that is optimized through offline learning. We apply policy search to learn the optimal policies, which are parameterized with splines or neural networks for different bicycle maneuvers. We use Neuroevolution of Augmenting Topology (NEAT) to optimize both the parametrization and the parameters of our policies. The learned controllers are robust enough to withstand large perturbations and allow interactive user control. The rider not only learns to steer and to balance in normal riding situations, but also learns to perform a wide variety of stunts, including wheelie, endo, bunny hop, front wheel pivot and back hop.

Skip Supplemental Material Section

Supplemental Material

a50-sidebyside.mp4

mp4

31 MB

References

  1. Allen, B., and Faloutsos, P. 2009. Evolved controllers for simulated locomotion. In Motion in Games, Lecture Notes in Computer Science, 219--230. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Andrews, S., and Kry, P. 2013. Goal directed multi-finger manipulation: Control policies and analysis. Computers & Graphics 37, 7, 830--839. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Auslander, J., Fukunaga, A., Partovi, H., Christensen, J., Hsu, L., Reiss, P., Shuman, A., Marks, J., and Ngo, J. T. 1995. Further experience with controller-based automatic motion synthesis for articulated figures. ACM Trans. Graph. 14, 4 (Oct.), 311--336. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. BBC. 2005. Bicycle chosen as best invention. BBC News.Google ScholarGoogle Scholar
  5. Boyan, J. A., and Moore, A. W. 1995. Generalization in reinforcement learning: Safely approximating the value function. In Advances in Neural Information Processing Systems 7, MIT Press, 369--376.Google ScholarGoogle Scholar
  6. Carvallo, M. E. 1900. Théorie du mouvement du monocycle et de la bicyclette. Journal de L'Ecole Polytechnique 5.Google ScholarGoogle Scholar
  7. Chambaron, S., Berberian, B., Delbecque, L., Ginhac, D., and Cleeremans, A. 2009. Implicit motor learning in discrete and continuous tasks: Toward a possible account of discrepant results. Handbook of Motor Skills: Development, Impairment, and Therapy, 139--155. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Collins, R. N. 1963. A mathematical analysis of the stability of two-wheeled vehicles. PhD thesis, University of Wisconsin. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Coros, S., Beaudoin, P., and van de Panne, M. 2009. Robust task-based control policies for physics-based characters. ACM Trans. Graph. 28, 5 (Dec.), 170:1--170:9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Coros, S., Beaudoin, P., and van de Panne, M. 2010. Generalized biped walking control. ACM Transctions on Graphics 29, 4, Article 130. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Coros, S., Karpathy, A., Jones, B., Reveret, L., and van de Panne, M. 2011. Locomotion skills for simulated quadrupeds. ACM Transactions on Graphics 30, 4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. da Silva, M., Abe, Y., and Popović, J. 2008. Interactive simulation of stylized human locomotion. In ACM SIGGRAPH 2008 Papers, ACM, New York, NY, USA, SIGGRAPH '08, 82:1--82:10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. de Lasa, M., and Hertzmann, A. 2009. Prioritized optimization for task-space control. In International Conference on Intelligent Robots and Systems (IROS). Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Geijtenbeek, T., and Pronost, N. 2012. Interactive Character Animation Using Simulated Physics: A State-of-the-Art Review. Computer Graphics Forum 31, 8, 2492--2515. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Geijtenbeek, T., van de Panne, M., and van der Stappen, A. F. 2013. Flexible muscle-based locomotion for bipedal creatures. ACM Transactions on Graphics 32, 6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Grzeszczuk, R., and Terzopoulos, D. 1995. Automated learning of muscle-actuated locomotion through control abstraction. In Proceedings of the 22nd annual conference on Computer graphics and interactive techniques, 63--70. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Hansen, N. 2009. The CMA Evolution Strategy: A Tutorial.Google ScholarGoogle Scholar
  18. Heidrich-Meisner, V., and Igel, C. 2008. Evolution strategies for direct policy search. In Proceedings of the 10th International Conference on Parallel Problem Solving from Nature: PPSN X, Springer-Verlag, Berlin, Heidelberg, 428--437. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Hinton, G. E. 2007. Learning multiple layers of representation. Trends in Cognitive Sciences 11, 428--434.Google ScholarGoogle ScholarCross RefCross Ref
  20. Hodgins, J. K., Sweeney, P. K., and Lawrence, D. G. 1992. Generating natural-looking motion for computer animation. In Proceedings of the Conference on Graphics Interface '92, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 265--272. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Hodgins, J. K., Wooten, W. L., Brogan, D. C., and O'Brien, J. F. 1995. Animating human athletics. In SIGGRAPH, 71--78. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Jones, D. E. H. 1970. The Stability of the bicycle. Physics Today 23, 34--40.Google ScholarGoogle ScholarCross RefCross Ref
  23. Klein, F., and Sommerfeld, A. 1910. Stabilität des fahrrads. Über die Theorie des Kreisels, Ch. IX, Section 8, 863--884.Google ScholarGoogle Scholar
  24. Kooijman, J. D. G., Meijaard, J. P., Papadopoulos, J. M., Ruina, A., and Schwab, A. L. 2011. A Bicycle Can Be Self-Stable Without Gyroscopic or Caster Effects. Science 332, 6027 (Apr.), 339--342.Google ScholarGoogle ScholarCross RefCross Ref
  25. Kwon, T., and Hodgins, J. 2010. Control systems for human running using an inverted pendulum model and a reference motion capture sequence. In Proceedings of the 2010 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, Eurographics Association, Aire-la-Ville, Switzerland, SCA '10, 129--138. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Laszlo, J., van de Panne, M., and Fiume, E. 1996. Limit cycle control and its application to the animation of balancing and walking. In Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques, ACM, New York, NY, USA, SIGGRAPH '96, 155--162. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Levine, S., and Koltun, V. 2013. Guided policy search. In ICML '13: Proceedings of the 30th International Conference on Machine Learning.Google ScholarGoogle Scholar
  28. Levine, S., Wang, J. M., Haraux, A., Popović, Z., and Koltun, V. 2012. Continuous character control with low-dimensional embeddings. ACM Trans. Graph. 31, 4 (July), 28:1--28:10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Meijaard, J. P., Papadopoulos, J. M., Ruina, A., and Schwab, A. L. 2007. Linearized dynamics euqations for the balance and steer of a bicycle: a benchmark and review. Proceedings of the Royal Society A.Google ScholarGoogle Scholar
  30. Mordatch, I., de Lasa, M., and Hertzmann, A. 2010. Robust Physics-Based Locomotion Using Low-Dimensional Planning. ACM Transactions on Graphics 29, 3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Muico, U., Lee, Y., Popović, J., and Popović, Z. 2009. Contact-aware nonlinear control of dynamic characters. In ACM SIGGRAPH 2009 Papers, ACM, New York, NY, USA, SIGGRAPH '09, 81:1--81:9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Ng, A. Y., and Jordan, M. 2000. Pegasus: A policy search method for large MDPs and POMDPs. In Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, UAI'00, 406--415. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Ng, A. Y., and Russell, S. J. 2000. Algorithms for inverse reinforcement learning. In Proceedings of the Seventeenth International Conference on Machine Learning, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, ICML '00, 663--670. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Ngo, J. T., and Marks, J. 1993. Spacetime constraints revisited. In Proceedings of the 20th Annual Conference on Computer Graphics and Interactive Techniques, ACM, New York, NY, USA, SIGGRAPH '93, 343--350. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Peters, J., and Schaal, S. 2008. Reinforcement learning of motor skills with policy gradients. Neural Networks 21, 4 (May), 682--697. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Pratt, J. E., Chew, C.-M., Torres, A., Dilworth, P., and Pratt, G. A. 2001. Virtual model control: An intuitive approach for bipedal locomotion. Int'l J. Robotic Research. 20, 2, 129--143.Google ScholarGoogle ScholarCross RefCross Ref
  37. Randløv, J., and Alstrøm, P. 1998. Learning to drive a bicycle using reinforcement learning and shaping. In Proceedings of the Fifteenth International Conference on Machine Learning (ICML 1998), Morgan Kauffman, San Francisco, CA, USA, J. W. Shavlik, Ed., 463--471. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Rankine, W. J. M. 1870. On the dynamical principles of the motion of velocipedes. The Engineer.Google ScholarGoogle Scholar
  39. Sims, K. 1994. Evolving virtual creatures. In Proceedings of the 21st Annual Conference on Computer Graphics and Interactive Techniques, ACM, New York, NY, USA, SIGGRAPH '94, 15--22. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Singh, D. V. 1964. Advanced concepts of the stability of two-wheeled vehicle-application of mathematical analysis to actual vehicles. PhD thesis, University of Wisconsin.Google ScholarGoogle Scholar
  41. Smith, R., 2008. Open dynamics engine. http://www.ode.org/.Google ScholarGoogle Scholar
  42. Stanley, K. O., and Miikkulainen, R. 2002. Evolving neural networks through augmenting topologies. Evol. Comput. 10, 2 (June), 99--127. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Sutton, R. S., and Barto, A. G. 1998. Introduction to Reinforcement Learning, 1st ed. MIT Press, Cambridge, MA, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Tan, J., Gu, Y., Turk, G., and Liu, C. K. 2011. Articulated swimming creatures. In ACM SIGGRAPH 2011 papers, ACM, SIGGRAPH '11, 58:1--58:12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Tan, J., Turk, G., and Liu, C. K. 2012. Soft body locomotion. ACM Trans. Graph. 31, 4 (July), 26:1--26:11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Thrun, S., and Schwartz, A. 1993. Issues in using function approximation for reinforcement learning. In In Proceedings of the Fourth Connectionist Models Summer School, Erlbaum.Google ScholarGoogle Scholar
  47. Treuille, A., Lee, Y., and Popović, Z. 2007. Near-optimal character animation with continuous control. ACM Trans. Graph. 26, 3 (July). Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Tsai, Y.-Y., Lin, W.-C., Cheng, K. B., Lee, J., and Lee, T.-Y. 2010. Real-time physics-based 3D biped character animation using an inverted pendulum model. IEEE Transactions on Visualization and Computer Graphics 16, 2 (Mar.), 325--337. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. van de Panne, M., and Fiume, E. 1993. Sensor-actuator networks. In Proceedings of the 20th Annual Conference on Computer Graphics and Interactive Techniques, ACM, New York, NY, USA, SIGGRAPH '93, 335--342. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. van de Panne, M., and Lee, C. 2003. Ski stunt simulator: Experiments with interactive dynamics. In Proceedings of the 14th Western Computer Graphics Symposium.Google ScholarGoogle Scholar
  51. Van Zytveld, P. 1975. A Method for the Automatic Stabilization of an Unmanned Bicycle. Department of Aeronautics and Astronautics, Stanford University.Google ScholarGoogle Scholar
  52. Wang, J. M., Fleet, D. J., and Hertzmann, A. 2009. Optimizing walking controllers. ACM Trans. Graph. 28, 5 (Dec.), 168:1--168:8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Wang, J. M., Fleet, D. J., and Hertzmann, A. 2010. Optimizing walking controllers for uncertain inputs and environments. ACM Trans. Graph. 29, 4 (July), 73:1--73:8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Wang, J. M., Hamner, S. R., Delp, S. L., and Koltun, V. 2012. Optimizing locomotion controllers using biologically-based actuators and objectives. ACM Trans. Graph. 31, 4 (July), 25:1--25:11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Whipple, F. J. W. 1899. The stability of the motion of a bicycle. Quarterly Journal of Pure and Applied Mathematics 30, 312--348.Google ScholarGoogle Scholar
  56. Wu, J.-c., and Popović, Z. 2003. Realistic modeling of bird flight animations. In ACM SIGGRAPH 2003 Papers, ACM, New York, NY, USA, SIGGRAPH '03, 888--895. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Ye, Y., and Liu, C. K. 2010. Optimal feedback control for character animation using an abstract model. In SIGGRAPH '10: ACM SIGGRAPH 2010 papers, ACM, New York, NY, USA, 1--9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Yin, K., Loken, K., and van de Panne, M. 2007. SIMBICON: simple biped locomotion control. In ACM SIGGRAPH 2007 papers, SIGGRAPH '07. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Yin, K., Coros, S., Beaudoin, P., and van de Panne, M. 2008. Continuation methods for adapting simulated skills. ACM Trans. Graph. 27, 3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Zhao, P., and van de Panne, M. 2005. User interfaces for interactive control of physics-based 3D characters. In Proceedings of the 2005 Symposium on Interactive 3D Graphics and Games, ACM, New York, NY, USA, I3D '05, 87--94. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Learning bicycle stunts

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Graphics
        ACM Transactions on Graphics  Volume 33, Issue 4
        July 2014
        1366 pages
        ISSN:0730-0301
        EISSN:1557-7368
        DOI:10.1145/2601097
        Issue’s Table of Contents

        Copyright © 2014 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 27 July 2014
        Published in tog Volume 33, Issue 4

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader