Skip to main content
Log in

Multiresolution state-space discretization for Q-Learning with pseudorandomized discretization

  • Published:
Journal of Control Theory and Applications Aims and scope Submit manuscript

Abstract

A multiresolution state-space discretization method with pseudorandom gridding is developed for the episodic unsupervised learning method of Q-learning. It is used as the learning agent for closed-loop control of morphing or highly reconfigurable systems. This paper develops a method whereby a state-space is adaptively discretized by progressively finer pseudorandom grids around the regions of interest within the state or learning space in an effort to break the Curse of Dimensionality. Utility of the method is demonstrated with application to the problem of a morphing airfoil, which is simulated by a computationally intensive computational fluid dynamics model. By setting the multiresolution method to define the region of interest by the goal the agent seeks, it is shown that this method with the pseudorandom grid can learn a specific goal within ±0.001 while reducing the total number of state-action pairs needed to achieve this level of specificity to less than 3000.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. D. C. Bentivegna, C. G. Atkeson, G. Cheng. Learning to select primitives and generate sub-goals from practice. Proceedings of the IEEE/lRSJ International Conference on Intelligent Robots and Systems, New York: IEEE, 2003: 946–953.

    Google Scholar 

  2. O. Simsek, A. P. Wolfe, A. G. Barto. Identifying useful subgoals in reinforcement learning by local graph partitioning. Proceedings of the 22nd International Conference on Machine Learning, New York: ACM, 2005: 816–823.

    Chapter  Google Scholar 

  3. C. Clausen, H. Wechsler. Quad-Q-learning. IEEE Transactions on Neural Networks, 2000, 11(2): 279–294.

    Article  Google Scholar 

  4. T. G. Dietterich. The MAXQ method for hierarchical reinforcement learning. Proceedings of the 15th International Conference on Machine Learning, San Francisco, CA: Morgan Kaufmann Publishers Inc., 1998: 118–126.

    Google Scholar 

  5. C. J. C. H.Watkins, P. Dayan. Learning from Delayed Rewards. Ph.D. thesis. Cambridge, U.K.: University of Cambridge, 1989.

    Google Scholar 

  6. A. Lampton. Function Approximation and Discretization Methods for Reinforcement Learning of Highly Reconfigurable Vehicles. Ph.D. thesis. College Station, TX: Texas A&M University, 2009.

    Google Scholar 

  7. A. Lampton, A. Niksch, J. Valasek. Reinforcement learning of morphing airfoils with aerodynamic and structural effects. Journal of Aerospace Computing, Information, and Communication, 2009, 6(1): 30–50.

    Google Scholar 

  8. A. Lampton, A. Niksch, J. Valasek. Reinforcement learning of a morphing airfoil-policy and discrete learning analysis. Proceedings of the AIAA Guidance, Navigation, and Control Conference, Honolulu, HI, 2008: No.AIAA-2008-7281.

  9. A. Lampton, A. Niksch, J. Valasek. Morphing airfoil with reinforcement learning of four shape changing parameters. Proceedings of the AIAA Guidance, Navigation, and Control Conference, Honolulu, HI, 2008: No. AIAA-2008-7282.

  10. R. Sutton, A. Barto. Reinforcement Learning — An Introduction. Cambridge: MIT Press, 1998.

    Google Scholar 

  11. H. Niederreiter. Random Number Generation and Quasi-Monte Carlo Methods. Philadelphia: SIAM, 1992.

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

This work was partly supported by the Air Force Office of Scientific Research, USAF (No.FA9550-08-1-0038), and the National Science Foundation under a Graduate Research Fellowship.

Amanda LAMPTON joined Systems Technology, Inc. in June, 2010. Her main areas of interest are flight mechanics and control, intelligent control, and autonomous systems. As a graduate student, she studied utilizing learning algorithms to solve the shape control problem of a reconfigurable or morphing air vehicle cast as a reinforcement learning problem. The majority of this work was funded by a National Science Foundation Graduate Research Fellowship. She developed the methodology to the point that funding was secured from the Air Force Office of Scientific Research (AFOSR) for further development. She served as student technical lead on this project for the remainder of her graduate career and her post-doctoral research. Dr. Lampton has extensive experience in designing flight controllers for a myriad of problems including the morphing aircraft problem, aerial refueling tasks, and high performance aircraft regulators and autopilot tasks using a variety of techniques. Amanda serves on both the AIAA Guidance, Navigation, and Control Technical Committee and the AIAA Intelligent Systems Technical Committee. She earned her B.S., M.S., and Ph.D. degrees in Aerospace Engineering at Texas A&M University in 2004, 2006, and 2009, respectively. She is a member of the American Institute of Aeronautics and Astronautics (AIAA) and the Institute of Electrical and Electronics Engineers (IEEE).

John VALASEK is Director, Vehicle Systems & Control Laboratory and Professor of Aerospace Engineering at Texas A&M University. His research focuses on bridging the gap between computer science and aerospace engineering, encompassing machine learning and multiagent systems, intelligent autonomous control, vision-based navigation systems, fault tolerant adaptive control, and cockpit systems and displays. John was previously a flight control engineer for the Northrop Corporation, Aircraft Division where he worked in the Flight Controls Research Group, and on the AGM-137 Tri-Services Standoff Attack Missile (TSSAM) program. He was also a summer faculty researcher at NASA Langley in 1996 and an AFOSR summer faculty research fellow in the Air Vehicles Directorate, Air Force Research Laboratory in 1997. John has served as Chair of Committee to 32 completed graduate degrees, and his students have won national and regional student research competitions in topics ranging from aircraft design to smart materials to computational intelligence. John is an associate editor of the Journal of Guidance, Control, and Dynamics, and current member of the AIAA Guidance, Navigation, and Control Technical Committee; AIAA Intelligent Systems Technical Committee; and the IEEE Technical Committee on Intelligent Learning in Control Systems. John earned his B.S. degree in Aerospace Engineering from California State Polytechnic University, Pomona in 1986, M.S. degree with honors, and Ph.D. in Aerospace Engineering from the University of Kansas, in 1990 and 1995, respectively. He is an Associate Fellow of AIAA and a Senior Member of IEEE.

Mrinal KUMAR received his B.Tech. degree from Indian Institute of Technology, Kanpur in 2004, and Ph.D. from Texas A&M University in 2009, both in Aerospace Engineering. He is currently a member of the faculty as an assistant professor in the Department of Mechanical and Aerospace Engineering at University of Florida, Gainesville. His current research interests include uncertainty quantification using spectral methods and design of randomized algorithms for large scale engineering problems.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lampton, A., Valasek, J. & Kumar, M. Multiresolution state-space discretization for Q-Learning with pseudorandomized discretization. J. Control Theory Appl. 9, 431–439 (2011). https://doi.org/10.1007/s11768-011-1012-4

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11768-011-1012-4

Keywords

Navigation