Multiresolution state-space discretization for Q-Learning with pseudorandomized discretization

Lampton, Amanda; Valasek, John; Kumar, Mrinal

doi:10.1007/s11768-011-1012-4

Multiresolution state-space discretization for Q-Learning with pseudorandomized discretization

Published: 19 July 2011

Volume 9, pages 431–439, (2011)
Cite this article

Journal of Control Theory and Applications Aims and scope Submit manuscript

Amanda Lampton¹,
John Valasek² &
Mrinal Kumar³

88 Accesses
2 Citations
Explore all metrics

Abstract

A multiresolution state-space discretization method with pseudorandom gridding is developed for the episodic unsupervised learning method of Q-learning. It is used as the learning agent for closed-loop control of morphing or highly reconfigurable systems. This paper develops a method whereby a state-space is adaptively discretized by progressively finer pseudorandom grids around the regions of interest within the state or learning space in an effort to break the Curse of Dimensionality. Utility of the method is demonstrated with application to the problem of a morphing airfoil, which is simulated by a computationally intensive computational fluid dynamics model. By setting the multiresolution method to define the region of interest by the goal the agent seeks, it is shown that this method with the pseudorandom grid can learn a specific goal within ±0.001 while reducing the total number of state-action pairs needed to achieve this level of specificity to less than 3000.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Novel Experience-Based Exploration Method for Q-Learning

Adaptive step-size selection for state-space probabilistic differential equation solvers

Article 28 September 2019

Simulation for Control

References

D. C. Bentivegna, C. G. Atkeson, G. Cheng. Learning to select primitives and generate sub-goals from practice. Proceedings of the IEEE/lRSJ International Conference on Intelligent Robots and Systems, New York: IEEE, 2003: 946–953.
Google Scholar
O. Simsek, A. P. Wolfe, A. G. Barto. Identifying useful subgoals in reinforcement learning by local graph partitioning. Proceedings of the 22nd International Conference on Machine Learning, New York: ACM, 2005: 816–823.
Chapter Google Scholar
C. Clausen, H. Wechsler. Quad-Q-learning. IEEE Transactions on Neural Networks, 2000, 11(2): 279–294.
Article Google Scholar
T. G. Dietterich. The MAXQ method for hierarchical reinforcement learning. Proceedings of the 15th International Conference on Machine Learning, San Francisco, CA: Morgan Kaufmann Publishers Inc., 1998: 118–126.
Google Scholar
C. J. C. H.Watkins, P. Dayan. Learning from Delayed Rewards. Ph.D. thesis. Cambridge, U.K.: University of Cambridge, 1989.
Google Scholar
A. Lampton. Function Approximation and Discretization Methods for Reinforcement Learning of Highly Reconfigurable Vehicles. Ph.D. thesis. College Station, TX: Texas A&M University, 2009.
Google Scholar
A. Lampton, A. Niksch, J. Valasek. Reinforcement learning of morphing airfoils with aerodynamic and structural effects. Journal of Aerospace Computing, Information, and Communication, 2009, 6(1): 30–50.
Google Scholar
A. Lampton, A. Niksch, J. Valasek. Reinforcement learning of a morphing airfoil-policy and discrete learning analysis. Proceedings of the AIAA Guidance, Navigation, and Control Conference, Honolulu, HI, 2008: No.AIAA-2008-7281.
A. Lampton, A. Niksch, J. Valasek. Morphing airfoil with reinforcement learning of four shape changing parameters. Proceedings of the AIAA Guidance, Navigation, and Control Conference, Honolulu, HI, 2008: No. AIAA-2008-7282.
R. Sutton, A. Barto. Reinforcement Learning — An Introduction. Cambridge: MIT Press, 1998.
Google Scholar
H. Niederreiter. Random Number Generation and Quasi-Monte Carlo Methods. Philadelphia: SIAM, 1992.
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Systems Technology, Inc., 13766 S. Hawthorne Blvd, Hawthorne, CA, 90250, USA
Amanda Lampton
Department of Aerospace Engineering, Texas A&M University, 3141 TAMU, College Station, TX, 77843-3141, USA
John Valasek
Department of Mechanical & Aerospace Engineering, University of Florida, 306 MAE-A, Gainesville, FL, 32611-6250, USA
Mrinal Kumar

Authors

Amanda Lampton
View author publications
You can also search for this author in PubMed Google Scholar
John Valasek
View author publications
You can also search for this author in PubMed Google Scholar
Mrinal Kumar
View author publications
You can also search for this author in PubMed Google Scholar

Additional information

This work was partly supported by the Air Force Office of Scientific Research, USAF (No.FA9550-08-1-0038), and the National Science Foundation under a Graduate Research Fellowship.

Amanda LAMPTON joined Systems Technology, Inc. in June, 2010. Her main areas of interest are flight mechanics and control, intelligent control, and autonomous systems. As a graduate student, she studied utilizing learning algorithms to solve the shape control problem of a reconfigurable or morphing air vehicle cast as a reinforcement learning problem. The majority of this work was funded by a National Science Foundation Graduate Research Fellowship. She developed the methodology to the point that funding was secured from the Air Force Office of Scientific Research (AFOSR) for further development. She served as student technical lead on this project for the remainder of her graduate career and her post-doctoral research. Dr. Lampton has extensive experience in designing flight controllers for a myriad of problems including the morphing aircraft problem, aerial refueling tasks, and high performance aircraft regulators and autopilot tasks using a variety of techniques. Amanda serves on both the AIAA Guidance, Navigation, and Control Technical Committee and the AIAA Intelligent Systems Technical Committee. She earned her B.S., M.S., and Ph.D. degrees in Aerospace Engineering at Texas A&M University in 2004, 2006, and 2009, respectively. She is a member of the American Institute of Aeronautics and Astronautics (AIAA) and the Institute of Electrical and Electronics Engineers (IEEE).

John VALASEK is Director, Vehicle Systems & Control Laboratory and Professor of Aerospace Engineering at Texas A&M University. His research focuses on bridging the gap between computer science and aerospace engineering, encompassing machine learning and multiagent systems, intelligent autonomous control, vision-based navigation systems, fault tolerant adaptive control, and cockpit systems and displays. John was previously a flight control engineer for the Northrop Corporation, Aircraft Division where he worked in the Flight Controls Research Group, and on the AGM-137 Tri-Services Standoff Attack Missile (TSSAM) program. He was also a summer faculty researcher at NASA Langley in 1996 and an AFOSR summer faculty research fellow in the Air Vehicles Directorate, Air Force Research Laboratory in 1997. John has served as Chair of Committee to 32 completed graduate degrees, and his students have won national and regional student research competitions in topics ranging from aircraft design to smart materials to computational intelligence. John is an associate editor of the Journal of Guidance, Control, and Dynamics, and current member of the AIAA Guidance, Navigation, and Control Technical Committee; AIAA Intelligent Systems Technical Committee; and the IEEE Technical Committee on Intelligent Learning in Control Systems. John earned his B.S. degree in Aerospace Engineering from California State Polytechnic University, Pomona in 1986, M.S. degree with honors, and Ph.D. in Aerospace Engineering from the University of Kansas, in 1990 and 1995, respectively. He is an Associate Fellow of AIAA and a Senior Member of IEEE.

Mrinal KUMAR received his B.Tech. degree from Indian Institute of Technology, Kanpur in 2004, and Ph.D. from Texas A&M University in 2009, both in Aerospace Engineering. He is currently a member of the faculty as an assistant professor in the Department of Mechanical and Aerospace Engineering at University of Florida, Gainesville. His current research interests include uncertainty quantification using spectral methods and design of randomized algorithms for large scale engineering problems.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lampton, A., Valasek, J. & Kumar, M. Multiresolution state-space discretization for Q-Learning with pseudorandomized discretization. J. Control Theory Appl. 9, 431–439 (2011). https://doi.org/10.1007/s11768-011-1012-4

Download citation

Received: 11 January 2011
Revised: 01 April 2011
Published: 19 July 2011
Issue Date: August 2011
DOI: https://doi.org/10.1007/s11768-011-1012-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multiresolution state-space discretization for Q-Learning with pseudorandomized discretization

Abstract

Access this article

Similar content being viewed by others

A Novel Experience-Based Exploration Method for Q-Learning

Adaptive step-size selection for state-space probabilistic differential equation solvers

Simulation for Control

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation