Abstract
Machine learning is currently one of the most actively developing research areas. Considerable attention in the ongoing research is paid to problems related to dynamical systems. One of the areas in which the application of machine learning technologies is being actively explored is aircraft of various types and purposes. This state of the art is due to the complexity and variety of tasks that are assigned to aircraft. The complicating factor in this case is incomplete and inaccurate knowledge of the properties of the object under study and the conditions in which it operates. In particular, a variety of abnormal situations may occur during flight, such as equipment failures and structural damage, which must be counteracted by reconfiguring the aircraft’s control system and controls. The aircraft control system must be able to operate effectively under these conditions by promptly changing the parameters and/or structure of the control laws used. Adaptive control methods allow to satisfy this requirement. One of the ways to synthesize control laws for dynamic systems, widely used nowadays, is LQR approach. A significant limitation of this approach is the lack of adaptability of the resulting control law, which prevents its use in conditions of incomplete and inaccurate knowledge of the properties of the control object and the environment in which it operates. To overcome this limitation, it was proposed to modify the standard variant of LQR (Linear Quadratic Regulator) based on approximate dynamic programming, a special case of which is the adaptive critic design (ACD) method. For the ACD-LQR combination, the problem of controlling the lateral motion of a maneuvering aircraft is solved. The results obtained demonstrate the promising potential of this approach to controlling the airplane motion under uncertainty conditions.
Similar content being viewed by others
REFERENCES
Meyn, S., Control Systems and Reinforcement Learning, Cambridge, UK: Cambridge Univ. Press, 2022.
Song, R., Wei, Q., and Li, Q., Adaptive Dynamic Programming: Single and Multiple Controllers, Beijing: Science Press; Singapore: Springer Nature, 2019.
Zhang, Y., Li, S., and Zhou, X., Deep Reinforcement Learning with Guaranteed Performance: A Lyapunov-Based Approach, Springer Nature Switzerland, 2020.
Kamalapurkar, R., Walters, P., Rosenfeld, J., and Dixon W., Reinforcement Learning for Optimal Feedback Control: A Lyapunov-based approach, Berlin: Springer, 2018.
Vamvoudakis, K.G., Wan, Y., Lewis, F.L., and Cansever, D., Eds., Handbook of Reinforcement Learning and Control, Springer Nature Switzerland, 2021.
Liu, D., Xue, S., Zhao, B., Luo, B., and Wei, Q., Adaptive dynamic programming for vontrol: A survey and recent advances. IEEE Trans. Syst., Man, Cybern., Part B, 2023, vol. 1, pp. 142–160.
Wang, D., He, H., and Liu D., Adaptive critic nonlinear robust control: A survey, IEEE Trans. Cybern., 2017, vol. 47, no. 10, pp. 1–22.
Buşoniu, L., de Bruin, T., Tolić, D., Kober, J., and Palunko, I., Reinforcement learning for control: Performance, stability, and deep approximators, Annu. Rev. Control, 2018, vol. 46, pp. 8–28.
Khan, S.G., Herrmann, G., Lewis, F.L., Pipe, T., and Melhuish, C., Reinforcement learning and optimal adaptive control: An overview and implementation examples, Annu. Rev. Control, 2012, vol. 36, pp. 42–59.
Kiumarsi, B., Vamvoudakis, K.G., Modares, H., and Lewis, F.L., Optimal and autonomous control using reinforcement learning: A survey, IEEE Trans. Neural Networks Learn. Syst., 2018, vol. 29, pp. 2042–2062.
Kober, J., Bagnell, J.A., and Peters, J., Reinforcement learning in robotics: A survey, Int. J. Rob. Res., 2013, vol. 22, pp. 1238–1274.
Lewis, F.L. and Vrabie, D., Reinforcement learning and adaptive dynamic programming for feedback control, IEEE Circuits Syst. Mag., 2009, vol. 9, no. 3. pp. 32–50.
Li, Y., Deep reinforcement learning: An overview. arXiv, 2018, arXiv:1810.06339v1, pp. 1–150.
Ducard, G.J.J., Fault-tolerant Flight Control and Guidance Systems: Practical Methods for Small Unmanned Aerial Vehicles; Springer: Berlin, 2009.
Hajlyev, C. and Caliskan, F., Fault Diagnosis and Reconfiguration in Flight Control Systems, Springer: Berlin, 2003.
Blanke, M., Kinnaert, M., Lunze, J., and Staroswiecki, M., Diagnosis and Fault-Tolerant Control, 2nd ed.; Springer: Berlin, 2006.
Noura, H., Theilliol, D., Ponsart, J.-C., and Chamseddine, A., Fault-tolerant Control Systems: Design and Practical Applications, Springer: Berlin, 2009.
Zhou, J., Xing, L., and Wen, C. Adaptive Control of Dynamic Systems with Uncertainty and Quantization, London, UK: CRC Press, 2021.
Astolfi A., Karagiannis D., and Ortega R., Nonlinear and Adaptive Control with Applications, Berlin a.o.: Springer, 2008.
Ioannou, P.A. and Sun, J., Robust Adaptive Control, Prentice Hall, 1995.
Mosca, E., Optimal, Predictive, and Adaptive Control, Prentice Hall, 1994.
Tao, G., Adaptive Control Design and Analysis, Wiley, 2003.
Sutton, R.S. and Barto, A.G., Reinforcement Learning: An Introduction. 2nd ed., Cambridge, Massachusetts, USA: MIT Press, 2018.
Wei, Q., Song, R., Li, B., and Lin, X., Self-learning Optimal Control of Nonlinear Systems: Adaptive Dynamic Programming Approach, Springer, 2018.
Haykin, S., Neural Networks: A Comprehensive Foundation, 2nd ed., Prentice Hall, 2006.
Powell, W.B., Approximate Dynamic Programming: Solving the Curse of Dimensionality, 2nd ed., Wiley, 2011.
Lewis, F.L. and Liu, D., Eds., Reinforcement Learning and Approximate Dynamic Programming for Feedback Control. Wiley, 2013.
Liu, D., Xue, S., Zhao, B., Luo, B., and Wei, Q., Adaptive dynamic programming for control: A survey and recent advances, IEEE Trans. Syst., Man, Cybern., 2021, vol. 51, no. 1, pp.142–160.
Liu, D., Wei, Q., Wang, D., Yang, X., and Li, H., Adaptive Dynamic Programming with Applications in Optimal Control, Springer, 2017.
Ferrari, S., Stengel, R.F., Online adaptive critic flight control, J. Guid., Control, Dyn., 2004, vol. 27, no. 5, pp. 777–786.
Wang, D. and Mu, C., Adaptive Critic Control with Robust Stabilization for Uncertain Nonlinear Systems, Springer, 2019.
Lewis, F.L., Vrabie, D.L., and Syrmos, V.L. Optimal Control, 3rd ed., Hoboken, New Jersey: Wiley, 2012.
Rugh, W.J. and Shamma J.S., Research on gain scheduling: Survey paper, Automatica, 2000, vol. 36, no. 10, pp.1401–1425.
Leith, D.J. and Leithead W.E., Survey of gain scheduling analysis and design, Int. J. Control, 2000, vol. 73, no. 11, pp. 1001–1025.
Enns, D., Bugajski, D., Hendrick, R., and Stein G., Dynamic inversion: an evolving methodology for flight control design, Int. J. Control, 1994, vol. 59, no. 1, pp. 71–91.
Looye, G., Design of robust autopilot control laws with nonlinear dynamic inversion, Automatisierungstechnik, 2001, vol. 49, no. 12, pp. 523–531.
Werbos, P.J., A menu of designs for reinforcement learning over time, in Neural Networks for Control, Miller, W.T., Sutton, R.S., and Werbos, P.J., Eds., Cambridge, MA: MIT Press, 1990, pp. 67–95.
Vamvoudakis, K.G. and Lewis, F.L., Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem, Automatica, 2010, vol. 46, pp. 878–888.
Wang, D. and Mu, C., Adaptive Critic Control with Robust Stabilization for Uncertain Nonlinear Systems; Singapore: Springer Nature, 2019.
Bradtke, S.J., Reinforcement learning applied to linear quadratic regulation, Proc. NIPS-92, 1992, pp. 295–302.
Faradonbeh, M.K.S., Tewari, A., and Michailidis, G., On adaptive linear-quadratic regulators, Automatica, 2020, vol. 117, pp. 1–13.
Lee, J.Y., Park, J.B., and Choi Y.H., Integral Q-learning and explorized policy iteration for adaptive optimal control of continuous-time linear systems, Automatica, 2012, vol. 48, pp. 2850–2859.
Lee, J.Y., Park, J.B., and Choi, Y.H., On integral generalized policy iteration for continuous-time linear quadratic regulations, Automatica, 2014, vol. 50, pp. 475–489.
Nguyen, L.T., Ogburn, M.E., Gilbert, W.P., Kibler, K.S., Brown, P.W., and Deal, P.L., Simulator study of stall/post-stall characteristics of a fighter airplane with relaxed longitudinal static stability, NASA TP-1538, 1979.
Chulin, M.A., Tiumentsev, Yu.V., and Zarubin, R.A., LQR approach to aircraft control based on the adaptive critic design, Stud. Comput. Intell., 2023, vol. 1120, pp. 406–419.
Stevens, B.L., Lewis, F.L., and Johnson, E.N., Aircraft Control and Simulation: Dynamics, Controls Design and Autonomous Systems, 3rd ed., Wiley, 2016.
Cook, M.V., Flight Dynamics Principles, 2nd ed., Elsevier, 2007.
Funding
The paper was prepared under the Program for the Development of the World-Class Research Center “Supersonic” in 2020–2025, funded by the Russian Ministry of Science and Higher Education (Agreement dated April 20, 2022, no. 075-15-2022-309).
Author information
Authors and Affiliations
Contributions
Equal contribution of the authors to the article.
Corresponding author
Ethics declarations
CONFLICT OF INTEREST
The authors of this work declares that they have no conflicts of interest.
ABBREVIATIONS
ACD—Adaptive Critic Design;
ADP—Approximate Dynamic Programming;
FBL—Feedback Linearization;
LQR—Linear Quadratic Regulator;
NDI—Nonlinear Dynamic Inversion;
RL—Reinforcement Learning.
Additional information
Publisher’s Note.
Allerton Press remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Tiumentsev, Y.V., Zarubin, R.A. Lateral Motion Control of a Maneuverable Aircraft Using Reinforcement Learning. Opt. Mem. Neural Networks 33, 1–12 (2024). https://doi.org/10.3103/S1060992X2401003X
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.3103/S1060992X2401003X