Skip to main content

2v2 Close Air Combat Decision-Making Based onĀ Improved MAPPO Algorithm

  • Conference paper
  • First Online:
Proceedings of 3rd 2023 International Conference on Autonomous Unmanned Systems (3rd ICAUS 2023) (ICAUS 2023)

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 1171))

Included in the following conference series:

  • 79 Accesses

Abstract

Aircraft cluster air warfare is a complex and challenging combat scenario. Reinforcement learning is applied to unmanned cluster control because of its powerful dynamic decision-making and control capabilities. However, for the scenario described above, the multi-agent reinforcement learning algorithm still has issues such as local optima and long training times. To address the above issues, our work improves the Multi-Agent Proximal Policy Optimization(MAPPO) algorithm. Specifically we apply a mechanism to reduce the dimensionality of actions and corresponding values, and design an adaptive reward function which can help the agent maintain a good balance between attack and defense. In addition we built a 2V2 simulation scenario of close air combat to evaluate our algorithm. The experimental results demonstrate that the models trained by our algorithm are more effective in decision-making performance.

X. Wuā€”This work was supported by the National Natural Science Foundation of China (62103192), China Postdoctoral Science Foundation (2021M691597) and Fundamental Research Funds for Central Universities (30922010710).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 219.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 279.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. McGrew, J.S., How, J.P., Williams, B., Roy, N.: Air-combat strategy using approximate dynamic programming. J. Guid. Control. Dyn. 33(5), 1641ā€“1654 (2010)

    ArticleĀ  Google ScholarĀ 

  2. Nigam, N., Bieniawski, S., Kroo, I., Vian, J.: Control of multiple uavs for persistent surveillance: algorithm and flight test results. IEEE Trans. Control Syst. Technol. 20(5), 1236ā€“1251 (2011)

    ArticleĀ  Google ScholarĀ 

  3. Kaelbling, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: a survey. J. Artifi. Intell. Res. 4, 237ā€“285 (1996)

    ArticleĀ  Google ScholarĀ 

  4. Peng, P., et al.: Multiagent bidirectionally-coordinated nets: emergence of human-level coordination in learning to play starcraft combat games. arXiv preprint arXiv:1703.10069 (2017)

  5. Matignon, L., Jeanpierre, L., Mouaddib, A.I.: Coordinated multi-robot exploration under communication constraints using decentralized markov decision processes. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol.Ā 26, pp. 2017ā€“2023 (2012)

    Google ScholarĀ 

  6. Arulkumaran, K., Deisenroth, M.P., Brundage, M., Bharath, A.A.: Deep reinforcement learning: a brief survey. IEEE Signal Process. Mag. 34(6), 26ā€“38 (2017)

    ArticleĀ  Google ScholarĀ 

  7. Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529ā€“533 (2015)

    ArticleĀ  Google ScholarĀ 

  8. Buşoniu, L., BabuÅ”ka, R., DeĀ Schutter, B.: Multi-agent reinforcement learning: an overview. In: Innovations in Multi-agent Systems and Applications-1, pp. 183ā€“221 (2010)

    Google ScholarĀ 

  9. Yu, C., Velu, A., Vinitsky, E., Wang, Y., Bayen, A.M., Wu, Y.: The surprising effectiveness of mappo in cooperative, multi-agent games. ArXiv abs/ arxiv: 2103.01955 (2021)

  10. Tan, M.: Multi-agent reinforcement learning: Independent vs. cooperative agents. In: Proceedings of the Tenth International Conference on Machine Learning, pp. 330ā€“337 (1993)

    Google ScholarĀ 

  11. Matignon, L., Laurent, G.J., Le Fort-Piat, N.: Independent reinforcement learners in cooperative markov games: a survey regarding coordination problems. Knowl. Eng. Rev. 27(1), 1ā€“31 (2012)

    ArticleĀ  Google ScholarĀ 

  12. Panait, L., Luke, S.: Cooperative multi-agent learning: The state of the art. Auton. Agent. Multi-Agent Syst. 11, 387ā€“434 (2005)

    ArticleĀ  Google ScholarĀ 

  13. Sukhbaatar, S., Fergus, R., etĀ al.: Learning multiagent communication with backpropagation. In: Advances in Neural Information Processing Systems 29 (2016)

    Google ScholarĀ 

  14. Claus, C., Boutilier, C.: The dynamics of reinforcement learning in cooperative multiagent systems. AAAI/IAAI 1998(746ā€“752), 2 (1998)

    Google ScholarĀ 

  15. Littman, M.L.: Markov games as a framework for multi-agent reinforcement learning. In: Machine learning proceedings 1994, pp. 157ā€“163. Elsevier (1994)

    Google ScholarĀ 

  16. Lowe, R., Wu, Y.I., Tamar, A., Harb, J., PieterĀ Abbeel, O., Mordatch, I.: Multi-agent actor-critic for mixed cooperative-competitive environments. In: Advances in Neural Information Processing Systems 30 (2017)

    Google ScholarĀ 

  17. Foerster, J., Farquhar, G., Afouras, T., Nardelli, N., Whiteson, S.: Counterfactual multi-agent policy gradients. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol.Ā 32 (2018)

    Google ScholarĀ 

  18. Kuba, J.G., et al.: Trust region policy optimisation in multi-agent reinforcement learning. ArXiv abs/ arxiv: 2109.11251 (2021)

  19. Schulman, J., Moritz, P., Levine, S., Jordan, M., Abbeel, P.: High-dimensional continuous control using generalized advantage estimation. arXiv preprint arXiv:1506.02438 (2015)

  20. Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)

  21. Hessel, M., Soyer, H., Espeholt, L., Czarnecki, W., Schmitt, S., van Hasselt, H.: Multi-task deep reinforcement learning with popart. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol.Ā 33, pp. 3796ā€“3803 (2019)

    Google ScholarĀ 

  22. QihanLiu, YuhuaĀ Jiang, X.M.: Light aircraft game: a lightweight, scalable, gym-wrapped aircraft competitive environment with baseline reinforcement learning algorithms (2022). https://github.com/liuqh16/CloseAirCombat

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiang Wu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

Ā© 2024 Beijing HIWING Scientific and Technological Information Institute

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yan, Q., Ren, J., Liu, Y., Wu, X. (2024). 2v2 Close Air Combat Decision-Making Based onĀ Improved MAPPO Algorithm. In: Qu, Y., Gu, M., Niu, Y., Fu, W. (eds) Proceedings of 3rd 2023 International Conference on Autonomous Unmanned Systems (3rd ICAUS 2023). ICAUS 2023. Lecture Notes in Electrical Engineering, vol 1171. Springer, Singapore. https://doi.org/10.1007/978-981-97-1083-6_20

Download citation

Publish with us

Policies and ethics