Skip to main content
Log in

A new multi-domain cooperative resource scheduling method using proximal policy optimization

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

For the complex environment and massive multi-source data, the capability of multi-domain cooperative resource scheduling has become extremely important. Optimal scheduling can reduce operating costs and time, and MDLS is still the most commonly utilized algorithm in combat task scheduling today, despite of its defects. This research provides a plausible new method for the MDCRS problem, a resource scheduling method based on deep reinforcement learning (DRL), which has proven to be effective for other scheduling problems. Aiming at the resource scheduling problem in the multi-domain cooperative operation, under timing constraints, an MDCRS model is created using the shortest completion time as the objective function. On this premise, this paper presents an MDCRS-MDP model based on Markov decision processes, in which a two-dimensional action space that can simultaneously allocate action and match platform is designed and a dense reward function with strong connections to the criterion for sparse makespan minimization is provided. A resource scheduling approach utilizing DRL is proposed, including task-platform matching and task sequencing, based on the MDCRS-MDP model. Finally, combined with the joint landing operation, the experimental results verify the effectiveness of the proposed method for solving MDCRS and demonstrate the significant advantages over traditional dispatching rules and meta-heuristic optimization algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data availability

Data will be made available on request.

References

  1. Zhang WM, Huang SP, Huang JC, Zhu C, Ding ZY (2020) Analysis on multi-domain operation and its command and control problems. Comm Inf Syst Technol 11(01):1–6. https://doi.org/10.15908/j.cnki.cist.2020.01.001

    Article  Google Scholar 

  2. Liu K (2021) Theoretical thinking on the joint all-domain command and control system of the U.S. army. J China Acad Electron Inf Technol 16(07):722–727. https://doi.org/10.3969/j.issn.1673-5692.2021.07.014

    Article  Google Scholar 

  3. Han X, Mandal S, Pattipati KR, Kleinman DL, Mishra M (2013) An optimization-based distributed planning algorithm: a blackboard-based collaborative framework. IEEE Trans Syst Man Cybern Syst 44(6):673–686. https://doi.org/10.1109/TSMC.2013.2276392

    Article  Google Scholar 

  4. Aramesh S, Aickelin U, Khorshidi HA (2022) A hybrid projection method for resource-constrained project scheduling problem under uncertainty. Neural Comput Appl 34:14557–14576. https://doi.org/10.1007/s00521-022-07321-2

    Article  Google Scholar 

  5. Gabi D, Dankolo NM, Muslim AA, Abraham A, Usmanjoda M, Zainal A, Zakaria Z (2022) Dynamic scheduling of heterogeneous resources across mobile edge-cloud continuum using fruit fly-based simulated annealing optimization scheme. Neural Comput Appl 34:14085–14105. https://doi.org/10.1007/s00521-022-07260-y

    Article  Google Scholar 

  6. Xie B, Lin H (2013) Survey on joint battlefield resources scheduling problem. Ship Electron Eng 33(10):23–26. https://doi.org/10.3969/j.issn1672-9730.2013.10.009

    Article  Google Scholar 

  7. Levchuk GM, Levchuk YN, Luo J, Pattipati KR, Kleinman DL (2002) Normative design of organization -Part I: Mission planning. IEEE Trans on Syst Man Cybern Part A Syst Humans 32(3):346–359. https://doi.org/10.1109/TSMCA.2002.802819

    Article  Google Scholar 

  8. Zhou Y, Zhao H, Chen J, Jia Y (2020) A novel mission planning method for UAVs’ course of action. Comput Commun 152:345–356. https://doi.org/10.1016/j.comcom.2020.01.006

    Article  Google Scholar 

  9. Fu Z, Qu L (2019) Research on resource rescheduling of joint operations based on GA-MDLS. In: 2019 IEEE 3rd Information Technology, Networking, Electronic and Automation Control Conference (ITNEC), pp 1944–1948. https://doi.org/10.1109/ITNEC.2019.8729238

  10. Zhang J, Huang S, Sun P, Chen G (2018) Task scheduling method based on feasible task execution sequence and greedy strategy. J Phys Conf Ser 1060(1):012051–012056. https://doi.org/10.1088/1742-6596/1060/1/012051

    Article  Google Scholar 

  11. Tian J, Hao XC, Gen M (2019) A hybrid multi-objective EDA for robust resource constraint project scheduling with uncertainty. Comput Ind Eng 130:317–326. https://doi.org/10.1016/j.cie.2019.02.039

    Article  Google Scholar 

  12. Poppenborg J, Knust S (2016) A flow-based tabu search algorithm for the RCPSP with transfer times. OR Spectrum 38:305–334. https://doi.org/10.1007/s00291-015-0402-2

    Article  MathSciNet  Google Scholar 

  13. Ding H, Gu X (2020) Hybrid of human learning optimization algorithm and particle swarm optimization algorithm with scheduling strategies for the flexible job-shop scheduling problem. Neurocomputing 414:313–332. https://doi.org/10.1016/j.neucom.2020.07.004

    Article  Google Scholar 

  14. Khurshid B, Maqsood S, Omair M, Sarkar B, Ahmad I, Muhammad K (2021) An improved evolution strategy hybridization with simulated annealing for permutation flow shop scheduling problems. IEEE Access 9:94505–94522. https://doi.org/10.1109/ACCESS.2021.3093336

    Article  Google Scholar 

  15. Girish BS, Jawahar N (2009) Scheduling job shop associated with multiple routings with genetic and ant colony heuristics. Int J Prod Res 47(14):3891–3917. https://doi.org/10.1080/00207540701824845

    Article  Google Scholar 

  16. Rana N, Abd Latiff MS, Abdulhamid SIM, Misra S (2022) A hybrid whale optimization algorithm with differential evolution optimization for multi-objective virtual machine scheduling in cloud computing. Eng Optim 54(12):1999–2016. https://doi.org/10.1080/0305215X.2021.1969560

    Article  MathSciNet  Google Scholar 

  17. Li Y, Qiu X, Liu X, Xia Q (2020) Deep reinforcement learning and its application in autonomous fitting optimization for attack areas of UCAVs. J Syst Eng Electron 31(4):734–742. https://doi.org/10.23919/JSEE.2020.000048

    Article  Google Scholar 

  18. Han BA, Yang JJ (2020) Research on adaptive job shop scheduling problems based on dueling double DQN. IEEE Access 8:186474–186495. https://doi.org/10.1109/ACCESS.2020.3029868

    Article  Google Scholar 

  19. Cheng F, Huang Y, Tanpure B, Sawalani P, Cheng L (2022) Cost-aware job scheduling for cloud instances using deep reinforcement learning. Clust Comput 25:619–631. https://doi.org/10.1007/s10586-021-03436-8

    Article  Google Scholar 

  20. Zhao FQ, Zhang LX, Cao J et al (2021) A cooperative water wave optimization algorithm with reinforcement learning for the distributed assembly no-idle flowshop scheduling problem. Comput Ind Eng 153:107082. https://doi.org/10.1016/j.cie.2020.107082

    Article  Google Scholar 

  21. Feng Y, Zhang L, Yang Z, Guo Y, Yang D (2021) Flexible job shop scheduling based on deep reinforcement learning. In: 2021 5th Asian Conference on Artificial Intelligence Technology (ACAIT), pp 660–666. https://doi.org/10.1109/ACAIT53529.2021.9731322

  22. Swarup S, Shakshuki EM, Yasar A (2021) Task scheduling in cloud using deep reinforcement learning. Procedia Comput Sci 184:42–51. https://doi.org/10.1016/j.procs.2021.03.016

    Article  Google Scholar 

  23. Zeng Z, Li X, Bai C (2022) A deep reinforcement learning approach to flexible job shop scheduling. In: 2022 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp 884–890. https://doi.org/10.1109/SMC53654.2022.9945107

  24. Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O (2017) Proximal policy optimization algorithms. https://doi.org/10.48550/arXiv.1707.06347

  25. Sutton RS, Barto AG (2018) Reinforcement learning: an introduction, 2nd edn. MIT Press, Cambridge

    Google Scholar 

  26. Tassel P, Gebser M, Schekotihin K (2021) A reinforcement learning environment for job-shop scheduling. https://doi.org/10.48550/arXiv.2104.03760

  27. Pasaraba WL (2000) The conduct and assessment of A2C2 experiment 7. Naval Postgraduate School, Monterey

    Google Scholar 

  28. Liang E, Liaw R, Nishihara R, Moritz P, Fox R, Goldberg K, Gonzalez J, Jordan M, Stoica I (2018) RLlib: Abstractions for distributed reinforcement learning. In: Proceedings of the 35th International Conference on Machine Learning. PMLR, 80, pp 3053–3062. http://proceedings.mlr.press/v80/liang18b/liang18b.pdf

  29. Kronheim BS, Kuchera MP, Prosper HB (2022) TensorBNN: bayesian inference for neural networks using TensorFlow. Comput Phys Commun 270:108168. https://doi.org/10.1016/j.cpc.2021.108168

    Article  MathSciNet  CAS  Google Scholar 

  30. Li H, Bi L, Jin BF (2018) Application of improved particle swarm optimization in multi-target working workshop scheduling. Comput Appl Softw 35(03):49–53. https://doi.org/10.3969/j.issn.1000-386x.2018.03.009

    Article  Google Scholar 

Download references

Acknowledgements

Research for this paper was supported by the Equipment advance research project (50912020401), the project of Xiangjiang Laboratory (No.22XJ02003), the National Natural Science Foundation (No.62122093), and the Natural Science Basic Research Plan in Shanxi Province of China (No.2018JM6011). The authors also gratefully acknowledge the helpful comments and suggestions of the reviewers, which have improved the presentation.

Author information

Authors and Affiliations

Authors

Contributions

HL contributed to conceptualization, methodology, and research management. ZH was involved in methodology and provided software. RW was involved in conceptualization and methodology. KH contributed to conceptualization and methodology. GC was involved in conceptualization and methodology.

Corresponding authors

Correspondence to Haiying Liu or Kuihua Huang.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, H., He, Z., Wang, R. et al. A new multi-domain cooperative resource scheduling method using proximal policy optimization. Neural Comput & Applic 36, 4931–4945 (2024). https://doi.org/10.1007/s00521-023-09326-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-023-09326-x

Keywords

Navigation