skip to main content
research-article

Integrating Reinforcement Learning with Multi-Agent Techniques for Adaptive Service Composition

Published:25 May 2017Publication History
Skip Abstract Section

Abstract

Service-oriented architecture is a widely used software engineering paradigm to cope with complexity and dynamics in enterprise applications. Service composition, which provides a cost-effective way to implement software systems, has attracted significant attention from both industry and research communities. As online services may keep evolving over time and thus lead to a highly dynamic environment, service composition must be self-adaptive to tackle uninformed behavior during the evolution of services. In addition, service composition should also maintain high efficiency for large-scale services, which are common for enterprise applications. This article presents a new model for large-scale adaptive service composition based on multi-agent reinforcement learning. The model integrates reinforcement learning and game theory, where the former is to achieve adaptation in a highly dynamic environment and the latter is to enable agents to work for a common task (i.e., composition). In particular, we propose a multi-agent Q-learning algorithm for service composition, which is expected to achieve better performance when compared with the single-agent Q-learning method and multi-agent SARSA (State-Action-Reward-State-Action) method. Our experimental results demonstrate the effectiveness and efficiency of our approach.

References

  1. Eyhab Al-Masri and Qusay H. Mahmoud. 2007. Discovering the best web service. In Proceedings of the 16th International Conference on World Wide Web. ACM, 1257--1258. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Mohammad Alrifai, Dimitrios Skoutas, and Thomas Risse. 2010. Selecting skyline services for QoS-based web service composition. In Proceedings of the 19th International Conference on World Wide Web. ACM, 11--20. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. D. Ardagna and B. Pernici. 2007. Adaptive service composition in flexible processes. IEEE Trans. Softw. Eng. 33, 6 (2007), 369--384. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Luciano Baresi and Sam Guinea. 2011. Self-supervising bpel processes. IEEE Transactions on Software Engineering 37, 2 (2011), 247--263. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Sandrine Beauche and Pascal Poizat. 2008. Automated service composition with adaptive planning. In Proceedings of Service-Oriented Computing (ICSOC’08). Springer, 530--537. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Israel Ben-Shaul, Ophir Holder, and Boris Lavva. 2001. Dynamic adaptation and deployment of distributed components in hadas. IEEE Trans. Softw. Eng. 27, 9 (2001), 769--787. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Craig Boutilier. 1996. Planning, learning and coordination in multiagent decision processes. In Proceedings of the 6th Conference on Theoretical Aspects of Rationality and Knowledge (TARK’96). Morgan Kaufmann, San Francisco, CA, 195--210. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Zaki Brahmi. 2013. QoS-aware automatic web service composition based on cooperative agents. In Proceedings of the 2013 IEEE 22nd International Workshop on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE’13). IEEE, 27--32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. George W. Brown. 1951. Iterative solution of games by fictitious play. Activity Anal. Prod. Allocat. 13, 1 (1951), 374--376.Google ScholarGoogle Scholar
  10. Bernd Bruegge and Allen H. Dutoit. 2004. Object-Oriented Software Engineering Using UML, Patterns and Java-(Required). Prentice Hall. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. L. Busoniu, R. Babuska, and B. De Schutter. 2008. A comprehensive survey of multiagent reinforcement learning. IEEE Trans. Syst. Man. Cybernet. C: Appl. Rev. 38, 2 (2008), 156--172. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Valeria Cardellini, Emiliano Casalicchio, Vincenzo Grassi, Stefano Iannucci, Francesco Lo Presti, and Raffaela Mirandola. 2012. Moses: A framework for qos driven runtime adaptation of service-oriented systems. IEEE Trans. Softw. Eng. 38, 5 (2012), 1138--1159. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Kun Chen, Jiuyun Xu, and Stephan Reiff-Marganiec. 2009. Markov-htn planning approach to enhance flexibility of automatic web service composition. In Proceedings of the IEEE International Conference on Web Services, 2009 (ICWS’09). IEEE, 9--16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Ying Chen, Jiwei Huang, and Chuang Lin. 2014. Partial selection: An efficient approach for QoS-aware web service composition. In Proceedings of the 2014 IEEE International Conference on Web Services (ICWS’14). IEEE, 1--8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Caroline Claus and Craig Boutilier. 1998. The dynamics of reinforcement learning in cooperative multiagent systems. In Proceedings of the 15th National/Tenth Conference on Artificial Intelligence/Innovative Applications of Artificial Intelligence (AAAI’98/IAAI’98). American Association for Artificial Intelligence, Menlo Park, CA, 746--752. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. P. Doshi, R. Goodwin, R. Akkiraju, and K. Verma. 2004. Dynamic workflow composition using Markov decision processes. In Proceedings of the IEEE International Conference on Web Services, 2004. 576--582. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Vadim Ermolayev, Natalya Keberle, Sergey Plaksin, Oleksandr Kononenko, and Vagan Terziyan. 2004. Towards a framework for agent-enabled semantic web service composition. Int. J. Web Serv. Res. 1, 3 (2004), 63--87.Google ScholarGoogle ScholarCross RefCross Ref
  18. Aiqiang Gao, Dongqing Yang, Shiwei Tang, and Ming Zhang. 2005. Web service composition using markov decision processes. In Advances in Web-age Information Management. Springer, 308--319. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. J. Octavio Gutierrez-Garcia and Kwang-Mong Sim. 2010. Agent-based service composition in cloud computing. In Grid and Distributed Computing, Control and Automation. Springer, 1--10.Google ScholarGoogle Scholar
  20. Rachid Hamadi and Boualem Benatallah. 2003. A petri net-based model for web service composition. In Proceedings of the 14th Australasian Database Conference,Vol. 17. Australian Computer Society, Inc., 191--200. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Pieter Jan’t Hoen, Karl Tuyls, Liviu Panait, Sean Luke, and Johannes A. La Poutre. 2005. An overview of cooperative and competitive multiagent learning. In Proceedings of the First International Conference on Learning and Adaption in Multi-Agent Systems. Springer-Verlag, 1--46. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Jrg Hoffmann, Ingo Weber, and Frank Michael Kraft. 2010. SAP speaks PDDL. In Proceedings of the 24th AAAI Conference on Artificial Intelligence (AAAI 2010). 1096--1101. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Junling Hu and Michael P. Wellman. 1998. Multiagent reinforcement learning: Theoretical framework and an algorithm. In Proceedings of the 15th International Conference on Machine Learning (ICML’98). Morgan Kaufmann, San Francisco, CA, 242--250. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Ivan J. Jureta, Stéphane Faulkner, Youssef Achbany, and Marco Saerens. 2007. Dynamic web service composition within a service-oriented architecture. In Proceedings of the IEEE International Conference on Web Services, 2007 (ICWS’07). IEEE, 304--311.Google ScholarGoogle ScholarCross RefCross Ref
  25. Leslie Pack Kaelbling, Michael L. Littman, and Andrew W. Moore. 1996. Reinforcement learning: A survey. Journal of Artificial Intelligence Research 4, 1 (1996), 237--285. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Eirini Kaldeli, Alexander Lazovik, and Marco Aiello. 2011. Continual planning with sensing for web service composition. In AAAI Conference on Artificial Intelligence, AAAI 2011, San Francisco, California, Usa, August. 1198--1203. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Andreas Klein, Fuyuki Ishikawa, and Shinichi Honiden. 2014. SanGA: A self-adaptive network-aware approach to service composition. IEEE Trans. Serv. Comput. 7, 3 (2014), 452--464.Google ScholarGoogle ScholarCross RefCross Ref
  28. Michael L. Littman. 1994. Markov games as a framework for multi-agent reinforcement learning. In ICML, Vol. 94. 157--163. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Michael L. Littman. 2001. Value-function reinforcement learning in Markov games. Cogn. Syst. Res. 2, 1 (2001), 55--66. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Zakaria Maamar, Soraya Kouadri Mostefaoui, and Hamdi Yahyaoui. 2005. Toward an agent-based and context-oriented approach for Web services composition. IEEE Trans. Knowl. Data Eng. 17, 5 (2005), 686--697. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Radu Mateescu, Pascal Poizat, and Gwen Salaun. 2012. Adaptation of service protocols using process algebra and on-the-fly reduction techniques. IEEE Trans. Softw. Eng. 38, 4 (2012), 755--777. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Dov Monderer and Lloyd S. Shapley. 1996. Fictitious play property for games with identical interests. J. Econ. Theor. 68, 1 (1996), 258--265.Google ScholarGoogle ScholarCross RefCross Ref
  33. Oliver Moser, Florian Rosenberg, and Schahram Dustdar. 2012. Domain-specific service selection for composite services. IEEE Trans. Softw. Eng. 38, 4 (2012), 828--843. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Ahmed Moustafa and Minjie Zhang. 2013. Multi-objective service composition using reinforcement learning. In Service-Oriented Computing. Springer, 298--312. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. John Nash. 1951. Non-cooperative games. Ann. Math. 54, 2 (1951), pp. 286--295.Google ScholarGoogle ScholarCross RefCross Ref
  36. Seog-Chan Oh, Dongwon Lee, and Soundar R. T. Kumara. 2008a. Effective web service composition in diverse and large-scale service networks. IEEE Trans. Serv. Comput. 1, 1 (2008), 15--32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Seog-Chan Oh, D. Lee, and S. R. T. Kumara. 2008b. Effective web service composition in diverse and large-scale service networks. IEEE Trans. Serv. Comput. 1, 1 (2008), 15--32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Liviu Panait and Sean Luke. 2005. Cooperative multi-agent learning: The state of the art. In Proceedings of 2005 Autonomous Agents and Multi-Agent Systems (AAMAS’05). 387--434. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Petros Papadopoulos, Huaglory Tianfield, David Moffat, and Peter Barrie. 2013. Decentralized multi-agent service composition. Multiagent Grid Syst. 9, 1 (2013), 45--100. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Marco Pistore, Annapaola Marconi, Piergiorgio Bertoli, and Paolo Traverso. 2005a. Automated composition of web services by planning at the knowledge level. In IJCAI, Vol. 19. 1252--1259. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Marco Pistore, Paolo Traverso, and Piergiorgio Bertoli. 2005b. Automated composition of web services by planning in asynchronous domains. In ICAPS, Vol. 5. 2--11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Martin L. Puterman. 2014. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley 8 Sons, New York, NY.Google ScholarGoogle Scholar
  43. Mazeiar Salehie and Ladan Tahvildari. 2009. Self-adaptive software: Landscape and research challenges. ACM Trans. Auton. Adapt. Syst. 4, 2 (2009), 14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Dong-Hoon Shin, Kyong-Ho Lee, and Tatsuya Suda. 2009. Automated generation of composite web services based on functional semantics. Web Semant.: Sci. Serv. Agents World Wide Web 7, 4 (2009), 332--343. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Satinder Singh, Tommi Jaakkola, Michael L. Littman, and Csaba Szepesvári. 2000. Convergence results for single-step on-policy reinforcement-learning algorithms. Mach. Learn. 38, 3 (2000), 287--308. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Peter Stone and Manuela Veloso. 2000. Multiagent systems: A survey from a machine learning perspective. Auton. Robots 8, 3 (2000), 345--383. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Richard S. Sutton and Andrew G. Barto. 1998. Reinforcement Learning: An Introduction. Vol. 1. Cambridge University Press, Cambridge.Google ScholarGoogle Scholar
  48. Hongxia Tong, Jian Cao, Shensheng Zhang, and Minglu Li. 2011. A distributed algorithm for web service composition based on service agent model. IEEE Trans. Parallel Distrib. Syst. 22, 12 (2011), 2008--2021. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Thomas Vogel and Holger Giese. 2014. Model-driven engineering of self-adaptive software with EUREMA. ACM Transactions on Autonomous and Adaptive Systems (TAAS) 8, 4 (2014), 18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Chang-ying Wang, Wen-wei Chen, and Li Yao. 2004. A multi-agent cooperative reinforcement learning algorithm based on team markov game. J. Fudan Univ. 5 (2004), 041.Google ScholarGoogle Scholar
  51. Hongbing Wang, Xin Chen, Qin Wu, Qi Yu, Zibin Zheng, and Athman Bouguettaya. 2014. Integrating on-policy reinforcement learning with multi-agent techniques for adaptive service composition. In Service-Oriented Computing. Springer, 154--168.Google ScholarGoogle Scholar
  52. Hongbing Wang, Xiaojun Wang, Xingzhi Zhang, Qi Yu, and Xingguo Hu. 2016. Effective service composition using multi-agent reinforcement learning. Knowl.-Based Syst. 92 (2016), 151--168. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Hongbing Wang, Xiaojun Wang, and Xuan Zhou. 2012. A multi-agent reinforcement learning model for service composition. In Proceedings of the 2012 IEEE Ninth International Conference on Services Computing (SCC’12). IEEE, 681--682. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Hongbing Wang, Qin Wu, Xin Chen, and Qi Yu. 2015. Integrating gaussian process with reinforcement learning for adaptive service composition. In Service-Oriented Computing. Springer, 203--217.Google ScholarGoogle Scholar
  55. Hongbing Wang, Qin Wu, Xin Chen, Qi Yu, Zibin Zheng, and Athman Bouguettaya. 2014. Adaptive and dynamic service composition via multi-agent reinforcement learning. In Proceedings of the 2014 IEEE International Conference on Web Services (ICWS’14). IEEE, 447--454. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Hongbing Wang, Xuan Zhou, Xiang Zhou, Weihong Liu, Wenya Li, and Athman Bouguettaya. 2010. Adaptive service composition based on reinforcement learning. In Service-Oriented Computing. Springer, 92--107.Google ScholarGoogle Scholar
  57. Lijuan Wang, Jun Shen, and Junzhou Luo. 2015. Facilitating an ant colony algorithm for multi-objective data-intensive service provision. J. Comput. System Sci. 81, 4 (2015), 734--746.Google ScholarGoogle ScholarCross RefCross Ref
  58. P. Wang, Z. Ding, C. Jiang, M. Zhou, and Y. Zheng. 2016. Automatic web service composition based on uncertainty execution effects. IEEE Trans. Serv. Comput. 9, 4 (July 2016), 551--565.Google ScholarGoogle ScholarCross RefCross Ref
  59. Xiaofeng Wang and Tuomas Sandholm. 2002. Reinforcement learning to play an optimal nash equilibrium in team markov games. In Advances in Neural Information Processing Systems. MIT Press, 1571--1578. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Christopher J. C. H. Watkins and Peter Dayan. 1992. Q-learning. Mach. Learn. 8, 3-4 (1992), 279--292. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Y. Wu, C. Yan, Z. Ding, G. Liu, P. Wang, C. Jiang, and M. Zhou. 2016. A multilevel index model to expedite web service discovery and composition in large-scale service repositories. IEEE Trans. Serv. Comput. 9, 3 (May 2016), 330--342.Google ScholarGoogle ScholarCross RefCross Ref
  62. Wenbo Xu, Jian Cao, Haiyan Zhao, and Lei Wang. 2012. A multi-agent learning model for service composition. In Proceedings of the 2012 IEEE Asia-Pacific Services Computing Conference (APSCC’12). IEEE, 70--75. Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Yuhong Yan, P. Poizat, and Ludeng Zhao. 2010. Self-adaptive service composition through graphplan repair. In Proceedings of the 2010 IEEE International Conference on Web Services (ICWS’10). 624--627. Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Chunyang Ye and H.-A. Jacobsen. 2013. Whitening soa testing via event exposure. IEEE Trans. Softw. Eng. 39, 10 (2013), 1444--1465. Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. H. Peyton Young. 1993. The evolution of conventions. Econometrica 61, 1 (1993), 57--84.Google ScholarGoogle ScholarCross RefCross Ref
  66. Zibin Zheng, Yilei Zhang, and Michael R. Lyu. 2014. Investigating QoS of real-world web services. IEEE Trans. Serv. Comput. 7, 1 (2014), 32--39. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Integrating Reinforcement Learning with Multi-Agent Techniques for Adaptive Service Composition

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Autonomous and Adaptive Systems
        ACM Transactions on Autonomous and Adaptive Systems  Volume 12, Issue 2
        June 2017
        162 pages
        ISSN:1556-4665
        EISSN:1556-4703
        DOI:10.1145/3099619
        Issue’s Table of Contents

        Copyright © 2017 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 25 May 2017
        • Revised: 1 February 2017
        • Accepted: 1 February 2017
        • Received: 1 January 2015
        Published in taas Volume 12, Issue 2

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader