Abstract
Most recent research studies on agent-based production scheduling have focused on developing negotiation schema for agent cooperation. However, successful implementation of agent-based approaches not only relies on the cooperation among the agents, but the individual agent’s intelligence for making good decisions. Learning is one mechanism that could provide the ability for an agent to increase its intelligence while in operation. This paper presents a study examining the implementation of the Q-learning algorithm, one of the most widely used reinforcement learning approaches, for use by job agents when making routing decisions in a job shop environment. A factorial experiment design for studying the settings used to apply Q-learning to the job routing problem is carried out. This study not only investigates the effects of this Q-learning application but also provides recommendations for factor settings and useful guidelines for future applications of Q-learning to agent-based production scheduling.
Similar content being viewed by others
References
Brenner W, Zarnekow R,Witting H (1998) Intelligent software agents: foundations and applications. Springer, Berlin Heidelberg New York
Shen W, Norrie DH, Barthés J-PA (2000) Multi-agent system for concurrent intelligent design and manufacturing. Taylor & Francis, New York
Weiss G (1999) Multiagent systems: A modern approach to distributed artificial intelligence. The MIT Press, Cambridge
Shaw MJ (1988) Dynamic scheduling in cellular manufacturing systems: A framework for network decision making. J Manuf Sys 7(2):83–94
Saad A, Kawamura K, Biswas G (1997) Performance evaluation of contract net-based heterarchical scheduling for flexible manufacturing systems. Intell Autonon and Soft Comput 3(3):229–248
Xue D, Sun J, Norrie DH (2001) An intelligent optimal production scheduling approach using constraint-based search and agent-based collaboration. Comput Ind 46(2):209–231
Ouelhadj D, Hanachi C, Bouzouia B (1998) Multi-agent systems for dynamic scheduling and control in manufacturing cells . Proc 1998 IEEE International Conference on Robotics & Automation, Leuven, Belgium, pp 2128–2133
Ouelhadj D , Hanachi C, Bouzouia B, Moualek A, Farhi A (1999) A multi-contract net protocol for dynamic scheduling in flexible manufacturing systems . Proc 1999 IEEE International Conference on Robotics & Automation, Detroit, Chicago, pp 1114–1119
Sousa P, Ramos C (1996) A Holonic approach for task scheduling in manufacturing systems. Proc 1996 IEEE International Conference on Robotics and Automation, Minneapolis, MN 2511–2516
Sousa P, Ramos C (1998) A dynamic scheduling Holon for manufacturing orders. J Intell Manuf 9(2):107–112
Sousa P, Ramos C (1999) A distributed architecture and negotiation protocol for scheduling in manufacturing systems. Comput Ind 38(2):103–113
Lin GY, Solberg JJ (1992) Integrated shop floor control using autonomous agents. IIE Trans 24(3):57–71
Lin GY, Solberg JJ (1994) An agent-based flexible routing manufacturing control simulation system. Proc 1994 Winter Simulation Conference, pp 970–977
Dewan P, Joshi S (2000) Dynamic single machine scheduling under distributed decision making. Int J Prod Res 38(16):3759–3777
Dewan P, Joshi S (2001) Implementation of an auction-based distributed scheduling model for a dynamic job shop environment. Int J Comput Inte Manuf 14(5):446–456
Ottaway TA, Burns JR (2000) An adaptive production control system utilizing agent technology. Int J Prod Res 38(4):721–737
Sutton RS, Barto AG (1999) Reinforcement learning: An introduction. The MIT Press, Cambridge, MA
Mahadevan S, Marchalleck N , Das TK, Gosavi A (1997) Self-improving factory simulation using continuous-time average-reward reinforcement learning. Proc the 4th International Machine Learning Conference, pp 202–210
Mahadevan S, Theocharous G (1998) Optimizing production manufacturing using reinforcement learning. The 11th International FLAIRS Conference, AAAI Press, pp 372–377
Paternina-Arboleda CD, Das TK (2001) Intelligent dynamic control policies for serial production lines . IIE Trans 33(1):65–77
Zhang W, Dietterich TG (1995) A reinforcement learning approach to job-shop scheduling. Proc 14th International Joint Conference on Artificial Intelligence, pp 1114–1120
Aydin EM, Oztemel E (2000) Dynamic job-shop scheduling using reinforcement learning agents. Robot Autonom Syst 33(2):169–178
Tesauro GJ (1995) Temporal difference learning and TD-Gammon. Commun ACM 38(3):58–68
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wang, YC., Usher, J.M. A reinforcement learning approach for developing routing policies in multi-agent production scheduling. Int J Adv Manuf Technol 33, 323–333 (2007). https://doi.org/10.1007/s00170-006-0465-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00170-006-0465-y