As a guest user you are not logged in or recognized by your IP address. You have
access to the Front Matter, Abstracts, Author Index, Subject Index and the full
text of Open Access publications.
In hierarchical reinforcement learning (HRL), continuous options provide a knowledge carrier that is more aligned with human behavior, but reliable scheduling methods are not yet available. To design an available scheduling method for continuous options, in this paper, the hierarchical reinforcement learning with adaptive scheduling (HAS) algorithm is proposed. It focuses on achieving an adaptive balance between exploration and exploitation during the frequent scheduling of continuous options. It builds on multi-step static scheduling and makes switching decisions according to the relative advantages of the previous and the estimated options, enabling the agent to focus on different behaviors at different phases. The expected t-step distance is applied to demonstrate the superiority of adaptive scheduling in terms of exploration. Furthermore, an interruption incentive based on annealing is proposed to alleviate excessive exploration, accelerating the convergence rate. We develop a comprehensive experimental analysis scheme. The experimental results demonstrate the high performance and robustness of HAS. Moreover, it provides evidence that adaptive scheduling has a positive effect both on the representation and option policies.
This website uses cookies
We use cookies to provide you with the best possible experience. They also allow us to analyze user behavior in order to constantly improve the website for you. Info about the privacy policy of IOS Press.
This website uses cookies
We use cookies to provide you with the best possible experience. They also allow us to analyze user behavior in order to constantly improve the website for you. Info about the privacy policy of IOS Press.