HLG: bridging human heuristic knowledge and deep reinforcement learning for optimal agent performance
Date
2024
Authors
Chen, B.
Cao, Z.
Editors
Advisors
Journal Title
Journal ISSN
Volume Title
Type:
Conference paper
Citation
Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS, 2024, vol.2024-May, pp.2189-2191
Statement of Responsibility
Conference Name
The 23rd International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2024) (6 May 2024 - 10 May 2024 : Auckland, New Zealand)
Abstract
Training an optimal policy in deep reinforcement learning (DRL) remains a significant challenge due to the pitfalls of inefficient sampling in dynamic environments with sparse rewards. In this paper, we proposed a Human Local Guide (HLG) incorporating high-level human knowledge and local policies to guide DRL agents to achieve optimal performance. HLG deployed the heuristic rules from human knowledge in differential decision trees and then injected them into neural networks, which can continuously improve the suboptimal global policy till the optimal level. Our developed HLG includes action guides based on a policy-switching mechanism and adaptive action guides inspired by an approximate policy evaluation scheme through a perturbation model to optimise policy further. Our proposed HLG outperforms PPO and PROLONET with at least 25% improvement in training efficiency and exploration capability based on MinGrid environments with sparse reward signals. This implies that HLG has a significant potential to continuously assist the DRL agent in achieving optimal policy in dynamic and complex environments.
School/Discipline
Dissertation Note
Provenance
Description
Access Status
Rights
Copyright 2024 by International Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS). This work is licensed under a Creative Commons Attribution International 4.0 License.