HLG: bridging human heuristic knowledge and deep reinforcement learning for optimal agent performance

Date

2024

Authors

Chen, B.
Cao, Z.

Editors

Advisors

Journal Title

Journal ISSN

Volume Title

Type:

Conference paper

Citation

Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS, 2024, vol.2024-May, pp.2189-2191

Statement of Responsibility

Conference Name

The 23rd International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2024) (6 May 2024 - 10 May 2024 : Auckland, New Zealand)

Abstract

Training an optimal policy in deep reinforcement learning (DRL) remains a significant challenge due to the pitfalls of inefficient sampling in dynamic environments with sparse rewards. In this paper, we proposed a Human Local Guide (HLG) incorporating high-level human knowledge and local policies to guide DRL agents to achieve optimal performance. HLG deployed the heuristic rules from human knowledge in differential decision trees and then injected them into neural networks, which can continuously improve the suboptimal global policy till the optimal level. Our developed HLG includes action guides based on a policy-switching mechanism and adaptive action guides inspired by an approximate policy evaluation scheme through a perturbation model to optimise policy further. Our proposed HLG outperforms PPO and PROLONET with at least 25% improvement in training efficiency and exploration capability based on MinGrid environments with sparse reward signals. This implies that HLG has a significant potential to continuously assist the DRL agent in achieving optimal policy in dynamic and complex environments.

School/Discipline

Dissertation Note

Provenance

Description

Access Status

Rights

Copyright 2024 by International Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS). This work is licensed under a Creative Commons Attribution International 4.0 License.

License

Grant ID

Call number

Persistent link to this record