Yang, Xintong ORCID: https://orcid.org/0000-0002-7612-614X, Ji, Ze ORCID: https://orcid.org/0000-0002-8968-9902, Wu, Jing ORCID: https://orcid.org/0000-0001-5123-9861 and Lai, Yukun ORCID: https://orcid.org/0000-0002-2094-5680
2022.
Abstract demonstrations and adaptive exploration for efficient and stable multi-step sparse reward reinforcement learning.
Presented at: 27th IEEE International Conference on Automation and Computing (ICAC2022),
Bristol, United Kingdom,
1-3 September 2022.
2022 27th International Conference on Automation and Computing (ICAC).
IEEE,
10.1109/ICAC55051.2022.9911100
|
|
PDF
- Accepted Post-Print Version
Download (558kB) |
Abstract
Although Deep Reinforcement Learning (DRL) has been popular in many disciplines including robotics, state-of-the-art DRL algorithms still struggle to learn long-horizon, multistep and sparse reward tasks, such as stacking several blocks given only a task-completion reward signal. To improve learning efficiency for such tasks, this paper proposes a DRL exploration technique, termed A2, which integrates two components inspired by human experiences: Abstract demonstrations and Adaptive exploration. A2 starts by decomposing a complex task into subtasks, and then provides the correct orders of subtasks to learn. During training, the agent explores the environment adaptively, acting more deterministically for well-mastered subtasks and more stochastically for ill-learnt subtasks. Ablation and comparative experiments are conducted on several grid-world tasks and three robotic manipulation tasks. We demonstrate that A2 can aid popular DRL algorithms (DQN, DDPG, and SAC) to learn more efficiently and stably in these environments.
| Item Type: | Conference or Workshop Item (Paper) |
|---|---|
| Date Type: | Published Online |
| Status: | Published |
| Schools: | Schools > Engineering Schools > Computer Science & Informatics |
| Publisher: | IEEE |
| ISBN: | 978-1-6654-9807-4 |
| Date of First Compliant Deposit: | 27 July 2022 |
| Last Modified: | 20 Mar 2025 22:15 |
| URI: | https://orca.cardiff.ac.uk/id/eprint/151519 |
Actions (repository staff only)
![]() |
Edit Item |





Altmetric
Altmetric