Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Abstract demonstrations and adaptive exploration for efficient and stable multi-step sparse reward reinforcement learning

Yang, Xintong ORCID: https://orcid.org/0000-0002-7612-614X, Ji, Ze ORCID: https://orcid.org/0000-0002-8968-9902, Wu, Jing ORCID: https://orcid.org/0000-0001-5123-9861 and Lai, Yukun ORCID: https://orcid.org/0000-0002-2094-5680 2022. Abstract demonstrations and adaptive exploration for efficient and stable multi-step sparse reward reinforcement learning. Presented at: 27th IEEE International Conference on Automation and Computing (ICAC2022), Bristol, United Kingdom, 1-3 September 2022. 2022 27th International Conference on Automation and Computing (ICAC). IEEE, 10.1109/ICAC55051.2022.9911100

[thumbnail of Ji Z - Abstract demonstrations and adaptive exploration ....pdf] PDF - Accepted Post-Print Version
Download (558kB)

Abstract

Although Deep Reinforcement Learning (DRL) has been popular in many disciplines including robotics, state-of-the-art DRL algorithms still struggle to learn long-horizon, multistep and sparse reward tasks, such as stacking several blocks given only a task-completion reward signal. To improve learning efficiency for such tasks, this paper proposes a DRL exploration technique, termed A2, which integrates two components inspired by human experiences: Abstract demonstrations and Adaptive exploration. A2 starts by decomposing a complex task into subtasks, and then provides the correct orders of subtasks to learn. During training, the agent explores the environment adaptively, acting more deterministically for well-mastered subtasks and more stochastically for ill-learnt subtasks. Ablation and comparative experiments are conducted on several grid-world tasks and three robotic manipulation tasks. We demonstrate that A2 can aid popular DRL algorithms (DQN, DDPG, and SAC) to learn more efficiently and stably in these environments.

Item Type: Conference or Workshop Item (Paper)
Date Type: Published Online
Status: Published
Schools: Engineering
Computer Science & Informatics
Publisher: IEEE
ISBN: 978-1-6654-9807-4
Date of First Compliant Deposit: 27 July 2022
Last Modified: 15 Dec 2022 15:36
URI: https://orca.cardiff.ac.uk/id/eprint/151519

Actions (repository staff only)

Edit Item Edit Item

Downloads

Downloads per month over past year

View more statistics