| Yang, Xintong  ORCID: https://orcid.org/0000-0002-7612-614X, Ji, Ze  ORCID: https://orcid.org/0000-0002-8968-9902, Wu, Jing  ORCID: https://orcid.org/0000-0001-5123-9861 and Lai, Yukun  ORCID: https://orcid.org/0000-0002-2094-5680
      2022.
      
      Abstract demonstrations and adaptive exploration for efficient and stable multi-step sparse reward reinforcement learning.
      Presented at: 27th IEEE International Conference on Automation and Computing (ICAC2022),
      Bristol, United Kingdom,
      1-3 September 2022.
      
      2022 27th International Conference on Automation and Computing (ICAC).
      
      
      
       
      
      
      IEEE,
      
      10.1109/ICAC55051.2022.9911100 | 
| ![Ji Z - Abstract demonstrations and adaptive exploration ....pdf [thumbnail of Ji Z - Abstract demonstrations and adaptive exploration ....pdf]](https://orca.cardiff.ac.uk/style/images/fileicons/application_pdf.png) | PDF
 - Accepted Post-Print Version Download (558kB) | 
Abstract
Although Deep Reinforcement Learning (DRL) has been popular in many disciplines including robotics, state-of-the-art DRL algorithms still struggle to learn long-horizon, multistep and sparse reward tasks, such as stacking several blocks given only a task-completion reward signal. To improve learning efficiency for such tasks, this paper proposes a DRL exploration technique, termed A2, which integrates two components inspired by human experiences: Abstract demonstrations and Adaptive exploration. A2 starts by decomposing a complex task into subtasks, and then provides the correct orders of subtasks to learn. During training, the agent explores the environment adaptively, acting more deterministically for well-mastered subtasks and more stochastically for ill-learnt subtasks. Ablation and comparative experiments are conducted on several grid-world tasks and three robotic manipulation tasks. We demonstrate that A2 can aid popular DRL algorithms (DQN, DDPG, and SAC) to learn more efficiently and stably in these environments.
| Item Type: | Conference or Workshop Item (Paper) | 
|---|---|
| Date Type: | Published Online | 
| Status: | Published | 
| Schools: | Schools > Engineering Schools > Computer Science & Informatics | 
| Publisher: | IEEE | 
| ISBN: | 978-1-6654-9807-4 | 
| Date of First Compliant Deposit: | 27 July 2022 | 
| Last Modified: | 20 Mar 2025 22:15 | 
| URI: | https://orca.cardiff.ac.uk/id/eprint/151519 | 
Actions (repository staff only)
|  | Edit Item | 

 
							

 Altmetric
 Altmetric Altmetric
 Altmetric