Yang, Xintong ![]() ![]() ![]() ![]() ![]() ![]() |
![]() |
PDF
- Accepted Post-Print Version
Download (4MB) |
![]() |
PDF
- Supplemental Material
Download (4MB) |
Abstract
Multistep tasks, such as block stacking or parts (dis)assembly, are complex for autonomous robotic manipulation. A robotic system for such tasks would need to hierarchically combine motion control at a lower level and symbolic planning at a higher level. Recently, reinforcement learning (RL)-based methods have been shown to handle robotic motion control with better flexibility and generalizability. However, these methods have limited capability to handle such complex tasks involving planning and control with many intermediate steps over a long time horizon. First, current RL systems cannot achieve varied outcomes by planning over intermediate steps (e.g., stacking blocks in different orders). Second, the exploration efficiency of learning multistep tasks is low, especially when rewards are sparse. To address these limitations, we develop a unified hierarchical reinforcement learning framework, named Universal Option Framework (UOF), to enable the agent to learn varied outcomes in multistep tasks. To improve learning efficiency, we train both symbolic planning and kinematic control policies in parallel, aided by two proposed techniques: 1) an auto-adjusting exploration strategy (AAES) at the low level to stabilize the parallel training, and 2) abstract demonstrations at the high level to accelerate convergence. To evaluate its performance, we performed experiments on various multistep block-stacking tasks with blocks of different shapes and combinations and with different degrees of freedom for robot control. The results demonstrate that our method can accomplish multistep manipulation tasks more efficiently and stably, and with significantly less memory consumption.
Item Type: | Article |
---|---|
Date Type: | Publication |
Status: | Published |
Schools: | Engineering |
Publisher: | IEEE |
ISSN: | 2162-237X |
Date of First Compliant Deposit: | 18 February 2021 |
Date of Acceptance: | 11 February 2021 |
Last Modified: | 06 Nov 2024 08:45 |
URI: | https://orca.cardiff.ac.uk/id/eprint/138649 |
Citation Data
Cited 11 times in Scopus. View in Scopus. Powered By Scopus® Data
Actions (repository staff only)
![]() |
Edit Item |