Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Enhancing reinforcement learning with a context-based approach

Munguia Galeano, Francisco ORCID: 2023. Enhancing reinforcement learning with a context-based approach. PhD Thesis, Cardiff University.
Item availability restricted.

[thumbnail of Enhancing Reinforcement Learning with a Context_based Approach.pdf]
PDF - Accepted Post-Print Version
Download (30MB) | Preview
[thumbnail of Cardiff University Electronic Publication Form] PDF (Cardiff University Electronic Publication Form) - Supplemental Material
Restricted to Repository staff only

Download (131kB)


Reinforcement Learning (RL) has shown outstanding capabilities in solving complex computational problems. However, most RL algorithms lack an explicit method for learning from contextual information. In reality, humans rely on context to identify patterns and relations among elements in the environment and determine how to avoid making incorrect actions. Conversely, what may seem like obvious poor decisions from a human perspective could take hundreds of steps for an agent to learn how to avoid them. This thesis aims to investigate methods for incorporating contextual information into RL in order to enhance learning performance. The research follows an incremental approach in which, first, contextual information is incorporated into RL in simulated environments, more concisely in games. The experiments show that all the algorithms which use contextual information significantly outperform the baseline algorithms by 77 % on average. Then, the concept is validated with a hybrid approach that comprises a robot in a Human-Robot Interaction (HRI) scenario dealing with rigid objects. The robot learns in simulation while executing actions in the real world. For this setup, based on contextual information, the proposed algorithm trains in a reduced amount of time (2.7 seconds). It reaches an 84% success rate in a grasp and release-related task while interacting with a human user, while the baseline algorithm with the highest success rate reached 68% after learning during a significantly longer period of time (91.8 seconds). Consequently, CQL suits the robot’s learning requirements in observing the current scenario configuration and learning to solve it while dealing with dynamic changes provoked by the user. Additionally, the thesis explores using an RL framework that uses contextual information to learn how to manipulate bags in the real world. A bag is a deformable object that presents challenges from grasping to planning, and RL has the potential to address this issue. The learning process is accomplished through a new RL algorithm introduced in this work called Π-learning, designed to find the best grasping points of the bag based on a set of compact state representations. The framework utilises a set of primitive actions and represents the task in five states. In the experiments, the framework reaches a 60% and 80% success rate after around three hours of training in the real world when starting the bagging task from folded and unfolded positions, respectively. Finally, the trained model is tested on two more bags of different sizes to evaluate its generalisation capacities. Overall, this research seeks to contribute to the broader advancement of RL and robotics, aiming to enhance the development of intelligent, autonomous systems that can effectively operate in diverse and dynamic real-world settings. Besides that, this research seeks to explore new possibilities for automation, HRI, and the utilisation of contextual information in RL.

Item Type: Thesis (PhD)
Date Type: Completion
Status: Unpublished
Schools: Engineering
Uncontrolled Keywords: 1) Robotics 2) Reinforcement Learning 3) Deformable Objects Manipulation
Date of First Compliant Deposit: 23 April 2024
Last Modified: 24 Apr 2024 09:11

Actions (repository staff only)

Edit Item Edit Item


Downloads per month over past year

View more statistics