Deep reinforcement learning-based indoor mapless robot navigation

Gao, Yan 2024. Deep reinforcement learning-based indoor mapless robot navigation. PhD Thesis, Cardiff University.

Item availability restricted.

Preview	PDF (Thesis) - Accepted Post-Print Version Download (25MB) \| Preview
	PDF (Cardiff University Electronic Publication Form) - Supplemental Material Restricted to Repository staff only Download (891kB)

Abstract

Navigation is a critical capability for mobile robots, enabling movement from a source to a destination. Conventional methods depend on predefined maps, which are time-consuming and labour-intensive to construct. Mapless navigation removes this requirement by finding collision-free paths using partial environmental observations. With advances in computational power and machine learning, there is a growing shift toward deep reinforcement learning (DRL)-based mapless navigation, where robots learn actions directly from raw sensory inputs. This thesis focuses on developing and improving DRLbased mapless navigation systems. Firstly, this thesis proposes a novel hierarchical reinforcement learning (HRL)-based mapless navigation framework. Specifically, it defines two different subgoal worthiness metrics: Predictive Neighbouring Space Scoring (PNSS) and Predictive Exploration Worthiness (PEW). PNSS relates to the explorable space for each subgoal, while PEW refers to the spatial distribution of obstacles, including the area of free space and the arrangement of obstacles around each subgoal. The PNSS and PEW models are developed to predict the PNSS and PEW values, enabling the robot to evaluate the worthiness of each subgoal. Then, these predicted PNSS or PEW values are incorporated into the high-level (HL) input representations, resulting in a more compact and informative representation. Additionally, a penalty element is introduced in the HL reward function, allowing the HL policy to consider the capabilities of the low-level policy when selecting subgoals. Moreover, this thesis proposes a novel subgoal space layout that enables the robot to explore locations further from its current position. Experiments in unknown environments demonstrate significant improvements over baselines. Then, this thesis develops a novel reward function and neural network i (NN) structure for HRL-based mapless navigation, designed to address local minimum issues in complex environments. The reward function for training the HL policy consists of two components: extrinsic reward and intrinsic reward. The extrinsic reward encourages the robot to move towards the target location, while the intrinsic reward, calculated based on novelty, episode memory, and memory decaying, enables the agent to engage in spontaneous exploration. The proposed NN architecture incorporates a Long Short-Term Memory (LSTM) network to enhance the agent’s memory and reasoning capabilities. Testing in unknown environments shows a significant improvement in success rates and effective resolution of local minima, especially where baseline methods fail completely. Finally, this thesis introduces a DRL-based mapless navigation method that does not assume the availability of accurate robot pose information. It utilises RGB-D-based ORB-SLAM2 for robot localisation. The trained policy effectively directs the robot towards the target while improving pose estimation by considering the quality of observed features along selected paths. The quality of features depends on both their quantity and distribution. To facilitate policy training, a compact state representation based on the spatial distribution of map points is proposed, enhancing the robot’s awareness of areas with reliable features. Additionally, a novel reward function is designed that incorporates relative pose error. It increases the policy’s responsiveness to individual actions. Instead of establishing a predetermined threshold to assess whether the discrepancy between the SLAM-predicted pose and the true value exceeds an acceptable limit, a dynamic threshold is employed to assess localisation performance, improving the policy’s adaptability to variations in SLAM performance across different environments. Experiments show the method outperforms related RL-based approaches in localisation-challenging ii environments. This thesis presents novel DRL-based mapless navigation approaches, making significant contributions to both theory and practical applications. Together, these contributions advance the field of autonomous navigation, offering more adaptable, efficient, and scalable solutions.

Item Type:	Thesis (PhD)
Date Type:	Completion
Status:	Unpublished
Schools:	Schools > Engineering
Uncontrolled Keywords:	1). Mapless navigation 2). Deep reinforcement learning 3).Collision avoidance 4).Motion planning 5) Hierarchical reinforcement learning 6). Simultaneous localisation and mapping
Date of First Compliant Deposit:	25 April 2025
Last Modified:	25 Apr 2025 12:33
URI:	https://orca.cardiff.ac.uk/id/eprint/177917

Actions (repository staff only)

Edit Item

Download Statistics

Downloads

Downloads per month over past year

View more statistics

CORE (COnnecting REpositories)