Gao, Yan
2024.
Deep reinforcement learning-based
indoor mapless robot navigation.
PhD Thesis,
Cardiff University.
![]() Item availability restricted. |
Preview |
PDF (Thesis)
- Accepted Post-Print Version
Download (25MB) | Preview |
![]() |
PDF (Cardiff University Electronic Publication Form)
- Supplemental Material
Restricted to Repository staff only Download (891kB) |
Abstract
Navigation is a critical capability for mobile robots, enabling movement from a source to a destination. Conventional methods depend on predefined maps, which are time-consuming and labour-intensive to construct. Mapless navigation removes this requirement by finding collision-free paths using partial environmental observations. With advances in computational power and machine learning, there is a growing shift toward deep reinforcement learning (DRL)-based mapless navigation, where robots learn actions directly from raw sensory inputs. This thesis focuses on developing and improving DRLbased mapless navigation systems. Firstly, this thesis proposes a novel hierarchical reinforcement learning (HRL)-based mapless navigation framework. Specifically, it defines two different subgoal worthiness metrics: Predictive Neighbouring Space Scoring (PNSS) and Predictive Exploration Worthiness (PEW). PNSS relates to the explorable space for each subgoal, while PEW refers to the spatial distribution of obstacles, including the area of free space and the arrangement of obstacles around each subgoal. The PNSS and PEW models are developed to predict the PNSS and PEW values, enabling the robot to evaluate the worthiness of each subgoal. Then, these predicted PNSS or PEW values are incorporated into the high-level (HL) input representations, resulting in a more compact and informative representation. Additionally, a penalty element is introduced in the HL reward function, allowing the HL policy to consider the capabilities of the low-level policy when selecting subgoals. Moreover, this thesis proposes a novel subgoal space layout that enables the robot to explore locations further from its current position. Experiments in unknown environments demonstrate significant improvements over baselines. Then, this thesis develops a novel reward function and neural network i (NN) structure for HRL-based mapless navigation, designed to address local minimum issues in complex environments. The reward function for training the HL policy consists of two components: extrinsic reward and intrinsic reward. The extrinsic reward encourages the robot to move towards the target location, while the intrinsic reward, calculated based on novelty, episode memory, and memory decaying, enables the agent to engage in spontaneous exploration. The proposed NN architecture incorporates a Long Short-Term Memory (LSTM) network to enhance the agent’s memory and reasoning capabilities. Testing in unknown environments shows a significant improvement in success rates and effective resolution of local minima, especially where baseline methods fail completely. Finally, this thesis introduces a DRL-based mapless navigation method that does not assume the availability of accurate robot pose information. It utilises RGB-D-based ORB-SLAM2 for robot localisation. The trained policy effectively directs the robot towards the target while improving pose estimation by considering the quality of observed features along selected paths. The quality of features depends on both their quantity and distribution. To facilitate policy training, a compact state representation based on the spatial distribution of map points is proposed, enhancing the robot’s awareness of areas with reliable features. Additionally, a novel reward function is designed that incorporates relative pose error. It increases the policy’s responsiveness to individual actions. Instead of establishing a predetermined threshold to assess whether the discrepancy between the SLAM-predicted pose and the true value exceeds an acceptable limit, a dynamic threshold is employed to assess localisation performance, improving the policy’s adaptability to variations in SLAM performance across different environments. Experiments show the method outperforms related RL-based approaches in localisation-challenging ii environments. This thesis presents novel DRL-based mapless navigation approaches, making significant contributions to both theory and practical applications. Together, these contributions advance the field of autonomous navigation, offering more adaptable, efficient, and scalable solutions.
Item Type: | Thesis (PhD) |
---|---|
Date Type: | Completion |
Status: | Unpublished |
Schools: | Schools > Engineering |
Uncontrolled Keywords: | 1). Mapless navigation 2). Deep reinforcement learning 3).Collision avoidance 4).Motion planning 5) Hierarchical reinforcement learning 6). Simultaneous localisation and mapping |
Date of First Compliant Deposit: | 25 April 2025 |
Last Modified: | 25 Apr 2025 12:33 |
URI: | https://orca.cardiff.ac.uk/id/eprint/177917 |
Actions (repository staff only)
![]() |
Edit Item |