Liu, Min, Demir, Emrah ![]() ![]() |
![]() |
PDF
- Accepted Post-Print Version
Available under License Creative Commons Attribution. Download (1MB) |
Abstract
Recent advances in reinforcement learning (RL) and deep reinforcement learning (DRL) have shown strong potential in generating near-optimal solutions to the NP-hard vehicle routing problem (VRP). This review systematically analyzes 129 relevant studies published between 2015 and 2025 through bibliometric analysis, based on data retrieved from major academic databases such as IEEE Xplore and Web of Science. The search was conducted using keywords such as ‘reinforcement learning,’ ‘traveling salesman problem,’ ‘vehicle routing problem,’ and their variants. The reviewed studies primarily adopt two main approaches: one reformulates the VRP as a Markov decision process and uses DRL to learn its inherent structure; the other integrates RL or DRL with approximation methods such as heuristics, metaheuristics, or hyperheuristics to enhance search efficiency and solution quality. Finally, the challenges of applying RL and DRL methods to the VRP and potential directions for the research are discussed.
Item Type: | Article |
---|---|
Date Type: | Published Online |
Status: | In Press |
Schools: | Schools > Business (Including Economics) |
Additional Information: | RRS policy applied |
Publisher: | Springer |
ISSN: | 2191-4281 |
Date of First Compliant Deposit: | 10 October 2025 |
Date of Acceptance: | 9 October 2025 |
Last Modified: | 21 Oct 2025 11:45 |
URI: | https://orca.cardiff.ac.uk/id/eprint/181585 |
Actions (repository staff only)
![]() |
Edit Item |