Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Learning robust reward machines from noisy labels

Parać, Roko, Nodari, Lorenzo, Ardon, Leo, Furelos-Blanco, Daniel, Cerutti, Federico ORCID: https://orcid.org/0000-0003-0755-0358 and Russo, Alessandra 2024. Learning robust reward machines from noisy labels. Presented at: 21st International Conference on Principles of Knowledge Representation and Reasoning, Hanoi, Vietnam, 2-8 November 2024. Proceedings of the 21st International Conference on Principles of Knowledge Representation and Reasoning. Proceedings of the 17th International Conference on Principles of Knowledge Representation and Reasoning. pp. 909-919. 10.24963/kr.2024/85

Full text not available from this repository.

Abstract

This paper presents PROB-IRM, an approach that learns robust reward machines (RMs) for reinforcement learning (RL) agents from noisy execution traces. The key aspect of RM-driven RL is the exploitation of a finite-state ma- chine that decomposes the agent’s task into different sub- tasks. PROB-IRM uses a state-of-the-art inductive logic pro- gramming framework robust to noisy examples to learn RMs from noisy traces using the Bayesian posterior degree of be- liefs, thus ensuring robustness against inconsistencies. Piv- otal for the results is the interleaving between RM learning and policy learning: a new RM is learned whenever the RL agent generates a trace that is believed not to be accepted by the current RM. To speed up the training of the RL agent, PROB-IRM employs a probabilistic formulation of reward shaping that uses the posterior Bayesian beliefs derived from the traces. Our experimental analysis shows that PROB-IRM can learn (potentially imperfect) RMs from noisy traces and exploit them to train an RL agent to solve its tasks success- fully. Despite the complexity of learning the RM from noisy traces, agents trained with PROB-IRM perform comparably to agents provided with handcrafted RMs.

Item Type: Conference or Workshop Item (Paper)
Date Type: Publication
Status: Published
Schools: Computer Science & Informatics
ISBN: 9781956792058
ISSN: 2334-1033
Last Modified: 26 Nov 2024 16:32
URI: https://orca.cardiff.ac.uk/id/eprint/173713

Actions (repository staff only)

Edit Item Edit Item