Learning robust reward machines from noisy labels

Parać, Roko, Nodari, Lorenzo, Ardon, Leo, Furelos-Blanco, Daniel, Cerutti, Federico

and Russo, Alessandra 2024. Learning robust reward machines from noisy labels. Presented at: 21st International Conference on Principles of Knowledge Representation and Reasoning, Hanoi, Vietnam, 2-8 November 2024. Proceedings of the 21st International Conference on Principles of Knowledge Representation and Reasoning. Proceedings of the 17th International Conference on Principles of Knowledge Representation and Reasoning. pp. 909-919. 10.24963/kr.2024/85

Full text not available from this repository.

Official URL: https://doi.org/10.24963/kr.2024/85

Abstract

This paper presents PROB-IRM, an approach that learns robust reward machines (RMs) for reinforcement learning (RL) agents from noisy execution traces. The key aspect of RM-driven RL is the exploitation of a finite-state ma- chine that decomposes the agent’s task into different sub- tasks. PROB-IRM uses a state-of-the-art inductive logic pro- gramming framework robust to noisy examples to learn RMs from noisy traces using the Bayesian posterior degree of be- liefs, thus ensuring robustness against inconsistencies. Piv- otal for the results is the interleaving between RM learning and policy learning: a new RM is learned whenever the RL agent generates a trace that is believed not to be accepted by the current RM. To speed up the training of the RL agent, PROB-IRM employs a probabilistic formulation of reward shaping that uses the posterior Bayesian beliefs derived from the traces. Our experimental analysis shows that PROB-IRM can learn (potentially imperfect) RMs from noisy traces and exploit them to train an RL agent to solve its tasks success- fully. Despite the complexity of learning the RM from noisy traces, agents trained with PROB-IRM perform comparably to agents provided with handcrafted RMs.

Item Type:	Conference or Workshop Item (Paper)
Date Type:	Publication
Status:	Published
Schools:	Schools > Computer Science & Informatics
ISBN:	9781956792058
ISSN:	2334-1033
Last Modified:	26 Nov 2024 16:32
URI:	https://orca.cardiff.ac.uk/id/eprint/173713

Actions (repository staff only)

Edit Item

Dimensions

Altmetric

CORE (COnnecting REpositories)