Khalid, Muhammad, Nourollah, Amir Masoud and Schockaert, Steven ORCID: https://orcid.org/0000-0002-9256-2881
2025.
Large language and reasoning models are shallow disjunctive reasoners.
Presented at: The 63rd Annual Meeting of the Association for Computational Linguistics,
Vienna, Austria,
27 July - 1 August 2025.
Published in: Che, Wanxiang, Nabende, Joyce, Shutova, Ekaterina and Pilehvar, Mohammad Taher eds.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).
, vol.1
Vienna, Austria:
Association for Computational Linguistics,
pp. 8843-8869.
10.18653/v1/2025.acl-long.433
|
Preview |
PDF
- Published Version
Available under License Creative Commons Attribution. Download (959kB) | Preview |
Abstract
Large Language Models (LLMs) and Systematic Reasoning Large Language Models (LLMs) have been found to struggle with systematic reasoning. Even on tasks where they appear to perform well, their performance often depends on shortcuts rather than genuine reasoning abilities, leading them to collapse on out-of-distribution (OOD) examples. Post-training strategies based on reinforcement learning and chain-of-thought prompting have recently been hailed as a step change. However, little is known about the potential of the resulting “Large Reasoning Models” (LRMs) beyond maths and programming-based problem solving, where genuine OOD problems can be sparse. In this paper, we focus on tasks that require systematic relational composition for qualitative spatial and temporal reasoning. The setting allows fine control over problem difficulty to precisely measure OOD generalization. We find that zero-shot LRMs generally outperform their LLM counterparts in single-path reasoning tasks but struggle in the multi-path setting. While showing comparatively better results, fine-tuned LLMs are also not capable of multi-path generalization. We also provide evidence for the behavioral interpretation of this—namely, that LRMs are shallow disjunctive reasoners.
| Item Type: | Conference or Workshop Item - published (Paper) |
|---|---|
| Date Type: | Publication |
| Status: | Published |
| Schools: | Schools > Computer Science & Informatics |
| Publisher: | Association for Computational Linguistics |
| ISBN: | 979-8-89176-251-0 |
| Related URLs: | |
| Date of First Compliant Deposit: | 22 July 2025 |
| Date of Acceptance: | 15 May 2025 |
| Last Modified: | 29 Jan 2026 12:20 |
| URI: | https://orca.cardiff.ac.uk/id/eprint/179567 |
Actions (repository staff only)
![]() |
Edit Item |





Dimensions
Dimensions