Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Dialogue meets data: a large language model approach for efficient linked data retrieval

Mussa, Omar, Rana, Omer ORCID: https://orcid.org/0000-0003-3597-2646, Goossens, Benoit ORCID: https://orcid.org/0000-0003-2360-4643, Orozco Ter Wengel, Pablo ORCID: https://orcid.org/0000-0002-7951-4148 and Perera, Charith ORCID: https://orcid.org/0000-0002-0190-3346 2026. Dialogue meets data: a large language model approach for efficient linked data retrieval. ACM Transactions on the Web
Item availability restricted.

[thumbnail of Mussa2026_Manuscript.pdf] PDF - Accepted Post-Print Version
Restricted to Repository staff only

Download (9MB)
[thumbnail of Provisional file] PDF (Provisional file) - Accepted Post-Print Version
Download (17kB)

Abstract

While large language models (LLMs) have captured global attention for their linguistic abilities, our work harnesses their power to overcome traditional barriers in querying Linked Data (LD) and Resource Description Framework (RDF) triplestores. This paper presents an innovative framework that integrates LLMs into conversational user interfaces (UIs), enabling the dynamic generation of precise SPARQL queries without the necessity for constant retraining. Most conversational UI models struggle with adaptability as they require frequent retraining whenever datasets are updated or expanded. This limitation impedes their effectiveness as general-purpose extraction tools. To address this challenge, our approach seamlessly incorporates LLMs into the conversational UI process, fostering a more sophisticated understanding and interpretation of user queries and enhancing overall responsiveness. By leveraging the advanced natural language processing capabilities of LLMs, our method improves RDF entity extraction in web systems that utilise conventional chatbots. Furthermore, it extends the functionality of these chatbots, allowing them to respond directly to queries based on the RDF schema while providing an assistive interface that deepens understanding of the dataset and its underlying domain. This facilitates the extraction of more meaningful information. By adopting this approach, interactions become more refined and context-sensitive, a crucial advancement for managing the intricate query structures commonly found in RDF datasets and Linked Open Data (LOD) endpoints. We have evaluated our approach in practical settings by assessing the tool's ability to address complex queries and to answer general ecological queries, with the outputs evaluated by human experts. The results demonstrate a notable improvement in system expressiveness and response accuracy, showcasing the potential of LLMs to transform information retrieval. The findings not only confirm their adaptability in enhancing existing systems but also open up exciting possibilities for their deployment in specialised web information domains, paving the way for future research in this evolving field.

Item Type: Article
Status: In Press
Schools: Schools > Computer Science & Informatics
Schools > Biosciences
Publisher: Association for Computing Machinery (ACM)
ISSN: 1559-1131
Date of First Compliant Deposit: 26 January 2026
Date of Acceptance: 10 January 2026
Last Modified: 26 Jan 2026 11:15
URI: https://orca.cardiff.ac.uk/id/eprint/184155

Actions (repository staff only)

Edit Item Edit Item

Downloads

Downloads per month over past year

View more statistics