Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Grouping entities with shared properties using multi-facet prompting and property embeddings

Gajbhiye, Amit, Bailleux, Thomas, Bouraoui, Zied, Espinosa-Anke, Luis ORCID: https://orcid.org/0000-0001-6830-9176 and Schockaert, Steven ORCID: https://orcid.org/0000-0002-9256-2881 2025. Grouping entities with shared properties using multi-facet prompting and property embeddings. Presented at: The 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP), Suzhou, China, 4-9 November 2025. Published in: Christodoulopoulos, Christos, Chakraborty, Tanmoy, Rose, Carolyn and Peng, Violet eds. Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing. Suzhou, China: Association for Computational Linguistics, pp. 15600-15615. 10.18653/v1/2025.emnlp-main.787

[thumbnail of 2025.emnlp-main.787.pdf]
Preview
PDF - Published Version
Available under License Creative Commons Attribution.

Download (582kB) | Preview

Abstract

Methods for learning taxonomies from data have been widely studied. We study a specific version of this task, called commonality identification, where only the set of entities is given and we need to find meaningful ways to group those entities. While LLMs should intuitively excel at this task, it is difficult to directly use such models in large domains. In this paper, we instead use LLMs to describe the different properties that are satisfied by each of the entities individually. We then use pre-trained embeddings to cluster these properties, and finally group entities that have properties which belong to the same cluster. To achieve good results, it is paramount that the properties predicted by the LLM are sufficiently diverse. We find that this diversity can be improved by prompting the LLM to structure the predicted properties into different facets of knowledge.

Item Type: Conference or Workshop Item - published (Paper)
Date Type: Publication
Status: Published
Schools: Schools > Computer Science & Informatics
Publisher: Association for Computational Linguistics
ISBN: 979-8-89176-332-6
Date of First Compliant Deposit: 21 October 2025
Date of Acceptance: 20 August 2025
Last Modified: 30 Jan 2026 16:57
URI: https://orca.cardiff.ac.uk/id/eprint/181774

Actions (repository staff only)

Edit Item Edit Item

Downloads

Downloads per month over past year

View more statistics