Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Ultra-fine entity typing with prior knowledge about labels: a simple clustering based approach

Li, Na, Bouraoui, Zied and Schockaert, Steven ORCID: https://orcid.org/0000-0002-9256-2881 2023. Ultra-fine entity typing with prior knowledge about labels: a simple clustering based approach. Presented at: Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, December 2023. Published in: Bouamor, Houda, Pino, Juan and Bali, Kalika eds. Findings of the Association for Computational Linguistics: EMNLP 2023. Association for Computational Linguistics, 11744–11756. 10.18653/v1/2023.findings-emnlp.786

[thumbnail of _EMNLP_2023__ultra_fine_entity_typing-5.pdf]
Preview
PDF - Accepted Post-Print Version
Download (274kB) | Preview

Abstract

Ultra-fine entity typing (UFET) is the task of inferring the semantic types from a large set of fine-grained candidates that apply to a given entity mention. This task is especially challenging because we only have a small number of training examples for many types, even with distant supervision strategies. State-of-the-art models, therefore, have to rely on prior knowledge about the type labels in some way. In this paper, we show that the performance of existing methods can be improved using a simple technique: we use pre-trained label embeddings to cluster the labels into semantic domains and then treat these domains as additional types. We show that this strategy consistently leads to improved results as long as high-quality label embeddings are used. Furthermore, we use the label clusters as part of a simple post-processing technique, which results in further performance gains. Both strategies treat the UFET model as a black box and can thus straightforwardly be used to improve a wide range of existing models.

Item Type: Conference or Workshop Item (Paper)
Status: Published
Schools: Computer Science & Informatics
Publisher: Association for Computational Linguistics
Date of First Compliant Deposit: 13 February 2024
Date of Acceptance: 7 October 2023
Last Modified: 13 Feb 2024 16:45
URI: https://orca.cardiff.ac.uk/id/eprint/165642

Actions (repository staff only)

Edit Item Edit Item

Downloads

Downloads per month over past year

View more statistics