Antypas, Dimosthenis, Ushio, Asahi, Barbieri, Francesco, Neves, Leonardo, Rezaee, Kiamehr, Espinosa-Anke, Luis ![]() ![]() ![]() |
Preview |
PDF
- Published Version
Available under License Creative Commons Attribution. Download (1MB) | Preview |
Abstract
Despite its relevance, the maturity of NLP for social media pales in comparison with general-purpose models, metrics and benchmarks. This fragmented landscape makes it hard for the community to know, for instance, given a task, which is the best performing model and how it compares with others. To alleviate this issue, we introduce a unified benchmark for NLP evaluation in social media, SuperTweetEval, which includes a heterogeneous set of tasks and datasets combined, adapted and constructed from scratch. We benchmarked the performance of a wide range of models on SuperTweetEval and our results suggest that, despite the recent advances in language modelling, social media remains challenging.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Date Type: | Publication |
Status: | Published |
Schools: | Computer Science & Informatics |
Publisher: | Association for Computational Linguistics |
Date of First Compliant Deposit: | 4 November 2024 |
Last Modified: | 04 Nov 2024 15:48 |
URI: | https://orca.cardiff.ac.uk/id/eprint/172500 |
Actions (repository staff only)
![]() |
Edit Item |