Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

What's the meaning of superhuman performance in today's NLU?

Tedeschi, Simone, Bos, Johan, Declerck, Thierry, Hajic, Jan, Hershcovich, Daniel, Hovy, Eduard, Koller, Alexander, Krek, Simon, Schockaert, Steven ORCID: https://orcid.org/0000-0002-9256-2881, Sennrich, Rico, Shutova, Ekaterina and Navigli, Roberto 2023. What's the meaning of superhuman performance in today's NLU? Presented at: The 61st Annual Meeting of the Association for Computational Linguistics (ACL’23), Toronto, Canada, 9-14 July 2023. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics. , vol.1 Association for Computational Linguistics, pp. 12471-12491. 10.18653/v1/2023.acl-long.697

[thumbnail of _ACL_2023__Theme_Track_Position_Paper__Submitted_Version_.pdf]
Preview
PDF - Accepted Post-Print Version
Available under License Creative Commons Attribution.

Download (681kB) | Preview

Abstract

In the last five years, there has been a significant focus in Natural Language Processing (NLP) on developing larger Pretrained Language Models (PLMs) and introducing benchmarks such as SuperGLUE and SQuAD to measure their abilities in language understanding, reasoning, and reading comprehension. These PLMs have achieved impressive results on these benchmarks, even surpassing human performance in some cases. This has led to claims of superhuman capabilities and the provocative idea that certain tasks have been solved. In this position paper, we take a critical look at these claims and ask whether PLMs truly have superhuman abilities and what the current benchmarks are really evaluating. We show that these benchmarks have serious limitations affecting the comparison between humans and PLMs and provide recommendations for fairer and more transparent benchmarks.

Item Type: Conference or Workshop Item (Paper)
Date Type: Publication
Status: Published
Schools: Schools > Computer Science & Informatics
Publisher: Association for Computational Linguistics
ISBN: 978-195942972-2
ISSN: 0736-587X
Date of First Compliant Deposit: 12 June 2023
Date of Acceptance: 2 May 2023
Last Modified: 12 May 2025 12:40
URI: https://orca.cardiff.ac.uk/id/eprint/160269

Actions (repository staff only)

Edit Item Edit Item

Downloads

Downloads per month over past year

View more statistics