Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

IAPUCP at SemEval-2021 task 1: Stacking fine-tuned transformers is almost all you need for lexical complexity prediction

Rivas Rojas, Kervy and Alva-Manchego, Fernando 2021. IAPUCP at SemEval-2021 task 1: Stacking fine-tuned transformers is almost all you need for lexical complexity prediction. Presented at: 15th International Workshop on Semantic Evaluation (SemEval 2021), Virtual, 05-06 August 2021. Published in: Palmer, Alexis, Schneider, Nathan, Schluter, Natalie, Emerson, Guy, Herbelot, Aurelie and Zhu, Xaodan eds. Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021). Association for Computational Linguistics, pp. 144-149. 10.18653/v1/2021.semeval-1.14

[thumbnail of 2021.semeval-1.14.pdf] PDF - Published Version
Available under License Creative Commons Attribution.

Download (275kB)

Abstract

This paper describes our submission to SemEval-2021 Task 1: predicting the complexity score for single words. Our model leverages standard morphosyntactic and frequency-based features that proved helpful for Complex Word Identification (a related task), and combines them with predictions made by Transformer-based pre-trained models that were fine-tuned on the Shared Task data. Our submission system stacks all previous models with a LightGBM at the top. One novelty of our approach is the use of multi-task learning for fine-tuning a pre-trained model for both Lexical Complexity Prediction and Word Sense Disambiguation. Our analysis shows that all independent models achieve a good performance in the task, but that stacking them obtains a Pearson correlation of 0.7704, merely 0.018 points behind the winning submission.

Item Type: Conference or Workshop Item (Paper)
Date Type: Publication
Status: Published
Schools: Computer Science & Informatics
Additional Information: File distributed under a Creative Commons Attribution 4.0 International License.
Publisher: Association for Computational Linguistics
Date of First Compliant Deposit: 14 February 2022
Last Modified: 14 Feb 2022 16:30
URI: https://orca.cardiff.ac.uk/id/eprint/147258

Actions (repository staff only)

Edit Item Edit Item

Downloads

Downloads per month over past year

View more statistics