Ezeani, I, Piao, S, Neale, Steven, Rayson, P and Knight, Dawn ORCID: https://orcid.org/0000-0002-4745-6502 2019. Leveraging pre-trained embeddings for Welsh Taggers. Presented at: 4th Workshop on Representation Learning for NLP, Florence, Italy, July 2019. ACL Anthology: Proceedings of the 4th Workshop on Representation Learning for NLP. , vol.W19-43 Association for Computational Linguistics, -. 10.18653/v1/W19-4332 |
Abstract
While the application of word embedding models to downstream Natural Language Processing (NLP) tasks has been shown to be successful, the benefits for low-resource languages is somewhat limited due to lack of adequate data for training the models. However, NLP research efforts for low-resource languages have focused on constantly seeking ways to harness pre-trained models to improve the performance of NLP systems built to process these languages without the need to re-invent the wheel. One such language is Welsh and therefore, in this paper, we present the results of our experiments on learning a simple multi-task neural network model for part-of-speech and semantic tagging for Welsh using a pre-trained embedding model from FastText. Our model’s performance was compared with those of the existing rule-based stand-alone taggers for part-of-speech and semantic taggers. Despite its simplicity and capacity to perform both tasks simultaneously, our tagger compared very well with the existing taggers.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Date Type: | Publication |
Status: | Published |
Schools: | English, Communication and Philosophy |
Publisher: | Association for Computational Linguistics |
Last Modified: | 26 Oct 2022 08:05 |
URI: | https://orca.cardiff.ac.uk/id/eprint/126545 |
Actions (repository staff only)
Edit Item |