Spasic, Irena ![]() ![]() ![]() ![]() ![]() |
Preview |
PDF
- Published Version
Available under License Creative Commons Attribution. Download (261kB) | Preview |
Abstract
This paper investigates an adaptation of an existing system for multi-word term recognition, originally developed for English, for Welsh. We overview the modifications required with a special focus on an important difference between the two representatives of two language families, Germanic and Celtic, which is concerned with the directionality of noun phrases. We successfully modelled these differences by means of lexico–syntactic patterns, which represent parameters of the system and, therefore, required no re–implementation of the core algorithm. The performance of the Welsh version was compared against that of the English version. For this purpose, we assembled three parallel domain–specific corpora. The results were compared in terms of precision and recall. Comparable performance was achieved across the three domains in terms of the two measures (P = 68.9%, R = 55.7%), but also in the ranking of automatically extracted terms measured by weighted kappa coefficient (k = 0.7758). These early results indicate that our approach to term recognition can provide a basis for machine translation of multi-word terms.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Date Type: | Published Online |
Status: | Published |
Schools: | Mathematics English, Communication and Philosophy Computer Science & Informatics Data Innovation Research Institute (DIURI) |
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science Q Science > QA Mathematics > QA76 Computer software |
Publisher: | European Association for Machine Translation |
Related URLs: | |
Date of First Compliant Deposit: | 8 October 2019 |
Last Modified: | 18 Jan 2025 22:17 |
URI: | https://orca.cardiff.ac.uk/id/eprint/125820 |
Actions (repository staff only)
![]() |
Edit Item |