Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Simulation and annotation of global acronyms

Filimonov, Maxim, Chopard, Daphne and Spasic, Irena ORCID: 2022. Simulation and annotation of global acronyms. Bioinformatics 38 (11) , pp. 3136-3138. 10.1093/bioinformatics/btac298

[thumbnail of btac298.pdf] PDF - Published Version
Available under License Creative Commons Attribution.

Download (471kB)


Motivation: Global acronyms are used in written text without their formal definitions. This makes it difficult to automatically interpret their sense as acronyms tend to be ambiguous. Supervised machine learning approaches to sense disambiguation require large training datasets. In clinical applications, large datasets are difficult to obtain due to patient privacy. Manual data annotation creates an additional bottleneck. Results: We proposed an approach to automatically modifying scientific abstracts to (1) simulate global acronym usage and (2) annotate their senses without the need for external sources or manual intervention. We implemented it as a web-based application, which can create large datasets that in turn can be used to train supervised approaches to word sense disambiguation of biomedical acronyms. Availability:

Item Type: Article
Date Type: Publication
Status: Published
Schools: Computer Science & Informatics
Data Innovation Research Institute (DIURI)
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Q Science > QA Mathematics > QA76 Computer software
Additional Information: This is an Open Access article distributed under the terms of the Creative Commons Attribution License (
Publisher: Oxford University Press
ISSN: 1367-4803
Related URLs:
Date of First Compliant Deposit: 9 May 2022
Date of Acceptance: 22 April 2022
Last Modified: 10 Nov 2022 11:09

Actions (repository staff only)

Edit Item Edit Item


Downloads per month over past year

View more statistics