Filimonov, Maxim, Chopard, Daphne and Spasic, Irena ORCID: https://orcid.org/0000-0002-8132-3885 2022. Simulation and annotation of global acronyms. Bioinformatics 38 (11) , pp. 3136-3138. 10.1093/bioinformatics/btac298 |
PDF
- Published Version
Available under License Creative Commons Attribution. Download (471kB) |
Abstract
Motivation: Global acronyms are used in written text without their formal definitions. This makes it difficult to automatically interpret their sense as acronyms tend to be ambiguous. Supervised machine learning approaches to sense disambiguation require large training datasets. In clinical applications, large datasets are difficult to obtain due to patient privacy. Manual data annotation creates an additional bottleneck. Results: We proposed an approach to automatically modifying scientific abstracts to (1) simulate global acronym usage and (2) annotate their senses without the need for external sources or manual intervention. We implemented it as a web-based application, which can create large datasets that in turn can be used to train supervised approaches to word sense disambiguation of biomedical acronyms. Availability: https://datainnovation.cardiff.ac.uk/acronyms/
Item Type: | Article |
---|---|
Date Type: | Publication |
Status: | Published |
Schools: | Computer Science & Informatics Data Innovation Research Institute (DIURI) |
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science Q Science > QA Mathematics > QA76 Computer software |
Additional Information: | This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) |
Publisher: | Oxford University Press |
ISSN: | 1367-4803 |
Related URLs: | |
Date of First Compliant Deposit: | 9 May 2022 |
Date of Acceptance: | 22 April 2022 |
Last Modified: | 05 Jan 2024 08:03 |
URI: | https://orca.cardiff.ac.uk/id/eprint/149431 |
Actions (repository staff only)
Edit Item |