Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Text mining of adverse events in clinical trials: Deep learning approach

Chopard, Daphne, Treder, Matthias ORCID: https://orcid.org/0000-0001-5955-2326, Corcoran, Padraig ORCID: https://orcid.org/0000-0001-9731-3385, Johnson, Claire, Busse-Morris, Monica ORCID: https://orcid.org/0000-0002-5331-5909 and Spasic, Irena ORCID: https://orcid.org/0000-0002-8132-3885 2021. Text mining of adverse events in clinical trials: Deep learning approach. JMIR Medical Informatics 9 (12) , e28632. 10.2196/28632

[thumbnail of document.pdf]
Preview
PDF - Published Version
Available under License Creative Commons Attribution.

Download (1MB) | Preview

Abstract

Background: Pharmacovigilance and safety reporting, which involves processes for monitoring the use of medicines in clinical trials, plays a critical role in the identification of previously unrecognized adverse events or changes in the patterns of adverse events. Objective: This study aimed to demonstrate feasibility of automating the coding of adverse events described in the narrative section of the serious adverse event report forms to enable a statistical analysis of the aforementioned patterns. Methods: We used the Unified Medical Language System (UMLS) as the coding scheme, which integrates 217 source vocabularies, thus enabling coding against other relevant terminologies such as ICD-10, MedDRA and SNOMED. We used MetaMap, highly configurable dictionary lookup software, to identify mentions of the UMLS concepts. We trained a binary classifier using Bidirectional Encoder Representations from Transformer (BERT), a transformer-based language model that captures contextual relationships, to differentiate between mentions of the UMLS concepts that represent adverse events and those that do not. Results: The model achieved a high F1 score of 0.8080 despite the class imbalance. This is 10.15 percent points lower than human-like performance, but also 17.45 percent points higher than the baseline approach. Conclusions: These results confirmed that automated coding of adverse events described in the narrative section of the serious adverse event reports is feasible. Once coded, adverse events can be statistically analyzed so that any correlations with the trialed medicines can be estimated in a timely fashion. Keywords: natural language processing; deep learning; machine learning; classification

Item Type: Article
Date Type: Publication
Status: Published
Schools: Computer Science & Informatics
Centre for Trials Research (CNTRR)
Data Innovation Research Institute (DIURI)
Subjects: Q Science > QA Mathematics > QA76 Computer software
Publisher: JMIR Publications
ISSN: 2291-9694
Funders: EPSRC
Date of First Compliant Deposit: 18 January 2022
Date of Acceptance: 14 November 2021
Last Modified: 19 May 2023 01:18
URI: https://orca.cardiff.ac.uk/id/eprint/145494

Actions (repository staff only)

Edit Item Edit Item

Downloads

Downloads per month over past year

View more statistics