ORCA
Online Research @ Cardiff

Clear Cookie - decide language by browser settings

Text mining of adverse events in clinical trials: Deep learning approach

Chopard, Daphne, Treder, Matthias

, Corcoran, Padraig

, Johnson, Claire, Busse-Morris, Monica

and Spasic, Irena

2021. Text mining of adverse events in clinical trials: Deep learning approach. JMIR Medical Informatics 9 (12) , e28632. 10.2196/28632

Preview

PDF - Published Version
Available under License Creative Commons Attribution.
Download (1MB) | Preview

Official URL: http://doi.org/10.2196/28632

Abstract

Background: Pharmacovigilance and safety reporting, which involves processes for monitoring the use of medicines in clinical trials, plays a critical role in the identification of previously unrecognized adverse events or changes in the patterns of adverse events. Objective: This study aimed to demonstrate feasibility of automating the coding of adverse events described in the narrative section of the serious adverse event report forms to enable a statistical analysis of the aforementioned patterns. Methods: We used the Uniﬁed Medical Language System (UMLS) as the coding scheme, which integrates 217 source vocabularies, thus enabling coding against other relevant terminologies such as ICD-10, MedDRA and SNOMED. We used MetaMap, highly configurable dictionary lookup software, to identify mentions of the UMLS concepts. We trained a binary classifier using Bidirectional Encoder Representations from Transformer (BERT), a transformer-based language model that captures contextual relationships, to differentiate between mentions of the UMLS concepts that represent adverse events and those that do not. Results: The model achieved a high F1 score of 0.8080 despite the class imbalance. This is 10.15 percent points lower than human-like performance, but also 17.45 percent points higher than the baseline approach. Conclusions: These results confirmed that automated coding of adverse events described in the narrative section of the serious adverse event reports is feasible. Once coded, adverse events can be statistically analyzed so that any correlations with the trialed medicines can be estimated in a timely fashion. Keywords: natural language processing; deep learning; machine learning; classification

Item Type:	Article
Date Type:	Publication
Status:	Published
Schools:	Schools > Computer Science & Informatics Research Institutes & Centres > Centre for Trials Research (CNTRR) Research Institutes & Centres > Data Innovation Research Institute (DIURI)
Subjects:	Q Science > QA Mathematics > QA76 Computer software
Publisher:	JMIR Publications
ISSN:	2291-9694
Funders:	EPSRC
Date of First Compliant Deposit:	18 January 2022
Date of Acceptance:	14 November 2021
Last Modified:	19 May 2023 01:18
URI:	https://orca.cardiff.ac.uk/id/eprint/145494

Actions (repository staff only)

Edit Item

Dimensions

Altmetric

Download Statistics

Downloads

Downloads per month over past year

View more statistics

CORE (COnnecting REpositories)