Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Interpreting patient case descriptions with biomedical language models

Alghanmi, Israa 2023. Interpreting patient case descriptions with biomedical language models. PhD Thesis, Cardiff University.
Item availability restricted.

[thumbnail of Israa_Thesis.pdf]
Preview
PDF - Accepted Post-Print Version
Available under License Creative Commons Attribution No Derivatives.

Download (932kB) | Preview
[thumbnail of Cardiff University Electronic Publication Form] PDF (Cardiff University Electronic Publication Form) - Supplemental Material
Restricted to Repository staff only

Download (167kB)

Abstract

The advent of pre-trained language models (LMs) has enabled unprecedented advances in the Natural Language Processing (NLP) field. In this respect, various specialised LMs for the biomedical domain have been introduced, and similar to their general purpose counterparts, these models have achieved state-of-the-art results in many biomedical NLP tasks. Accordingly, it can be assumed that they can perform medical reasoning. However, given the challenging nature of the biomedical domain and the scarcity of labelled data, it is still not fully understood what type of knowledge these models encapsulate and how they can be enhanced further. This research seeks to address these questions, with a focus on the task of interpreting patient case descriptions, which provides the means to investigate the model’s ability to perform medical reasoning. In general, this task is concerned with inferring a diagnosis or recommending a treatment from a text fragment describing a set of symptoms accompanied by other information. Therefore, we started by probing pre-trained language models. For this purpose, we constructed a benchmark that is derived from an existing dataset (MedNLI). Following that, to improve the performance of LMs, we used a distant supervision strategy to identify cases that are similar to a given one. We then showed that using such similar cases can lead to better results than other strategies for augmenting the input to the LM. As a final contribution, we studied the possibility of fine-tuning biomedical LMs on PubMed abstracts that correspond to case reports. In particular, we proposed a self-supervision task which mimics the downstream tasks of inferring diagnoses and recommending treatments. The findings in this thesis indicate that the performance of the considered biomedical LMs can be improved by using methods that go beyond relying on additional manually annotated datasets.

Item Type: Thesis (PhD)
Date Type: Completion
Status: Unpublished
Schools: Computer Science & Informatics
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Q Science > QA Mathematics > QA76 Computer software
Date of First Compliant Deposit: 13 June 2023
Date of Acceptance: 12 June 2023
Last Modified: 19 Jun 2023 11:39
URI: https://orca.cardiff.ac.uk/id/eprint/160355

Actions (repository staff only)

Edit Item Edit Item

Downloads

Downloads per month over past year

View more statistics