Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Development and benchmarking a novel scatter search algorithm for learning probabilistic graphical models in healthcare

Threlfal, John 2022. Development and benchmarking a novel scatter search algorithm for learning probabilistic graphical models in healthcare. PhD Thesis, Cardiff University.
Item availability restricted.

[thumbnail of PhD_Thesis___John_Threlfall.pdf]
Preview
PDF - Accepted Post-Print Version
Download (1MB) | Preview
[thumbnail of Cardiff University Electronic Publication Form] PDF (Cardiff University Electronic Publication Form) - Supplemental Material
Restricted to Repository staff only

Download (214kB)

Abstract

Healthcare data of small sizes are widespread, and the challenge of building accurate inference models is difficult. Many machine learning algorithms exist, but many are black boxes. Explainable models in healthcare are essential, so healthcare practitioners can understand the developed model and incorporate domain knowledge into the model. Probabilistic graphical models offer a visual way to represent relationships between data. Here we develop a new scatter search algorithm to learn Bayesian networks. This machine learning approach is applied to three case studies to understand the effectiveness in comparison with traditional machine learning techniques. First, a new scatter search approach is presented to construct the structure of a Bayesian network. Statistical tests are used to build small Directed acyclic graphs combined in an iterative process to build up multiple larger graphs. Probability distributions are fitted as the graphs are built up. These graphs are then scored based on classification performance. Once no new solutions can be found, the algorithm finishes. The first study looks at the effectiveness of the scatter search constructed Bayesian network against other machine learning algorithms in the same class. These algorithms are benchmarked against standard datasets from the UCI Machine Learning Repository, which has many published studies. The second study assesses the effectiveness of the scatter search Bayesian network for classifying ovarian cancer patients. Multiple other machine learning algorithms were applied alongside the Bayesian network. All data from this study were collected by clinicians from the Aneurin Bevan University Health Board. The study concluded that machine-learning techniques could be applied to classify patients based on early indicators. The third and final study looked into applying machine learning techniques to no-show breast cancer follow-up patients. Once again, the scatter search Bayesian network was used alongside other machine learning approaches. Socio-demographic and socio-economic factors involving low to middle-income families were used in this study with feature selection techniques to improve machine learning performance. It was found machine learning, when used with feature selection, could classify no-show patients with reasonable accuracy.

Item Type: Thesis (PhD)
Date Type: Completion
Status: Unpublished
Schools: Mathematics
Date of First Compliant Deposit: 18 July 2023
Last Modified: 18 Jul 2023 14:57
URI: https://orca.cardiff.ac.uk/id/eprint/161098

Actions (repository staff only)

Edit Item Edit Item

Downloads

Downloads per month over past year

View more statistics