Brady, Oisin Padraig, Fuentes Toro, Carolina ORCID: https://orcid.org/0000-0002-0871-939X, Johnson, Sean James, Giles, Peter ORCID: https://orcid.org/0000-0003-3143-6854, Alvares, Caroline ORCID: https://orcid.org/0000-0003-4391-9802 and Zabkiewicz, Joanna ORCID: https://orcid.org/0000-0003-0951-3825
2025.
Survival prediction in acute myeloid leukemia at distinct treatment time points: a performance comparison of random survival forest and Elastic-Net regularized Cox regression.
JMIR Bioinformatics and Biotechnology
10.2196/75678
Item availability restricted. |
|
PDF
- Accepted Post-Print Version
Restricted to Repository staff only Download (3MB) |
|
|
PDF (Provisional file)
- Accepted Post-Print Version
Download (17kB) |
Abstract
Background: Risk group stratification based on AML patient survival prediction is complex. Despite common risk group categorisation guidelines, overall prognosis remains poor. Machine learning (ML) techniques have been shown to provide more accurate risk group stratification than conventional approaches using trial data. However, many time-to-event models do not utilize training sets constrained to specific time windows, instead using aggregations of trial data. Objective: Evaluate the performance of 1) Random Survival Forest (RSF) and 2) Cox Proportional Hazard Regression (CPHR) with Elastic Net regularisation (CoxNet) for survival prediction of Acute Myeloid Leukaemia patients within a censoring window trained with available data recorded at discreet time points during the AML17 randomised controlled trial dataset. Methods: For each stage in the AML17 trial, separate models were trained for each exhaustive k-choice combination of available AML17 data subsets. Data combinations for each model were further constrained according to the respective trial stage to avoid data leakage. Preliminary Pearson’s correlation methods were used to remove directly correlating features with the time-to-event prediction (time-to-death/5-year censoring point). Repeated k-fold stratified cross validation was used on each dataset ablation to find candidate models. Permutation importance and Elastic Net regularisation were used to monitor stability across validation folds and reduce the feature set of the highest performing stage RSF and CPHR models respectively. Finally, selected ablated models were re-evaluated using the nested, k-fold, stratified sampling cross validation method with bootstrapping. Results: Concordance index ranked the best models for data constricted up to the end of induction (RSF: 0.68, https://preprints.jmir.org/preprint/75678 [unpublished, peer-reviewed preprint] JMIR Preprints Brady et alCoxNet: 0.67), stages 1 (RSF: 0.69, CoxNet: 0.68), 2 (RSF: 0.68, CoxNet: 0.66), 3 (RSF: 0.69, CoxNet: 0.63) of the trial. Conclusion: This study details the high prediction accuracy for time-to-survival-event predictions when training sets of CoxNet and RSF models which are sequentially constricted to data measured up to the end of respective AML17 trial stages. Performance of these sequential time-to-event models intend to justify their use as part of a wider digital twin system simulating multiple time-to-event outcomes for AML patients.
| Item Type: | Article |
|---|---|
| Status: | In Press |
| Schools: | Schools > Medicine Schools > Computer Science & Informatics |
| Publisher: | JMIR Publications |
| ISSN: | 2563-3570 |
| Funders: | EPSRC |
| Date of First Compliant Deposit: | 15 January 2026 |
| Date of Acceptance: | 30 December 2025 |
| Last Modified: | 29 Jan 2026 11:46 |
| URI: | https://orca.cardiff.ac.uk/id/eprint/183705 |
Actions (repository staff only)
![]() |
Edit Item |





Altmetric
Altmetric