Elhaddad, Heba, Chiriches, Claudia, Watts, Katie, Giles, Peter ORCID: https://orcid.org/0000-0003-3143-6854, Gilkes, Amanda and Ruthardt, Martin ORCID: https://orcid.org/0000-0003-1021-3811
2023.
P546: Machine learning and response to induction chemotherapy in patients with acute myeloid leukemia [Abstract].
HemaSphere
7
(S3)
, pp. 948-949.
10.1097/01.hs9.0000969092.93079.e9
|
|
PDF
- Published Version
Available under License Creative Commons Attribution Non-commercial No Derivatives. Download (161kB) |
Abstract
Background: The toxic effects of induction chemotherapy for acute myeloid leukemia (AML) are only worth it if a patient responds. Until now, no validated scoring system can accurately predict patients’ responses to chemotherapy. Most prediction approaches are complicated as they utilize various cytogenetic and molecular abnormalities. This is not only imprecise but also time-consuming. Walter et al. used a scoring system with an area under the curve (AUC) of 0.64 to 0.69 to predict resistance to chemotherapy. Ng et al. incorporated mutational data with other covariates to get concordance statistics (C-statistics) of 0.65 to 0.8 in response prediction, which was biased because death and relapse affected the C-statistics. Machine learning (ML) aims to make accurate predictions based on existing data. Random Forest (RF) is an ML algorithm widely used to make predictions related to cancer, like progression and survival. In AML, RF has been recently applied in predicting patients’ overall survival (OS). It achieved a prediction efficiency of AUC 0.59 to 0.64, better than the ELN risk stratification. Therefore, we focused on predicting the response to induction therapy based on mutational data, as induction failure can worsen patients’ prognosis due to treatment delay and exposure to unnecessary toxicity. Aims: We applied ML to investigate whether targeted sequencing contained the relevant information to predict the patients’ response to induction chemotherapy. Methods: In this study, sequencing data of 111 genes of a cohort of 1552 AML patients from the AML NCRI study group were retrospectively analyzed using RF. Following intense induction chemotherapy, patients were stratified into responders achieving complete remission (CR) or resistant not achieving CR. The CR rate in our dataset was around 80%, matching rates reported in the literature. However, 47% of patients in the CR group relapsed during follow-up. Sequencing was performed on an Illumina platform. All analyses were done using R. The cohort was randomly split into training (70%) and test (30%) sets. Our starting dataset needed to be more balanced, having 80% CR vs. 20% resistance. Therefore, the training set was balanced using SMOTE before RF analysis. Results: As we aimed to distinguish responders from resistant patients, we first tested different relapse time thresholds to decide whether to include the relapsed patients in the CR or the resistant group. RF revealed that the best prediction accuracy could be obtained when all relapsed patients were included in the CR group. Patients who died within 28 days of starting therapy were excluded, as early death introduced bias to RF analysis. RF model with balanced data achieved 93% accuracy in predicting the training set (AUC 0.9). When the RF model was used to predict the treatment outcome of the test set (a new subset of data), it attained 79% accuracy, AUC 0.676, 87% sensitivity, 44% specificity, 86% positive predictive value (PPV), and 46% negative predictive. The effect of data balancing is reflected on the prediction of the test set (after balancing: accuracy increased from 73% to 79%, and specificity increased from 19% to 44%), resulting in higher prediction accuracy for the minority (resistant) group. Summary/Conclusion: Targeted sequencing data obtained during diagnosis include the information for RF to predict the patients’ response to induction chemotherapy. RF is a practical, rapidly developing algorithm that can be incorporated into future technologies to obtain high response prediction accuracy in AML patients.
| Item Type: | Short Communication |
|---|---|
| Date Type: | Publication |
| Status: | Published |
| Schools: | Schools > Medicine |
| Publisher: | Wiley |
| ISSN: | 2572-9241 |
| Date of First Compliant Deposit: | 27 February 2026 |
| Last Modified: | 27 Feb 2026 11:45 |
| URI: | https://orca.cardiff.ac.uk/id/eprint/185346 |
Actions (repository staff only)
![]() |
Edit Item |





Dimensions
Dimensions