Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Qualitative assessment of oesophageal cancer metabolic tumour volumes delineated by an artificial intelligence algorithm

Parkinson, Craig ORCID: https://orcid.org/0000-0003-3454-4957, Foley, Kieran, Sobhee, Shailen, Riviera, Walter, Berenato, Salvatore, Stylianou, Costas, Crosby, Tom and Spezi, Emiliano ORCID: https://orcid.org/0000-0002-1452-8813 2020. Qualitative assessment of oesophageal cancer metabolic tumour volumes delineated by an artificial intelligence algorithm. Presented at: NCRI Virtual Showcase 2020, Virtual, 2-3 November 2020.

[thumbnail of Variability+in+observer+results+for+Turing+tests.pdf] PDF - Accepted Post-Print Version
Download (62kB)

Abstract

Qualitative assessment of oesophageal cancer metabolic tumour volumes delineated by an artificial intelligence algorithm Year: 2020 Session type: E-poster/poster Theme: Big data and AI Craig Parkinson, Kieran Foley, Shailen Sobhee, Walter Riviera, Salvatore Berenato, Costas Stylianou, Tom Crosby, Emiliano Spezi Abstract Background Incidence of oesophageal cancer is rising. Radiotherapy is increasingly used to treat this poor prognosis disease but requires significant resources to plan treatment. Therefore, automated methods would be preferred. Quantitative analysis of artificial intelligence (AI) algorithms is often reported, but qualitative evaluation is lacking. We investigated observers ability to differentiate manual versus a fully automated AI algorithm for outlining metabolic tumour volume (MTV) using a Turing test, including inter and intra-observer variability. Method Five radiologists (Ob1 to Ob5), independently observed 580 contours. 256 contours were delineated using a U-Net deep learning (DL) model and 324 were delineated manually. Observers decided whether the contour had been created with a DL method, manually, or if they were unable to tell. Of the 580 contours, 37 contours were repeated twice. Observers were blinded to the method and presented with a co-registered PET/CT, with a contour overlay. CT imaging was windowed to a window width of 330 Hounsfield units (HU) and window centre of -10 HU. Results Overall, Ob1 to Ob5 correctly identified 165 (28.4%), 199 (51.6%), 190 (32.8%), 181 (31.2%) and 193 (33.3%) out of 580 cases, respectively. Ob1 to Ob5 identified 202 (78.9%), 199 (77.7%), 159 (62.1%), 189 (73.8%) and 143 (55.9%) of 256 DL contours as being manually delineated. In repeat imaging, Ob1 changed opinion in 9 cases, Ob2 10 cases, Ob3 10 cases, Ob4 7 and Ob5 8 cases. On average observers changed opinion in 9 cases (21.6%) with a minimum of 7 (18.9%) cases and a maximum of 10 cases (27.0%). Observers on average identified 178.4 (69.6%) of the DL contours as being delineated manually (range; minimum 143 cases (55.8%) and maximum of 202 (78.9%) cases). Conclusion We have shown that Turing tests provide an additional method for qualitative evaluation that complements quantitative metrics, to assess AI algorithm performance in outlining metabolic tumour volumes. In our study, observers were unable to confidently determine the delineation method suggesting a strong performance of the AI algorithm. However, observer selection is subject to inter and intra-observer variability and potentially impacted by clinical experience.

Item Type: Conference or Workshop Item (Poster)
Status: Published
Schools: Engineering
Funders: Welsh Government
Date of First Compliant Deposit: 27 January 2021
Date of Acceptance: 2 October 2020
Last Modified: 05 Nov 2022 04:16
URI: https://orca.cardiff.ac.uk/id/eprint/137987

Actions (repository staff only)

Edit Item Edit Item

Downloads

Downloads per month over past year

View more statistics