Chen, Ziman, Chambara, Nonhlanhla ![]() ![]() |
![]() |
PDF
- Published Version
Download (2MB) |
Abstract
Background/Objectives Recent advancements in large language models, such as ChatGPT-4o, have created new opportunities for analyzing complex multi-modal data, including medical images. This study aims to assess the potential of ChatGPT-4o in distinguishing between benign and malignant thyroid nodules via multi-modality ultrasound imaging: grayscale ultrasound, color Doppler ultrasound (CDUS), and shear wave elastography (SWE). Materials and Methods Patients who underwent thyroid nodule ultrasound examinations and had confirmed pathological diagnoses were included. ChatGPT-4o analyzed the multi-modality ultrasound data using two approaches: (1.) a dual-modality strategy which employed grayscale ultrasound and CDUS, and (2.) a triple-modality strategy which incorporated grayscale ultrasound, CDUS, and SWE. The diagnostic performance was compared against pathological findings utilizing receiver operating characteristic (ROC) curve analysis, while consistency was evaluated through Cohen’s Kappa analysis. Results A total of 106 thyroid nodules were evaluated; 65.1% were benign and 34.9% malignant. In the dual-modality approach, ChatGPT-4o achieved an area under the ROC curve (AUC) of 66.3%, moderate agreement with pathology results (Kappa = 0.298), a sensitivity of 70.3%, a specificity of 62.3%, and an accuracy of 65.1%. Conversely, the triple-modality approach exhibited higher specificity at 97.1% but lower sensitivity at 18.9%, with an accuracy of 69.8% and a reduced overall agreement (Kappa = 0.194), resulting in an AUC of 58.0%. Conclusions ChatGPT-4o exhibits potential, to some extent, in classifying thyroid nodules using multi-modality ultrasound imaging. However, the dual-modality approach unexpectedly outperforms the triple-modality approach. This indicates that ChatGPT-4o might encounter challenges in integrating and prioritizing different data modalities, particularly when conflicting information is present, which could impact diagnostic effectiveness.
Item Type: | Article |
---|---|
Date Type: | Published Online |
Status: | Published |
Schools: | Schools > Healthcare Sciences |
Additional Information: | License information from Publisher: LICENSE 1: URL: https://creativecommons.org/licenses/by/4.0/, Start Date: 2025-06-20 |
Publisher: | MDPI |
Date of First Compliant Deposit: | 8 July 2025 |
Date of Acceptance: | 18 June 2025 |
Last Modified: | 08 Jul 2025 09:00 |
URI: | https://orca.cardiff.ac.uk/id/eprint/179620 |
Actions (repository staff only)
![]() |
Edit Item |