Arnold, Philipp Georg, Russe, Maximilian Frederik, Bamberg, Fabian, Emrich, Tilman, Vecsey-Nagy, Milán, Ashi, Ayaat, Kravchenko, Dmitrij, Varga-Szemes, Ákos, Soschynski, Martin, Rau, Alexander, Kotter, Elmar and Hagar, Muhammad Taha
2025.
Performance of large language models for CAD-RADS 2.0 classification derived from cardiac CT reports.
Journal of Cardiovascular Computed Tomography
10.1016/j.jcct.2025.03.007
![]() |
![]() |
PDF
- Accepted Post-Print Version
Available under License Creative Commons Attribution. Download (1MB) |
Abstract
Background The Coronary Artery Disease-Reporting and Data System (CAD-RADS) 2.0 offers standardized guidelines for interpreting coronary artery disease in cardiac CT. Accurate and consistent CAD-RADS 2.0 scoring is crucial for comprehensive disease characterization and clinical decision-making. This study investigates the capability of large language models (LLMs) to autonomously generate CAD-RADS 2.0 scores from cardiac CT reports. Methods A dataset of cardiac CT reports was created to evaluate the performance of several state-of-the-art LLMs in generating CAD-RADS 2.0 scores via in-context learning. The tested models comprised GPT-3.5, GPT-4o, Mistral 7b, Mixtral 8 × 7b, Llama3 8b, Llama3 8b with a 64k context length, and Llama3 70b. The generated scores from each model were compared to the ground truth, which was provided by two board-certified cardiothoracic radiologists in consensus based on the reports. Results The final set comprised 200 cardiac CT reports. GPT-4o and Llama3 70b achieved the highest accuracy in generating full CAD-RADS 2.0 scores including all modifiers with a performance rate of 93 % and 92.5 %, respectively, followed by Mixtral 8 × 7b with 78 %. In contrast, older LLMs, such as Mistral 7b and GPT-3.5 performed poorly (16 %) and Llama3 8b demonstrated intermediate results with an accuracy of 41.5 %. Conclusion LLMs enhanced with in-context learning are capable of autonomously generating CAD-RADS 2.0 scores for cardiac CT reports with excellent accuracy, potentially enhancing both the efficiency and consistency of cardiac CT reporting. Open-source models not only deliver competitive accuracy but also present the benefit of local hosting, mitigating concerns around data security.
Item Type: | Article |
---|---|
Date Type: | Published Online |
Status: | In Press |
Schools: | Schools > Medicine |
Additional Information: | License information from Publisher: LICENSE 1: URL: http://creativecommons.org/licenses/by/4.0/, Start Date: 2025-03-29 |
Publisher: | Elsevier |
ISSN: | 1934-5925 |
Date of First Compliant Deposit: | 16 April 2025 |
Date of Acceptance: | 28 March 2025 |
Last Modified: | 16 Apr 2025 09:30 |
URI: | https://orca.cardiff.ac.uk/id/eprint/177730 |
Actions (repository staff only)
![]() |
Edit Item |