Garcia-Font, Marc, Dufey-Portilla, Nicolás, Durán-Sindreu, Fernando, González Sánchez, José Antonio, Millán, Gustavo Rodríguez, Nagendrababu, Venkateshbabu, Dummer, Paul M. H. ORCID: https://orcid.org/0000-0002-0726-7467 and Sans, Francesc Abella
2025.
Evaluating retrieval-augmented large language models on external cervical resorption: a comparative study of Gemini and NotebookLM.
Journal of Endodontics
, S0099-2399(25)00665-X.
10.1016/j.joen.2025.10.016
|
Abstract
This study evaluated the accuracy and consistency of two large language models (LLMs) developed by Alphabet Inc., Google Gemini (GG), a base configuration, and NotebookLM (NLM), a document-grounded configuration, when answering clinical questions regarding external cervical resorption (ECR) using a retrieval-augmented framework. Forty-six dichotomous clinical questions related to ECR were developed by three academic endodontists based on established sources. Each question was submitted to GG and NLM using three independent user accounts, yielding 276 responses. The retrieval-augmented generation configuration was replicated by NLM, which was programmed to generate responses exclusively from the documents provided. Three endodontic experts independently evaluated all responses against predefined gold standard answers. Accuracy was defined as agreement with the gold standard; consistency referred to identical responses across the three trials. Statistical analyses included 95% confidence intervals (Wald and Wilson), Fleiss' kappa, and Fisher's exact test. GG achieved an accuracy of 89% (41/46; 95% CI, 76.96-95.27) and a consistency rate of 93% (κ = 0.89; p < 0.001). NLM achieved an accuracy of 96% (44/46; 95% CI, 85.47-98.79) and the same consistency (κ = 0.90; p < 0.001). No significant differences occurred between the LLMs for accuracy and consistency. The NLM and GG models exhibited a high level of accuracy and consistency. Although NLM had a slightly superior performance, retrieval augmentation did not significantly enhance the responses to structured clinical tasks. [Abstract copyright: Copyright © 2025 American Association of Endodontists. Published by Elsevier Inc. All rights reserved.]
| Item Type: | Article |
|---|---|
| Date Type: | Published Online |
| Status: | In Press |
| Schools: | Schools > Dentistry |
| Publisher: | Elsevier |
| ISSN: | 0099-2399 |
| Date of Acceptance: | 31 October 2025 |
| Last Modified: | 27 Nov 2025 12:15 |
| URI: | https://orca.cardiff.ac.uk/id/eprint/182718 |
Actions (repository staff only)
![]() |
Edit Item |





Altmetric
Altmetric