Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Investigating marker accuracy in differentiating between university scripts written by students and those produced using ChatGPT

Hassoulas, Athanasios ORCID: https://orcid.org/0000-0002-1029-1847, Powell, Ned, Roberts, Lindsay ORCID: https://orcid.org/0000-0001-7236-1276, Umla-Runge, Katja ORCID: https://orcid.org/0000-0002-9615-8907, Gray, Laurence and Coffey, Marcus 2023. Investigating marker accuracy in differentiating between university scripts written by students and those produced using ChatGPT. Journal of Applied Learning & Teaching 6 (2) 10.37074/jalt.2023.6.2.13

[thumbnail of GenAI & marker accuracy.pdf]
Preview
PDF - Published Version
Available under License Creative Commons Attribution.

Download (440kB) | Preview

Abstract

The introduction of OpenAI’s ChatGPT has widely been considered a turning point for assessment in higher education. Whilst we find ourselves on the precipice of a profoundly disruptive technology, generative artificial intelligence (AI) is here to stay. At present, institutions around the world are considering how best to respond to such new and emerging tools, ranging from outright bans to re-evaluating assessment strategies. In evaluating the extent of the problem that these tools pose to the marking of assessments, a study was designed to investigate marker accuracy in differentiating between scripts prepared by students and those produced using generative AI. A survey containing undergraduate reflective writing scripts and postgraduate extended essays was administered to markers at a medical school in Wales, UK. The markers were asked to assess the scripts on writing style and content, and to indicate whether they believed the scripts to have been produced by students or ChatGPT. Of the 34 markers recruited, only 23% and 19% were able to correctly identify the ChatGPT undergraduate and postgraduate scripts, respectively. A significant effect of suspected script authorship was found for script content, X²(4, n=34) = 10.41, p<0.05, suggesting that written content holds clues as to how markers assign authorship. We recommend consideration be given to how generative AI can be responsibly integrated into assessment strategies and expanding our definition of what constitutes academic misconduct in light of this new technology.

Item Type: Article
Date Type: Published Online
Status: Published
Schools: Medicine
Subjects: L Education > L Education (General)
T Technology > T Technology (General)
Publisher: Journal of Applied Learning & Teaching
ISSN: 2591-801X
Date of First Compliant Deposit: 25 July 2023
Date of Acceptance: 22 July 2023
Last Modified: 10 Jun 2024 09:44
URI: https://orca.cardiff.ac.uk/id/eprint/161240

Actions (repository staff only)

Edit Item Edit Item

Downloads

Downloads per month over past year

View more statistics