Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Crowdsourcing formulaic phrases: towards a new type of spoken corpus

Adolphs, Svenja, Knight, Dawn, Smith, Catherine and Price, Dominic 2020. Crowdsourcing formulaic phrases: towards a new type of spoken corpus. Corpora 15 (2) , pp. 141-168. 10.3366/COR.2020.0192

PDF - Accepted Post-Print Version
Download (733kB) | Preview


Spoken corpora have traditionally been assembled through careful recording and transcription of discourse events, a process which is both labour intensive and often restrictive in terms of breadth of recording contexts available. To overcome these potential challenges in spoken corpus compilation, we explore the use of crowdsourcing of language samples that are reported by participants. We investigate the level of precision and recall of the ‘crowd’ when it comes to reporting language they have heard in certain contexts, alongside the use of a crowdsourcing toolkit to facilitate this task. As a focussing device for the selection of reported language samples, we draw on the use of formulaic phrases as an area that has received considerable attention by corpus linguists and applied linguists over the years. We argue that while studying reported language usage instead of actual language-in-use is problematic for several reasons, many of which have been highlighted in the literature on Discourse Completion Tasks (Schauer and Adolphs, 2006), our suggested approach presents several advantages and opportunities for spoken corpus linguistics.

Item Type: Article
Date Type: Publication
Status: Published
Schools: English, Communication and Philosophy
Publisher: Edinburgh University Press
ISSN: 1749-5032
Date of First Compliant Deposit: 3 January 2019
Date of Acceptance: 14 December 2018
Last Modified: 26 Nov 2020 05:20

Actions (repository staff only)

Edit Item Edit Item


Downloads per month over past year

View more statistics