Cabiddu, Francesco, Bott, Lewis ![]() ![]() ![]() |
Preview |
PDF
- Published Version
Available under License Creative Commons Attribution. Download (1MB) | Preview |
Abstract
Word segmentation is a crucial step in children's vocabulary learning. While computational models of word segmentation can capture infants’ performance in small-scale artificial tasks, the examination of early word segmentation in naturalistic settings has been limited by the lack of measures that can relate models’ performance to developmental data. Here, we extended CLASSIC (Chunking Lexical and Sublexical Sequences in Children; Jones et al., 2021), a corpus-trained chunking model that can simulate several memory and phonological and vocabulary learning phenomena to allow it to perform word segmentation using utterance boundary information, and we have named this extended version CLASSIC utterance boundary (CLASSIC-UB). Further, we compared our model to the performance of children on a wide range of new measures, capitalizing on the link between word segmentation and vocabulary learning abilities. We showed that the combination of chunking and utterance-boundary information used by CLASSIC utterance boundary allowed a better prediction of English-learning children's output vocabulary than did other models.
Item Type: | Article |
---|---|
Date Type: | Publication |
Status: | Published |
Schools: | Psychology |
Publisher: | Wiley |
ISSN: | 0023-8333 |
Date of First Compliant Deposit: | 13 December 2022 |
Date of Acceptance: | 9 December 2022 |
Last Modified: | 11 Jan 2024 15:55 |
URI: | https://orca.cardiff.ac.uk/id/eprint/154927 |
Actions (repository staff only)
![]() |
Edit Item |