Accessible automation: evaluating object segmentation solutions for parent-child interaction research

Thompson, Craig D. J., Ferreira De Oliveira, Catia, Zhang, Ziye, Lai, Yu-Kun

and D'Souza, Hana 2025. Accessible automation: evaluating object segmentation solutions for parent-child interaction research. Presented at: International Conference on Development and Learning 2025, Prague, Czech Republic, 16-19 September 2025. Proceedings of the IEEE International Conference on Development and Learning 2025. IEEE, 10.1109/ICDL63968.2025.11204391

[thumbnail of AccessibleAutomation_AuthorsVersion.pdf]

Preview

PDF - Accepted Post-Print Version
Available under License Creative Commons Attribution.
Download (619kB) | Preview

Official URL: https://doi.org/10.1109/ICDL63968.2025.11204391

Abstract

Lab-based parent-child interaction (PCI) studies enable researchers to observe real-time behaviours in a controlled setting. With the rise of head-mounted eye-tracking and cameras, these studies now capture even richer data. However, extracting meaningful variables often requires time-consuming manual annotation. One key variable of interest to developmental scientists—due to its links to attention and learning—is the size of objects in the child’s view. Manually extracting object sizes from a single 5-minute recording (9,000 frames at 30 FPS) can take up to 225 hours. Advances in computer vision now offer automated solutions. In this study, we evaluated six automated object segmentation solutions for their ability to extract object size from PCI videos featuring distinctly coloured objects: Colour-based extraction Segment and Track Anything (SAM-Track) Segment Anything Model 2 (SAM2) DeepLabv3 PyTorch U-Net You Only Look Once (YOLOv11) Some solutions require minimal setup (Colour-based extraction, SAM-Track, and SAM2), while others require custom training using manually annotated frames (DeepLabv3, PyTorch U-Net, and YOLOv11). Two of the out-of-the-box models (SAM-Track and SAM2) and two of the custom-trained models (PyTorch U-Net and YOLOv11) demonstrated very high object segmentation accuracy: Median Dice scores: .92 – .96 Median IoU scores: .85 – .92 These tools offer a scalable and accessible way to automate object segmentation, reducing annotation time from months to hours. This enables broader application of this approach in developmental science.

Item Type:	Conference or Workshop Item (Paper)
Date Type:	Published Online
Status:	Published
Schools:	Schools > Computer Science & Informatics Schools > Psychology
Publisher:	IEEE
ISBN:	9798331543440
Date of First Compliant Deposit:	10 July 2025
Date of Acceptance:	1 June 2025
Last Modified:	29 Oct 2025 15:47
URI:	https://orca.cardiff.ac.uk/id/eprint/179678

Actions (repository staff only)

Edit Item

Dimensions

Altmetric

Download Statistics

Downloads

Downloads per month over past year

View more statistics

CORE (COnnecting REpositories)