Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

PeruSIL: A framework to build a continuous Peruvian Sign Language interpretation dataset

Bejarano, Gissella, Huamani-Malca, Joe, Cerna-Herrera, Francisco, Alva Manchego, Fernando and Rivas, Pablo 2022. PeruSIL: A framework to build a continuous Peruvian Sign Language interpretation dataset. Presented at: LREC2022: 10th Workshop on the Representation and Processing of Sign Languages: Multilingual Sign Language Resources, Marseille, France, 20-25 June 2022. Published in: Efthimiou, Eleni, Fotinea, Stavroula-Evita, Hanke, Thomas, Hochgesang, Julie A., Kristoffersen, Jette, Mesch, Johanna and Schulder, Marc eds. Proceedings of the LREC2022 10th Workshop on the Representation and Processing of Sign Languages: Multilingual Sign Language Resources. European Language Resources Association, pp. 1-8.

Full text not available from this repository.

Abstract

Video-based datasets for Continuous Sign Language are scarce due to the challenging task of recording videos from native signers and the reduced number of people who can annotate sign language. COVID-19 has evidenced the key role of sign language interpreters in delivering nationwide health messages to deaf communities. In this paper, we present a framework for creating a multi-modal sign language interpretation dataset based on videos and we use it to create the first dataset for Peruvian Sign Language (LSP) interpretation annotated by hearing volunteers who have intermediate knowledge of PSL guided by the video audio. We rely on hearing people to produce a first version of the annotations, which should be reviewed by native signers in the future. Our contributions: i) we design a framework to annotate a sign Language dataset; ii) we release the first annotated LSP multi-modal interpretation dataset (AEC); iii) we evaluate the annotation done by hearing people by training a sign language recognition model. Our model reaches up to 80.3% of accuracy among a minimum of five classes (signs) AEC dataset, and 52.4% in a second dataset. Nevertheless, analysis by subject in the second dataset show variations worth to discuss.

Item Type: Conference or Workshop Item (Paper)
Status: Published
Schools: Computer Science & Informatics
Publisher: European Language Resources Association
Last Modified: 03 Oct 2023 14:45
URI: https://orca.cardiff.ac.uk/id/eprint/161901

Actions (repository staff only)

Edit Item Edit Item