ORCA
Online Research @ Cardiff

Clear Cookie - decide language by browser settings

Spanning the spectrum of hatred detection: a Persian multi-label hate speech dataset with annotator rationales

Delbari, Zahra, Moosavi, Nafise Sadat and Pilehvar, Mohammad Taher 2024. Spanning the spectrum of hatred detection: a Persian multi-label hate speech dataset with annotator rationales. Presented at: Thirty-Eighth AAAI Conference on Artificial Intelligence, Vancouver, Canada, 20-27 February 2024. Published in: Woolridge, M., Dy, J. and Natarajan, S. eds. Proceedings of the AAAI Conference on Artificial Intelligence. , vol.38 (16) Washington, DC, USA: Association for the Advancement of Artificial Intelligence, pp. 17889-17897. 10.1609/aaai.v38i16.29743

Preview

PDF - Accepted Post-Print Version
Download (1MB) | Preview

Official URL: http://dx.doi.org/10.1609/aaai.v38i16.29743

Abstract

With the alarming rise of hate speech in online communities, the demand for effective NLP models to identify instances of offensive language has reached a critical point. However, the development of such models heavily relies on the availability of annotated datasets, which are scarce, particularly for less-studied languages. To bridge this gap for the Persian language, we present a novel dataset specifically tailored to multi-label hate speech detection. Our dataset, called Phate, consists of an extensive collection of over seven thousand manually-annotated Persian tweets, offering a rich resource for training and evaluating hate speech detection models on this language. Notably, each annotation in our dataset specifies the targeted group of hate speech and includes a span of the tweet which elucidates the rationale behind the assigned label. The incorporation of these information expands the potential applications of our dataset, facilitating the detection of targeted online harm or allowing the benchmark to serve research on interpretability of hate speech detection models. The dataset, annotation guideline, and all associated codes are accessible at https://github.com/Zahra-D/Phate.

Item Type:	Conference or Workshop Item (Paper)
Date Type:	Publication
Status:	Published
Schools:	Schools > Computer Science & Informatics
Publisher:	Association for the Advancement of Artificial Intelligence
ISBN:	9781577358879
ISSN:	2374-3468
Date of First Compliant Deposit:	11 September 2024
Date of Acceptance:	9 December 2023
Last Modified:	07 Nov 2024 17:00
URI:	https://orca.cardiff.ac.uk/id/eprint/168944

Actions (repository staff only)

Edit Item

Dimensions

Altmetric

Download Statistics

Downloads

Downloads per month over past year

View more statistics

CORE (COnnecting REpositories)