Delbari, Zahra, Moosavi, Nafise Sadat and Pilehvar, Mohammad Taher
2024.
Spanning the spectrum of hatred detection: a Persian multi-label hate speech dataset with annotator rationales.
Presented at: Thirty-Eighth AAAI Conference on Artificial Intelligence,
Vancouver, Canada,
20-27 February 2024.
Published in: Woolridge, M., Dy, J. and Natarajan, S. eds.
Proceedings of the AAAI Conference on Artificial Intelligence.
, vol.38
(16)
Washington, DC, USA:
Association for the Advancement of Artificial Intelligence,
pp. 17889-17897.
10.1609/aaai.v38i16.29743
Item availability restricted. |
PDF
- Accepted Post-Print Version
Restricted to Repository staff only until 11 October 2024 due to copyright restrictions. Download (1MB) |
Abstract
With the alarming rise of hate speech in online communities, the demand for effective NLP models to identify instances of offensive language has reached a critical point. However, the development of such models heavily relies on the availability of annotated datasets, which are scarce, particularly for less-studied languages. To bridge this gap for the Persian language, we present a novel dataset specifically tailored to multi-label hate speech detection. Our dataset, called Phate, consists of an extensive collection of over seven thousand manually-annotated Persian tweets, offering a rich resource for training and evaluating hate speech detection models on this language. Notably, each annotation in our dataset specifies the targeted group of hate speech and includes a span of the tweet which elucidates the rationale behind the assigned label. The incorporation of these information expands the potential applications of our dataset, facilitating the detection of targeted online harm or allowing the benchmark to serve research on interpretability of hate speech detection models. The dataset, annotation guideline, and all associated codes are accessible at https://github.com/Zahra-D/Phate.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Date Type: | Publication |
Status: | Published |
Schools: | Computer Science & Informatics |
Publisher: | Association for the Advancement of Artificial Intelligence |
ISBN: | 9781577358879 |
ISSN: | 2374-3468 |
Date of First Compliant Deposit: | 11 September 2024 |
Date of Acceptance: | 9 December 2023 |
Last Modified: | 12 Sep 2024 03:52 |
URI: | https://orca.cardiff.ac.uk/id/eprint/168944 |
Actions (repository staff only)
Edit Item |