Burnap, Pete ORCID: https://orcid.org/0000-0003-0396-633X, Colombo, Gualtiero, Amery, Rosie, Hodorog, Andrei ORCID: https://orcid.org/0000-0002-4701-5643 and Scourfield, Jonathan ORCID: https://orcid.org/0000-0001-6218-8158 2017. Multi-class machine classification of suicide-related communication on Twitter. Online Social Networks and Media 2 , pp. 32-44. 10.1016/j.osnem.2017.08.001 |
Preview |
PDF
- Published Version
Available under License Creative Commons Attribution. Download (1MB) | Preview |
Abstract
The World Wide Web, and online social networks in particular, have increased connectivity between people such that information can spread to millions of people in a matter of minutes. This form of online collective contagion has provided many benefits to society, such as providing reassurance and emergency management in the immediate aftermath of natural disasters. However, it also poses a potential risk to vulnerable Web users who receive this information and could subsequently come to harm. One example of this would be the spread of suicidal ideation in online social networks, about which concerns have been raised. In this paper we report the results of a number of machine classifiers built with the aim of classifying text relating to suicide on Twitter. The classifier distinguishes between the more worrying content, such as suicidal ideation, and other suicide-related topics such as reporting of a suicide, memorial, campaigning and support. It also aims to identify flippant references to suicide. We built a set of baseline classifiers using lexical, structural, emotive and psychological features extracted from Twitter posts. We then improved on the baseline classifiers by building an ensemble classifier using the Rotation Forest algorithm and a Maximum Probability voting classification decision method, based on the outcome of base classifiers. This achieved an F-measure of 0.728 overall (for 7 classes, including suicidal ideation) and 0.69 for the suicidal ideation class. We summarise the results by reflecting on the most significant predictive principle components of the suicidal ideation class to provide insight into the language used on Twitter to express suicidal ideation. Finally, we perform a 12-month case study of suicide-related posts where we further evaluate the classification approach - showing a sustained classification performance and providing anonymous insights into the trends and demographic profile of Twitter users posting content of this type.
Item Type: | Article |
---|---|
Date Type: | Publication |
Status: | Published |
Schools: | Computer Science & Informatics Social Sciences (Includes Criminology and Education) Data Innovation Research Institute (DIURI) |
Subjects: | H Social Sciences > H Social Sciences (General) Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
Publisher: | Elsevier |
ISSN: | 2468-6964 |
Funders: | Department of Health |
Date of First Compliant Deposit: | 5 September 2017 |
Date of Acceptance: | 8 August 2017 |
Last Modified: | 19 Feb 2024 06:26 |
URI: | https://orca.cardiff.ac.uk/id/eprint/103420 |
Citation Data
Cited 102 times in Scopus. View in Scopus. Powered By Scopus® Data
Actions (repository staff only)
Edit Item |