Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Machine classification and analysis of suicide-related communication on Twitter

Burnap, Peter ORCID:, Colombo, Gualtiero and Scourfield, Jonathan Bryn ORCID: 2015. Machine classification and analysis of suicide-related communication on Twitter. Presented at: 26th ACM Conference on Hypertext & Social Media, Cyprus, 1-4 September 2015. HT '15- Proceedings of the 26th ACM Conference on Hypertext & Social Media. Association for Computing Machinery, pp. 75-84.

[thumbnail of p75-burnap.pdf]
PDF - Published Version
Download (604kB) | Preview


The World Wide Web, and online social networks in particular, have increased connectivity between people such that information can spread to millions of people in a matter of minutes. This form of online collective contagion has provided many benefits to society, such as providing reassurance and emergency management in the immediate aftermath of natural disasters. However, it also poses a potential risk to vulnerable Web users who receive this information and could subsequently come to harm. One example of this would be the spread of suicidal ideation in online social networks, about which concerns have been raised. In this paper we report the results of a number of machine classifiers built with the aim of classifying text relating to suicide on Twitter. The classifier distinguishes between the more worrying content, such as suicidal ideation, and other suicide-related topics such as reporting of a suicide, memorial, campaigning and support. It also aims to identify flippant references to suicide. We built a set of baseline classifiers using lexical, structural, emotive and psychological features extracted from Twitter posts. We then improved on the baseline classifiers by building an ensemble classifier using the Rotation Forest algorithm and a Maximum Probability voting classification decision method, based on the outcome of base classifiers. This achieved an F-measure of 0.728 overall (for 7 classes, including suicidal ideation) and 0.69 for the suicidal ideation class. We summarise the results by reflecting on the most significant predictive principle components of the suicidal ideation class to provide insight into the language used on Twitter to express suicidal ideation.

Item Type: Conference or Workshop Item (Paper)
Date Type: Publication
Status: Published
Schools: Computer Science & Informatics
Data Innovation Research Institute (DIURI)
Social Sciences (Includes Criminology and Education)
Subjects: H Social Sciences > H Social Sciences (General)
H Social Sciences > HM Sociology
Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Q Science > QA Mathematics > QA76 Computer software
Publisher: Association for Computing Machinery
Funders: Department of Health
Related URLs:
Date of First Compliant Deposit: 30 March 2016
Last Modified: 28 Oct 2022 09:56

Citation Data

Cited 92 times in Scopus. View in Scopus. Powered By Scopus® Data

Actions (repository staff only)

Edit Item Edit Item


Downloads per month over past year

View more statistics