ORCA
Online Research @ Cardiff

Clear Cookie - decide language by browser settings

Accuracy of a pre-trained sentiment analysis (SA) classification model on tweets related to emergency response and early recovery assessment: the case of 2019 Albanian earthquake

Contreras Mojica, Diana, Wilkinson, Sean, Alterman, Evangeline and Hervás, Javier 2022. Accuracy of a pre-trained sentiment analysis (SA) classification model on tweets related to emergency response and early recovery assessment: the case of 2019 Albanian earthquake. Natural Hazards 113 , pp. 403-421. 10.1007/s11069-022-05307-w

[thumbnail of Contreras2022_Article_AccuracyOfAPre-trainedSentimen.pdf]

PDF - Published Version
Available under License Creative Commons Attribution.
Download (2MB)

Official URL: https://doi.org/10.1007/s11069-022-05307-w

Abstract

Traditionally, earthquake impact assessments have been made via fieldwork by non-governmental organisations (NGO's) sponsored data collection; however, this approach is time-consuming, expensive and often limited. Recently, social media (SM) has become a valuable tool for quickly collecting large amounts of first-hand data after a disaster and shows great potential for decision-making. Nevertheless, extracting meaningful information from SM is an ongoing area of research. This paper tests the accuracy of the pre-trained sentiment analysis (SA) model developed by the no-code machine learning platform MonkeyLearn using the text data related to the emergency response and early recovery phase of the three major earthquakes that struck Albania on the 26th November 2019. These events caused 51 deaths, 3000 injuries and extensive damage. We obtained 695 tweets with the hashtags: #Albania #AlbanianEarthquake, and #albanianearthquake from the 26th November 2019 to the 3rd February 2020. We used these data to test the accuracy of the pre-trained SA classification model developed by MonkeyLearn to identify polarity in text data. This test explores the feasibility to automate the classification process to extract meaningful information from text data from SM in real-time in the future. We tested the no-code machine learning platform's performance using a confusion matrix. We obtained an overall accuracy (ACC) of 63% and a misclassification rate of 37%. We conclude that the ACC of the unsupervised classification is sufficient for a preliminary assessment, but further research is needed to determine if the accuracy is improved by customising the training model of the machine learning platform.

Item Type:	Article
Date Type:	Publication
Status:	Published
Schools:	Schools > Earth and Environmental Sciences
Subjects:	G Geography. Anthropology. Recreation > G Geography (General) G Geography. Anthropology. Recreation > GB Physical geography H Social Sciences > H Social Sciences (General) H Social Sciences > HA Statistics H Social Sciences > HV Social pathology. Social and public welfare J Political Science > JF Political institutions (General) J Political Science > JZ International relations Q Science > Q Science (General) T Technology > TA Engineering (General). Civil engineering (General) T Technology > TH Building construction
Additional Information:	This article is licensed under a Creative Commons Attribution 4.0 International License
Publisher:	Springer
ISSN:	1573-0840
Funders:	Engineering and Physical Sciences Research Council (EPSRC) Grant number: EP/P025641/1
Related URLs:	Organisation
Date of First Compliant Deposit:	26 March 2022
Date of Acceptance:	24 February 2022
Last Modified:	06 May 2023 07:04
URI:	https://orca.cardiff.ac.uk/id/eprint/148906

Citation Data

Cited 4 times in Scopus. View in Scopus. Powered By Scopus® Data

Actions (repository staff only)

Edit Item

Dimensions

Altmetric

Download Statistics

Downloads

Downloads per month over past year

View more statistics

CORE (COnnecting REpositories)