Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

The role of idioms in sentiment analysis

Williams, Lowri, Bannister, Christian ORCID:, Arribas-Ayllon, Michael ORCID:, Preece, Alun ORCID: and Spasic, Irena ORCID: 2015. The role of idioms in sentiment analysis. Expert Systems with Applications 42 (21) , pp. 7375-7385. 10.1016/j.eswa.2015.05.039

[thumbnail of idioms.pdf]
PDF - Published Version
Available under License Creative Commons Attribution.

Download (904kB) | Preview


In this paper we investigate the role of idioms in automated approaches to sentiment analysis. To estimate the degree to which the inclusion of idioms as features may potentially improve the results of traditional sentiment analysis, we compared our results to two such methods. First, to support idioms as features we collected a set of 580 idioms that are relevant to sentiment analysis, i.e. the ones that can be mapped to an emotion. These mappings were then obtained using a web-based crowdsourcing approach. The quality of the crowdsourced information is demonstrated with high agreement among five independent annotators calculated using Krippendorff's alpha coefficient (α = 0.662). Second, to evaluate the results of sentiment analysis, we assembled a corpus of sentences in which idioms are used in context. Each sentence was annotated with an emotion, which formed the basis for the gold standard used for the comparison against two baseline methods. The performance was evaluated in terms of three measures - precision, recall and F-measure. Overall, our approach achieved 64% and 61% for these three measures in two experiments improving the baseline results by 20 and 15 percent points respectively. F-measure was significantly improved over all three sentiment polarity classes: Positive, Negative and Other. Most notable improvement was recorded in classification of positive sentiments, where recall was improved by 45 percent points in both experiments without compromising the precision. The statistical significance of these improvements was confirmed by McNemar's test.

Item Type: Article
Date Type: Publication
Status: Published
Schools: Computer Science & Informatics
Social Sciences (Includes Criminology and Education)
MRC Centre for Neuropsychiatric Genetics and Genomics (CNGG)
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Q Science > QA Mathematics > QA76 Computer software
Uncontrolled Keywords: emotion recognition, sentiment analysis, natural language processing, user-generated content, tagging
Additional Information: This is an open access article under the CC BY license
Publisher: Elsevier
ISSN: 0957-4174
Related URLs:
Date of First Compliant Deposit: 22 April 2016
Last Modified: 15 Dec 2023 07:27

Citation Data

Cited 58 times in Scopus. View in Scopus. Powered By Scopus® Data

Actions (repository staff only)

Edit Item Edit Item


Downloads per month over past year

View more statistics