Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Diary mining: predicting emotion from activities, people and places

Alahdal, Shahd 2020. Diary mining: predicting emotion from activities, people and places. PhD Thesis, Cardiff University.
Item availability restricted.

[thumbnail of 2020alahdalsaphd.pdf]
PDF - Accepted Post-Print Version
Download (6MB) | Preview
[thumbnail of Cardiff University Electronic Publication Form] PDF (Cardiff University Electronic Publication Form) - Supplemental Material
Restricted to Repository staff only

Download (152kB)


Diary methods are concerned with collecting qualitative information from people about their everyday lives and are commonly used in many fields such as psychology, sociology and medicine to understand human behaviour and improve mental health. By its nature, the data is difficult to analyse and time-consuming to process manually, creating a gap between collection, analysis and intervention. Technologies such as machine learning have the potential to shrink this gap, save time and effort, and hence give deeper insight into the diary data. Computer science technologies have been heavily used by many disciplines to understand humans. One such application is emotion detection from text, which is the process of automatically identifying the emotion that is either directly expressed by the author or the underlying emotion that prompted the author to write a text. Studies have shown promising results using different features extracted, whether linguistic or others (e.g., number of followers). However, very few have used activities for emotion prediction from text, and none of these have combined activities with other associated situational features from the relevant event. The research in this thesis proposes an approach to predict emotion from self-recorded personal textual diaries using a small set of domain-specific features. Daily activities, in association with people and places, are used as the main indicators of an individual's current situation. The association of these factors with emotion has been well-studied independently in psychology, which has motivated this investigation to validate the combination of all three features and test their ability to predict emotion from a computer science perspective. This research begins by proposing a framework to classify short diary entries into a small number of high-level personal activities (work/study, social/family, food/drink, leisure, essentials) and represents them as low dimensional probability vectors using unsupervised (clustering) and supervised (classification) machine learning techniques. In view of the fact that these entries are characterised by sparseness, and that there is lack of training data as they are highly personal, this framework applies a transfer learning approach by exploiting previously acquired knowledge as a foundation step, using a pre-trained word embedding model on similar, but not identical, and easily obtained publicly available data (tweets). Furthermore, references to people and places are also recognised from the text using information extraction techniques. These automatically extracted features are then used for predicting emotion, utilising different emotion schemes, including Ekman's basic emotion model, the Circumplex model, together with simpler classification into pleasantness/unpleasantness, and emotional/ neutral states. In addition, different learning strategies for predicting emotion are compared, including the use of personalised and global training data. This research has shown that activities, people, and places can successfully predict some emotions from the text, especially `happiness' and `neutral'.

Item Type: Thesis (PhD)
Date Type: Completion
Status: Unpublished
Schools: Computer Science & Informatics
Date of First Compliant Deposit: 29 October 2020
Last Modified: 29 Oct 2020 10:51

Actions (repository staff only)

Edit Item Edit Item


Downloads per month over past year

View more statistics