Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Text mining patient experiences from online health communities

Greenwood, Mark 2015. Text mining patient experiences from online health communities. PhD Thesis, Cardiff University.
Item availability restricted.

[thumbnail of 2016greenwoodmaphd.pdf]
PDF - Accepted Post-Print Version
Download (2MB) | Preview
[thumbnail of greenwoodma.pdf] PDF - Supplemental Material
Restricted to Repository staff only

Download (153kB)


Social media has had an impact on how patients experience healthcare. Through online channels, patients are sharing information and their experiences with potentially large audiences all over the world. While sharing in this way may offer immediate benefits to themselves and their readership (e.g. other patients) these unprompted, self-authored accounts of illness are also an important resource for healthcare researchers. They offer unprecedented insight into understanding patients’experience of illness. Work has been undertaken through qualitative analysis in order to explore this source of data and utilising the information expressed through these media. However, the manual nature of the analysis means that scope is limited to a small proportion of the hundreds of thousands of authors who are creating content. In our research, we aim to explore utilising text mining to support traditional qualitative analysis of this data. Text mining uses a number of processes in order to extract useful facts from text and analyse patterns within – the ultimate aim is to generate new knowledge by analysing textual data en mass. We developed QuTiP – a Text Mining framework which can enable large scale qualitative analyses of patient narratives shared over social media. In this thesis, we describe QuTiP and our application of the framework to analyse the accounts of patients living with chronic lung disease. As well as a qualitative analysis, we describe our approaches to automated information extraction, term recognition and text classification in order to automatically extract relevant information from blog post data. Within the QuTiP framework, these individual automated approaches can be brought together to support further analyses of large social media datasets.

Item Type: Thesis (PhD)
Status: Unpublished
Schools: Computer Science & Informatics
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Uncontrolled Keywords: Text Mining; Health Informatics; Social Media; Automatic Term Recognition; Sentiment Analysis; Chronic Obstructive Pulmonary Disease; COPD Exacerbation; Patient Experience; Qualitative Analysis; Information Extraction
Date of First Compliant Deposit: 30 March 2016
Last Modified: 15 Nov 2016 05:46

Actions (repository staff only)

Edit Item Edit Item


Downloads per month over past year

View more statistics