Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

KLOSURE: Closing in on open–ended patient questionnaires with text mining

Spasic, Irena ORCID:, Owen, David ORCID:, Smith, Andrew ORCID: and Button, Kate ORCID: 2019. KLOSURE: Closing in on open–ended patient questionnaires with text mining. Journal of Biomedical Semantics 10 (S1) , 24. 10.1186/s13326-019-0215-3

[thumbnail of 13326_2019_Article_215.pdf]
PDF - Published Version
Available under License Creative Commons Attribution.

Download (1MB) | Preview


Background: Knee injury and Osteoarthritis Outcome Score (KOOS) is an instrument used to quantify patients' perceptions about their knee condition and associated problems. It is administered as a 42-item closed-ended questionnaire in which patients are asked to self-assess five outcomes: pain, other symptoms, activities of daily living, sport and recreation activities, and quality of life. We developed KLOG as a 10-item open-ended version of the KOOS questionnaire in an attempt to obtain deeper insight into patients' opinions including their unmet needs. However, the open–ended nature of the questionnaire incurs analytical overhead associated with the interpretation of responses. The goal of this study was to automate such analysis. We implemented KLOSURE as a system for mining free–text responses to the KLOG questionnaire. It consists of two subsystems, one concerned with feature extraction and the other one concerned with classification of feature vectors. Feature extraction is performed by a set of four modules whose main functionalities are linguistic pre-processing, sentiment analysis, named entity recognition and lexicon lookup respectively. Outputs produced by each module are combined into feature vectors. The structure of feature vectors will vary across the KLOG questions. Finally, Weka, a machine learning workbench, was used for classification of feature vectors. Results: The precision of the system varied between 62.8% and 95.3%, whereas the recall varied from 58.3% to 87.6% across the 10 questions. The overall performance in terms of F–measure varied between 59.0% and 91.3% with an average of 74.4% and a standard deviation of 8.8. Conclusions: We demonstrated the feasibility of mining open-ended patient questionnaires. By automatically mapping free text answers onto a Likert scale, we can effectively measure the progress of rehabilitation over time. In comparison to traditional closed-ended questionnaires, our approach offers much richer information that can be utilised to support clinical decision making. In conclusion, we demonstrated how text mining can be used to combine the benefits of qualitative and quantitative analysis of patient experiences.

Item Type: Article
Date Type: Publication
Status: Published
Schools: Psychology
Computer Science & Informatics
Healthcare Sciences
Data Innovation Research Institute (DIURI)
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Q Science > QA Mathematics > QA76 Computer software
Publisher: BMC
ISSN: 2041-1480
Funders: Wellcome Trust
Date of First Compliant Deposit: 31 July 2019
Date of Acceptance: 30 July 2019
Last Modified: 04 May 2023 19:45

Citation Data

Cited 8 times in Scopus. View in Scopus. Powered By Scopus® Data

Actions (repository staff only)

Edit Item Edit Item


Downloads per month over past year

View more statistics