Liang, Yan, Liu, Ying ORCID: https://orcid.org/0000-0001-9319-5940, Chen, Chong and Jiang, Zhigang 2018. Extracting topic-sensitive content from textual documents - A hybrid topic model approach. Engineering Applications of Artificial Intelligence 70 , pp. 81-91. 10.1016/j.engappai.2017.12.010 |
Preview |
PDF
- Accepted Post-Print Version
Download (1MB) | Preview |
Abstract
When exploring information of a topic, users often concern its different aspects. For instance, product designers are interested in seeking information of specific topic aspects such as technical challenge and usability from online consumer opinions, while potential buyers wish to obtain general sentiment of public opinions. In this paper, we study an interesting problem called topic-sensitive content extraction (TSCE). TSCE aims to extract contents that are relevant to the samples of topic aspects highlighted by users from a single document in a given text collection. To tackle TSCE, we have proposed a new hybrid topic model which integrates different structures in both topic space and context space. It focuses on identifying contents associated with a specified topic aspect from each document. By modeling gradient documents via term profiles for context modeling and by leveraging local and global differences between probability distributions over words in both topic modeling and context modeling, it has better captured the features of various language patterns. Hence, sentence relevance ranking according to a specific topic aspect is largely improved. The experimental studies on extracting critical contents of specific aspects, including motivation and design solution, from technical patents for design analysis have shown the merits of the proposed modeling.
Item Type: | Article |
---|---|
Date Type: | Publication |
Status: | Published |
Schools: | Engineering |
Publisher: | Elsevier / International Federation of Automatic Control (IFAC) |
ISSN: | 0952-1976 |
Date of First Compliant Deposit: | 2 January 2018 |
Date of Acceptance: | 27 December 2017 |
Last Modified: | 27 Nov 2024 06:00 |
URI: | https://orca.cardiff.ac.uk/id/eprint/107835 |
Citation Data
Cited 11 times in Scopus. View in Scopus. Powered By Scopus® Data
Actions (repository staff only)
Edit Item |