Alrehamy, Hassan H. and Walker, Coral ![]() ![]() |
Abstract
Keyphrases provide important semantic metadata for organizing and managing free-text documents. As data grow exponentially, there is a pressing demand for automatic and efficient keyphrase extraction methods. We introduce in this paper SemCluster, a clustering-based unsupervised keyphrase extraction method. By integrating an internal ontology (i.e., WordNet) with external knowledge sources, SemCluster identifies and extracts semantically important terms from a given document, clusters the terms, and, using the clustering results as heuristics, identifies the most representative phrases and singles them out as keyphrases. SemCluster is evaluated against two baseline unsupervised methods, TextRank and KeyCluster, over the Inspec dataset under an F1-measure metric. The evaluation results clearly show that SemCluster outperforms both methods.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Date Type: | Publication |
Status: | Published |
Schools: | Computer Science & Informatics |
Publisher: | Springer |
ISBN: | 978-3-319-66939-7 |
ISSN: | 2194-5357 |
Last Modified: | 22 Apr 2023 14:02 |
URI: | https://orca.cardiff.ac.uk/id/eprint/111040 |
Citation Data
Cited 10 times in Scopus. View in Scopus. Powered By Scopus® Data
Actions (repository staff only)
![]() |
Edit Item |