Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Mining term similarities from corpora

Nenadic, Goran, Spasic, Irena and Ananiadou, Sophia 2004. Mining term similarities from corpora. Terminology 10 (1) , pp. 55-80. 10.1075/term.10.1.04nen

Full text not available from this repository.


In this article, we present an approach to the automatic discovery of term similarities, which may serve as a basis for a number of term-oriented knowledge mining tasks. The method for term comparison combines internal (lexical similarity) and two types of external criteria (syntactic and contextual similarities). Lexical similarity is based on sharing lexical constituents (i.e. term heads and modifiers). Syntactic similarity relies on a set of specific lexico-syntactic co-occurrence patterns indicating the parallel usage of terms (e.g., within an enumeration or within a term coordination/conjunction structure), while contextual similarity is based on the usage of terms in similar contexts. Such contexts are automatically identified by a pattern mining approach, and a procedure is proposed to assess their domain-specific and terminological relevance. Although automatically collected, these patterns are domain dependent and identify contexts in which terms are used. Different types of similarities are combined into a hybrid similarity measure, which can be tuned for a specific domain by learning optimal weights for individual similarities. The suggested similarity measure has been tested in the domain of biomedicine, and some experiments are presented.

Item Type: Article
Date Type: Publication
Status: Published
Schools: Computer Science & Informatics
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Uncontrolled Keywords: sim; spasic; srb; term clustering; pattern mining; automatic terminology management; term similarity; contextual similarity
Publisher: John Benjamins
ISSN: 0929-9971
Last Modified: 04 Jun 2017 01:56

Citation Data

Cited 19 times in Scopus. View in Scopus. Powered By Scopus® Data

Actions (repository staff only)

Edit Item Edit Item