Camacho Collados, Jose ![]() ![]() ![]() |
Preview |
PDF
- Accepted Post-Print Version
Download (245kB) | Preview |
Abstract
Recently a number of unsupervised approaches have been proposed for learning vectors that capture the relationship between two words. Inspired by word embedding models, these approaches rely on co-occurrence statistics that are obtained from sentences in which the two target words appear. However, the number of such sentences is often quite small, and most of the words that occur in them are not relevant for characterizing the considered relationship. As a result, standard co-occurrence statistics typically lead to noisy relation vectors. To address this issue, we propose a latent variable model that aims to explicitly determine what words from the given sentences best characterize the relationship between the two target words. Relation vectors then correspond to the parameters of a simple unigram language model which is estimated from these words.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Date Type: | Publication |
Status: | Published |
Schools: | Schools > Computer Science & Informatics |
Publisher: | IJCAI |
Date of First Compliant Deposit: | 14 August 2019 |
Last Modified: | 08 Aug 2025 11:15 |
URI: | https://orca.cardiff.ac.uk/id/eprint/124030 |
Citation Data
Cited 8 times in Scopus. View in Scopus. Powered By Scopus® Data
Actions (repository staff only)
![]() |
Edit Item |