Camacho Collados, Jose ORCID: https://orcid.org/0000-0003-1618-7239 and Navigli, Roberto 2016. Find the word that does not belong: a framework for an intrinsic evaluation of word vector representations. Presented at: 1st Workshop on Evaluating Vector Space Representations for NLP, Berlin, 12 August 2016. Proceedings of the 1st Workshop on Evaluating Vector Space Representations for NLP. Stroudsburg, PA: The Association for Computational Linguistics, pp. 43-50. 10.18653/v1/W16-2508 |
Official URL: http://dx.doi.org/10.18653/v1/W16-2508
Abstract
We present a new framework for an intrinsic evaluation of word vector representations based on the outlier detection task. This task is intended to test the capability of vector space models to create semantic clusters in the space. We carried out a pilot study building a gold standard dataset and the results revealed two important features: human performance on the task is extremely high compared to the standard word similarity task, and state-of-the-art word embedding models, whose current shortcomings were highlighted as part of the evaluation, still have considerable room for improvement.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Date Type: | Publication |
Status: | Published |
Schools: | Computer Science & Informatics |
Publisher: | The Association for Computational Linguistics |
ISBN: | 978-1-945626-14-2 |
Last Modified: | 24 Oct 2022 07:04 |
URI: | https://orca.cardiff.ac.uk/id/eprint/114031 |
Actions (repository staff only)
Edit Item |