Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Improving cross-lingual word embeddings by meeting in the middle

Doval, Yerai, Camacho-Collados, Jose, Espinosa Anke, Luis and Schockaert, Steven ORCID: https://orcid.org/0000-0002-9256-2881 2018. Improving cross-lingual word embeddings by meeting in the middle. Presented at: Conference on Empirical Methods in Natural Language Processing, Brussels, 31 October-4 November.

[thumbnail of EMNLP18_MultilingualEmbeddings-2.pdf]
Preview
PDF - Presentation
Download (288kB) | Preview

Abstract

Cross-lingual word embeddings are becoming increasingly important in multilingual NLP. Recently, it has been shown that these embeddings can be effectively learned by aligning two disjoint monolingual vector spaces through linear transformations, using no more than a small bilingual dictionary as supervision. In this work, we propose to apply an additional transformation after the initial alignment step, which moves cross-lingual synonyms towards a middle point between them. By applying this transformation our aim is to obtain a better cross-lingual integration of the vector spaces. In addition, and perhaps surprisingly, the monolingual spaces also improve by this transformation. This is in contrast to the original alignment, which is typically learned such that the structure of the monolingual spaces is preserved. Our experiments confirm that the resulting cross-lingual embeddings outperform state-of-the-art models in both monolingual and cross-lingual evaluation tasks.

Item Type: Conference or Workshop Item (Paper)
Status: Unpublished
Schools: Computer Science & Informatics
Last Modified: 24 Oct 2022 07:23
URI: https://orca.cardiff.ac.uk/id/eprint/114793

Citation Data

Cited 29 times in Scopus. View in Scopus. Powered By Scopus® Data

Actions (repository staff only)

Edit Item Edit Item

Downloads

Downloads per month over past year

View more statistics