Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Aligning visual prototypes with BERT embeddings for few-shot learning

Yan, Kun, Bouraoui, Zied, Wang, Ping, Jameel, Shoaib and Schockaert, Steven ORCID: https://orcid.org/0000-0002-9256-2881 2021. Aligning visual prototypes with BERT embeddings for few-shot learning. Presented at: ACM International Conference on Multimedia Retrieval (ACM ICMR 2021), Taipei, Taiwan, 21-24 August 2021. Proceedings of the 2021 International Conference on Multimedia Retrieval. New York, NY, USA: Association for Computing Machinery, pp. 367-375. 10.1145/3460426.3463641

[thumbnail of Aligning_Visual_Prototypes_with_BERT_Embeddings_for_Few_Shot_Learning__1_.pdf]
Preview
PDF - Accepted Post-Print Version
Download (1MB) | Preview

Abstract

Few-shot learning (FSL) is the task of learning to recognize previously unseen categories of images from a small number of training examples. This is a challenging task, as the available examples may not be enough to unambiguously determine which visual features are most characteristic of the considered categories. To alleviate this issue, we propose a method that additionally takes into account the names of the image classes. While the use of class names has already been explored in previous work, our approach differs in two key aspects. First, while previous work has aimed to directly predict visual prototypes from word embeddings, we found that better results can be obtained by treating visual and text-based prototypes separately. Second, we propose a simple strategy for learning class name embeddings using the BERT language model, which we found to substantially outperform the GloVe vectors that were used in previous work. We furthermore propose a strategy for dealing with the high dimensionality of these vectors, inspired by models for aligning cross-lingual word embeddings. We provide experiments on miniImageNet, CUB and tieredImageNet, showing that our approach consistently improves the state-of-the-art in metric-based FSL.

Item Type: Conference or Workshop Item (Paper)
Date Type: Publication
Status: Published
Schools: Schools > Computer Science & Informatics
Publisher: Association for Computing Machinery
ISBN: 978-1-4503-8463-6
Date of First Compliant Deposit: 4 June 2021
Date of Acceptance: 21 April 2021
Last Modified: 24 Jul 2025 14:29
URI: https://orca.cardiff.ac.uk/id/eprint/141725

Actions (repository staff only)

Edit Item Edit Item

Downloads

Downloads per month over past year

View more statistics