Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

EHIN: Early-aware hierarchical interaction network for weakly-supervised referring image segmentation

Li, Hongjun, Wang, Nan, Chen, Anqing, Liu, Jiang, Ma, Wanli, Liu, Weide, Ju, Yakun, Rosin, Paul L. ORCID: https://orcid.org/0000-0002-4965-3884, Liu, Hantao ORCID: https://orcid.org/0000-0003-4544-3481 and Zhou, Wei 2026. EHIN: Early-aware hierarchical interaction network for weakly-supervised referring image segmentation. Neurocomputing 659 , 131764. 10.1016/j.neucom.2025.131764

Full text not available from this repository.

Abstract

Referring image segmentation (RIS) aims to segment target regions in images based on natural language descriptions. Although weakly-supervised RIS frameworks have been proposed to reduce reliance on costly manual annotations, their performance remains limited due to both the low quality of pseudo-labels and the inherent difficulty in achieving effective interaction between visual and textual features. In this paper, we propose a novel weakly-supervised framework named Early-aware Hierarchical Interaction Network (EHIN). The proposed network includes two key components, which are designed to enhance pseudo-label generation and improve the interaction between visual and textual features for RIS, respectively. First, EHIN incorporates an Early-aware Contrastive Learning Module (ECLM) that enhances feature discrimination by leveraging contrastive learning to distinguish target features from background noise. By integrating the module early into the processing pipeline, ECLM operates on raw image features directly, preserving richer visual details while reducing reliance on labeled data and thus improving the reliability of pseudo-labels. Second, EHIN integrates a Hierarchical Interaction Prompt Module (HIPM) to facilitate comprehensive interaction between visual and textual features and enhance subsequent feature fusion. Extensive experimental results on four benchmark datasets demonstrate that the proposed EHIN outperforms the state-of-the-art RIS. Code is available at https://github.com/CDUT-DBGroup/MFP-TRIS.

Item Type: Article
Date Type: Publication
Status: Published
Schools: Schools > Computer Science & Informatics
Publisher: Elsevier
ISSN: 0925-2312
Date of Acceptance: 8 October 2025
Last Modified: 04 Nov 2025 11:01
URI: https://orca.cardiff.ac.uk/id/eprint/182058

Actions (repository staff only)

Edit Item Edit Item