Mao, Amin, Yan, Jiebin, Fang, Yuming and Liu, Hantao ![]() Item availability restricted. |
![]() |
PDF
- Accepted Post-Print Version
Restricted to Repository staff only until 13 March 2026 due to copyright restrictions. Available under License Creative Commons Attribution Non-commercial No Derivatives. Download (1MB) |
Abstract
The deep learning based video salient object detection (VSOD) models have achieved great success in the past few years, however, these VSOD models still suffer from the following two problems: i) struggle in accurately predicting those pixels surrounding salient objects; ii) unaligned features of different scales lead to deviations in feature fusion. To tackle these problems, we propose a hierarchical boundary feature alignment network (HBFA). Specifically, the proposed HBFA consists of a temporal-spatial fusion module (TSM) and three decoding branches. TSM captures multi-scale spatiotemporal information. The two boundary feature branches are used to guide the whole network to pay more attention to the boundary of salient objects, while the feature alignment branch is capable of fusing the features from the internal and external branches while aligning features across different scales. Our extensive experiments show that the proposed method reaches a new state-of-the-art performance.
Item Type: | Article |
---|---|
Date Type: | Published Online |
Status: | In Press |
Schools: | Schools > Computer Science & Informatics |
Additional Information: | License information from Publisher: LICENSE 1: Title: This article is under embargo with an end date yet to be finalised. |
Publisher: | Elsevier |
ISSN: | 1047-3203 |
Date of First Compliant Deposit: | 19 March 2025 |
Date of Acceptance: | 5 March 2025 |
Last Modified: | 19 Mar 2025 10:30 |
URI: | https://orca.cardiff.ac.uk/id/eprint/176987 |
Actions (repository staff only)
![]() |
Edit Item |