Mehri, Faridoun, Fayyaz, Mohsen, Baghshah, Mahdieh Soleymani and Pilehvar, Mohammad Taher 2024. SkipPLUS: Skip the first few layers to better explain vision transformers. Presented at: 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Seattle, WA, USA, 17-18 June 2024. 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). pp. 204-215. 10.1109/cvprw63382.2024.00025 |
Preview |
PDF
- Accepted Post-Print Version
Download (4MB) | Preview |
Abstract
Despite their remarkable performance, the explainability of Vision Transformers (ViTs) remains a challenge. While forward attention-based token attribution techniques have become popular in text processing, their suitability for ViTs hasn't been extensively explored. In this paper, we compare these methods against state-of-the-art input attribution methods from the Vision literature, revealing their limitations due to improper aggregation of information across layers. To address this, we introduce two general techniques, PLUS and SkipPLUS, that can be composed with any input attribution method to more effectively aggregate information across layers while handling noisy layers. Through comprehensive and quantitative evaluations of faithfulness and human interpretability on a variety of ViT architectures and datasets, we demonstrate the effectiveness of PLUS and SkipPLUS, establishing a new state-of-the-art in white-box token attribution. We conclude with a comparative analysis highlighting the strengths and weaknesses of the best versions of all the studied methods. The code used in this paper is freely available at https://github.com/NightMachinery/SkipPLUS-CVPR-2024.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Date Type: | Published Online |
Status: | Published |
Schools: | Computer Science & Informatics |
ISBN: | 979-8-3503-6547-4 |
ISSN: | 2160-7508 |
Date of First Compliant Deposit: | 10 December 2024 |
Date of Acceptance: | 7 April 2024 |
Last Modified: | 10 Jan 2025 02:45 |
URI: | https://orca.cardiff.ac.uk/id/eprint/173032 |
Actions (repository staff only)
![]() |
Edit Item |