Towards minimal supervision for semantic segmentation of remote sensing imagery

Ma, Wanli 2024. Towards minimal supervision for semantic segmentation of remote sensing imagery. PhD Thesis, Cardiff University.
Item availability restricted.

	PDF - Accepted Post-Print Version Restricted to Repository staff only until 26 August 2026 due to copyright restrictions. Download (89MB)
	PDF (Cardiff University Electronic Publication Form) - Supplemental Material Restricted to Repository staff only Download (301kB)

Abstract

Given the high cost of pixel-wise annotation in remote sensing imagery, this thesis investigates semantic segmentation under limited supervision, seeking to minimise labelled data requirements while preserving strong performance. To achieve this goal, this thesis explores three techniques: (1) multi-modal image fusion, (2) semisupervised learning, and (3) active learning for remote sensing segmentation tasks. First, this thesis explores the multi-modality of remote sensing data for minimal supervision in the application of land cover mapping. We proposed an attention-based multi-modal Image fusion network (AMM-FuseNet), which employs dual encoders for handling multiple data modalities, integrates a channel-attention mechanism, and utilises a densely connected atrous spatial pyramid pooling (DenseASPP) module for enhanced contextual representation. This proposed feature extraction module excels at extracting information and representing multi-modal data, it demonstrates greater capability when coping with a limited amount of training data. Second, this thesis investigates semi-supervised learning for minimal supervision. Specifically, we proposed, DiverseModel and DiverseHead, two semi-supervised learning architectures, which achieve high-performance results for segmentation tasks whilst the proposed DiverseHead is simple and relatively lightweight. Also, based on DiverseModel, we observe its potential benefits for knowledge distillation. Third, this thesis introduces an active learning approach combined with an optimised semi-supervised learning architecture that identifies areas where the pseudo labels are likely incorrect for manual labelling. To further minimise the labelling budget, for some uncertain areas in pseudo labels, we proposed a self-assigned labelling process which relabels these areas in pseudo-labels by comparing their feature representation with that of labelled data without consuming budget, while also leveraging reliable pseudo-labels for training. As a result, the three techniques explored in this thesis demonstrate that high performance semantic segmentation can be achieved with reduced supervision, offering a cost-effective and efficient solution for remote sensing applications. Nonetheless, several areas remain open for improvement, including model compactness, the integration of data augmentation techniques, and combining domain adaptation strategies for tasks requiring domain transfer. In the future, we will focus on these areas to further enhance minimal supervision approaches.

Item Type:	Thesis (PhD)
Date Type:	Completion
Status:	Unpublished
Schools:	Schools > Computer Science & Informatics
Subjects:	Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Funders:	Cardiff University School of Computer Science Studentship
Date of First Compliant Deposit:	27 August 2025
Date of Acceptance:	13 August 2025
Last Modified:	01 Sep 2025 10:53
URI:	https://orca.cardiff.ac.uk/id/eprint/180437

Actions (repository staff only)

Edit Item

Download Statistics

Downloads

Downloads per month over past year

View more statistics

CORE (COnnecting REpositories)