Randell, Hayley
2022.
Dimension reduction methods for high-dimensional datasets.
MPhil Thesis,
Cardiff University.
![]() Item availability restricted. |
Preview |
PDF
- Accepted Post-Print Version
Download (1MB) | Preview |
![]() |
PDF (Cardiff University Electronic Publication Form)
- Supplemental Material
Restricted to Repository staff only Download (118kB) |
Abstract
In recent years computer power has increased massively which consequently has led to an increase in the size of data. The steep increase in size has led to a vast need for more modern ways of analysing this data. Classical methods for analysing data were intended for a low dimensional setting, hence an increasingly popular method of analysing large data is to perform a dimension reduction technique first to project the data into a lower dimension. A `good' dimension reduction technique accurately predicts the correct dimension reduction subspace, without having a sig- nificant impact on the computational efficiency of the calculations. There are many dimension reduction methods already developed but few have successfully achieved a high level of accuracy without sacrificing the computation time. Our aim is to develop a method that rivals previous methods with high accuracy and those which are efficient computationally. Another common drawback with classic methods is that not many are realistic options for data where the dimension size exceeds the sample size, many depend on calculating the inverse of the covariance matrix of the predictor variables which becomes singular as the dimension size surpasses the sample size. It has also been shown that many classic estimators of the central dimension reduction subspace do not remain consistent when the dimension size is larger than the sample size. There are two main contributions from this work, we have developed a dimension reduction method using Distance-Weighted Discrimination (DWD) which has increased accuracy compared with classic methods and is computationally faster than more recent methods. We have also developed a dimension reduction method which can tackle larger datasets without being restricted by the dimension, and further improved the computational efficiency compared with classic methods in the form of a feature partitioning algorithm.
Item Type: | Thesis (MPhil) |
---|---|
Date Type: | Completion |
Status: | Unpublished |
Schools: | Mathematics |
Date of First Compliant Deposit: | 6 October 2022 |
Last Modified: | 06 May 2023 02:38 |
URI: | https://orca.cardiff.ac.uk/id/eprint/152961 |
Actions (repository staff only)
![]() |
Edit Item |