Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Exploring the use of Machine Learning with extragalactic emission-line surveys, in preparation for the Square Kilometre Array

Dawson, James 2021. Exploring the use of Machine Learning with extragalactic emission-line surveys, in preparation for the Square Kilometre Array. PhD Thesis, Cardiff University.
Item availability restricted.

[thumbnail of PhD Thesis]
PDF (PhD Thesis) - Accepted Post-Print Version
Available under License Creative Commons Attribution No Derivatives.

Download (11MB) | Preview
[thumbnail of Cardiff University Electronic Publication Form] PDF (Cardiff University Electronic Publication Form) - Supplemental Material
Restricted to Repository staff only

Download (49kB)


This thesis investigates the use of machine learning for analysing the kinematics of galaxies in a time efficient manner. The application of machine learning in astronomy is arguably nascent, and very much so in the case of galaxy kinematics. Being able to extract kinematic information at speed will be important come the advent of next generation telescopes such as the Square Kilometre Array. Such instruments will collect raw data on scales too large to store. Therefore, the use of on the fly modelling techniques, harnessing the power of machine learning, is crucial. I will show that it is possible and beneficial to use machine learning algorithms to tackle scientific questions in extragalactic astronomy in this way. This thesis starts by investigating the use of machine learning algorithms for rapidly discriminating between disturbed and orderly rotating gas discs in galaxies. Specifically, cold dense molecular gas discs are embedded onto a latent manifold using convolutional autoencoders (CAE) which boast powerful automated feature embedding capabilities. Using hydrodynamical simulations to create mock observational data, the CAE is trained on millions of naturally augmented moment one maps before testing on observational HI data from the Local Volume HI Survey (Koribalski et al. 2018), as well CO observational data from various surveys using ALMA. Using a simple binary classifier on the embeddings, it can be shown that disturbed and orderly rotating discs are separately classified with high accuracy even in the presence of injected noise. Such models may be useful as fast filtering tools for identifying mergers or relaxed discs for further kinematic modelling. Bearing in mind that transfer learning for next generation survey datasets holds great risk, a new approach to kinematically characterising gas in galaxies is studied next. Using self-supervised physics-aware neural networks, the need for a throw-away training set is removed entirely, and replaced with a model which can learn physical parameterisations of galaxy rotation curves at rapid speed. With the introduction of monte carlo dropout, it is also possible to recover modelling errors for kinematic parameters, which will be useful in gauging the validity of learned parameters. These models are tested on simulated data as well as observational CO data from the WISDOM survey and HI data from THINGS (Walter et al. 2008). Learned rotation curves match well with those derived from more analytically motivated modelling tools (e.g. Bbarolo, Di Teodoro & Fraternali 2015), but compute parameterisations in a fraction of the time. Finally I study the use of the aforementioned self-supervised physics-aware neural networks, to recover the H-alpha Tully-Fisher relation (TFR) from largest IFU dataset to date. To do so, moment maps from both SAMI and MaNGA IFU surveys are used to derive the rotational velocities of low redshift galaxies. These are then fit against mass to derive both the forward and reverse TFR. The fits are in agreement with those found in the wider literature except that my fits have shallower gradients because a correction for asymmetric drift is applied in this work, but not in the comparison fits from the literature. Here, I identify and quantify trends between position along (and perpendicular to) the TFR and galaxy properties, namely: age and mass-to-light ratio. A clear relation is also discussed between velocity turnover radius, r-turn/r-e, and stellar mass. The application of models originally designed for use with millimetre and radio interferometric data, shows the benefits of using self-supervised physics-aware approaches to circumvent the problems often associated with transfer learning. Such methods will be useful when applied to next generation IFU survey data releases, with instruments such as HECTOR. In summary, in this thesis, I explore the different machine learning approaches to kinematically characterise galaxies in a time-efficient manner. I conclude with some remaining questions and avenues for future research.

Item Type: Thesis (PhD)
Date Type: Completion
Status: Unpublished
Schools: Physics and Astronomy
Subjects: Q Science > QB Astronomy
Q Science > QC Physics
Uncontrolled Keywords: extragalactic astronomy galaxies machine learning AI kinematics neural networks atomic hydrogen 21cm line emission interferometry SKA square kilometre array
Funders: STFC CDT
Date of First Compliant Deposit: 8 February 2022
Last Modified: 09 Feb 2022 14:52

Actions (repository staff only)

Edit Item Edit Item


Downloads per month over past year

View more statistics