Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Augmenting official citizen science data collections with social media data related to wildlife observations

Edwards, Thomas J. 2022. Augmenting official citizen science data collections with social media data related to wildlife observations. PhD Thesis, Cardiff University.
Item availability restricted.

[thumbnail of 2022edwardsthomasphd.pdf] PDF - Accepted Post-Print Version
Restricted to Repository staff only until 24 November 2023 due to copyright restrictions.
Available under License Creative Commons GNU LGPL (Software).

Download (11MB)
[thumbnail of Cardiff University Electronic Publication Form] PDF (Cardiff University Electronic Publication Form) - Supplemental Material
Restricted to Repository staff only

Download (331kB)


Studies of wildlife species distribution patterns are increasingly important in the face of rapid ecosystem changes that have implications for disease emergence and spread, food security, climate change, and invasive species biology. Citizen science campaigns can be very effective for observing wildlife behaviour, but they can also be a resource-consuming process and limited in coverage and sometimes their accuracy. Due to their wide usage, social media platforms represent an untapped source of potentially valuable wildlife observational data which is less costly to obtain but could complement citizen science data collections and support real-time species monitoring and analysis. There are however concerns about the correctness and completeness of social media data sources. Further, the exploitation of social media data related to wildlife involves challenges such as its heterogeneity, noisiness and lack of adequate labelled data. Previous research on using social media sites in ecology studies is limited and often involves manual or semi-automated approaches with few attempts to exploit advanced machine learning methods. In this thesis, we aim to identify social media mining techniques that facilitate the us-age of social media datasets as a source of wildlife observational data. First, we study the potential of social media data to supplement citizen science data collections and perform a range of statistical, spatial, and temporal analyses. We also present image and text-classification based verification approaches for identifying wildlife observations on social media which are suitable for large and diverse data collections. To address the fact the only a small proportion of social media posts have coordinates, we develop geo-referencing techniques that use state-of-the-art transformer-based neural network models, transfer learning, and regression models. These methods are extended with hybrid approaches incorporating machine learning and rule-based methods to improve the precision of geo-referencing models given limited amounts of training data. A preliminary study of how social media can be exploited for spatio-temporal analysis is conducted. The thesis shows that the image sharing platform, Flickr and the micro-blogging service Twitter can be valuable sources of wildlife observational data but require verification and preparation techniques to support their use. We show that combining neural network models, transfer learning, and/or rule-based approaches can facilitate the verification and georeferencing of social media datasets even in the presence of more specialised language and limited amounts of labelled data. We also present the largest collections of geo-referenced wildlife-related Twitter and Flickr datasets as well as a deep learning transformer model trained on wildlife Tweets. These resources can be beneficial for further studies into passive citizen science and social media mining.

Item Type: Thesis (PhD)
Date Type: Completion
Status: Unpublished
Schools: Computer Science & Informatics
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Date of First Compliant Deposit: 24 November 2022
Last Modified: 24 Nov 2022 14:10

Actions (repository staff only)

Edit Item Edit Item


Downloads per month over past year

View more statistics