Tong, Tong
2024.
Human intention recognition using context
relationships in complex scenes.
PhD Thesis,
Cardiff University.
Item availability restricted. |
![]() |
PDF
- Accepted Post-Print Version
Restricted to Repository staff only until 14 May 2026 due to copyright restrictions. Download (4MB) |
![]() |
PDF (Cardiff University Electronic Publication Form)
- Supplemental Material
Restricted to Repository staff only Download (207kB) |
Abstract
Recognising human intentions is a significant challenge in human-robot interaction research. However, most current studies can only identify human intentions in simple environments, such as determining which tool a person needs next in a handover task. Additionally, many studies rely on features like gaze or gestures to recognise human intentions. However, in the different environments, such as a living room or a kitchen, relying on these features alone may result in inaccurate and biased recognition. As human intentions are frequently influenced by their environmental context, a comprehensive understanding of complex scenes necessitates considering of contextual relationships. To address these limitations, this thesis proposes a general framework for human intention recognition, incorporating context representation and contextual reasoning to overcome the challenges of recognising human intentions in complex environments. Most research primarily targets the identification of a single type of intention. This limitation is largely due to the restricted availability of datasets that support comprehensive human intention recognition. To address this issue, this thesis introduces a Dynamic Scene Graph (DSG) dataset designed to represent contextual relationships and multiple types of human intention. The dataset is derived from the publicly available video dataset, Action Genome. A total of 471 videos were rigorously selected and annotated with 20 distinct human intentions. DSG addresses the current shortage of datasets for human intention recognition and contributes towards more generalised research in human intention prediction. Next, a novel model called the Spatial Temporal Graph Attention Informer Neural Network (STGAIN) was developed, leveraging spatial relationships between humans and objects as well as their temporal evolution across successive video frames. The model achieved an accuracy rate of 0.81 in recognising human intentions, significantly outperforming current state-of-the-art models such as the Spatial Temporal Graph Convolutional Network (STGCN) (0.70) and the ABSTRACT II Spatial Temporal Graph Attention Network (STGAT) (0.75). Additionally, the thesis partitioned the dataset into varying proportions—50%, 75%, and 100%—to explore the importance of temporal changes and contextual relationships in recognising human intentions. To validate the generalisation capability of the model, the thesis further applied STGAIN to the human activity recognition task using the derived DSG dataset. The STGAIN model achieved an accuracy rate of 0.86 on this human activity recognition task, surpassing both STGCN (0.76) and STGAT (0.80). Cross-validation across different tasks demonstrates the advantages of the proposed model within the domain of graph-based data. This research aims to exploit contextual relationships for human intention recognition across different environments, highlighting the significance of such relationships for achieving more generalised human intention recognition. Understanding human-object contextual relationships can also enhance the capabilities of human-centred robotics, enabling these technologies to proficiently recognise and adapt to human intentions in various situations.
Item Type: | Thesis (PhD) |
---|---|
Date Type: | Completion |
Status: | Unpublished |
Schools: | Schools > Engineering |
Uncontrolled Keywords: | Artificial Intelligence; Human Intention; Context Representation; Graph Neural Network; Dynamic Graph; Spatial Temporal Graph |
Date of First Compliant Deposit: | 19 May 2025 |
Last Modified: | 19 May 2025 15:55 |
URI: | https://orca.cardiff.ac.uk/id/eprint/178285 |
Actions (repository staff only)
![]() |
Edit Item |