Llanos-Neuta, Nicolás, Obando-Ceron, Johan, Perafan-Villota, Juan C. and Romero Cano, Victor ORCID: https://orcid.org/0000-0003-2910-5116
2025.
Learning structured spatiotemporal tasks with xLSTM under uncertainty: A multi-task approach.
Presented at: 12th Workshop on Engineering Applications,
Cali, Colombia,
29–31 October 2025.
Published in: Figueroa-García, Juan Carlos, López-Sotelo, Jesús Alfonso, Moreno-Trujillo, John Freddy and Gaona-García, Elvis Eduardo eds.
Applied Computer Sciences in Engineering.
Communications in Computer and Information Science
Springer,
234–246.
10.1007/978-3-032-08203-9_20
Item availability restricted. |
|
PDF
- Accepted Post-Print Version
Restricted to Repository staff only until 27 November 2025 due to copyright restrictions. Download (883kB) |
Abstract
The demand for real-time perception in complex robotic systems operating in dynamic environments has motivated the development of architectures capable of solving multiple learning tasks simultaneously. However, current Multi-Task Learning approaches often suffer from poor task balancing, lack of inter-task regularization, and static loss weighting. This work introduces a multi-task framework based on Extended Long Short-Term Memory (xLSTM) to jointly predict the motion of 3D bounding boxes, semantic classes, 3D velocity, and categorical dynamic behavior of objects. The framework adopts an encoder–multi-head architecture with shared temporal representations and task-specific heads. Two auxiliary tasks, velocity regression and dynamic state classification, are derived from physical approximations and incorporated to guide training. A homoscedastic uncertainty-based loss weighting strategy dynamically adjusts task influence during optimization. Quantitative results on the KITTI benchmark show that the proposed framework achieves lower RMSE for motion estimation and higher F1-scores for classification tasks compared to baselines. Auxiliary tasks improve convergence and coherence, while uncertainty weighting enhances training stability. This architecture offers a scalable and interpretable solution for spatiotemporal modeling and has the potential to benefit downstream applications in robotics and intelligent monitoring systems.
| Item Type: | Conference or Workshop Item (Paper) |
|---|---|
| Date Type: | Publication |
| Status: | Published |
| Schools: | Schools > Computer Science & Informatics |
| Publisher: | Springer |
| ISBN: | 9783032082022 |
| Date of First Compliant Deposit: | 25 October 2025 |
| Date of Acceptance: | 1 August 2025 |
| Last Modified: | 27 Oct 2025 12:46 |
| URI: | https://orca.cardiff.ac.uk/id/eprint/181891 |
Actions (repository staff only)
![]() |
Edit Item |





Dimensions
Dimensions