Learning structured spatiotemporal tasks with xLSTM under uncertainty: A multi-task approach

Llanos-Neuta, Nicolás, Obando-Ceron, Johan, Perafan-Villota, Juan C. and Romero Cano, Victor

2025. Learning structured spatiotemporal tasks with xLSTM under uncertainty: A multi-task approach. Presented at: 12th Workshop on Engineering Applications, Cali, Colombia, 29–31 October 2025. Published in: Figueroa-García, Juan Carlos, López-Sotelo, Jesús Alfonso, Moreno-Trujillo, John Freddy and Gaona-García, Elvis Eduardo eds. Applied Computer Sciences in Engineering. Communications in Computer and Information Science Springer, 234–246. 10.1007/978-3-032-08203-9_20

[thumbnail of Learning_Structured_Temporal_Tasks.pdf]

Preview

PDF - Accepted Post-Print Version
Download (883kB) | Preview

Official URL: http://dx.doi.org/10.1007/978-3-032-08203-9_20

Abstract

The demand for real-time perception in complex robotic systems operating in dynamic environments has motivated the development of architectures capable of solving multiple learning tasks simultaneously. However, current Multi-Task Learning approaches often suffer from poor task balancing, lack of inter-task regularization, and static loss weighting. This work introduces a multi-task framework based on Extended Long Short-Term Memory (xLSTM) to jointly predict the motion of 3D bounding boxes, semantic classes, 3D velocity, and categorical dynamic behavior of objects. The framework adopts an encoder–multi-head architecture with shared temporal representations and task-specific heads. Two auxiliary tasks, velocity regression and dynamic state classification, are derived from physical approximations and incorporated to guide training. A homoscedastic uncertainty-based loss weighting strategy dynamically adjusts task influence during optimization. Quantitative results on the KITTI benchmark show that the proposed framework achieves lower RMSE for motion estimation and higher F1-scores for classification tasks compared to baselines. Auxiliary tasks improve convergence and coherence, while uncertainty weighting enhances training stability. This architecture offers a scalable and interpretable solution for spatiotemporal modeling and has the potential to benefit downstream applications in robotics and intelligent monitoring systems.

Item Type:	Conference or Workshop Item (Paper)
Date Type:	Publication
Status:	Published
Schools:	Schools > Computer Science & Informatics
Publisher:	Springer
ISBN:	9783032082022
Date of First Compliant Deposit:	25 October 2025
Date of Acceptance:	1 August 2025
Last Modified:	27 Nov 2025 02:30
URI:	https://orca.cardiff.ac.uk/id/eprint/181891

Actions (repository staff only)

Edit Item

Altmetric

Dimensions

Download Statistics

Downloads

Downloads per month over past year

View more statistics

CORE (COnnecting REpositories)