| Meng, Yiran, Ye, Junhong, Zhou, Wei, Yue, Guanghui, Mao, Xudong, Wang, Ruomei and Zhao, Baoquan 2025. VideoForest: Person-anchored hierarchical reasoning for cross-video question answering. Presented at: MM '25: The 33rd ACM International Conference on Multimedia, Dublin, Ireland, 27-31 October 2025. MM '25: Proceedings of the 33rd ACM International Conference on Multimedia. ACM, pp. 836-845. 10.1145/3746027.3754573 |
Abstract
Cross-video question answering presents significant challenges beyond traditional single-video understanding, particularly in establishing meaningful connections across video streams and managing the complexity of multi-source information retrieval. We introduce VideoForest, a novel framework that addresses these challenges through person-anchored hierarchical reasoning, enabling effective cross-video understanding without requiring end-to-end training. VideoForest integrates three key innovations: 1) a human-anchored feature extraction mechanism that employs ReID and tracking algorithms to establish robust spatiotemporal relationships across multiple video sources; 2) a multi-granularity spanning tree structure that hierarchically organizes visual content around person-level trajectories; and 3) a multi-agent reasoning framework that efficiently traverses this hierarchical structure to answer complex queries. To evaluate our method, we develop CrossVideoQA, a comprehensive benchmark specifically designed for person-centric cross-video analysis. Experimental results demonstrate VideoForest's superior performance in cross-video reasoning tasks, achieving 71.93% accuracy in person recognition, 83.75% in behavior analysis, and 51.67% in summarization and reasoning.
| Item Type: | Conference or Workshop Item (Paper) |
|---|---|
| Date Type: | Published Online |
| Status: | Published |
| Schools: | Schools > Computer Science & Informatics |
| Publisher: | ACM |
| ISBN: | 9798400720352 |
| Last Modified: | 18 Nov 2025 10:15 |
| URI: | https://orca.cardiff.ac.uk/id/eprint/182479 |
Actions (repository staff only)
![]() |
Edit Item |




Altmetric
Altmetric