TPD-NeRF: Temporally Progressive Reconstruction of Dynamic Neural Radiance Fields from Monocular Video

Yuan, Yu-Jie, Kobbelt, Leif, Yang, Jie, Lai, Yu-Kun

and Gao, Lin 2025. TPD-NeRF: Temporally Progressive Reconstruction of Dynamic Neural Radiance Fields from Monocular Video. Presented at: 13th International Conference, Computational Visual Media, Hong Kong SAR, China, 19–21 April 2025. Published in: Didyk, Piotr and Hou, Junhui eds. Computational Visual Media: 13th International Conference, CVM 2025 Proceedings, Part II. Symbolic and Quantitative Approaches to Reasoning with Uncertainty ECSQARU 2019. Lecture Notes in Computer Science , vol.15664 (1) Singapore: pp. 90-112. 10.1007/978-981-96-5812-1_6

Preview

PDF - Accepted Post-Print Version
Download (22MB) | Preview

Official URL: http://dx.doi.org/10.1007/978-981-96-5812-1_6

Abstract

Due to their great performance in representing 3D scene geometry and appearance, Neural Radiance Fields (NeRF) have recently gained a lot of attention in applications like novel view synthesis. Some extensions of NeRFs to dynamic scenes have been proposed, but they either require synchronized multi-view video input or fail for faster motions or longer sequences. In this paper, we propose a novel dynamic NeRF framework, called TPD-NeRF, which takes a single monocular video as input and enables high quality synthesis of novel views for any time point even in highly dynamic scenes. The idea is to first establish local frame-to-frame consistency by training a sub-network that predicts short term offsets and hence generates frame-to-frame correspondences. Applying this network multiple times allows us to propagate correspondences from any frame of the input sequence to one global reference frame. Using the resulting global correspondences as supervision, we can train another sub-network to establish global consistency for the TPD-NeRF. This network effectively maps each dynamic state back to a canonical space, i.e. it captures the global motion in the scene. To further improve the visual quality, we introduce the space-time field network as the canonical NeRF to capture missing dynamic information of the two deformation networks. We extensively evaluate our method and compare it with previous work to demonstrate that our method outperforms existing dynamic NeRF methods.

Item Type:	Conference or Workshop Item (Paper)
Date Type:	Publication
Status:	Published
Schools:	Schools > Computer Science & Informatics
ISBN:	978-981-96-5811-4
ISSN:	0302-9743
Date of First Compliant Deposit:	24 June 2025
Date of Acceptance:	18 December 2024
Last Modified:	04 Aug 2025 01:30
URI:	https://orca.cardiff.ac.uk/id/eprint/179291

Actions (repository staff only)

Edit Item

Altmetric

Dimensions

Download Statistics

Downloads

Downloads per month over past year

View more statistics

CORE (COnnecting REpositories)