Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Neural panoramic representation for spatially and temporally consistent 360◦ video editing

Kou, Simin, Zhang, Fang-Lue, Lai, Yukun ORCID: https://orcid.org/0000-0002-2094-5680 and Dodgson, Neil A. 2024. Neural panoramic representation for spatially and temporally consistent 360◦ video editing. Presented at: IEEE International Symposium on Mixed and Augmented Reality (ISMAR), Seattle, USA, 21-25 October 2024.

[thumbnail of NeuralPanoRep_ISMAR2024.pdf]
Preview
PDF - Accepted Post-Print Version
Download (2MB) | Preview

Abstract

Content-based 360° video editing allows users to manipulate panoramic content for interaction in a dynamic visual world. However, the current related methods (2D neural representation and optical flow) show limitations in producing high-quality panoramic content from 360° videos due to their lack of capacity to model the inherent spatiotemporal relationships among pixels in the true panoramic space. To address this issue, we propose a Neural Panoramic Representation (NPR) method to model the global inter-pixel relationships, facilitating immersive video editing. Specifically, our method utilizes MLP-based networks to learn spherical implicit content layers, by encoding the spherical spatiotemporal positions and appearance details within the panoramic video, and bi-directional mapping between the original video frames and the learned content layers, to capture the interpretable and global omnidirectional visual characteristics of individual dynamic scenes. Additionally, we introduce innovative loss functions (spherical neighborhood consistency and unit spherical regularization) to ensure the creation of appropriate implicit spherical content layers. We further provide an interactive layer neural panoramic editing approach based on the proposed NPR, in the head-mounted display device. We evaluate this framework on diverse real-world 360° videos, showing superior performance on both reconstruction and consistent editing compared to existing state-of-the-art (SOTA) neural representation techniques.

Item Type: Conference or Workshop Item (Paper)
Status: In Press
Schools: Computer Science & Informatics
Funders: The Royal Society
Date of First Compliant Deposit: 7 September 2024
Date of Acceptance: 31 July 2024
Last Modified: 03 Nov 2024 02:30
URI: https://orca.cardiff.ac.uk/id/eprint/171914

Actions (repository staff only)

Edit Item Edit Item

Downloads

Downloads per month over past year

View more statistics