Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

StylizedNeRF: consistent 3D scene stylization as stylized NeRF via 2D-3D mutual learning

Huang, Yi-Hua, He, Yu, Yuan, Yu-Jie, Lai, Yukun ORCID: https://orcid.org/0000-0002-2094-5680 and Gao, Lin 2022. StylizedNeRF: consistent 3D scene stylization as stylized NeRF via 2D-3D mutual learning. Presented at: 2022 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2022), New Orleans, LA, United States of America, 19-24 June 2022. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, pp. 18321-18331. 10.1109/CVPR52688.2022.01780

[thumbnail of StylizedNeRF_CVPR2022.pdf]
Preview
PDF - Accepted Post-Print Version
Download (22MB) | Preview

Abstract

3D scene stylization aims at generating stylized images of the scene from arbitrary novel views following a given set of style examples, while ensuring consistency when rendered from different views. Directly applying methods for image or video stylization to 3D scenes cannot achieve such consistency. Thanks to recently proposed neural radiance fields (NeRF), we are able to represent a 3D scene in a consistent way. Consistent 3D scene stylization can be effectively achieved by stylizing the corresponding NeRF. However, there is a significant domain gap between style examples which are 2D images and NeRF which is an implicit volumetric representation. To address this problem, we propose a novel mutual learning framework for 3D scene stylization that combines a 2D image stylization network and NeRF to fuse the stylization ability of 2D stylization network with the 3D consistency of NeRF. We first pre-train a standard NeRF of the 3D scene to be stylized and replace its color prediction module with a style network to obtain a stylized NeRF. It is followed by distilling the prior knowledge of spatial consistency from NeRF to the 2D stylization network through an introduced consistency loss. We also introduce a mimic loss to supervise the mutual learning of the NeRF style module and fine-tune the 2D stylization decoder. In order to further make our model handle ambiguities of 2D stylization results, we introduce learnable latent codes that obey the probability distributions conditioned on the style. They are attached to training samples as conditional inputs to better learn the style module in our novel stylized NeRF. Experimental results demonstrate that our method is superior to existing approaches in both visual quality and long-range consistency.

Item Type: Conference or Workshop Item (Paper)
Date Type: Published Online
Status: Published
Schools: Schools > Computer Science & Informatics
Publisher: IEEE
ISBN: 978-1-6654-6947-0
ISSN: 1063-6919
Funders: The Royal Society
Date of First Compliant Deposit: 2 April 2022
Date of Acceptance: 2 March 2022
Last Modified: 09 Jul 2025 09:44
URI: https://orca.cardiff.ac.uk/id/eprint/149029

Actions (repository staff only)

Edit Item Edit Item

Downloads

Downloads per month over past year

View more statistics