| Yang, Yi, Yue, Guanghui, Zhou, Wei, Mao, Xudong, Wang, Ruomei and Zhao, Baoquan 2025. Expressive human volumetric video generation with rich text. IEEE Transactions on Circuits and Systems for Video Technology 10.1109/tcsvt.2025.3628996 |
Abstract
Plain text has become the dominant interactive interface for text-driven human volumetric video generation. However, its limited customization options hinder users from expressing motion effects with accuracy. For example, plain text struggles to specify continuous variables such as motion amplitude, speed, and joint trajectories with precision, and it fails to convey stylized motion characteristics. Additionally, crafting detailed textual prompts for complex motion sequences is cumbersome, while excessively long prompts strain text encoders. To address these limitations, we propose a rich text-based framework that supports font styles, sizes, and trajectory sketching. By extracting motion-related attributes from rich text, our method enables fine-grained control over motion styles, precise speed regulation, and accurate joint trajectory manipulation. These capabilities are realized through gradient-guided noise editing and ControlNet-based motion optimization, which operate within the latent motion diffusion process. Specifically, we design a unified gradient-guided adaptation mechanism to ensure that the generated motion video adheres strictly to the specified constraints. Furthermore, we introduce realism-oriented optimization for stylistic and joint-level control, refining motion synthesis at a granular level to produce smoother, more natural movements. We present multiple comparative evaluations showcasing volumetric video generation from both rich text and plain text. Through quantitative analysis, we demonstrate that our method surpasses strong plain-text baselines, producing expressive, customizable human volumetric motion videos.
| Item Type: | Article |
|---|---|
| Date Type: | Published Online |
| Status: | In Press |
| Schools: | Schools > Computer Science & Informatics |
| Additional Information: | License information from Publisher: LICENSE 1: URL: https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html, Start Date: 2025-01-01 |
| Publisher: | Institute of Electrical and Electronics Engineers |
| ISSN: | 1051-8215 |
| Last Modified: | 20 Nov 2025 15:00 |
| URI: | https://orca.cardiff.ac.uk/id/eprint/182547 |
Actions (repository staff only)
![]() |
Edit Item |




Dimensions
Dimensions