Zhou, Feng, Li, Chi, Dai, Ju, Zhu, Mengxiao, Zhang, Yongmei, Lai, Yukun ![]() ![]() ![]() |
Preview |
PDF
- Accepted Post-Print Version
Download (2MB) | Preview |
Abstract
Generating and creating humanoid 3D models has received increasing attention recently due to its fundamental support for many high-level 3D applications. Although automatic 3D pose and shape reconstruction methods have achieved promising results, there are still some failure cases due to self-occlusions, viewpoint changes, and the complexity of human pose articulations. In this paper, we propose a novel way to leverage Large Language Models (LLMs) to interactively reconstruct human pose and shape based on a Skinned Multi-Person Linear (SMPL) model. We construct a mapping table to fine-tune an LLM, enabling it to understand user inputs better and output the positional information of joint points. Additionally, a simple neural network is adopted to regress the shape cues of the SMPL. We demonstrate a gallery of results of numerous poses and shapes. We validate our method via numerical evaluations, user studies, and comparisons to manually posed characters and previous work.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Date Type: | Published Online |
Status: | Published |
Schools: | Schools > Computer Science & Informatics |
Publisher: | IEEE |
ISBN: | 979-8-3503-6875-8 |
ISSN: | 1520-6149 |
Related URLs: | |
Date of First Compliant Deposit: | 8 February 2025 |
Date of Acceptance: | 18 December 2024 |
Last Modified: | 26 Mar 2025 15:15 |
URI: | https://orca.cardiff.ac.uk/id/eprint/176050 |
Actions (repository staff only)
![]() |
Edit Item |