Cosker, Darren P., Marshall, Andrew David ORCID: https://orcid.org/0000-0003-2789-1395, Rosin, Paul L. ORCID: https://orcid.org/0000-0002-4965-3884 and Hicks, Yulia Alexandrovna ORCID: https://orcid.org/0000-0002-7179-4587 2003. Video realistic talking heads using hierarchical non-linear speech-appearance models. Presented at: Mirage 2003, Rocquencourt, France, 10-11 March 2003. |
Abstract
In this paper we present an audio driven system capable of videorealistic synthesis of a speaker uttering novel phrases. The audio input signal requires no phonetic labelling and is speaker independent. The system requires only a small training set of video and produces fully co-articulated realistic facial synthesis. Natural mouth and face dynamics are learned in training to allow new facial poses, unseen in the training video, to be rendered. To improve specificity and synthesis quality the appearance of a speaker’s mouth and face are modelled separately and combined to produce the final video. To achieve this we have developed a novel approach which utilizes a hierarchical and non-linear PCA model which couples speech and appearance. The model is highly compact making it suitable for a wide range of real-time applications in multimedia and telecommunications using standard hardware.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Date Type: | Publication |
Status: | Published |
Schools: | Computer Science & Informatics Engineering |
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science Q Science > QA Mathematics > QA76 Computer software |
Uncontrolled Keywords: | Facial animation ; facial modelling ; speech animation ; face ; models |
Additional Information: | INRIA - Institut national de recherche en informatique et en automatique (France) |
Last Modified: | 17 Oct 2022 09:41 |
URI: | https://orca.cardiff.ac.uk/id/eprint/5164 |
Citation Data
Actions (repository staff only)
Edit Item |