Speech-driven facial animation using manifold relevance determination

Dawood, Samia, Hicks, Yulia

and Marshall, David 2016. Speech-driven facial animation using manifold relevance determination. Hua, Gang and Jégou, Hervé, eds. Computer Vision – ECCV 2016 Workshops, Vol. 9914. Lecture Notes in Computer Science, Cham: Springer, pp. 869-882. (10.1007/978-3-319-48881-3_57)

Full text not available from this repository.

Official URL: http://dx.doi.org/10.1007/978-3-319-48881-3_57

Abstract

In this paper, a new approach to visual speech synthesis using a joint probabilistic model is introduced, namely the Gaussian process latent variable model trimmed with manifold relevance determination model, which explicitly models coarticulation. One talking head dataset is processed (LIPS dataset) by extracting visual and audio features from the sequences. The model can capture the structure of data with extremely high dimensionality. Distinguishable visual features can be inferred directly from the trained model by sampling from the discovered latent points. Statistical evaluation of inferred visual features against ground truth data is obtained and compared with the current state-of-the-art visual speech synthesis approach. The quantitative results demonstrate that the proposed approach outperforms the state-of-the-art technique.

Item Type:	Book Section
Date Type:	Publication
Status:	Published
Schools:	Schools > Engineering
Publisher:	Springer
ISBN:	978-3-319-48881-3
ISSN:	1611-3349
Last Modified:	25 Oct 2022 13:40
URI:	https://orca.cardiff.ac.uk/id/eprint/120474

Actions (repository staff only)

Edit Item

Altmetric

Dimensions

CORE (COnnecting REpositories)