Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Speech-driven facial animation using manifold relevance determination

Dawood, Samia, Hicks, Yulia ORCID: and Marshall, David 2016. Speech-driven facial animation using manifold relevance determination. Hua, Gang and Jégou, Hervé, eds. Computer Vision – ECCV 2016 Workshops, Vol. 9914. Lecture Notes in Computer Science, Cham: Springer, pp. 869-882. (10.1007/978-3-319-48881-3_57)

Full text not available from this repository.


In this paper, a new approach to visual speech synthesis using a joint probabilistic model is introduced, namely the Gaussian process latent variable model trimmed with manifold relevance determination model, which explicitly models coarticulation. One talking head dataset is processed (LIPS dataset) by extracting visual and audio features from the sequences. The model can capture the structure of data with extremely high dimensionality. Distinguishable visual features can be inferred directly from the trained model by sampling from the discovered latent points. Statistical evaluation of inferred visual features against ground truth data is obtained and compared with the current state-of-the-art visual speech synthesis approach. The quantitative results demonstrate that the proposed approach outperforms the state-of-the-art technique.

Item Type: Book Section
Date Type: Publication
Status: Published
Schools: Engineering
Publisher: Springer
ISBN: 978-3-319-48881-3
ISSN: 1611-3349
Last Modified: 25 Oct 2022 13:40

Actions (repository staff only)

Edit Item Edit Item