Benchmarking Speech Synchronized Facial Animation Based on Context-Dependent Visemes

De Martino,J.M., Violaro,F.

Abstract:
In this paper we evaluate the effectiveness in conveying speech information of a speech synchronized facial animation system based on context-dependent visemes. The evaluation procedure is based on an oral speech intelligibility test conducted with, and without, supplementary visual information provided by a real and a virtual speaker. Three situations (audio-only, audio+video and audio+animation) are compared and analysed under five different conditions of noise contamination of the audio signal. The results show that the virtual face driven by context-dependent visemes effectively contributes to speech intelligibility at high noise degradation levels (Signal to Noise Ratio (SNR)-18dB).