Multimodal HCI output: Facial motion, gestures and synthesized speech synchronization

Pandžić, Igor

izvor podataka: crosbi !

Multimodal HCI output: Facial motion, gestures and synthesized speech synchronization (CROSBI ID 40611)

Prilog u knjizi | izvorni znanstveni rad

Pandžić, Igor Multimodal HCI output: Facial motion, gestures and synthesized speech synchronization // Multimodal Signal Processing / Thiran, Jean-Philippe ; Marques, Ferran ; Bourlard, Herve (ur.). Oxford: Academic Press, 2010. str. 257-274

Podaci o odgovornosti

Autori

Pandžić, Igor

Osnovni podaci na izvornom jeziku
Osnovni podaci na ostalim jezicima

Jezik

engleski

Naslov

Multimodal HCI output: Facial motion, gestures and synthesized speech synchronization

Sažetak

In this chapter we present an overview of the issues involved in generating multimodal output consisting of speech, facial motion and gestures. We start by introducing a basic audio-visual speech synthesis system that generates simple lip motion from input text using a TTS engine and an animation system. Throughout this chapter we gradually extend and improve this system first with coarticulation, then full facial motion and gestures and finally we present it in the context of a full Embodied Conversational Agent system. At each level we present key concepts and discuss existing systems. We concentrate on real-time interactive systems, as necessary for HCI. This requires on-the-fly generation of speech and animation and their synchronization, and does not allow for any time-consuming pre-processing. We discuss the practical issues that this requirement brings in the final section that deals with obtaining timing information from the TTS engine. We concentrate on systems that deal with plain text input (ASCII or UNICODE) rather than those that require manual tagging of text because such systems add a significant overhead to the implementation of any HCI application.

Ključne riječi

character animation, multimodal speech synthesis

Napomena

nije evidentirano

Jezik

nije evidentirano

Naslov

nije evidentirano

Sažetak

nije evidentirano

Ključne riječi

nije evidentirano

Napomena

nije evidentirano

Podaci o prilogu

Stranice rada

257-274.

Status objave rada

objavljeno

Podaci o knjizi

Knjiga u kojoj je prilog objavljen

Multimodal Signal Processing

Urednici

Thiran, Jean-Philippe ; Marques, Ferran ; Bourlard, Herve

Izdavač

Oxford: Academic Press

Godina izdavanja

2010.

ISBN

978-0-12-374825-6

Povezanost rada

Povezane osobe

Igor Sunday Pandžić (autor/i)

Povezane ustanove

Fakultet elektrotehnike i računarstva (036) (autorova ustanova)

Povezani projekti

Utjelovljeni razgovorni agenti za usluge u umreženim i pokretljivim sustavima (rezultat rada na projektu)

Područje

Računarstvo