Multimodal HCI output: Facial motion, gestures and synthesized speech synchronization

Pandžić, Igor

Pregled bibliografske jedinice broj: 449341

Multimodal HCI output: Facial motion, gestures and synthesized speech synchronization

Pandžić, Igor

Multimodal HCI output: Facial motion, gestures and synthesized speech synchronization // Multimodal Signal Processing / Thiran, Jean-Philippe ; Marques, Ferran ; Bourlard, Herve (ur.).
Oxford: Academic Press, 2010. str. 257-274

CROSBI ID: 449341 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Multimodal HCI output: Facial motion, gestures and synthesized speech synchronization

Autori
Pandžić, Igor

Vrsta, podvrsta i kategorija rada
Poglavlja u knjigama, znanstveni

Knjiga
Multimodal Signal Processing

Urednik/ci
Thiran, Jean-Philippe ; Marques, Ferran ; Bourlard, Herve

Izdavač
Academic Press

Grad
Oxford

Godina
2010

Raspon stranica
257-274

ISBN
978-0-12-374825-6

Ključne riječi
character animation, multimodal speech synthesis

Sažetak
In this chapter we present an overview of the issues involved in generating multimodal output consisting of speech, facial motion and gestures. We start by introducing a basic audio-visual speech synthesis system that generates simple lip motion from input text using a TTS engine and an animation system. Throughout this chapter we gradually extend and improve this system first with coarticulation, then full facial motion and gestures and finally we present it in the context of a full Embodied Conversational Agent system. At each level we present key concepts and discuss existing systems. We concentrate on real-time interactive systems, as necessary for HCI. This requires on-the-fly generation of speech and animation and their synchronization, and does not allow for any time-consuming pre-processing. We discuss the practical issues that this requirement brings in the final section that deals with obtaining timing information from the TTS engine. We concentrate on systems that deal with plain text input (ASCII or UNICODE) rather than those that require manual tagging of text because such systems add a significant overhead to the implementation of any HCI application.

Izvorni jezik
Engleski

Znanstvena područja
Računarstvo

POVEZANOST RADA

Projekti:
036-0362027-2028 - Utjelovljeni razgovorni agenti za usluge u umreženim i pokretljivim sustavima (Pandžić, Igor Sunday, MZO ) ( CroRIS)

Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb

Profili:

Igor Sunday Pandžić (autor)

CROSBI Hrvatska znanstvena bibliografija

Pregled bibliografske jedinice broj: 449341

Multimodal HCI output: Facial motion, gestures and synthesized speech synchronization

Citiraj ovu publikaciju:

Pregled bibliografske jedinice broj: 449341

Multimodal HCI output: Facial motion, gestures and synthesized speech synchronization

Citiraj ovu publikaciju:

Podijeli: