MULTIMODAL SPEAKER IDENTITY CONVERSION - CONTINUED

Inanoglu, Zeynep; Jottrand, Matthieu; Markaki, Maria; Stanković, Kristina; Zara, Aur´elie; Arslan, Levent; Dutoit, Thierry; Pandžić, Igor; Saraclar, Murat; Stylianou, Yannis

izvor podataka: crosbi !

MULTIMODAL SPEAKER IDENTITY CONVERSION - CONTINUED (CROSBI ID 534914)

Prilog sa skupa u zborniku | izvorni znanstveni rad

Inanoglu, Zeynep ; Jottrand, Matthieu ; Markaki, Maria ; Stanković, Kristina ; Zara, Aur´elie ; Arslan, Levent ; Dutoit, Thierry ; Pandžić, Igor ; Saraclar, Murat ; Stylianou, Yannis MULTIMODAL SPEAKER IDENTITY CONVERSION - CONTINUED // Proceedings eNTERFACE'07. Istanbul, 2007. str. 51-60

Podaci o odgovornosti

Autori

Inanoglu, Zeynep ; Jottrand, Matthieu ; Markaki, Maria ; Stanković, Kristina ; Zara, Aur´elie ; Arslan, Levent ; Dutoit, Thierry ; Pandžić, Igor ; Saraclar, Murat ; Stylianou, Yannis

Osnovni podaci na izvornom jeziku
Osnovni podaci na ostalim jezicima

Jezik

engleski

Naslov

MULTIMODAL SPEAKER IDENTITY CONVERSION - CONTINUED

Sažetak

Being able to convert a given the speech and facial movements of a given source speaker into those of another (identified) target speaker, is a challenging problem. In this paper we build on the experience gained in a previous eNTERFACE workshop to produce a working, although still very imperfect, identity conversion system. The conversion system we develop is based on the late fusion of two independently obtained conversion results: voice conversion and facial movement conversion. In an attempt to perform parallel conversion of the glottal source and excitation tract features of speech, we examine the usability of the ARX-LF source-filter model of speech. Given its high sensitivity to parameter modification, we then use the code-book based STASC model. For face conversion, we first build 3D facial models of the source and target speakers, using the MPEG-4 standard. Facial movements are then tracked using the Active Appearance Model approach, and facial movement mapping is obtained by imposing source FAPs on the 3D model of the target, and using the target FAPUs to interpret the source FAPs.

Ključne riječi

Voice conversion – Speech-to-speech conversion – Speaker mapping

Napomena

nije evidentirano

Jezik

nije evidentirano

Naslov

nije evidentirano

Sažetak

nije evidentirano

Ključne riječi

nije evidentirano

Napomena

nije evidentirano

Podaci o prilogu

Stranice rada

51-60.

Godina izdavanja

2007.

Status objave rada

objavljeno

Podaci o matičnoj publikaciji

Naslov

Proceedings eNTERFACE'07

Izdavač

Istanbul:

ISBN

978-2-87463-105-4

Podaci o skupu

Skup

Summer workshop on Multimodal Interfaces

Vrsta sudjelovanja

ostalo

Datum održavanja skupa

16.07.2007-10.08.2007

Mjesto održavanja skupa

Istanbul, Turska

Povezanost rada

Povezane osobe

Kristina Stanković (autor/i)

Igor Sunday Pandžić (autor/i)

Povezane ustanove

Fakultet elektrotehnike i računarstva (036) (autorova ustanova)

Povezani projekti

Utjelovljeni razgovorni agenti za usluge u umreženim i pokretljivim sustavima (rezultat rada na projektu)

Područje

Računarstvo

Poveznice

cmpe.boun.edu.tr