MULTIMODAL SPEAKER IDENTITY CONVERSION - CONTINUED

Inanoglu, Zeynep; Jottrand, Matthieu; Markaki, Maria; Stanković, Kristina; Zara, Aur´elie; Arslan, Levent; Dutoit, Thierry; Pandžić, Igor; Saraclar, Murat; Stylianou, Yannis

Pregled bibliografske jedinice broj: 332623

MULTIMODAL SPEAKER IDENTITY CONVERSION - CONTINUED

Inanoglu, Zeynep; Jottrand, Matthieu; Markaki, Maria; Stanković, Kristina; Zara, Aur´elie; Arslan, Levent; Dutoit, Thierry; Pandžić, Igor; Saraclar, Murat; Stylianou, Yannis

MULTIMODAL SPEAKER IDENTITY CONVERSION - CONTINUED // Proceedings eNTERFACE'07
Istanbul, 2007. str. 51-60 (ostalo, nije recenziran, cjeloviti rad (in extenso), znanstveni)

CROSBI ID: 332623 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
MULTIMODAL SPEAKER IDENTITY CONVERSION - CONTINUED

Autori
Inanoglu, Zeynep ; Jottrand, Matthieu ; Markaki, Maria ; Stanković, Kristina ; Zara, Aur´elie ; Arslan, Levent ; Dutoit, Thierry ; Pandžić, Igor ; Saraclar, Murat ; Stylianou, Yannis

Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni

Izvornik
Proceedings eNTERFACE'07 / - Istanbul, 2007, 51-60

ISBN
978-2-87463-105-4

Skup
Summer workshop on Multimodal Interfaces

Mjesto i datum
Istanbul, Turska, 16.07.2007. - 10.08.2007

Vrsta sudjelovanja
Ostalo

Vrsta recenzije
Nije recenziran

Ključne riječi
Voice conversion; Speech-to-speech conversion; Speaker mapping
(Voice conversion – Speech-to-speech conversion – Speaker mapping)

Sažetak
Being able to convert a given the speech and facial movements of a given source speaker into those of another (identified) target speaker, is a challenging problem. In this paper we build on the experience gained in a previous eNTERFACE workshop to produce a working, although still very imperfect, identity conversion system. The conversion system we develop is based on the late fusion of two independently obtained conversion results: voice conversion and facial movement conversion. In an attempt to perform parallel conversion of the glottal source and excitation tract features of speech, we examine the usability of the ARX-LF source-filter model of speech. Given its high sensitivity to parameter modification, we then use the code-book based STASC model. For face conversion, we first build 3D facial models of the source and target speakers, using the MPEG-4 standard. Facial movements are then tracked using the Active Appearance Model approach, and facial movement mapping is obtained by imposing source FAPs on the 3D model of the target, and using the target FAPUs to interpret the source FAPs.

Izvorni jezik
Engleski

Znanstvena područja
Računarstvo

POVEZANOST RADA

Projekti:
036-0362027-2028 - Utjelovljeni razgovorni agenti za usluge u umreženim i pokretljivim sustavima (Pandžić, Igor Sunday, MZO ) ( CroRIS)

Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb

Profili:

Igor Sunday Pandžić (autor)