Pregled bibliografske jedinice broj: 332623
MULTIMODAL SPEAKER IDENTITY CONVERSION - CONTINUED
MULTIMODAL SPEAKER IDENTITY CONVERSION - CONTINUED // Proceedings eNTERFACE'07
Istanbul, 2007. str. 51-60 (ostalo, nije recenziran, cjeloviti rad (in extenso), znanstveni)
CROSBI ID: 332623 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
MULTIMODAL SPEAKER IDENTITY CONVERSION - CONTINUED
Autori
Inanoglu, Zeynep ; Jottrand, Matthieu ; Markaki, Maria ; Stanković, Kristina ; Zara, Aur´elie ; Arslan, Levent ; Dutoit, Thierry ; Pandžić, Igor ; Saraclar, Murat ; Stylianou, Yannis
Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni
Izvornik
Proceedings eNTERFACE'07
/ - Istanbul, 2007, 51-60
ISBN
978-2-87463-105-4
Skup
Summer workshop on Multimodal Interfaces
Mjesto i datum
Istanbul, Turska, 16.07.2007. - 10.08.2007
Vrsta sudjelovanja
Ostalo
Vrsta recenzije
Nije recenziran
Ključne riječi
Voice conversion; Speech-to-speech conversion; Speaker mapping
(Voice conversion – Speech-to-speech conversion – Speaker mapping)
Sažetak
Being able to convert a given the speech and facial movements of a given source speaker into those of another (identified) target speaker, is a challenging problem. In this paper we build on the experience gained in a previous eNTERFACE workshop to produce a working, although still very imperfect, identity conversion system. The conversion system we develop is based on the late fusion of two independently obtained conversion results: voice conversion and facial movement conversion. In an attempt to perform parallel conversion of the glottal source and excitation tract features of speech, we examine the usability of the ARX-LF source-filter model of speech. Given its high sensitivity to parameter modification, we then use the code-book based STASC model. For face conversion, we first build 3D facial models of the source and target speakers, using the MPEG-4 standard. Facial movements are then tracked using the Active Appearance Model approach, and facial movement mapping is obtained by imposing source FAPs on the 3D model of the target, and using the target FAPUs to interpret the source FAPs.
Izvorni jezik
Engleski
Znanstvena područja
Računarstvo
POVEZANOST RADA
Projekti:
036-0362027-2028 - Utjelovljeni razgovorni agenti za usluge u umreženim i pokretljivim sustavima (Pandžić, Igor Sunday, MZO ) ( CroRIS)
Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb