MULTIMODAL SPEAKER IDENTITY CONVERSION - CONTINUED (CROSBI ID 534914)
Prilog sa skupa u zborniku | izvorni znanstveni rad
Podaci o odgovornosti
Inanoglu, Zeynep ; Jottrand, Matthieu ; Markaki, Maria ; Stanković, Kristina ; Zara, Aur´elie ; Arslan, Levent ; Dutoit, Thierry ; Pandžić, Igor ; Saraclar, Murat ; Stylianou, Yannis
engleski
MULTIMODAL SPEAKER IDENTITY CONVERSION - CONTINUED
Being able to convert a given the speech and facial movements of a given source speaker into those of another (identified) target speaker, is a challenging problem. In this paper we build on the experience gained in a previous eNTERFACE workshop to produce a working, although still very imperfect, identity conversion system. The conversion system we develop is based on the late fusion of two independently obtained conversion results: voice conversion and facial movement conversion. In an attempt to perform parallel conversion of the glottal source and excitation tract features of speech, we examine the usability of the ARX-LF source-filter model of speech. Given its high sensitivity to parameter modification, we then use the code-book based STASC model. For face conversion, we first build 3D facial models of the source and target speakers, using the MPEG-4 standard. Facial movements are then tracked using the Active Appearance Model approach, and facial movement mapping is obtained by imposing source FAPs on the 3D model of the target, and using the target FAPUs to interpret the source FAPs.
Voice conversion – Speech-to-speech conversion – Speaker mapping
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
Podaci o prilogu
51-60.
2007.
objavljeno
Podaci o matičnoj publikaciji
Proceedings eNTERFACE'07
Istanbul:
978-2-87463-105-4
Podaci o skupu
Summer workshop on Multimodal Interfaces
ostalo
16.07.2007-10.08.2007
Istanbul, Turska