Napredna pretraga

Pregled bibliografske jedinice broj: 653961

A Comparison of Two Approaches to Bilingual HMM- Based Speech Synthesis


Pobar, Miran; Justin, Tadej; Žibert, Janez; Mihelič, France; Ipšić, Ivo
A Comparison of Two Approaches to Bilingual HMM- Based Speech Synthesis // Text, Speech, and Dialogue / Habernal, Ivan ; Matoušek, Václav (ur.).
Pilsen: Springer Berlin Heidelberg, 2013. str. 44-51 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)


Naslov
A Comparison of Two Approaches to Bilingual HMM- Based Speech Synthesis

Autori
Pobar, Miran ; Justin, Tadej ; Žibert, Janez ; Mihelič, France ; Ipšić, Ivo

Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni

Izvornik
Text, Speech, and Dialogue / Habernal, Ivan ; Matoušek, Václav - Pilsen : Springer Berlin Heidelberg, 2013, 44-51

ISBN
978-3-642-40584-6

Skup
16th International Conference, TSD 2013, Pilsen, Czech Republic, September 1-5, 2013. Proceedings

Mjesto i datum
Brno, Češka Republika, 1-5.9.2013

Vrsta sudjelovanja
Predavanje

Vrsta recenzije
Međunarodna recenzija

Ključne riječi
Bilingual; HMM; speech synthesis; phoneme mapping; state mapping; speaker adaptation; Kullback-Leibler divergence

Sažetak
We compare the performance of two approaches when using cross-lingual data from different speakers to build bilingual speech synthesis systems capable of producing speech with the same speaker identity. One approach treats data from both languages as monolingual, by labeling all data with a manually joined phoneme set. Speaker independent voice is trained using the joined data, and adapted to the target speaker using the CMLLR adaptation. In the second approach, speaker independent voices are trained for each language separately. State mapping between these voices is derived automatically from minimum Kullback–Leibler divergence between state distributions. The mapping is used to apply the adaptation transformations calculated within one language across languages to the other speaker independent voice. We evaluate the quality of speech on MOS scale and similarity of synthesized speech characteristics to the target speaker using DMOS on the example of Croatian-Slovene language pair.

Izvorni jezik
Engleski

Znanstvena područja
Računarstvo, Informacijske i komunikacijske znanosti



POVEZANOST RADA


Projekt / tema
318-0361935-0852 - Govorne tehnologije (Ivo Ipšić, )

Ustanove
Sveučilište u Rijeci - Odjel za informatiku

Autor s matičnim brojem:
Miran Pobar, (308561)
Ivo Ipšić, (222462)

Časopis indeksira:


  • Scopus