A Comparison of Two Approaches to Bilingual HMM- Based Speech Synthesis

Pobar, Miran; Justin, Tadej; Žibert, Janez; Mihelič, France; Ipšić, Ivo

Pregled bibliografske jedinice broj: 653961

A Comparison of Two Approaches to Bilingual HMM- Based Speech Synthesis

Pobar, Miran; Justin, Tadej; Žibert, Janez; Mihelič, France; Ipšić, Ivo

A Comparison of Two Approaches to Bilingual HMM- Based Speech Synthesis // Text, Speech, and Dialogue / Habernal, Ivan ; Matoušek, Václav (ur.).
Plzeň: Springer, 2013. str. 44-51 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)

CROSBI ID: 653961 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
A Comparison of Two Approaches to Bilingual HMM- Based Speech Synthesis

Autori
Pobar, Miran ; Justin, Tadej ; Žibert, Janez ; Mihelič, France ; Ipšić, Ivo

Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni

Izvornik
Text, Speech, and Dialogue / Habernal, Ivan ; Matoušek, Václav - Plzeň : Springer, 2013, 44-51

ISBN
978-3-642-40584-6

Skup
16th International Conference, TSD 2013, Pilsen, Czech Republic, September 1-5, 2013. Proceedings

Mjesto i datum
Brno, Češka Republika, 01.09.2013. - 05.09.2013

Vrsta sudjelovanja
Predavanje

Vrsta recenzije
Međunarodna recenzija

Ključne riječi
bilingual; HMM; speech synthesis; phoneme mapping; state mapping; speaker adaptation; Kullback-Leibler divergence

Sažetak
We compare the performance of two approaches when using cross-lingual data from different speakers to build bilingual speech synthesis systems capable of producing speech with the same speaker identity. One approach treats data from both languages as monolingual, by labeling all data with a manually joined phoneme set. Speaker independent voice is trained using the joined data, and adapted to the target speaker using the CMLLR adaptation. In the second approach, speaker independent voices are trained for each language separately. State mapping between these voices is derived automatically from minimum Kullback–Leibler divergence between state distributions. The mapping is used to apply the adaptation transformations calculated within one language across languages to the other speaker independent voice. We evaluate the quality of speech on MOS scale and similarity of synthesized speech characteristics to the target speaker using DMOS on the example of Croatian-Slovene language pair.

Izvorni jezik
Engleski

Znanstvena područja
Računarstvo, Informacijske i komunikacijske znanosti

POVEZANOST RADA

Projekti:
318-0361935-0852 - Govorne tehnologije (Ipšić, Ivo, MZOS ) ( CroRIS)

Ustanove:
Fakultet informatike i digitalnih tehnologija, Rijeka

Profili:

Miran Pobar (autor)

Ivo Ipšić (autor)

Citiraj ovu publikaciju:

Časopis indeksira:

Scopus

Pregled bibliografske jedinice broj: 653961

A Comparison of Two Approaches to Bilingual HMM- Based Speech Synthesis

Citiraj ovu publikaciju:

Časopis indeksira:

Podijeli: