Rapid Prototyping of a Croatian Large Vocabulary Continuous Speech Recognition System

Bajo, Dario; Turković, Danijel; Dembitz, Šandor

izvor podataka: crosbi !

Rapid Prototyping of a Croatian Large Vocabulary Continuous Speech Recognition System (CROSBI ID 604564)

Prilog sa skupa u zborniku | izvorni znanstveni rad | međunarodna recenzija

Bajo, Dario ; Turković, Danijel ; Dembitz, Šandor Rapid Prototyping of a Croatian Large Vocabulary Continuous Speech Recognition System // INFOCOMP 2013 / Rückemann, Claus-Peter ; Pankowska, Malgorzata (ur.). Lisabon: International Academy, Research, and Industry Association (IARIA), 2013. str. 13-18

Podaci o odgovornosti

Autori

Bajo, Dario ; Turković, Danijel ; Dembitz, Šandor

Osnovni podaci na izvornom jeziku
Osnovni podaci na ostalim jezicima

Jezik

engleski

Naslov

Rapid Prototyping of a Croatian Large Vocabulary Continuous Speech Recognition System

Sažetak

The Croatian language, like many minority languages used by less than 0.1% of the world population, is in need of mature automatic speech recognition (ASR) systems for applications such as transcription of speech recordings, voice control, an aid to impaired people, etc. This paper describes a short-term research and development project aimed to produce an applicable Croatian large vocabulary continuous speech recognition system from scratch. The open-source CMU Sphinx toolkit was our platform choice. For the purpose of acoustic model training, we made a speech training set of several hundred utterances, containing words carefully chosen according to their phonetic properties. Language models were derived from the Croatian large-scale n-gram system, which ensures the system’s applicability. During the project, we succeeded in developing an ASR system able to recognize freely chosen utterances composed of 15, 000 most frequently used Croatian words reasonably well.

Ključne riječi

automatic speech recognition; continuous speech; large-scale n-gram model; large vocabulary.

Napomena

nije evidentirano

Jezik

nije evidentirano

Naslov

nije evidentirano

Sažetak

nije evidentirano

Ključne riječi

nije evidentirano

Napomena

nije evidentirano

Podaci o prilogu

Stranice rada

13-18.

Godina izdavanja

2013.

Status objave rada

objavljeno

Podaci o matičnoj publikaciji

Naslov

INFOCOMP 2013

Urednici

Rückemann, Claus-Peter ; Pankowska, Malgorzata

Izdavač

Lisabon: International Academy, Research, and Industry Association (IARIA)

ISBN

978-1-61208-310-0

Podaci o skupu

Skup

The Third International Conference on Advanced Communications and Computation

Vrsta sudjelovanja

predavanje

Datum održavanja skupa

17.11.2013-22.11.2013

Mjesto održavanja skupa

Lisabon, Portugal

Povezanost rada

Povezane osobe

Šandor Dembitz (autor/i)

Povezane ustanove

Fakultet elektrotehnike i računarstva (036) (autorova ustanova)

Povezani projekti

Umrežena ekonomija (rezultat rada na projektu)

Područje

Elektrotehnika