Pregled bibliografske jedinice broj: 664514
Rapid Prototyping of a Croatian Large Vocabulary Continuous Speech Recognition System
Rapid Prototyping of a Croatian Large Vocabulary Continuous Speech Recognition System // INFOCOMP 2013 / Rückemann, Claus-Peter ; Pankowska, Malgorzata (ur.).
Lisabon: International Academy, Research, and Industry Association (IARIA), 2013. str. 13-18 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
CROSBI ID: 664514 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Rapid Prototyping of a Croatian Large Vocabulary Continuous Speech Recognition System
Autori
Bajo, Dario ; Turković, Danijel ; Dembitz, Šandor
Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni
Izvornik
INFOCOMP 2013
/ Rückemann, Claus-Peter ; Pankowska, Malgorzata - Lisabon : International Academy, Research, and Industry Association (IARIA), 2013, 13-18
ISBN
978-1-61208-310-0
Skup
The Third International Conference on Advanced Communications and Computation
Mjesto i datum
Lisabon, Portugal, 17.11.2013. - 22.11.2013
Vrsta sudjelovanja
Predavanje
Vrsta recenzije
Međunarodna recenzija
Ključne riječi
automatic speech recognition; continuous speech; large-scale n-gram model; large vocabulary.
Sažetak
The Croatian language, like many minority languages used by less than 0.1% of the world population, is in need of mature automatic speech recognition (ASR) systems for applications such as transcription of speech recordings, voice control, an aid to impaired people, etc. This paper describes a short-term research and development project aimed to produce an applicable Croatian large vocabulary continuous speech recognition system from scratch. The open-source CMU Sphinx toolkit was our platform choice. For the purpose of acoustic model training, we made a speech training set of several hundred utterances, containing words carefully chosen according to their phonetic properties. Language models were derived from the Croatian large-scale n-gram system, which ensures the system’s applicability. During the project, we succeeded in developing an ASR system able to recognize freely chosen utterances composed of 15, 000 most frequently used Croatian words reasonably well.
Izvorni jezik
Engleski
Znanstvena područja
Elektrotehnika
POVEZANOST RADA
Projekti:
036-0362027-1638 - Umrežena ekonomija (Skočir, Zoran, MZO ) ( CroRIS)
Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb
Profili:
Šandor Dembitz
(autor)