Napredna pretraga

Pregled bibliografske jedinice broj: 585773

Comparison of Statistical Model-Based Voice Activity Detectors for Mobile Robot Speech Applications


Marković, Ivan; Domitrović, Hrvoje; Petrović, Ivan
Comparison of Statistical Model-Based Voice Activity Detectors for Mobile Robot Speech Applications // Proceedings of the 10th IFAC Symposioum on Robotic Control (SYROCO2012), Volume 10, Part 1 / Petrovic, Ivan ; Korondi, Peter (ur.).
Dubrovnik, Hrvatska, 2012. (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)


Naslov
Comparison of Statistical Model-Based Voice Activity Detectors for Mobile Robot Speech Applications

Autori
Marković, Ivan ; Domitrović, Hrvoje ; Petrović, Ivan

Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni

Izvornik
Proceedings of the 10th IFAC Symposioum on Robotic Control (SYROCO2012), Volume 10, Part 1 / Petrovic, Ivan ; Korondi, Peter - Dubrovnik, Hrvatska, 2012

ISBN
978-3-902823-11-3

Skup
10th IFAC Symposioum on Robotic Control (SYROCO2012)

Mjesto i datum
Dubrovnik, Hrvatska, 05-07.09.2012.

Vrsta sudjelovanja
Predavanje

Vrsta recenzije
Međunarodna recenzija

Ključne riječi
Voice activity detection; statistical model-based detectors; receiver operating characteristic curves

Sažetak
This paper deals with the problem of voice activity detection in adverse acoustic conditions, namely high and varying noise scenarios. For robotic applications, we need the voice activity detector to be computationally light, robust to varying levels of background noise, and have a low latency, especially if we are tracking moving speakers. We analyze three voice activity detectors - two model the discrete Fourier transform coefficients by Gaussian and generalized Gaussian distribution, while the third models the spectral envelope as having either Rayleigh or Rice distribution---and we present them in a unifying and consistent manner, with respect to a statistical hypotheses ratio measure and a joint noise spectrum estimation algorithm. Moreover, we compare the performance under various noise conditions ; three types of noises, three different signal-to-noise ratios and six different speakers, by means of receiver operating characteristic curves and area under a curve score. The results showed that the Rayleigh-Rice model had on average better results and medium computational demand.

Izvorni jezik
Engleski

Znanstvena područja
Elektrotehnika, Računarstvo, Temeljne tehničke znanosti



POVEZANOST RADA


Projekt / tema
036-0363078-3018 - Upravljanje mobilnim robotima i vozilima u nepoznatim i dinamičkim okruženjima (Ivan Petrović, )

Ustanove
Fakultet elektrotehnike i računarstva, Zagreb