Pregled bibliografske jedinice broj: 585773
Comparison of Statistical Model-Based Voice Activity Detectors for Mobile Robot Speech Applications
Comparison of Statistical Model-Based Voice Activity Detectors for Mobile Robot Speech Applications // Proceedings of the 10th IFAC Symposioum on Robotic Control (SYROCO2012), Volume 10, Part 1 / Petrovic, Ivan ; Korondi, Peter (ur.).
Dubrovnik, 2012. (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
CROSBI ID: 585773 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Comparison of Statistical Model-Based Voice Activity Detectors for Mobile Robot Speech Applications
Autori
Marković, Ivan ; Domitrović, Hrvoje ; Petrović, Ivan
Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni
Izvornik
Proceedings of the 10th IFAC Symposioum on Robotic Control (SYROCO2012), Volume 10, Part 1
/ Petrovic, Ivan ; Korondi, Peter - Dubrovnik, 2012
ISBN
978-3-902823-11-3
Skup
10th IFAC Symposioum on Robotic Control (SYROCO2012)
Mjesto i datum
Dubrovnik, Hrvatska, 05.09.2012. - 07.09.2012
Vrsta sudjelovanja
Predavanje
Vrsta recenzije
Međunarodna recenzija
Ključne riječi
voice activity detection; statistical model-based detectors; receiver operating characteristic curves
Sažetak
This paper deals with the problem of voice activity detection in adverse acoustic conditions, namely high and varying noise scenarios. For robotic applications, we need the voice activity detector to be computationally light, robust to varying levels of background noise, and have a low latency, especially if we are tracking moving speakers. We analyze three voice activity detectors - two model the discrete Fourier transform coefficients by Gaussian and generalized Gaussian distribution, while the third models the spectral envelope as having either Rayleigh or Rice distribution---and we present them in a unifying and consistent manner, with respect to a statistical hypotheses ratio measure and a joint noise spectrum estimation algorithm. Moreover, we compare the performance under various noise conditions ; three types of noises, three different signal-to-noise ratios and six different speakers, by means of receiver operating characteristic curves and area under a curve score. The results showed that the Rayleigh-Rice model had on average better results and medium computational demand.
Izvorni jezik
Engleski
Znanstvena područja
Elektrotehnika, Računarstvo, Temeljne tehničke znanosti
POVEZANOST RADA
Projekti:
036-0363078-3018 - Upravljanje mobilnim robotima i vozilima u nepoznatim i dinamičkim okruženjima (Petrović, Ivan, MZO ) ( CroRIS)
Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb