Nalazite se na CroRIS probnoj okolini. Ovdje evidentirani podaci neće biti pohranjeni u Informacijskom sustavu znanosti RH. Ako je ovo greška, CroRIS produkcijskoj okolini moguće je pristupi putem poveznice www.croris.hr
izvor podataka: crosbi !

Perceptual Significance of Cepstral Distortion Measures in Digital Speech Processing (CROSBI ID 160490)

Prilog u časopisu | izvorni znanstveni rad | međunarodna recenzija

Vasilijević, Antonio ; Petrinović, Davor Perceptual Significance of Cepstral Distortion Measures in Digital Speech Processing // Automatika : časopis za automatiku, mjerenje, elektroniku, računarstvo i komunikacije, 52 (2011), 2; 132-146

Podaci o odgovornosti

Vasilijević, Antonio ; Petrinović, Davor

engleski

Perceptual Significance of Cepstral Distortion Measures in Digital Speech Processing

Currently, one of the most widely used distance measures in speech and speaker recognition is the Euclidean distance between mel frequency cepstral coefficients (MFCC). MFCCs are based on filter bank algorithm whose filters are equally spaced on a perceptually motivated mel frequency scale. The value of mel cepstral vector, as well as the properties of the corresponding cepstral distance, are determined by several parameters used in mel cepstral analysis. The aim of this work is to examine compatibility of MFCC measure with human perception for different values of parameters in the analysis. By analysing mel filter bank parameters it is found that filter bank with 24 bands, 220 mels bandwidth and band overlap coefficient equal and higher than one gives optimal spectral distortion (SD) distance measures. For this kind of mel filter bank, the difference between vowels can be recognised for full-length mel cepstral SD RMS measure higher than 0.4 - 0.5 dB. Further on, we will show that usage of truncated mel cepstral vector (12 coefficients) is justified for speech recognition, but may be arguable for speaker recognition. We also analysed the impact of aliasing in cepstral domain on cepstral distortion measures. The results showed high correlation of SD distances calculated from aperiodic and periodic mel cepstrum, leading to the conclusion that the impact of aliasing is generally minor. There are rare exceptions where aliasing is present, and these were also analysed.

Aliasing ; Digital speech processing ; MFCC ; Mel cepstrum ; SD Measure ; Speech recognition ; OBJECTIVE ASSESSMENT ; RECOGNITION ; QUALITY

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

Podaci o izdanju

52 (2)

2011.

132-146

objavljeno

0005-1144

Povezanost rada

Elektrotehnika, Računarstvo

Poveznice
Indeksiranost