Pretražite po imenu i prezimenu autora, mentora, urednika, prevoditelja

Napredna pretraga

Pregled bibliografske jedinice broj: 451668

Perceptual Significance of Cepstral Distortion Measures in Digital Speech Processing


Vasilijević, Antonio; Petrinović, Davor
Perceptual Significance of Cepstral Distortion Measures in Digital Speech Processing // Automatika : časopis za automatiku, mjerenje, elektroniku, računarstvo i komunikacije, 52 (2011), 2; 132-146 (međunarodna recenzija, članak, znanstveni)


CROSBI ID: 451668 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Perceptual Significance of Cepstral Distortion Measures in Digital Speech Processing

Autori
Vasilijević, Antonio ; Petrinović, Davor

Izvornik
Automatika : časopis za automatiku, mjerenje, elektroniku, računarstvo i komunikacije (0005-1144) 52 (2011), 2; 132-146

Vrsta, podvrsta i kategorija rada
Radovi u časopisima, članak, znanstveni

Ključne riječi
Aliasing ; Digital speech processing ; MFCC ; Mel cepstrum ; SD Measure ; Speech recognition ; OBJECTIVE ASSESSMENT ; RECOGNITION ; QUALITY

Sažetak
Currently, one of the most widely used distance measures in speech and speaker recognition is the Euclidean distance between mel frequency cepstral coefficients (MFCC). MFCCs are based on filter bank algorithm whose filters are equally spaced on a perceptually motivated mel frequency scale. The value of mel cepstral vector, as well as the properties of the corresponding cepstral distance, are determined by several parameters used in mel cepstral analysis. The aim of this work is to examine compatibility of MFCC measure with human perception for different values of parameters in the analysis. By analysing mel filter bank parameters it is found that filter bank with 24 bands, 220 mels bandwidth and band overlap coefficient equal and higher than one gives optimal spectral distortion (SD) distance measures. For this kind of mel filter bank, the difference between vowels can be recognised for full-length mel cepstral SD RMS measure higher than 0.4 - 0.5 dB. Further on, we will show that usage of truncated mel cepstral vector (12 coefficients) is justified for speech recognition, but may be arguable for speaker recognition. We also analysed the impact of aliasing in cepstral domain on cepstral distortion measures. The results showed high correlation of SD distances calculated from aperiodic and periodic mel cepstrum, leading to the conclusion that the impact of aliasing is generally minor. There are rare exceptions where aliasing is present, and these were also analysed.

Izvorni jezik
Engleski

Znanstvena područja
Elektrotehnika, Računarstvo



POVEZANOST RADA


Projekti:
0036054

Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb

Profili:

Avatar Url Antonio Vasilijević (autor)

Avatar Url Davor Petrinović (autor)

Poveznice na cjeloviti tekst rada:

Pristup cjelovitom tekstu rada Hrčak

Citiraj ovu publikaciju:

Vasilijević, Antonio; Petrinović, Davor
Perceptual Significance of Cepstral Distortion Measures in Digital Speech Processing // Automatika : časopis za automatiku, mjerenje, elektroniku, računarstvo i komunikacije, 52 (2011), 2; 132-146 (međunarodna recenzija, članak, znanstveni)
Vasilijević, A. & Petrinović, D. (2011) Perceptual Significance of Cepstral Distortion Measures in Digital Speech Processing. Automatika : časopis za automatiku, mjerenje, elektroniku, računarstvo i komunikacije, 52 (2), 132-146.
@article{article, author = {Vasilijevi\'{c}, Antonio and Petrinovi\'{c}, Davor}, year = {2011}, pages = {132-146}, keywords = {Aliasing, Digital speech processing, MFCC, Mel cepstrum, SD Measure, Speech recognition, OBJECTIVE ASSESSMENT, RECOGNITION, QUALITY}, journal = {Automatika : \v{c}asopis za automatiku, mjerenje, elektroniku, ra\v{c}unarstvo i komunikacije}, volume = {52}, number = {2}, issn = {0005-1144}, title = {Perceptual Significance of Cepstral Distortion Measures in Digital Speech Processing}, keyword = {Aliasing, Digital speech processing, MFCC, Mel cepstrum, SD Measure, Speech recognition, OBJECTIVE ASSESSMENT, RECOGNITION, QUALITY} }
@article{article, author = {Vasilijevi\'{c}, Antonio and Petrinovi\'{c}, Davor}, year = {2011}, pages = {132-146}, keywords = {Aliasing, Digital speech processing, MFCC, Mel cepstrum, SD Measure, Speech recognition, OBJECTIVE ASSESSMENT, RECOGNITION, QUALITY}, journal = {Automatika : \v{c}asopis za automatiku, mjerenje, elektroniku, ra\v{c}unarstvo i komunikacije}, volume = {52}, number = {2}, issn = {0005-1144}, title = {Perceptual Significance of Cepstral Distortion Measures in Digital Speech Processing}, keyword = {Aliasing, Digital speech processing, MFCC, Mel cepstrum, SD Measure, Speech recognition, OBJECTIVE ASSESSMENT, RECOGNITION, QUALITY} }

Časopis indeksira:


  • Web of Science Core Collection (WoSCC)
    • Science Citation Index Expanded (SCI-EXP)
    • SCI-EXP, SSCI i/ili A&HCI
  • Scopus


Uključenost u ostale bibliografske baze podataka::


  • INSPEC
  • Science Citation Index Expanded
  • Chemical Abstracts
  • Current Bibliography on Science and Technology
  • Science Abstracts
  • Referativnii Zurnal





Contrast
Increase Font
Decrease Font
Dyslexic Font