Pretražite po imenu i prezimenu autora, mentora, urednika, prevoditelja

Napredna pretraga

Pregled bibliografske jedinice broj: 451615

Gaussian Mixture Model Based Quantization of Line Spectral Frequencies for Adaptive Multirate Speech Codec


Tadić, Tihomir; Petrinović, Davor
Gaussian Mixture Model Based Quantization of Line Spectral Frequencies for Adaptive Multirate Speech Codec // CIT. Journal of computing and information technology, 19 (2011), 2; 113-126 doi:10.2498/cit.1001767 (međunarodna recenzija, članak, znanstveni)


CROSBI ID: 451615 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Gaussian Mixture Model Based Quantization of Line Spectral Frequencies for Adaptive Multirate Speech Codec

Autori
Tadić, Tihomir ; Petrinović, Davor

Izvornik
CIT. Journal of computing and information technology (1330-1136) 19 (2011), 2; 113-126

Vrsta, podvrsta i kategorija rada
Radovi u časopisima, članak, znanstveni

Ključne riječi
Gaussian mixture models; Karhunen-Loève transform; Line spectral frequency; Adaptive Multi-Rate codec; Speech coding; Transform coding; Vector quantization; Entropy constrained scalar quantizer

Sažetak
In this paper, we investigate the use of a Gaussian MixtureModel (GMM)-based quantizer for quantization of the Line Spectral Frequencies (LSFs) in the Adaptive Multi-Rate (AMR) speech codec. We estimate the parametric GMM model of the probability density function (pdf) for the prediction error (residual) of mean- removed LSF parameters that are used in the AMR codec for speech spectral envelope representation. The studied GMM- based quantizer is based on transform coding using Karhunen- Loeve transform (KLT) and transform domain scalar quantizers (SQ) individually designed for each Gaussian mixture. We have investigated the applicability of such a quantization scheme in the existing AMR codec by solely replacing the AMR LSF quantization algorithm segment. The main novelty in this paper lies in applying and adapting the entropy constrained (EC) coding for fixed-rate scalar quantization of transformed residuals thereby allowing for better adaptation to the local statistics of the source. We study and evaluate the compression efficiency, computational complexity and memory requirements of the proposed algorithm. Experimental results show that the GMM- based EC quantizer provides better rate/distortion performance than the quantization schemes used in the referent AMR codec by saving up to 7.32 bits/frame at much lower rate-independent computational complexity and memory requirements.

Izvorni jezik
Engleski

Znanstvena područja
Elektrotehnika, Računarstvo



POVEZANOST RADA


Projekti:
0036054

Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb

Profili:

Avatar Url Davor Petrinović (autor)

Avatar Url Tihomir Tadić (autor)

Poveznice na cjeloviti tekst rada:

Pristup cjelovitom tekstu rada doi Hrčak

Citiraj ovu publikaciju:

Tadić, Tihomir; Petrinović, Davor
Gaussian Mixture Model Based Quantization of Line Spectral Frequencies for Adaptive Multirate Speech Codec // CIT. Journal of computing and information technology, 19 (2011), 2; 113-126 doi:10.2498/cit.1001767 (međunarodna recenzija, članak, znanstveni)
Tadić, T. & Petrinović, D. (2011) Gaussian Mixture Model Based Quantization of Line Spectral Frequencies for Adaptive Multirate Speech Codec. CIT. Journal of computing and information technology, 19 (2), 113-126 doi:10.2498/cit.1001767.
@article{article, author = {Tadi\'{c}, Tihomir and Petrinovi\'{c}, Davor}, year = {2011}, pages = {113-126}, DOI = {10.2498/cit.1001767}, keywords = {Gaussian mixture models, Karhunen-Lo\`{e}ve transform, Line spectral frequency, Adaptive Multi-Rate codec, Speech coding, Transform coding, Vector quantization, Entropy constrained scalar quantizer}, journal = {CIT. Journal of computing and information technology}, doi = {10.2498/cit.1001767}, volume = {19}, number = {2}, issn = {1330-1136}, title = {Gaussian Mixture Model Based Quantization of Line Spectral Frequencies for Adaptive Multirate Speech Codec}, keyword = {Gaussian mixture models, Karhunen-Lo\`{e}ve transform, Line spectral frequency, Adaptive Multi-Rate codec, Speech coding, Transform coding, Vector quantization, Entropy constrained scalar quantizer} }
@article{article, author = {Tadi\'{c}, Tihomir and Petrinovi\'{c}, Davor}, year = {2011}, pages = {113-126}, DOI = {10.2498/cit.1001767}, keywords = {Gaussian mixture models, Karhunen-Lo\`{e}ve transform, Line spectral frequency, Adaptive Multi-Rate codec, Speech coding, Transform coding, Vector quantization, Entropy constrained scalar quantizer}, journal = {CIT. Journal of computing and information technology}, doi = {10.2498/cit.1001767}, volume = {19}, number = {2}, issn = {1330-1136}, title = {Gaussian Mixture Model Based Quantization of Line Spectral Frequencies for Adaptive Multirate Speech Codec}, keyword = {Gaussian mixture models, Karhunen-Lo\`{e}ve transform, Line spectral frequency, Adaptive Multi-Rate codec, Speech coding, Transform coding, Vector quantization, Entropy constrained scalar quantizer} }

Časopis indeksira:


  • Scopus


Uključenost u ostale bibliografske baze podataka::


  • CIS Current Index to Statistics
  • Compuscience Database on STN International and Internet
  • Computer Science Index, EBSCO Information Services
  • CrossRef, Publishers International Linking Association (PILA)
  • INSPEC Computer and Control Abstracts
  • The Database for Physics Electronics and Computing
  • LISA Library and Information Science Abstracts
  • PASCAL data base, INIST-CNRS
  • Zentralblatt fuer Mathema


Citati:





    Contrast
    Increase Font
    Decrease Font
    Dyslexic Font