High Performance Processing for Speech Recognition (CROSBI ID 206780)
Prilog u časopisu | izvorni znanstveni rad | međunarodna recenzija
Podaci o odgovornosti
Ramljak, Milan ; Stella, Maja ; Šarić, Matko
engleski
High Performance Processing for Speech Recognition
The evolution of computer technology, including operating systems and applications, resulted in designing intelligent machines that can recognize the spoken word and find out its meaning. During the years, processing time required for speech recognition has been significantly improved, not only thanks to improvements in algorithms, but also with more processing power of nowadays computers. In this paper we analyze processing time and reconstructed speech quality of the three common front-end methods (Linear Predictive Coding - LPC, Mel-Frequency Cepstrum - MFC, Perceptual Linear Prediction - PLP) for calculating coefficients. Reconstructed speech quality is measured with Perceptual Evaluation of Speech Quality (PESQ) score. It is visible from our analysis that, if required, higher number of coefficients could be used without significant impact on processing time for MFC and PLP coefficients. Another very important aspect for processing time is a choice of back-end. In this paper we propose high performance neural network back-end implementation on distributed system based on Erlang programming language. Erlang processes can act as neural network neurons, and asynchronous message exchange is connection within processes transforming Erlang program in a normal neural network structure. With this kind of neural network implementation we have obtained significant increase in performance.
speech recognition; coefficients; PESQ; processing time; neural network; Erlang
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
Podaci o izdanju
8
2014.
166-172
objavljeno
1998-4464