Pregled bibliografske jedinice broj: 1002968
Emotion Classification Based on Convolutional Neural Network Using Speech Data
Emotion Classification Based on Convolutional Neural Network Using Speech Data // MIPRO 2019 42nd International Convention May 20 – 24, 2019 Opatija, Croatia Proceedings / Skala, Karolj (ur.).
Rijeka: Hrvatska udruga za informacijsku i komunikacijsku tehnologiju, elektroniku i mikroelektroniku - MIPRO, 2019. str. 1191-1196 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
CROSBI ID: 1002968 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Emotion Classification Based on Convolutional Neural Network Using Speech Data
Autori
Vrebčević, Nikola ; Mijić, Igor ; Petrinović, Davor
Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni
Izvornik
MIPRO 2019 42nd International Convention May 20 – 24, 2019 Opatija, Croatia Proceedings
/ Skala, Karolj - Rijeka : Hrvatska udruga za informacijsku i komunikacijsku tehnologiju, elektroniku i mikroelektroniku - MIPRO, 2019, 1191-1196
Skup
MIPRO 2019 - 42nd International Convention
Mjesto i datum
Rijeka, Hrvatska, 20.05.2019. - 24.05.2019
Vrsta sudjelovanja
Predavanje
Vrsta recenzije
Međunarodna recenzija
Ključne riječi
emotions ; speech ; emotion classification ; convolutional neural network ; deep learning
Sažetak
The human voice is the most frequently used mode of communication among people. It carries both linguistic and paralinguistic information. For an emotion classification task, it is important to process paralinguistic information because it describes the current affective state of a speaker. This affective information can be used for health care purposes, customer service enhancement and in the entertainment industry. Previous research in the field mostly relied on handcrafted features that are derived from speech signals and thus used for the construction of mainly statistical models. Today, by using new technologies, it is possible to design models that can both extract features and perform classification. This preliminary research explores the performance of a model that comprises a convolutional neural network for feature extraction and a deep neural network that performs emotion classification. The convolutional neural network consists of three convolutional layers that filter input spectrograms in time and frequency dimensions and two dense layers forming the deep part of the model. The unified neural network is trained and tested spectrograms of speech utterances from the Berlin database of emotional speech.
Izvorni jezik
Engleski
Znanstvena područja
Elektrotehnika, Računarstvo, Kognitivna znanost (prirodne, tehničke, biomedicina i zdravstvo, društvene i humanističke znanosti)
POVEZANOST RADA
Projekti:
0036054
KK.01.1.1.01.009.
DOK-2018-01-2976
HRZZ-IP-2014-09-2625 - Iznad Nyquistove granice (BeyondLimit) (Seršić, Damir, HRZZ ) ( CroRIS)
Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb