Pregled bibliografske jedinice broj: 1242609
Speaker Identification Using Small Artificial Neural Network on Small Dataset
Speaker Identification Using Small Artificial Neural Network on Small Dataset // Proceedings of International Conference on Smart Systems and Technologies (SST 2022)
Osijek: Institute of Electrical and Electronics Engineers (IEEE), 2022. str. 141-145 doi:10.1109/sst55530.2022.9954727 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
CROSBI ID: 1242609 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Speaker Identification Using Small Artificial Neural Network on Small Dataset
Autori
Loina, Luka
Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni
Izvornik
Proceedings of International Conference on Smart Systems and Technologies (SST 2022)
/ - Osijek : Institute of Electrical and Electronics Engineers (IEEE), 2022, 141-145
ISBN
978-1-6654-8215-8
Skup
International Conference on Smart Systems and Technologies 2022 (SST 2022)
Mjesto i datum
Osijek, Hrvatska, 19.10.2022. - 21.10.2022
Vrsta sudjelovanja
Predavanje
Vrsta recenzije
Međunarodna recenzija
Ključne riječi
speaker recognition ; neural networks ; artificial intelligence ; preprocessing methods
Sažetak
Speaker recognition provides an answer to the question “Who is speaking?”. Most of the research in the field of speaker recognition focuses on models for recognizing thousands of speakers and uses large datasets for doing so. In this paper, we explore the possibility of using small neural networks that could quickly be trained on small datasets for doing speaker recognition for a small number of speakers with high accuracy. To investigate this matter experimental analysis was conducted by using hyperparameter optimization to find the optimal combination of the structure and parameters for multiple configurations of neural networks. Furthermore, the investigation was performed with multiple preprocessing methods to find the effect of preprocessing on the size and accuracy of the resulting models. As the result of the experiment, we found a residual neural network with 387, 461 parameters that was able to classify speakers with high accuracy. In comparison state of the art speaker recognition uses 4.2 million parameters.
Izvorni jezik
Engleski
Znanstvena područja
Računarstvo
POVEZANOST RADA
Ustanove:
Fakultet elektrotehnike, računarstva i informacijskih tehnologija Osijek
Profili:
Luka Loina
(autor)