Pretražite po imenu i prezimenu autora, mentora, urednika, prevoditelja

Napredna pretraga

Pregled bibliografske jedinice broj: 1275014

Impact of missing values on the performance of machine learning algorithms


Radišić, Bojan; Seljan, Sanja; Dunđer, Ivan
Impact of missing values on the performance of machine learning algorithms // CEUR Workshop Proceedings: Recent Trends and Applications in Computer Science and Information Technology (RTA-CSIT 2023) / Xhina, Endrit ; Hoxha, Klesti (ur.).
Tirana: University of Tirana, Faculty of Natural Sciences, Department of Informatics, 2023. str. 54-62 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)


CROSBI ID: 1275014 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Impact of missing values on the performance of machine learning algorithms

Autori
Radišić, Bojan ; Seljan, Sanja ; Dunđer, Ivan

Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni

Izvornik
CEUR Workshop Proceedings: Recent Trends and Applications in Computer Science and Information Technology (RTA-CSIT 2023) / Xhina, Endrit ; Hoxha, Klesti - Tirana : University of Tirana, Faculty of Natural Sciences, Department of Informatics, 2023, 54-62

Skup
5th International Conference on Recent Trends and Applications in Computer Science and Information Technology (RTA-CSIT)

Mjesto i datum
Tirana, Albanija, 26.04.2023. - 27.05.2023

Vrsta sudjelovanja
Predavanje

Vrsta recenzije
Međunarodna recenzija

Ključne riječi
machine learning ; neural network ; missing data ; confusion matrix ; accuracy

Sažetak
Machine learning (ML) can be used to analyze and predict student success outcome in order to avoid various problems and to plan future actions for helping students overcome difficulties during their study. This paper analyzes data from a digital system of 309 students who were enrolled in the Specialist Study in Trade Business at the Faculty of Tourism and Rural Development from 2010 to 2018. The paper explores the impact of four different data sets on the performance of ML algorithms. The first data set is with partially missing data on the length of study (around 7%), the second one uses arithmetic means in place of missing data, the third is based on median values, whereas the fourth uses the geometric mean instead. Four popular ML algorithms were considered: k-Nearest Neighbors (KNN), Naïve Bayes (NB), Random Forest (RF) and Probabilistic Neural Network (PNN). All of them are used for predicting student success based on achieved ECTS credit points. The aim of this paper is to compare and analyze the impact of missing values on the results of individual ML algorithms.

Izvorni jezik
Engleski

Znanstvena područja
Računarstvo, Informacijske i komunikacijske znanosti



POVEZANOST RADA


Projekti:
--11-933-1053 - Strojno učenje i obrada prirodnog jezika u domeni računalne sigurnosti – II. dio (Seljan, Sanja) ( CroRIS)

Ustanove:
Filozofski fakultet, Zagreb,
Fakultet turizma i ruralnog razvoja u Požegi

Profili:

Avatar Url Ivan Dunđer (autor)

Avatar Url Sanja Seljan (autor)

Avatar Url Bojan Radišić (autor)

Citiraj ovu publikaciju:

Radišić, Bojan; Seljan, Sanja; Dunđer, Ivan
Impact of missing values on the performance of machine learning algorithms // CEUR Workshop Proceedings: Recent Trends and Applications in Computer Science and Information Technology (RTA-CSIT 2023) / Xhina, Endrit ; Hoxha, Klesti (ur.).
Tirana: University of Tirana, Faculty of Natural Sciences, Department of Informatics, 2023. str. 54-62 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
Radišić, B., Seljan, S. & Dunđer, I. (2023) Impact of missing values on the performance of machine learning algorithms. U: Xhina, E. & Hoxha, K. (ur.)CEUR Workshop Proceedings: Recent Trends and Applications in Computer Science and Information Technology (RTA-CSIT 2023).
@article{article, author = {Radi\v{s}i\'{c}, Bojan and Seljan, Sanja and Dun\djer, Ivan}, year = {2023}, pages = {54-62}, keywords = {machine learning, neural network, missing data, confusion matrix, accuracy}, title = {Impact of missing values on the performance of machine learning algorithms}, keyword = {machine learning, neural network, missing data, confusion matrix, accuracy}, publisher = {University of Tirana, Faculty of Natural Sciences, Department of Informatics}, publisherplace = {Tirana, Albanija} }
@article{article, author = {Radi\v{s}i\'{c}, Bojan and Seljan, Sanja and Dun\djer, Ivan}, year = {2023}, pages = {54-62}, keywords = {machine learning, neural network, missing data, confusion matrix, accuracy}, title = {Impact of missing values on the performance of machine learning algorithms}, keyword = {machine learning, neural network, missing data, confusion matrix, accuracy}, publisher = {University of Tirana, Faculty of Natural Sciences, Department of Informatics}, publisherplace = {Tirana, Albanija} }

Časopis indeksira:


  • Scopus





Contrast
Increase Font
Decrease Font
Dyslexic Font