Pregled bibliografske jedinice broj: 1129013
An empirical study of classification algorithms when dealing with the problem of class imbalance and other data intrinsic characteristics
An empirical study of classification algorithms when dealing with the problem of class imbalance and other data intrinsic characteristics // Abstract Book - Fifth International Workshop on Data Science / Lončarić, Sven - Zagreb : Centre of Research Excellence for Data Science and Cooperative Systems Research Unit for Data Science, 2020, 38-41
Zagreb, Hrvatska, 2020. str. 38-41 (radionica, nije recenziran, kratko priopćenje, znanstveni)
CROSBI ID: 1129013 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
An empirical study of classification algorithms
when dealing with the problem of class
imbalance and other data intrinsic
characteristics
Autori
Dudjak, Mario ; Martinović, Goran
Vrsta, podvrsta i kategorija rada
Sažeci sa skupova, kratko priopćenje, znanstveni
Izvornik
Abstract Book - Fifth International Workshop on Data Science / Lončarić, Sven - Zagreb : Centre of Research Excellence for Data Science and Cooperative Systems Research Unit for Data Science, 2020, 38-41
/ - , 2020, 38-41
Skup
5th International Workshop on Data Science (IWDS 2020)
Mjesto i datum
Zagreb, Hrvatska, 24.11.2020
Vrsta sudjelovanja
Radionica
Vrsta recenzije
Nije recenziran
Ključne riječi
class imbalance ; class overlapping ; small disjuncts
Sažetak
Evaluating and comparing the performance and behaviour of different algorithms is a pivotal step when applying machine learning in various application domains. Nevertheless, learning the concepts of real-world problems is a challenging task because of the different intrinsic characteristics that may be present in such datasets. Since not all machine learning algorithms are made equal, these characteristics do not affect their behaviour uniformly. This paper presents a large-scale empirical study of four different types of classifiers in which we try to determine and rank the degrees of correlation between their performance and the level of class imbalance, data rarity, small disjuncts, class overlapping and noise, and provide insight into classifier behaviour when faced with these problems.
Izvorni jezik
Engleski
Znanstvena područja
Računarstvo
POVEZANOST RADA
Ustanove:
Fakultet elektrotehnike, računarstva i informacijskih tehnologija Osijek