Pretražite po imenu i prezimenu autora, mentora, urednika, prevoditelja

Napredna pretraga

Pregled bibliografske jedinice broj: 1004920

An insight into the effects of class imbalance and sampling on classification accuracy in credit risk assessment


Andrić, Kristina; Kalpić, Damir; Bohaček Zoran
An insight into the effects of class imbalance and sampling on classification accuracy in credit risk assessment // Computer Science and Information Systems, 16 (2019), 1; 155-178 doi:10.2298/CSIS180110037A (međunarodna recenzija, članak, znanstveni)


CROSBI ID: 1004920 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
An insight into the effects of class imbalance and sampling on classification accuracy in credit risk assessment

Autori
Andrić, Kristina ; Kalpić, Damir ; Bohaček Zoran

Izvornik
Computer Science and Information Systems (1820-0214) 16 (2019), 1; 155-178

Vrsta, podvrsta i kategorija rada
Radovi u časopisima, članak, znanstveni

Ključne riječi
credit risk assessment, imbalanced data sets, class distribution, classification algorithms, sample size, undersampling

Sažetak
In this paper we investigate the role of sample size and class distribution in credit risk assessments, focusing on real life imbalanced data sets. Choosing the optimal sample is of utmost importance for the quality of predictive models and has become an increasingly important topic with the recent advances in automating lending decision processes and the ever growing richness in data collected by financial institutions. To address the observed research gap, a large-scale experimental evaluation of real-life data sets of different characteristics was performed, using several classification algorithms and performance measures. Results indicate that various factors play a role in determining the optimal class distribution, namely the performance measure, classification algorithm and data set characteristics. The study also provides valuable insight on how to design the training sample to maximize prediction performance and the suitability of using different classification algorithms by assessing their sensitivity to class imbalance and sample size.

Izvorni jezik
Engleski

Znanstvena područja
Računarstvo



POVEZANOST RADA


Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb

Profili:

Avatar Url Zoran Bohaček (autor)

Avatar Url Damir Kalpić (autor)

Poveznice na cjeloviti tekst rada:

doi www.doiserbia.nb.rs

Citiraj ovu publikaciju:

Andrić, Kristina; Kalpić, Damir; Bohaček Zoran
An insight into the effects of class imbalance and sampling on classification accuracy in credit risk assessment // Computer Science and Information Systems, 16 (2019), 1; 155-178 doi:10.2298/CSIS180110037A (međunarodna recenzija, članak, znanstveni)
Andrić, K., Kalpić, D. & Bohaček Zoran (2019) An insight into the effects of class imbalance and sampling on classification accuracy in credit risk assessment. Computer Science and Information Systems, 16 (1), 155-178 doi:10.2298/CSIS180110037A.
@article{article, author = {Andri\'{c}, Kristina and Kalpi\'{c}, Damir}, year = {2019}, pages = {155-178}, DOI = {10.2298/CSIS180110037A}, keywords = {credit risk assessment, imbalanced data sets, class distribution, classification algorithms, sample size, undersampling}, journal = {Computer Science and Information Systems}, doi = {10.2298/CSIS180110037A}, volume = {16}, number = {1}, issn = {1820-0214}, title = {An insight into the effects of class imbalance and sampling on classification accuracy in credit risk assessment}, keyword = {credit risk assessment, imbalanced data sets, class distribution, classification algorithms, sample size, undersampling} }
@article{article, author = {Andri\'{c}, Kristina and Kalpi\'{c}, Damir}, year = {2019}, pages = {155-178}, DOI = {10.2298/CSIS180110037A}, keywords = {credit risk assessment, imbalanced data sets, class distribution, classification algorithms, sample size, undersampling}, journal = {Computer Science and Information Systems}, doi = {10.2298/CSIS180110037A}, volume = {16}, number = {1}, issn = {1820-0214}, title = {An insight into the effects of class imbalance and sampling on classification accuracy in credit risk assessment}, keyword = {credit risk assessment, imbalanced data sets, class distribution, classification algorithms, sample size, undersampling} }

Časopis indeksira:


  • Web of Science Core Collection (WoSCC)
    • Science Citation Index Expanded (SCI-EXP)
    • SCI-EXP, SSCI i/ili A&HCI
  • Scopus


Citati:





    Contrast
    Increase Font
    Decrease Font
    Dyslexic Font