Pretražite po imenu i prezimenu autora, mentora, urednika, prevoditelja

Napredna pretraga

Pregled bibliografske jedinice broj: 1035930

Estimation of random accuracy and its use in validation of predictive quality of classification models within predictive challenges


Lučić, Bono; Batista, Jadranko; Bojović, Viktor; Lovrić, Mario; Sović Kržić, Ana; Bešlo, Drago; Nadramija, Damir; Vikić-Topić, Dražen
Estimation of random accuracy and its use in validation of predictive quality of classification models within predictive challenges // Croatica chemica acta, 92 (2019), 3; 379-391 doi:10.5562/cca3551 (međunarodna recenzija, članak, znanstveni)


CROSBI ID: 1035930 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Estimation of random accuracy and its use in validation of predictive quality of classification models within predictive challenges

Autori
Lučić, Bono ; Batista, Jadranko ; Bojović, Viktor ; Lovrić, Mario ; Sović Kržić, Ana ; Bešlo, Drago ; Nadramija, Damir ; Vikić-Topić, Dražen

Izvornik
Croatica chemica acta (0011-1643) 92 (2019), 3; 379-391

Vrsta, podvrsta i kategorija rada
Radovi u časopisima, članak, znanstveni

Ključne riječi
model validation ; QSPR ; QSAR ; two-class variable ; classification model ; contingency table ; estimation ; prediction ; test set ; correlation coefficient ; predictive error ; classification accuracy ; model ranking ; random accuracy

Sažetak
Shortcomings of the correlation coefficient (Pearson's) as a measure for estimating and calculating the accuracy of predictive model properties are analysed. Here we discuss two such cases that can often occur in the application of the model in predicting properties of a new external set of compounds. The first problem in using the correlation coefficient is its insensitivity to the systemic error that must be expected in predicting properties of a novel external set of compounds, which is not a random sample selected from the training set. The second problem is that an external set can be arbitrarily large or small and have an arbitrary and uneven distribution of the measured value of the target variable, whose values are not known in advance. In these conditions, the correlation coefficient can be an overoptimistic measure of agreement of predicted values with the corresponding experimental values and can lead to a highly optimistic conclusion about the predictive ability of the model. Due to these shortcomings of the correlation coefficient, the use of standard error (root-mean-square-error) of prediction is suggested as a better quality measure of predictive capabilities of a model. In the case of classification models, the use of the difference between the real accuracy and the most probable random accuracy of the model shows very good characteristics in ranking different models according to predictive quality, having at the same time an obvious interpretation.

Izvorni jezik
Engleski

Znanstvena područja
Kemija, Računarstvo, Interdisciplinarne biotehničke znanosti

Napomena
HrZZ and EU-ESF, Basic grant of MZO/RBI to Bono
Lučić and SCE for Marine Bioprospecting–BioProCro
(KK.01.1.1.01)



POVEZANOST RADA


Projekti:
EK-KF-KK.01.1.1.01.0002 - Bioprospecting Jadranskog mora (Jerković, Igor; Dragović-Uzelac, Verica; Šantek, Božidar; Čož-Rakovac, Rozelinda; Kraljević Pavelić, Sandra; Jokić, Stela, EK ) ( CroRIS)

Basic grant of MZO/RBI to Bono Lučić and Croatian Science Foundation (DOK-01-2018)

Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb,
Fakultet agrobiotehničkih znanosti Osijek,
Institut "Ruđer Bošković", Zagreb,
Dječja bolnica Srebrnjak

Poveznice na cjeloviti tekst rada:

doi hrcak.srce.hr doi.org fulir.irb.hr

Citiraj ovu publikaciju:

Lučić, Bono; Batista, Jadranko; Bojović, Viktor; Lovrić, Mario; Sović Kržić, Ana; Bešlo, Drago; Nadramija, Damir; Vikić-Topić, Dražen
Estimation of random accuracy and its use in validation of predictive quality of classification models within predictive challenges // Croatica chemica acta, 92 (2019), 3; 379-391 doi:10.5562/cca3551 (međunarodna recenzija, članak, znanstveni)
Lučić, B., Batista, J., Bojović, V., Lovrić, M., Sović Kržić, A., Bešlo, D., Nadramija, D. & Vikić-Topić, D. (2019) Estimation of random accuracy and its use in validation of predictive quality of classification models within predictive challenges. Croatica chemica acta, 92 (3), 379-391 doi:10.5562/cca3551.
@article{article, author = {Lu\v{c}i\'{c}, Bono and Batista, Jadranko and Bojovi\'{c}, Viktor and Lovri\'{c}, Mario and Sovi\'{c} Kr\v{z}i\'{c}, Ana and Be\v{s}lo, Drago and Nadramija, Damir and Viki\'{c}-Topi\'{c}, Dra\v{z}en}, year = {2019}, pages = {379-391}, DOI = {10.5562/cca3551}, keywords = {model validation, QSPR, QSAR, two-class variable, classification model, contingency table, estimation, prediction, test set, correlation coefficient, predictive error, classification accuracy, model ranking, random accuracy}, journal = {Croatica chemica acta}, doi = {10.5562/cca3551}, volume = {92}, number = {3}, issn = {0011-1643}, title = {Estimation of random accuracy and its use in validation of predictive quality of classification models within predictive challenges}, keyword = {model validation, QSPR, QSAR, two-class variable, classification model, contingency table, estimation, prediction, test set, correlation coefficient, predictive error, classification accuracy, model ranking, random accuracy} }
@article{article, author = {Lu\v{c}i\'{c}, Bono and Batista, Jadranko and Bojovi\'{c}, Viktor and Lovri\'{c}, Mario and Sovi\'{c} Kr\v{z}i\'{c}, Ana and Be\v{s}lo, Drago and Nadramija, Damir and Viki\'{c}-Topi\'{c}, Dra\v{z}en}, year = {2019}, pages = {379-391}, DOI = {10.5562/cca3551}, keywords = {model validation, QSPR, QSAR, two-class variable, classification model, contingency table, estimation, prediction, test set, correlation coefficient, predictive error, classification accuracy, model ranking, random accuracy}, journal = {Croatica chemica acta}, doi = {10.5562/cca3551}, volume = {92}, number = {3}, issn = {0011-1643}, title = {Estimation of random accuracy and its use in validation of predictive quality of classification models within predictive challenges}, keyword = {model validation, QSPR, QSAR, two-class variable, classification model, contingency table, estimation, prediction, test set, correlation coefficient, predictive error, classification accuracy, model ranking, random accuracy} }

Časopis indeksira:


  • Current Contents Connect (CCC)
  • Web of Science Core Collection (WoSCC)
    • Science Citation Index Expanded (SCI-EXP)
    • SCI-EXP, SSCI i/ili A&HCI
  • Scopus


Citati:





    Contrast
    Increase Font
    Decrease Font
    Dyslexic Font