Pregled bibliografske jedinice broj: 1531
Noise elimination in inductive concept learning: a case study in medical diagnosis
Noise elimination in inductive concept learning: a case study in medical diagnosis // 7th International Workshop on Algorithmic Learning Theory ALT-96 / Arikawa, Setsuo ; Sharma, Arun K. (ur.).
Berlin: Springer, 1996. str. 199-212 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
CROSBI ID: 1531 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Noise elimination in inductive concept learning: a case study in medical diagnosis
Autori
Gamberger, Dragan ; Lavrač, Nada ; Džeroski, Sašo
Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni
Izvornik
7th International Workshop on Algorithmic Learning Theory ALT-96
/ Arikawa, Setsuo ; Sharma, Arun K. - Berlin : Springer, 1996, 199-212
Skup
7th International Workshop ALT '96
Mjesto i datum
Sydney, Australija, 23.10.1996. - 25.10.1996
Vrsta sudjelovanja
Predavanje
Vrsta recenzije
Međunarodna recenzija
Ključne riječi
machine learning; noise handling; rheumatic diseases
Sažetak
Compression measures used in inductive learners, such as measures based on the MDL (Minimum Description Length) principle, provide a theoretically justified basis for grading candidate hypotheses. Compression-based induction
is appropriate also for handling of noisy data. This paper shows that a simple compression measure can be used to detect noisy examples. A technique is proposed in which noisy examples are detected and eliminated from the training set, and a hypothesis is then built from the set of remaining examples. The separation of noise detection and hypothesis formation has the advantage that noisy examples do not influence hypothesis construction as opposed to most standard approaches to noise handling in which the learner typically tries to avoid overfitting the noisy example set. This noise elimination method is applied to a problem of early diagnosis of rheumatic diseases which is known to be a difficult problem, due both to its nature and to the imperfections in the dataset. The method is evaluated by applying the noise elimination algorithm in conjunction with the CN2 rule induction algorithm, and by comparing their performance to earlier results obtained by CN2 in this diagnostic domain.
Izvorni jezik
Engleski
Znanstvena područja
Elektrotehnika
POVEZANOST RADA