Pregled bibliografske jedinice broj: 152723
Avoiding data overfitting in scientific discovery: Experiments in functional genomics
Avoiding data overfitting in scientific discovery: Experiments in functional genomics // ECAI 2004 / de Mantaras, Ramon L. ; Saitta, Lorenza (ur.).
Valencia: IOS Press, 2004. str. 470-474 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
CROSBI ID: 152723 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Avoiding data overfitting in scientific discovery: Experiments in functional genomics
Autori
Gamberger, Dragan ; Lavrač, Nada
Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni
Izvornik
ECAI 2004
/ De Mantaras, Ramon L. ; Saitta, Lorenza - Valencia : IOS Press, 2004, 470-474
Skup
16th European Conference on Artificial Intelligence
Mjesto i datum
Valencia, Španjolska, 22.08.2004. - 27.08.2004
Vrsta sudjelovanja
Predavanje
Vrsta recenzije
Međunarodna recenzija
Ključne riječi
Data overfitting; Scientific discovery; Functional genomics
Sažetak
Functional genomics is a typical scientific discovery domain characterized by a very large number of attributes (genes) relative to the number of examples (observations). The danger of data overfitting is crucial in such domains. This work presents an approach which can help in avoiding data overfitting in supervised inductive learning of short rules that are appropriate for human interpretation. The approach is based on the subgroup discovery rule learning framework, enhanced by methods of restricting the hypothesis search space by exploiting the relevancy of features that enter the rule construction process as well as their combinations that form the rules. A multi-class functional genomics problem of classifying fourteen cancer types based on more than 16000 gene expression values is used to illustrate the methodology.
Izvorni jezik
Engleski
Znanstvena područja
Računarstvo
POVEZANOST RADA