CroRIS - CROSBI

izvor podataka: crosbi !

Searching for credible relations in machine learning (CROSBI ID 458145)

Ocjenski rad | doktorska disertacija

Vidulin, Vedrana Searching for credible relations in machine learning / Gams, Matjaž ; Filipič, Bogdan (mentor); Ljubljana, . 2012

Podaci o odgovornosti

Autori

Vidulin, Vedrana

Mentori

Gams, Matjaž ; Filipič, Bogdan

Osnovni podaci na izvornom jeziku
Osnovni podaci na ostalim jezicima

Jezik

engleski

Naslov

Searching for credible relations in machine learning

Sažetak

Can a model constructed by machine learning or data mining programs be trusted? For example, it is known that a decision tree model can contain less-credible parts caused by pathologies in induction algorithms, noise and missing values in data, or simply because of the complexity of a domain. Such models typically contain relations that are statistically significant, but in reality meaningless. Meaningless relations are problematic since they undermine the user’s trust in the data mining system and can also lead to wrong conclusions about the most important relations in the domain. In this thesis we propose an interactive method for the construction of credible relations in complex domains, named Human- Machine Data Mining (HMDM). The basic idea of our approach is to construct a large number of models to extract the credible relations, i.e., relations that are meaningful and of high quality. The task is computationally very demanding, and for other than simple cases there is no possibility for humans to analyze a meaningful share of all the hypothesized models on their own. However, the introduced combination of human understanding and raw computer power enables a smart examination of the parts of the huge search space with most credible models. While data mining methods perform the search, humans examine and evaluate the results, make conclusions and redo the search in a way that seems to be the most promising based on the previous attempts. In this way, the humans guide the data mining to search the subspaces with the most credible models and finally the humans construct the overall conclusions from the various, most interesting solutions. The HMDM defines a toolbox composed of semi-automated data mining procedures and a set of scenarios for the human to guide the analysis towards credible models. Furthermore, it defines a scheme for the extraction of credible relations from multiple models, which provides support to the human analyst in the process of constructing correct conclusions about the domain. The proposed approach is demonstrated in two complex domains that show how the higher education and the research and development sectors are related to economic welfare. In addition, we showed in a domain of automatic web genre identification that HMDM can be successfully used for learning predictive models in another domain. A user study justified the HMDM method by showing that the users are frequently not able to detect meaningless relations by observing a single model constructed by a machine learning algorithm. However, by observing interesting variations, i.e., candidate solutions suggested by the HMDM method, the participants realized the weaknesses of the default model and created better domain models.

Ključne riječi

Interactive data mining, Interactive machine learning, Interactive explanation structure, Relation-extraction scheme, Domain analysis, Human–computer interaction

Napomena

nije evidentirano

Jezik

nije evidentirano

Naslov

nije evidentirano

Sažetak

nije evidentirano

Ključne riječi

nije evidentirano

Napomena

nije evidentirano

Podaci o izdanju

Broj stranica

149

Datum obrane

03.02.2012.

Status objave rada

obranjeno

Podaci o ustanovi koja je dodijelila akademski stupanj

Mjesto

Ljubljana

Povezanost rada

Povezane osobe

Vedrana Vidulin (autor/i)

Područje

nije evidentirano

Poveznice

vedranavidulin.com