Searching for credible relations in machine learning

Vidulin, Vedrana

Pregled bibliografske jedinice broj: 1253080

Searching for credible relations in machine learning

Vidulin, Vedrana

Searching for credible relations in machine learning, 2012., doktorska disertacija, Ljubljana

CROSBI ID: 1253080 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Searching for credible relations in machine learning

Autori
Vidulin, Vedrana

Vrsta, podvrsta i kategorija rada
Ocjenski radovi, doktorska disertacija

Mjesto
Ljubljana

Datum
03.02

Godina
2012

Stranica
149

Mentor
Gams, Matjaž ; Filipič, Bogdan

Ključne riječi
Interactive data mining, Interactive machine learning, Interactive explanation structure, Relation-extraction scheme, Domain analysis, Human–computer interaction

Sažetak
Can a model constructed by machine learning or data mining programs be trusted? For example, it is known that a decision tree model can contain less-credible parts caused by pathologies in induction algorithms, noise and missing values in data, or simply because of the complexity of a domain. Such models typically contain relations that are statistically significant, but in reality meaningless. Meaningless relations are problematic since they undermine the user’s trust in the data mining system and can also lead to wrong conclusions about the most important relations in the domain. In this thesis we propose an interactive method for the construction of credible relations in complex domains, named Human- Machine Data Mining (HMDM). The basic idea of our approach is to construct a large number of models to extract the credible relations, i.e., relations that are meaningful and of high quality. The task is computationally very demanding, and for other than simple cases there is no possibility for humans to analyze a meaningful share of all the hypothesized models on their own. However, the introduced combination of human understanding and raw computer power enables a smart examination of the parts of the huge search space with most credible models. While data mining methods perform the search, humans examine and evaluate the results, make conclusions and redo the search in a way that seems to be the most promising based on the previous attempts. In this way, the humans guide the data mining to search the subspaces with the most credible models and finally the humans construct the overall conclusions from the various, most interesting solutions. The HMDM defines a toolbox composed of semi-automated data mining procedures and a set of scenarios for the human to guide the analysis towards credible models. Furthermore, it defines a scheme for the extraction of credible relations from multiple models, which provides support to the human analyst in the process of constructing correct conclusions about the domain. The proposed approach is demonstrated in two complex domains that show how the higher education and the research and development sectors are related to economic welfare. In addition, we showed in a domain of automatic web genre identification that HMDM can be successfully used for learning predictive models in another domain. A user study justified the HMDM method by showing that the users are frequently not able to detect meaningless relations by observing a single model constructed by a machine learning algorithm. However, by observing interesting variations, i.e., candidate solutions suggested by the HMDM method, the participants realized the weaknesses of the default model and created better domain models.

Izvorni jezik
Engleski

POVEZANOST RADA

Profili:

Vedrana Vidulin (autor)

Poveznice na cjeloviti tekst rada:

vedranavidulin.com

CROSBI Hrvatska znanstvena bibliografija

Pregled bibliografske jedinice broj: 1253080

Searching for credible relations in machine learning

Poveznice na cjeloviti tekst rada:

Citiraj ovu publikaciju:

Pregled bibliografske jedinice broj: 1253080

Searching for credible relations in machine learning

Poveznice na cjeloviti tekst rada:

Citiraj ovu publikaciju:

Podijeli: