Napredna pretraga

Pregled bibliografske jedinice broj: 939190

Redescription mining augmented with random forest of multi-target predictive clustering trees


Mihelčić, Matej; Džeroski, Sašo; Lavrač, Nada; Šmuc, Tomislav
Redescription mining augmented with random forest of multi-target predictive clustering trees // Journal of Intelligent Information Systems, 50 (2018), 1; 63-96 doi:10.1007/s10844-017-0448-5 (međunarodna recenzija, članak, znanstveni)


Naslov
Redescription mining augmented with random forest of multi-target predictive clustering trees

Autori
Mihelčić, Matej ; Džeroski, Sašo ; Lavrač, Nada ; Šmuc, Tomislav

Izvornik
Journal of Intelligent Information Systems (0925-9902) 50 (2018), 1; 63-96

Vrsta, podvrsta i kategorija rada
Radovi u časopisima, članak, znanstveni

Ključne riječi
Knowledge discovery ; Redescription mining ; Random forest ; Predictive clustering trees ; World countries ; Computer science bibliography ; Bioclimatic niches

Sažetak
In this work, we present a redescription mining algorithm that uses Random Forest of Predictive Clustering Trees (RFPCTs) for generating and iteratively improving a set of redescriptions. The approach uses information about element membership in different queries, generated from a single constructed PCT, to explore redescription space, while queries obtained from the Random Forest of PCTs increase candidate diversity. The approach is able to produce highly accurate, statistically significant redescriptions described by Boolean, nominal or numerical attributes. As opposed to current tree-based approaches that use multi-class or binary classification, we explore the benefits of using multi-label classification and multi-target regression to create redescriptions. Major benefit of the approach, compared to other state of the art solutions, is that it does not require specifying minimal threshold on redescription accuracy to obtain highly accurate, optimized set of redescriptions. The process of Random Forest based augmentation and different modes of redescription set creation are evaluated on three datasets with different properties. We use the same datasets to compare the performance of our algorithm to state of the art redescription mining approaches.

Izvorni jezik
Engleski

Znanstvena područja
Računarstvo



POVEZANOST RADA


Projekt / tema
HRZZ-IP-2013-11-9623 - Postupci strojnog učenja za dubinsku analizu složenih struktura podataka (Dragan Gamberger, )

Ustanove
Institut "Ruđer Bošković", Zagreb

Časopis indeksira:


  • Current Contents Connect (CCC)
  • Web of Science Core Collection (WoSCC)
    • Science Citation Index Expanded (SCI-EXP)
    • SCI-EXP, SSCI i/ili A&HCI
  • Scopus


Citati