Pregled bibliografske jedinice broj: 939190
Redescription mining augmented with random forest of multi-target predictive clustering trees
Redescription mining augmented with random forest of multi-target predictive clustering trees // Journal of Intelligent Information Systems, 50 (2018), 1; 63-96 doi:10.1007/s10844-017-0448-5 (međunarodna recenzija, članak, znanstveni)
CROSBI ID: 939190 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Redescription mining augmented with random forest of multi-target predictive clustering trees
Autori
Mihelčić, Matej ; Džeroski, Sašo ; Lavrač, Nada ; Šmuc, Tomislav
Izvornik
Journal of Intelligent Information Systems (0925-9902) 50
(2018), 1;
63-96
Vrsta, podvrsta i kategorija rada
Radovi u časopisima, članak, znanstveni
Ključne riječi
Knowledge discovery ; Redescription mining ; Random forest ; Predictive clustering trees ; World countries ; Computer science bibliography ; Bioclimatic niches
Sažetak
In this work, we present a redescription mining algorithm that uses Random Forest of Predictive Clustering Trees (RFPCTs) for generating and iteratively improving a set of redescriptions. The approach uses information about element membership in different queries, generated from a single constructed PCT, to explore redescription space, while queries obtained from the Random Forest of PCTs increase candidate diversity. The approach is able to produce highly accurate, statistically significant redescriptions described by Boolean, nominal or numerical attributes. As opposed to current tree-based approaches that use multi-class or binary classification, we explore the benefits of using multi-label classification and multi-target regression to create redescriptions. Major benefit of the approach, compared to other state of the art solutions, is that it does not require specifying minimal threshold on redescription accuracy to obtain highly accurate, optimized set of redescriptions. The process of Random Forest based augmentation and different modes of redescription set creation are evaluated on three datasets with different properties. We use the same datasets to compare the performance of our algorithm to state of the art redescription mining approaches.
Izvorni jezik
Engleski
Znanstvena područja
Računarstvo
POVEZANOST RADA
Projekti:
HRZZ-IP-2013-11-9623 - Postupci strojnog učenja za dubinsku analizu složenih struktura podataka (DescriptiveInduction) (Gamberger, Dragan, HRZZ - 2013-11) ( CroRIS)
Ustanove:
Institut "Ruđer Bošković", Zagreb
Citiraj ovu publikaciju:
Časopis indeksira:
- Current Contents Connect (CCC)
- Web of Science Core Collection (WoSCC)
- Science Citation Index Expanded (SCI-EXP)
- SCI-EXP, SSCI i/ili A&HCI
- Scopus