Pregled bibliografske jedinice broj: 1253589
Hierarchy Decomposition Pipeline: A Toolbox for Comparison of Model Induction Algorithms on Hierarchical Multi-label Classification Problems
Hierarchy Decomposition Pipeline: A Toolbox for Comparison of Model Induction Algorithms on Hierarchical Multi-label Classification Problems // Discovery Science, DS 2020, Lecture Notes in Computer Science / Appice, A. ; Tsoumakas, G. ; Manolopoulos, Y. ; Matwin, S. (ur.).
online: Springer, 2020. str. 486-501 doi:10.1007/978-3-030-61527-7_32 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
CROSBI ID: 1253589 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Hierarchy Decomposition Pipeline: A Toolbox for Comparison of Model Induction
Algorithms on Hierarchical Multi-label Classification Problems
Autori
Vidulin, Vedrana ; Džeroski, Sašo
Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni
Izvornik
Discovery Science, DS 2020, Lecture Notes in Computer Science
/ Appice, A. ; Tsoumakas, G. ; Manolopoulos, Y. ; Matwin, S. - : Springer, 2020, 486-501
ISBN
978-3-030-61526-0
Skup
23rd International Conference on Discovery Science (DS 2020)
Mjesto i datum
Online, 19.10.2020. - 21.10.2020
Vrsta sudjelovanja
Predavanje
Vrsta recenzije
Međunarodna recenzija
Ključne riječi
Hierarchical multi-label classification, Hierarchy decomposition, Structured prediction
Sažetak
Hierarchical multi-label classification (HMC) is a supervised machine learning task, where each example can be assigned more than one label and the possible labels are organized in a hierarchy. HMC problems emerge in domains like functional genomics, habitat modelling, text and image categorization. They can be addressed with global model induction algorithms, which induce a single model that predicts the complete hierarchy, as well as with local algorithms, which induce multiple models that predict different segments of the hierarchy. However, there is no consensus about which of these approaches perform the best over different domains, especially in the setting of learning ensembles. We introduce the hierarchy decomposition pipeline, a publicly available toolbox for comparison of model induction algorithms on HMC problems in an ensemble setting. The pipeline includes five algorithms, including the algorithm that predicts the complete hierarchy, and algorithms that perform partial and complete hierarchy decompositions. One of these algorithms is the novel “label specialization” algorithm that constructs a local multi-label classification model for each parent label in a hierarchy that simultaneously predicts the respective children labels. We apply the pipeline on ten HMC data sets from four domains, which have both tree and directed acyclic graph label hierarchies, and confirm that there is no single best algorithm for all HMC problems. This finding shows that there exists a need for such a pipeline that enables a user to choose the best performing algorithm for his/her HMC data set. Finally, we show that the choice can be narrowed to a specific type of algorithm, based on the characteristics of the label hierarchy and the data set label cardinality.
Izvorni jezik
Engleski
Citiraj ovu publikaciju:
Časopis indeksira:
- Scopus