Phenotype prediction with semi-supervised learning

Levatić, Jurica; Brbić, Maria; Stepišnik Perdih, Tomaž; Kocev, Dragi; Vidulin, Vedrana; Šmuc, Tomislav; Supek, Fran; Džeroski, Sašo

izvor podataka: crosbi ✓

Phenotype prediction with semi-supervised learning (CROSBI ID 655809)

Prilog sa skupa u zborniku | izvorni znanstveni rad | međunarodna recenzija

Levatić, Jurica ; Brbić, Maria ; Stepišnik Perdih, Tomaž ; Kocev, Dragi ; Vidulin, Vedrana ; Šmuc, Tomislav ; Supek, Fran ; Džeroski, Sašo Phenotype prediction with semi-supervised learning // New frontiers in mining complex patterns NFMCP 2017, Lecture Notes in Computer Science. 2017. str. 1-11

Podaci o odgovornosti

Autori

Levatić, Jurica ; Brbić, Maria ; Stepišnik Perdih, Tomaž ; Kocev, Dragi ; Vidulin, Vedrana ; Šmuc, Tomislav ; Supek, Fran ; Džeroski, Sašo

Osnovni podaci na izvornom jeziku
Osnovni podaci na ostalim jezicima

Jezik

engleski

Naslov

Phenotype prediction with semi-supervised learning

Sažetak

In this work, we address the task of phenotypic traits prediction using methods for semi- supervised learning. More specifically, we propose to use supervised and semi-supervised classification trees as well as supervised and semi-supervised random forests of classification trees. We consider 114 datasets for different phenotypic traits referring to 997 microbial species. These datasets present a challenge for the existing machine learning methods: they are not labelled/annotated entirely and their distribution is typically imbalanced. We investigate whether approaching the task of phenotype prediction as a semi- supervised learning task can yield improved predictive performance. The result suggest that the semi-supervised methodology considered here is helpful for phenotype prediction for which the amount of labeled data ranges from 20 to 40%. Furthermore, the semi-supervised classification trees exhibit good predictive performance for datasets where the presence of a given trait is not extremely imbalanced (i.e., less than 6%).

Ključne riječi

semi-supervised learning ; phenotype ; decision trees ; predictive clustering trees ; random forests ; binary classification

Napomena

nije evidentirano

Jezik

nije evidentirano

Naslov

nije evidentirano

Sažetak

nije evidentirano

Ključne riječi

nije evidentirano

Napomena

nije evidentirano

Podaci o prilogu

Stranice rada

1-11.

Godina izdavanja

2017.

Status objave rada

objavljeno

Podaci o matičnoj publikaciji

Naslov

New frontiers in mining complex patterns NFMCP 2017, Lecture Notes in Computer Science

Podaci o skupu

Skup

New frontiers in mining complex patterns: Sixth edition of the International Workshop NFMCP 2017 in conjunction with ECML-PKDD 2017

Vrsta sudjelovanja

predavanje

Datum održavanja skupa

18.09.2017-22.09.2017

Mjesto održavanja skupa

Skopje, Sjeverna Makedonija

Povezanost rada

Povezane osobe

Vedrana Vidulin (autor/i)

Maria Brbić (autor/i)

Tomislav Šmuc (autor/i)

Fran Supek (autor/i)

Povezane ustanove

Institut Ruđer Bošković (098) (autorova ustanova)

Područje

Računarstvo

Poveznice

di.uniba.it