Pregled bibliografske jedinice broj: 1253644
Speed and accuracy benchmarks of large-scale microbial gene function prediction with supervised machine learning
Speed and accuracy benchmarks of large-scale microbial gene function prediction with supervised machine learning // Discovery science : book of abstracts
Bled, Slovenija, 2014. str. 1-3 (poster, međunarodna recenzija, prošireni sažetak, znanstveni)
CROSBI ID: 1253644 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Speed and accuracy benchmarks of large-scale
microbial gene function prediction with supervised
machine learning
Autori
Vidulin, Vedrana ; Šmuc, Tomislav ; Supek, Fran
Vrsta, podvrsta i kategorija rada
Sažeci sa skupova, prošireni sažetak, znanstveni
Izvornik
Discovery science : book of abstracts
/ - , 2014, 1-3
Skup
Discovery Science
Mjesto i datum
Bled, Slovenija, 08.10.2014. - 10.10.2014
Vrsta sudjelovanja
Poster
Vrsta recenzije
Međunarodna recenzija
Ključne riječi
gene function prediction, supervised machine learning, bencmark
Sažetak
Machine learning approaches for microbial gene function prediction (MGFP) from genome context data are mostly unsupervised [1] and rely on pairwise distances between individual examples arranged into “functional interaction networks” [2]. When supervised approaches were used, most of them typically predicted a limited set of functions and/or used a single-label approach to classification [3, 4], constructing a separate classifier for each function and ignoring the relationships between the functions. Multilabel approaches may perform better, especially those that can exploit the relations between functions readily available in gene function ontologies [5]. Our aim is to compare predictive accuracy and computational efficiency of single vs. multi-label approaches on supervised MGFP. High accuracy is a prerequisite for applying the classifier in real- life tasks, where confidence in predicted functions is of key importance for prioritizing downstream experimental work. Many such predictions have indeed been validated in biological experiments [6, 7]. A lower demand for computational time is of importance when the number of considered functions is high.
Izvorni jezik
Engleski