Pregled bibliografske jedinice broj: 915180
De Novo Assembly using Semi-Supervised Read Categorization
De Novo Assembly using Semi-Supervised Read Categorization // Second International Workshop on Data Science / Lončarić, Sven ; Šmuc, Tomislav (ur.).
Zagreb, Hrvatska, 2017. str. 73-75 (poster, međunarodna recenzija, prošireni sažetak, znanstveni)
CROSBI ID: 915180 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
De Novo Assembly using Semi-Supervised Read Categorization
Autori
Šebrek, Tomislav ; Tomljanović, Jan ; Šikić, Mile
Vrsta, podvrsta i kategorija rada
Sažeci sa skupova, prošireni sažetak, znanstveni
Izvornik
Second International Workshop on Data Science
/ Lončarić, Sven ; Šmuc, Tomislav - , 2017, 73-75
Skup
Second International Workshop on Data Science
Mjesto i datum
Zagreb, Hrvatska, 30.11.2017
Vrsta sudjelovanja
Poster
Vrsta recenzije
Međunarodna recenzija
Ključne riječi
Deep-learning ; Semi-supervised learning ; De novo assembly ; Chimeric read ; Repeat read.
Sažetak
In this paper, we propose a semi-supervised deep learning method for categorization of reads that impede the de novo genome assembly process. In- stead of dealing directly with sequenced reads, we analyze their coverage graphs converted to 1D-signals. We noticed that specific signal pat-terns occur in each relevant class of reads. Semi-supervised approach is chosen because manually labelling the data is a very slow and tedious process, so our goal was to facili- tate the assembly process with as little labeled data as possible. We tested two models to learn patterns in the coverage graphs: M1 + M2 and semi-GAN. We evaluated the performance of each model based on a manually labeled dataset that comprises various reads from multiple reference genomes with respect to the number of labeled examples that were used during the training process. In addition, we embedded our detection in the assembly process which improved the quality of assemblies.
Izvorni jezik
Engleski
Znanstvena područja
Biologija, Računarstvo