From Short to Long Reads: Benchmarking Assembly Tools

Sović, Ivan; Skala, Karolj; Šikić, Mile

Pregled bibliografske jedinice broj: 631030

From Short to Long Reads: Benchmarking Assembly Tools

Sović, Ivan; Skala, Karolj; Šikić, Mile

From Short to Long Reads: Benchmarking Assembly Tools // ISMB/ECCB 2013
Berlin, Njemačka, 2013. str. 1-1 (poster, međunarodna recenzija, sažetak, znanstveni)

CROSBI ID: 631030 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
From Short to Long Reads: Benchmarking Assembly Tools

Autori
Sović, Ivan ; Skala, Karolj ; Šikić, Mile

Vrsta, podvrsta i kategorija rada
Sažeci sa skupova, sažetak, znanstveni

Izvornik
ISMB/ECCB 2013 / - , 2013, 1-1

Skup
ISMB/ECCB 2013

Mjesto i datum
Berlin, Njemačka, 20.07.2013. - 24.07.2013

Vrsta sudjelovanja
Poster

Vrsta recenzije
Međunarodna recenzija

Ključne riječi
DNA ; sequencing ; assembly ; tools ; long read ; benchmark ; N50 ; performan ce

Sažetak
An increasing number of DNA de novo assembly tools are being developed, each claiming to produce better results in some aspect than their competition. It is, however, interesting that not enough attention has been paid to their comparative evaluation. Even in cases where the quality of their results has been tested, it is hard to find information on their execution performance. We designed a benchmarking methodology and applied it to several DNA de novo assembly tools. Unlike other comparative studies, our primary goal was to focus on assemblers’ resource consumption as a function of varying lengths and coverages of input read sequences. Since such study is very time consuming, we have currently performed benchmarking on a limited number of assemblers, and report here the preliminary results. We have defined a collection of 77 datasets of simulated read sequences of E. Coli, designed to cover the space of varying read lengths and coverages. Benchmarking was performed on two de Bruijn graph (DBG) based assemblers, Velvet and SOAPdenovo, and two overlap graph (OG) based assemblers, SGA and Minimus. Preliminary results show that DBG-based assemblers generally perform faster than OG-based ones. Additionally, DBGs memory consumption reaches a plateau at some point. The two tested OGs produce differing memory results, presumably because of different underlying alignment algorithms. However, DBGs seem to produce much lower N50 and maximal contig lengths than OGs, especially for longer reads. We conclude that OG is the approach of preference for the upcoming sequencing technologies that will produce longer reads.

Izvorni jezik
Engleski

Znanstvena područja
Biologija, Računarstvo

POVEZANOST RADA

Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb,
Institut "Ruđer Bošković", Zagreb

Profili:

Ivan Sović (autor)

Mile Šikić (autor)

Karolj Skala (autor)

CROSBI Hrvatska znanstvena bibliografija

Pregled bibliografske jedinice broj: 631030

From Short to Long Reads: Benchmarking Assembly Tools

Citiraj ovu publikaciju:

Pregled bibliografske jedinice broj: 631030

From Short to Long Reads: Benchmarking Assembly Tools

Citiraj ovu publikaciju:

Podijeli: