Pregled bibliografske jedinice broj: 1052813
Algoritmi za de novo sastavljanje velikih genoma
Algoritmi za de novo sastavljanje velikih genoma, 2019., doktorska disertacija, Fakultet elektrotehnike i računarstva, Zagreb
CROSBI ID: 1052813 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Algoritmi za de novo sastavljanje velikih genoma
(Algorithms for de novo assembly of large genomes)
Autori
Vaser, Robert
Vrsta, podvrsta i kategorija rada
Ocjenski radovi, doktorska disertacija
Fakultet
Fakultet elektrotehnike i računarstva
Mjesto
Zagreb
Datum
19.12
Godina
2019
Stranica
83
Mentor
Šikić, Mile
Ključne riječi
de novo, sastavljanje, dugačka očitanja, PacBio, Oxford Nanopore, gomila preklapanja, razmještaj simulacijom djelovanja sila, poravnanje parcijalnog uređaja, vektorizacija
(de novo, assembly, long reads, PacBio, Oxford Nanopore, pile-o-gram, force directed layout, partial order alignment, vectorization)
Sažetak
The inability of DNA sequencing technologies to interpret entire molecules led to the development of methods that connect the obtained short fragments back together in a puzzle-like process. They are called assemblers and their design is guided with the notion that similar fragments originate from the same region in the genome. That is often annulled due to sequencing errors and repetitive nature of the genome. Short fragments of first two generations of sequencing are incapable of spanning moderately long repetitive regions and thus hinder a complete assembly. The advent of new sequencing approaches, namely Pacific Biosciences and Oxford Nanopore Technologies, pushed the limit on the fragment lengths at a cost of higher error rates, but still facilitated the assembly problem considerably. First assembly attempts used various types of error correction approaches prior the assembly with existing tools at that time. Although, several long read based assemblers have been proposed in the past years, they demand significant amounts of computational resources. The focus of this research is development of memory efficient and scalable algorithms for de novo assembly of large genomes using third generation of sequencing data without error correction of input sequences. In the scope of the thesis we implemented three novel tools for genome assembly: a memory friendly layout module called Rala, which builds the assembly graph from preprocessed sequences and resolves junctions in graph with the help of force directed placement ; a fast and accurate consensus module called Racon based on vectorized partial order alignment ; and the complete de novo assembler called Raven, which competes with state-of-the-art assemblers both in quality and resource management.
Izvorni jezik
Engleski
Znanstvena područja
Računarstvo
POVEZANOST RADA
Projekti:
KK.01.1.1.01.0009 - Napredne metode i tehnologije u znanosti o podatcima i kooperativnim sustavima (EK )
HRZZ-UIP-2013-11-7353 - Algoritmi za analizu slijeda genoma (AGESA) (Šikić, Mile, HRZZ ) ( CroRIS)
Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb