Pregled bibliografske jedinice broj: 1180082
DNA Nanopore Sequencing Basecaller
DNA Nanopore Sequencing Basecaller, 2021., diplomski rad, diplomski, Zagreb
CROSBI ID: 1180082 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
DNA Nanopore Sequencing Basecaller
Autori
Pavlić, Stanislav
Vrsta, podvrsta i kategorija rada
Ocjenski radovi, diplomski rad, diplomski
Mjesto
Zagreb
Datum
01.07
Godina
2021
Stranica
47
Mentor
Šikić, Mile
Neposredni voditelj
Stanojević, Dominik
Ključne riječi
bioinformatics ; basecalling ; nanopore sequencing ; deep learning ; transformers ; CTC
Sažetak
Nanopore sequencing is one of the state-of-the-art sequencing technologies. It passes a DNA sample through a pore which changes the ionic current in the pore. Due to the size of the pore, there are usually five nucleotides (5-mer) present in the pore influencing the measured signal. Each of the 1024 possible 5-mers produces a different signal, and this information is used for basecalling (converting the raw signal to a sequence of nucleotides). The signal is approximately rectangular because the 5-mer changes one nucleotide at a time, but there is a lot of noise present. The goal of this thesis was to develop a DNA nanopore sequencing basecaller using modern deep learning architectures with self-supervised learning in mind. The architecture is mainly based on transformers. The basecaller was evaluated on publicly available datasets. The solution called AttentionCall was implemented in Python and the PyTorch library. The source code is available on GitHub at github.com/StanislavPavlic/attentioncall.
Izvorni jezik
Engleski
Znanstvena područja
Računarstvo
POVEZANOST RADA
Projekti:
HRZZ-IP-2018-01-5886 - De novo sastavljanje genoma i metagenoma (SIGMA) (Šikić, Mile, HRZZ ) ( CroRIS)
Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb