Napredna pretraga

Pregled bibliografske jedinice broj: 643480

Protein database search optimization based on CUDA and MPI


Pavlović, Dario; Vaser, Robert; Korpar, Matija; Šikić, Mile
Protein database search optimization based on CUDA and MPI // The 36th international ICT convention MIPRO
Opatija, Hrvatska, 2013. (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)


Naslov
Protein database search optimization based on CUDA and MPI

Autori
Pavlović, Dario ; Vaser, Robert ; Korpar, Matija ; Šikić, Mile

Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni

Skup
The 36th international ICT convention MIPRO

Mjesto i datum
Opatija, Hrvatska, 20-24.05.2013

Vrsta sudjelovanja
Predavanje

Vrsta recenzije
Međunarodna recenzija

Ključne riječi
Alignment; Smith-Waterman; sequence; GPU

Sažetak
Protein database search is an important method in the field of computational biology. There are a large number of sequences in an average database which makes such searches rather time and resource consuming. With the rapid growth in size of these databases in the past years, there came a need to speed up the search and consequently, any alignments performed on such databases. This paper presents an acceleration of the database search tool sw#DB which is based on a CUDA implementation of Smith-Waterman algorithm. We achieved speed up by reducing database size. The whole database was divided into seeds of a fixed length. The positions of these seeds and the corresponding sequence indexes from the database are then stored in a hash container. This allows for a constant time lookup of all the positions of a seed in every sequence of a database. Potential alignment candidate sequences for a query are filtered using this method, forwarding only those which contain at least one seed from the query to the sw#DB. This reduces the number of alignments performed. Overall, it brings a speedup of around three times compared to the basic sw#DB tool, based solely on Smith Waterman algorithm, with almost no loss of accuracy. The implementation is written in CUDA and C programming languages. For large queries, MPI implementation with multiple CUDA cards is used.

Izvorni jezik
Engleski

Znanstvena područja
Računarstvo



POVEZANOST RADA


Projekt / tema
036-0362214-1987 - Modeliranje kompleksnih sustava (Branko Jeren, )

Ustanove
Fakultet elektrotehnike i računarstva, Zagreb

Autor s matičnim brojem:
Mile Šikić, (250972)