Pretražite po imenu i prezimenu autora, mentora, urednika, prevoditelja

Napredna pretraga

Pregled bibliografske jedinice broj: 813803

CLOUDFLOW - Enabling Faster Biomedical Pipelines with Mapreduce and Spark


Forer, Lukas; Afgan, Enis; Weissenteiner, Hansi; Davidović, Davor; Specht, Guenther; Kronenberg, Florian; Schoenherr, Sebastian
CLOUDFLOW - Enabling Faster Biomedical Pipelines with Mapreduce and Spark // Scalable Computing. Practice and Experience, 17 (2016), 2; 103-114 doi:10.12694/scpe.v17i2.1159 (međunarodna recenzija, članak, znanstveni)


CROSBI ID: 813803 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
CLOUDFLOW - Enabling Faster Biomedical Pipelines with Mapreduce and Spark

Autori
Forer, Lukas ; Afgan, Enis ; Weissenteiner, Hansi ; Davidović, Davor ; Specht, Guenther ; Kronenberg, Florian ; Schoenherr, Sebastian

Izvornik
Scalable Computing. Practice and Experience (1895-1767) 17 (2016), 2; 103-114

Vrsta, podvrsta i kategorija rada
Radovi u časopisima, članak, znanstveni

Ključne riječi
Apache YARN ; Pipeline Framework ; Spark ; Cloud Computing

Sažetak
For many years Apache Hadoop has been used as a synonym for processing data in the MapReduce fashion. However, due to the complexity of developing MapReduce applications, adoption of this paradigm in genetics has been limited. To alleviate some of the issues, we have previously developed Cloudflow - a high-level pipeline framework that allows users to create sophisticated biomedical pipelines using predefined code blocks while the framework automatically translates those into the MapReduce execution model. With the introduction of the YARN resource management layer, new computational processing models such as Apache Spark are now plugable into the Hadoop ecosystem. In this paper we describe the extension of Cloudflow to support Apache Spark without any adaptions to already implemented pipelines. The described performance evaluation demonstrates that Spark can bring an additional boost for analysing next generation sequencing (NGS) data to the field of genetics. The Cloudflow framework is open source and freely available at https://github.com/genepi/cloudflow.

Izvorni jezik
Engleski

Znanstvena područja
Računarstvo



POVEZANOST RADA


Ustanove:
Institut "Ruđer Bošković", Zagreb

Profili:

Avatar Url Enis Afgan (autor)

Avatar Url Davor Davidović (autor)

Citiraj ovu publikaciju:

Forer, Lukas; Afgan, Enis; Weissenteiner, Hansi; Davidović, Davor; Specht, Guenther; Kronenberg, Florian; Schoenherr, Sebastian
CLOUDFLOW - Enabling Faster Biomedical Pipelines with Mapreduce and Spark // Scalable Computing. Practice and Experience, 17 (2016), 2; 103-114 doi:10.12694/scpe.v17i2.1159 (međunarodna recenzija, članak, znanstveni)
Forer, L., Afgan, E., Weissenteiner, H., Davidović, D., Specht, G., Kronenberg, F. & Schoenherr, S. (2016) CLOUDFLOW - Enabling Faster Biomedical Pipelines with Mapreduce and Spark. Scalable Computing. Practice and Experience, 17 (2), 103-114 doi:10.12694/scpe.v17i2.1159.
@article{article, author = {Forer, Lukas and Afgan, Enis and Weissenteiner, Hansi and Davidovi\'{c}, Davor and Specht, Guenther and Kronenberg, Florian and Schoenherr, Sebastian}, year = {2016}, pages = {103-114}, DOI = {10.12694/scpe.v17i2.1159}, keywords = {Apache YARN, Pipeline Framework, Spark, Cloud Computing}, journal = {Scalable Computing. Practice and Experience}, doi = {10.12694/scpe.v17i2.1159}, volume = {17}, number = {2}, issn = {1895-1767}, title = {CLOUDFLOW - Enabling Faster Biomedical Pipelines with Mapreduce and Spark}, keyword = {Apache YARN, Pipeline Framework, Spark, Cloud Computing} }
@article{article, author = {Forer, Lukas and Afgan, Enis and Weissenteiner, Hansi and Davidovi\'{c}, Davor and Specht, Guenther and Kronenberg, Florian and Schoenherr, Sebastian}, year = {2016}, pages = {103-114}, DOI = {10.12694/scpe.v17i2.1159}, keywords = {Apache YARN, Pipeline Framework, Spark, Cloud Computing}, journal = {Scalable Computing. Practice and Experience}, doi = {10.12694/scpe.v17i2.1159}, volume = {17}, number = {2}, issn = {1895-1767}, title = {CLOUDFLOW - Enabling Faster Biomedical Pipelines with Mapreduce and Spark}, keyword = {Apache YARN, Pipeline Framework, Spark, Cloud Computing} }

Časopis indeksira:


  • Web of Science Core Collection (WoSCC)
    • Emerging Sources Citation Index (ESCI)
  • Scopus


Uključenost u ostale bibliografske baze podataka::


  • Social Services Abstracts


Citati:





    Contrast
    Increase Font
    Decrease Font
    Dyslexic Font