Pregled bibliografske jedinice broj: 899731
ETLator-a scripting ETL framework
ETLator-a scripting ETL framework // Proceedings of 40th Jubilee International Convention MIPRO 2017 / Petar Biljanović (ur.).
Rijeka: Hrvatska udruga za informacijsku i komunikacijsku tehnologiju, elektroniku i mikroelektroniku - MIPRO, 2017. str. 1581-1586 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
CROSBI ID: 899731 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
ETLator-a scripting ETL framework
Autori
Radonić, Miran ; Mekterović, Igor
Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni
Izvornik
Proceedings of 40th Jubilee International Convention MIPRO 2017
/ Petar Biljanović - Rijeka : Hrvatska udruga za informacijsku i komunikacijsku tehnologiju, elektroniku i mikroelektroniku - MIPRO, 2017, 1581-1586
ISBN
978-953-233-093-9
Skup
40th jubilee international convention on information and communication technology, electronics and microelectronics
Mjesto i datum
Rijeka, Hrvatska, 22.05.2017. - 26.05.2017
Vrsta sudjelovanja
Predavanje
Vrsta recenzije
Međunarodna recenzija
Ključne riječi
Data Warehouse ; ETL ; Scripting framework
Sažetak
ETL (Extract Transform Load) process is the industry standard term for data extraction, transformation and loading into the Data Warehouse (DW). ETL process is the most resource demanding process in DW implementation and typically has to be evolved and maintained for the duration of the DW. To facilitate the development and maintenance of ETL processes many ETL tools have been developed featuring Graphical User Interfaces and various built-in functionalities (parallelism, logging, rich transformation libraries, documentation generation, etc.). The downside of such GUI ETL tools is that development is carried out heavily using mouse operations and less by writing programming code, which feels unnatural for some developers, especially with many similar, repetitive tasks. In this paper we present an alternative approach – an ETL framework “ETLator” based on Python scripting language where ETL tasks are defined by writing Python code. ETLator implements various typical ETL transformations and allows the user to simply and efficiently define complex ETL tasks with multiple sources and parallel tasks whilst leveraging full flexibility of Python. ETLator also provides logging and can document ETL tasks by generating data flow images. On a test case we show that ETLator simplifies ETL development and rivals the GUI approach.
Izvorni jezik
Engleski
Znanstvena područja
Računarstvo
POVEZANOST RADA
Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb
Profili:
Igor Mekterović
(autor)