Pregled bibliografske jedinice broj: 765003
Delta View Generation for Incremental Loading of Large Dimensions in a Data Warehouse
Delta View Generation for Incremental Loading of Large Dimensions in a Data Warehouse // Information and Communication Technology, Electronics and Microelectronics (MIPRO), 2015 38th International Convention on
Opatija, Hrvatska: Hrvatska udruga za informacijsku i komunikacijsku tehnologiju, elektroniku i mikroelektroniku - MIPRO, 2015. str. 1417-1422 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
CROSBI ID: 765003 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Delta View Generation for Incremental Loading of Large Dimensions in a Data Warehouse
Autori
Mekterović, Igor ; Brkić, Ljiljana
Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni
Izvornik
Information and Communication Technology, Electronics and Microelectronics (MIPRO), 2015 38th International Convention on
/ - : Hrvatska udruga za informacijsku i komunikacijsku tehnologiju, elektroniku i mikroelektroniku - MIPRO, 2015, 1417-1422
ISBN
978-9-5323-3082-3
Skup
The 38th International ICT Convention – MIPRO 2015
Mjesto i datum
Opatija, Hrvatska, 25.05.2015. - 29.05.2015
Vrsta sudjelovanja
Predavanje
Vrsta recenzije
Međunarodna recenzija
Ključne riječi
data warehouse; incremental etl; delta view; algorithm
Sažetak
Incremental load is the preferred approach in efficient ETL processes. Fact tables are the ones who benefit the most from this approach, since they are large in terms of row count. For the sake of simplicity, dimension tables are often ignored and populated in a full reload manner. However, big dimensions (e.g. Client) can also have a significant impact on the ETL process and should also be considered for incremental load. Although they have much smaller cardinality than a typical fact table, it usually takes much more resources to calculate one dimension table row than to calculate one fact table row. Large dimension tables are based on multiple source tables, and it is not trivial to determine the changed records that should be considered for the incremental load because changes in any and all of underlying source tables must be considered. In this paper, we present an algorithm for the dimension’s delta view generation. Delta view for a dimension encompasses all its source tables and produces a set of keys (e.g. ClientIds) that should be incrementally processed. We have employed this approach in a real world project and have noticed a significant reduction in the loading time of big dimensions.
Izvorni jezik
Engleski
Znanstvena područja
Računarstvo
POVEZANOST RADA
Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb