Pregled bibliografske jedinice broj: 1155002
DEBS grand challenge: real-time detection of air quality improvement with Apache Flink
DEBS grand challenge: real-time detection of air quality improvement with Apache Flink // DEBS '21: Proceedings of the 15th ACM International Conference on Distributed and Event-based Systems
Milano, Italija: The Association for Computing Machinery (ACM), 2021. str. 148-153 doi:10.1145/3465480.3466930 (ostalo, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
CROSBI ID: 1155002 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
DEBS grand challenge: real-time detection of air
quality improvement with Apache Flink
Autori
Marić, Josip ; Pripužić, Krešimir ; Antonić, Martina
Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni
Izvornik
DEBS '21: Proceedings of the 15th ACM International Conference on Distributed and Event-based Systems
/ - : The Association for Computing Machinery (ACM), 2021, 148-153
ISBN
9781450385558
Skup
15th ACM International Conference on Distributed and Event-based Systems
Mjesto i datum
Milano, Italija, 28.06.2021. - 02.07.2021
Vrsta sudjelovanja
Ostalo
Vrsta recenzije
Međunarodna recenzija
Ključne riječi
sensor streams, big data, fast data, geospatial data streams
Sažetak
The topic of the DEBS Grand Challenge 2021 is to develop a solution for detecting areas in which the air quality index (AQI) improved the most when compared to the previous year. The solution must run two given continuous queries in parallel on the incoming sensor data stream which must return the following: 1) a top 50 cities in terms of AQI improvement with their current AQIs and 2) a histogram of the longest streaks of good AQI. The incoming data is accessed through an API which provides streaming sensor measurements in batches. We present our solution based on Apache Flink, a distributed stream processing framework for the cluster. We opted for Flink since its applications can easily be scaled horizontally and vertically by adding computation nodes or increasing available resources, respectively. Flink allows us to divide the given queries into smaller tasks which can be run concurrently on different nodes in order to reduce the overall processing time and thus improve the performance of our solution. In more detail, the following performance intensive tasks are run in parallel on distributed nodes: 1) retrieving measurement batches, 2) assigning a city to each measurement and 3) calculating air quality index per city. We also discuss the main optimizations we have used to improve the performance and present an experimental evaluation of our solution.
Izvorni jezik
Engleski
Znanstvena područja
Elektrotehnika, Računarstvo
POVEZANOST RADA
Projekti:
HRZZ-UIP-2017-05-9066 - Učinkovita stvarnovremenska obrada brzih geoprostornih podataka (RETROFIT) (Pripužić, Krešimir, HRZZ ) ( CroRIS)
Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb