Pregled bibliografske jedinice broj: 1181738
A distributed geospatial publish/subscribe system on Apache Spark
A distributed geospatial publish/subscribe system on Apache Spark // Future Generation Computer Systems, 132 (2022), 282-298 doi:10.1016/j.future.2022.02.013 (međunarodna recenzija, članak, znanstveni)
CROSBI ID: 1181738 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
A distributed geospatial publish/subscribe system on
Apache Spark
Autori
Livaja, Ivan ; Pripužić, Krešimir ; Sovilj, Siniša ; Vuković, Marin
Izvornik
Future Generation Computer Systems (0167-739X) 132
(2022);
282-298
Vrsta, podvrsta i kategorija rada
Radovi u časopisima, članak, znanstveni
Ključne riječi
Geospatial data ; Partitioning ; Data replication ; Big data ; Data stream processing
Sažetak
Publish/subscribe is a messaging pattern where message producers, called publishers, publish messages which they want to be distributed to message consumers, called subscribers. Subscribers are required to subscribe to messages of interest in advance to be able to receive them upon the publishing. In this paper, we discuss a special type of publish/subscribe systems, namely geospatial publish/subscribe systems (GeoPS systems), in which both published messages (i.e., publications) and subscriptions include a geospatial object. Such an object is used to express both the location information of a publication and the location of interest of a subscription. We argue that there is great potential for using GeoPS systems for the Internet of Things and Sensor Web applications. However, existing GeoPS systems are not applicable for this purpose since they are centralized and cannot cope with multiple highly frequent incoming geospatial data streams containing publications. To overcome this limitation, we present a distributed GeoPS system in the cluster which efficiently matches incoming publications in real-time with a set of stored subscriptions. Additionally, we propose four different (distributed) replication and partitioning strategies for managing subscriptions in our distributed GeoPS system. Finally, we present results of an extensive experimental evaluation in which we compare the throughput, latency and memory consumption of these strategies. These results clearly show that they are both efficient and scalable to larger clusters. The comparison with centralized state- of-the-art approaches shows that the additional processing overhead of our distributed strategies introduced by the Apache Spark is almost negligible.
Izvorni jezik
Engleski
Znanstvena područja
Elektrotehnika, Računarstvo
POVEZANOST RADA
Projekti:
HRZZ-UIP-2017-05-9066 - Učinkovita stvarnovremenska obrada brzih geoprostornih podataka (RETROFIT) (Pripužić, Krešimir, HRZZ ) ( CroRIS)
Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb,
Veleučilište u Šibeniku,
Sveučilište Jurja Dobrile u Puli
Citiraj ovu publikaciju:
Časopis indeksira:
- Current Contents Connect (CCC)
- Web of Science Core Collection (WoSCC)
- Science Citation Index Expanded (SCI-EXP)
- SCI-EXP, SSCI i/ili A&HCI
- Scopus