A distributed geospatial publish/subscribe system on Apache Spark (CROSBI ID 306381)
Prilog u časopisu | izvorni znanstveni rad | međunarodna recenzija
Podaci o odgovornosti
Livaja, Ivan ; Pripužić, Krešimir ; Sovilj, Siniša ; Vuković, Marin
engleski
A distributed geospatial publish/subscribe system on Apache Spark
Publish/subscribe is a messaging pattern where message producers, called publishers, publish messages which they want to be distributed to message consumers, called subscribers. Subscribers are required to subscribe to messages of interest in advance to be able to receive them upon the publishing. In this paper, we discuss a special type of publish/subscribe systems, namely geospatial publish/subscribe systems (GeoPS systems), in which both published messages (i.e., publications) and subscriptions include a geospatial object. Such an object is used to express both the location information of a publication and the location of interest of a subscription. We argue that there is great potential for using GeoPS systems for the Internet of Things and Sensor Web applications. However, existing GeoPS systems are not applicable for this purpose since they are centralized and cannot cope with multiple highly frequent incoming geospatial data streams containing publications. To overcome this limitation, we present a distributed GeoPS system in the cluster which efficiently matches incoming publications in real-time with a set of stored subscriptions. Additionally, we propose four different (distributed) replication and partitioning strategies for managing subscriptions in our distributed GeoPS system. Finally, we present results of an extensive experimental evaluation in which we compare the throughput, latency and memory consumption of these strategies. These results clearly show that they are both efficient and scalable to larger clusters. The comparison with centralized state- of-the-art approaches shows that the additional processing overhead of our distributed strategies introduced by the Apache Spark is almost negligible.
Geospatial data ; Partitioning ; Data replication ; Big data ; Data stream processing
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
Podaci o izdanju
132
2022.
282-298
objavljeno
0167-739X
10.1016/j.future.2022.02.013
Povezanost rada
Elektrotehnika, Računarstvo