Federated Learning as a Tool for Open Machine Learning Models in eGovernment

Guberović, Emanuel; Čavrak, Igor; Bosnić, Ivana; Charalampos, Alexopoulos

izvor podataka: crosbi !

Federated Learning as a Tool for Open Machine Learning Models in eGovernment (CROSBI ID 707930)

Prilog sa skupa u zborniku | prošireni sažetak izlaganja sa skupa

Guberović, Emanuel ; Čavrak, Igor ; Bosnić, Ivana ; Charalampos, Alexopoulos Federated Learning as a Tool for Open Machine Learning Models in eGovernment // Book of abstracts of the National Open Data Conference / Vujić, Miroslav ; Šalamon, Dragica (ur.). Zagreb, 2021

Podaci o odgovornosti

Autori

Guberović, Emanuel ; Čavrak, Igor ; Bosnić, Ivana ; Charalampos, Alexopoulos

Osnovni podaci na izvornom jeziku
Osnovni podaci na ostalim jezicima

Jezik

engleski

Naslov

Federated Learning as a Tool for Open Machine Learning Models in eGovernment

Sažetak

Federated learning (FL) [1] emerged as a new data- parallel machine learning (ML) technique, contributing missing links needed in the field of artificial intelligence to comply with restrictions concerning data privacy regulations [2]. Besides enabling ML to dodge data privacy obstacles, it creates new opportunities by facilitating global knowledge discovery through training models using distributed datasets from different data providers and with different ownership and access rights. Such an approach advocates the creation of open models – an extension of the open data concept – where data required for open model construction can be open, closed, and a combination of both. FL open models (FLOMs) align with the usage of new disruptive technologies for achieving 'knowledge of the crowd' in supporting data-driven and evidence- based decision and policy-making, recognized as a third-generation eGovernance methodology [3-4]. This article proposes a simple FL framework, with a step-by-step guide on implementing a FLOM accompanied by two examples that fall within the eGovernance domain. We specify the FLOM framework as a blueprint for using FL in realization of open models with the following specification items: client data and requirements, an aggregation server, an Application Programming Interface (API) on the aggregation server, and a runnable ML model. A high-level description of the required individual data and computational capabilities of the client for participation within the learning process includes required data attributes, their frequency, and quantity, as well as possible additional qualitative data metrics [5]. An aggregation server is required to create an aggregate value from a set of model weight client updates, followed by successfully notifying and disseminating the new global model weights to the participating clients. The API interface on the aggregation server consists of endpoints for receiving client model weight updates and disseminating the new global weights. Notably, the ML model used at the core of the FLOM process needs to take the predefined input values from the client and provide the appropriate model weights for the API endpoint on the aggregating server. FLOM is based on the typical FL process that takes four distinct steps per one iteration: in the first step, clients send their individual model updates, followed in the second step by aggregation of those updates on the aggregation server. The third step requires returning the aggregated model weights to the clients, who use that data in the final step to update their local models. We validated the potential of FLOM as a 3rd generation eGovernance tool using two different use cases ; by comparing the quality of the data discovery with the confidential and private data available to the FL process and using only the data available to the typical centralized ML. The first use case revolves around a horizontally partitioned environment, with a goal of agricultural commodity price prediction by combining data from the EUROSTAT price index [6] and FAO product import/export dataset [7]. This data is partitioned on a country level, with each one being a distinct data unit. Using FLOM in this example allows individual producers to gain better information about the cost-effectiveness of producing each commodity. This new knowledge can be discovered without the need for producers to exchange their production cost data, often confidential. The second use case relies on the constructed dataset from the anonymized private data created for a loan approval task containing credit record data and some client-specific private data. By vertically separating the dataset into credit balance data and private data, we compare the gains achieved using FL with the knowledge extracted from the complete dataset versus using only the credit balance data. Our validation of FL and open model approach, based on the two use cases from the domain of eGovernance, revealed significant gains compared to using the data available only to centralized ML techniques. With the introduction of the FLOM framework, we aim to facilitate the creation of new tools, services, and usage scenarios from various domains that were previously not practically possible or hard to achieve. In particular, we aim at usage scenarios that would allow the creation of new knowledge, in the form of open models, that combine both open and closed datasets and allow various parties to participate in the creation and usage of such open models. KEY WORDS Federated learning, machine learning, open data, open models, eGovernance REFERENCES 1. McMahan, H. & Moore, Eider & Ramage, Daniel & Agüera y Arcas, Blaise. (2016). Federated Learning of Deep Networks using Model Averaging. 2. Z. Lachana, C. Alexopoulos, E. Loukis, and Y. Charalabidis, “Identifying the different generations of egovernment: an analysis framework, ” in The 12th Mediterranean Conference on Information Systems (MCIS), 2018, pp. 1–13. 3. UNCTAD, “Data protection regulations and international data flows: Implications for trade and development data protection regulations and international data flows: Implications for trade and development”, 2016. 4. Y. Charalabidis, E. Loukis, C. Alexopoulos, and Z. Lachana, “The three generations of electronic government: From service provision to open data and to policy analytics, ” in International Conference on Electronic Government, Springer, 2019, pp. 3–17. 5. Sidi, Fatimah & Hassany Shariat Panahy, Payam & Affendey, Lilly & A. Jabar, Marzanah & Ibrahim, Hamidah & Mustapha, Aida. (2013). Data quality: A survey of data quality dimensions. 10.1109/InfRKM.2012.6204995. 6. EUROSTAT, Price indices of agricultural products, Accessed on September 2021, Available at https://ec.europa.eu/eurostat/cache/metadata/en/ap ri_pi_esms.htm 7. FAOSTAT, Commodities by country, Accessed on September 2021, Available at http://www.fao.org/faostat/en/#rankings/commoditie s_by_country_imports

Ključne riječi

Federated learning ; machine learning ; open data ; open models ; eGovernance

Napomena

nije evidentirano

Jezik

nije evidentirano

Naslov

nije evidentirano

Sažetak

nije evidentirano

Ključne riječi

nije evidentirano

Napomena

nije evidentirano

Podaci o prilogu

Broj rada

Godina izdavanja

2021.

Status objave rada

objavljeno

Podaci o matičnoj publikaciji

Naslov

Book of abstracts of the National Open Data Conference

Urednici

Vujić, Miroslav ; Šalamon, Dragica

Izdavač

Zagreb:

ISBN

978-953-243-123-0

Podaci o skupu

Skup

Nacionalna konferencija o otvorenim podacima = National Open Data Conference (NODC2021)

Vrsta sudjelovanja

predavanje

Datum održavanja skupa

20.09.2021-23.09.2021

Mjesto održavanja skupa

Zagreb, Hrvatska

Povezanost rada

Povezane osobe

Emanuel Guberović (autor/i)

Igor Čavrak (autor/i)

Ivana Bosnić (autor/i)

Povezane ustanove

Fakultet elektrotehnike i računarstva (036) (autorova ustanova)

Područje

Računarstvo