Pregled bibliografske jedinice broj: 634302
Partially Synthetic Dataset Generated for the Testing Purposes on the Basis of Available Public Use Anonymized Microdata
Partially Synthetic Dataset Generated for the Testing Purposes on the Basis of Available Public Use Anonymized Microdata // Proceedings of the 7th European Computing Conference (ECC '13) / Boras, Damir ; Mikelić Preradović, Nives ; Moya, Francisco ; Roushdy, Mohamed ; Salem, Abdel-Badeeh M. (ur.).
Dubrovnik: WSEAS Press, 2013. str. 385-390 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
CROSBI ID: 634302 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Partially Synthetic Dataset Generated for the Testing Purposes on the Basis of Available Public Use Anonymized Microdata
Autori
Miličević, Mario ; Žubrinić, Krunoslav ; Sjekavica, Tomo
Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni
Izvornik
Proceedings of the 7th European Computing Conference (ECC '13)
/ Boras, Damir ; Mikelić Preradović, Nives ; Moya, Francisco ; Roushdy, Mohamed ; Salem, Abdel-Badeeh M. - Dubrovnik : WSEAS Press, 2013, 385-390
ISBN
978-960-474-304-9
Skup
7th European Computing Conference (ECC '13)
Mjesto i datum
Dubrovnik, Hrvatska, 25.06.2013. - 27.06.2013
Vrsta sudjelovanja
Predavanje
Vrsta recenzije
Međunarodna recenzija
Ključne riječi
Synthetic data; Confidentiality; Disclosure; Microdata; PPDP
Sažetak
Governments and organizations increasingly recognize huge opportunities in sharing and distribution of collected data, and research community must provide methods and algorithms for privacy preserving data publishing. Without access to the original microdata it is impossible to estimate the quality of developed anonymization methods or to compare the classification accuracy and the computational time of various algorithms applied both on anonymized and original datasets. We propose another high-quality microdata source for testing purposes - partially synthetic dataset generated on the basis of actual public use anonymized microdata set. The original distribution of the data should be simulated in a significant extent, as well as attribute value correlations or functional dependencies. Since the synthesized data are based on published microdata sets, it is expected that hidden complex patterns within a dataset can be also preserved.
Izvorni jezik
Engleski
Znanstvena područja
Računarstvo, Informacijske i komunikacijske znanosti
POVEZANOST RADA
Projekti:
275-0000000-3260 - Integralna kvaliteta usluge komunikacijskih i informacijskih sustava (Lipovac, Vladimir, MZO ) ( CroRIS)
Ustanove:
Sveučilište u Dubrovniku