Pregled bibliografske jedinice broj: 1051659
Data Set Synthesis Based on Known Correlations and Distributions for Expanded Social Graph Generation
Data Set Synthesis Based on Known Correlations and Distributions for Expanded Social Graph Generation // IEEE Access, 8 (2020), 1; 33013-33022 doi:10.1109/access.2020.2970862 (međunarodna recenzija, članak, znanstveni)
CROSBI ID: 1051659 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Data Set Synthesis Based on Known Correlations and
Distributions for Expanded Social Graph Generation
Autori
Petricioli, Lucija ; Humski, Luka ; Vranic, Mihaela ; Pintar, Damir
Izvornik
IEEE Access (2169-3536) 8
(2020), 1;
33013-33022
Vrsta, podvrsta i kategorija rada
Radovi u časopisima, članak, znanstveni
Ključne riječi
Correlation matrix ; data distribution ; social graph ; synthetic data generation
Sažetak
Nowadays, data created through the usage of different services are most commonly not available to the average researcher. Security and privacy have become a top concern, which has further restricted access to certain real- life data, especially holding true for social networks. This is why synthetic data generators have become a very important area of research, particularly synthetic social graph generators. However, even today, such generators mostly create graphs that contain just the information whether two nodes are connected. Fortunately, there is an existing conceptual solution for an expanded social graph generator that aims to generate synthetic graphs containing multiple weighted edges between nodes, thus showing various types of relationships among those nodes, all based on known real-life data characteristics. One of its proposed steps is the generation of necessary data according to provided distributions and correlations. This paper focuses on the generation of such data by adapting an existing iterative algorithm for non-normal multivariate data simulation to generate synthetic data based on the publicly available distributions and correlations of Facebook interaction parameters. It is shown that the characteristics of the generated synthetic data are similar to the known characteristics of the real-life data, proving that the chosen algorithm, along with the accompanying alterations, can be used as one of the steps within the process of generating a synthetic expanded social graph.
Izvorni jezik
Engleski
Znanstvena područja
Elektrotehnika, Računarstvo
POVEZANOST RADA
Projekti:
KK.01.1.1.01.0009 - Napredne metode i tehnologije u znanosti o podatcima i kooperativnim sustavima (EK )
KK.01.2.1.01.0041
Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb
Citiraj ovu publikaciju:
Časopis indeksira:
- Current Contents Connect (CCC)
- Web of Science Core Collection (WoSCC)
- Science Citation Index Expanded (SCI-EXP)
- Social Science Citation Index (SSCI)
- SCI-EXP, SSCI i/ili A&HCI
- Scopus