Pregled bibliografske jedinice broj: 1272976
On Differentiating Synthetic and Real Data in Medical Applications
On Differentiating Synthetic and Real Data in Medical Applications // The Second Serbian International Conference on Applied Artificial Intelligence (SICAAI) - Book of Abstracts / Filipović, Nenad (ur.).
Kragujevac: University of Kragujevac, 2023. 61, 4 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
CROSBI ID: 1272976 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
On Differentiating Synthetic and Real Data in Medical Applications
Autori
Baressi Šegota, Sandi ; Anđelić, Nikola ; Štifanić, Daniel ; Štifanić, Jelena ; Car, Zlatan
Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni
Izvornik
The Second Serbian International Conference on Applied Artificial Intelligence (SICAAI) - Book of Abstracts
/ Filipović, Nenad - Kragujevac : University of Kragujevac, 2023
ISBN
978-86-81037-77-5
Skup
The Second Serbian International Conference on Applied Artificial Intelligence (SICAAI)
Mjesto i datum
Kragujevac, Srbija, 19.05.2023. - 20.05.2023
Vrsta sudjelovanja
Predavanje
Vrsta recenzije
Međunarodna recenzija
Ključne riječi
data classification, machine learning, synthetic data.
Sažetak
The use of synthetic, generated, data to address the machine-learning algorithms’ needs for a large amount of data points is a growing trend in the research community. This paper tests four methods - Copula Generative Adversarial Network (GAN), CTGAN, Gaussian Copula, and Triplet-based variable encoder (TVAE) on the dataset defining risk factors for lung cancer - selected due to the high performance of ML models applied to it. The resulting datasets are tested using Pearson’s correlation coefficient and used for training two multilayer perceptron (MLP)-based neural networks. The first tests the classification performance of the models developed with synthetic datasets compared to the ones developed on original data. The second network is trained with the goal of differentiating between real and synthetic data through training on a mixed dataset. Results show that the correlation similarity has a direct link with the performance of both networks, with TVAE generating the best results and being the hardest to differentiate.
Izvorni jezik
Engleski
Znanstvena područja
Elektrotehnika, Računarstvo, Interdisciplinarne tehničke znanosti
POVEZANOST RADA
Projekti:
NadSve-Sveučilište u Rijeci-uniri-tehnic-18-275-1447 - Razvoj inteligentnog ekspertnog sustava za online diagnostiku raka mokračnog mjehura (Car, Zlatan, NadSve - UNIRI potpore) ( CroRIS)
--KK.01.1.1.01.009 - Napredne metode i tehnologije u znanosti o podatcima i kooperativnim sustavima (DATACROSS) (Šmuc, Tomislav; Lončarić, Sven; Petrović, Ivan; Jokić, Andrej; Palunko, Ivana) ( CroRIS)
--uniri-mladi-technic-22-61 - Energetska optimizacija industrijskih robotskih manipulatora primjenom algoritama evolucijskog računarstva (Anđelić, Nikola) ( CroRIS)
Ustanove:
Tehnički fakultet, Rijeka
Profili:
Zlatan Car
(autor)
Nikola Anđelić
(autor)
Sandi Baressi Šegota
(autor)
Daniel Štifanić
(autor)