Nalazite se na CroRIS probnoj okolini. Ovdje evidentirani podaci neće biti pohranjeni u Informacijskom sustavu znanosti RH. Ako je ovo greška, CroRIS produkcijskoj okolini moguće je pristupi putem poveznice www.croris.hr
izvor podataka: crosbi !

Methods for Automatic Sensitive Data Detection in Large Datasets: a Review (CROSBI ID 707892)

Prilog sa skupa u zborniku | izvorni znanstveni rad | međunarodna recenzija

Kužina, Vjeko ; Vušak, Eugen ; Jović, Alan Methods for Automatic Sensitive Data Detection in Large Datasets: a Review // MIPRO / Skala, Karolj (ur.). 2021. str. 213-218

Podaci o odgovornosti

Kužina, Vjeko ; Vušak, Eugen ; Jović, Alan

engleski

Methods for Automatic Sensitive Data Detection in Large Datasets: a Review

In recent years, the need for detection and deidentification of sensitive data in both structured and unstructured forms has increased. The methods used for these tasks have evolved accordingly and currently there are many solutions in different areas of interest. This paper describes the need for the detection of sensitive data in large datasets and describes the challenges associated with automating the detection process. It gives a brief overview of the rule-based and machine learning methods used in this area and examples of their application. The advantages and disadvantages of the described methods are also discussed. We show that the most recent detection solutions are based on the latest and most advanced models proposed in the field of natural language processing, but that there are still some rule-based methods used for certain types of sensitive data. In recent years, the need for detection and de-identification of sensitive data in both structured and unstructured forms has increased. The methods used for these tasks have evolved accordingly and currently there are many solutions in different areas of interest. This paper describes the need for the detection of sensitive data in large datasets and describes the challenges associated with automating the detection process. It gives a brief overview of the rule-based and machine learning methods used in this area and examples of their application. The advantages and disadvantages of the described methods are also discussed. We show that the most recent detection solutions are based on the latest and most advanced models proposed in the field of natural language processing, but that there are still some rule-based methods used for certain types of sensitive data.

sensitive data ; detection ; de-identification ; unstructured data ; machine learning ; named entity recognition

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

Podaci o prilogu

213-218.

2021.

objavljeno

Podaci o matičnoj publikaciji

MIPRO 2021 Proceeedings

Skala, Karolj

Rijeka: Hrvatska udruga za informacijsku i komunikacijsku tehnologiju, elektroniku i mikroelektroniku - MIPRO

1847-3938

1847-3946

Podaci o skupu

MIPRO 2021

predavanje

27.09.2021-01.10.2021

Opatija, Hrvatska

Povezanost rada

Računarstvo