Pretražite po imenu i prezimenu autora, mentora, urednika, prevoditelja

Napredna pretraga

Pregled bibliografske jedinice broj: 1148395

Methods for Automatic Sensitive Data Detection in Large Datasets: a Review


Kužina, Vjeko; Vušak, Eugen; Jović, Alan
Methods for Automatic Sensitive Data Detection in Large Datasets: a Review // MIPRO 2021 Proceeedings / Skala, Karolj (ur.).
Rijeka: Hrvatska udruga za informacijsku i komunikacijsku tehnologiju, elektroniku i mikroelektroniku - MIPRO, 2021. str. 213-218 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)


CROSBI ID: 1148395 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Methods for Automatic Sensitive Data Detection in Large Datasets: a Review

Autori
Kužina, Vjeko ; Vušak, Eugen ; Jović, Alan

Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni

Izvornik
MIPRO 2021 Proceeedings / Skala, Karolj - Rijeka : Hrvatska udruga za informacijsku i komunikacijsku tehnologiju, elektroniku i mikroelektroniku - MIPRO, 2021, 213-218

Skup
44th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO 2021)

Mjesto i datum
Opatija, Hrvatska, 27.09.2021. - 01.10.2021

Vrsta sudjelovanja
Predavanje

Vrsta recenzije
Međunarodna recenzija

Ključne riječi
sensitive data ; detection ; de-identification ; unstructured data ; machine learning ; named entity recognition

Sažetak
In recent years, the need for detection and deidentification of sensitive data in both structured and unstructured forms has increased. The methods used for these tasks have evolved accordingly and currently there are many solutions in different areas of interest. This paper describes the need for the detection of sensitive data in large datasets and describes the challenges associated with automating the detection process. It gives a brief overview of the rule-based and machine learning methods used in this area and examples of their application. The advantages and disadvantages of the described methods are also discussed. We show that the most recent detection solutions are based on the latest and most advanced models proposed in the field of natural language processing, but that there are still some rule-based methods used for certain types of sensitive data. In recent years, the need for detection and de-identification of sensitive data in both structured and unstructured forms has increased. The methods used for these tasks have evolved accordingly and currently there are many solutions in different areas of interest. This paper describes the need for the detection of sensitive data in large datasets and describes the challenges associated with automating the detection process. It gives a brief overview of the rule-based and machine learning methods used in this area and examples of their application. The advantages and disadvantages of the described methods are also discussed. We show that the most recent detection solutions are based on the latest and most advanced models proposed in the field of natural language processing, but that there are still some rule-based methods used for certain types of sensitive data.

Izvorni jezik
Engleski

Znanstvena područja
Računarstvo



POVEZANOST RADA


Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb

Profili:

Avatar Url Vjeko Kužina (autor)

Avatar Url Alan Jović (autor)

Poveznice na cjeloviti tekst rada:

Pristup cjelovitom tekstu rada

Citiraj ovu publikaciju:

Kužina, Vjeko; Vušak, Eugen; Jović, Alan
Methods for Automatic Sensitive Data Detection in Large Datasets: a Review // MIPRO 2021 Proceeedings / Skala, Karolj (ur.).
Rijeka: Hrvatska udruga za informacijsku i komunikacijsku tehnologiju, elektroniku i mikroelektroniku - MIPRO, 2021. str. 213-218 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
Kužina, V., Vušak, E. & Jović, A. (2021) Methods for Automatic Sensitive Data Detection in Large Datasets: a Review. U: Skala, K. (ur.)MIPRO 2021 Proceeedings.
@article{article, author = {Ku\v{z}ina, Vjeko and Vu\v{s}ak, Eugen and Jovi\'{c}, Alan}, editor = {Skala, K.}, year = {2021}, pages = {213-218}, keywords = {sensitive data, detection, de-identification, unstructured data, machine learning, named entity recognition}, title = {Methods for Automatic Sensitive Data Detection in Large Datasets: a Review}, keyword = {sensitive data, detection, de-identification, unstructured data, machine learning, named entity recognition}, publisher = {Hrvatska udruga za informacijsku i komunikacijsku tehnologiju, elektroniku i mikroelektroniku - MIPRO}, publisherplace = {Opatija, Hrvatska} }
@article{article, author = {Ku\v{z}ina, Vjeko and Vu\v{s}ak, Eugen and Jovi\'{c}, Alan}, editor = {Skala, K.}, year = {2021}, pages = {213-218}, keywords = {sensitive data, detection, de-identification, unstructured data, machine learning, named entity recognition}, title = {Methods for Automatic Sensitive Data Detection in Large Datasets: a Review}, keyword = {sensitive data, detection, de-identification, unstructured data, machine learning, named entity recognition}, publisher = {Hrvatska udruga za informacijsku i komunikacijsku tehnologiju, elektroniku i mikroelektroniku - MIPRO}, publisherplace = {Opatija, Hrvatska} }




Contrast
Increase Font
Decrease Font
Dyslexic Font