Nalazite se na CroRIS probnoj okolini. Ovdje evidentirani podaci neće biti pohranjeni u Informacijskom sustavu znanosti RH. Ako je ovo greška, CroRIS produkcijskoj okolini moguće je pristupi putem poveznice www.croris.hr
izvor podataka: crosbi !

Information Extraction from Free-Form CV Documents in Multiple Languages (CROSBI ID 295560)

Prilog u časopisu | izvorni znanstveni rad | međunarodna recenzija

Vukadin, Davor ; Kurdija, Adrian Satja ; Delač, Goran ; Šilić, Marin Information Extraction from Free-Form CV Documents in Multiple Languages // IEEE access, 9 (2021), 84559-84575. doi: 10.1109/access.2021.3087913

Podaci o odgovornosti

Vukadin, Davor ; Kurdija, Adrian Satja ; Delač, Goran ; Šilić, Marin

engleski

Information Extraction from Free-Form CV Documents in Multiple Languages

This paper proposes two natural language processing models for extracting useful information from multilingual, unstructured (free form) CV documents. The model identifies the relevant document sections (personal information, education, employment, etc.) and the corresponding specific information at the lower hierarchy level (names, addresses, roles, skill competences, etc.). Our approach employs the transformer architecture and its multilingual implementation of the encoder part in the form of the BERT language model. The models are trained and tested on a large, manually annotated CV dataset, achieving high scores on standard accuracy measures. The proposed models exhibit important properties of end-to-end training and interpretability, which was investigated by visualizing the model attention and its vector representations.

Information retrieval ; Natural language processing ; Text analysis ; Recurrent neural networks ; CV parsing

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

Podaci o izdanju

9

2021.

84559-84575

objavljeno

2169-3536

10.1109/access.2021.3087913

Trošak objave rada u otvorenom pristupu

APC

Povezanost rada

Računarstvo

Poveznice
Indeksiranost