Pretražite po imenu i prezimenu autora, mentora, urednika, prevoditelja

Napredna pretraga

Pregled bibliografske jedinice broj: 701463

Multi-label Classification of Croatian Legal Documents Using EuroVoc Thesaurus


Šarić, Frane; Dalbelo Bašić, Bojana; Moens, Marie-Francine; Šnajder, Jan
Multi-label Classification of Croatian Legal Documents Using EuroVoc Thesaurus // Proceedings of Workshop on Semantic Processing of Legal Texts (SPLeT2014)
Reykjavík: European Language Resources Association (ELRA), 2014. str. 7-12 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)


CROSBI ID: 701463 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Multi-label Classification of Croatian Legal Documents Using EuroVoc Thesaurus

Autori
Šarić, Frane ; Dalbelo Bašić, Bojana ; Moens, Marie-Francine ; Šnajder, Jan

Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni

Izvornik
Proceedings of Workshop on Semantic Processing of Legal Texts (SPLeT2014) / - Reykjavík : European Language Resources Association (ELRA), 2014, 7-12

Skup
Workshop on Semantic Processing of Legal Texts

Mjesto i datum
Reykjavík, Island, 26.05.2014. - 31.05.2014

Vrsta sudjelovanja
Predavanje

Vrsta recenzije
Međunarodna recenzija

Ključne riječi
multi-label classification; automatic indexing; class sparsity; EuroVoc thesaurus; legal documents

Sažetak
The automatic indexing of legal documents can improve access to legislation. EuroVoc thesaurus has been used to index documents of the European Parliament as well as national legislative. A number of studies exists that address the task of automatic EuroVoc indexing. In this paper we describe the work on EuroVoc indexing of Croatian legislative documents. We focus on the machine learning aspect of the problem. First, we describe the manually indexed Croatian legislative documents collection, which we make freely available. Secondly, we describe the multi-label classification experiments on this collection. A challenge of EuroVoc indexing is class sparsity, and we discuss some strategies to address it. Our best model achieves a precision of 79.7%, recall of 60.2%, and F1 score of 68.6%.

Izvorni jezik
Engleski

Znanstvena područja
Računarstvo



POVEZANOST RADA


Projekti:
036-1300646-1986 - Otkrivanje znanja u tekstnim podacima (Dalbelo-Bašić, Bojana, MZO ) ( CroRIS)

Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb

Profili:

Avatar Url Jan Šnajder (autor)

Avatar Url Bojana Dalbelo Bašić (autor)

Avatar Url Frane Šarić (autor)

Citiraj ovu publikaciju:

Šarić, Frane; Dalbelo Bašić, Bojana; Moens, Marie-Francine; Šnajder, Jan
Multi-label Classification of Croatian Legal Documents Using EuroVoc Thesaurus // Proceedings of Workshop on Semantic Processing of Legal Texts (SPLeT2014)
Reykjavík: European Language Resources Association (ELRA), 2014. str. 7-12 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
Šarić, F., Dalbelo Bašić, B., Moens, M. & Šnajder, J. (2014) Multi-label Classification of Croatian Legal Documents Using EuroVoc Thesaurus. U: Proceedings of Workshop on Semantic Processing of Legal Texts (SPLeT2014).
@article{article, author = {\v{S}ari\'{c}, Frane and Dalbelo Ba\v{s}i\'{c}, Bojana and Moens, Marie-Francine and \v{S}najder, Jan}, year = {2014}, pages = {7-12}, keywords = {multi-label classification, automatic indexing, class sparsity, EuroVoc thesaurus, legal documents}, title = {Multi-label Classification of Croatian Legal Documents Using EuroVoc Thesaurus}, keyword = {multi-label classification, automatic indexing, class sparsity, EuroVoc thesaurus, legal documents}, publisher = {European Language Resources Association (ELRA)}, publisherplace = {Reykjav\'{\i}k, Island} }
@article{article, author = {\v{S}ari\'{c}, Frane and Dalbelo Ba\v{s}i\'{c}, Bojana and Moens, Marie-Francine and \v{S}najder, Jan}, year = {2014}, pages = {7-12}, keywords = {multi-label classification, automatic indexing, class sparsity, EuroVoc thesaurus, legal documents}, title = {Multi-label Classification of Croatian Legal Documents Using EuroVoc Thesaurus}, keyword = {multi-label classification, automatic indexing, class sparsity, EuroVoc thesaurus, legal documents}, publisher = {European Language Resources Association (ELRA)}, publisherplace = {Reykjav\'{\i}k, Island} }




Contrast
Increase Font
Decrease Font
Dyslexic Font