Nalazite se na CroRIS probnoj okolini. Ovdje evidentirani podaci neće biti pohranjeni u Informacijskom sustavu znanosti RH. Ako je ovo greška, CroRIS produkcijskoj okolini moguće je pristupi putem poveznice www.croris.hr
izvor podataka: crosbi

New inflectional lexicons and training corpora for improved morphosyntactic annotation of Croatian and Serbian (CROSBI ID 635177)

Prilog sa skupa u zborniku | izvorni znanstveni rad | međunarodna recenzija

Ljubešić, Nikola ; Klubička, Filip ; Agić, Željko ; Jazbec, Ivo-Pavao New inflectional lexicons and training corpora for improved morphosyntactic annotation of Croatian and Serbian // Proceedings of the Tenth International conference on language resources and evaluation (LREC 2016). Portorož: European Language Resources Association (ELRA), 2016. str. 4264-4270

Podaci o odgovornosti

Ljubešić, Nikola ; Klubička, Filip ; Agić, Željko ; Jazbec, Ivo-Pavao

engleski

New inflectional lexicons and training corpora for improved morphosyntactic annotation of Croatian and Serbian

In this paper we present newly developed inflectional lexcions and manually annotated corpora of Croatian and Serbian. We introducehrLexandsrLex—two freely available inflectional lexicons of Croatian and Serbian—and describe the process of building theselexicons, supported by supervised machine learning techniques for lemma and paradigm prediction. Furthermore, we introducehr500k, a manually annotated corpus of Croatian, 500 thousand tokens in size. We showcase the three newly developed resources on the task ofmorphosyntactic annotation of both languages by using a recently developed CRF tagger. We achieve best results yet reported on thetask for both languages, beating the HunPos baseline trained on the same datasets by a wide margin.

inflectional lexicon ; morphosyntactic annotation ; Croatian ; Serbian

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

Podaci o prilogu

4264-4270.

2016.

objavljeno

Podaci o matičnoj publikaciji

Proceedings of the Tenth International conference on language resources and evaluation (LREC 2016)

Portorož: European Language Resources Association (ELRA)

978-2-9517408-9-1

Podaci o skupu

Tenth International Conference on Language Resources and Evaluation (LREC 2016)

poster

23.05.2016-28.05.2016

Portorož, Slovenija

Povezanost rada

Informacijske i komunikacijske znanosti, Računarstvo

Poveznice