Pretražite po imenu i prezimenu autora, mentora, urednika, prevoditelja

Napredna pretraga

Pregled bibliografske jedinice broj: 815830

New inflectional lexicons and training corpora for improved morphosyntactic annotation of Croatian and Serbian


Ljubešić, Nikola; Klubička, Filip; Agić, Željko; Jazbec, Ivo-Pavao
New inflectional lexicons and training corpora for improved morphosyntactic annotation of Croatian and Serbian // Proceedings of the Tenth International conference on language resources and evaluation (LREC 2016)
Portorož: European Language Resources Association (ELRA), 2016. str. 4264-4270 (poster, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)


CROSBI ID: 815830 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
New inflectional lexicons and training corpora for improved morphosyntactic annotation of Croatian and Serbian

Autori
Ljubešić, Nikola ; Klubička, Filip ; Agić, Željko ; Jazbec, Ivo-Pavao

Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni

Izvornik
Proceedings of the Tenth International conference on language resources and evaluation (LREC 2016) / - Portorož : European Language Resources Association (ELRA), 2016, 4264-4270

ISBN
978-2-9517408-9-1

Skup
Tenth International conference on language resources and evaluation - LREC 2016

Mjesto i datum
Portorož, Slovenija, 23.05.2016. - 28.05.2016

Vrsta sudjelovanja
Poster

Vrsta recenzije
Međunarodna recenzija

Ključne riječi
inflectional lexicon ; morphosyntactic annotation ; Croatian ; Serbian

Sažetak
In this paper we present newly developed inflectional lexcions and manually annotated corpora of Croatian and Serbian. We introducehrLexandsrLex—two freely available inflectional lexicons of Croatian and Serbian—and describe the process of building theselexicons, supported by supervised machine learning techniques for lemma and paradigm prediction. Furthermore, we introducehr500k, a manually annotated corpus of Croatian, 500 thousand tokens in size. We showcase the three newly developed resources on the task ofmorphosyntactic annotation of both languages by using a recently developed CRF tagger. We achieve best results yet reported on thetask for both languages, beating the HunPos baseline trained on the same datasets by a wide margin.

Izvorni jezik
Engleski

Znanstvena područja
Računarstvo, Informacijske i komunikacijske znanosti



POVEZANOST RADA


Ustanove:
Filozofski fakultet, Zagreb

Profili:

Avatar Url Željko Agić (autor)

Avatar Url Nikola Ljubešić (autor)

Poveznice na cjeloviti tekst rada:

Pristup cjelovitom tekstu rada www.lrec-conf.org

Citiraj ovu publikaciju:

Ljubešić, Nikola; Klubička, Filip; Agić, Željko; Jazbec, Ivo-Pavao
New inflectional lexicons and training corpora for improved morphosyntactic annotation of Croatian and Serbian // Proceedings of the Tenth International conference on language resources and evaluation (LREC 2016)
Portorož: European Language Resources Association (ELRA), 2016. str. 4264-4270 (poster, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
Ljubešić, N., Klubička, F., Agić, Ž. & Jazbec, I. (2016) New inflectional lexicons and training corpora for improved morphosyntactic annotation of Croatian and Serbian. U: Proceedings of the Tenth International conference on language resources and evaluation (LREC 2016).
@article{article, author = {Ljube\v{s}i\'{c}, Nikola and Klubi\v{c}ka, Filip and Agi\'{c}, \v{Z}eljko and Jazbec, Ivo-Pavao}, year = {2016}, pages = {4264-4270}, keywords = {inflectional lexicon, morphosyntactic annotation, Croatian, Serbian}, isbn = {978-2-9517408-9-1}, title = {New inflectional lexicons and training corpora for improved morphosyntactic annotation of Croatian and Serbian}, keyword = {inflectional lexicon, morphosyntactic annotation, Croatian, Serbian}, publisher = {European Language Resources Association (ELRA)}, publisherplace = {Portoro\v{z}, Slovenija} }
@article{article, author = {Ljube\v{s}i\'{c}, Nikola and Klubi\v{c}ka, Filip and Agi\'{c}, \v{Z}eljko and Jazbec, Ivo-Pavao}, year = {2016}, pages = {4264-4270}, keywords = {inflectional lexicon, morphosyntactic annotation, Croatian, Serbian}, isbn = {978-2-9517408-9-1}, title = {New inflectional lexicons and training corpora for improved morphosyntactic annotation of Croatian and Serbian}, keyword = {inflectional lexicon, morphosyntactic annotation, Croatian, Serbian}, publisher = {European Language Resources Association (ELRA)}, publisherplace = {Portoro\v{z}, Slovenija} }




Contrast
Increase Font
Decrease Font
Dyslexic Font