Pretražite po imenu i prezimenu autora, mentora, urednika, prevoditelja

Napredna pretraga

Pregled bibliografske jedinice broj: 616808

Building Named Entity Recognition Models for Croatian and Slovene


Ljubešić, Nikola; Stupar, Marija; Jurić, Tereza
Building Named Entity Recognition Models for Croatian and Slovene // Proceedings of the Eighth LANGUAGE TECHNOLOGIES Conference / Erjavec, Tomaž ; Žganec Gros, Jerneja (ur.).
Ljubljana, 2012. str. 129-134 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)


CROSBI ID: 616808 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Building Named Entity Recognition Models for Croatian and Slovene
(Building Named Entity Recognition Models For Croatian And Slovene)

Autori
Ljubešić, Nikola ; Stupar, Marija ; Jurić, Tereza

Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni

Izvornik
Proceedings of the Eighth LANGUAGE TECHNOLOGIES Conference / Erjavec, Tomaž ; Žganec Gros, Jerneja - Ljubljana, 2012, 129-134

Skup
Eighth LANGUAGE TECHNOLOGIES Conference

Mjesto i datum
Ljubljana, Slovenija, 08.10.2012. - 09.10.2012

Vrsta sudjelovanja
Predavanje

Vrsta recenzije
Međunarodna recenzija

Ključne riječi
named entity recognition; distributional similarity; Croatian language; Slovene language

Sažetak
The paper presents efforts in developing freely available models for named entity recognition and classification for Croatian and Slovene. Our experiments focus on the most informative set of linguistic features taking into account the availability of language tools for the lan- guages in question. Beside the classic linguistic features, distributional similarity features calculated from large unannotated monolingual corpora are exploited as well. Using distributional information improves the results for 7-8 points in F1 while adding morphological infor- mation improves the results for additional 3-4 points in both languages. The best performing models, along with test sets for comparison with future and existing systems and a HunPos part-of-speech model for Croatian are available for download for academic usage.

Izvorni jezik
Engleski

Znanstvena područja
Informacijske i komunikacijske znanosti



POVEZANOST RADA


Projekti:
FP7-271022
130-1301679-1380 - Hrvatska rječnička baština i hrvatski europski identitet (Boras, Damir, MZOS ) ( CroRIS)

Ustanove:
Filozofski fakultet, Zagreb

Profili:

Avatar Url Nikola Ljubešić (autor)


Citiraj ovu publikaciju:

Ljubešić, Nikola; Stupar, Marija; Jurić, Tereza
Building Named Entity Recognition Models for Croatian and Slovene // Proceedings of the Eighth LANGUAGE TECHNOLOGIES Conference / Erjavec, Tomaž ; Žganec Gros, Jerneja (ur.).
Ljubljana, 2012. str. 129-134 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
Ljubešić, N., Stupar, M. & Jurić, T. (2012) Building Named Entity Recognition Models for Croatian and Slovene. U: Erjavec, T. & Žganec Gros, J. (ur.)Proceedings of the Eighth LANGUAGE TECHNOLOGIES Conference.
@article{article, author = {Ljube\v{s}i\'{c}, Nikola and Stupar, Marija and Juri\'{c}, Tereza}, year = {2012}, pages = {129-134}, keywords = {named entity recognition, distributional similarity, Croatian language, Slovene language}, title = {Building Named Entity Recognition Models for Croatian and Slovene}, keyword = {named entity recognition, distributional similarity, Croatian language, Slovene language}, publisherplace = {Ljubljana, Slovenija} }
@article{article, author = {Ljube\v{s}i\'{c}, Nikola and Stupar, Marija and Juri\'{c}, Tereza}, year = {2012}, pages = {129-134}, keywords = {named entity recognition, distributional similarity, Croatian language, Slovene language}, title = {Building Named Entity Recognition Models For Croatian And Slovene}, keyword = {named entity recognition, distributional similarity, Croatian language, Slovene language}, publisherplace = {Ljubljana, Slovenija} }




Contrast
Increase Font
Decrease Font
Dyslexic Font