Nalazite se na CroRIS probnoj okolini. Ovdje evidentirani podaci neće biti pohranjeni u Informacijskom sustavu znanosti RH. Ako je ovo greška, CroRIS produkcijskoj okolini moguće je pristupi putem poveznice www.croris.hr
izvor podataka: crosbi !

Named Entity Recognition in Croatian Tweets (CROSBI ID 619157)

Prilog sa skupa u zborniku | izvorni znanstveni rad | međunarodna recenzija

Baksa, Krešimir ; Dolović, Dino ; Glavaš, Goran ; Šnajder, Jan Named Entity Recognition in Croatian Tweets // Proceedings of the Ninth Language Technologies Conference, Information Society (IS-JT 2014). Ljubljana, 2014. str. 85-89

Podaci o odgovornosti

Baksa, Krešimir ; Dolović, Dino ; Glavaš, Goran ; Šnajder, Jan

engleski

Named Entity Recognition in Croatian Tweets

Existing named entity extraction tools, typically designed for formal texts written in standard language (e.g., news stories, essays, or legal texts), do not perform well on user-generated content (e.g., tweets). In this paper we present a supervised approach for named entity recognition and classification for Croatian tweets. Comparison of three different sequence labeling models (HMM, CRF, and SVM) revealed that CRF is the best model for the task, achieving a micro-averaged F1-score of over 87%. We also demonstrate that the state-of-the-art NER model designed for Croatian standard language texts performs much worse than our Twitter-specific NER models.

Named entity recognition; information extraction; twitter data; Croatian language

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

Podaci o prilogu

85-89.

2014.

objavljeno

Podaci o matičnoj publikaciji

Proceedings of the Ninth Language Technologies Conference, Information Society (IS-JT 2014)

Ljubljana:

Podaci o skupu

Ninth Language Technologies Conference, Information Society (IS-JT 2014)

predavanje

09.10.2014-10.10.2014

Ljubljana, Slovenija

Povezanost rada

Računarstvo

Poveznice