Domain Dependence of Statistical Named Entity Recognition and Classification in Croatian Texts (CROSBI ID 598149)
Prilog sa skupa u zborniku | izvorni znanstveni rad | međunarodna recenzija
Podaci o odgovornosti
Agić, Željko ; Bekavac, Božo
engleski
Domain Dependence of Statistical Named Entity Recognition and Classification in Croatian Texts
Influence of text domain selection on statistical named entity recognition and classification in Croatian texts is investigated. Two datasets of Croatian newspaper texts of differing text domains were manually annotated for named entities and used for training and testing the Stanford NER system for named entity recognition based on sequence labeling with CRF. State of the art scores were observed in both domains. A strong preference for systems trained on mixed text domains is established by the experiment. The top- performing system was recorded with an overall F1- score of 0.876 on mixed-domain test sets, scoring 0.899 in one of the selected domains and 0.852 in the other. The single best domain F1-scores were recorded at 0.910 and 0.858.
text domain; domain dependence; named entity recognition; Croatian language
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
Podaci o prilogu
277-283.
2013.
objavljeno
Podaci o matičnoj publikaciji
Proceedings of the 35th International Conference on Information Technology Interfaces (ITI 2013)
Lužar-Stiffler, Vesna ; Jarec, Iva
Zagreb: Sveučilišni računski centar Sveučilišta u Zagrebu (Srce)
978-953-7138-30-1
1330-1012
Podaci o skupu
35th International Conference on Information Technology Interfaces - ITI 2013
predavanje
24.06.2013-27.06.2013
Cavtat, Hrvatska