Nalazite se na CroRIS probnoj okolini. Ovdje evidentirani podaci neće biti pohranjeni u Informacijskom sustavu znanosti RH. Ako je ovo greška, CroRIS produkcijskoj okolini moguće je pristupi putem poveznice www.croris.hr
izvor podataka: crosbi

Document Representation Methods for News Event Detection in Croatian (CROSBI ID 541390)

Prilog sa skupa u zborniku | izvorni znanstveni rad | međunarodna recenzija

Ljubešić, Nikola ; Agić, Željko ; Bakarić, Nikola Document Representation Methods for News Event Detection in Croatian // Proceedings of the 6th International Conference on Formal Approaches to South Slavic and Balkan Languages / Tadić, Marko ; Dimitrova-Vulchanova, Mila ; Koeva, Svetla (ur.). Zagreb: Hrvatsko društvo za jezične tehnologije, 2008. str. 79-84

Podaci o odgovornosti

Ljubešić, Nikola ; Agić, Željko ; Bakarić, Nikola

engleski

Document Representation Methods for News Event Detection in Croatian

Constant increase in the amount of available data in the world in general demands new organizational and representational ideas and approaches. Document clustering as a method for event detection uses, supplements and upgrades existing information retrieval methods in order to improve knowledge management and representation. This article describes the research done in order to determine the impact of various methods of document representation on cluster analysis. Several statistical and linguistic NLP morphological normalization methods of document representation are tested in an event detection scenario. Event detection was conducted using online newspaper articles issued on a single day. A cluster analysis was done using the various document representation methods and a clustering algorithm. The results were then compared against a human evaluated golden standard. The results show that both statistical and linguistic methods simplify the representational complexity and minimally improve the results which lead to the conclusion that for this task statistical methods should be preferred.

document representation ; document clustering ; news event detection

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

Podaci o prilogu

79-84.

2008.

objavljeno

Podaci o matičnoj publikaciji

Proceedings of the 6th International Conference on Formal Approaches to South Slavic and Balkan Languages

Tadić, Marko ; Dimitrova-Vulchanova, Mila ; Koeva, Svetla

Zagreb: Hrvatsko društvo za jezične tehnologije

978-953-55375-0-2

Podaci o skupu

6th International Conference on Formal Approaches to South Slavic and Balkan Languages (FASSBL 2008)

predavanje

25.09.2008-28.09.2008

Dubrovnik, Hrvatska

Povezanost rada

Filologija, Informacijske i komunikacijske znanosti, Računarstvo