Pregled bibliografske jedinice broj: 433101
Automatic Keyphrase Extraction from Croatian Newspaper Articles
Automatic Keyphrase Extraction from Croatian Newspaper Articles // The Future of Information Sciences, Digital Resources and Knowledge Sharing / Stančić, Hrvoje ; Selja, Sanja ; Bawden, David ; Lasić-Lazić, Jadranka ; Slavić, Aida (ur.).
Zagreb, Hrvatska, 2009. str. 207-218 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
CROSBI ID: 433101 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Automatic Keyphrase Extraction from Croatian Newspaper Articles
Autori
Ahel, Renee ; Dalbelo Bašić, Bojana ; Šnajder, Jan
Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni
Izvornik
The Future of Information Sciences, Digital Resources and Knowledge Sharing
/ Stančić, Hrvoje ; Selja, Sanja ; Bawden, David ; Lasić-Lazić, Jadranka ; Slavić, Aida - , 2009, 207-218
ISBN
978-953-175-305-0
Skup
2nd International Conference The Future of Information Sciences (INFuture 2009)
Mjesto i datum
Zagreb, Hrvatska, 04.11.2009. - 06.11.2009
Vrsta sudjelovanja
Predavanje
Vrsta recenzije
Međunarodna recenzija
Ključne riječi
keyphrase extraction; naïve Bayes classifier; Croatian language
Sažetak
Keyphrases provide a way to summarize documents and enable cross-category retrieval. The paper describes a robust system for automatic keyphrase extraction from newspaper articles in Croatian language. Keyphrase candidates are generated based on linguistic and statistical features, and naïve Bayes classifier is used to select the best keyphrases among the candidates. A prediction model is built using training documents with human-assigned keyphrases. System performance is measured on a corpus of newspaper articles, by comparing the automatically extracted keyphrases with those assigned by professional indexers. In absence of comparable results, we consider our results to be of modest performance.
Izvorni jezik
Engleski
Znanstvena područja
Računarstvo
POVEZANOST RADA
Projekti:
036-1300646-1986 - Otkrivanje znanja u tekstnim podacima (Dalbelo-Bašić, Bojana, MZO ) ( CroRIS)
Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb