Pregled bibliografske jedinice broj: 728626
Node Selectivity as a Measure for Graph-Based Keyword Extraction in Croatian News
Node Selectivity as a Measure for Graph-Based Keyword Extraction in Croatian News // 6th International Conference on Information Technologies and Information Society (ITIS2014) / Boshkoska, Biljana Mileva ; Levnajić, Zoran (ur.).
Šmarješke Toplice, 2014. (predavanje, međunarodna recenzija, neobjavljeni rad, znanstveni)
CROSBI ID: 728626 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Node Selectivity as a Measure for Graph-Based Keyword Extraction in Croatian News
Autori
Beliga, Slobodan ; Martinčić-Ipšić, Sanda
Vrsta, podvrsta i kategorija rada
Sažeci sa skupova, neobjavljeni rad, znanstveni
Izvornik
6th International Conference on Information Technologies and Information Society (ITIS2014)
/ Boshkoska, Biljana Mileva ; Levnajić, Zoran - Šmarješke Toplice, 2014
Skup
6th International Conference on Information Technologies and Information Society (ITIS2014)
Mjesto i datum
Šmarješke toplice, Slovenija, 05.11.2014. - 07.11.2014
Vrsta sudjelovanja
Predavanje
Vrsta recenzije
Međunarodna recenzija
Ključne riječi
keyword extraction; keyword candidate; keyword ranking; keyword expansion; node selectivity; Croatian news; complex network
Sažetak
In this paper, we introduce selectivity-based keyword extraction as a new unsupervised method for graph-based keyword extraction. Node selectivity measure is defined as the average weight distribution on the links of a single node and used in procedure of keyword candidate extraction. In particular, we propose extracting three word long keyword sequence and proving that the obtained results compare favourably with previously published results. Experiments were conducted on Croatian news articles dataset with keywords annotated by human experts. The selectivity-based keyword extraction method achieved the average F2 score of 25.32% on isolated documents and F2 score of 42.07% on a document collection. Proposed method is derived solely from statistical and structural information, which are reflected in the topological properties of text network. Furthermore, comparative results indicate that our simple graph-based method provides results that are comparable with more complex supervised and unsupervised methods, as well as with human annotators.
Izvorni jezik
Engleski
Znanstvena područja
Računarstvo, Informacijske i komunikacijske znanosti
POVEZANOST RADA
Projekti:
Uniri-LangNet
Ustanove:
Fakultet informatike i digitalnih tehnologija, Rijeka