Napredna pretraga

Pregled bibliografske jedinice broj: 728626

Node Selectivity as a Measure for Graph-Based Keyword Extraction in Croatian News


Beliga, Slobodan; Martinčić-Ipšić, Sanda
Node Selectivity as a Measure for Graph-Based Keyword Extraction in Croatian News // 6th International Conference on Information Technologies and Information Society (ITIS2014) / Boshkoska, Biljana Mileva ; Levnajić, Zoran (ur.).
Šmarješke toplice, Slovenija, 2014. (predavanje, međunarodna recenzija, neobjavljeni rad, znanstveni)


Naslov
Node Selectivity as a Measure for Graph-Based Keyword Extraction in Croatian News

Autori
Beliga, Slobodan ; Martinčić-Ipšić, Sanda

Vrsta, podvrsta i kategorija rada
Sažeci sa skupova, neobjavljeni rad, znanstveni

Izvornik
6th International Conference on Information Technologies and Information Society (ITIS2014) / Boshkoska, Biljana Mileva ; Levnajić, Zoran - Šmarješke toplice, Slovenija, 2014

Skup
6th International Conference on Information Technologies and Information Society (ITIS2014)

Mjesto i datum
Šmarješke toplice, Slovenija, 5-7.11.2014

Vrsta sudjelovanja
Predavanje

Vrsta recenzije
Međunarodna recenzija

Ključne riječi
Keyword extraction; keyword candidate; keyword ranking; keyword expansion; node selectivity; Croatian news; complex network

Sažetak
In this paper, we introduce selectivity-based keyword extraction as a new unsupervised method for graph-based keyword extraction. Node selectivity measure is defined as the average weight distribution on the links of a single node and used in procedure of keyword candidate extraction. In particular, we propose extracting three word long keyword sequence and proving that the obtained results compare favourably with previously published results. Experiments were conducted on Croatian news articles dataset with keywords annotated by human experts. The selectivity-based keyword extraction method achieved the average F2 score of 25.32% on isolated documents and F2 score of 42.07% on a document collection. Proposed method is derived solely from statistical and structural information, which are reflected in the topological properties of text network. Furthermore, comparative results indicate that our simple graph-based method provides results that are comparable with more complex supervised and unsupervised methods, as well as with human annotators.

Izvorni jezik
Engleski

Znanstvena područja
Računarstvo, Informacijske i komunikacijske znanosti



POVEZANOST RADA


Projekt / tema
Uniri-LangNet

Ustanove
Sveučilište u Rijeci - Odjel za informatiku