Napredna pretraga

Pregled bibliografske jedinice broj: 723769

Toward Selectivity Based Keyword Extraction for Croatian News


Beliga, Slobodan; Meštrović, Ana; Martinčić- Ipšić, Sanda
Toward Selectivity Based Keyword Extraction for Croatian News // Surfacing the Deep and the Social Web (SDSW 2014) / Rupino da Cunha, Paulo ; Nguyen, Ngoc Thanh ; Boucelma, Omar ; Cautis, Bogdan ; Velegrakis, Yannis (ur.).
Italy: CEUR Proc. vol. 1310, 2014. str. 1-14 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)


Naslov
Toward Selectivity Based Keyword Extraction for Croatian News
(Toward Selectivity-Based Keyword Extraction for Croatian News)

Autori
Beliga, Slobodan ; Meštrović, Ana ; Martinčić- Ipšić, Sanda

Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni

Izvornik
Surfacing the Deep and the Social Web (SDSW 2014) / Rupino da Cunha, Paulo ; Nguyen, Ngoc Thanh ; Boucelma, Omar ; Cautis, Bogdan ; Velegrakis, Yannis - Italy : CEUR Proc. vol. 1310, 2014, 1-14

Skup
Surfacing the Deep and the Social Web

Mjesto i datum
Italija, 19.10

Vrsta sudjelovanja
Predavanje

Vrsta recenzije
Međunarodna recenzija

Ključne riječi
Keyword extraction; complex network; centrality measures; selectivity; Croatian news texts

Sažetak
Our approach proposes a novel network measure - the node selectivity for the task of keyword extraction. The node selectivity is de- ned as the average strength of the node. Firstly, we show that selectivity- based keyword extraction slightly outperforms the extraction based on the standard centrality measures: in-degree, out- degree, betweenness, and closeness. Furthermore, from the data set of Croatian news we ex- tract keyword candidates and expand extracted nodes to word-tuples ranked with the highest in/out selectivity values. The obtained sets are evaluated on manually annotated keywords: for the set of extracted key- word candidates the average F1 score is 24.63%, and the average F2 score is 21.19% ; for the exacted word-tuples candidates the average F1 score is 25.9% and the average F2 score is 24.47%. Selectivity-based ex- traction does not require linguistic knowledge while it is purely derived from statistical and structural information of the network.

Izvorni jezik
Engleski

Znanstvena područja
Računarstvo, Informacijske i komunikacijske znanosti



POVEZANOST RADA


Projekt / tema
Uniri-LangNet

Ustanove
Sveučilište u Rijeci - Odjel za informatiku