Nalazite se na CroRIS probnoj okolini. Ovdje evidentirani podaci neće biti pohranjeni u Informacijskom sustavu znanosti RH. Ako je ovo greška, CroRIS produkcijskoj okolini moguće je pristupi putem poveznice www.croris.hr
izvor podataka: crosbi !

A Generic Method for Multi Word Extraction from Wikipedia (CROSBI ID 49163)

Prilog u knjizi | izvorni znanstveni rad

Bekavac, Božo ; Tadić, Marko A Generic Method for Multi Word Extraction from Wikipedia // Technologies for the Processing and Retrieval of Semi-Structured Documents: Experience from the CADIAL Project / Tadić, Marko ; Dalbelo Bašić, Bojana ; Moens, Marie-Francine (ur.). Zagreb: Hrvatsko društvo za jezične tehnologije, 2009. str. 115-124

Podaci o odgovornosti

Bekavac, Božo ; Tadić, Marko

engleski

A Generic Method for Multi Word Extraction from Wikipedia

This paper presents the generic method for multiword expression extraction from Wikipedia. The method is using the properties of this specific encyclopedic genre in its HTML format and it relies on the intention of the authors of articles to link to other articles. The relevant links were processed by applying local regular grammars within the NooJ development environment. We tested the method on a Croatian version of Wikipedia and we present the results obtained.

multi word expressions, multi word extraction, Croatian, Wikipedia

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

Podaci o prilogu

115-124.

objavljeno

Podaci o knjizi

Technologies for the Processing and Retrieval of Semi-Structured Documents: Experience from the CADIAL Project

Tadić, Marko ; Dalbelo Bašić, Bojana ; Moens, Marie-Francine

Zagreb: Hrvatsko društvo za jezične tehnologije

2009.

978-953-55375-1-9

Povezanost rada

Informacijske i komunikacijske znanosti, Filologija