Napredna pretraga

Pregled bibliografske jedinice broj: 356624

Building a Search Engine Model with Morphological Normalization Support


Mijić, Jure; Dalbelo Bašić, Bojana; Šnajder, Jan
Building a Search Engine Model with Morphological Normalization Support // Proceedings of the ITI 2008 30th Int. Conf. on Information Technology Interfaces / Luzar - Stiffler, Vesna ; Hljuz Dobric, Vesna ; Bekic, Zoran (ur.).
Zagreb: SRCE, 2008. str. 619-624 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)


Naslov
Building a Search Engine Model with Morphological Normalization Support

Autori
Mijić, Jure ; Dalbelo Bašić, Bojana ; Šnajder, Jan

Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni

Izvornik
Proceedings of the ITI 2008 30th Int. Conf. on Information Technology Interfaces / Luzar - Stiffler, Vesna ; Hljuz Dobric, Vesna ; Bekic, Zoran - Zagreb : SRCE, 2008, 619-624

ISBN
978-953-7138-13-4

Skup
30th Int. Conf. on Information Technology Interfaces

Mjesto i datum
Cavtat, Hrvatska, 23-26.06.2008.

Vrsta sudjelovanja
Predavanje

Vrsta recenzije
Međunarodna recenzija

Ključne riječi
Information need; Information retrieval; Morphological normalization; Search engine

Sažetak
Searching a collection of documents can seem like an easy task, but manipulating textual data can be difficult because the data are mostly unstructured. We undertook the task of building an effective search engine for a collection of Croatian legislative documents. The developed search engine model supports multiple modules for information retrieval. To improve the effectiveness of the retrieval, we used a morphological normalization module that uses an inflectional lexicon automatically acquired from a document corpus. As we do not have a gold standard for our legislative document collection, we evaluated our search engine on three English test collections, explored the effects of stemming, and compared the results to the vector space model.

Izvorni jezik
Engleski

Znanstvena područja
Računarstvo



POVEZANOST RADA


Projekt / tema
036-1300646-1986 - Otkrivanje znanja u tekstnim podacima (Bojana Dalbelo-Bašić, )

Ustanove
Fakultet elektrotehnike i računarstva, Zagreb