Pregled bibliografske jedinice broj: 356624
Building a Search Engine Model with Morphological Normalization Support
Building a Search Engine Model with Morphological Normalization Support // Proceedings of the ITI 2008 30th Int. Conf. on Information Technology Interfaces / Luzar - Stiffler, Vesna ; Hljuz Dobric, Vesna ; Bekic, Zoran (ur.).
Zagreb: Sveučilišni računski centar Sveučilišta u Zagrebu (Srce), 2008. str. 619-624 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
CROSBI ID: 356624 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Building a Search Engine Model with Morphological Normalization Support
Autori
Mijić, Jure ; Dalbelo Bašić, Bojana ; Šnajder, Jan
Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni
Izvornik
Proceedings of the ITI 2008 30th Int. Conf. on Information Technology Interfaces
/ Luzar - Stiffler, Vesna ; Hljuz Dobric, Vesna ; Bekic, Zoran - Zagreb : Sveučilišni računski centar Sveučilišta u Zagrebu (Srce), 2008, 619-624
ISBN
978-953-7138-13-4
Skup
30th Int. Conf. on Information Technology Interfaces
Mjesto i datum
Cavtat, Hrvatska, 23.06.2008. - 26.06.2008
Vrsta sudjelovanja
Predavanje
Vrsta recenzije
Međunarodna recenzija
Ključne riječi
Information need; Information retrieval; Morphological normalization; Search engine
Sažetak
Searching a collection of documents can seem like an easy task, but manipulating textual data can be difficult because the data are mostly unstructured. We undertook the task of building an effective search engine for a collection of Croatian legislative documents. The developed search engine model supports multiple modules for information retrieval. To improve the effectiveness of the retrieval, we used a morphological normalization module that uses an inflectional lexicon automatically acquired from a document corpus. As we do not have a gold standard for our legislative document collection, we evaluated our search engine on three English test collections, explored the effects of stemming, and compared the results to the vector space model.
Izvorni jezik
Engleski
Znanstvena područja
Računarstvo
POVEZANOST RADA
Projekti:
036-1300646-1986 - Otkrivanje znanja u tekstnim podacima (Dalbelo-Bašić, Bojana, MZO ) ( CroRIS)
Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb