Enhanced Thesaurus Terms Extraction for Document Indexing (CROSBI ID 507482)
Prilog sa skupa u zborniku | izvorni znanstveni rad | međunarodna recenzija
Podaci o odgovornosti
Šarić, Frane ; Šnajder, Jan ; Dalbelo Bašić, Bojana ; Eklić, Hrvoje
engleski
Enhanced Thesaurus Terms Extraction for Document Indexing
In this paper we present an enhanced method for the thesaurus term extraction regarded as the main support to a semi-automatic indexing system. The enhancement is achieved by neutralising the efect of language morphology applying lemmatisation on both the text and the thesaurus, and by implementing an effcient recursive algorithm for term extraction. Formal definition and statistical evaluation of the experimental results of the proposed method for thesaurus term extraction are given. The need for disambiguation methods and the efect of lemmatisation in the realm of thesaurus term extraction are discussed.
Information retrieval; term extraction; NLP; lemmatisation; Eurovoc
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
Podaci o prilogu
227-232.
2005.
objavljeno
Podaci o matičnoj publikaciji
Proceedingss of the 27th International Conference on Information Technology Interfaces : ITI 2005
Lužar - Stiffler, Vesna ; Hljuz Dobrić, Vesna
Zagreb: Sveučilišni računski centar Sveučilišta u Zagrebu (Srce)
Podaci o skupu
International Conference on Information Technology Interfaces (27 ; 2005)
predavanje
20.06.2005-23.06.2005
Cavtat, Hrvatska