Nalazite se na CroRIS probnoj okolini. Ovdje evidentirani podaci neće biti pohranjeni u Informacijskom sustavu znanosti RH. Ako je ovo greška, CroRIS produkcijskoj okolini moguće je pristupi putem poveznice www.croris.hr
izvor podataka: crosbi !

Sentence Retrieval using Stemming and Lemmatization with Different Length of the Queries (CROSBI ID 281817)

Prilog u časopisu | izvorni znanstveni rad | međunarodna recenzija

Boban, Ivan ; Doko, Alen ; Gotovac, Sven Sentence Retrieval using Stemming and Lemmatization with Different Length of the Queries // Advances in science, technology and engineering systems journal, 5 (2020), 3; 349-354. doi: 10.25046/aj050345

Podaci o odgovornosti

Boban, Ivan ; Doko, Alen ; Gotovac, Sven

engleski

Sentence Retrieval using Stemming and Lemmatization with Different Length of the Queries

In this paper we focus on Sentence retrieval which is similar to Document retrieval but with a smaller unit of retrieval. Using data pre- processing in document retrieval is generally considered useful. When it comes to sentence retrieval the situation is not that clear. In this paper we use TF-ISF (term frequency – inverse sentence frequency) method for sentence retrieval. As pre-processing steps, we use stop word removal and language modeling techniques: stemming and lemmatization. We also experiment with different query lengths. The results show that data pre-processing with stemming and lemmatization is useful with sentences retrieval as it is with document retrieval. Lemmatization produces better results with longer queries, while stemming shows worse results with longer queries. For the experiment we used data of the Text Retrieval Conference (TREC) novelty tracks.

Sentence retrieval ; TF-ISF ; Data pre-processing ; Stemming ; Lemmatization

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

Podaci o izdanju

5 (3)

2020.

349-354

objavljeno

2415-6698

10.25046/aj050345

Povezanost rada

Računarstvo

Poveznice
Indeksiranost