Pretražite po imenu i prezimenu autora, mentora, urednika, prevoditelja

Napredna pretraga

Pregled bibliografske jedinice broj: 427399

String Distance-Based Stemming of the Highly Inflected Croatian Language


Šnajder, Jan; Dalbelo Bašić, Bojana
String Distance-Based Stemming of the Highly Inflected Croatian Language // Proceedings of Recent Advances in Natural Language Processing (RANLP-2009) / Angelova, Galia ; Bontcheva, Kalina ; Mitkov, Ruslan ; Nicolov, Nicolas ; Nikolov, Nikolai (ur.).
Šumen: Incoma, 2009. str. 411-415 (poster, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)


CROSBI ID: 427399 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
String Distance-Based Stemming of the Highly Inflected Croatian Language

Autori
Šnajder, Jan ; Dalbelo Bašić, Bojana

Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni

Izvornik
Proceedings of Recent Advances in Natural Language Processing (RANLP-2009) / Angelova, Galia ; Bontcheva, Kalina ; Mitkov, Ruslan ; Nicolov, Nicolas ; Nikolov, Nikolai - Šumen : Incoma, 2009, 411-415

Skup
International Conference Recent Advances in Natural Language Processing'2009 (RANLP-2009)

Mjesto i datum
Borovec, Bugarska, 14.09.2009. - 16.09.2009

Vrsta sudjelovanja
Poster

Vrsta recenzije
Međunarodna recenzija

Ključne riječi
Stemming; morphology; string distance; Croatian language

Sažetak
Stemming refers to the grouping of morphologically related words into so-called stem classes for the purpose of improving information retrieval performance. Traditional approaches to stemming are language-specific and require a substantial amount of linguistic knowledge. A viable alternative is string distance-based stemming, in which stem classes are obtained by clustering word-forms from a corpus. In this paper, we apply string distance-based stemming to the highly inflected Croatian language using a number of string distance measures proposed in the literature. We focus on evaluating the stemming performance at both inflectional and derivational level, and investigate how this performance relates to the choice of the distance threshold value. Although our focus is on the Croatian language, we believe our results transfer well to languages of similar morphological complexity.

Izvorni jezik
Engleski

Znanstvena područja
Računarstvo



POVEZANOST RADA


Projekti:
036-1300646-1986 - Otkrivanje znanja u tekstnim podacima (Dalbelo-Bašić, Bojana, MZO ) ( CroRIS)

Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb

Profili:

Avatar Url Jan Šnajder (autor)

Avatar Url Bojana Dalbelo Bašić (autor)

Citiraj ovu publikaciju:

Šnajder, Jan; Dalbelo Bašić, Bojana
String Distance-Based Stemming of the Highly Inflected Croatian Language // Proceedings of Recent Advances in Natural Language Processing (RANLP-2009) / Angelova, Galia ; Bontcheva, Kalina ; Mitkov, Ruslan ; Nicolov, Nicolas ; Nikolov, Nikolai (ur.).
Šumen: Incoma, 2009. str. 411-415 (poster, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
Šnajder, J. & Dalbelo Bašić, B. (2009) String Distance-Based Stemming of the Highly Inflected Croatian Language. U: Angelova, G., Bontcheva, K., Mitkov, R., Nicolov, N. & Nikolov, N. (ur.)Proceedings of Recent Advances in Natural Language Processing (RANLP-2009).
@article{article, author = {\v{S}najder, Jan and Dalbelo Ba\v{s}i\'{c}, Bojana}, year = {2009}, pages = {411-415}, keywords = {Stemming, morphology, string distance, Croatian language}, title = {String Distance-Based Stemming of the Highly Inflected Croatian Language}, keyword = {Stemming, morphology, string distance, Croatian language}, publisher = {Incoma}, publisherplace = {Borovec, Bugarska} }
@article{article, author = {\v{S}najder, Jan and Dalbelo Ba\v{s}i\'{c}, Bojana}, year = {2009}, pages = {411-415}, keywords = {Stemming, morphology, string distance, Croatian language}, title = {String Distance-Based Stemming of the Highly Inflected Croatian Language}, keyword = {Stemming, morphology, string distance, Croatian language}, publisher = {Incoma}, publisherplace = {Borovec, Bugarska} }




Contrast
Increase Font
Decrease Font
Dyslexic Font