Pretražite po imenu i prezimenu autora, mentora, urednika, prevoditelja

Napredna pretraga

Pregled bibliografske jedinice broj: 606016

Guessing the Correct Inflectional Paradigm of Unknown Croatian Words


Šnajder, Jan
Guessing the Correct Inflectional Paradigm of Unknown Croatian Words // Proceedings of the Eighth Language Technologies Conference / Erjavec, Tomaž ; Žganec Gros, Jerneja (ur.).
Ljubljana, 2012. str. 185-190 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)


CROSBI ID: 606016 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Guessing the Correct Inflectional Paradigm of Unknown Croatian Words

Autori
Šnajder, Jan

Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni

Izvornik
Proceedings of the Eighth Language Technologies Conference / Erjavec, Tomaž ; Žganec Gros, Jerneja - Ljubljana, 2012, 185-190

Skup
Information Society 2012 - Eighth Language Technologies Conference

Mjesto i datum
Ljubljana, Slovenija, 08.10.2012. - 09.10.2012

Vrsta sudjelovanja
Predavanje

Vrsta recenzije
Međunarodna recenzija

Ključne riječi
Morphological analysis; machine learning
(Morfološka analiza; strojno učenje)

Sažetak
A real-life morphological analyzer must be able to handle properly the out-of-vocabulary words. We address the task of guessing the correct inflectional paradigm of unknown Croatian words. We frame this as a supervised machine learning problem: we train a model for deciding whether a candidate lemma-paradigm pair is correct based on a number of string- and corpus-based features. Our aim is to examine the machine learning aspect of the problem: we analyze the features and evaluate the classification accuracy using different feature subsets. We show that satisfactory level of accuracy (92%) can be achieved with SVM using a combination of string- and corpus-based features. We discuss a number of possible directions for future research.

Izvorni jezik
Engleski

Znanstvena područja
Računarstvo



POVEZANOST RADA


Projekti:
036-1300646-1986 - Otkrivanje znanja u tekstnim podacima (Dalbelo-Bašić, Bojana, MZO ) ( CroRIS)

Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb

Profili:

Avatar Url Jan Šnajder (autor)

Citiraj ovu publikaciju:

Šnajder, Jan
Guessing the Correct Inflectional Paradigm of Unknown Croatian Words // Proceedings of the Eighth Language Technologies Conference / Erjavec, Tomaž ; Žganec Gros, Jerneja (ur.).
Ljubljana, 2012. str. 185-190 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
Šnajder, J. (2012) Guessing the Correct Inflectional Paradigm of Unknown Croatian Words. U: Erjavec, T. & Žganec Gros, J. (ur.)Proceedings of the Eighth Language Technologies Conference.
@article{article, author = {\v{S}najder, Jan}, year = {2012}, pages = {185-190}, keywords = {Morphological analysis, machine learning}, title = {Guessing the Correct Inflectional Paradigm of Unknown Croatian Words}, keyword = {Morphological analysis, machine learning}, publisherplace = {Ljubljana, Slovenija} }
@article{article, author = {\v{S}najder, Jan}, year = {2012}, pages = {185-190}, keywords = {Morfolo\v{s}ka analiza, strojno u\v{c}enje}, title = {Guessing the Correct Inflectional Paradigm of Unknown Croatian Words}, keyword = {Morfolo\v{s}ka analiza, strojno u\v{c}enje}, publisherplace = {Ljubljana, Slovenija} }




Contrast
Increase Font
Decrease Font
Dyslexic Font