Pregled bibliografske jedinice broj: 606016
Guessing the Correct Inflectional Paradigm of Unknown Croatian Words
Guessing the Correct Inflectional Paradigm of Unknown Croatian Words // Proceedings of the Eighth Language Technologies Conference / Erjavec, Tomaž ; Žganec Gros, Jerneja (ur.).
Ljubljana, 2012. str. 185-190 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
CROSBI ID: 606016 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Guessing the Correct Inflectional Paradigm of Unknown Croatian Words
Autori
Šnajder, Jan
Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni
Izvornik
Proceedings of the Eighth Language Technologies Conference
/ Erjavec, Tomaž ; Žganec Gros, Jerneja - Ljubljana, 2012, 185-190
Skup
Information Society 2012 - Eighth Language Technologies Conference
Mjesto i datum
Ljubljana, Slovenija, 08.10.2012. - 09.10.2012
Vrsta sudjelovanja
Predavanje
Vrsta recenzije
Međunarodna recenzija
Ključne riječi
Morphological analysis; machine learning
(Morfološka analiza; strojno učenje)
Sažetak
A real-life morphological analyzer must be able to handle properly the out-of-vocabulary words. We address the task of guessing the correct inflectional paradigm of unknown Croatian words. We frame this as a supervised machine learning problem: we train a model for deciding whether a candidate lemma-paradigm pair is correct based on a number of string- and corpus-based features. Our aim is to examine the machine learning aspect of the problem: we analyze the features and evaluate the classification accuracy using different feature subsets. We show that satisfactory level of accuracy (92%) can be achieved with SVM using a combination of string- and corpus-based features. We discuss a number of possible directions for future research.
Izvorni jezik
Engleski
Znanstvena područja
Računarstvo
POVEZANOST RADA
Projekti:
036-1300646-1986 - Otkrivanje znanja u tekstnim podacima (Dalbelo-Bašić, Bojana, MZO ) ( CroRIS)
Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb
Profili:
Jan Šnajder
(autor)