Pregled bibliografske jedinice broj: 392376
Improved Methods of Word Acquisition in developing Hascheck Spell Checker Web Service System
Improved Methods of Word Acquisition in developing Hascheck Spell Checker Web Service System // X International PhD Workshop OWD 2008 / Grzegorz Kłapyta (ur.).
Gliwice: PTETiS, 2008. str. 029-034 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
CROSBI ID: 392376 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Improved Methods of Word Acquisition in developing Hascheck Spell Checker Web Service System
Autori
Pavlek, Jakov ; Dembitz, Šandor ; Matasić, Marko
Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni
Izvornik
X International PhD Workshop OWD 2008
/ Grzegorz Kłapyta - Gliwice : PTETiS, 2008, 029-034
Skup
X International PhD Workshop OWD 2008
Mjesto i datum
Wisła, Poljska, 18.10.2008. - 21.10.2008
Vrsta sudjelovanja
Predavanje
Vrsta recenzije
Međunarodna recenzija
Ključne riječi
spell checker; word acquisition; web service; Google Search; Wikipedia.
(spell checker; word acquisition; web service; Google Search; Wikipedia)
Sažetak
Public service Hascheck (Croatian Academic Spell CHECKer) is a free Web service on the global level with continually growing base of its users and with rapidly increasing service volume. In this paper we discuss methods used for processing and learning new, previously unknown words to the Hascheck system. Interface for manual word acquisition has been developed using Google Web Search engine from appropriate given domains as a part of the improvement of the Hascheck service. In this matter already existing systematized knowledge resources, specifically Wikipedia and Croatian Spell Checker for MS Word, have been intensively used. Program modules for automatic retrieval and classification of word types based on information about domain, language, and way of spelling have been developed. As a result, some 135000 of new word types have been processed and classified into adequate classes using the developed software. We also evaluate earlier methods used in the same process and compare them to the new ones regarding their accuracy, efficiency and the time they take to process words. Combining new methods the processing of word types, that is, supervised learning in the Hascheck system, has been accelerated and the time of decision-making process has been significantly reduced.
Izvorni jezik
Engleski
Znanstvena područja
Elektrotehnika
POVEZANOST RADA
Projekti:
036-0362027-1638 - Umrežena ekonomija (Skočir, Zoran, MZO ) ( CroRIS)
Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb