Pretražite po imenu i prezimenu autora, mentora, urednika, prevoditelja

Napredna pretraga

Pregled bibliografske jedinice broj: 280673

Croatian Lemmatization Server


Tadić, Marko
Croatian Lemmatization Server // Formal Approaches to south Slavic and Balkan Languages / Vulchanova, Mila Dimitrova ; Koeva, Svetla ; Krapova, Iliyana ; Vulchanov, Valentin (ur.).
Sofija: Bugarska akademija znanosti, 2006. str. 140-146 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)


CROSBI ID: 280673 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Croatian Lemmatization Server

Autori
Tadić, Marko

Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni

Izvornik
Formal Approaches to south Slavic and Balkan Languages / Vulchanova, Mila Dimitrova ; Koeva, Svetla ; Krapova, Iliyana ; Vulchanov, Valentin - Sofija : Bugarska akademija znanosti, 2006, 140-146

Skup
Fifth International Conference Formal Approaches to South Slavic and Balkan languages (FASSBL)

Mjesto i datum
Sofija, Bugarska, 18.10.2006. - 20.10.2006

Vrsta sudjelovanja
Predavanje

Vrsta recenzije
Međunarodna recenzija

Ključne riječi
lemmatization; POS tagging; MSD tagging; Croatian; web-service

Sažetak
The need for lemmatization in inflectionally rich languages is indisputable: it is applicable for the whole range of procedures — from textsearch, up to parsing. From two predominant approaches to lemmatization: 1) algorithmic (generally rule-based and realized with FSA) and 2) relational (generally data-driven and realized with databases), this paper opted for the latter. The reason is that formal-grammar approaches to Croatian morphology are rare and limited just to a part of morphological system. The other reason is that the generator for Croatian has already been developed (Tadić 1994) as well as Croatian Morphological Lexicon (CML) (Tadić & Fulgosi 2003). The idea was to offer an on-line lemmatization, POS/MSD service using CML v 4.5 as the back-end. The Croatian Lemmatization Server (CLS) is available at http://hml.hnk.ffzg.hr and it offers lemmatization and POS/MSD tagging at unigram level for now. For each token in submitted text, the server delivers all possible lemmas of which this token may be a word-form. For homographic tokens, each lemma is accompanied with all possible POS/MSD tags which are compliant to MulTextEast v3 specifications for Croatian . The CLS can also be used for generation: when lemma is inputted and marked, all its possible word-forms are being retrieved and delivered.

Izvorni jezik
Engleski

Znanstvena područja
Filologija



POVEZANOST RADA


Projekti:
0130418

Ustanove:
Filozofski fakultet, Zagreb

Profili:

Avatar Url Marko Tadić (autor)

Poveznice na cjeloviti tekst rada:

Pristup cjelovitom tekstu rada hnk.ffzg.hr

Citiraj ovu publikaciju:

Tadić, Marko
Croatian Lemmatization Server // Formal Approaches to south Slavic and Balkan Languages / Vulchanova, Mila Dimitrova ; Koeva, Svetla ; Krapova, Iliyana ; Vulchanov, Valentin (ur.).
Sofija: Bugarska akademija znanosti, 2006. str. 140-146 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
Tadić, M. (2006) Croatian Lemmatization Server. U: Vulchanova, M., Koeva, S., Krapova, I. & Vulchanov, V. (ur.)Formal Approaches to south Slavic and Balkan Languages.
@article{article, author = {Tadi\'{c}, Marko}, year = {2006}, pages = {140-146}, keywords = {lemmatization, POS tagging, MSD tagging, Croatian, web-service}, title = {Croatian Lemmatization Server}, keyword = {lemmatization, POS tagging, MSD tagging, Croatian, web-service}, publisher = {Bugarska akademija znanosti}, publisherplace = {Sofija, Bugarska} }
@article{article, author = {Tadi\'{c}, Marko}, year = {2006}, pages = {140-146}, keywords = {lemmatization, POS tagging, MSD tagging, Croatian, web-service}, title = {Croatian Lemmatization Server}, keyword = {lemmatization, POS tagging, MSD tagging, Croatian, web-service}, publisher = {Bugarska akademija znanosti}, publisherplace = {Sofija, Bugarska} }




Contrast
Increase Font
Decrease Font
Dyslexic Font