Pretražite po imenu i prezimenu autora, mentora, urednika, prevoditelja

Napredna pretraga

Pregled bibliografske jedinice broj: 181464

Building the tagset for Croatian


Stojanov, Tomislav; Dovedan, Zdravko; Seljan, Sanja; Vučković, Kristina
Building the tagset for Croatian // 5th European Conference on Formal Description of Slavic Languages : Conference proceedings / Zybatow, Gerhild ; Szucsich, Luka ; Junghanns, Uwe ; Meyer, Roland (ur.).
Frankfurt : New York (NY): Peter Lang, 2008. (demonstracija, međunarodna recenzija, neobjavljeni rad, znanstveni)


CROSBI ID: 181464 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Building the tagset for Croatian

Autori
Stojanov, Tomislav ; Dovedan, Zdravko ; Seljan, Sanja ; Vučković, Kristina

Vrsta, podvrsta i kategorija rada
Sažeci sa skupova, neobjavljeni rad, znanstveni

Izvornik
5th European Conference on Formal Description of Slavic Languages : Conference proceedings / Zybatow, Gerhild ; Szucsich, Luka ; Junghanns, Uwe ; Meyer, Roland - Frankfurt : New York (NY) : Peter Lang, 2008

ISBN
9783631551608

Skup
European Conference on Formal Description of Slavic Languages (5 ; 2003)

Mjesto i datum
Leipzig, Njemačka, 28.11.2003

Vrsta sudjelovanja
Demonstracija

Vrsta recenzije
Međunarodna recenzija

Ključne riječi
tagging; tagset; Croatian language; morphologic generator; Multext-East specifications; morphosyntax; morphosyntactic category; morphosyntactic feature; parts-of-speech; adjective aspect; definite adjectives; indefinite adjectives; grammar checker

Sažetak
There is a number of morphological generators for Croatian language such as Kržak (1988), Silić (1996), Tadić (1994, 2003), and others that are parts of Korektor© , Hrvatska Riječ© , Lapis© and other applications, all of which, except Tadić's, have application as spelling checkers. Tadić's GenOblik is developed for the need of the corpus linguistics project and annotated according to the Multext-East specification (Erjavec 2001) that Przepiórkowski & Woliński (2003a, b) have critically evaluated having adopted their own tagset closer to grammatical system of Polish language. This paper also approaches from the criticism of the stated specification, but based on a different ground. The following is emphasized: (i) insufficient differentiation of inherent and relational motivated morphosyntactic features - verb relational categories such as modality, conditionality and compound tense cannot be annotated by tag that is added to an individual lexical unit - the stated features (in Croatian as well as in other languages) do not derive from form as such but are relationally conditioned. (ii) lack of adherence from morphosyntactic criteria in establishing formal criteria - semantic features, like the category of common and proper noun, are introduced, whereas other semantic categories, like countability and collectiveness, are not included. Tagging system that would rely on more emphasized qualitative approach is an issue of the second part of the work. The aim of the own tagging system is in the assumption that more grammatical adequacy could contribute to greater accuracy in solving homography in parser algorithm that is to follow. The tag system designed by Stojanov (2003) annotates the units from Silić's Grammar Thesaurus© , licensed by Microsoft and built into its Office application as a spelling checker for Croatian language. Arguments for its selection, and not for selection of some other morphological generator, are also given. Apart from being completed and fully functional, it is structured according to grammatical criteria. Thereupon, the 'overgenerating' effect is decreased maximally with high morphological well-formedness.

Izvorni jezik
Engleski

Znanstvena područja
Informacijske i komunikacijske znanosti, Filologija

Napomena
Linguistik international ; Bd. 20



POVEZANOST RADA


Projekti:
0130440

Ustanove:
Filozofski fakultet, Zagreb

Profili:

Avatar Url Zdravko Dovedan Han (autor)

Avatar Url Sanja Seljan (autor)

Avatar Url Kristina Kocijan (autor)


Citiraj ovu publikaciju:

Stojanov, Tomislav; Dovedan, Zdravko; Seljan, Sanja; Vučković, Kristina
Building the tagset for Croatian // 5th European Conference on Formal Description of Slavic Languages : Conference proceedings / Zybatow, Gerhild ; Szucsich, Luka ; Junghanns, Uwe ; Meyer, Roland (ur.).
Frankfurt : New York (NY): Peter Lang, 2008. (demonstracija, međunarodna recenzija, neobjavljeni rad, znanstveni)
Stojanov, T., Dovedan, Z., Seljan, S. & Vučković, K. (2008) Building the tagset for Croatian. U: Zybatow, G., Szucsich, L., Junghanns, U. & Meyer, R. (ur.)5th European Conference on Formal Description of Slavic Languages : Conference proceedings.
@article{article, author = {Stojanov, Tomislav and Dovedan, Zdravko and Seljan, Sanja and Vu\v{c}kovi\'{c}, Kristina}, year = {2008}, keywords = {tagging, tagset, Croatian language, morphologic generator, Multext-East specifications, morphosyntax, morphosyntactic category, morphosyntactic feature, parts-of-speech, adjective aspect, definite adjectives, indefinite adjectives, grammar checker}, isbn = {9783631551608}, title = {Building the tagset for Croatian}, keyword = {tagging, tagset, Croatian language, morphologic generator, Multext-East specifications, morphosyntax, morphosyntactic category, morphosyntactic feature, parts-of-speech, adjective aspect, definite adjectives, indefinite adjectives, grammar checker}, publisher = {Peter Lang}, publisherplace = {Leipzig, Njema\v{c}ka} }
@article{article, author = {Stojanov, Tomislav and Dovedan, Zdravko and Seljan, Sanja and Vu\v{c}kovi\'{c}, Kristina}, year = {2008}, keywords = {tagging, tagset, Croatian language, morphologic generator, Multext-East specifications, morphosyntax, morphosyntactic category, morphosyntactic feature, parts-of-speech, adjective aspect, definite adjectives, indefinite adjectives, grammar checker}, isbn = {9783631551608}, title = {Building the tagset for Croatian}, keyword = {tagging, tagset, Croatian language, morphologic generator, Multext-East specifications, morphosyntax, morphosyntactic category, morphosyntactic feature, parts-of-speech, adjective aspect, definite adjectives, indefinite adjectives, grammar checker}, publisher = {Peter Lang}, publisherplace = {Leipzig, Njema\v{c}ka} }




Contrast
Increase Font
Decrease Font
Dyslexic Font