Nalazite se na CroRIS probnoj okolini. Ovdje evidentirani podaci neće biti pohranjeni u Informacijskom sustavu znanosti RH. Ako je ovo greška, CroRIS produkcijskoj okolini moguće je pristupi putem poveznice www.croris.hr
izvor podataka: crosbi !

Inductive Morphosyntactic Tagsets (CROSBI ID 511181)

Neobjavljeno sudjelovanje sa skupa | neobjavljeni prilog sa skupa

Stojanov, Tomislav ; Vučković, Kristina ; Dovedan, Zdravko Inductive Morphosyntactic Tagsets // Computational Modeling of Lexical Acquisition Split, Hrvatska, 25.08.2005-28.08.2005

Podaci o odgovornosti

Stojanov, Tomislav ; Vučković, Kristina ; Dovedan, Zdravko

engleski

Inductive Morphosyntactic Tagsets

There is a number of morphological generators for Croatian language such as Kržak (1988), Silić (1996), Tadić (1994, 2003), and others that are parts of Korektor© , Hrvatska Riječ© , Lapis© and other applications, all of which, except Tadić’ s, have application as spelling checkers. Tadić's GenOblik is developed for the need of the corpus linguistics project and annotated according to the Multext-East specification (Erjavec 2001) that Przepiórkowski & Woliński (2003a, b) have critically evaluated having adopted their own tagset closer to grammatical system of Polish language. This paper also approaches from the criticism of the stated specification, but based on a different ground. The following is emphasized: (i) insufficient differentiation of inherent and relational motivated morphosyntactic features – verb relational categories such as modality, conditionality and compound tense cannot be annotated by tag that is added to an individual lexical unit – the stated features (in Croatian as well as in other languages) do not derive from form as such but are relationally conditioned. (ii) lack of adherence from morphosyntactic criteria in establishing formal criteria – semantic features, like the category of common and proper noun, are introduced, whereas other semantic categories, like countability, collectiveness, transitivity, and optativity are not included. Most of the critique towards the Multext-East specification reflects the so-called deductive approach to the tagset design. The tagging system that relies on the more emphasized qualitative approach is discussed in the second part of this paper. Explained is the so called inductive approach to creating a system of tags where the tags are derived from the morphological generator itself which avoids the disadvantages of the deductive system of tags and gains greater grammatical reliability. This could contribute to the greater accuracy in solving homographic forms in the parser’ s algorithm. Six arguments are made in favor of the inductive approach.

tagging; tagset; Croatian language; morphological generator; MULTEXT-East; morphosyntax; morphosyntactic category; morphosyntactic feature; adjective aspect; adjective indefiniteness; deductive method; inductive method; corpus linguistics; machine translat

Druga recenzija za objavu u tiskanu izdanju još u postupku.

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

Podaci o prilogu

nije evidentirano

nije evidentirano

Podaci o skupu

Computational Modeling of Lexical Acquisition

predavanje

25.08.2005-28.08.2005

Split, Hrvatska

Povezanost rada

Računarstvo, Informacijske i komunikacijske znanosti, Filologija