Vocabulary size prediction of Croatian texts (CROSBI ID 493518)
Prilog sa skupa u zborniku | izvorni znanstveni rad | međunarodna recenzija
Podaci o odgovornosti
Tuđman, Miroslav ; Mikelić, Nives ; Boras, Damir
engleski
Vocabulary size prediction of Croatian texts
The preliminary research of the vocabulary size of the Croatian lexical corpora shows that the distribution of types is regular and that deviations of the calculated values are within theoretically acceptable limit. The research also brought us to conclusion that Zipf's Law in Croatian language is not applicable because the lexical density is different, i.e. the proportion of types and tokens in different languages is different and the parameters of that proportion need to be calculated for every language separately.
Lexical items; vocabulary size; Zipf law; lexical density; token; type; Croatian text corpus
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
Podaci o prilogu
223-228.
2003.
objavljeno
Podaci o matičnoj publikaciji
Proceedings of the 25th International Conference on Information Technology Interfaces
Budin, Leo ; Lužar-Stiffler, Vesna ; Bekić, Zoran ; Hljuz Dobrić, Vesna
Zagreb: Sveučilišni računski centar Sveučilišta u Zagrebu (Srce)
Podaci o skupu
International Conference on Information Technology Interfaces (25 ; 2003)
predavanje
18.06.2003-19.06.2003
Cavtat, Hrvatska