Napredna pretraga

Pregled bibliografske jedinice broj: 695322

Croatian language networks


Martinčić-Ipšić, Sanda
Croatian language networks // 2014 Adriatic Conference on Graph Theory and Complexity / Vukičević, Damir (ur.).
Split: PMF Split, 2014. str. 8-9 (predavanje, nije recenziran, sažetak, ostalo)


Naslov
Croatian language networks

Autori
Martinčić-Ipšić, Sanda

Vrsta, podvrsta i kategorija rada
Sažeci sa skupova, sažetak, ostalo

Izvornik
2014 Adriatic Conference on Graph Theory and Complexity / Vukičević, Damir - Split : PMF Split, 2014, 8-9

Skup
2014 Adriatic Conference on Graph Theory and Complexity

Mjesto i datum
MedILS Split, Croatia, 25.4-27.4

Vrsta sudjelovanja
Predavanje

Vrsta recenzije
Nije recenziran

Ključne riječi
Complex networks; language networks; natural language processing

Sažetak
Written, as well as spoken language can be modeled via complex networks where the lingual units (words) are represented by nodes and their linguistic interactions by links. Such representations enable language analysis through varying linguistic units ; the examination of language evolution ; the modeling of language acquisition ; or assessing the text quality. The language networks construction can be on word- level and on subword-level. The study of networks interactions across language levels can reveal presently unavailable structural properties of the Croatian language at phonological, syllabic, morphological, co-occurrence and syntax level. In our research we are focused upon the word and sub-word co-occurrence networks of Croatian. Initially, we study the structure of Croatian word co-occurrence networks ; the change of network structure properties by systematically varying the co-occurrence window sizes, the corpus sizes and the removal of stopwords. Below the word level we constructed syllable networks. The results indicate that Croatian syllable networks exhibit certain properties of small world networks. Furthermore, we compared Croatian syllable networks with Portuguese and Chinese syllable networks and we have shown that they have similar properties. The applicative goal of this study is to derive an assessment model for the evaluation of the quality of Croatian texts from complex networks parameters, which could be used to develop software able to consistently carry out a desired analysis of a given text, such as assessing the quality of a summary or estimating the quality of a machine translation.

Izvorni jezik
Engleski

Znanstvena područja
Računarstvo, Informacijske i komunikacijske znanosti



POVEZANOST RADA


Projekt / tema
UniRi - 13.13.2.2.07

Ustanove
Sveučilište u Rijeci - Odjel za informatiku

Autor s matičnim brojem:
Sanda Martinčić-Ipšić, (250193)