Napredna pretraga

Pregled bibliografske jedinice broj: 697756

Language Networks


Martinčić-Ipšić, Sanda
Language Networks, 2014. (predavanje).


Naslov
Language Networks

Autori
Martinčić-Ipšić, Sanda

Izvornik
HDJT - NLP Kruzok, FER

Vrsta, podvrsta
Ostale vrste radova, predavanje

Godina
2014

Ključne riječi
Complex networks; language networks

Sažetak
Language can be viewed as a complex network if it is presented as system of interacting linguistic’s units. Network analysis provides mechanisms that can reveal new patterns in a complex structure and can thus be applied to the study of the patterns in language structures. This, in turn, may contribute to a better understanding of the organization and the structure and evolution of a language. In our research we are focused upon the word and sub-word co-occurrence networks of Croatian. Initially, we study the structure of Croatian word co-occurrence networks ; the change of network structure properties by systematically varying the co-occurrence window sizes, the corpus sizes and the removal of stopwords. On the word co-occurrence level we compare the properties of linguistic networks for Croatian, English and Italian languages. We constructed co- occurrence networks from parallel text corpora, consisting of the translations of five books in the three languages. The networks’ measures across the three studied languages differ particularly in the average path length and average clustering coefficient. For the text differentiation we study the linguistic networks from different text types: literature, blogs and shuffled texts. The linguistic networks are constructed from texts as directed and weighted co-occurrence networks of words. The comparison of the networks structure is performed at global level in terms of: average node degree, average shortest path length, diameter, clustering coefficient, density and number of components. Furthermore, we perform analysis on the local level by comparing the rank plots of in and out degree, in and out strength and in and out selectivity. The selectivity-based measure points to the differences between the structure of the networks from different text types. Below the word level we constructed syllable networks. The Croatian syllable networks exhibit small world properties. Additionally, we compared networks form syllables and corresponding words. The results indicate there are some structural differences in their properties.

Izvorni jezik
Engleski

Znanstvena područja
Računarstvo, Informacijske i komunikacijske znanosti



POVEZANOST RADA


Ustanove
Sveučilište u Rijeci - Odjel za informatiku

Autor s matičnim brojem:
Sanda Martinčić-Ipšić, (250193)