Napredna pretraga

Pregled bibliografske jedinice broj: 689514

Frequency in Croatian CDI and Croatian Child Language Frequency Dictionary

Hržica, Gordana; Kovačević, Melita
Frequency in Croatian CDI and Croatian Child Language Frequency Dictionary // Early language acquisition
Lyon, Francuska, 2012. (poster, međunarodna recenzija, sažetak, ostalo)

Frequency in Croatian CDI and Croatian Child Language Frequency Dictionary

Hržica, Gordana ; Kovačević, Melita

Vrsta, podvrsta i kategorija rada
Sažeci sa skupova, sažetak, ostalo

Early language acquisition

Mjesto i datum
Lyon, Francuska, 05-07.2012

Vrsta sudjelovanja

Vrsta recenzije
Međunarodna recenzija

Ključne riječi
Frequency; language acquisition; language corpora; CDI

The role of frequency in language acquisition has proven its significance in last decades, especially with the increasing number of studies on spoken language corpora (Demuth 2007). It is known that the frequency has an impact on comprehension, production, and emergence of linguistic categories and rules (see Diessel 2007). The effect is particularly important for lexicon: word frequency is one of the two main predictors of spoken word processing, other being neighbour density (Storkel and Morrisette 2002). Data about the frequency of words in language are usually obtained by corpora analyses, weather written or spoken. However, there can be alternatives. In this study we wanted to compare two possible sources that could provide information about word frequencies in one language. Both of them are targeting the production of spontaneous spoken language. One is longitudinal corpus, and the other is parent report instrument. The Croatian corpus of child language (Kovačević 2002) consists of recordings of spontaneous speech of three monolingual children, taken from 1 ; 5 to 2 ; 8, approximately twice a month (available at CHILDES). The complete corpus was morphologically tagged. The tagged corpus was used as a base for compiling the first Croatian Child Language Frequency Dictionary (CCFD - Hržica et al, in progress). The corpus data were lemmatized before the frequency count has been done. Frequences were counted for each time point separately, during the recording period of each child, preserving in particular the time-developmental component. CCFD allows for the analysis of the most frequent lemmas in all three sub-corpora, according to frequency, alphabetic ordering, time of appearance, and part-of-speech. Also, it preserves morphological encoding of types and number of types and tokens. Koralje (Komunikacijske razvojne ljestvice – Kovačević et al. 2007) is a Croatian version of MacArthur-Bates Communicative Development Inventories (Fenson et al. 1993), parent report instrument adopted for the number of languages (Dale and Penfold 2011). Croatian version was standardised on the representative sample of 627 Croatian children (infants: 250, toddlers: 377). CDI and CCFD are two different and separately developed methods for accessing child language. The goal of this study was to test the choice of words in Croatian CDI second vocabulary subscale (measuring only the production) by comparing it to the complementary method of data collection (data from spontaneous speech samples presented in CCFD). More specifically, the goal was to compare the presence of CDI lemmas in CCFD and the relation between number of marking (in CDI) and relative frequency (in CCFD). Results show that high percentage of the lemmas in CDI is present in CCFD. The highest level of overlap is present in adverbs, prepositions, connectors and particles (100%). The lowest level of overlap is present in nouns. As for the frequency counts, number of markings (in CDI) is in coherence with relative frequency (in CCFD).

Izvorni jezik


Edukacijsko-rehabilitacijski fakultet, Zagreb