Napredna pretraga

Pregled bibliografske jedinice broj: 795386

Learner Corpus of Croatian as a Second and Foreign Language

Mikelić Preradović, Nives; Berać, Monika; Boras, Damir
Learner Corpus of Croatian as a Second and Foreign Language // Multidisciplinary Approaches to Multilingualism / Cergol Kovačević, Kristina ; Udier, Sanda Lucija (ur.).
Frankfurt am Main, Germany: Peter Lang, 2015. str. 107-126

Learner Corpus of Croatian as a Second and Foreign Language

Mikelić Preradović, Nives ; Berać, Monika ; Boras, Damir

Vrsta, podvrsta i kategorija rada
Poglavlja u knjigama, znanstveni

Multidisciplinary Approaches to Multilingualism

Cergol Kovačević, Kristina ; Udier, Sanda Lucija

Peter Lang

Frankfurt am Main, Germany


Raspon stranica


Ključne riječi
Learner corpus, Croatian as a second and foreign language, Non-native speakers, Interlanguage, Foreign language learning

Years of experience in teaching Croatian as a foreign language have clearly indicated the need for a lexical resource that overcomes the limitations of existing bilingual dictionaries and manuals, enabling optimal presentation of the Croatian language to non-native speakers. The aim of this study is to describe the development methodology of the first learner corpus of Croatian as a second and foreign language, the corpus that will be used for producing new lexical resources aimed at non-native speakers of Croatian language. The learner corpus is based on written texts and audio recordings of foreign students attending Croaticum – Center for Croatian as a Second and Foreign Language at the Faculty of Humanities and Social Sciences, University of The learner corpus consists of two sub-corpora: a corpus of written texts systematically described by detailed metadata (gender, age, nationality, mother tongue, / bilingual or multilingual competence, language competence of parents, etc.) and a corpus of recordings of read-aloud and spontaneous speech. Our hypothesis is that computer processing of our learner corpus should provide the in-depth analysis of learners' language (description of their interlanguage and deviations from the standard). Computer processing will result in the extraction of important linguistic patterns, due to different levels of corpus annotation and enrichment: morphosyntactic tagging and error tagging. The research will provide insight into the interlanguage of learners and enable language instructors to better understand the real needs of their students. Research on collected authentic natural language data and computer analysis of the corpus of errors will result in the creation of interactive tools and computer applications for individual assistance in improving reading and writing skills of the Croatian language at the lexical, grammatical and discourse level. The developed corpus will ensure representative data for systematic research of Croatian as a foreign language as well as the development of a consistent methodology and set of tools that will allow comparative analysis of Croatian as first and as a foreign language. It will also allow the improvement of teaching materials at different levels of the Croatian language acquisition (A1 to C1), as well as the development of descriptors for language tests and various innovations in teaching Croatian as a foreign language. Due to typological characteristics of Croatian as an inflectional language, Croatian learner corpus can be used to develop methods and tools for other Slavic languages that do not yet possess their own learner corpora.

Izvorni jezik

Znanstvena područja
Informacijske i komunikacijske znanosti


Filozofski fakultet, Zagreb