Croatian error-annotated corpus of non- professional written language

Štefanec, Vanja; Ljubešić, Nikola; Kuvač Kraljević, Jelena

Pregled bibliografske jedinice broj: 825905

Croatian error-annotated corpus of non- professional written language

Štefanec, Vanja; Ljubešić, Nikola; Kuvač Kraljević, Jelena

Croatian error-annotated corpus of non- professional written language // Proceedings of the Tenth International conference on language resources and evaluation (LREC 2016) / Calzolari, Nicoletta ; Khalid Choukr ; Declerck, Thierry ; Goggi, Sara ; Grobelnik, Marko ; Maegaard, Bente ; Mariani, Joseph ; Mazo, Hélène ; Moreno, Asunción ; Odijk, Jan ; Piperidis, Stelios (ur.).
Portorož: The European Language Resources Association, 2016. str. 3220-3226 (poster, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)

CROSBI ID: 825905 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Croatian error-annotated corpus of non- professional written language

Autori
Štefanec, Vanja ; Ljubešić, Nikola ; Kuvač Kraljević, Jelena

Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni

Izvornik
Proceedings of the Tenth International conference on language resources and evaluation (LREC 2016) / Calzolari, Nicoletta ; Khalid Choukr ; Declerck, Thierry ; Goggi, Sara ; Grobelnik, Marko ; Maegaard, Bente ; Mariani, Joseph ; Mazo, Hélène ; Moreno, Asunción ; Odijk, Jan ; Piperidis, Stelios - Portorož : The European Language Resources Association, 2016, 3220-3226

ISBN
978-2-9517408-9-1

Skup
Tenth International conference on language resources and evaluation - LREC 2016

Mjesto i datum
Portorož, Slovenija, 23.05.2016. - 28.05.2016

Vrsta sudjelovanja
Poster

Vrsta recenzije
Međunarodna recenzija

Ključne riječi
error corpus ; language disorders ; Croatian

Sažetak
In the paper authors will present the Croatian corpus of non-professional written language. Consisting of two subcorpora, i.e. the clinical subcorpus, consisting of written texts produced by speakers with various types of language disorders, and the healthy speakers subcorpus, as well as by the levels of its annotation, it offers an opportunity for different lines of research. Authors will present the corpus structure, describe the sampling methodology, explain the levels of annotation, and give some very basic statistic. On the basis of data from the corpus, existing language technologies for Croatian will be adapted in order to be implemented in a platform facilitating text production to speakers with language disorders. In this respect, several analyses of the corpus data will be presented.

Izvorni jezik
Engleski

Znanstvena područja
Pedagogija

POVEZANOST RADA

Ustanove:
Edukacijsko-rehabilitacijski fakultet, Zagreb,
Filozofski fakultet, Zagreb

Profili:

Nikola Ljubešić (autor)