Pretražite po imenu i prezimenu autora, mentora, urednika, prevoditelja

Napredna pretraga

Pregled bibliografske jedinice broj: 174994

Making Monolingual Corpora Comparable: a Case Study of Bulgarian and Croatian


Bekavac, Božo; Osenova, Petya; Simov, Kiril; Tadić, Marko
Making Monolingual Corpora Comparable: a Case Study of Bulgarian and Croatian // Fourth International Conference on Language Resources and Evaluation LREC2004 / Lino, Maria Teresa ; Xavier, Maria Francesca ; Ferreira, Fátima ; Costa, Rute ; Silva, Raquel (ur.).
Pariz-Lisabon: ELRA, 2004. str. 1187-1190


CROSBI ID: 174994 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Making Monolingual Corpora Comparable: a Case Study of Bulgarian and Croatian

Autori
Bekavac, Božo ; Osenova, Petya ; Simov, Kiril ; Tadić, Marko

Vrsta, podvrsta i kategorija rada
Poglavlja u knjigama, znanstveni

Knjiga
Fourth International Conference on Language Resources and Evaluation LREC2004

Urednik/ci
Lino, Maria Teresa ; Xavier, Maria Francesca ; Ferreira, Fátima ; Costa, Rute ; Silva, Raquel

Izdavač
ELRA

Grad
Pariz-Lisabon

Godina
2004

Raspon stranica
1187-1190

ISBN
2-9517408-1-6

Ključne riječi
corpus linguistics, comparable corpora, Croatian, Bulgarian

Sažetak
This paper describes the first steps towards the creation of a Bulgarian-Croatian comparable corpus. Its base are two newspaper subcorpora from larger reference corpora of Bulgarian and Croatian. In the beginning we rely on more extralinguistically-oriented, but methodologically cleaner parameters of similarity like: specific topics, pre-defined time span and data size. The idea of ‘ light’ and ‘ hard’ comparable corpora is introduced. At this stage we aim at producing a ‘ light’ bilingual comparable corpus. The algorithm for identifying lexical similarity and aligning linguistic units is presented, and the initial experiments are outlined.

Izvorni jezik
Engleski

Znanstvena područja
Informacijske i komunikacijske znanosti, Filologija, Etnologija i antropologija



POVEZANOST RADA


Projekt / tema
0130418 - 0130418 (, )

Ustanove
Filozofski fakultet, Zagreb

Profili:

Avatar Url Božo Bekavac (autor)

Avatar Url Marko Tadić (autor)

Citiraj ovu publikaciju

Bekavac, Božo; Osenova, Petya; Simov, Kiril; Tadić, Marko
Making Monolingual Corpora Comparable: a Case Study of Bulgarian and Croatian // Fourth International Conference on Language Resources and Evaluation LREC2004 / Lino, Maria Teresa ; Xavier, Maria Francesca ; Ferreira, Fátima ; Costa, Rute ; Silva, Raquel (ur.).
Pariz-Lisabon: ELRA, 2004. str. 1187-1190
Bekavac, B., Osenova, P., Simov, K. & Tadić, M. (2004) Making Monolingual Corpora Comparable: a Case Study of Bulgarian and Croatian. U: Lino, M., Xavier, M., Ferreira, F., Costa, R. & Silva, R. (ur.) Fourth International Conference on Language Resources and Evaluation LREC2004. Pariz-Lisabon, ELRA, str. 1187-1190.
@inbook{inbook, year = {2004}, pages = {1187-1190}, keywords = {corpus linguistics, comparable corpora, Croatian, Bulgarian}, isbn = {2-9517408-1-6}, title = {Making Monolingual Corpora Comparable: a Case Study of Bulgarian and Croatian}, keyword = {corpus linguistics, comparable corpora, Croatian, Bulgarian}, publisher = {ELRA}, publisherplace = {Pariz-Lisabon} }




Contrast
Increase Font
Decrease Font
Dyslexic Font