Pretražite po imenu i prezimenu autora, mentora, urednika, prevoditelja

Napredna pretraga

Pregled bibliografske jedinice broj: 174994

Making Monolingual Corpora Comparable: a Case Study of Bulgarian and Croatian


Bekavac, Božo; Osenova, Petya; Simov, Kiril; Tadić, Marko
Making Monolingual Corpora Comparable: a Case Study of Bulgarian and Croatian // Fourth International Conference on Language Resources and Evaluation LREC2004 / Lino, Maria Teresa ; Xavier, Maria Francesca ; Ferreira, Fátima ; Costa, Rute ; Silva, Raquel (ur.).
Pariz : Lisabon: European Language Resources Association (ELRA), 2004. str. 1187-1190


CROSBI ID: 174994 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Making Monolingual Corpora Comparable: a Case Study of Bulgarian and Croatian

Autori
Bekavac, Božo ; Osenova, Petya ; Simov, Kiril ; Tadić, Marko

Vrsta, podvrsta i kategorija rada
Poglavlja u knjigama, znanstveni

Knjiga
Fourth International Conference on Language Resources and Evaluation LREC2004

Urednik/ci
Lino, Maria Teresa ; Xavier, Maria Francesca ; Ferreira, Fátima ; Costa, Rute ; Silva, Raquel

Izdavač
European Language Resources Association (ELRA)

Grad
Pariz : Lisabon

Godina
2004

Raspon stranica
1187-1190

ISBN
2-9517408-1-6

Ključne riječi
corpus linguistics, comparable corpora, Croatian, Bulgarian

Sažetak
This paper describes the first steps towards the creation of a Bulgarian-Croatian comparable corpus. Its base are two newspaper subcorpora from larger reference corpora of Bulgarian and Croatian. In the beginning we rely on more extralinguistically-oriented, but methodologically cleaner parameters of similarity like: specific topics, pre-defined time span and data size. The idea of ‘ light’ and ‘ hard’ comparable corpora is introduced. At this stage we aim at producing a ‘ light’ bilingual comparable corpus. The algorithm for identifying lexical similarity and aligning linguistic units is presented, and the initial experiments are outlined.

Izvorni jezik
Engleski

Znanstvena područja
Informacijske i komunikacijske znanosti, Filologija, Etnologija i antropologija



POVEZANOST RADA


Projekti:
0130418

Ustanove:
Filozofski fakultet, Zagreb

Profili:

Avatar Url Božo Bekavac (autor)

Avatar Url Marko Tadić (autor)

Poveznice na cjeloviti tekst rada:

Pristup cjelovitom tekstu rada

Citiraj ovu publikaciju:

Bekavac, Božo; Osenova, Petya; Simov, Kiril; Tadić, Marko
Making Monolingual Corpora Comparable: a Case Study of Bulgarian and Croatian // Fourth International Conference on Language Resources and Evaluation LREC2004 / Lino, Maria Teresa ; Xavier, Maria Francesca ; Ferreira, Fátima ; Costa, Rute ; Silva, Raquel (ur.).
Pariz : Lisabon: European Language Resources Association (ELRA), 2004. str. 1187-1190
Bekavac, B., Osenova, P., Simov, K. & Tadić, M. (2004) Making Monolingual Corpora Comparable: a Case Study of Bulgarian and Croatian. U: Lino, M., Xavier, M., Ferreira, F., Costa, R. & Silva, R. (ur.) Fourth International Conference on Language Resources and Evaluation LREC2004. Pariz : Lisabon, European Language Resources Association (ELRA), str. 1187-1190.
@inbook{inbook, author = {Bekavac, Bo\v{z}o and Osenova, Petya and Simov, Kiril and Tadi\'{c}, Marko}, year = {2004}, pages = {1187-1190}, keywords = {corpus linguistics, comparable corpora, Croatian, Bulgarian}, isbn = {2-9517408-1-6}, title = {Making Monolingual Corpora Comparable: a Case Study of Bulgarian and Croatian}, keyword = {corpus linguistics, comparable corpora, Croatian, Bulgarian}, publisher = {European Language Resources Association (ELRA)}, publisherplace = {Pariz : Lisabon} }
@inbook{inbook, author = {Bekavac, Bo\v{z}o and Osenova, Petya and Simov, Kiril and Tadi\'{c}, Marko}, year = {2004}, pages = {1187-1190}, keywords = {corpus linguistics, comparable corpora, Croatian, Bulgarian}, isbn = {2-9517408-1-6}, title = {Making Monolingual Corpora Comparable: a Case Study of Bulgarian and Croatian}, keyword = {corpus linguistics, comparable corpora, Croatian, Bulgarian}, publisher = {European Language Resources Association (ELRA)}, publisherplace = {Pariz : Lisabon} }




Contrast
Increase Font
Decrease Font
Dyslexic Font