Nalazite se na CroRIS probnoj okolini. Ovdje evidentirani podaci neće biti pohranjeni u Informacijskom sustavu znanosti RH. Ako je ovo greška, CroRIS produkcijskoj okolini moguće je pristupi putem poveznice www.croris.hr
izvor podataka: crosbi

Building the Spanish-Croatian Parallel Corpus (CROSBI ID 690827)

Prilog sa skupa u zborniku | ostalo | međunarodna recenzija

Mikelenić, Bojana ; Tadić, Marko Building the Spanish-Croatian Parallel Corpus // Proceedings of The 12th Language Resources and Evaluation Conference / Calzolari, Nicoletta ; Béchet, Frédéric ; Blache, Philippe et al. (ur.). Marseille: European Language Resources Association (ELRA), 2020. str. 3932-3936

Podaci o odgovornosti

Mikelenić, Bojana ; Tadić, Marko

engleski

Building the Spanish-Croatian Parallel Corpus

This paper describes the building of the first Spanish-Croatian unidirectional parallel corpus, which has been constructed at the Faculty of Humanities and Social Sciences of the University of Zagreb. The corpus is comprised of eleven Spanish novels and their translations to Croatian done by six different professional translators. All the texts were published between 1999 and 2012. The corpus has more than 2 Mw, with approximately 1 Mw for each language. It was automatically sentence segmented and aligned, as well as manually post-corrected, and contains 71, 778 translation units. In order to protect the copyright and to make the corpus available under permissive CC-BY licence, the aligned translation units are shuffled. This limits the usability of the corpus for research of language units at sentence and lower language levels only. There are two versions of the corpus in TMX format that will be available for download through META-SHARE and CLARIN ERIC infrastructure. The former contains plain TMX, while the latter is lemmatised and POS-tagged and stored in the aTMX format.

written corpus ; parallel corpus ; Spanish ; Croatian

Zbog pandemije krunastoga virusa, kongres nije održan, ali je zbornik radova objavljen 2020-05-15.

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

Podaci o prilogu

3932-3936.

2020.

objavljeno

Podaci o matičnoj publikaciji

Proceedings of The 12th Language Resources and Evaluation Conference

Calzolari, Nicoletta ; Béchet, Frédéric ; Blache, Philippe ; Choukri, Khalid ; Cieri, Christopher ; Declerck, Thierry ; Goggi, Sara ; Isahara, Hitoshi ; Maegaard, Bente ; Mariani, Joseph ; Mazo, Hélène ; Moreno, Asuncion ; Odijk, Jan ; Piperidis, Stelios

Marseille: European Language Resources Association (ELRA)

979-10-95546-34-4

Podaci o skupu

The 12th Language Resources and Evaluation Conference (LREC2020)

poster

11.05.2020-16.05.2020

Marseille, Francuska

Povezanost rada

Filologija, Informacijske i komunikacijske znanosti

Poveznice