Nalazite se na CroRIS probnoj okolini. Ovdje evidentirani podaci neće biti pohranjeni u Informacijskom sustavu znanosti RH. Ako je ovo greška, CroRIS produkcijskoj okolini moguće je pristupi putem poveznice www.croris.hr
izvor podataka: crosbi !

Parallel Corpus of Croatian-Italian Administrative Texts (CROSBI ID 685323)

Prilog sa skupa u zborniku | sažetak izlaganja sa skupa | međunarodna recenzija

Brkic Bakaric, Marija ; Lalli Pacelat, Ivana Parallel Corpus of Croatian-Italian Administrative Texts // Proceedings of the 2nd Workshop on Human- Informed Translation and Interpreting Technology (HiT-IT 2019) / Temnikova, I. ; Orăsan, C. ; Corpas Pastor, G. et al. (ur.). 2019. str. 11-18 doi: 10.26615/issn.2683-0078.2019_002

Podaci o odgovornosti

Brkic Bakaric, Marija ; Lalli Pacelat, Ivana

engleski

Parallel Corpus of Croatian-Italian Administrative Texts

Parallel corpora constitute a unique resource for providing assistance to human translators. The selection and preparation of the parallel corpora also conditions the quality of the resulting MT engine. Since Croatian is a national language and Italian is officially recognized as a minority language in seven cities and twelve municipalities of Istria County, a large amount of parallel texts is produced on a daily basis. However, there have been no attempts in using these texts for compiling a parallel corpus. A domain-specific sentencealigned parallel Croatian-Italian corpus of administrative texts would be of high value in creating different language tools and resources. The aim of this paper is, therefore, to explore the value of parallel documents which are publicly available mostly in pdf format and to investigate the use of automatically-built dictionaries in corpus compilation. The effects that a document format and, consequently sentence splitting, and the dictionary input have on the sentence alignment process are manually evaluated.

parallel corpora ; sentence alignment, automatic dictionary

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

Podaci o prilogu

11-18.

2019.

objavljeno

10.26615/issn.2683-0078.2019_002

Podaci o matičnoj publikaciji

Proceedings of the 2nd Workshop on Human- Informed Translation and Interpreting Technology (HiT-IT 2019)

Temnikova, I. ; Orăsan, C. ; Corpas Pastor, G. ; Mitkov, R.

Podaci o skupu

2nd Workshop on Human-Informed Translation and Interpreting Technology

predavanje

05.09.2019-06.09.2019

Varna, Bugarska

Povezanost rada

Filologija, Informacijske i komunikacijske znanosti

Poveznice