Pregled bibliografske jedinice broj: 419754
Procedures in building the Croatian-English parallel corpus
Procedures in building the Croatian-English parallel corpus // Text Corpora and Multilingual Lexicography / Teubert, Wolfgang (ur.).
Amsterdam : Philadelphia: John Benjamins Publishing, 2007. str. 93-107
CROSBI ID: 419754 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Procedures in building the Croatian-English parallel corpus
Autori
Tadić, Marko
Vrsta, podvrsta i kategorija rada
Poglavlja u knjigama, znanstveni
Knjiga
Text Corpora and Multilingual Lexicography
Urednik/ci
Teubert, Wolfgang
Izdavač
John Benjamins Publishing
Grad
Amsterdam : Philadelphia
Godina
2007
Raspon stranica
93-107
ISBN
978 90 272 2238 1
Ključne riječi
parallel corpus, Croatian, English, building
Sažetak
This contribution gives a survey of procedures and formats used in building the Croatian-English parallel corpus which is being collected at the Institute of Linguistics at the Philosophical Faculty, University of Zagreb. The primary text source is the newspaper Croatia Weekly which has been published from the beginning of 1998 by HIKZ (Croatian Institute for Information and Culture). After a quick survey of existing English-Croatian parallel corpora, the article copes with procedures involved in text conversion and text encoding, particularly the alignment. There are several recent suggestions for alignment encoding, and they are listed and elaborated at the end of the article.
Izvorni jezik
Engleski
Znanstvena područja
Filologija
POVEZANOST RADA
Projekti:
130-1300646-0645 - Hrvatski jezični resursi i njihovo obilježavanje (Tadić, Marko, MZOS ) ( CroRIS)
Ustanove:
Filozofski fakultet, Zagreb
Profili:
Marko Tadić
(autor)