Nalazite se na CroRIS probnoj okolini. Ovdje evidentirani podaci neće biti pohranjeni u Informacijskom sustavu znanosti RH. Ako je ovo greška, CroRIS produkcijskoj okolini moguće je pristupi putem poveznice www.croris.hr
izvor podataka: crosbi !

Bootstrapping Bilingual Lexicons from Comparable Corpora for Closely Related Languages (CROSBI ID 45034)

Prilog u knjizi | izvorni znanstveni rad

Ljubešić, Nikola ; Fišer, Darja Bootstrapping Bilingual Lexicons from Comparable Corpora for Closely Related Languages // Text, Speech and Dialogue / Habernal, Ivan ; Matoušek, Václav (ur.). Berlin : Heidelberg: Springer, 2011. str. 91-98

Podaci o odgovornosti

Ljubešić, Nikola ; Fišer, Darja

engleski

Bootstrapping Bilingual Lexicons from Comparable Corpora for Closely Related Languages

In this paper we present an approach to bootstrap a Croatian- Slovene bilingual lexicon from comparable news corpora from scratch, without relying on any external bilingual knowledge resource. Instead of using a dictionary to translate context vectors, we build a seed lexicon from identical words in both languages and extend it with context-based cognates and translation candidates of the most frequent words. By enlarging the seed dictionary for only 7% we were able to improve the baseline precision from 0.597 to 0.731 on the mean reciprocal rank for the ten top-ranking translation candidates with a 50.4% recall on the gold standard of 500 entries.

comparable corpora, bilingual lexicon extraction, bootstrapping

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

Podaci o prilogu

91-98.

objavljeno

Podaci o knjizi

Text, Speech and Dialogue

Habernal, Ivan ; Matoušek, Václav

Berlin : Heidelberg: Springer

2011.

978-3-642-23537-5

Povezanost rada

Informacijske i komunikacijske znanosti