Pretražite po imenu i prezimenu autora, mentora, urednika, prevoditelja

Napredna pretraga

Pregled bibliografske jedinice broj: 552910

Bootstrapping Bilingual Lexicons from Comparable Corpora for Closely Related Languages


Ljubešić, Nikola; Fišer, Darja
Bootstrapping Bilingual Lexicons from Comparable Corpora for Closely Related Languages // Text, Speech and Dialogue / Habernal, Ivan ; Matoušek, Václav (ur.).
Berlin : Heidelberg: Springer, 2011. str. 91-98


CROSBI ID: 552910 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Bootstrapping Bilingual Lexicons from Comparable Corpora for Closely Related Languages

Autori
Ljubešić, Nikola ; Fišer, Darja

Vrsta, podvrsta i kategorija rada
Poglavlja u knjigama, znanstveni

Knjiga
Text, Speech and Dialogue

Urednik/ci
Habernal, Ivan ; Matoušek, Václav

Izdavač
Springer

Grad
Berlin : Heidelberg

Godina
2011

Raspon stranica
91-98

ISBN
978-3-642-23537-5

Ključne riječi
comparable corpora, bilingual lexicon extraction, bootstrapping

Sažetak
In this paper we present an approach to bootstrap a Croatian- Slovene bilingual lexicon from comparable news corpora from scratch, without relying on any external bilingual knowledge resource. Instead of using a dictionary to translate context vectors, we build a seed lexicon from identical words in both languages and extend it with context-based cognates and translation candidates of the most frequent words. By enlarging the seed dictionary for only 7% we were able to improve the baseline precision from 0.597 to 0.731 on the mean reciprocal rank for the ten top-ranking translation candidates with a 50.4% recall on the gold standard of 500 entries.

Izvorni jezik
Engleski

Znanstvena područja
Informacijske i komunikacijske znanosti



POVEZANOST RADA


Projekti:
130-1301679-1380 - Hrvatska rječnička baština i hrvatski europski identitet (Boras, Damir, MZOS ) ( CroRIS)

Ustanove:
Filozofski fakultet, Zagreb

Profili:

Avatar Url Nikola Ljubešić (autor)

Citiraj ovu publikaciju:

Ljubešić, Nikola; Fišer, Darja
Bootstrapping Bilingual Lexicons from Comparable Corpora for Closely Related Languages // Text, Speech and Dialogue / Habernal, Ivan ; Matoušek, Václav (ur.).
Berlin : Heidelberg: Springer, 2011. str. 91-98
Ljubešić, N. & Fišer, D. (2011) Bootstrapping Bilingual Lexicons from Comparable Corpora for Closely Related Languages. U: Habernal, I. & Matoušek, V. (ur.) Text, Speech and Dialogue. Berlin : Heidelberg, Springer, str. 91-98.
@inbook{inbook, author = {Ljube\v{s}i\'{c}, Nikola and Fi\v{s}er, Darja}, year = {2011}, pages = {91-98}, keywords = {comparable corpora, bilingual lexicon extraction, bootstrapping}, isbn = {978-3-642-23537-5}, title = {Bootstrapping Bilingual Lexicons from Comparable Corpora for Closely Related Languages}, keyword = {comparable corpora, bilingual lexicon extraction, bootstrapping}, publisher = {Springer}, publisherplace = {Berlin : Heidelberg} }
@inbook{inbook, author = {Ljube\v{s}i\'{c}, Nikola and Fi\v{s}er, Darja}, year = {2011}, pages = {91-98}, keywords = {comparable corpora, bilingual lexicon extraction, bootstrapping}, isbn = {978-3-642-23537-5}, title = {Bootstrapping Bilingual Lexicons from Comparable Corpora for Closely Related Languages}, keyword = {comparable corpora, bilingual lexicon extraction, bootstrapping}, publisher = {Springer}, publisherplace = {Berlin : Heidelberg} }




Contrast
Increase Font
Decrease Font
Dyslexic Font