Pretražite po imenu i prezimenu autora, mentora, urednika, prevoditelja

Napredna pretraga

Pregled bibliografske jedinice broj: 616795

Disambiguating vectors for bilingual lexicon extraction from comparable corpora


Apidianaki, Marianna; Ljubešić, Nikola; Fišer, Darja
Disambiguating vectors for bilingual lexicon extraction from comparable corpora // Proceedings of the Eighth LANGUAGE TECHNOLOGIES Conference / Erjavec, Tomaž ; Žganec Gros, Jerneja (ur.).
Ljubljana, 2012. str. 10-15 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)


CROSBI ID: 616795 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Disambiguating vectors for bilingual lexicon extraction from comparable corpora

Autori
Apidianaki, Marianna ; Ljubešić, Nikola ; Fišer, Darja

Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni

Izvornik
Proceedings of the Eighth LANGUAGE TECHNOLOGIES Conference / Erjavec, Tomaž ; Žganec Gros, Jerneja - Ljubljana, 2012, 10-15

Skup
Eighth LANGUAGE TECHNOLOGIES Conference

Mjesto i datum
Ljubljana, Slovenija, 08.10.2012. - 09.10.2012

Vrsta sudjelovanja
Predavanje

Vrsta recenzije
Međunarodna recenzija

Ključne riječi
bilingual lexicon extraction; cross-lingual sense clustering; feature disambiguation

Sažetak
This paper presents an approach to enhance the extraction of translation equivalents from comparable corpora by plugging in bilingual lexico-semantic knowledge harvested from a parallel corpus. First, the bilingual lexicon obtained from word-aligning the parallel corpus replaces an external seed dictionary, making the approach knowledge-light and portable. Next, instead of using simple 1:1 mappings between the source and the target language, translation equivalents are clustered into sets of synonyms based on contextual similarities, enabling us to expand the translation of vector features with several translation variants. And last but not least, the vector features are disambiguated and translated only with the translation variants from the most appropriate cluster, thus producing less noisy vectors that allow for a more successful cross- lingual comparison of the vectors compared to simpler methods.

Izvorni jezik
Engleski

Znanstvena područja
Informacijske i komunikacijske znanosti



POVEZANOST RADA


Projekti:
130-1301679-1380 - Hrvatska rječnička baština i hrvatski europski identitet (Boras, Damir, MZOS ) ( CroRIS)

Ustanove:
Filozofski fakultet, Zagreb

Profili:

Avatar Url Nikola Ljubešić (autor)


Citiraj ovu publikaciju:

Apidianaki, Marianna; Ljubešić, Nikola; Fišer, Darja
Disambiguating vectors for bilingual lexicon extraction from comparable corpora // Proceedings of the Eighth LANGUAGE TECHNOLOGIES Conference / Erjavec, Tomaž ; Žganec Gros, Jerneja (ur.).
Ljubljana, 2012. str. 10-15 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
Apidianaki, M., Ljubešić, N. & Fišer, D. (2012) Disambiguating vectors for bilingual lexicon extraction from comparable corpora. U: Erjavec, T. & Žganec Gros, J. (ur.)Proceedings of the Eighth LANGUAGE TECHNOLOGIES Conference.
@article{article, author = {Apidianaki, Marianna and Ljube\v{s}i\'{c}, Nikola and Fi\v{s}er, Darja}, year = {2012}, pages = {10-15}, keywords = {bilingual lexicon extraction, cross-lingual sense clustering, feature disambiguation}, title = {Disambiguating vectors for bilingual lexicon extraction from comparable corpora}, keyword = {bilingual lexicon extraction, cross-lingual sense clustering, feature disambiguation}, publisherplace = {Ljubljana, Slovenija} }
@article{article, author = {Apidianaki, Marianna and Ljube\v{s}i\'{c}, Nikola and Fi\v{s}er, Darja}, year = {2012}, pages = {10-15}, keywords = {bilingual lexicon extraction, cross-lingual sense clustering, feature disambiguation}, title = {Disambiguating vectors for bilingual lexicon extraction from comparable corpora}, keyword = {bilingual lexicon extraction, cross-lingual sense clustering, feature disambiguation}, publisherplace = {Ljubljana, Slovenija} }




Contrast
Increase Font
Decrease Font
Dyslexic Font