Slovene-Croatian Treebank Transfer Using Bilingual Lexicon Improves Croatian Dependency Parsing (CROSBI ID 590524)
Prilog sa skupa u zborniku | izvorni znanstveni rad | međunarodna recenzija
Podaci o odgovornosti
Agić, Željko ; Merkler, Danijela ; Berović, Daša
engleski
Slovene-Croatian Treebank Transfer Using Bilingual Lexicon Improves Croatian Dependency Parsing
A method is presented for transferring dependency treebanks between similar languages by using a bilingual lexicon, aiming to improve dependency parsing accuracy on the target language. It is illustrated by transferring the Slovene Dependency Treebank to Croatian by using a GIZA++ bilingual lexicon constructed from the Croatian-Slovene 1984 parallel corpus from the Multext East project. The transferred treebank is merged with the Croatian Dependency Treebank and the merged treebank is used to train and test two graph-based dependency parsers. MSTParser and CroDep accuracy on parsing the 1984 fictional text shows a statistically significant increase and a similar decrease on parsing the Croatian Dependency Treebank newspaper text.
treebank transfer; bilingual lexicon; dependency parsing
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
Podaci o prilogu
5-9.
2012.
objavljeno
Podaci o matičnoj publikaciji
Proceedings of the 15th International Multiconference Information Society (IS 2012), Volume C, Proceedings of the 8th Language Technologies Conference
Erjavec, Tomaž ; Žganec Gros, Jerneja
Ljubljana: Institut Jožef Stefan
978-961-264-048-4
1581-9973
Podaci o skupu
Eighth Language Technologies Conference
predavanje
08.10.2012-09.10.2012
Ljubljana, Slovenija