Dealing with data sparseness in SMT with factored models and morphological expansion: a case study on Croatian (CROSBI ID 245681)
Prilog u časopisu | izvorni znanstveni rad | međunarodna recenzija
Podaci o odgovornosti
Sanchez-Cartagena, Victor M. ; Ljubešić, Nikola ; Klubička, Filip
engleski
Dealing with data sparseness in SMT with factored models and morphological expansion: a case study on Croatian
This paper describes our experience using available linguistic resources for Croatian in order to address data sparseness when building an English-to-Croatian general domain phrase- based statistical machine translation system. We report the results obtained with factored translation models and morphological expansion, highlight the impact of the algorithm used for tagging the corpora, and show that the improvement brought by these methods is compatible with the application of data selection on out-of-domain parallel corpora.
data sparseness, factored translation models, morphological expansion
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
Podaci o izdanju
Povezanost rada
Informacijske i komunikacijske znanosti