Domain Adaptation for Machine Translation Involving a Low-Resource Language: Google AutoML vs. from-scratch NMT Systems (CROSBI ID 685957)
Prilog sa skupa u zborniku | izvorni znanstveni rad | međunarodna recenzija
Podaci o odgovornosti
Šoštarić, Margita ; Pavlović, Nataša ; Boltužić, Filip
engleski
Domain Adaptation for Machine Translation Involving a Low-Resource Language: Google AutoML vs. from-scratch NMT Systems
Despite the advances in machine translation (MT) made with neural models, adaptation of such systems for specialist domains is challenging. The problem is heightened for low-resource languages. Additionally, the computational resources and expertise needed to train neural models present barriers for smaller translation companies and freelancers, for whom paid but affordable customization services might present a viable solution. One such service, Google Cloud AutoML, is here compared to domain adaptation of neural MT systems trained from scratch using OpenNMT, an open-source MT toolkit. The from-scratch systems are trained on a larger out-of-domain and a smaller in-domain dataset comprised of medical texts. The same indomain data are used to customize Google Translate. System performance is compared using automatic and human evaluation. The resources, skills, costs and time necessary to set up the examined systems are discussed.
machine translation, domain adaptation, low-resource language, neural machine translation
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
Podaci o prilogu
113-124.
2019.
objavljeno
Podaci o matičnoj publikaciji
Translating and the Computer 41
Esteves-Ferreira, João ; Macan, Juliet Margaret ; Mitkov, Ruslan ; Stefanov, Olaf-Michael
Ženeva: Editions Tradulex
978-2970-10957-0
Podaci o skupu
Translating and the Computer (TC41 2019)
predavanje
21.11.2019-22.11.2019
London, Ujedinjeno Kraljevstvo