On Machine Translation of User Reviews (CROSBI ID 707507)
Prilog sa skupa u zborniku | ostalo | međunarodna recenzija
Podaci o odgovornosti
Popović, Maja ; Poncelas, Alberto ; Brkić Bakarić, Marija ; Way, Andy
engleski
On Machine Translation of User Reviews
This work investigates neural machine translation (NMT) systems for translating English user reviews into Croatian and Serbian, two similar morphologically complex languages. Two types of reviews are used for testing the systems: IMDb movie reviews and Amazon product reviews. Two types of training data are explored: large out-of-domain bilingual parallel corpora, as well as small synthetic in-domain parallel corpus obtained by machine translation of monolingual English Amazon reviews into the target languages. Both automatic scores and human evaluation show that using the synthetic in-domain corpus together with a selected subset of out-of-domain data is the best option. Separated results on IMDb and Amazon reviews indicate that MT systems perform differently on different review types so that user reviews generally should not be considered as a homogeneous genre. Nevertheless, more detailed research on larger amount of different reviews covering different domains/topics is needed to fully understand these differences.
neural machine translation, user reviews genre, morphologically complex languages, out-of-domain parallel corpora, synthetic parallel corpora, evaluation
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
Podaci o prilogu
1113-1122.
2021.
objavljeno
Podaci o matičnoj publikaciji
Proceedings of Recent Advances in Natural Language Processing ({; ; {; ; RANLP}; ; }; ; )
Angelova, Galia ; Kunilovskaya, Maria ; Mitkov, Ruslan ; Nikolova-Koleva, Ivelina
978-954-452-072-4
2603-2813
Podaci o skupu
International Conference Recent Advances in Natural Language Processing (RANLP 2021)
predavanje
01.09.2021-03.09.2021
online