Pregled bibliografske jedinice broj: 1146758
On Machine Translation of User Reviews
On Machine Translation of User Reviews // Proceedings of Recent Advances in Natural Language Processing ({; ; {; ; RANLP}; ; }; ; ) / Angelova, Galia ; Kunilovskaya, Maria ; Mitkov, Ruslan ; Nikolova-Koleva, Ivelina (ur.).
online, 2021. str. 1113-1122 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), ostalo)
CROSBI ID: 1146758 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
On Machine Translation of User Reviews
Autori
Popović, Maja ; Poncelas, Alberto ; Brkić Bakarić, Marija ; Way, Andy
Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), ostalo
Izvornik
Proceedings of Recent Advances in Natural Language Processing ({; ; {; ; RANLP}; ; }; ; )
/ Angelova, Galia ; Kunilovskaya, Maria ; Mitkov, Ruslan ; Nikolova-Koleva, Ivelina - , 2021, 1113-1122
ISBN
978-954-452-072-4
Skup
International Conference Recent Advances in Natural Language Processing (RANLP 2021)
Mjesto i datum
Online, 01.09.2021. - 03.09.2021
Vrsta sudjelovanja
Predavanje
Vrsta recenzije
Međunarodna recenzija
Ključne riječi
neural machine translation, user reviews genre, morphologically complex languages, out-of-domain parallel corpora, synthetic parallel corpora, evaluation
Sažetak
This work investigates neural machine translation (NMT) systems for translating English user reviews into Croatian and Serbian, two similar morphologically complex languages. Two types of reviews are used for testing the systems: IMDb movie reviews and Amazon product reviews. Two types of training data are explored: large out-of-domain bilingual parallel corpora, as well as small synthetic in-domain parallel corpus obtained by machine translation of monolingual English Amazon reviews into the target languages. Both automatic scores and human evaluation show that using the synthetic in-domain corpus together with a selected subset of out-of-domain data is the best option. Separated results on IMDb and Amazon reviews indicate that MT systems perform differently on different review types so that user reviews generally should not be considered as a homogeneous genre. Nevertheless, more detailed research on larger amount of different reviews covering different domains/topics is needed to fully understand these differences.
Izvorni jezik
Engleski
Znanstvena područja
Informacijske i komunikacijske znanosti
POVEZANOST RADA
Ustanove:
Fakultet informatike i digitalnih tehnologija, Rijeka
Profili:
Marija Brkić Bakarić
(autor)