Pregled bibliografske jedinice broj: 1262054
Analysis of Corpus-based Word-Order Typological Methods
Analysis of Corpus-based Word-Order Typological Methods // Proceedings of the Sixth Workshop on Universal Dependencies (UDW, GURT/SyntaxFest 2023) / Grobol, Loïc ; Tyers, Francis (ur.).
Washington (MD): Association for Computational Linguistics (ACL), 2023. str. 36-46 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
CROSBI ID: 1262054 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Analysis of Corpus-based Word-Order Typological
Methods
Autori
Alves, Diego ; Bekavac, Božo ; Zeman, Daniel ; Tadić, Marko
Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni
Izvornik
Proceedings of the Sixth Workshop on Universal Dependencies (UDW, GURT/SyntaxFest 2023)
/ Grobol, Loïc ; Tyers, Francis - Washington (MD) : Association for Computational Linguistics (ACL), 2023, 36-46
ISBN
978-1-959429-34-0
Skup
Sixth Workshop on Universal Dependencies (UDW, GURT/SyntaxFest 2023)
Mjesto i datum
Washington D.C., Sjedinjene Američke Države, 09.03.2023. - 12.03.2023
Vrsta sudjelovanja
Predavanje
Vrsta recenzije
Međunarodna recenzija
Ključne riječi
dependency parsing ; typology ; multilingualism
Sažetak
This article presents a comparative analysis of four different syntactic typological approaches applied to 20 different languages. We compared three specific quantitative methods, using parallel CoNLL-U corpora, to the classification obtained via syntactic features provided by a typological database (lang2vec). First, we analyzed the Marsagram linear approach which consists of extracting the frequency word-order patterns regarding the position of components inside syntactic nodes. The second approach considers the relative position of heads and dependents, and the third is based simply on the relative position of verbs and objects. From the results, it was possible to observe that each method provides different language clusters which can be compared to the classic genealogical classification (the lang2vec and the head and dependent methods being the closest). As different word-order phenomena are considered in these specific typological strategies, each one provides a different angle of analysis to be applied according to the precise needs of the researchers.
Izvorni jezik
Engleski
Znanstvena područja
Informacijske i komunikacijske znanosti, Filologija
POVEZANOST RADA
Projekti:
EK-H2020-812997 - Cross-lingual Event-centric Open Analytics Research Academy (Cleopatra) (Tadić, Marko, EK - H2020-MSCA-ITN-2018) ( CroRIS)
Ustanove:
Filozofski fakultet, Zagreb