Pregled bibliografske jedinice broj: 1106494
Mining Semantic Relations from Comparable Corpora through Intersections of Word Embeddings
Mining Semantic Relations from Comparable Corpora through Intersections of Word Embeddings // Proceedings of the 13th Workshop on Building and Using Comparable Corpora, Language Resources and Evaluation Conference (LREC 2020) / Rapp, Reinhard ; Zweigenbaum, Pierre ; Sharoff, Serge (ur.).
Marseille: European Language Resources Association (ELRA), 2020. str. 29-34
CROSBI ID: 1106494 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Mining Semantic Relations from Comparable Corpora
through Intersections of Word Embeddings
Autori
Vintar, Špela ; Grčić Simeunović, Larisa ; Martinc, Matej ; Pollak, Senja ; Stepišnik, Uroš
Vrsta, podvrsta i kategorija rada
Poglavlja u knjigama, znanstveni
Knjiga
Proceedings of the 13th Workshop on Building and Using Comparable Corpora, Language Resources and Evaluation Conference (LREC 2020)
Urednik/ci
Rapp, Reinhard ; Zweigenbaum, Pierre ; Sharoff, Serge
Izdavač
European Language Resources Association (ELRA)
Grad
Marseille
Godina
2020
Raspon stranica
29-34
ISBN
0-979-95546-42-9
Ključne riječi
semantic relations, word embeddings, comparable corpus, karstology, frame-based terminology
Sažetak
We report an experiment aimed at extracting words expressing a specific semantic relation using intersections of word embeddings. In a multilingual frame-based domain model, specific features of a concept are typically described through a set of non-arbitrary semantic relations. In karstology, our domain of choice which we are exploring though a comparable corpus in English and Croatian, karst phenomena such as landforms are usually described through their FORM, LOCATION, CAUSE, FUNCTION and COMPOSITION. We propose an approach to mine words pertaining to each of these relations by using a small number of seed adjectives, for which we retrieve closest words using word embeddings and then use intersections of these neighbourhoods to refine our search. Such crosslanguage expansion of semantically-rich vocabulary is a valuable aid in improving the coverage of a multilingual knowledge base, but also in exploring differences between languages in their respective conceptualisations of the domain.
Izvorni jezik
Engleski
Znanstvena područja
Filologija