Pregled bibliografske jedinice broj: 922824
Leveraging Lexical Substitutes for Unsupervised Word Sense Induction
Leveraging Lexical Substitutes for Unsupervised Word Sense Induction // Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18)
New Orleans (LA), Sjedinjene Američke Države, 2018. str. 5004-5011 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
CROSBI ID: 922824 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Leveraging Lexical Substitutes for Unsupervised Word Sense Induction
Autori
Alagić, Domagoj ; Šnajder, Jan ; Padó, Sebastian
Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni
Izvornik
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18)
/ - , 2018, 5004-5011
Skup
32nd AAAI Conference on Artificial Intelligence (AAAI-18)
Mjesto i datum
New Orleans (LA), Sjedinjene Američke Države, 02.02.2018. - 07.02.2018
Vrsta sudjelovanja
Predavanje
Vrsta recenzije
Međunarodna recenzija
Ključne riječi
Lexical semantics ; Natural language processing ; Lexical substitution ; Polysemy ; Word sense induction
Sažetak
Word sense induction is the most prominent unsupervised approach to lexical disambiguation. It clusters word instances, typically represented by their bag-of-words contexts. Therefore, uninformative and ambiguous contexts present a major challenge. In this paper, we investigate the use of an alternative instance representation based on lexical substitutes, i.e., contextually suitable, meaning-preserving replacements. Using lexical substitutes predicted by a state- of-the-art automatic system and a simple clustering algorithm, we out-perform bag-of- words instance representations and compete with much more complex structured probabilistic models. Furthermore, we show that an oracle based on manually-labeled lexical substitutes yields yet substantially higher performance. Taken together, this provides evidence for a complementarity between word sense induction and lexical substitution that has not been given much consideration before.
Izvorni jezik
Engleski
Znanstvena područja
Računarstvo, Informacijske i komunikacijske znanosti
POVEZANOST RADA
Projekti:
HRZZ-UIP-2014-09-7312 - SenseHive: Dinamički modeli za postepenu izgradnju leksičko-semantičkih resursa potpomognuti radom mnoštva (SenseHive) (Šnajder, Jan, HRZZ ) ( CroRIS)
Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb