Pregled bibliografske jedinice broj: 1129724
Word sense induction using leader-follower clustering of automatically generated lexical substitutes
Word sense induction using leader-follower clustering of automatically generated lexical substitutes // Expert systems with applications, 181 (2021), 115162, 12 doi:10.1016/j.eswa.2021.115162 (međunarodna recenzija, članak, znanstveni)
CROSBI ID: 1129724 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Word sense induction using leader-follower clustering of automatically generated lexical substitutes
Autori
Akkasi, Abbas ; Šnajder, Jan
Izvornik
Expert systems with applications (0957-4174) 181
(2021);
115162, 12
Vrsta, podvrsta i kategorija rada
Radovi u časopisima, članak, znanstveni
Ključne riječi
Word sense induction ; Natural language processing ; Graph clustering ; Clustering refinement ; Lexical substitution
Sažetak
Word Sense Induction (WSI) concerns the automatic identification of the various senses of polysemous words. Any improvement in this process can directly affect the quality of the applications in which knowing the word’s senses is important. For example, word sense disambiguation, information retrieval, and clustering of web search result in lexically ambiguous queries. In this paper, we propose a novel WSI model that makes use of automatically generated lexical substitutes for a target word to construct a graph and data preparation for the next steps. Following the data preparation step, we make use of Leader–Follower graph clustering to find the basic senses of the target word. The senses of the target word inside the remaining or new upcoming instances will be decided according to their contextual embedding’s similarities with the basic sense. Besides, to make the number of found sense groups of a target word much closer to the reality, we apply post-processing at the end. The results of experiments on SemEval2010 dataset confirm that the proposed method outperforms all the state-of-the-art solutions in terms of both harmonic and geometric v-measure and f-score with a lower average number of sense groups.
Izvorni jezik
Engleski
Znanstvena područja
Računarstvo
POVEZANOST RADA
Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb
Profili:
Jan Šnajder
(autor)
Citiraj ovu publikaciju:
Časopis indeksira:
- Current Contents Connect (CCC)
- Web of Science Core Collection (WoSCC)
- Science Citation Index Expanded (SCI-EXP)
- SCI-EXP, SSCI i/ili A&HCI
- Scopus