Representing word meaning with lexical substitutes (CROSBI ID 448677)
Ocjenski rad | doktorska disertacija
Podaci o odgovornosti
Alagić, Domagoj
Šnajder, Jan
engleski
Representing word meaning with lexical substitutes
The thesis focuses on exploring computational approaches to representing word meaning in context. While representing the meaning of individual words is crucial for most natural language processing (NLP) tasks, it is still a challenge because word meaning often depends on the context. This research investigates computational models for representing word meaning in context using lexical substitutes (LS), meaning-preserving replacements for a word in context. More specifically, it explores in depth to what extent computational substitute-based representation corresponds to the more established sense-based representation. First, a proof of concept study aimed to validate the initial hypothesis of lexical substitutes being suitable for representing word meaning is presented. Seeing that this hypothesis is best tested on a downstream benchmark NLP tasks, this study opts for a word sense induction (WSI), a well-established semantic NLP task. The results obtained using simple methods based around lexical substituted motivated the more detailed experiment on the correspondence between the sense- and substitute-based representation. The thesis introduces a new lexical sample dataset annotated with both word senses and lexical substitutes, which served as a testbed for the study. Experiments using both manually and automatically produced lexical substitutes are also conducted, uncovering the performance gap between the two. Lastly, WSI is again considered, now with the computational approaches verified in the mentioned experiments, and compared against the state-of-the-art WSI model. Complementing the previous experiments, a focused end-to-end study on lexical substitution for Croatian language was also performed, yielding the first Croatian lexical substitution dataset.
word meaning ; computational lexical semantics ; lexical substitution ; word sense induction, natural language processing
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
Podaci o izdanju
73
12.10.2021.
obranjeno
Podaci o ustanovi koja je dodijelila akademski stupanj
Fakultet elektrotehnike i računarstva
Zagreb