Pregled bibliografske jedinice broj: 711764
Towards Semantic Validation of a Derivational Lexicon
Towards Semantic Validation of a Derivational Lexicon // Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers
Dublin, 2014. str. 1728-1739 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
CROSBI ID: 711764 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Towards Semantic Validation of a Derivational Lexicon
Autori
Padó, Sebastian ; Zeller, Britta Zeller ; Šnajder, Jan
Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni
Izvornik
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers
/ - Dublin, 2014, 1728-1739
Skup
The 25th International Conference on Computational Linguistics (COLING 2014)
Mjesto i datum
Dublin, Irska, 23.08.2014. - 29.08.2014
Vrsta sudjelovanja
Predavanje
Vrsta recenzije
Međunarodna recenzija
Ključne riječi
derivational morphology; morphosemantics; support vector machines; German language
Sažetak
Derivationally related lemmas like friend_N–friendly_A–friendship_N are derived from a common stem. Frequently, their meanings are also systematically related. However, there are also many examples of derivationally related lemma pairs whose meanings differ substantially, e.g., object_N–objective_N. Most broad-coverage derivational lexicons do not reflect this distinction, mixing up semantically related and unrelated word pairs. In this paper, we investigate strategies to recover the above distinction by recognizing semantically related lemma pairs, a process we call semantic validation. We make two main contributions: First, we perform a detailed data analysis on the basis of a large German derivational lexicon. It reveals two promising sources of information (distributional semantics and structural information about derivational rules), but also systematic problems with these sources. Second, we develop a classification model for the task that reflects the noisy nature of the data. It achieves an improvement of 13.6% in precision and 5.8% in F1-score over a strong majority class baseline. Our experiments confirm that both information sources contribute to semantic validation, and that they are complementary enough that the best results are obtained from a combined model.
Izvorni jezik
Engleski
Znanstvena područja
Računarstvo
POVEZANOST RADA
Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb
Profili:
Jan Šnajder
(autor)