Unsupervised Acquisition of Comprehensive Multiword Lexicons using Competition in an n-gram Lattice (CROSBI ID 250828)
Prilog u časopisu | izvorni znanstveni rad | međunarodna recenzija
Podaci o odgovornosti
Brooke, Julian ; Šnajder, Jan ; Baldwin, Timothy
engleski
Unsupervised Acquisition of Comprehensive Multiword Lexicons using Competition in an n-gram Lattice
We present a new model for acquiring comprehensive multiword lexicons from large corpora based on competition among n-gram candidates. In contrast to the standard approach of simple ranking by association measure, in our model n-grams are arranged in a lattice structure based on subsumption and overlap relationships, with nodes inhibiting other nodes in their vicinity when they are selected as a lexical item. We show how the configuration of such a lattice can be optimized tractably, and demonstrate using annotations of sampled n-grams that our method consistently outperforms alternatives by at least 0.05 F-score across several corpora and languages.
Multiword expressions ; natural language processing ; lexical semantics
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
Podaci o izdanju
5 (1)
2017.
455-470
objavljeno
2307-387X