Nalazite se na CroRIS probnoj okolini. Ovdje evidentirani podaci neće biti pohranjeni u Informacijskom sustavu znanosti RH. Ako je ovo greška, CroRIS produkcijskoj okolini moguće je pristupi putem poveznice www.croris.hr
izvor podataka: crosbi

Extending Lexical Association Measures for Collocation Extraction (CROSBI ID 151427)

Prilog u časopisu | izvorni znanstveni rad | međunarodna recenzija

Petrović, Saša ; Šnajder, Jan ; Dalbelo Bašić, Bojana Extending Lexical Association Measures for Collocation Extraction // Computer speech and language, 24 (2010), 2; 383-394. doi: 10.1016/j.csl.2009.06.001

Podaci o odgovornosti

Petrović, Saša ; Šnajder, Jan ; Dalbelo Bašić, Bojana

engleski

Extending Lexical Association Measures for Collocation Extraction

Collocations are linguistic phenomena that occur when two or more words appear together more often than by chance and whose meaning often cannot be inferred from the meanings of its parts. As collocations have found many applications in the fields of natural language processing, information retrieval, and text mining, extracting them from large corpora has been the focus of many studies over the past few years. In this paper, we introduce the notion of an extension pattern, a formalization of the idea of extending lexical association measures (AMs) defined for bigrams. An extension pattern provides a measure-independent way of extending AMs for extracting collocations of arbitrary length. We define different extension patterns and compare them on a task of extracting collocations from a newspaper corpus. We show that the stopword-sensitive extension patterns we propose outperform other extensions, which indicates that AMs could benefit by taking into account linguistic information about an n-grams's part-of-speech pattern.

collocations; collocation extraction; lexical association measures

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

Podaci o izdanju

24 (2)

2010.

383-394

objavljeno

0885-2308

10.1016/j.csl.2009.06.001

Povezanost rada

Računarstvo

Poveznice
Indeksiranost