Napredna pretraga

Pregled bibliografske jedinice broj: 385286

TermeX: A Tool for Collocation Extraction


Delač, Davor; Krleža, Zoran; Dalbelo Bašić, Bojana; Šnajder, Jan; Šarić, Frane
TermeX: A Tool for Collocation Extraction // Lecture Notes in Computer Science (Computational Linguistics and Intelligent Text Processing), 5449 (2009), 149-157 doi:10.1007/978-3-642-00382-0_12 (međunarodna recenzija, članak, znanstveni)


Naslov
TermeX: A Tool for Collocation Extraction

Autori
Delač, Davor ; Krleža, Zoran ; Dalbelo Bašić, Bojana ; Šnajder, Jan ; Šarić, Frane

Izvornik
Lecture Notes in Computer Science (Computational Linguistics and Intelligent Text Processing) (0302-9743) 5449 (2009); 149-157

Vrsta, podvrsta i kategorija rada
Radovi u časopisima, članak, znanstveni

Ključne riječi
TermeX; tool; Collocation Extraction

Sažetak
Collocations – word combinations occurring together more often than by chance – have a wide range of NLP applications. Many approaches for automating collocation extraction based on lexical association measures have been proposed in the literature. This paper presents TermeX – a tool for efficient extraction of collocations based on a variety of association measures. TermeX implements POS filtering and lemmatization, and is capable of extracting collocations up to length four. We address trade-offs between high memory consumption and processing speed and propose an efficient implementation. Our implementation allows for processing time linear to corpus size and memory consumption linear to the number of word types.

Izvorni jezik
Engleski

Znanstvena područja
Računarstvo



POVEZANOST RADA


Projekt / tema
036-1300646-1986 - Otkrivanje znanja u tekstnim podacima (Bojana Dalbelo-Bašić, )

Ustanove
Fakultet elektrotehnike i računarstva, Zagreb

Časopis indeksira:


  • Scopus


Uključenost u ostale bibliografske baze podataka:


  • INSPEC


Citati