Pregled bibliografske jedinice broj: 385286
TermeX: A Tool for Collocation Extraction
TermeX: A Tool for Collocation Extraction // Lecture Notes in Computer Science (Computational Linguistics and Intelligent Text Processing), 5449 (2009), 149-157 doi:10.1007/978-3-642-00382-0_12 (međunarodna recenzija, članak, znanstveni)
CROSBI ID: 385286 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
TermeX: A Tool for Collocation Extraction
Autori
Delač, Davor ; Krleža, Zoran ; Dalbelo Bašić, Bojana ; Šnajder, Jan ; Šarić, Frane
Izvornik
Lecture Notes in Computer Science (Computational Linguistics and Intelligent Text Processing) (0302-9743) 5449
(2009);
149-157
Vrsta, podvrsta i kategorija rada
Radovi u časopisima, članak, znanstveni
Ključne riječi
TermeX; tool; Collocation Extraction
Sažetak
Collocations – word combinations occurring together more often than by chance – have a wide range of NLP applications. Many approaches for automating collocation extraction based on lexical association measures have been proposed in the literature. This paper presents TermeX – a tool for efficient extraction of collocations based on a variety of association measures. TermeX implements POS filtering and lemmatization, and is capable of extracting collocations up to length four. We address trade-offs between high memory consumption and processing speed and propose an efficient implementation. Our implementation allows for processing time linear to corpus size and memory consumption linear to the number of word types.
Izvorni jezik
Engleski
Znanstvena područja
Računarstvo
POVEZANOST RADA
Projekti:
036-1300646-1986 - Otkrivanje znanja u tekstnim podacima (Dalbelo-Bašić, Bojana, MZO ) ( CroRIS)
Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb
Citiraj ovu publikaciju:
Časopis indeksira:
- Scopus
Uključenost u ostale bibliografske baze podataka::
- INSPEC