Pretražite po imenu i prezimenu autora, mentora, urednika, prevoditelja

Napredna pretraga

Pregled bibliografske jedinice broj: 460728

Compressing Gazetteers Revisited


Budišćak, Ivan; Piskorski, Jakub; Ristov, Strahil
Compressing Gazetteers Revisited // Pre-proceedings of the Eighth International Workshop on Finite-State Methods and Natural Language Processing 2009 workshop / Watson, Bruce ; Kourie, Derrick ; Cleophas, Loek ; Rautenbach, Pierre (ur.).
Pretoria: University of Pretoria, 2009. (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)


CROSBI ID: 460728 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Compressing Gazetteers Revisited

Autori
Budišćak, Ivan ; Piskorski, Jakub ; Ristov, Strahil

Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni

Izvornik
Pre-proceedings of the Eighth International Workshop on Finite-State Methods and Natural Language Processing 2009 workshop / Watson, Bruce ; Kourie, Derrick ; Cleophas, Loek ; Rautenbach, Pierre - Pretoria : University of Pretoria, 2009

ISBN
978-1-86854-743-2

Skup
Eighth International Workshop on Finite-State Methods and Natural Language Processing

Mjesto i datum
Pretoria, Južnoafrička Republika, 21.07.2009. - 24.07.2009

Vrsta sudjelovanja
Predavanje

Vrsta recenzije
Međunarodna recenzija

Ključne riječi
Recursive Finite State Automata; Automata Compression; Gazetteer Compression

Sažetak
Finite-state automata are state-of-the-art representation of gazetteers in NLP. This paper compares different methods for gazetteer compression based on two, independently published, algorithms for automata substructure recognition. The more recent algorithm, that we denote REC-FSA (Recursive Finite State Automaton) has been invented specially for gazetteer compression and reported as the most space efficient approach at the time of publication. In this paper we apply the older method, denoted here with REC-FSA-2 and obtain circa 30% improvement of the compression rate compared to the more recent algorithm. However, the latter algorithm is much faster. We employ previously published modification of REC- FSA-2, that we denote REC-FSA-2-DICT, to achieve a viable compromise between the compression efficiency and time complexity. The results reported here represent the state-of-the-art in gazetteer compression.

Izvorni jezik
Engleski

Znanstvena područja
Matematika, Računarstvo, Informacijske i komunikacijske znanosti



POVEZANOST RADA


Projekti:
098-0982560-2566 - Mjerenje i karakterizacija podataka iz stvarnog svijeta (Medved-Rogina, Branka, MZOS ) ( CroRIS)

Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb,
Institut "Ruđer Bošković", Zagreb

Profili:

Avatar Url Ivan Budišćak (autor)

Avatar Url Strahil Ristov (autor)


Citiraj ovu publikaciju:

Budišćak, Ivan; Piskorski, Jakub; Ristov, Strahil
Compressing Gazetteers Revisited // Pre-proceedings of the Eighth International Workshop on Finite-State Methods and Natural Language Processing 2009 workshop / Watson, Bruce ; Kourie, Derrick ; Cleophas, Loek ; Rautenbach, Pierre (ur.).
Pretoria: University of Pretoria, 2009. (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
Budišćak, I., Piskorski, J. & Ristov, S. (2009) Compressing Gazetteers Revisited. U: Watson, B., Kourie, D., Cleophas, L. & Rautenbach, P. (ur.)Pre-proceedings of the Eighth International Workshop on Finite-State Methods and Natural Language Processing 2009 workshop.
@article{article, author = {Budi\v{s}\'{c}ak, Ivan and Piskorski, Jakub and Ristov, Strahil}, year = {2009}, keywords = {Recursive Finite State Automata, Automata Compression, Gazetteer Compression}, isbn = {978-1-86854-743-2}, title = {Compressing Gazetteers Revisited}, keyword = {Recursive Finite State Automata, Automata Compression, Gazetteer Compression}, publisher = {University of Pretoria}, publisherplace = {Pretoria, Ju\v{z}noafri\v{c}ka Republika} }
@article{article, author = {Budi\v{s}\'{c}ak, Ivan and Piskorski, Jakub and Ristov, Strahil}, year = {2009}, keywords = {Recursive Finite State Automata, Automata Compression, Gazetteer Compression}, isbn = {978-1-86854-743-2}, title = {Compressing Gazetteers Revisited}, keyword = {Recursive Finite State Automata, Automata Compression, Gazetteer Compression}, publisher = {University of Pretoria}, publisherplace = {Pretoria, Ju\v{z}noafri\v{c}ka Republika} }




Contrast
Increase Font
Decrease Font
Dyslexic Font