Pregled bibliografske jedinice broj: 1014670
Enumerated Automata Implementation of String Dictionaries
Enumerated Automata Implementation of String Dictionaries // Implementation and Application of Automata / Michal Hospodár, Galina Jirásková (ur.).
Košice: Springer, 2019. str. 33-44 doi:10.1007/978-3-030-23679-3_3 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
CROSBI ID: 1014670 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Enumerated Automata Implementation of String Dictionaries
Autori
Bakarić, Robert ; Korenčić, Damir ; Ristov, Strahil
Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni
Izvornik
Implementation and Application of Automata
/ Michal Hospodár, Galina Jirásková - Košice : Springer, 2019, 33-44
ISBN
978-3-030-23678-6
Skup
24th International Conference on Implementation and Application of Automata (CIAA 2019)
Mjesto i datum
Košice, Slovačka, 22.07.2019. - 25.07.2019
Vrsta sudjelovanja
Predavanje
Vrsta recenzije
Međunarodna recenzija
Ključne riječi
String dictionary ; Enumerated DFA ; Recursive automaton ; LZ trie ; DNA indexing
Sažetak
Over the last decade a considerable effort was invested into research on implementing string dictionaries. String dictionary is a data structure that bijectively maps a set of strings to a set of integers, and that is used in various index-based applications. A recent paper [18] can be regarded as a reference work on the subject of string dictionary implementations. Although very comprehensive, [18] does not cover the implementation of a string dictionary with the enumerated deterministic finite automaton, a data structure naturally suited for this purpose. We compare the results for the state-of-the-art compressed enumerated automaton with those presented in [18] on the same collection of data sets, and on the collection of natural language word lists. We show that our string dictionary implementation is a competitive variant for different types of data, especially when dealing with large sets of strings, and when strings have more similarity between them. In particular, our method presents as a prominent solution for storing DNA motifs and words of inflected natural languages. We provide the code used for the experiments.
Izvorni jezik
Engleski
Znanstvena područja
Računarstvo
POVEZANOST RADA
Projekti:
KK.01.1.1.01.0009 - Napredne metode i tehnologije u znanosti o podatcima i kooperativnim sustavima (EK )
HRZZ-IP-2018-01-7317 - Napredni deterministički i hibridni algoritmi na nizovima, sljedovima i stablima s primjenama u tehničkim znanostima i znanostima o životu (ALGSEQ18) (Ristov, Strahil, HRZZ - 2018-01) ( CroRIS)
Ustanove:
Institut "Ruđer Bošković", Zagreb
Citiraj ovu publikaciju:
Časopis indeksira:
- Scopus