Pretražite po imenu i prezimenu autora, mentora, urednika, prevoditelja

Napredna pretraga

Pregled bibliografske jedinice broj: 76128

A method for compressing lexicons, DCC02, Data Compression Conference


Ristov, Strahil; Laporte, Eric
A method for compressing lexicons, DCC02, Data Compression Conference // DCC 2002 / Storer, James; Cohn, Martin (ur.).
Snowbird (UT), Sjedinjene Američke Države: IEEE, Computer Society, 2002. (poster, međunarodna recenzija, sažetak, znanstveni)


CROSBI ID: 76128 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
A method for compressing lexicons, DCC02, Data Compression Conference

Autori
Ristov, Strahil ; Laporte, Eric

Vrsta, podvrsta i kategorija rada
Sažeci sa skupova, sažetak, znanstveni

Izvornik
DCC 2002 / Storer, James; Cohn, Martin - : IEEE, Computer Society, 2002

Skup
Data Compression Conference

Mjesto i datum
Snowbird (UT), Sjedinjene Američke Države, 02.04.2002. - 04.04.2002

Vrsta sudjelovanja
Poster

Vrsta recenzije
Međunarodna recenzija

Ključne riječi
natural language lexicon; spelling-to-phonetic conversion; compressed trie; index compression

Sažetak
Natural language lexicon is a set of strings where each string consists of a word and the associated linguistic data. Its computer representation is a structure that returns appropriate linguistic data on a given input word. It should be small and fast. We propose a method for lexicon compression based on extant efficient method for compressing tries. Straightforward trie compression becomes ineffective when strings are long so words and associated data sets are compressed separately, additionally processed and linked with auxiliary index structure. The index file is compressed with canonical Huffman codes and, for the example of 660.000 entries, 18 Mbytes French phonetic lexicon, overall size of searchable compressed string set is 7% of the original size.

Izvorni jezik
Engleski

Znanstvena područja
Elektrotehnika



POVEZANOST RADA


Projekti:
0098024
00980502

Ustanove:
Institut "Ruđer Bošković", Zagreb

Profili:

Avatar Url Strahil Ristov (autor)


Citiraj ovu publikaciju:

Ristov, Strahil; Laporte, Eric
A method for compressing lexicons, DCC02, Data Compression Conference // DCC 2002 / Storer, James; Cohn, Martin (ur.).
Snowbird (UT), Sjedinjene Američke Države: IEEE, Computer Society, 2002. (poster, međunarodna recenzija, sažetak, znanstveni)
Ristov, S. & Laporte, E. (2002) A method for compressing lexicons, DCC02, Data Compression Conference. U: Storer, J. & Cohn, M. (ur.)DCC 2002.
@article{article, author = {Ristov, Strahil and Laporte, Eric}, year = {2002}, pages = {470}, keywords = {natural language lexicon, spelling-to-phonetic conversion, compressed trie, index compression}, title = {A method for compressing lexicons, DCC02, Data Compression Conference}, keyword = {natural language lexicon, spelling-to-phonetic conversion, compressed trie, index compression}, publisher = {IEEE, Computer Society}, publisherplace = {Snowbird (UT), Sjedinjene Ameri\v{c}ke Dr\v{z}ave} }
@article{article, author = {Ristov, Strahil and Laporte, Eric}, year = {2002}, pages = {470}, keywords = {natural language lexicon, spelling-to-phonetic conversion, compressed trie, index compression}, title = {A method for compressing lexicons, DCC02, Data Compression Conference}, keyword = {natural language lexicon, spelling-to-phonetic conversion, compressed trie, index compression}, publisher = {IEEE, Computer Society}, publisherplace = {Snowbird (UT), Sjedinjene Ameri\v{c}ke Dr\v{z}ave} }




Contrast
Increase Font
Decrease Font
Dyslexic Font