Pregled bibliografske jedinice broj: 676340
Impact of Hash Value Length on Document Comparison System’s Performance
Impact of Hash Value Length on Document Comparison System’s Performance // Proceedings of 7th European Computing Conference (ECC '13) - Recent Advances in Information Science (Recent Advances in Computer Engineering Series 13) / Damir Boras, Nives Mikelić Preradović, Francisco Moya, Mohamed Roushdy, Abdel-Badeeh M. Salem (ur.).
Atena: WSEAS Press, 2013. str. 89-94 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
CROSBI ID: 676340 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Impact of Hash Value Length on Document Comparison System’s Performance
Autori
Juričić, Vedran ; Soleša, Dragan ; Dunđer, Ivan
Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni
Izvornik
Proceedings of 7th European Computing Conference (ECC '13) - Recent Advances in Information Science (Recent Advances in Computer Engineering Series 13)
/ Damir Boras, Nives Mikelić Preradović, Francisco Moya, Mohamed Roushdy, Abdel-Badeeh M. Salem - Atena : WSEAS Press, 2013, 89-94
ISBN
978-960-474-304-9
Skup
7th European Computing Conference (ECC '13) - Recent Advances in Information Science (Recent Advances in Computer Engineering Series 13) / Language and Text Processing
Mjesto i datum
Dubrovnik, Hrvatska, 25.06.2013. - 27.06.2013
Vrsta sudjelovanja
Predavanje
Vrsta recenzije
Međunarodna recenzija
Ključne riječi
document comparison system; hash-based system; hash value length; performance analysis
Sažetak
This paper analyses the changes that occur in a document comparison system when changing the length of hash values of documents’ n-grams, that is, when changing the number of bits that are used to store hash values. A hash-based document comparison system was developed and used to perform different analyses. The authors analyzed dependencies between hash value length and disk space requirements, comparison process time and F-measure, in order to find the optimum length, a balance between the best performance and the lowest space and time requirements. Because of the regularity of those dependencies, the authors tried to approximate values obtained by testing with exponential and trigonometric functions.
Izvorni jezik
Engleski
Znanstvena područja
Računarstvo, Informacijske i komunikacijske znanosti
Napomena
Indeksiranost: ISI (Thomson Reuters), ELSEVIER, SCOPUS, ACM - Association for Computing Machinery, Zentralblatt MATH, British Library, EBSCO, SWETS, EMBASE, CAS - American Chemical Society, CiteSeerx, Cabell Publishing, Electronic Journals Library, SAO/NASA Astrophysics Data System, EI Compendex, Engineering Village, CSA Cambridge Scientific Abstracts, DoPP, GEOBASE, Biobase, American Mathematical Society (AMS), Inspec - The IET, Ulrich's International Periodicals Directory
POVEZANOST RADA
Projekti:
130-1300646-0909 - Informacijska tehnologija u prevođenju hrvatskoga i e-učenju jezika (Seljan, Sanja, MZOS ) ( CroRIS)
Ustanove:
Filozofski fakultet, Zagreb