Pregled bibliografske jedinice broj: 302117
Query-Driven Indexing for Peer-to-Peer Text Retrieval
Query-Driven Indexing for Peer-to-Peer Text Retrieval // Proceedings of the 16th international conference on World Wide Web / Patel-Schneider, Peter ; Shenoy, Prashant (ur.).
New York (NY): The Association for Computing Machinery (ACM), 2007. str. 1185-1186 (poster, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)
CROSBI ID: 302117 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Query-Driven Indexing for Peer-to-Peer Text Retrieval
Autori
Skobeltsyn, Gleb ; Luu, Toan ; Podnar Zarko, Ivana ; Rajman, Martin ; Aberer, Karl
Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni
Izvornik
Proceedings of the 16th international conference on World Wide Web
/ Patel-Schneider, Peter ; Shenoy, Prashant - New York (NY) : The Association for Computing Machinery (ACM), 2007, 1185-1186
ISBN
978-1-59593-654-7
Skup
16th International World Wide Web Conference
Mjesto i datum
Banff, Kanada, 08.05.2007. - 12.05.2007
Vrsta sudjelovanja
Poster
Vrsta recenzije
Međunarodna recenzija
Ključne riječi
P2P; DHT; IR; Text Retrieval; Query-Driven Indexing
Sažetak
We describe a query-driven indexing framework for scalable text retrieval over structured P2P networks. To cope with the bandwidth consumption problem that has been identified as the major obstacle for full-text retrieval in P2P networks, we truncate posting lists associated with indexing features to a constant size storing only top-k ranked document references. To compensate for the loss of information caused by the truncation, we extend the set of indexing features with carefully chosen term sets. Indexing term sets are selected based on the query statistics extracted from query logs, thus we index only such combinations that are a) frequently present in user queries and b) non-redundant w.r.t the rest of the index. The distributed index is compact and efficient as it constantly evolves adapting to the current query popularity distribution. Moreover, it is possible to control the tradeoff between the storage/bandwidth requirements and the quality of query answering by tuning the indexing parameters. Our theoretical analysis and experimental results indicate that we can indeed achieve scalable P2P text retrieval for very large document collections and deliver good retrieval performance.
Izvorni jezik
Engleski
Znanstvena područja
Elektrotehnika, Računarstvo
POVEZANOST RADA
Projekti:
036-0362027-1639 - Isporuka sadržaja i pokretljivost korisnika i usluga u mrežama nove generacije (Matijašević, Maja, MZO ) ( CroRIS)
Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb
Profili:
Ivana Podnar Žarko
(autor)