Sciweavers

17 search results - page 3 / 4
» Load balancing for term-distributed parallel retrieval
Sort
View
ICDT
2005
ACM
124views Database» more  ICDT 2005»
13 years 10 months ago
Optimal Distributed Declustering Using Replication
A common technique for improving performance for database query retrieval is to decluster the database among multiple disks so that retrievals can be parallelized. In this paper we...
Keith B. Frikken
ICDCS
2007
IEEE
13 years 11 months ago
Defragmenting DHT-based Distributed File Systems
Existing DHT-based file systems use consistent hashing to assign file blocks to random machines. As a result, a user task accessing an entire file or multiple files needs to r...
Jeffrey Pang, Phillip B. Gibbons, Michael Kaminsky...
EDBT
2009
ACM
184views Database» more  EDBT 2009»
13 years 12 months ago
Distributed similarity search in high dimensions using locality sensitive hashing
In this paper we consider distributed K-Nearest Neighbor (KNN) search and range query processing in high dimensional data. Our approach is based on Locality Sensitive Hashing (LSH...
Parisa Haghani, Sebastian Michel, Karl Aberer
APPT
2009
Springer
13 years 11 months ago
Evaluating SPLASH-2 Applications Using MapReduce
MapReduce has been prevalent for running data-parallel applications. By hiding other non-functionality parts such as parallelism, fault tolerance and load balance from programmers,...
Shengkai Zhu, Zhiwei Xiao, Haibo Chen, Rong Chen, ...
PPOPP
2003
ACM
13 years 10 months ago
Optimizing data aggregation for cluster-based internet services
Large-scale cluster-based Internet services often host partitioned datasets to provide incremental scalability. The aggregation of results produced from multiple partitions is a f...
Lingkun Chu, Hong Tang, Tao Yang, Kai Shen