Sciweavers

PVLDB
2010
119views more  PVLDB 2010»
13 years 2 months ago
An Architecture for Parallel Topic Models
This paper describes a high performance sampling architecture for inference of latent topic models on a cluster of workstations. Our system is faster than previous work by over an...
Alexander J. Smola, Shravan Narayanamurthy
PVLDB
2010
108views more  PVLDB 2010»
13 years 2 months ago
Retrieving Top-k Prestige-Based Relevant Spatial Web Objects
The location-aware keyword query returns ranked objects that are near a query location and that have textual descriptions that match query keywords. This query occurs inherently i...
Xin Cao, Gao Cong, Christian S. Jensen
PVLDB
2010
142views more  PVLDB 2010»
13 years 2 months ago
Distributed Caching Platforms
With the advances in processing, memory, and connectivity technologies, applications are becoming increasingly distributed, data-centric, and web based. These applications demand ...
Anil Nori
PVLDB
2010
92views more  PVLDB 2010»
13 years 2 months ago
Proximity Rank Join
We introduce the proximity rank join problem, where we are given a set of relations whose tuples are equipped with a score and a real-valued feature vector. Given a target feature...
Davide Martinenghi, Marco Tagliasacchi
PVLDB
2010
172views more  PVLDB 2010»
13 years 2 months ago
Secure Personal Data Servers: a Vision Paper
An increasing amount of personal data is automatically gathered and stored on servers by administrations, hospitals, insurance companies, etc. Citizen themselves often count on in...
Tristan Allard, Nicolas Anciaux, Luc Bouganim, Yan...
PVLDB
2010
164views more  PVLDB 2010»
13 years 2 months ago
FlashStore: High Throughput Persistent Key-Value Store
We present FlashStore, a high throughput persistent keyvalue store, that uses flash memory as a non-volatile cache between RAM and hard disk. FlashStore is designed to store the ...
Biplob Debnath, Sudipta Sengupta, Jin Li
PVLDB
2010
204views more  PVLDB 2010»
13 years 2 months ago
Cheetah: A High Performance, Custom Data Warehouse on Top of MapReduce
Large-scale data analysis has become increasingly important for many enterprises. Recently, a new distributed computing paradigm, called MapReduce, and its open source implementat...
Songting Chen
PVLDB
2010
97views more  PVLDB 2010»
13 years 2 months ago
Ranking Continuous Probabilistic Datasets
Ranking is a fundamental operation in data analysis and decision support, and plays an even more crucial role if the dataset being explored exhibits uncertainty. This has led to m...
Jian Li, Amol Deshpande
PVLDB
2010
114views more  PVLDB 2010»
13 years 2 months ago
Peer coordination through distributed triggers
This is a demonstration of data coordination in a peer data management system through the employment of distributed triggers. The latter express in a declarative manner individual...
Verena Kantere, Maher Manoubi, Iluju Kiringa, Timo...
PVLDB
2010
85views more  PVLDB 2010»
13 years 2 months ago
Evaluating Entity Resolution Results
Entity Resolution (ER) is the process of identifying groups of records that refer to the same real-world entity. Various measures (e.g., pairwise F1, cluster F1) have been used fo...
David Menestrina, Steven Whang, Hector Garcia-Moli...