Sciweavers

SIGMOD
2012
ACM
345views Database» more  SIGMOD 2012»
11 years 6 months ago
Shark: fast data analysis using coarse-grained distributed memory
Shark is a research data analysis system built on a novel rained distributed shared-memory abstraction. Shark marries query processing with deep data analysis, providing a unifie...
Cliff Engle, Antonio Lupher, Reynold Xin, Matei Za...
SIGMOD
2012
ACM
225views Database» more  SIGMOD 2012»
11 years 6 months ago
A model-based approach to attributed graph clustering
Zhiqiang Xu, Yiping Ke, Yi Wang, Hong Cheng, James...
SIGMOD
2012
ACM
209views Database» more  SIGMOD 2012»
11 years 6 months ago
Locality-sensitive hashing scheme based on dynamic collision counting
Locality-Sensitive Hashing (LSH) and its variants are wellknown methods for solving the c-approximate NN Search problem in high-dimensional space. Traditionally, several LSH funct...
Junhao Gan, Jianlin Feng, Qiong Fang, Wilfred Ng
SIGMOD
2012
ACM
234views Database» more  SIGMOD 2012»
11 years 6 months ago
Oracle in-database hadoop: when mapreduce meets RDBMS
Big data is the tar sands of the data world: vast reserves of raw gritty data whose valuable information content can only be extracted at great cost. MapReduce is a popular parall...
Xueyuan Su, Garret Swart
SIGMOD
2012
ACM
234views Database» more  SIGMOD 2012»
11 years 6 months ago
BloomUnit: declarative testing for distributed programs
We present BloomUnit, a testing framework for distributed programs written in the Bloom language. BloomUnit allows developers to write declarative test specifications that descri...
Peter Alvaro, Andrew Hutchinson, Neil Conway, Will...
SIGMOD
2012
ACM
222views Database» more  SIGMOD 2012»
11 years 6 months ago
Tiresias: a demonstration of how-to queries
In this demo, we will present Tiresias, the first how-to query engine. How-to queries represent fundamental data analysis questions of the form: “How should the input change in...
Alexandra Meliou, Yisong Song, Dan Suciu
SIGMOD
2012
ACM
253views Database» more  SIGMOD 2012»
11 years 6 months ago
Skew-aware automatic database partitioning in shared-nothing, parallel OLTP systems
The advent of affordable, shared-nothing computing systems portends a new class of parallel database management systems (DBMS) for on-line transaction processing (OLTP) applicatio...
Andrew Pavlo, Carlo Curino, Stanley B. Zdonik
SIGMOD
2012
ACM
212views Database» more  SIGMOD 2012»
11 years 6 months ago
Local structure and determinism in probabilistic databases
While extensive work has been done on evaluating queries over tuple-independent probabilistic databases, query evaluation over correlated data has received much less attention eve...
Theodoros Rekatsinas, Amol Deshpande, Lise Getoor
SIGMOD
2012
ACM
190views Database» more  SIGMOD 2012»
11 years 6 months ago
Sample-driven schema mapping
End-users increasingly find the need to perform light-weight, customized schema mapping. State-of-the-art tools provide powerful functions to generate schema mappings, but they u...
Li Qian, Michael J. Cafarella, H. V. Jagadish
SIGMOD
2012
ACM
226views Database» more  SIGMOD 2012»
11 years 6 months ago
SkewTune: mitigating skew in mapreduce applications
We present an automatic skew mitigation approach for userdefined MapReduce programs and present SkewTune, a system that implements this approach as a drop-in replacement for an e...
YongChul Kwon, Magdalena Balazinska, Bill Howe, Je...