Sciweavers

912 search results - page 121 / 183
» Design, Implementation, and Evaluation of a Repairable Datab...
Sort
View
118
Voted
KDD
2009
ACM
198views Data Mining» more  KDD 2009»
16 years 1 months ago
Pervasive parallelism in data mining: dataflow solution to co-clustering large and sparse Netflix data
All Netflix Prize algorithms proposed so far are prohibitively costly for large-scale production systems. In this paper, we describe an efficient dataflow implementation of a coll...
Srivatsava Daruru, Nena M. Marin, Matt Walker, Joy...
124
Voted
SIGMOD
2010
ACM
145views Database» more  SIGMOD 2010»
14 years 11 months ago
How to ConQueR why-not questions
One useful feature that is missing from today’s database systems is an explain capability that enables users to seek clarifications on unexpected query results. There are two t...
Quoc Trung Tran, Chee-Yong Chan
WWW
2004
ACM
16 years 1 months ago
Web data integration using approximate string join
Web data integration is an important preprocessing step for web mining. It is highly likely that several records on the web whose textual representations differ may represent the ...
Yingping Huang, Gregory R. Madey
103
Voted
CORR
2010
Springer
168views Education» more  CORR 2010»
14 years 7 months ago
In Cloud, Can Scientific Communities Benefit from the Economies of Scale?
The basic idea behind cloud computing is that resource providers offer elastic resources to end users. In this paper, we intend to answer one key question to the success of cloud c...
Lei Wang, Jianfeng Zhan, Weisong Shi, Yi Liang
118
Voted
SIGMOD
2010
ACM
228views Database» more  SIGMOD 2010»
15 years 5 months ago
Probabilistic string similarity joins
Edit distance based string similarity join is a fundamental operator in string databases. Increasingly, many applications in data cleaning, data integration, and scientific compu...
Jeffrey Jestes, Feifei Li, Zhepeng Yan, Ke Yi