Sciweavers

5640 search results - page 750 / 1128
» Parallelizing the Data Cube
Sort
View
162
Voted
ICS
2009
Tsinghua U.
15 years 1 months ago
R-ADMAD: high reliability provision for large-scale de-duplication archival storage systems
Data de-duplication has become a commodity component in dataintensive systems and it is required that these systems provide high reliability comparable to others. Unfortunately, b...
Chuanyi Liu, Yu Gu, Linchun Sun, Bin Yan, Dongshen...
109
Voted
CLOUDCOM
2010
Springer
15 years 6 days ago
LEEN: Locality/Fairness-Aware Key Partitioning for MapReduce in the Cloud
This paper investigates the problem of Partitioning Skew1 in MapReduce-based system. Our studies with Hadoop, a widely used MapReduce implementation, demonstrate that the presence ...
Shadi Ibrahim, Hai Jin, Lu Lu, Song Wu, Bingsheng ...
148
Voted
BIOINFORMATICS
2012
13 years 5 months ago
Epigenetic priors for identifying active transcription factor binding sites
Motivation Accurate knowledge of the genome-wide binding of transcription factors in a particular cell type or under a particular condition is necessary for understanding transcri...
Gabriel Cuellar-Partida, Fabian A. Buske, Robert C...
137
Voted
KDD
2004
ACM
124views Data Mining» more  KDD 2004»
16 years 3 months ago
Support envelopes: a technique for exploring the structure of association patterns
This paper introduces support envelopes--a new tool for analyzing association patterns--and illustrates some of their properties, applications, and possible extensions. Specifical...
Michael Steinbach, Pang-Ning Tan, Vipin Kumar
140
Voted
KDD
2002
ACM
147views Data Mining» more  KDD 2002»
16 years 3 months ago
Visualized Classification of Multiple Sample Types
The goal of the knowledge discovery and data mining is to extract the useful knowledge from the given data. Visualization enables us to find structures, features, patterns, and re...
Li Zhang, Aidong Zhang, Murali Ramanathan