Sciweavers

36 search results - page 7 / 8
» MRShare: Sharing Across Multiple Queries in MapReduce
Sort
View
PODS
2010
ACM
232views Database» more  PODS 2010»
13 years 11 months ago
Optimal sampling from distributed streams
A fundamental problem in data management is to draw a sample of a large data set, for approximate query answering, selectivity estimation, and query planning. With large, streamin...
Graham Cormode, S. Muthukrishnan, Ke Yi, Qin Zhang
VLDB
2007
ACM
226views Database» more  VLDB 2007»
14 years 6 months ago
SciPort: An Adaptable Scientific Data Integration Platform for Collaborative Scientific Research
Scientific data are posing new challenges to data management due to the large volume, complexity and heterogeneity of the data. Meanwhile, scientific collaboration becomes increas...
Fusheng Wang, Pierre-Emmanuel Bourgue, Georg Hacke...
VLDB
2004
ACM
106views Database» more  VLDB 2004»
13 years 11 months ago
Structures, Semantics and Statistics
At a fundamental level, the key challenge in data integration is to reconcile the semantics of disparate data sets, each expressed with a different database structure. I argue th...
Alon Y. Halevy
ICDE
2007
IEEE
134views Database» more  ICDE 2007»
14 years 14 days ago
Outlier Detection for Fine-grained Load Balancing in Database Clusters
Recent industry trends towards reducing the costs of ownership in large data centers emphasize the need for database system techniques for both automatic performance tuning and ef...
Jin Chen, Gokul Soundararajan, Madalin Mihailescu,...
BMCBI
2005
156views more  BMCBI 2005»
13 years 6 months ago
DynGO: a tool for visualizing and mining of Gene Ontology and its associations
Background: A large volume of data and information about genes and gene products has been stored in various molecular biology databases. A major challenge for knowledge discovery ...
Hongfang Liu, Zhang-Zhi Hu, Cathy H. Wu