Sciweavers

15317 search results - page 290 / 3064
» Globally Distributed Data
Sort
View
CLOUD
2010
ACM
15 years 11 months ago
Comet: batched stream processing for data intensive distributed computing
Batched stream processing is a new distributed data processing paradigm that models recurring batch computations on incrementally bulk-appended data streams. The model is inspired...
Bingsheng He, Mao Yang, Zhenyu Guo, Rishan Chen, B...
CORR
2011
Springer
183views Education» more  CORR 2011»
14 years 10 months ago
Learning When Training Data are Costly: The Effect of Class Distribution on Tree Induction
For large, real-world inductive learning problems, the number of training examples often must be limited due to the costs associated with procuring, preparing, and storing the tra...
Foster J. Provost, Gary M. Weiss
WWW
2010
ACM
16 years 1 months ago
Distributed nonnegative matrix factorization for web-scale dyadic data analysis on mapreduce
The Web abounds with dyadic data that keeps increasing by every single second. Previous work has repeatedly shown the usefulness of extracting the interaction structure inside dya...
Chao Liu, Hung-chih Yang, Jinliang Fan, Li-Wei He,...
169
Voted
TOCS
2008
146views more  TOCS 2008»
15 years 6 months ago
Bigtable: A Distributed Storage System for Structured Data
Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers. Many...
Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C...
CORR
2010
Springer
116views Education» more  CORR 2010»
15 years 4 months ago
Progressive Decoding for Data Availability and Reliability in Distributed Networked Storage
—To harness the ever growing capacity and decreasing cost of storage, providing an abstraction of dependable storage in the presence of crash-stop and Byzantine failures is compu...
Yunghsiang Han, Soji Omiwade, Rong Zheng