Sciweavers

67 search results - page 5 / 14
» Data weaving: scaling up the state-of-the-art in data cluste...
Sort
View
DBISP2P
2008
Springer
124views Database» more  DBISP2P 2008»
14 years 11 months ago
Exploiting Distribution Skew for Scalable P2P Text Clustering
K-Means clustering is widely used in information retrieval and data mining. Distributed K-Means variants have already been proposed, but none of the past algorithms scales to large...
Odysseas Papapetrou, Wolf Siberski, Fabian Leitrit...
ICDM
2005
IEEE
188views Data Mining» more  ICDM 2005»
15 years 3 months ago
CLUMP: A Scalable and Robust Framework for Structure Discovery
We introduce a robust and efficient framework called CLUMP (CLustering Using Multiple Prototypes) for unsupervised discovery of structure in data. CLUMP relies on finding multip...
Kunal Punera, Joydeep Ghosh
OSDI
2008
ACM
14 years 12 months ago
DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language
DryadLINQ is a system and a set of language extensions that enable a new programming model for large scale distributed computing. It generalizes previous execution environments su...
Yuan Yu, Michael Isard, Dennis Fetterly, Mihai Bud...
DSRT
2006
IEEE
15 years 3 months ago
Grid-enabling FIRST: Speeding Up Simulation Applications Using WinGrid
The vision of grid computing is to make computational power, storage capacity, data and applications available to users as readily as electricity and other utilities. Grid infrast...
Navonil Mustafee, Anders Alstad, Bjorn Larsen, Sim...
ICDE
2012
IEEE
227views Database» more  ICDE 2012»
13 years 2 days ago
Temporal Analytics on Big Data for Web Advertising
—“Big Data” in map-reduce (M-R) clusters is often fundamentally temporal in nature, as are many analytics tasks over such data. For instance, display advertising uses Behavio...
Badrish Chandramouli, Jonathan Goldstein, Songyun ...