Sciweavers

257 search results - page 38 / 52
» An Approximate L1-Difference Algorithm for Massive Data Stre...
Sort
View
ICDM
2009
IEEE
121views Data Mining» more  ICDM 2009»
15 years 4 months ago
Finding Time Series Motifs in Disk-Resident Data
—Time series motifs are sets of very similar subsequences of a long time series. They are of interest in their own right, and are also used as inputs in several higher-level data...
Abdullah Mueen, Eamonn J. Keogh, Nima Bigdely Sham...
STOC
2010
ACM
199views Algorithms» more  STOC 2010»
15 years 2 months ago
Zero-One Frequency Laws
Data streams emerged as a critical model for multiple applications that handle vast amounts of data. One of the most influential and celebrated papers in streaming is the “AMS...
Vladimir Braverman and Rafail Ostrovsky
KDD
2009
ACM
191views Data Mining» more  KDD 2009»
15 years 10 months ago
Efficient methods for topic model inference on streaming document collections
Topic models provide a powerful tool for analyzing large text collections by representing high dimensional data in a low dimensional subspace. Fitting a topic model given a set of...
Limin Yao, David M. Mimno, Andrew McCallum
PODS
2010
ACM
215views Database» more  PODS 2010»
15 years 2 months ago
An optimal algorithm for the distinct elements problem
We give the first optimal algorithm for estimating the number of distinct elements in a data stream, closing a long line of theoretical research on this problem begun by Flajolet...
Daniel M. Kane, Jelani Nelson, David P. Woodruff
ANOR
2010
130views more  ANOR 2010»
14 years 7 months ago
Greedy scheduling with custom-made objectives
We present a methodology to automatically generate an online job scheduling method for a custom-made objective and real workloads. The scheduling problem comprises independent para...
Carsten Franke, Joachim Lepping, Uwe Schwiegelshoh...