Sciweavers

1413 search results - page 194 / 283
» Mining Multiple Large Databases
Sort
View
136
Voted
KDD
2008
ACM
217views Data Mining» more  KDD 2008»
16 years 2 months ago
Stream prediction using a generative model based on frequent episodes in event sequences
This paper presents a new algorithm for sequence prediction over long categorical event streams. The input to the algorithm is a set of target event types whose occurrences we wis...
Srivatsan Laxman, Vikram Tankasali, Ryen W. White
KDD
2007
ACM
182views Data Mining» more  KDD 2007»
16 years 2 months ago
Cleaning disguised missing data: a heuristic approach
In some applications such as filling in a customer information form on the web, some missing values may not be explicitly represented as such, but instead appear as potentially va...
Ming Hua, Jian Pei
KAIS
2008
114views more  KAIS 2008»
15 years 1 months ago
A new concise representation of frequent itemsets using generators and a positive border
A complete set of frequent itemsets can get undesirably large due to redundancy when the minimum support threshold is low or when the database is dense. Several concise representat...
Guimei Liu, Jinyan Li, Limsoon Wong
EDBT
2004
ACM
131views Database» more  EDBT 2004»
16 years 2 months ago
Declustering Two-Dimensional Datasets over MEMS-Based Storage
Due to the large difference between seek time and transfer time in current disk technology, it is advantageous to perform large I/O using a single sequential access rather than mu...
Hailing Yu, Divyakant Agrawal, Amr El Abbadi
155
Voted
ICDE
2012
IEEE
224views Database» more  ICDE 2012»
13 years 4 months ago
Exploiting Common Subexpressions for Cloud Query Processing
—Many companies now routinely run massive data analysis jobs – expressed in some scripting language – on large clusters of low-end servers. Many analysis scripts are complex ...
Yasin N. Silva, Paul-Ake Larson, Jingren Zhou