Skew is prevalent in data streams, and should be taken into account by algorithms that analyze the data. The problem of finding "biased quantiles"-- that is, approximate...
Graham Cormode, Flip Korn, S. Muthukrishnan, Dives...
This paper presents research-in-progress. An extensive customer-centric data warehouse architecture should enable both complex analytical queries as well as standard reporting que...
Similarity-based grouping of data entries in one or more data sources is a task underlying many different data management tasks, such as, structuring search results, removal of red...
Catching the recent trend of data is an important issue when mining frequent itemsets from data streams. To prevent from storing the whole transaction data within the sliding windo...
Several advanced techniques have been proposed for data clustering and many of them have been applied to gene expression data, with partial success. The high dimensionality and the...