In this paper, we formulate the problem of summarization of a dataset of transactions with categorical attributes as an optimization problem involving two objective functions - co...
Abstract. The data stream model of computation is often used for analyzing huge volumes of continuously arriving data. In this paper, we present a novel algorithm called DUCstream ...
We apply a well-known Bayesian probabilistic model to textual information retrieval: the classification of documents based on their relevance to a query. This model was previously...
Existing data cleaning methods work on the basis of computing the degree of similarity between nearby records in a sorted database. High recall is achieved by accepting records wi...
Monitoring cluster evolution in data streams is a major research topic in data streams mining. Previous clustering methods for evolving data streams focus on global clustering res...
Liang Tang, Chang-jie Tang, Lei Duan, Chuan Li, Ye...