To discover patterns in historical data, climate scientists have applied various clustering methods with the goal of identifying regions that share some common climatological beha...
Karsten Steinhaeuser, Nitesh V. Chawla, Auroop R. ...
We present a generalization of frequent itemsets allowing the notion of errors in the itemset definition. We motivate the problem and present an efficient algorithm that identifie...
Previous efforts on event detection from the web have focused primarily on web content and structure data ignoring the rich collection of web log data. In this paper, we propose t...
Qiankun Zhao, Tie-Yan Liu, Sourav S. Bhowmick, Wei...
In this paper, we propose a new similarity measure to compute the pairwise similarity of text-based documents based on suffix tree document model. By applying the new suffix tree ...
Among various document clustering algorithms that have been proposed so far, the most useful are those that automatically reveal the number of clusters and assign each target docum...
Eugene Levner, David Pinto, Paolo Rosso, David Alc...