Abstract. In this paper we propose the clustering of top-ranking sentences (TRS) for effective information access. Top-ranking sentences are selected by a query-biased sentence ex...
A major challenge in document clustering is the extremely high dimensionality. For example, the vocabulary for a document set can easily be thousands of words. On the other hand, ...
A promising approach to graph clustering is based on the intuitive notion of intra-cluster density vs. inter-cluster sparsity. While both formalizations and algorithms focusing on ...
This paper proposes a new clustering algorithm in the fuzzy-c-means family, which is designed to cluster time series and is particularly suited for short time series and those wit...
In this paper we address the problem of analyzing web log data collected at a typical online newspaper site. We propose a two-way clustering technique based on probability theory....
Hannes Wettig, Jussi Lahtinen, Tuomas Lepola, Petr...