Subsequence similarity matching in time series databases is an important research area for many applications. This paper presents a new approximate approach for automatic online s...
Abstract— In this work, web-based metrics for semantic similarity computation between words or terms are presented and compared with the state-of-the-art. Starting from the funda...
We present our approach to defining similarity between software artifacts and discuss its potential exploitation in software reuse by analogy. We first establish properties of si...
We consider the problem of partitioning, in a highly accurate and highly efficient way, a set of n documents lying in a metric space into k non-overlapping clusters. We augment th...
Filippo Geraci, Marco Pellegrini, Paolo Pisati, Fa...
We present a principled methodology for filtering news stories by formal measures of information novelty, and show how the techniques can be used to custom-tailor newsfeeds based ...
Evgeniy Gabrilovich, Susan T. Dumais, Eric Horvitz