The problem of indexing time series has attracted much interest. Most algorithms used to index time series utilize the Euclidean distance or some variation thereof. However, it has...
The k-means algorithm is the method of choice for clustering large-scale data sets and it performs exceedingly well in practice. Most of the theoretical work is restricted to the c...
Abstract. The Mongue-Elkan method is a general text string comparison method based on an internal character-based similarity measure (e.g. edit distance) combined with a token leve...
Sergio Jimenez, Claudia Becerra, Alexander F. Gelb...
We use a set of photographs showing similar scenes as a model for a single photograph this scene. A distance measure for this model is defined by correlating the neigborhoods of p...
We address the problem of minimizing the communication involved in the exchange of similar documents. We consider two users, A and B, who hold documents x and y respectively. Neit...