The ability of fast similarity search at large scale is of great importance to many Information Retrieval (IR) applications. A promising way to accelerate similarity search is sem...
The Self-Organizing map (SOM), a powerful method for data mining and cluster extraction, is very useful for processing data of high dimensionality and complexity. Visualization met...
The assumptions behind linear classifiers for categorical data are examined and reformulated in the context of the multinomial manifold, the simplex of multinomial models furnishe...
In this paper we investigate whether paragraphs can be identified automatically in different languages and domains. We propose a machine learning approach which exploits textual a...
This paper introduces multiple instance regression, a variant of multiple regression in which each data point may be described by more than one vector of values for the independen...