This paper uses an optimization approach to address the problem of conceptual clustering. The aim of AGAPE, which is based on the tabu-search meta-heuristic using split, merge and ...
We describe a large-scale application of methods for finding plagiarism and self-plagiarism in research document collections. The methods are applied to a collection of 284,834 d...
Daria Sorokina, Johannes Gehrke, Simeon Warner, Pa...
We develop the notion of normalized information distance (NID) [7] into a kernel distance suitable for use with a Support Vector Machine classifier, and demonstrate its use for an...
We derive a number of well known deterministic latent variable models such as PCA, ICA, EPCA, NMF and PLSA as variational EM approximations with point posteriors. We show that the...
Max Welling, Chaitanya Chemudugunta, Nathan Sutter
This paper presents a unified view of a number of dimension reduction techniques under the common framework of tensors. Specifically, it is established that PCA, and the recentl...