Word searching and indexing in historical document collections is a challenging problem because, characters in these documents are often touching or broken due to degradation/agei...
We study the prevalent problem when a test distribution differs from the training distribution. We consider a setting where our training set consists of a small number of sample d...
Ruslan Salakhutdinov, Sham M. Kakade, Dean P. Fost...
This paper presents Anonymouth, a novel framework for anonymizing writing style. Without accounting for style, anonymous authors risk identification. This framework is necessary t...
Andrew W. E. McDonald, Sadia Afroz, Aylin Caliskan...
Motivation: Next-generation sequencing methods are generating increasingly massive datasets, yet still do not fully capture genetic diversity in the richest environments. To under...
Alex L. B. Leach, James P. J. Chong, Kelly R. Rede...
Nearest neighbour classifiers and related kernel methods often perform poorly in high dimensional problems because it is infeasible to include enough training samples to cover the...