In this paper we describe an interdisciplinary collaboration between researchers in machine learning and oceanography. The collaboration was formed to study the problem of open oc...
Joshua M. Lewis, Pincelli M. Hull, Kilian Q. Weinb...
Computing reliable gene expression levels from microarray experiments is a sophisticated process with many potential pitfalls. Quality control is one of the most important steps i...
While much of the data on the web is unstructured in nature, there is also a significant amount of embedded structured data, such as product information on e-commerce sites or sto...
This paper describes a text normalization system for deletion-based abbreviations in informal text. We propose using statistical classifiers to learn the probability of deleting ...
We propose a novel algorithm for clustering data sampled from multiple submanifolds of a Riemannian manifold. First, we learn a representation of the data using generalizations of...