In many Web applications, such as blog classification and newsgroup classification, labeled data are in short supply. It often happens that obtaining labeled data in a new domain ...
This work addresses the need for stateful dataflow programs that can rapidly sift through huge, evolving data sets. These data-intensive applications perform complex multi-step c...
Dionysios Logothetis, Christopher Olston, Benjamin...
The rapid growth of the Internet over the last decade has been startling. However, efforts to track its growth have often fallen afoul of bad data -- for instance, how much traffi...
Assessing the similarity between objects is a prerequisite for many data mining techniques. This paper introduces a novel approach to learn distance functions that maximizes the c...
Christoph F. Eick, Alain Rouhana, Abraham Bagherje...
—Methods for learning decision rules are being successfully applied to many problem domains, especially where understanding and interpretation of the learned model is necessary. ...