The rapid globalization of Wikipedia is generating a parallel, multi-lingual corpus of unprecedented scale. Pages for the same topic in many different languages emerge both as a r...
We present collaborative peer-to-peer algorithms for the problem of approximating frequency counts for popular items distributed across the peers of a large-scale network. Our alg...
In this paper, we propose a general framework for distributed boosting intended for efficient integrating specialized classifiers learned over very large and distributed homogeneo...
Background: Standardization of analytical approaches and reporting methods via community-wide collaboration can work synergistically with web-tool development to result in rapid c...
Adam J. Carroll, Murray R. Badger, A. Harvey Milla...
The promise of unsupervised learning methods lies in their potential to use vast amounts of unlabeled data to learn complex, highly nonlinear models with millions of free paramete...