The distributed boosting algorithm

10 years 4 months ago
The distributed boosting algorithm
In this paper, we propose a general framework for distributed boosting intended for efficient integrating specialized classifiers learned over very large and distributed homogeneous databases that cannot be merged at a single location. Our distributed boosting algorithm can also be used as a parallel classification technique, where a massive database that cannot fit into main computer memory is partitioned into disjoint subsets for a more efficient analysis. In the proposed method, at each boosting round the classifiers are first learned from disjoint datasets and then exchanged amongst the sites. Finally the classifiers are combined into a weighted voting ensemble on each disjoint data set. The ensemble that is applied to an unseen test set represents an ensemble of ensembles built on all distributed sites. In experiments performed on four large data sets the proposed distributed boosting method achieved classification accuracy comparable or even slightly better than the standard boo...
Aleksandar Lazarevic, Zoran Obradovic
Added 30 Nov 2009
Updated 30 Nov 2009
Type Conference
Year 2001
Where KDD
Authors Aleksandar Lazarevic, Zoran Obradovic
Comments (0)