A distributed learning framework for heterogeneous data sources

12 years 1 months ago
A distributed learning framework for heterogeneous data sources
We present a probabilistic model-based framework for distributed learning that takes into account privacy restrictions and is applicable to scenarios where the different sites have diverse, possibly overlapping subsets of features. Our framework decouples data privacy issues from knowledge integration issues by requiring the individual sites to share only privacy-safe probabilistic models of the local data, which are then integrated to obtain a global probabilistic model based on the union of the features available at all the sites. We provide a mathematical formulation of the model integration problem using the maximum likelihood and maximum entropy principles and describe iterative algorithms that are guaranteed to converge to the optimal solution. For certain commonly occurring special cases involving hierarchically ordered feature sets or conditional independence, we obtain closed form solutions and use these to propose an efficient alternative scheme by recursive decomposition o...
Srujana Merugu, Joydeep Ghosh
Added 28 Jun 2010
Updated 28 Jun 2010
Type Conference
Year 2005
Where KDD
Authors Srujana Merugu, Joydeep Ghosh
Comments (0)