Probabilistic frameworks for privacy-aware data mining

8 years 2 months ago
Probabilistic frameworks for privacy-aware data mining
Often several cooperating parties would like to have a global view of their joint data for various data mining objectives, but cannot reveal the contents of individual records due to privacy, ownership or competitive considerations. In this talk, we present a probabilistic framework for resolving such seemingly contradictory goals. Rather than sharing parts of the original or perturbed data, the framework shares the parameters of suitable probabilistic models built at each local data site. We mathematically show that the best representative of all the data is a certain "mean" model, and empirically show that this model can be approximated quite well by generating artificial samples from the underlying distributions using Markov Chain Monte Carlo techniques, and then fitting a combined global model with a chosen parametric form to these samples. We also propose a new measure that quantifies privacy in such situations based on information theoretic concepts, and show that decr...
Joydeep Ghosh
Added 12 Dec 2010
Updated 12 Dec 2010
Type Journal
Year 2008
Where ISI
Authors Joydeep Ghosh
Comments (0)