We briefly survey several privacy compromises in published datasets, some historical and some on paper. An inspection of these suggests that the problem lies with the nature of the...
In this work we provide efficient distributed protocols for generating shares of random noise, secure against malicious participants. The purpose of the noise generation is to crea...
Cynthia Dwork, Krishnaram Kenthapadi, Frank McSher...
Abstract. Decision-makers in critical fields such as medicine and finance make use of a wide range of information available over the Internet. Mediation, a data integration techn...
This paper describes the development of a ground truth dataset of culturally diverse Romanized names in which approximately 70,000 names are matched against a subset of 700. We ra...
Processing and extracting meaningful knowledge from count data is an important problem in data mining. The volume of data is increasing dramatically as the data is generated by da...