In data publishing, anonymization techniques such as generalization and bucketization have been designed to provide privacy protection. In the meanwhile, they reduce the utility o...
Many scalable data mining tasks rely on active learning to provide the most useful accurately labeled instances. However, what if there are multiple labeling sources (`oracles...
This paper describes and evaluates privacy-friendly methods for extracting quasi-social networks from browser behavior on user-generated content sites, for the purpose of finding ...
Foster J. Provost, Brian Dalessandro, Rod Hook, Xi...
The Drosophila gene expression pattern images document the spatial and temporal dynamics of gene expression and they are valuable tools for explicating the gene functions, interac...
Shuiwang Ji, Lei Yuan, Ying-Xin Li, Zhi-Hua Zhou, ...
In this paper, we consider the problem of combining link and content analysis for community detection from networked data, such as paper citation networks and Word Wide Web. Most ...