We propose a new unsupervised learning technique for extracting information from large text collections. We model documents as if they were generated by a two-stage stochastic pro...
Mark Steyvers, Padhraic Smyth, Michal Rosen-Zvi, T...
Cluster analysis is a primary method for database mining. It is either used as a stand-alone tool to get insight into the distribution of a data set, e.g. to focus further analysi...
Mihael Ankerst, Markus M. Breunig, Hans-Peter Krie...
Hardware trends suggest that large-scale CMP architectures, with tens to hundreds of processing cores on a single piece of silicon, are iminent within the next decade. While exist...
Bratin Saha, Ali-Reza Adl-Tabatabai, Anwar M. Ghul...
chool of Business Research – FY 2007 Research Abstracts Dare to Share: Protecting Sensitive Knowledge with Data Sanitization Data sanitization is a process that is used to promot...
Abstract. This paper introduces an efficient privacy-preserving protocol for distributed K-means clustering over an arbitrary partitioned data, shared among N parties. Clustering i...
Maneesh Upmanyu, Anoop M. Namboodiri, Kannan Srina...