Abstract. This paper presents PLDA, our parallel implementation of Latent Dirichlet Allocation on MPI and MapReduce. PLDA smooths out storage and computation bottlenecks and provid...
Yi Wang, Hongjie Bai, Matt Stanton, Wen-Yen Chen, ...
Large repositories of source code create new challenges and opportunities for statistical machine learning. Here we first develop Sourcerer, an infrastructure for the automated c...
Erik Linstead, Paul Rigor, Sushil Krishna Bajracha...
Traditional sensor network deployments consisted of fixed infrastructures and were relatively small in size. More and more, we see the deployment of ad-hoc sensor networks with h...
Martin F. O'Connor, Vincent Andrieu, Mark Roantree
—Emerging Web2.0 applications such as virtual worlds or social networking websites strongly differ from usual OLTP applications. First, the transactions are encapsulated in an AP...
In P2P systems, large volumes of data are declustered naturally across a large number of peers. But it is very difficult to control the initial data distribution because every use...