In this research we introduce the problem of the binary matrix partitioning in a biological context. Our idea is to use SNP matrix to construct a set of phylogenetic networks to r...
This paper presents a general-purpose distributed lookup service, denoted Passive Distributed Indexing (PDI). PDI stores entries in form of (key, value) pairs in index caches loca...
Today the major web search engines answer queries by showing ten result snippets, which need to be inspected by users for identifying relevant results. In this paper we investigat...
Abstract. This paper presents PLDA, our parallel implementation of Latent Dirichlet Allocation on MPI and MapReduce. PLDA smooths out storage and computation bottlenecks and provid...
Yi Wang, Hongjie Bai, Matt Stanton, Wen-Yen Chen, ...
−Document clustering has become an increasingly important task in analyzing huge numbers of documents distributed among various sites. The challenging aspect is to analyze this e...