Given that commercial search engines cover billions of web pages, efficiently managing the corresponding volumes of disk-resident data needed to answer user queries quickly is a f...
This paper introduces a novel statistical mixture model for probabilistic grouping of distributional histogram data. Adopting the Bayesian framework, we propose to perform anneale...
This paper proposes a novel approach to measuring XML document similarity by taking into account the semantics between XML elements. The motivation of the proposed approach is to ...
: We consider the use of a cluster system with a shared nothing architecture for update-intensive autonomous databases. To optimize load balancing, we use optimistic database repli...
Abstract. This paper is about the evaluation of the results of clustering algorithms, and the comparison of such algorithms. We propose a new method based on the enrichment of a se...