We examine the learning-curve sampling method, an approach for applying machinelearning algorithms to large data sets. The approach is based on the observation that the computatio...
Abstract. In this paper we show how approximate matrix factorisations can be used to organise document summaries returned by a search engine into meaningful thematic categories. We...
Background: DNA microarray technology allows for the measurement of genome-wide expression patterns. Within the resultant mass of data lies the problem of analyzing and presenting...
Meng Piao Tan, Erin N. Smith, James R. Broach, Chr...
Multi-master update everywhere database replication, as achieved by protocols based on group communication such as DBSM and Postgres-R, addresses both performance and availability...
Data mining refers to the process of revealing unknown and potentially useful information from a large database. Frequent itemsets mining is one of the foundational problems in dat...