In applications such as fraud and intrusion detection, it is of great interest to measure the evolving trends in the data. We consider the problem of quantifying changes between tw...
Gene expression information from microarray experiments is a primary form of data for biological analysis and can offer insights into disease processes and cellular behaviour. Suc...
This paper describes the Network-Attached Secure Disk (NASD) storage architecture, prototype implementations of NASD drives, array management for our architecture, and three files...
Garth A. Gibson, David Nagle, Khalil Amiri, Jeff B...
MapReduce is emerging as an important programming model for large-scale data-parallel applications such as web indexing, data mining, and scientific simulation. Hadoop is an open-...
Matei Zaharia, Andy Konwinski, Anthony D. Joseph, ...
This paper presents the implementation of kDCI, an enhancement of DCI [10], a scalable algorithm for discovering frequent sets in large databases. The main contribution of kDCI re...
Salvatore Orlando, Claudio Lucchese, Paolo Palmeri...