Background: Expressed sequence tag (EST) collections are composed of a high number of single-pass, redundant, partial sequences, which need to be processed, clustered, and annotat...
All Netflix Prize algorithms proposed so far are prohibitively costly for large-scale production systems. In this paper, we describe an efficient dataflow implementation of a coll...
Srivatsava Daruru, Nena M. Marin, Matt Walker, Joy...
Abstract: This paper presents the initial design and implementation of a Gridbased distributed and parallel data mining system. The Grid system, namely the Business Intelligence Gr...
The deluge of available data for analysis demands the need to scale the performance of data mining implementations. With the current architectural trends, one of the major challen...
The support vector machine (SVM) is a wellestablished and accurate supervised learning method for the classification of data in various application fields. The statistical learnin...